Gemini 3 Deep Think Performance on ARC-AGI Benchmarks
Press Space for next Tweet
Gemini 3 Deep Think is getting a significant upgrade. We’ve refined Deep Think in close partnership with scientists and researchers to tackle tough, real-world challenges. And it’s pushing the frontier across the most challenging benchmarks, achieving an unprecedented 84.6% on ARC-AGI-2. It also sets a new standard on Humanity’s Last Exam - 48.4% without tools.

Topics
Read the stories that matter.The stories and ideas that actually matter.
Save hours a day in 5 minutesTurn hours of scrolling into a five minute read.