Finding signal on Twitter is more difficult than it used to be. We curate the best tweets on topics like AI, startups, and product development every weekday so you can focus on what matters.

Gemini 3 Deep Think Performance on ARC-AGI Benchmarks

Gemini 3 Deep Think is getting a significant upgrade. We’ve refined Deep Think in close partnership with scientists and researchers to tackle tough, real-world challenges. And it’s pushing the frontier across the most challenging benchmarks, achieving an unprecedented 84.6% on ARC-AGI-2. It also sets a new standard on Humanity’s Last Exam - 48.4% without tools.

Content

Topics

Read the stories that matter.

Save hours a day in 5 minutes