Finding signal on Twitter is more difficult than it used to be. We curate the best tweets on topics like AI, startups, and product development every weekday so you can focus on what matters.
Claude Opus 4.6 (120K Thinking) on ARC-AGI Semi-Private Eval
Max Effort:
- ARC-AGI-1: 93.0%, $1.88/task
- ARC-AGI-2: 68.8% $3.64/task
New ARC-AGI SOTA model from @AnthropicAI
ARC Prize submitted a response to @NSF 's new Tech Labs program which funds teams breaking down barriers for emerging tech. We believe NSF will be a neutral anchor for this work. Tech Labs will accelerate the path from AI research to real-world deployment.
A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task
Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task
This represents a ~390X efficiency improvement in one year