GPT 5.2 achieves higher accuracy with more compute time

Press Space for next Tweet

OpenAI just dropped that GPT 5.2 is #1 on the METR eval, finishing 6.5hr long human tasks half of the time, but.. It takes ~10x the working time per task of Gemini 3 Pro and ~25x of Opus 4.5. The models are smarter, but by using insane amounts of test-time compute reasoning!

Topics

artificial intelligence machine learning technology programming innovation data science software engineering

Read the stories that matter.The stories and ideas that actually matter.

Save hours a day in 5 minutesTurn hours of scrolling into a five minute read.