GDPval Score and AI Economic Competitiveness
Press Space to continue
Whoa. This new GDPval score is a very big deal. Probably the most economically relevant measure of AI ability suggesting that in head-to-head competition with human experts on tasks that require 4-8 hours for a human to do, GPT-5.2 wins 71% of the time as judged by other humans
231
11
36
66