Finding signal on Twitter is more difficult than it used to be. We curate the best tweets on topics like AI, startups, and product development every weekday so you can focus on what matters.

AI Benchmarking Focuses on Coding Over Real Work Tasks

Ethan Mollick

What a great illustration of the central problem of AI benchmarking for real work All of the effort is going into benchmarking for coding, but that is a small part of the actual jobs people do, which leaves the true trajectory of AI progress less clear. https://arxiv.org/pdf/2603.01203

Tweet image 1
Tweet image 2

Topics

We doomscroll, you upskill.

slop ⇢ substance ⇢ signal