Finding signal on Twitter is more difficult than it used to be. We curate the best tweets on topics like AI, startups, and product development every weekday so you can focus on what matters.
> "Less than half the tokens of 5.2-Codex for same tasks"
That one line already says a lot. There is no assumption anymore that compute or budget is infinite in 2026.
But if you can get better modeling performance while using fewer tokens, that's a win-win.
> " Less than half the tokens of 5.2-Codex for same tasks" That one line already says a lot. There is no assumption anymore that compute or budget is infinite in 2026. But if you can get better modeling performance while using fewer tokens, that's a win-win. x.com/sama/status/20 Image This post is unavailable.
Ch 6 on RL with verifiable rewards is now available. Essentially GRPO from scratch, and probably my favorite chapter so far. (First 363 pages done, yay!) I'm now working on the follow-up with more RLVR runs, more metrics & analyses, and extensions like policy clipping and KL