Finding signal on Twitter is more difficult than it used to be. We curate the best tweets on topics like AI, startups, and product development every weekday so you can focus on what matters.
Page 1 • Showing 9 tweets
Sonnet 4.6 first loss on snake bench After looking at a many games over the months, 99% of losses are from unforced errors. The model simply doesn't think that many moves in advance Here Sonnet (green) chose to go right which trapped it in the corner resulting in an L

Who’s up to build an RLM agent for ARC-AGI-3? Bounty available https://github .com/arcprize/ARC-AGI-3-Agents
skills as markdown feel like a stepping stone the production version of this will be paid, metered skills agents call as services paid prompts didn’t work, but this might who’s building this?
I've gone through sprints of content over the past ~6 years But haven't posted seriously for 24 months (since going full time w/ ARC Prize) * Email: 20K subs * YT: 68K subs ~100% subs are AI/ML focused Not sure what to do with this


What’s the standard way to document/log agent runs? A JSON dag? That shows all prompt, tool calls, sub agent runs, tool responses, etc. Is there a standard yet?
Inspired by @danshipper‘s recent video I’m taking an “agent-native” approach to ARC Prize For code, this is straight forward, so much green field on non-code business use cases Step 1 is get the data into a consumable spot
Even if model growth slowed to a halt today (it won’t), we have 4-5yrs of harness growth + compute coming online before we asymptote
Someone cold-emailed me asking for an internship But the body of the email was just an LLM error message Rather than roasting him, I got on a call with @sanjitr11 (the op) to get his story and see if I could help I'm glad I did! He showed agency for his goals. 5 different
Claude Agent SDK is a hammer and I'm looking for nails everywhere Hooked it up to: CRM <> Claude SDK <> Slack I gave it the most general tools along w/ a skill. md Tools: - list_objects: Find all object types - list_attributes: Find fields on objects - query: Query objects