Finding signal on Twitter is more difficult than it used to be. We curate the best tweets on topics like AI, startups, and product development every weekday so you can focus on what matters.

Claude Hallucinated a GitHub Repository During Deployment

Guillermo Rauch

A Vercel user reported an issue that sounded extremely scary. An unknown GitHub OSS codebase being deployed to their team. We, of course, took the report extremely seriously and began an investigation. Security and infra engineering engaged. Turns out Opus 4.6 *hallucinated a public repository ID* and used our API to deploy it. Luckily for this user, the repository was harmless and random. The JSON payload looked like this: "πšπš’πšπš‚πš˜πšžπš›πšŒπšŽ": { "πšπš’πš™πšŽ": "πšπš’πšπš‘πšžπš‹", "πš›πšŽπš™πš˜π™Έπš": "𝟿𝟷𝟹𝟿𝟹𝟿𝟺𝟢𝟷", // ⚠️ πš‘πšŠπš•πš•πšžπšŒπš’πš—πšŠπšπšŽπš "πš›πšŽπš": "πš–πšŠπš’πš—" } When the user asked the agent to explain the failure, it confessed: The agent never looked up the GitHub repo ID via the GitHub API. There are zero GitHub API calls in the session before the first rogue deployment. The number 913939401 appears for the first time at line 877 β€” the agent fabricated it entirely. The agent knew the correct project ID (prj_β–’β–’β–’β–’β–’β–’) and project name (β–’β–’β–’β–’β–’β–’) but invented a plausible-looking numeric repo ID rather than looking it up. Some takeaways: β–ͺ️ Even the smartest models have bizarre failure modes that are very different from ours. Humans make lots of mistakes, but certainly not make up a random repo id. β–ͺ️ Powerful APIs create additional risks for agents. The API exist to import and deploy legitimate code, but not if the agent decides to hallucinate what code to deploy! β–ͺ️ Thus, it's likely the agent would have had better results had it not decided to use the API and stuck with CLI or MCP. This reinforces our commitment to make Vercel the most secure platform for agentic engineering. Through deeper integrations with tools like Claude Code and additional guardrails, we're confident security and privacy will be upheld. Note: the repo id above is randomized for privacy reasons.

Topics

We doomscroll, you upskill.

slop β‡’ substance β‡’ signal