Artificial General Coding Intelligence Before AGI

Press Space for next Tweet

We'll achieve Artificial General Coding Intelligence way before AGI. The reason is simple The quality of the generic data set is super low, and it isn't easy to filter/rank it. 99% of the content on the internet is "human slop," and AI is trained on it. So we simply dont have enough high-quality data that encapsulates generic and emotional reasoning. In contrast, it's very easy to curate a high-quality training set for coding, by simply filtering out all public repos on GitHub by the "star" count. I don't see any solution for a significant boost of general dataset quality, most likely we need a million humans sitting and ranking the content for a few years, and we might end up by filtering out most of the content, and the final dataset will be way too small. so the huge challenge here will be producing high-quality general training data, It'll take us a few decades to solve this but the coding is already solved, and we can clearly see how every new model does so much better at coding than the previous one. The next step for coding agents is to start training on the code they created. And unlike generic LLM-generated slop that might reduce the quality of the dataset, the ai generated code gets instant human review, so only good code survives. If my assumption is true, vibe coding gonna become to default way of building software pretty soon (it's already happening, every month the amount of ai generate code vs human-generated is changing in favor of AI)

Topics

artificial intelligence programming machine learning software engineering data science technology future of work

Topics

artificial intelligence programming machine learning software engineering data science technology future of work

We doomscroll, you upskill.Get the stories and ideas that matter

slop ⇢ substance ⇢ signal5 minutes a day, zero spam.

We doomscroll, you upskill.

slop ⇢ substance ⇢ signal

Newsletter

We doomscroll, you upskill.

slop ⇢ substance ⇢ signal