30 tweets
Today we're launching our first data partnership with @Similarweb Now, you can: Access 12 months of web traffic history Benchmark competitors instantly Analyze marketing channels and traffic sources Get regional traffic breakdowns All powered by Similarweb's trusted digital intelligence. Available now for all Manus Pro users.
We'll achieve Artificial General Coding Intelligence way before AGI. The reason is simple The quality of the generic data set is super low, and it isn't easy to filter/rank it. 99% of the content on the internet is "human slop," and AI is trained on it. So we simply dont have enough high-quality data that encapsulates generic and emotional reasoning. In contrast, it's very easy to curate a high-quality training set for coding, by simply filtering out all public repos on GitHub by the "star" count. I don't see any solution for a significant boost of general dataset quality, most likely we need a million humans sitting and ranking the content for a few years, and we might end up by filtering out most of the content, and the final dataset will be way too small. so the huge challenge here will be producing high-quality general training data, It'll take us a few decades to solve this but the coding is already solved, and we can clearly see how every new model does so much better at coding than the previous one. The next step for coding agents is to start training on the code they created. And unlike generic LLM-generated slop that might reduce the quality of the dataset, the ai generated code gets instant human review, so only good code survives. If my assumption is true, vibe coding gonna become to default way of building software pretty soon (it's already happening, every month the amount of ai generate code vs human-generated is changing in favor of AI)
The majority of the ruff ruff is people who look at the current point and people who look at the current slope.
There’s been a lot of exuberance over the holidays about context graphs stemming from the post by @JayaGup10 and @ashugarg on AI’s trillion-dollar opportunity: context graphs. At Glean, we’re excited — because it finally has a name. Context graphs that understand not just your data, but how your company actually works. We’re at a point where nearly everything in the enterprise has been digitized: decisions and structured data are captured in systems of record, while everyday work unfolds across communications tools, project management systems, code repositories, and more. Context graphs shed light on how work really gets done in the enterprise, enabling automation. With the rise of agents and their ability to reason and act, there’s a major unlock ahead with automation, but only if that reasoning is grounded in the right enterprise context. ## Context has to evolve as AI advances Glean was founded with the understanding that great search is the foundation of context. That means understanding content: indexing unstructured data so employees can search across their enterprise and quickly find the most recent and relevant information—like the latest design doc, policy update, or customer note—needed to answer a question or unblock a task. But as AI has begun to take on more complex work, we’ve learned that this foundation needs to expand. It’s not enough to understand enterprise data alone; systems also need relationship knowledge. How work gets done in an enterprise is fundamentally relationship-driven—knowing who owns an account, who approves a contract, which engineer is on call, or which teams collaborate when an incident escalates. RPA and workflow tools have sought to automate the best-understood processes in organizations, but the majority of work is distributed - done by individuals and small teams, with the processes documented only as 'tribal knowledge.' How do we bring this majority of work into the automation fold, and enable Agents to learn and automate this work? This is where context graphs come to play. ## Context graphs are really about capturing process reality to automate work Glean’s refinement to context graphs: “You can’t reliably capture the why; you can capture the how.” The why is often a thinking step that usually resides in someone’s head- you can’t actually model it. Sometimes it’s hinted at in a meeting transcript or a Slack thread, but much of it never gets written down in a clean or durable way. The how, on the other hand, leaves a rich digital trail: recurring steps, data updates, approvals, collaboration patterns, changing fields, and cross-system behavior over time. Over many cycles, those process traces approximate the why: you can infer rationales from patterns in how work repeatedly gets done — not from trying to literally store every human thought. With that in mind, the goal behind context graphs becomes capturing the "how" (the process) now, and to learn the "why" (the intent) over time. If agents are meant to automate real work in the enterprise, the path is modeling processes deeply to both understand the conditions under which work proceeds, pauses, or escalates so the next time a situation presents itself, the agent can figure out the right actions to take. ## Context graphs are a technical investment Creating this level of knowledge and understanding isn’t easy. Building context graphs is hard: • Observability (via connectors and apps): Getting a full understanding of what happens in an enterprise requires more than clean, structured decision data from systems of record. It requires observability across the connectors and applications where work actually happens—both the breadth to capture activity across the many tools employees use, and the depth to extract meaningful signals from each connector. For example, a connector to Salesforce may expose a deal stage change, but true observability comes from also seeing activity across connected apps: a document edited in Google Docs, a message sent in Slack, a meeting created in Calendar, or a record updated in Salesforce—each captured directly from the underlying system via its connector. • Understanding activity data: Beyond indexing content, systems must capture low-level activity signals: discrete, timestamped actions taken within tools. These include events like a document edit, a field update, a comment added, a Slack message sent, or a file uploaded. Capturing these actions in chronological order—and tracking how state changes between them—provides the raw activity data. • Deriving higher-level understanding of tasks, projects, and initiatives: Only after collecting this atomic activity data can systems begin to infer higher-level constructs. Patterns along with semantic understanding across many low-level actions—repeated document edits, coordinated Slack messages, and frequent updates to the same records—can be aggregated to indicate a task, a project, or a broader initiative. For example, a sequence of document creation, edits, Slack messages, and record updates across several days may collectively represent a customer onboarding effort or a product launch, even if that work was never explicitly labeled as such in any single system. Separating signals from noise is difficult, especially in an enterprise. At @Glean, for example, our task understanding reaches ~80% accuracy—an indicator of how strong all the upstream technology needs to be to make this viable. This is made more impressive by the fact that being for the enterprise, context graphs aren’t built at internet scale. Data can’t be aggregated across customers, and the resulting datasets are both smaller and inaccessible to humans due to privacy constraints—requiring the graphs to be inferred algorithmically. ## Context graphs are part of the foundational suite of technologies that form the next data platform While context graphs are getting the most attention right now, at @Glean we know that solving context can’t rely on a single technology. Getting to real process understanding requires a stack of technologies working together: connectors to observe activity across tools, indexes to enable fast retrieval, graphs to model enterprise structure and relationships, and memory to capture what happens when agents actually execute work. This stack is what allows systems to move from raw enterprise data to agents that can act. As agents begin to operate in the enterprise, learning becomes essential. What works for humans doesn’t always translate directly to agents. By capturing execution traces—how agents use tools, in what sequences, and with what outcomes—systems can learn from agentic work in practice. These traces form enterprise memory, capturing what actually works for agents over time. Process understanding doesn’t come from the context graph alone; it emerges from the combination of structural understanding and learned behavior. When you step back and look at all of these layers together—connectors, indexes, graphs, and personal and enterprise memory—you realize you’ve effectively built an entirely new data platform. One designed not for reporting or analytics, but as the backbone of agentic automation: a system that observes how work happens, learns from execution in practice, and enables agents to reliably carry work forward across the enterprise. ## Context is foundational to agentic work The real question behind Jaya and Ashu’s post is how we enable agents to successfully get work done in the enterprise. How can they learn, understand, and operate like your enterprise? If agents are going to take on more work, that opportunity depends on a context foundation, one that understands your enterprise data, your relationships, and your processes.
BREAKING: Databricks is raising over $4 billion in Series L funding at a $134 billion valuation.
Anthropic: - Haiku 4.5 - Sonnet 4.5 - Opus 4.5 Simple. Clear. Easy to make a choice, especially when you factor in price. We don't use any OpenAI models at Doist (we mainly use Anthropic and Google), and not because they aren't good, but because it's a maze to figure out which OpenAI model to use for what purpose.
Claude Code did a side project that took me ~2 weeks in 2025 in 30mins. Semantic tweet visualization — downloaded all X data — semantically embedding every post — reduce dimensions (UMAP) and cluster — use LLMs to label the clusters And the output quality is *far* better!
ICYMI - Claude Code in the Claude Desktop app! Benefits: → Visual session management instead of terminal tabs → Parallel sessions via git worktrees → Run locally or in the cloud → One-click to open in VS Code or CLI Same Claude Code. Better ergonomics.
The longer I work with Claude Code, the more I realise I should start to have all my files locally instead of scattered around many cloud SaaS apps. Can’t give context without owning it all on your device. Sure you can call the tools, but Claude just browsing your folders is so much more powerful for context. True renaissance for local, open-source tools.

Thinking about a free Ralph tutorial Start with the basic loop, then layer on top: - Containerization - Feedback loops - Testing - Formatting - Linting/types - Skills (for steering) - To plan or not to plan? All using it to build an actual production app. WDYT?
I just added insights to startup pages on TrustMRR. I scrape the website to extract the HTML, ask an LLM to parse it, and extract interesting data like B2B/B2C, total number of users, pricing, etc. Soon, these data will be editable in the dashboard. Hopefully, that increases the chance your startup ranks on Google (better backlink!)
BREAKING : Google is rolling out a NotebookLM integration for Gemini, where users will be able to attach notebooks as a context to their conversations. Grendizer, combine!
How Spotify Wrapped Actually Works: A Software Engineer's Breakdown:
BREAKING: Google Research just dropped the textbook killer. Its called "Learn Your Way" and it uses LearnLM to transform any PDF into 5 personalized learning formats. Students using it scored 78% vs 67% on retention tests. The education revolution is here.
human data will be a $1 trillion/year market This is not a short-term prediction. It is a structural claim about where the economy converges. To believe this, you need to accept two assumptions: • Digital and physical intelligence can eventually automate the tedious parts of the economy • Self-learning intelligence without human data is impossible at the frontier automation is the most useful & liberating thing humanity can do If AI systems can automate functions, then automating all functions is the highest-leverage task for humanity. Automation compresses time. It allows: • Aspirations to be fulfilled faster, by orders of magnitude • Humans to focus on the enjoyable, judgment-heavy parts of work while robots and agents to handle the rest As humans gain time, they create more. Net-new work is initially creative and high-value. Over time it becomes legible, repeatable, and ready for automation. Once automated, it continues delivering value while freeing humans to focus on new creative work. This loop is permanent. Automation does not eliminate human work. It pushes humans toward higher-value, more creative work. At a societal level, automation reshapes the economics of the world. As AI systems take on more production and coordination, the cost of producing goods and services collapses while availability explodes. At the same time, distribution becomes increasingly optimal. Digitally and physically intelligent systems coordinate supply and demand with less friction, less waste, and less delay, making access faster, cheaper, and more reliable every year AI models learn from humans forever Every artificially intelligent system learns from humans in some form: • Demonstrations • Supervised fine-tuning • Preference learning • Complex rubrics and evaluations • Continual corrections Even self-play and synthetic data depend on human grounding — humans define objectives, rewards, and what “good” looks like. As a result: • Every function in the economy contains useful learning signal • Every decision, exception, failure, and tradeoff creates data But raw activity is not enough. That data must be: • Recorded • Structured • Evaluated • Packaged into usable pipelines And importantly, functions must continue running while they are being automated. Automation is iterative, not instantaneous. this creates a universal obligation and opportunity To iteratively automate functions, every company, government agency, or institution running real operations must consume and produce structured data related to those functions. In most cases, it will not be optimal for them to create or structure that data themselves, due to scale inefficiencies, high fixed costs, and the operational difficulty of producing high-quality, reusable structured data in-house. We already see this dynamic today. For example, many lawyers produce more leverage per hour working on standardized, structured legal data through platforms like micro1 than they do performing unstructured work inside individual law firms. At micro1, over 1,000 lawyers work in structured data creation and earn on average ~20% more than in traditional firm roles. Law firms themselves are unlikely to become large-scale producers of structured training data, but they will increasingly be consumers of that data, either directly or by having it embedded in the tools they use. This creates a powerful incentive structure. Labs that are automating functions will pay for this data, because long term the value gained from incremental automation far exceeds the cost of acquiring the data. As a result: • Entities are incentivized to produce high-quality human data not just to automate themselves, but because that data has external market value • Every hour of work can simultaneously: • Run the organization • Train AI models • Generate additional revenue for the organization Human labor becomes not just labor to produce goods & services, but a revenue-generating asset on its own. the ultimate convergence: 5%+ of human time is spent on human data It’s reasonable to think that most functions in the economy will spend some amount of time trying to automate themselves. Not fully, and not all at once, but continuously pushing work out of the human loop as it becomes repeatable and scalable. Today, even knowledge workers spend the majority of their time on communication and coordination rather than on what we would consider actual productive work. As automation advances, tedious parts of knowledge work are progressively removed, and automation increasingly absorbs coordination, scheduling, routing, and routine communication. The result is a larger share of human time being spent on judgment heavy knowledge work. Even under conservative assumptions, it is reasonable to expect that in a more automated economy roughly 75% of work time is still spent on communication and coordination, while about 25% is spent doing actual work. Not all of that work needs to be structured. But a meaningful fraction does. Work that produces decisions, judgments, demonstrations, evaluations, and exceptions becomes far more valuable when captured in a structured, reusable form, both to complete the task and to enable future automation. If only one fifth of that actual work is performed in structured environments, that implies roughly 5% of total human labor time is spent generating structured human data. With global GDP at roughly $100T, and labor representing about 50% of that, total labor spend is around $50T annually. Five percent of that corresponds to roughly $2.5T per year of human time directed at enabling automation, creating demonstrations, feedback, evaluations, and learning signals for AI systems. Certainly not all of this will become explicit spend in the human data market. Much of it will remain implicit, fragmented, or unpriced. But even with aggressive discounting, you still arrive at something on the order of $1T per year. automation reshapes labor, it doesn’t shrink it This results in automation scaling, As automation scales, some amount of what was spent on human labor is redirected towards: • Energy • Compute • AI labor However, total human labor spend continues to increase. Why? Automation creates time. Time enables creativity. Creativity produces net-new functions within the economy. Those functions are initially done by humans. Over time, they follow the same automation cycle. human labor gets more expensive because: • Human time is finite at any moment • Creativity and judgment are scarce • Net-new ideas command premium value As automation expands, humans concentrate more of their time on higher-leverage work. While total human hours do grow over time, that growth cannot be rapidly accelerated in response to demand. The fastest and dominant way the labor market expands is by increasing the value created per human hour. As this continues: • Total human labor spend rises • A larger share of human time is spent generating learning signals and enabling automation we should never call it annotation again The importance of this work in shaping AI means calling it “data labeling” or “annotation” is completely inaccurate. These phrases describe mechanical tasks, when the real value comes from human judgment, expertise, and decision-making expressed in structured form. A more accurate description is expert human data creation or structured human judgment. This is how human expertise compounds in an automated economy. It explains why human data scales with automation rather than disappearing, and why it becomes a first-class economic input over time. human brilliance is needed more than ever This does not require extreme assumptions. It only requires that automation continues to work, and that intelligence continues to learn from humans. If that is true, then human data is not a phase or a temporary bottleneck. It is a structural input to the economy. Human judgment is captured, structured, and refined. That judgment becomes the training substrate of intelligence. That intelligence, in turn, produces more automation. As functions are automated, human time is freed. That time is spent creating new functions to automate, and the beautiful cycle continues.
Anthropic launched Claude for Healthcare with HIPAA-ready products and expanded Claude for Life Sciences with new connectors ranging from clinical trial management to regulatory operations - Claude for Healthcare connects to Centers for Medicare and Medicaid Services Coverage Database, International Classification of Diseases 10th Revision codes, and National Provider Identifier Registry, with new Agent Skills for FHIR development and a sample prior authorization review skill that can be customized to organizations' policies - US Claude Pro and Max plan subscribers get beta access to HealthEx and Function connectors now, with Apple Health and Android Health Connect integrations rolling out in beta this week on iOS and Android apps for accessing lab results and health records - Claude for Life Sciences adds connectors to Medidata for trial data and site performance, ClinicalTrials[.]gov, ToolUniverse with 600+ vetted scientific tools, bioRxiv and medRxiv preprint servers, Open Targets, ChEMBL, and Owkin Pathology Explorer for tissue image analysis - New Agent Skills for scientific problem selection, converting instrument data to Allotrope, scVI-tools and Nextflow deployment for bioinformatics, and a sample skill for clinical trial protocol draft generation with endpoint recommendations accounting for regulatory pathways, competitive landscape, and FDA guidelines - Anthropic is hosting "The Briefing: Healthcare and Life Sciences", a free livestreamed virtual event on January 12 at 11:30 AM PST with Anthropic leadership and customer perspectives on AI in healthcare
Today we’re introducing Scribe v2: the most accurate transcription model ever released. While Scribe v2 Realtime is optimized for ultra low latency and agents use cases, Scribe v2 is built for batch transcription, subtitling, and captioning at scale.
A new (beta) feature of Claude that I've been learning about today is Programmatic tool calling. It programmatically writes code that calls and runs your tools directly in a sandbox before returning results to the model. This reduces latency + token consumption because you can essentially filter or process data before it reaches the model's context window. https://platform.claude.com/docs/en/agents-and-tools/tool-use/programmatic-tool-calling…
We collaborated with @a16z to publish the **State of AI** - an empirical report on how LLMs have been used on OpenRouter. After analyzing more than 100 trillion tokens across hundreds of models and 3+ million users (excluding 3rd party) from the last year, we have a lot of insights to share.