Why Selling Work Instead of Software Doesn't Work
Press Space for next Tweet
📝 Don't Sell the Work For the last four years, the prevailing theory in Silicon Valley was that software companies should “sell work, not software.” The overly simplistic explanation of the idea is that as AI got better at AIing, customers would need fewer employees, and therefore companies who charged on a per seat basis would be screwed. By selling work, a startup was offering “outcome based pricing” whereby they could capture a percentage of a customer’s payroll versus competing for scraps among a firm’s IT budget with the rest of software. This is an elegant idea. It neatly solves the problems that AI introduced to software’s business model, and like all of Silicon Valley’s most important theses, it could theoretically make founders several yacht’s worth of cash. The narrative it introduced rebounded throughout the valley. Sequoia adopted a “tools → copilots → autopilots” framework with outcome-based pricing as the evolutionary endpoint. a16z began framing the AI era as a shift from “Software-as-a-Service” to “Service-as-a-Software.” And everyone, including myself, nodded in agreement. Unfortunately for all of us, this elegant idea turned out to be very wrong. There is exactly one category of startups where the model has been effective—AI customer service—and every other major AI application today uses a different positioning and pricing strategy. So what happened? Why has “selling work” been such a dud? Outside of mere intellectual curiosity, the answer to this question really matters right now. If you buy my argument from last week, namely that the sell-off in software stocks was justified because profit pools would shift to a new type of applications focused on the context layer, then we need to understand how these new apps should be monetized. Just how exactly are these new types of companies going to make money? Why hasn’t selling work, uh, worked? The answer is simple: there is no actual “work” to sell. The opacity of production The counter-arguments to “sell work” have mostly been economic. Margins are bad—50-60% versus 80-90% for SaaS. These critiques are valid but insufficient. The deeper problem is ontological. When you hire McKinsey, you can’t reverse-engineer what they did into a bill of materials. The production process is opaque—you’re paying for judgment, relationships, and institutional knowledge bundled into a deliverable whose value is difficult to quantify. That opacity is what sustains pricing power. AI work has no such opacity. The production process is increasingly transparent. Sophisticated buyers can see the API calls, estimate token counts, and calculate the rough cost of what you just delivered. The “work” firms are charging for — a market analysis, a contract review, a support resolution — is a markup on inference wrapped in varying degrees of prompt engineering, workflow orchestration, and domain context. And unlike McKinsey’s black box, the buyer can peek inside this one. This is why the “it’s just tokens” critique sticks in a way that “it’s just compute” never stuck against SaaS. Everyone knew Salesforce’s marginal cost per seat approached zero; that was the whole bull case for software! But it didn’t matter. You couldn’t replicate Salesforce’s value by renting your own servers from AWS. The pricing power came from platform lock-in, data gravity, integrations, and switching costs, none of which had anything to do with the cost of production. AI work is different. The production process is reproducible. Your customer can see that you’re calling Claude with a system prompt and some retrieved context, and increasingly, they can do exactly that themselves. The API is right there. The prompt engineering isn’t patented. The workflow orchestration is a weekend project for a competent engineer. When the buyer can not only see your costs but replicate your production process, the opacity is effectively zilch. And this matters because of what sits underneath. Price is only partially determined by the value of a product. A pricing model is a reflection of the scarcity in your supply chain. When your production process is reproducible and the inputs are abundant (tokens are hardly hard to come by nowadays) you’ve lost pricing power from both directions. Take KPMG’s audit group. They forced their own auditor to give them a 14% discount because they know that AI is doing most of the work, and the work is cheap. (Yes, there is some irony in KPMG demanding audit discounts while not extending the same courtesy to their own customers). This is what happens when the opacity that sustained professional services pricing disappears. The customer benchmarks your price against the visible cost of the underlying unit of production, not against the value of the output. Inference costs dropped roughly 99% per token between 2023 and 2025. Any business model that prices against the value of human labor while its input costs follow Moore’s Law trajectories is building on a foundation of perpetual margin compression. Your customers will demand that you pass those savings through. The scissors problem But even if you could maintain the opacity, the economics of “selling work” have a structural problem that makes the model worse over time, not better. There’s a scissors dynamic. On one blade, inference costs per token keep falling. On the other hand, token consumption per task is exploding. OpenRouter’s analysis of over 100 trillion tokens in December of last year, shows that reasoning models now account for over 50% of all tokens processed. Average sequence lengths have more than tripled over the past 20 months, from under 2,000 tokens to over 5,400. Programming workloads routinely exceed 20,000 input tokens per request. A reasoning model can consume 10,000 internal “thinking” tokens to produce a 200-token answer. Ask any founder of an AI application today if their token costs have gone down over the last few years and they’ll laugh in your face. Sure, tokens are cheaper, but the frontier models from OpenAI and Anthropic charge a hefty markup on them, and the work we are demanding from these models is forcing ever more token consumption. Sophisticated customers demand being on the smartest models because why shouldn’t they? This is the treadmill that “sell work” companies are trapped on. You save 50% on per-token costs, but your customers now expect agentic, multi-step workflows that consume 10-100x the tokens. The price per token drops; the tokens per task skyrockets. Net costs stay flat or rise — while your customer still benchmarks your price against the ever-cheaper unit cost of the commodity underneath. The margin improvement that was supposed to rescue outcome-based pricing never arrives because the goalposts keep moving. The quality specification problem To make the case even worse for “selling work” is that even if you could stabilize the pricing, “work” requires something that tokens cannot provide: a contractual specification of quality. The reason firms hire lawyers at $800/hour rather than paying per brief is that the quality of legal reasoning cannot be specified ex ante in a contract. You’re buying access to judgment, not a deliverable. You are buying lawyers who all went to Harvard Law because that carries with it a promise of indefinable quality. The same problem reappears with AI. If you’re “selling work,” you need a contract that defines what good work looks like, and that specification problem is often as expensive as the work itself. Ronald Coase explained decades ago that firms exist because they reduce the “costs of using the price mechanism.” Subscription pricing is the software equivalent — it eliminates per-transaction negotiation, measurement, and verification costs. Per-task or per-outcome pricing reintroduces all three. Every “outcome” must be defined, measured, verified, and attributed. The AI vendor knows the actual difficulty and cost of completing each task; the customer does not. Classic principal-agent problems emerge on both sides. This creates what economists call a moral hazard in both directions. Vendors paid per “completed task” face incentives to use cheaper models, minimize compute, skip quality checks, and classify ambiguous outcomes as successes. Customers face incentives to overload systems with low-value tasks or dispute outcomes to avoid payment. You’ve rebuilt the adversarial dynamics of the consulting industry, except with less accountability and no reputational mechanism to discipline quality. Subscriptions or per-token credit systems sidestep all of this. You’re buying access to the tool. Quality assessment becomes the buyer’s problem, not a contractual negotiation. The entire transaction cost structure is simpler. The binary exception proves the rule There is exactly one domain where “sell work” has produced explosive growth: customer support. Sierra AI grew from roughly $20M to $150M ARR in about 15 months, charging ~$1.50 per resolution. Decagon grew from ~$6M to ~$35M ARR using per-conversation pricing. Intercom’s Fin at $0.99 per resolution drove 40% higher adoption. The only other prominent case study I could find is legal tech firm EvenUp which charges per brief generated, but even then customers have a minimum they have to purchase per month. But customer support isn’t really “selling work” in the way the argument orginally meant. A resolved support ticket is a binary state change. Did the customer’s problem go away? Yes or no. The outcome is measurable not because the “work” is well-defined, but because the absence of a problem is well-defined. Attribution is unambiguous: the AI either handled the ticket or it didn’t. These conditions don’t really exist anywhere elsewhere. What’s a good contract clause? What’s a high-quality line of code? These are judgment calls and judgment is precisely what’s hardest to contractually specify and verify. The moment you move from binary state changes to quality spectrums, outcome-based pricing falls apart. Meanwhile, seat-based AI is eating the world The irony of the “sell work” thesis is that the fastest-growing AI companies of the past three years mostly ignored it. Harvey AI uses traditional seat-based pricing. It grew from roughly $50M to $190M ARR in 2025, with median seat counts doubling within 12 months, and reached an $8-11B valuation. Cursor uses tiered subscriptions with compute credits. It became arguably the fastest-growing B2B SaaS company in history, reaching approximately $1B ARR in 24 months. Microsoft Copilot charges $30/seat/month and reached 100M+ monthly active users. The pricing model was not the differentiator. Product quality, distribution, and domain-specific value drove growth regardless of whether companies charged per seat or per usage unit. Harvey’s seat-based model created strong net revenue retention as firms expanded licenses. Cursor’s subscription model created predictable revenue while allowing usage flexibility. Neither needed to solve the quality specification problem because they sold access to tools, not deliverables. And note what these companies do have that a raw inference wrapper does not: deep workflow integration, proprietary data pipelines, purpose-built UX, and switching costs that compound over time. The production of AI work — the inference itself — is reproducible. The production of AI products — with their context layers, workflow lock-in, and accumulated user data — is not. The confusion at the heart of the thesis The sell work thesis confuses the application of intelligence with the output of intelligence. Professional services solved the quality verification problem not by making outputs easier to evaluate, but by building accountability infrastructure around the producer — licenses, liability, long-term reputation. AI has the capability to produce the output but none of the scaffolding that lets a buyer trust it. It’s the most competent worker in the room who can’t sign the contract, can’t be sued, and won’t remember the engagement next week. Until that’s solved, the human professional’s role shifts from doing the cognitive work to bearing responsibility for it. The winning AI companies will sell software that does work and maintain software economics through workflow lock-in, data moats, and subscription revenue — rather than work done by software, which is just consulting economics and problems in an API trenchcoat. So what does monetization actually look like? If everything is tokens of context in and tokens of intelligence out, where does the pricing power come from? Not from the tokens. Inference is a commodity, getting cheaper by the quarter, and your customers know it. Not from the “work” — which is nebulous, hard to specify, and invites the adversarial dynamics of every consulting engagement ever. The pricing power comes from the thing that makes the tokens useful in the first place: context. As I argued in “Context is King,” the software stack is splitting into three layers: systems of record (databases), point solutions (the interface layer), and a new middle layer — the context layer — that holds the institutional knowledge telling AI agents what to do, in what order, and whether they’re allowed to do it. The “sell work” thesis was an attempt to answer the monetization question, but it landed on the wrong answer because it focused on the output rather than the input. The provider sells the conditions for good work, not the work itself. This is why margins are a lagging indicator of where value accrues, not a leading one. Scarcity drives pricing more than anything else, and inference is abundant. Context is scarce. The “sell work” companies are stuck arguing about gross margins on a commodity. The context layer companies are building something that appreciates. The real lesson The thesis was right in one important way—AI does expand addressable markets from software budgets into labor budgets. The TAM expansion thesis holds. AI companies should think about replacing work, not merely augmenting workers. The context layer will capture much of the payroll budget that went to project management and operations. We already have early evidence of this happening. A recent paper out of Ramp, “Payrolls to Prompts,” puts hard numbers on this. Researchers tracked firm-level spending on freelance labor marketplaces like Upwork and Fiverr alongside spending on AI model providers from Q3 2021 through Q3 2025. After ChatGPT launched, the firms most reliant on contracted online labor adopted AI earlier, spent more on it, and cut their freelance budgets by 15%. For every $1 firms stopped spending on human contractors, they spent just $0.03 on AI. “ Keep in mind that this is all firms, for the ones that have gone all in on AI, the effect is more dramatic, “More than 50% of businesses that had spent on online labor marketplaces in Q2 2022 spent 0% in Q2 2025, whereas roughly 80% of businesses spent between 0 and 5% of their total spend on AI model providers in Q2 2025.” Companies are pulling directly from payroll and contractor budgets, exactly the kind of budget migration the “sell work” camp predicted. They just proposed the wrong business model to capture it. But “sell work” is the wrong business model for capturing that opportunity. It sounds like a paradigm shift, but it’s actually a regression to the services economics that software was invented to escape. The frontier AI companies already understand this. They sell subscriptions that give customers access to intelligence-on-tap and let the customer define what “good work” means. They’ll maintain 70-90% gross margins, predictable revenue, and deep switching costs. They look like the best software companies in history, not outsourcing firms with better technology. The market has also spoken. ~92% of AI companies now use hybrid or subscription pricing. Pure outcome-based models represent just 7%. The “sell work” era lasted about as long as it took buyers to realize they were paying a markup on tokens and to start demanding token-level pricing. The next time someone tells you to “sell work, not software,” ask them one question: can you write a contract that specifies what “good work” looks like for your AI? If the answer involves a spectrum of quality rather than a binary outcome, you’re not selling work. You’re selling tokens at a markup. And your customers will figure that out faster than you’d like. If you want to read more from me, subscribe at gettheleverage.com http://x.com/i/article/20245139936377856…

Topics
Read the stories that matter.The stories and ideas that actually matter.
Save hours a day in 5 minutesTurn hours of scrolling into a five minute read.