AI Cost Management and the Tokenmaxxing Backlash

From AI Gold Rush to AI Cost Management

The current pullback in AI spending is a shift in which large companies move from aggressive, experimental AI adoption toward disciplined AI cost management that focuses on measurable productivity gains and sustainable, enterprise‑grade returns on investment rather than raw usage and hype‑driven deployment. After a burst of enthusiasm, businesses wired AI into coding workflows, customer service, and internal tools, assuming that more tokens and more automation would mean lower costs and faster output. Instead, they hit rising cloud bills, confusing productivity metrics, and backlash from workers and customers. The trend now is not abandoning AI, but slowing down, scrutinizing bills, and questioning whether internal dashboards showing high AI usage say anything about enterprise AI ROI. That mood change is reshaping which tools survive, how they are deployed, and how AI vendors must prove their value.

Uber, Microsoft, Klarna: When Usage Soared But ROI Stalled

Uber’s rollout of Anthropic’s Claude Code to roughly 5,000 engineers became the most visible cautionary tale: usage was huge, with 95% of engineers using AI tools monthly and 70% of code commits described as AI‑driven, yet leadership could not link those numbers to clear consumer value. Uber COO Andrew Macdonald said, “It’s very hard to draw a line between one of those stats and ‘Okay, now we’re actually producing 25% more useful consumer features.’” Microsoft faced a similar issue when Claude Code became “perhaps a little too popular” with its own engineers, prompting a quiet license pullback and a shift toward GitHub Copilot CLI. Klarna’s aggressive replacement of about 700 support roles with an OpenAI chatbot cut headcount but hurt satisfaction, which dropped by 22%, leading the company to rehire humans after realizing efficiency gains were undermining quality.

Why Big Tech Is Slowing AI Spend And Rethinking ROI

The Tokenmaxxing Backlash and the Limits of Raw Consumption

The AI spending pullback is tightly linked to tokenmaxxing criticism. Tokenmaxxing refers to pushing as many AI tokens as possible through models, often to show ambition or chase perceived productivity, without clear evidence of enterprise AI ROI. Macdonald’s comment that the link between token use and output “is not there yet” went viral and echoed wider concern that internal token bills are exploding. Visa has bragged that its monthly token spend is almost 2 trillion, while some engineers argue that “50% of internal token spend is completely useless.” Google’s Sundar Pichai has said chief information officers are worried about how fast AI budgets are being blown. Analysis from Jellyfish suggests diminishing returns: the top 10% of Claude Code users consume about 10 times the tokens of a median developer but produce only about twice the output, undercutting the case for unrestrained consumption.

Why Measuring Enterprise AI ROI Is So Hard

Behind the AI spending pullback is a practical question: when and how should companies judge AI’s impact? Variable token pricing means the more employees rely on AI, the less predictable the bill becomes, especially for agentic AI adoption where agents make many calls in the background. Uber’s experience shows that impressive internal metrics—such as rising agentic AI usage from 32% to 84% in a single month—do not automatically translate into more features that customers notice or pay for. Klarna and the Commonwealth Bank saw that cutting staff and adding chatbots or voice bots can drive up call volumes and damage satisfaction, erasing expected savings. Duolingo’s decision to stop evaluating employees on raw AI usage reflects a growing belief that tools should be judged on outcomes like resolved tickets, shipped features, and retention, not on the number of prompts sent to a model.

From Tokenmaxxing to Targeted, Agentic AI Adoption

As the tokenmaxxing phase fades, enterprises are moving toward narrower, outcome‑driven deployments and smarter AI cost management. Agentic coding tools, which can analyze codebases and act semi‑autonomously, remain attractive but are now evaluated for unit economics rather than novelty. At Salesforce, aggressive adoption of agentic coding pushed token use so far that the initial budget became “an almost absurd underestimate,” a warning to others planning similar rollouts. The emerging playbook is to cap usage, focus on high‑value workflows, and instrument tools so teams can track when AI helps close tickets faster, reduce bugs, or support fewer handoffs. Instead of mandating AI in every task, leaders are starting to ask where AI is the best option and where humans or simpler automation are cheaper and more reliable, making enterprise AI ROI the central metric for future investment decisions.