What Tokenmaxxing Means—and Why the Backlash Is Growing
Tokenmaxxing is the practice of pushing large language models to process as many tokens as possible, encouraging heavy prompt, context, and output lengths in the hope of maximizing productivity, even though the direct link between higher token volumes, better work quality, and measurable business outcomes remains unclear. In theory, more AI tokens should mean richer reasoning, more context-aware answers, and faster output from coding agents or chat assistants. In practice, industry leaders are calling that assumption into question. Uber COO Andrew Macdonald said he has not seen a clear line between increased token usage and concrete gains like “25% more useful consumer features,” despite heavy internal experimentation. As AI token costs rise, this gap between expectations and proof is turning tokenmaxxing from a bragging point into a budget risk that boards and finance teams can no longer ignore.

Budgets Burn First, Productivity Proof Comes Later
Early adopters rushed into AI with generous internal budgets, confident that more tokens would translate into higher output. That confidence is fading. Uber reportedly burned through its annual AI budget in the first four months of the year, and engineering leaders complain that vast token volumes create more noise than value. Akshat Bubna of Modal estimated that “50% of internal token spend is completely useless, but right now it’s hard to know which 50%.” Google CEO Sundar Pichai has heard from chief information officers who are “so concerned about how much their companies are blowing through budgets,” warning the problem is likely to worsen. At the same time, Meta, Disney, JPMorgan, and Visa promote AI-heavy workflows and track or reward token use. The result is a tension between visible AI enthusiasm and a growing demand for harder AI ROI measurement before the next budget cycle.
Agentic Coding Investment: From Hype to Cost Discipline
Agentic coding—AI systems that autonomously write, modify, and test code—has become a major driver of AI token costs. Salesforce, for example, rolled out agentic coding across its engineering teams, only to find that its initial token budget was “an almost absurd underestimate.” The more capable the agent, the more tokens it consumes while iterating on tasks, calling tools, and re-running fixes. A report from engineering intelligence company Jellyfish shows how uneven the returns can be: the top 10% of Claude Code users consumed about ten times as many AI tokens as the median developer but produced only about twice the output. Rather than reward high consumption, the report argues that companies should tie AI spending to clear engineering metrics such as pull requests, defect rates, or cycle times, making agentic coding investment accountable to the same standards as any other tooling.
Rethinking AI ROI Measurement and the Bubble Question
The tokenmaxxing backlash is feeding a broader concern: are current AI valuations and spending levels sustainable, or is this another tech bubble? “Big Short” investor Michael Burry has described tokenmaxxing as a “crazy, rushed, temporary phase” and warned that Nvidia stock faces a high risk of an “aggressive” fall. The worry is that companies are paying for sheer volume of computation without matching it to steady revenue growth or durable productivity gains. Some investors and executives argue that smarter AI ROI measurement is the way forward. That means tracking outcomes such as features shipped, customer satisfaction, and developer throughput, then linking them directly to AI token costs. If firms can show reliable, unit-level economics from AI, tokenmaxxing gives way to targeted spending. If they cannot, the current enthusiasm for heavy AI infrastructure may look less like long-term investment and more like speculative excess.
