AI Spending Pullback and the Tokenmaxxing Debate

From AI Hype to AI Spending Pullback

An AI spending pullback is the emerging phase where companies that rushed to deploy generative tools are pausing or scaling back investments to focus on clear productivity gains, cost efficiency, and provable business outcomes instead of raw usage metrics or experimental projects. After a wave of optimism that AI would cut costs and speed up delivery, the bill has arrived. Uber rolled out Anthropic’s Claude Code to about 5,000 engineers, saw 95% monthly usage and AI involvement in 70% of code commits, yet exhausted its annual AI tools budget in four months. Microsoft, a flagship AI champion, has started revoking internal Claude Code licenses and steering engineers toward its own GitHub Copilot CLI. These moves do not signal a rejection of AI, but a shift toward asking harder questions about AI return on investment.

Inside the Tokenmaxxing Debate

The tokenmaxxing debate is at the center of this reassessment. Tokenmaxxing describes a mindset where teams push AI tools as much as possible, maximizing token usage as a proxy for innovation and productivity. Uber COO Andrew Macdonald captured the backlash when he said he has not seen a direct link between higher token usage and higher productivity, adding that it is “very hard to draw a line” between usage stats and more useful consumer features. Meanwhile, firms like Meta, Disney, JPMorgan and Visa track or even reward heavy AI use, with Visa reportedly citing monthly token use close to 2 trillion. Engineers warn that a large share of internal token spend may be waste. As Google’s Sundar Pichai noted, CIOs are increasingly worried about budgets that AI usage threatens to blow apart.

Why Tech Giants Are Pumping the Brakes on AI Spending

Agentic Coding Costs Hit Uber, Microsoft and Salesforce

Agentic coding tools, which let AI plan and execute multi-step coding tasks, were supposed to be the next productivity leap. Instead, they have exposed fresh AI cost efficiency problems. Uber’s experience with Claude Code shows how variable token pricing makes it hard to budget at scale: as agents take on more work, usage – and bills – spike. Microsoft saw Claude Code become “perhaps a little too popular” with engineers, but popularity translated into higher costs, prompting a shift back to GitHub Copilot CLI. At Salesforce, early token budgets for agentic coding now look wildly low compared with real usage. The pattern is clear: tokenmaxxing on agentic tools can deliver convenience for developers while turning AI spend into an open-ended line item, forcing leaders to rethink how they measure AI return on investment in software delivery.

Customer-Facing AI: When Cost Cuts Damage Quality

Some of the sharpest AI reversals have come in customer service, where the limits of generative tools are most visible. Klarna cut about 700 roles and shifted two-thirds to three-quarters of customer interactions to an OpenAI-powered chatbot, then saw a 22% drop in satisfaction and began rehiring human agents. The Commonwealth Bank replaced 45 call-centre agents with a voice bot, only to see call volumes and queues surge, eventually reinstating staff and calling the decision an error. Duolingo briefly tied performance reviews to AI tool usage before dropping that metric after staff questioned whether they were being pushed to use AI for its own sake. These cases show that chasing headcount reduction or token counts can backfire when AI cannot match human empathy, nuance, and problem solving in complex scenarios.

From Usage Metrics to Measurable AI Return on Investment

Across Silicon Valley and beyond, the narrative is shifting from “AI everywhere” to “AI that pays for itself.” Early phases were dominated by adoption targets, usage leaderboards and pride in big token bills. Now executives want clear, defensible AI return on investment. Engineering intelligence data shows that the top 10% of Claude Code users consumed about ten times as many tokens as the median developer but produced only about twice the output, undercutting the idea that heavy usage always equals value. Companies are starting to ask which workflows see faster cycle times, fewer bugs, higher customer satisfaction or lower support load because of AI – and which efforts are expensive experiments. As AI spending pullback gathers pace, the winners are likely to be teams that move beyond tokenmaxxing and treat agentic coding and chatbots as tools that must earn their place in the cost base.