From AI Gold Rush to Token Shock
Enterprise AI costs refer to the cumulative spending by organizations on token-based access to large language models and AI tools, including per-token API fees, consumption-based pricing, and related usage charges that scale with every prompt, document, or coding task employees run through these systems. This spending is now spiraling beyond expectations because pricing is tied to usage volume instead of clear business outcomes, leaving companies exposed to runaway bills when AI is adopted broadly across the workforce. The bill has arrived, and it is ugly. Uber burned through its entire 2026 AI coding budget in roughly four months, with per‑engineer monthly API costs between USD 500 (approx. RM2,300) and USD 2,000 (approx. RM9,200). Finance leaders were caught off guard because early AI rollouts were treated as strategic experiments, not metered utilities. Companies had enthusiastically adopted AI when it first emerged, but they are now far more cautious about its use. Instead of asking “Where can we use AI?”, executives are asking “What is this doing to our P&L?”

The Consumer–Enterprise Pricing Trap
At the heart of the crisis is a lopsided token pricing model. Consumers often get powerful AI systems for free, while enterprises are asked to pay premium rates per token, becoming the primary source of revenue for the labs that need huge compute budgets to keep chasing AGI. Nikesh Arora, CEO of a major security firm and board member at Uber, argues that high token pricing for enterprises, while consumers pay nothing, is a trap pushing businesses toward cheaper open‑source models rather than the frontier systems labs want them to adopt. His warning is not theoretical. When Uber’s engineers embraced AI coding tools, usage metrics looked great — 95 percent using AI monthly, 70 percent of commits AI‑driven — but the value to riders and revenue was unclear while the budget evaporated. Arora’s prescription has three parts, starting with a blunt demand: cut token pricing now so enterprises can experiment without immediate fear of burning a year’s budget in a quarter. Until model makers accept thinner margins per token, enterprise AI costs will keep undermining long‑term adoption.

Everyday Office Work: The Hidden Budget Killer
The most uncomfortable revelation is that the biggest AI invoices are not coming from elite engineering teams but from routine office work. Leaked audio from inside a global consulting giant shows executives alarmed at “soaring token spend,” with the heaviest use driven by workers converting PDFs into slides, reformatting documents into markdown, and automating tasks that used to cost nothing but time. Routine office tasks, scaled across thousands of employees, are quietly generating the biggest AI invoices. Uber’s own budget collapse was not triggered by a single moonshot project; widespread, heavy use of AI coding tools across the organization drained the annual allocation by April. That story is repeating elsewhere as AI pricing shifts from flat subscriptions to seat‑fee‑plus‑pre‑committed‑token models, according to FinOps‑style analysis. The largest recurring bills aren’t coming from frontier experiments. They are coming from everyday workers automating mundane tasks across thousands of seats. The unlimited AI buffet is closing as finance teams realize that “freeing people from drudgery” now has a visible, often painful, line item.

Consumption-Based Pricing Meets Murky ROI
AI vendors have shifted from familiar seat‑based licenses to consumption‑based pricing, where costs are driven by token usage rather than simple user counts. For developer teams, that has produced wildly variable bills: AI coding charges jumping from USD 20 (approx. RM90) or USD 100 (approx. RM460) to USD 2,000 (approx. RM9,200) or USD 5,000 (approx. RM23,000) per developer per month, with extreme cases hitting USD 20,000 (approx. RM92,000) in token fees. Yet, as Gartner’s Nitish Tyagi bluntly states, “There is no direct relation between the increase in token consumption and an increase in productivity gains”. Satya Nadella describes the management problem in even sharper terms: “The marginal cost of productivity improvement has to match the marginal cost of the token. That’s a management discipline”. Internally, he admits there has been “a lot” of token maxing at his own company. Everybody goes and vibe codes and token maxes, but that is not how you get to transformative growth. Gartner predicts that by 2028, AI coding costs will overtake the average developer’s salary due to rising token consumption and the shift to consumption‑based licensing models. In some markets, Tyagi says, coding agents may already cost more than the developers using them.
From Unlimited AI to Token Governance
The era of unmetered enterprise AI experiments is over. The aggressive AI adoption mandate is already being walked back at firms that have not admitted it publicly. The tab just arrived, and finance teams are choking on it. Microsoft’s Nadella is clear: cheaper tokens would make the value equation easier, but discipline is not optional — the math has to work even at current prices. Token governance is shaping up to be the cloud cost crisis of the AI era, and the playbook looks familiar. Expect what comes next to feel like mature cloud FinOps: token quotas, role‑based access tiers, consumption dashboards, and chargeback models landing in every department. FinOps frameworks argue that token economics must connect consumption directly to business outcomes, or organizations are paying too much with no receipt. Gartner urges teams to adopt context engineering and model routing, sending simpler, frequent tasks to smaller models and reserving frontier systems for complex, high‑value work. If CEOs get what they are now demanding — lower model prices plus real cost‑control tools — enterprise AI can move from budget shock to sustainable, outcome‑driven use. Without that shift, token‑priced AI risks pricing itself out of the very enterprises it wants to transform.




