AI Compute Costs Are Surging Past Payroll
AI was sold as a path to cheaper, scalable work. Instead, many enterprises are discovering that AI compute costs can surpass what they spend on people. Microsoft’s internal move to cancel most direct licenses for Anthropic’s Claude Code after only six months is a telling signal: enthusiastic adoption quickly exhausted its allocated token budget, making routine use financially unsustainable. Uber reportedly burned through its entire 2026 AI coding tool budget within four months, driven by culture and incentives that rewarded high AI usage volumes. Analysts now warn that falling token prices are masking a harsher reality—total AI operational costs keep rising as organizations run more complex, agentic systems that consume far more tokens per task. Nvidia executive Bryan Catanzaro has highlighted that compute spending for AI can significantly exceed employee payroll, challenging the assumption that replacing human labor with AI is automatically economical.

The Cannes Case Study: When GPUs Eat 80% of the Budget
The film “Hell Grind,” premiering at Cannes, offers a vivid picture of GPU infrastructure expenses in practice. The project reportedly cost USD 500,000 (approx. RM2,300,000) to produce, with around USD 400,000 (approx. RM1,840,000) spent purely on compute—roughly 80% of the entire budget. Every character, environment, and explosion was generated by AI rather than shot on set, shifting costs from crews and locations to massive GPU workloads. The team used prompts averaging 3,000 words and generated more than 16,000 video clips for just the opening portion, discarding most attempts due to subtle visual flaws. Each failed iteration still incurred compute charges, turning experimentation into a major line item. Compared with traditional mid-budget films that can reach tens of millions, the total is lower, but the cost structure is inverted: GPU infrastructure expenses dominate while human and physical production costs shrink.
Why Cheaper Tokens Don’t Equal Cheaper AI
Many executives assume that dropping token prices will make AI steadily cheaper. In reality, a shift toward agentic AI is driving total AI operational costs higher. Agentic systems break work into many small reasoning steps, invoke multiple tools, and call models repeatedly, dramatically increasing token volume per task. At firms like Uber, Meta, and Amazon, internal leaderboards and “tokenmaxxing” cultures encouraged employees to lean heavily on AI, rapidly inflating consumption. The result is the “AI paradox”: unit prices fall while bills explode. This means that, beyond a certain point, automating a workflow with AI can cost more in compute than paying human staff to do it, especially for complex tasks that trigger long chains of AI calls. Organizations must therefore evaluate not just price per token, but total usage patterns, task complexity, and where human judgment remains more cost-effective.
Guaranteed Capacity and the Power-Law of Scale
As AI compute costs climb, large providers are introducing new ways to lock in infrastructure at scale. OpenAI’s Guaranteed Capacity offering lets enterprises secure long-term access to compute for one, two, or three years, with discounts increasing for longer commitments. CEO Sam Altman has framed this as a response to rising demand and capacity constraints, noting that advanced models will keep the world compute-limited for some time. OpenAI itself is reportedly targeting roughly USD 600 billion (approx. RM2,760,000,000,000) in total compute spending by 2030, highlighting just how capital-intensive the space has become. Long-term capacity deals can stabilize AI compute costs for big buyers, but they also preserve a margin advantage for hyperscale providers that can afford massive upfront investments. Smaller organizations, by contrast, remain exposed to volatile on-demand pricing and may struggle to match the economics of the largest players.
How to Decide Between AI and Human Resources
For most businesses, the emerging lesson is that AI is not a blanket replacement for human labor. Instead, AI is economically attractive where workloads are high-volume, repetitive, and large-scale enough to amortize GPU infrastructure expenses—such as bulk content generation, code refactoring at scale, or standardized support interactions. For exploratory or low-volume use cases, human employees may be cheaper and more reliable. When planning enterprise AI budgeting, leaders should model full lifecycle costs: compute, orchestration, monitoring, prompt engineering, and failure rates, not just subscription fees. They should also stress-test scenarios where usage doubles or triples, given how quickly token budgets have been exhausted at major tech firms. The most resilient strategies blend AI with human oversight, reserving expensive compute for tasks where it demonstrably outperforms people and keeping humans in the loop for nuanced, judgment-heavy work.
