When AI Becomes More Expensive Than Hiring
AI was sold as a way to automate work at scale, but the economics are turning upside down. Microsoft’s internal reversal on Anthropic’s Claude Code is one example: after initially giving employees free access, the company reportedly cancelled most direct licenses when the tool burned through its token budget far faster than expected. Uber’s leadership experienced a similar shock when the company consumed its entire coding-tool budget for the year in just four months, driven by internal incentives that rewarded heavy AI usage. These are not edge cases; they illustrate what some analysts now call an AI paradox. Despite falling per‑token prices, overall AI compute costs have surged as agentic systems use vastly more tokens per task. Nvidia executive Bryan Catanzaro has warned that, in aggregate, AI compute costs are now overtaking employee payrolls—meaning replacing humans with AI can actually increase operating expenses.
AI Film Production: When 80% of the Budget Is GPU Time
Nowhere is the cost imbalance more vivid than in AI-driven filmmaking. The Cannes-premiering project “Hell Grind” reportedly cost USD 500,000 (approx. RM2,300,000) to produce, with about USD 400,000 (approx. RM1,840,000) spent purely on compute. In other words, roughly 80 percent of the entire budget went to GPU time rather than actors, sets, or cameras. Every character, environment, and explosion was generated by AI, turning what might once have been a people- and location-heavy production into a capital-intensive GPU pipeline. The workflow was punishingly compute-hungry: prompts averaging around 3,000 words per clip and tens of thousands of generated video segments, many ultimately discarded. This flips traditional film economics on their head. Instead of labor and logistics being the primary variable costs, AI infrastructure spending dominates, making experimental AI cinema viable mainly for teams that can afford massive compute bills.

Rising GPU Pricing Trends and Agentic AI’s Token Hunger
Behind these headline projects lies a broader shift in AI infrastructure spending. Hyperscalers and model developers built their first GPU farms for occasional training jobs, not for round‑the‑clock inference across millions of users. As generative AI tools turn into code copilots, agents, and orchestration platforms, they are burning through tokens orders of magnitude faster than early chatbots. This demand has coincided with aggressive GPU pricing trends at the model level. When OpenAI launched GPT‑5.5, it reportedly doubled per‑token pricing, and Google’s Gemini Flash 3.5 arrived at several times the cost of its Flash‑Lite predecessor. Flat‑rate seat pricing, once attractive for light usage, collapses as soon as teams consistently hit high token volumes, which is exactly what agentic workflows do. The result is a world where businesses pay escalating invoices even as vendors tout cheaper tokens and more efficient silicon.
Guaranteed Capacity: Stability for the Few, Not Savings for the Many
OpenAI’s new Guaranteed Capacity program shows how providers are responding: with financial instruments aimed squarely at large enterprises. The offering lets customers lock in long‑term access to AI compute over one‑, two‑, or three‑year commitments, with discounts that scale with contract length. CEO Sam Altman has framed the move as a response to persistent capacity constraints and as a way to give enterprises predictable access while helping OpenAI plan massive infrastructure investments, including a reported target of USD 600 billion (approx. RM2,760,000,000,000) in total compute spending by 2030. Crucially, this is about stability, not democratization. Only organizations that can commit to significant, multi‑year volumes qualify, and any efficiency gains largely translate into provider margin relief. Smaller companies and individual creators still buy into metered, volatile pricing, absorbing higher AI compute costs without the benefit of guaranteed throughput or negotiated discounts.

A Widening Gap Between AI Haves and Have‑Nots
The current cost structure risks hardening a two‑tier AI economy. At the top, well‑funded enterprises sign multi‑billion‑dollar compute deals, participate in programs like Guaranteed Capacity, and ultimately convert upcoming hardware efficiency gains into better margins or bundled software offerings. At the bottom, startups, agencies, and independent creators pay retail for access to the same models—and then watch usage spiral as they adopt more agentic, autonomous workflows. For them, AI compute costs can quickly exceed the expense of hiring human staff, yet they lack both the bargaining power and the capital to secure preferential terms. New generations of GPUs and accelerators promise lower cost per token, but those savings will arrive first as balance‑sheet relief for hyperscalers, not visible price cuts. Unless pricing models evolve, AI may entrench existing power imbalances rather than leveling the playing field.
