A Permanent 75% Cut That Resets AI Model Pricing
DeepSeek has turned a temporary promotion into a permanent overhaul of AI model pricing by slashing the cost of its flagship V4-Pro model by 75%. Instead of reverting after May 31, the promotional rates will remain in place, meaning the model will operate at only 25% of its original launch cost. V4-Pro now charges about USD 0.435 (approx. RM2.00) per million uncached input tokens and USD 0.87 (approx. RM4.00) per million output tokens, down from around USD 1.74 (approx. RM8.00) and USD 3.48 (approx. RM16.00). Cache-hit input pricing has fallen even more steeply, in some cases to one-tenth of earlier levels, which is critical for long-running agents and repeated prompts. For enterprises consuming billions of tokens each month, these artificial intelligence costs could translate into savings of millions of dollars annually and effectively reset expectations for AI model pricing worldwide.
Premium Capabilities at Budget Prices Intensify AI Price Wars
What makes the DeepSeek price cut so disruptive is that V4-Pro is positioned as a high-end system rather than a budget model. It is built on a Mixture-of-Experts architecture with an estimated 1.6 trillion total parameters, activating about 49 billion parameters during inference to balance intelligence and compute efficiency. The model supports a one-million-token context window and can output up to 384,000 tokens in a single request, enabling entire codebases, legal archives, or scientific datasets to be processed in one go. Estimates suggest that, after the cut, V4-Pro is between 20 and 35 times cheaper than some premium frontier models from leading Western providers for certain workloads. This combination of frontier-grade performance and aggressively low artificial intelligence costs is a direct escalation of AI price wars and pressures rivals to justify why their models are so much more expensive.
From Smartphones to Cloud: The Commoditization Playbook Repeats
DeepSeek’s move echoes earlier waves of technology commoditization seen in smartphones and cloud computing. In both markets, hardware and infrastructure that were once premium and scarce became abundant, driving a race to the bottom on price while pushing vendors to differentiate on ecosystem, support, and specialized features. With AI model pricing, a similar pattern is emerging: once frontier models become widely available, price competition becomes the quickest lever for gaining share. As models like V4-Pro prove that high-context, reasoning-capable systems can be run more cheaply, the perceived “luxury” status of frontier AI starts to erode. Over time, this can normalize expectations that large-context, high-performance AI should be an everyday utility rather than a premium service, expanding accessibility but compressing margins and reshaping how value is captured across the stack.
Pressure on OpenAI, Anthropic and the Enterprise AI Business Model
The aggressive DeepSeek price cut forces incumbents such as OpenAI and Anthropic to confront a strategic dilemma: defend high prices to preserve profitability, or lower them to protect market share. With estimates indicating that V4-Pro can be 20 to 35 times cheaper than some premium frontier offerings for particular tasks, enterprise buyers now have a powerful benchmark when negotiating contracts and planning deployments. For many organisations, artificial intelligence costs are shifting from experimental budgets to core operating expenses, making price more decisive in vendor selection. Incumbents may respond with tiered offerings, usage-based discounts, or bundled services such as security, governance, and compliance to justify higher rates. Yet as AI price wars intensify, even value-added features may struggle to offset stark per-token gaps, potentially fragmenting the market between low-cost infrastructure-style providers and premium, service-heavy platforms.
Hardware Strategy and the Trade-Off Between Scale and Profit
Underlying DeepSeek’s pricing boldness is its infrastructure pivot toward Huawei’s Ascend AI accelerators, particularly the Ascend 950 and 950PR supernode systems. By optimising the V4 series to run on these chips, DeepSeek reduces dependence on more constrained and often pricier alternatives, improving control over its cost base. Huawei reportedly aims to ship around 750,000 Ascend 950PR units during 2026, expanding available compute capacity. This hardware alignment gives DeepSeek room to prioritise market penetration over near-term margins, betting that lower prices will drive massive volume and ecosystem lock-in. The trade-off is clear: aggressive pricing may compress profitability per token, but it also raises the barrier for rivals that lack similar hardware supply and cost advantages. As AI model pricing trends downward, the winners may be those who own or tightly integrate the compute layer, enabling sustainable low-cost offerings without sacrificing long-term viability.
