DeepSeek AI pricing and the new cost war

What DeepSeek’s 75% Price Cut Really Means

DeepSeek’s permanent 75 percent price reduction for its V4-Pro AI model is a structural change in AI model cost reduction, cutting inference costs to a quarter of launch levels and signaling a new phase of price competition that could reframe how enterprises evaluate and adopt AI infrastructure worldwide. The company has locked in pricing for V4-Pro at one-quarter of its original rate, turning what started as a temporary promotion into a standing offer. According to Technology.org, “DeepSeek’s V4-Pro now costs a quarter of its launch price, and the company says that rate is here to stay.” Updated API prices now range between 0.025 and 6 yuan per million tokens, or around USD 0.0035 (approx. RM0.02) to USD 0.83 (approx. RM3.80), depending on workload type, far below many flagship competitors and positioning DeepSeek as a cost-focused alternative in a crowded AI model market.

DeepSeek’s Permanent 75% AI Model Price Cut and the New Cost War

How the New DeepSeek AI Pricing Undercuts Rivals

The new DeepSeek AI pricing underlines a clear strategy: win developers and enterprises by slashing per-token costs far below incumbent models. On its public pricing page, V4-Pro now ranges from USD 0.003625 (approx. RM0.02) to USD 0.87 (approx. RM4.00) per million tokens, down from USD 0.0145 (approx. RM0.07) to USD 3.48 (approx. RM16.00). That scale of AI model cost reduction directly targets high-volume users who process millions of tokens daily, such as AI agents, customer support tools, and data-heavy copilots. Engadget notes that the permanent discount was originally set to expire at the end of May, but DeepSeek chose to keep it in place. By undercutting OpenAI’s GPT-5 and Google’s Gemini 3.5 Flash on price, DeepSeek is signaling that the next competitive frontier is not only model quality but also sustained low AI infrastructure pricing for enterprise buyers.

Huawei Chips and the Economics Behind the Cut

Behind DeepSeek’s aggressive AI infrastructure pricing sits a changing hardware story anchored around Huawei’s Ascend 950 AI chips. DeepSeek previously warned that V4-Pro could cost up to 12 times more than its lighter Flash model because of “constraints in high-end compute capacity” and limited access to advanced AI hardware. Digital Trends reports that usage costs for V4-Pro have dropped from 0.1–24 yuan per million tokens to 0.025–6 yuan, indicating that hardware supply pressures may be easing. Technology.org confirms that V4-Pro leans on Huawei’s Ascend 950 supernodes and that DeepSeek had predicted prices would fall once those chips shipped in higher volume. While the company has not explicitly credited Huawei for the permanent price cut, the timing suggests more reliable access to domestic AI chips is enabling DeepSeek to sustain lower inference costs without sacrificing performance or capacity.

Implications for Enterprise AI Adoption and Budgeting

For enterprise AI adoption, DeepSeek’s move reshapes how teams budget, experiment, and scale. Lower per-million-token costs make it easier to pilot multiple AI agents or long-context applications without blowing through budgets on inference alone. DeepSeek’s promise of an “era of cost-effective 1M context length” matters for enterprises that need large context windows for documents, conversations, and analytics but have been constrained by AI infrastructure pricing. With V4-Pro now permanently discounted, enterprises can compare price–performance across more vendors rather than defaulting to a single incumbent. The savings compound at scale: companies running millions or billions of tokens monthly can redirect budget to integration, governance, and domain fine-tuning. In practical terms, DeepSeek’s pricing reduces the penalty for multi-model strategies, making it more realistic to pair different AI models for specialized tasks within the same stack.

How Competitors Might Respond to DeepSeek’s Cost Shock

DeepSeek’s decision to keep the 75 percent price cut turns a short-term promotion into a strategic shock that competitors cannot ignore. Digital Trends points out that if AI models keep getting cheaper while performance improves, the global AI pricing battle could turn far more aggressive, putting pressure on both regional startups and major Western providers. Engadget notes that DeepSeek’s discounting already undercuts flagship models such as GPT-5 and Gemini 3.5 Flash, while past accusations from Anthropic about “distillation attacks” show that tensions are not only economic but also reputational. In response, rivals may introduce tiered offerings, more generous free quotas, or targeted discounts for high-volume enterprise customers to stay competitive. Over time, we can expect AI model cost reduction to become a key selling point alongside safety, latency, and ecosystem tools, forcing providers to rethink how they monetize premium models.