DeepSeek AI pricing and the 75% V4-Pro price cut

What DeepSeek’s 75% V4-Pro Price Cut Means

DeepSeek’s permanent 75% V4-Pro price cut is a long-term reduction in AI model costs that turns a short-lived promotional discount into standard API pricing for high-volume, long-context workloads. DeepSeek has locked in the new rate so that V4-Pro now costs a quarter of its launch price, removing the earlier deadline when prices were set to rise again. According to Technology.org, “DeepSeek’s V4-Pro now costs a quarter of its launch price, and the company says that rate is here to stay.” The company’s updated price band runs from 0.025 to 6 yuan per million tokens, down from the previous 0.1 to 24 yuan per million tokens. For developers already building on V4-Pro, the change converts what looked like a temporary bargain into a predictable baseline, signaling a more aggressive long-term DeepSeek AI pricing strategy.

DeepSeek’s Permanent 75% Price Cut Rewrites AI Model Economics

From Time-Limited Deal to Stable API Pricing

The most important shift is not the size of the V4-Pro price cut, but its permanence. DeepSeek has replaced an expiring 75% discount with standard API pricing, preventing the cheaper tier from reverting after May 31. WinBuzzer notes that V4-Pro now “sits at a quarter of its original price,” giving finance teams a single rate card instead of a looming reset. For API buyers, that change turns V4-Pro into a predictable option for long-context services such as coding tools, document search, and retrieval-heavy support bots. Long context windows can consume millions of tokens daily; moving from a temporary discount to a standing rate lets teams commit to launches, contracts, and capacity plans without budgeting for a sudden spike in AI model costs. The focus moves from short-term promotions to sustainable DeepSeek AI pricing.

Huawei’s Ascend Chips and the Economics Behind Cheaper Inference

DeepSeek has not fully explained how it can sustain such a steep price cut, but the sources point to a likely factor: Huawei’s Ascend 950 chips. Technology.org reports that the V4-Pro model “leans on Huawei’s Ascend 950 chips,” and earlier statements from DeepSeek suggested prices would fall once those chips shipped in volume. Digital Trends adds that limited access to high-end compute hardware had previously forced V4-Pro pricing far above DeepSeek’s cheaper Flash model, with Pro access reportedly costing up to 12 times more at launch. As Huawei’s AI hardware becomes a stronger option for large deployments, DeepSeek’s inference costs appear to be dropping, allowing lower token rates without collapsing margins. If supply of these chips continues to improve, V4-Pro’s cheaper API tier could be sustainable rather than a short-term loss leader.

How Lower AI Model Costs Change Developer and Enterprise Planning

DeepSeek’s new rate card directly affects how builders budget and architect AI systems. WinBuzzer lists V4-Pro’s current pricing at approximately USD 0.435 (approx. RM2.01) per million uncached input tokens and USD 0.87 (approx. RM4.03) per million output tokens, with cheaper rates for cached input. These levels can be 20 to 35 times cheaper than some premium models from OpenAI, Anthropic, and Google, depending on workload. For developers running high-volume, long-context workloads, this alters API pricing comparison exercises: instead of trimming features to control token usage, teams can consider more generous context windows, richer retrieval results, or more frequent agent calls. For enterprises, predictable, lower token rates mean multi-quarter contracts and internal recharge models can assume a stable cost curve, rather than planning around promotional cliffs or uncertain future AI model costs.

Competitive Pressure and the Next Phase of AI Pricing

DeepSeek’s permanent DeepSeek AI pricing cut may trigger a broader reset in AI economics. Digital Trends frames it as one of the boldest moves in the AI race so far, warning that “the global AI pricing battle could become far more aggressive over the next year.” WinBuzzer notes that V4-Pro’s lower token rates could pressure rivals such as Kimi, Qwen, and MiniMax in budget-sensitive API pricing comparison scenarios. If one provider offers long-context, high-parameter models at a fraction of the per-million-token cost, others may be forced to respond, either by cutting list prices, introducing deeper volume discounts, or segmenting cheaper tiers for cost-conscious users. At the same time, DeepSeek’s dependence on Huawei’s 2026 chip shipment targets means how long it can maintain this price edge will shape whether this is an isolated move or the new standard for AI model costs.