DeepSeek’s Permanent 75% Price Cut Puts New Press...

From Temporary Promotion to Permanent DeepSeek API Pricing Reset

DeepSeek has converted what looked like a short-term promotion into a structural price reset for its flagship DeepSeek-V4-Pro model. The company now treats the 75% V4-Pro discount as permanent, fixing prices at one quarter of the original launch level once the current discount window ends on May 31, 2026 at 15:59 UTC. The updated rate card lists V4-Pro at USD 0.435 (approx. RM2.00) per million uncached input tokens and USD 0.87 (approx. RM4.00) per million output tokens, compared with crossed-out reference prices of USD 1.74 (approx. RM8.00) and USD 3.48 (approx. RM16.00). Cached input is listed at USD 0.003625 (approx. RM0.02) per million tokens after the same 75% cut. This is not a marginal tweak: developers get a lasting, long-context-capable model at a price level that dramatically undercuts many premium competitors, signalling a deliberate, aggressive move in AI model cost reduction.

DeepSeek’s Permanent 75% Price Cut Puts New Pressure on AI API Pricing

How a Permanent V4-Pro Discount Changes AI Developer Pricing Strategies

Locking in the V4-Pro discount permanent fundamentally alters how startups and product teams think about AI economics. Previously, founders had to weigh fine-tuning open models or building narrow in-house systems against opaque or volatile API bills. Now, DeepSeek’s lower, stable token rates give teams a predictable baseline for long-context workloads such as coding agents, research assistants, document-heavy search, and customer support bots. DeepSeek has also cut input cache-hit prices across its model lineup to one tenth of launch pricing, which is especially important for agentic systems that reuse large prompt contexts all day. For AI-native startups whose gross margins were squeezed by usage-based token costs, this change could make low-ticket AI features viable and reduce the need to aggressively trim context or over-summarize user data. In short, the new DeepSeek API pricing reshapes product spreadsheets as much as it changes model benchmarks.

Hardware Tailwinds: Huawei Chips and the Economics Behind Cheaper Tokens

Such a steep and permanent price drop raises a key question: what changed in DeepSeek’s cost base? While the company has not given a detailed breakdown, industry attention is focusing on Huawei’s Ascend AI chips. DeepSeek previously acknowledged that limited access to advanced compute pushed V4-Pro pricing far above its cheaper Flash model, reportedly up to 12 times more at launch because high-end hardware was constrained. Now, usage costs for V4-Pro are described as ranging from 0.025 to 6 yuan per million tokens depending on workload type, down from 0.1 to 24 yuan. Huawei’s growing AI chip ecosystem, including Ascend 950, appears to be easing those constraints by providing scalable, non‑NVIDIA infrastructure. If chip supply continues to improve, DeepSeek can sustain lower inference costs, turning hardware advantages into structural AI model cost reduction rather than a short-lived promotion.

Competitive Shock: API Cost Comparison with Premium Models

In a market where API prices have been trending down, V4-Pro still stands out as aggressively cheap. Analysis of current rate cards suggests that for some workloads, V4-Pro can be roughly 20 to 35 times cheaper than premium offerings from providers like OpenAI, Anthropic, and Google, with recent GPT-5.5 list pricing used as a reference point for the gap. DeepSeek-V4-Flash is even cheaper than V4-Pro, at USD 0.14 (approx. RM0.65) per million input tokens and USD 0.28 (approx. RM1.30) per million output tokens, with cache hits at USD 0.0028 (approx. RM0.01). This kind of API cost comparison will be difficult for rivals to ignore. Even if competitors can justify higher prices on reliability, compliance, or ecosystem depth, procurement teams and startup founders will increasingly benchmark token costs against DeepSeek’s rate card when designing new AI products or renegotiating contracts.

What This Means Next for Startups and the Wider AI Market

For developers, the immediate benefit is the ability to budget around a stable, low-cost long-context model instead of a discount that might vanish. Coding tools, document search engines, and high-volume support bots can now be scoped with more generous context windows and less fear of margin collapse. That said, price is only one dimension: reliability, latency, governance, tool-calling quality, and data handling policies still shape model selection, especially for sensitive or regulated workloads. Many teams will likely adopt a multi-model strategy, using DeepSeek where cost and context dominate and other providers where compliance or specific capabilities matter more. For the broader market, DeepSeek is clearly trading margin for reach, and its move could force competitors such as Kimi, Qwen, MiniMax, and Western hyperscalers to revisit their own cost structures. The AI price war is no longer theoretical; it is arriving in developers’ invoices.