DeepSeek’s Permanent 75% AI Price Cut Signals a N...

From Temporary Promotion to Permanent AI Model Pricing Shock

DeepSeek has converted a limited-time discount on its DeepSeek V4-Pro model into a permanent reset of AI model pricing. Instead of reverting after May 31, the 75% V4-Pro API discount now becomes the standard rate, dropping costs to a quarter of the original launch level. The company’s pricing page lists V4-Pro at USD 0.435 (approx. RM2.00) per million uncached input tokens and USD 0.87 (approx. RM4.00) per million output tokens, down from crossed-out reference prices of USD 1.74 (approx. RM8.00) and USD 3.48 (approx. RM16.00). Cached input is now USD 0.003625 (approx. RM0.02) per million tokens after the same 75% reduction. This is not just another sale; it is a structural API cost reduction that resets expectations for what a frontier-style model can cost, while competitors must justify much higher invoices to cost-conscious developers.

DeepSeek’s Permanent 75% AI Price Cut Signals a New Era of Model Pricing

Why a Cheaper V4-Pro Changes AI Development Costs and Product Design

For startups and independent developers, DeepSeek’s move directly reshapes AI development costs and the way products are architected. AI-native products often look like software but behave financially like infrastructure: every support reply, generated report, or coding action produces a token bill. Until now, this has squeezed gross margins and made low-priced or freemium AI features difficult to sustain. With V4-Pro’s permanent 75% cut and deeply reduced cache-hit pricing, it becomes much cheaper to keep large, persistent context windows: long documents, user histories, and reusable instructions can stay in the prompt instead of being aggressively trimmed or summarized. That opens space for richer agents, more capable coding assistants, and document-heavy workflows that were previously uneconomical. Founders weighing whether to fine-tune open models, build narrow in-house systems, or call external APIs now run very different spreadsheets when forecasting unit economics, experimentation budgets, and pricing strategies.

Competitive Pressure: Undercutting Rivals and Reframing the AI Price War

DeepSeek’s new rate card drops V4-Pro into a price tier that looks unusually aggressive compared to many premium models. Some workloads may now be 20 to 35 times cheaper than top-end offerings from major rivals, depending on prompt and output volume. V4-Flash is even cheaper, at USD 0.14 (approx. RM0.65) per million input tokens and USD 0.28 (approx. RM1.30) per million output tokens, with lower cache-hit rates. This puts immediate pressure on other providers that compete on cost-sensitive use cases, including emerging regional players like Kimi, Qwen, and MiniMax. For global developers, the message is clear: high-quality capabilities no longer need to come with premium, frontier-level price tags. While enterprises will still consider reliability, latency, data policies, tool calling, and trust, the sheer magnitude of the price gap ensures that many teams will at least benchmark DeepSeek’s APIs before committing to longer contracts elsewhere.

Huawei’s AI Chips and the Infrastructure Behind DeepSeek’s Price Reset

Such an extreme, permanent API cost reduction raises the question: what changed under the hood? DeepSeek previously acknowledged that limited access to high-end compute forced V4-Pro pricing much higher than its cheaper Flash model, with Pro access costing up to 12 times more at launch due to constrained advanced hardware. Now, industry attention is turning to Huawei’s Ascend AI chips, particularly the Ascend 950 line, which has grown in importance after export restrictions restricted access to certain competing accelerators. A more stable pipeline of local AI hardware may be enabling DeepSeek to run inference at significantly lower cost, making the permanent 75% cut more sustainable. Huawei’s 2026 chip shipment targets could shape how long DeepSeek can keep these prices, but for now, the move suggests that improved domestic compute supply is beginning to translate into tangible API savings for developers.

What This Means Next for Developers, Startups, and AI Strategy

The implications of DeepSeek’s pricing reset extend beyond a single model. For builders, the lower AI model pricing offers a more predictable cost baseline for long-context, high-volume workloads like coding tools, document search, and customer support automation. It makes experimentation cheaper and could revive ideas that previously failed the unit-economics test. Startups serving students, small businesses, and international users may now add or enrich AI features without immediately sacrificing margin. At the same time, teams must weigh trade-offs around provider concentration, reliability, latency, and data governance, rather than reflexively migrating everything to the cheapest endpoint. Strategically, the permanent 75% cut signals a broader shift: as AI hardware supply matures and inference becomes cheaper, competitive advantage will move from pure model access to product design, workflow integration, and differentiated user experience built on top of increasingly affordable APIs.

DeepSeek’s Permanent 75% AI Price Cut Signals a New Era of Model Pricing

From Temporary Promotion to Permanent AI Model Pricing Shock

Why a Cheaper V4-Pro Changes AI Development Costs and Product Design

Competitive Pressure: Undercutting Rivals and Reframing the AI Price War

Huawei’s AI Chips and the Infrastructure Behind DeepSeek’s Price Reset

What This Means Next for Developers, Startups, and AI Strategy