DeepSeek API pricing cut reshapes AI costs

What DeepSeek’s Permanent 75% Price Cut Really Means

DeepSeek’s move to make its 75 percent V4-Pro API discount permanent is a structural reset of AI model cost expectations, turning a short-term promotion into a long-term baseline that changes how developers and startups plan, price, and ship AI products. Instead of a temporary sale, the V4-Pro price is now fixed at one quarter of its original launch level, with DeepSeek’s pricing page listing USD 0.435 (approx. RM2.00) per million uncached input tokens and USD 0.87 (approx. RM4.00) per million output tokens, reduced from crossed-out reference prices of USD 1.74 (approx. RM8.00) and USD 3.48 (approx. RM16.00). Cached input is also reduced after the same 75 percent cut, while the deepseek-v4-flash model remains cheaper still. For AI-native companies that pay per token, this permanent DeepSeek API pricing shift is closer to a wholesale change in unit economics than a marketing event.

DeepSeek’s Permanent 75% Price Cut Rewrites AI API Economics

Cost Drivers: Chips, Cache, and the Economics Behind the Cut

The size and permanence of DeepSeek’s AI model cost reduction suggest more than a marketing strategy. According to Digital Trends, usage costs for V4-Pro now range from 0.025 to 6 yuan per million tokens, down from 0.1 to 24 yuan per million tokens, hinting at lower infrastructure costs in the background. Industry attention is focusing on Huawei’s Ascend AI chips, which DeepSeek previously lacked at scale, forcing higher prices for V4-Pro compared with its cheaper Flash model. On top of that, DeepSeek slashed input cache-hit prices across its lineup to one tenth of launch levels, directly benefiting workloads that reuse large shared prompts: customer support flows, coding agents, document-heavy retrieval systems, and research tools. WinBuzzer notes that Huawei’s 2026 chip shipment targets may shape how long DeepSeek can sustain this cheaper V4-Pro access, but for now, the underlying economics appear to support the permanent discount.

How Developers and Startups Can Rebuild AI Budgets

For developers, the most important change is stability: the V4-Pro V4-Pro API discount is no longer a short-lived deal, but the standing rate card. WinBuzzer reports that for some workloads DeepSeek-V4-Pro may be 20 to 35 times cheaper than premium offerings from OpenAI, Anthropic, and Google, depending on prompt structure and output volume. That level of DeepSeek API pricing allows founders to rebuild AI developer budgets from the ground up. Instead of trimming prompts to save tokens, teams can keep more context in the model and still maintain healthy margins. Low-ticket AI features for students, small businesses, and solo operators, which once looked unprofitable, now become much easier to justify. Crucially, predictable pricing that will not reset after May 31 means long-context coding tools, search products, and agents can be planned on multi-year timelines rather than short promotional windows.

Competitive Shockwaves Across the AI Model Market

DeepSeek’s move does more than cut its own revenue per token; it forces a visible comparison with every rival API provider. Startup Fortune notes that every competitor now has to explain why its bill remains higher when DeepSeek-V4-Pro’s uncached input sits at USD 0.435 (approx. RM2.00) per million tokens and cached input at USD 0.003625 (approx. RM0.02) per million tokens. In parallel, WinBuzzer highlights that this pricing could pressure players like Kimi, Qwen, and MiniMax in budget-sensitive use cases, especially where long context and heavy reuse of prompts dominate costs. As Chinese firms scale performance while reducing inference costs, Digital Trends argues that a wider AI price war is likely, with global incumbents under pressure to answer a 75 percent permanent discount on a frontier-style model. The field is shifting from feature comparisons to a blunt question: whose tokens are cheapest at acceptable quality and reliability?

Risks, Trade-offs, and What Comes Next for AI Buyers

Even with the aggressive V4-Pro API discount, price is only one dimension in model choice. Startup Fortune points out that teams still need to weigh reliability, latency, data policies, tool calling behavior, and trust around sensitive workloads when considering a provider. Some buyers may be cautious about concentrating critical systems on any single vendor, especially when future hardware supply, such as Huawei’s planned chip shipments, could affect capacity and throttling. At the same time, the price gap is now wide enough that many teams will test DeepSeek for at least part of their stack, particularly for experimental features, internal tools, or high-volume but low-risk automation. The strategic question for AI startups is how to blend DeepSeek’s cheaper tokens with other models to balance cost and resilience. For now, DeepSeek is trading margin for reach, and in the process, rewriting the baseline for AI model cost reduction everywhere.