DeepSeek API pricing slashes AI model costs

What DeepSeek’s Permanent 75% Price Cut Means

DeepSeek’s decision to make its 75 percent V4-Pro API discount permanent is a long-term reset of AI model cost expectations, turning a temporary promotion into a structural change in how long-context AI services are priced and consumed. The company is fixing V4-Pro API pricing at one quarter of its launch level instead of letting the sale expire. According to Startup Fortune, DeepSeek-V4-Pro is listed at USD 0.435 (approx. RM2.00) per million uncached input tokens and USD 0.87 (approx. RM4.00) per million output tokens, down from crossed-out reference prices of USD 1.74 (approx. RM8.00) and USD 3.48 (approx. RM16.00). Cached input is now USD 0.003625 (approx. RM0.02) per million tokens after the same 75 percent reduction, while V4-Flash is quoted at USD 0.14 (approx. RM0.65) per million input tokens and USD 0.28 (approx. RM1.30) per million output tokens. This anchors a new baseline for DeepSeek API pricing.

DeepSeek’s 75% AI Price Cut Resets the Cost of Long-Context Models

A New Economics for Developer AI Costs

DeepSeek’s move directly targets the economics of AI-native products by changing the math underpinning developer AI costs. Many founders face a mismatch: their apps look like software, but their margins resemble infrastructure businesses because every support reply, report, agent task or coding job adds to a token bill. By cutting prices by 75 percent and trimming cache-hit input costs across the model lineup to one tenth of launch pricing, DeepSeek makes it cheaper to keep larger prompts and richer context in long-context AI models. This is especially significant for agents, coding assistants, customer support bots and document-heavy workflows that repeatedly reuse instructions and reference material. For startups serving students, small firms or international users with limited budgets, low-ticket AI features become easier to justify. The new pricing can tip the decision toward using an external model instead of fine-tuning or building narrow in-house systems.

How Huawei’s AI Chips Enable Aggressive Pricing

Behind the headline discount is a shift in hardware supply that appears to be enabling DeepSeek’s AI model cost reduction. Technology.org notes that DeepSeek’s V4-Pro API costs now range from 0.025 to 6 yuan per million tokens, or about USD 0.0035 (approx. RM0.02) to USD 0.83 (approx. RM3.80), down from 0.1 to 24 yuan. The model leans on Huawei’s Ascend 950 chips, and DeepSeek had earlier warned that V4-Pro could cost up to 12 times more than the lighter Flash version because of “constraints in high-end compute capacity.” As Ascend 950 supernodes arrive in greater volume, those constraints appear to be easing, giving DeepSeek more room to compress inference margins. Digital Trends reports the same 0.025 to 6 yuan band, reinforcing that this is not a short-term promotion but a bet that improved AI hardware supply can sustain lower prices over time.

Global AI API Price War and Strategic Trade-Offs

DeepSeek’s permanent V4-Pro discount sharpens competitive pressure on other AI providers and signals a wider AI API price war. If one frontier-style model with a million-token context window is priced at a quarter of its launch rate, rivals must justify why their invoices remain higher. This could push incumbents to cut prices, introduce cheaper long-context tiers, or rely more heavily on routing between “Flash” and “Pro” style models to lower effective costs. At the same time, DeepSeek is trading margin for reach while pursuing ambitious goals in research and open-source development. Bloomberg reporting cited by Startup Fortune notes that management has discussed prioritizing breakthrough research over short-term commercialization even as it talks about a large funding round. Buyers still weigh latency, reliability, tooling, data policy and trust, but the cost gap is wide enough that many teams will run serious evaluations rather than treat DeepSeek as a niche option.