DeepSeek’s 75% Price Cut on V4-Pro Rewrites the E...

From Temporary Discount to a New Baseline for AI Model Cost Reduction

DeepSeek has converted what was initially a time-limited promotion for its flagship V4-Pro model into permanent pricing, marking one of the most aggressive AI model cost reductions yet. Instead of reverting after May 31, the company locked in a 75% discount, keeping V4-Pro at only 25% of its launch cost. The updated rate card lists roughly USD 0.435 (approx. RM2.00) per million uncached input tokens and USD 0.87 (approx. RM4.00) per million output tokens, down from USD 1.74 (approx. RM8.00) and USD 3.48 (approx. RM16.00). In local currency, usage now ranges from 0.025 to 6 yuan per million tokens, compared with earlier levels of 0.1 to 24 yuan. For enterprises running long-context workloads or AI agents at scale, this translates into substantial savings and positions DeepSeek V4-Pro as a high-end yet remarkably affordable AI API.

DeepSeek’s 75% Price Cut on V4-Pro Rewrites the Economics of Enterprise AI

Long-Context Performance at Budget Prices Changes Enterprise ROI Math

V4-Pro is not a cut-down or budget-tier engine; it is a high-end reasoning and coding model built on a Mixture-of-Experts architecture with an estimated 1.6 trillion total parameters, activating around 49 billion during inference. It supports a one-million-token context window and can output up to 384,000 tokens in a single request, enabling entire codebases, legal archives, and persistent agent memories to fit into a single session. Historically, such capabilities carried premium price tags that limited experimentation and large-scale deployment. With DeepSeek V4-Pro now reportedly 20 to 35 times cheaper than some premium models for certain workloads, enterprises can redesign their AI strategies: shifting from narrow, cost-constrained pilots to always-on AI assistants, retrieval-heavy document systems, and complex multi-agent workflows, without blowing through their enterprise AI budgets.

Predictable Pricing Reshapes Enterprise AI Budgets and Planning Cycles

By turning a steep discount into standard pricing, DeepSeek has removed a major planning headache for finance and engineering leaders. Instead of budgeting around an imminent price reset, enterprise buyers can now treat the lower V4-Pro rates as a stable baseline for long-context AI workloads. The current pricing—USD 0.435 (approx. RM2.00) per million uncached input tokens and USD 0.87 (approx. RM4.00) per million output tokens, plus even lower cache-hit input rates—gives teams a clearer view of steady-state operating costs. This predictability is crucial for services that consume billions of tokens per month, such as customer support bots, code assistants, and retrieval-augmented search platforms. It also reduces the perceived risk of overcommitting to a single provider, encouraging enterprises to migrate more workloads to AI APIs instead of overinvesting in bespoke infrastructure that may quickly become cost-inefficient.

Huawei’s AI Chips and the Emerging Price War in Affordable AI APIs

DeepSeek has not fully detailed how it achieved such dramatic AI model cost reduction, but industry focus is turning to Huawei’s Ascend AI chips. Limited access to advanced hardware previously forced DeepSeek to price V4-Pro significantly higher than its Flash model. As Ascend 950 shipments ramp, those constraints appear to be easing, giving DeepSeek cheaper and more reliable compute capacity. Lower inference costs can then be passed through as reduced API prices, potentially explaining why V4-Pro’s rates are now so far below many frontier models. This hardware–software combination is intensifying an emerging AI price war, putting pressure on rivals such as Kimi, Qwen, and MiniMax, as well as major international providers. If this trend continues, affordable AI APIs may become the norm, compelling competitors to revisit their own pricing or risk losing cost-sensitive enterprise workloads.

How Cheaper AI APIs Will Transform Enterprise AI Strategies

The permanent DeepSeek V4-Pro pricing shift has implications far beyond one vendor’s rate card. When a frontier-grade model with a one-million-token context becomes this affordable, the economics of build-versus-buy tilt decisively toward API consumption. Enterprises that once considered training specialist models or building on-premise clusters may now find it cheaper and faster to plug into V4-Pro for many use cases. Lower token prices—especially for cache hits—encourage designs that lean on persistent context, long-running agents, and dense retrieval pipelines, rather than aggressively pruning prompts to save costs. Over time, this can change how AI projects are scoped, funded, and measured: ROI calculations will increasingly account for lower variable costs and faster deployment cycles, enabling more experimental products, broader internal automation, and a wider rollout of AI-driven services across business units.

DeepSeek’s 75% Price Cut on V4-Pro Rewrites the Economics of Enterprise AI

From Temporary Discount to a New Baseline for AI Model Cost Reduction

Long-Context Performance at Budget Prices Changes Enterprise ROI Math

Predictable Pricing Reshapes Enterprise AI Budgets and Planning Cycles

Huawei’s AI Chips and the Emerging Price War in Affordable AI APIs

How Cheaper AI APIs Will Transform Enterprise AI Strategies