From Temporary Promotion to Structural Price Reset
DeepSeek has converted what looked like a short-lived promotion into a structural reset of its AI model economics. The company confirmed that its 75% discount on the V4-Pro API will become the official rate after the current discount period ends on May 31, 2026 at 15:59 UTC, effectively turning the sale price into the real price. V4-Pro is now listed at USD 0.435 (approx. RM2.00) per million uncached input tokens and USD 0.87 (approx. RM4.00) per million output tokens, down from crossed-out reference prices of USD 1.74 (approx. RM8.00) and USD 3.48 (approx. RM16.00). Cached input falls to USD 0.003625 (approx. RM0.02) per million tokens, while the lighter V4-Flash is cheaper still at USD 0.14 (approx. RM0.65) input and USD 0.28 (approx. RM1.30) output per million tokens. This permanent DeepSeek API pricing move will be hard for other frontier providers to ignore.

A New Floor for AI Model Cost Reduction
Beyond headline percentages, DeepSeek’s restructuring sharply compresses the practical cost of running serious AI workloads. Input cache-hit pricing across its lineup has been cut to one tenth of original launch levels, an especially big deal for agents, coding copilots, customer support bots, and document-heavy systems that repeatedly reuse the same instructions or reference material. In yuan terms, the company says its API price range now runs from 0.025 yuan to 6 yuan per million tokens, down from 0.1 yuan to 24 yuan. Estimates in industry commentary suggest that after the cut, V4-Pro may be between 20 and 35 times cheaper than some premium models from incumbent leaders for certain workloads. For developers processing billions of tokens, those ratios can translate into savings worth millions annually, effectively redefining what “affordable AI development” means at frontier scale.
Accessibility Over Exclusivity: A Strategic Shift in AI Competition
DeepSeek is not discounting a stripped-down model; it is undercutting the market with a frontier-grade system. V4-Pro is described as a 1.6 trillion-parameter Mixture-of-Experts model that activates about 49 billion parameters per request, supports a one-million-token context window, and can output up to 384,000 tokens in a single call. V4-Flash sits alongside it as a faster, more economical sibling, creating a ladder where everyday tasks route to Flash and complex reasoning or agentic coding goes to Pro. This architecture, paired with aggressive DeepSeek API pricing, signals a pivot in competitive strategy: winning on accessibility instead of purely on feature exclusivity. Rather than treating long context windows and advanced reasoning as premium, rare capabilities, DeepSeek is normalizing them as baseline tools. Competitors must now justify why similar capabilities should still command much higher prices.
How Developers and Startups Gain New Room to Experiment
AI-native startups have struggled with a margin problem: products look like software but behave like metered infrastructure, where every support reply, generated report, or coding task carries a token bill. With V4-Pro’s 75% price cut now permanent and cache hits at one-tenth of launch pricing, founders suddenly face a different spreadsheet. Instead of agonizing over whether to stitch together narrow internal models or aggressively trim prompts, teams can keep richer context in the system and trial low-ticket AI features that were previously uneconomical. This unlocks affordable AI development for segments such as students, small businesses, solo operators, and price-sensitive international users. It also changes the build-versus-buy calculus: using a high-end external model becomes far more viable against fine-tuning or hosting costs, especially when routing between V4-Pro and V4-Flash further lowers the effective cost per application.
Risks, Infrastructure Bets, and the Broader Industry Recalibration
DeepSeek is consciously trading margin for reach at a time when frontier AI remains expensive to train and serve. Reports indicate that its V4 series is optimized to run on Ascend accelerators rather than relying primarily on more supply-constrained alternatives, an infrastructure shift that likely underpins confidence in sustaining lower prices. Management has reportedly told potential investors it will prioritize breakthrough research over short-term commercialization while pursuing a large funding round, and its founder has pledged to keep open-source model development in the mix even as the company pushes toward artificial general intelligence. For developers, price is only one dimension; reliability, latency, governance, regional restrictions, and trust still matter, and some enterprises may hesitate to move sensitive workflows immediately. Yet the price gap is now so wide that many teams will at least test DeepSeek, pressuring incumbents to respond with their own AI model cost reduction strategies.
