DeepSeek Turns a Temporary Discount into a Structural Shock
DeepSeek has converted what began as a promotional discount into a permanent reset of AI API pricing. The company’s notice confirms that DeepSeek V4 Pro will now remain at 25% of its original standard rate, effectively locking in a 75% reduction instead of reverting to higher prices after the promotion period. Enterprise developers are charged 6 Renminbi for every 1,000,000 output tokens, while consumer access via the original platform and mobile app remains free. This puts DeepSeek’s model inference costs on a dramatically different footing from premium rivals: OpenAI’s GPT 5.5 is listed at 30 USD (approx. RM138) per 1,000,000 output tokens at standard tiers and 180 USD (approx. RM828) at premium levels. That translates to a gap of tens or hundreds of times per token and signals that capable models can be priced as high-volume infrastructure instead of luxury software.

MiMo V2.5 Pro Confirms the Downward Trajectory of Model Inference Costs
Xiaomi’s MiMo V2.5 Pro reinforces how quickly AI API pricing is compressing, especially for reasoning-focused systems. According to its API page, MiMo V2.5 Pro is offered at about 1 USD (approx. RM5) per 1,000,000 input tokens and 3 USD (approx. RM14) per 1,000,000 output tokens for prompts up to 256,000 tokens, with higher long‑context pricing above that point. This places MiMo directly in the same buying conversation as DeepSeek V4 Pro and demonstrates that low-cost access is no longer limited to simpler chat models. Reasoning-heavy workloads—coding agents, analysis tools and complex workflows with repeated tool calls—have historically carried token bills that constrained product design. MiMo’s pricing shows those constraints easing. Together with DeepSeek pricing, it signals a broader shift: capable reasoning models are starting to look like scalable infrastructure, pushing the entire market toward lower model inference costs.
How AI Cost Competition Rewrites the Economics for Startups and Developers
For startups, the new wave of AI cost competition is less about abstract market theory and more about concrete budgets. Lower AI API pricing from DeepSeek, MiMo and other emerging labs directly expands what small teams can afford to build and test. Researchers, coding assistant founders and workflow automation startups can now explore longer context windows, higher output volumes and more tool calls without the token bill consuming their business model. This particularly benefits teams relying on third‑party inference rather than training proprietary foundation models. With multiple providers undercutting each other on price, developers gain leverage: they can pilot several models, negotiate better terms and design multi‑model routing strategies that would have been prohibitively expensive a year ago. While latency, uptime and data policies still matter, the falling baseline of model inference costs gives builders more room to iterate before imposing strict usage caps on their users.
Pressure Mounts on OpenAI, Google and Anthropic to Rethink Pricing
As DeepSeek pricing and MiMo’s low tariffs reset expectations, established leaders such as OpenAI, Google and Anthropic face growing pressure to reassess their AI API pricing strategies. GPT 5.5’s standard and premium rates—30 USD (approx. RM138) and 180 USD (approx. RM828) per 1,000,000 output tokens respectively—stand in stark contrast to DeepSeek V4 Pro’s 6 Renminbi per 1,000,000 output tokens. Incumbents still enjoy advantages in brand recognition, ecosystem maturity and enterprise trust, and many customers are willing to pay a premium for compliance, support and perceived safety. Yet procurement teams are unlikely to ignore a gap of this magnitude. Even if contracts are not replaced overnight, every renewal conversation now includes cheaper, increasingly capable alternatives. To defend margins, the big providers will need to emphasize differentiated capabilities, bundled offerings and reliability—or move toward more granular, value‑aligned pricing that reflects the total cost of getting useful work done.
From Scarcity to Abundance: The Next Phase of AI API Economics
The shift from scarce, expensive AI capacity to abundant, competitively priced AI APIs is reshaping the entire stack. Falling base prices expand usage but also challenge middleware platforms that rely on charging a spread over underlying model costs. Aggregators and routing layers must now prove they offer more than simple pass‑through access, adding value via observability, failover, billing controls and governance. For application builders, the economic frontier is moving away from headline dollars per million tokens toward a fuller view of model inference costs: cache strategies, long‑context tiers, verbosity defaults, rate limits and reliability. The winners will likely be teams that architect systems to exploit rapid price compression—using multiple models, switching providers as economics change and aligning product design with cost curves. As DeepSeek and MiMo normalize aggressive pricing, AI APIs start to resemble classic cloud infrastructure, where efficiency and flexibility matter as much as raw capability.
