AI coding agents and the new developer price floor

AI coding agents and a 72-hour wave of releases

AI coding agents are software tools that pair large language models with code editors, terminals, and automation so they can read repositories, propose plans, run commands, and modify files across multi-step development workflows with minimal human prompting. In the space of 72 hours, three such agents reshaped what developers pay for intelligence. Cursor released its Composer 2.5 coding model, Anthropic announced new infrastructure features for Claude Managed Agents, and Alibaba made its Qwen 3.7 Max API available. At nearly the same time, Reasonix appeared as a DeepSeek-native terminal agent, focused on cheaper long shell sessions. Taken together, these moves compress the effective price floor for frontier-level capability while sharpening expectations around plan mode, MCP support, and terminal-native options. Developers now have to evaluate not just raw quality, but how long sessions and tool orchestration affect their monthly costs.

Cursor’s Composer 2.5 pushes price per task downward

Composer 2.5 marks Cursor’s third-generation proprietary coding model, built on the open-source Kimi K2.5 base and trained on 25 times as many synthetic coding tasks as its March predecessor. Cursor now discloses that base model upfront after criticism around earlier transparency. The standard tier for Composer 2.5 is priced at USD 0.50 (approx. RM2.30) per million input tokens and USD 2.50 (approx. RM11.50) per million output tokens, with a faster default variant at USD 3.00 (approx. RM13.80) input and USD 15.00 (approx. RM69.10) output. According to Developer-Tech, Composer 2.5 reaches around 63% accuracy on CursorBench v3.1 at roughly USD 0.50 (approx. RM2.30) per task, while Claude Opus 4.7 at its default setting delivers comparable scores at about USD 7 (approx. RM32.20) per task. The benchmark may be vendor-controlled, but the order-of-magnitude pricing gap is now a core part of developer tool pricing debates.

Anthropic and Alibaba chase enterprise and API price sensitivity

Anthropic’s Code with Claude London event added two key features to Claude Managed Agents: self-hosted sandboxes and MCP tunnels. Self-hosted sandboxes, now in public beta, let teams run Claude agents and execute tools inside their own infrastructure while Anthropic maintains the orchestration loop. MCP tunnels, in research preview, route encrypted traffic through a lightweight gateway so agents can reach internal systems without a public endpoint. Both arrive with caveats and access controls, but they narrow the gap that had slowed enterprise deployments. In parallel, Alibaba’s Qwen 3.7 Max API went live on Alibaba Cloud Model Studio, adding another competitive option for high-end code generation. Even without headline-grabbing per-token figures, the presence of another frontier-scale model competing for developer workloads increases pressure on API cost reduction across the market and reinforces that AI coding agents will be judged on both capability and predictable spend.

Reasonix and the DeepSeek terminal agent cost story

Reasonix approaches the same market from a different angle: instead of promising a larger model, it offers a DeepSeek-native terminal coding agent tuned for cheaper long sessions. Built around DeepSeek prefix caching, Reasonix aims to avoid resending the same repository context and instructions on every turn, which can otherwise inflate API costs during long-running shell workflows. The project author frames the release as “MCP first-class · plan mode · cache-first loop · MIT licensed,” highlighting its cache-first design, Model Context Protocol support, and open-source license. According to WinBuzzer, a single-day study by the project reports about USD 12 (approx. RM55.30) in spend instead of about USD 61 (approx. RM281.20) under comparable usage, though this evidence comes from the tool’s own data. Reasonix targets developers who already live in the terminal, require Node.js 22 or later, and want a DeepSeek terminal agent that focuses on API cost reduction rather than editor lock-in.

New default expectations and the race toward consolidation

The compressed 72-hour release window says as much about strategy as it does about features. Cursor, Anthropic, Alibaba, and Reasonix are not only competing on accuracy and latency; they are testing how far they can push developer tool pricing down while expanding what “standard” capability means. MCP support, plan mode, and agent access to terminals and repositories are becoming expected, not novel. At the same time, cost stories are diverging: Cursor focuses on lower per-task prices, Reasonix on cache-first session economics, and Anthropic on deployment control rather than raw tokens-per-dollar. For developers, this means AI coding agents can no longer be evaluated as a generic monthly subscription. They have to be measured by repository size, typical session length, and security constraints. With so many overlapping agents, the pace suggests a land grab ahead of eventual consolidation, where a smaller set of platforms may define the rules for cost and capability.