MilikMilik

Cursor’s Composer 2.5 Targets Claude with 10x Cheaper, Long-Running AI Coding Assistance

Cursor’s Composer 2.5 Targets Claude with 10x Cheaper, Long-Running AI Coding Assistance

Cursor Strikes Back in the AI Coding Assistant Race

Cursor’s Composer 2.5 release marks a strategic push to reclaim ground in AI-assisted coding, where Claude Code has recently dominated. While Anthropic benefits from owning both the model and the product, Cursor has been paying for Claude inference even as it competes with it, creating mounting pressure on margins. Composer 2.5 is Cursor’s answer: an in-house AI coding assistant designed to cut costs while improving reliability on complex, long-running tasks. Built into Cursor’s IDE, the model acts like an autonomous coding agent that can apply changes across multiple files, refactor large codebases, and debug in context. Cursor still boasts strong adoption—billions of accepted lines of code per day and heavy enterprise usage—but perception has shifted toward agentic tools. Composer 2.5 is positioned to reset that narrative by demonstrating that smarter post-training can rival frontier models without requiring a new base model.

Cursor’s Composer 2.5 Targets Claude with 10x Cheaper, Long-Running AI Coding Assistance

Benchmark Parity with Opus 4.7 at a Fraction of the Cost

On public benchmarks, Composer 2.5 now sits in the same conversation as leading frontier models. It scores 79.8% on SWE-Bench Multilingual, just behind Opus 4.7’s 80.5% and ahead of GPT-5.5’s 77.8%. On Terminal-Bench 2.0, it essentially matches Opus 4.7 (69.3% vs. 69.4%). On CursorBench v3.1—Cursor’s own harder-task suite that stresses agent-style workflows—Composer 2.5 posts 63.2%. While Opus 4.7 can edge ahead at its max setting, its default setting falls below Composer 2.5, and GPT-5.5 trails further. The bigger story is cost efficient coding. Standard Composer 2.5 pricing is set at USD 0.50 (approx. RM2.30) per million input tokens and USD 2.50 (approx. RM11.40) per million output tokens, with an effort curve showing similar accuracy at under USD 1 (approx. RM4.60) per task where rivals cost several dollars more.

Cursor’s Composer 2.5 Targets Claude with 10x Cheaper, Long-Running AI Coding Assistance

Targeted Reinforcement Learning and 25x More Synthetic Training

Composer 2.5 keeps the same Kimi K2.5 base checkpoint as Composer 2 but leans heavily on post-training to close the gap with larger Claude-class models. Cursor says around 85% of total compute went into its own training and reinforcement learning on top of the open-source foundation. A key innovation is targeted RL with textual feedback: instead of waiting for a single reward at the end of a long rollout, the system injects localized hints right where the model goes wrong, such as a faulty tool call. These corrections become teacher signals, sharpening credit assignment for complex, multi-step tasks. On top of that, Cursor scaled synthetic task training by 25x, exposing the model to far more simulated workloads. Additional behavioral calibration further tunes communication style, coding consistency, and instruction-following, making the AI coding assistant feel more predictable and cooperative in real projects.

Longer Coding Jobs, Heavier Post-Training, and a Claude Alternative

Beyond headline numbers, Composer 2.5 is explicitly tuned for longer, more demanding coding jobs. Cursor reports tougher reinforcement-learning environments and richer practice workloads designed to stretch planning depth and tool use across multi-file refactors, repeated tool calls, and extended debugging chains. Developers can already access Composer 2.5 inside Cursor and test those claims directly against live repositories instead of relying solely on benchmarks. Importantly, Cursor has maintained aggressive pricing on both the standard tier—USD 0.50 (approx. RM2.30) per million input tokens and USD 2.50 (approx. RM11.40) per million output tokens—and a faster default variant at USD 3.00 (approx. RM13.70) per million input tokens and USD 15.00 (approx. RM68.50) per million output tokens. Taken together, the benchmark parity, 10x cost efficiency claims, and stronger long-task reliability position Composer 2.5 as a credible Claude alternative for teams that want affordability without giving up capability.

Cursor’s Composer 2.5 Targets Claude with 10x Cheaper, Long-Running AI Coding Assistance

What Comes Next for Cursor’s Model Strategy

Composer 2.5 also signals how Cursor sees its long-term model strategy. Rather than swapping base models with every release cycle, the company is betting that deep post-training on a stable foundation can keep pace with rapidly evolving competitors. The collaboration with SpaceXAI on a larger model trained with significantly more compute, and a separate effort on a from-scratch system using roughly ten times more total compute, hint at an eventual tiered lineup: a fast, cost-efficient workhorse like Composer 2.5 and heavier agents for the hardest enterprise workloads. For now, Cursor’s immediate challenge is proving that the new model’s benchmark gains translate into day-to-day productivity. Developers will judge Composer 2.5 on multi-file accuracy, response speed, and cost predictability—but if its real-world behavior mirrors its benchmarks, Cursor may successfully redefine itself as the most cost-efficient, high-capability AI coding assistant on the market.

Comments
Say Something...
No comments yet. Be the first to share your thoughts!