AI coding agents: pricing shifts and new tools

AI coding agents and a week that reset developer expectations

AI coding agents are software tools that combine large language models, file awareness, and automated workflows to propose, edit, and run code on behalf of developers throughout the lifecycle of a project. In the span of 72 hours, three new or upgraded agents—Grok Build from xAI, Cursor’s Composer 2.5, and Anthropic’s latest Claude Managed Agents features—reset expectations for what developers pay for this level of intelligence. Cursor released a new in-house model, Anthropic shipped infrastructure that addresses security and deployment constraints, and Grok Build arrived as a full coding CLI tied to X subscriptions. Together they mark a shift from “AI assistant” as a premium add‑on toward AI as an expected part of the developer toolchain, forcing teams to examine not only raw token costs but the total cost of integrating these agents into everyday coding work.

Grok Build’s subscription route: AI coding inside the X ecosystem

Grok Build is xAI’s new coding agent and command‑line tool aimed at professional software engineering and complex automation work. Today it is in early beta and available to all SuperGrok and X Premium Plus subscribers, which means access is bundled into an existing subscription rather than sold as a separate metered developer tool. Grok Build features include Plan Mode for complex tasks, where developers can approve or rewrite the plan before execution, and seamless use of AGENTS.md, plugins, hooks, skills, and MCP servers in an existing repository. It supports subagents that run in parallel for larger jobs, deep worktree integrations, and a headless mode for embedding agents into scripts and automations. The tool spans the workflow: code search, multi‑file edits, Git integration, web search, terminal execution, sandboxed runs, and code review, signaling that xAI wants Grok to be a primary development environment rather than a sidecar assistant.

Cursor’s Composer 2.5 and the price compression of coding intelligence

Cursor’s Composer 2.5 is its third‑generation proprietary coding model, trained on 25 times as many synthetic coding tasks as its March release and built on the open‑source Kimi K2.5 base. Cursor publicly named the base model this time, after facing criticism for not disclosing the foundation in the previous launch. The standard tier runs at USD 0.50 (approx. RM2.30) per million input tokens and USD 2.50 (approx. RM11.50) per million output tokens, while a faster default variant costs USD 3.00 (approx. RM13.80) per million input tokens and USD 15.00 (approx. RM69.00) per million output tokens. On CursorBench v3.1, Composer 2.5 scores about 63% accuracy at roughly USD 0.50 (approx. RM2.30) per task, whereas Claude Opus 4.7 at its default setting lands near USD 7.00 (approx. RM32.20) per task. That gap, even allowing for benchmark bias, shows how quickly developer tool pricing for frontier‑level AI coding has compressed.

Anthropic and Alibaba: infrastructure, protocols, and a closed flagship model

Anthropic’s Code with Claude London event focused less on a new model and more on infrastructure that makes Claude Managed Agents easier to deploy at scale. Self‑hosted sandboxes, now in public beta, let teams run Claude tools inside their own infrastructure while leaving the agent orchestration loop with Anthropic. MCP tunnels, in research preview, provide encrypted connections into private internal systems without exposing public endpoints, though they come with as‑is caveats and access approval requirements. On a separate track, Alibaba’s Qwen 3.7 Max API launched on Alibaba Cloud Model Studio as a closed‑weight flagship model. It is priced at USD 2.50 (approx. RM11.50) per million input tokens and USD 7.50 (approx. RM34.50) per million output tokens, with a 90% discount on cached input tokens. Extended thinking is enabled by default, making sessions verbose unless developers cap max_tokens, and the model natively supports the Anthropic Messages protocol for drop‑in use with existing Claude Code setups.

Pricing signals, feature gaps, and the race to dominate AI coding

Taken together, these three launches show that AI coding agents are moving into a new phase where capability, security, and price are all in motion at once. Cursor is attacking cost directly with Composer 2.5, placing Claude‑adjacent performance at a fraction of the price on its own benchmarks. xAI is folding Grok Build into X‑tier subscriptions, expanding access by tying a full coding agent and CLI to a consumer‑plus‑developer bundle. Anthropic and Alibaba are focusing on infrastructure and integration: Anthropic with self‑hosted sandboxes and MCP tunnels, Alibaba with a closed flagship model that plugs into the Anthropic Messages protocol. The rapid iteration cycle suggests vendors are racing to lock in developers before the market consolidates. For teams, a coding agent comparison now means weighing token prices, review overhead, and security posture together, rather than assuming the most expensive model is the only viable option.