GitHub Copilot pricing goes token-based

What GitHub’s Token-Based Billing Change Really Means

GitHub Copilot’s token-based billing is a metered pricing model where developers pay for each token of AI input and output instead of a fixed subscription, tying GitHub Copilot pricing directly to how much assistance they consume and making costs rise or fall with every prompt, chat, and generated code block. In April, GitHub announced that all Copilot plans would move from flat-rate “requests” to usage-based pricing measured in tokens, with the change taking effect in June. Under the old model, heavy users benefited from cross-subsidies as GitHub absorbed much of the growing inference cost from large models and long sessions. Now each paid tier includes a pool of AI credits, and one credit equals one cent of usage. The result is a billing system that exposes the real cost of long conversations, large context windows, and premium models instead of hiding them behind a single subscription fee.

From Flat Fees to Vanishing Credits: Developer Costs Spike

Developers are discovering that Copilot’s new token-based billing drains monthly credits far faster than expected. Reports on GitHub’s own community forum describe credits “burned like anything” for what users considered minor edits, such as updating only a few lines across several files. One user showed that half of a 7,000-credit allowance disappeared after a single day of work, while another said they used 840 credits despite being “super cautious” with Claude Sonnet 4.6. According to ArtificialIntelligence-News.com, GitHub’s deprecated fixed-price subscriptions likely functioned as loss leaders that let users consume far more tokens than their fees covered. Now, the cross-subsidy is gone and those same habits translate into real expenditure. One developer quoted by TechSpot reported that “a few prompts” consumed 700 credits, while another saw 5,000 credits vanish after only a couple of Copilot-driven commits, turning what felt like a limitless tool into a meter that is always running.

How Model Choice and Context Size Drive Metered Billing AI Costs

Under usage-based pricing, Copilot cost increases are tightly linked to model selection and context size. Each token of input and output consumes credits, so larger prompts, long-running chats, and powerful models are all more expensive. TechSpot notes that one million output tokens from an OpenAI GPT-5.4 nano model costs about USD 1.25 (approx. RM5.75) through Copilot, while the same volume from the frontier-class GPT-5.5 costs roughly USD 30 (approx. RM138). That gap was hidden when both counted as a single premium request. Now, a simple “build a Minesweeper game” prompt via Claude Haiku 4.5 can use 94 credits, and a single complex prompt can consume 171 credits. Long chat threads exacerbate the problem because every new request resends the entire conversation as context. As developer Neil Hewitt pointed out, those old messages are all input tokens, and input tokens now have a direct, visible price attached.

Why Token-Based Billing Is Spreading Across AI Tools

GitHub Copilot’s pricing shift is part of a broader move toward metered billing AI services across the industry. Running large language models involves substantial ongoing costs, from inference and infrastructure to model development and maintenance, and subscription-based plans that ignore usage patterns were always going to be temporary. ArtificialIntelligence-News.com argues that letting users burn far more tokens than their subscriptions represented was not sustainable, and Copilot’s update signals an end to “all-you-can-eat” AI for heavy users. TechSpot notes that Copilot’s decision is unlikely to remain an outlier as other AI coding assistants work to align their revenue with actual consumption. At the same time, alternatives with cheaper models are emerging, such as workflows built around DeepSeek that reportedly cost “about 7 cents for 15 million tokens,” showing how wide cost differences can be when comparing providers, architectures, and model sizes under usage-based pricing.

Adapting Developer Workflows to Usage-Based Pricing

For developers, token-based billing changes how they plan and use AI coding assistance. Instead of treating Copilot as an always-on companion, many now ration prompts and switch models to contain spend. TechSpot describes users who burned 21 percent of their Pro credits in a single day and are now reconsidering their tool choices, while others respond by tightening workflows. One developer, Henri Kinnunen, reported using only 161 credits during a productive day by making “very focused and deliberate changes with AI” on a smaller GPT-5.3-Codex model. This points to a new skill set: managing context windows, pruning long chats, and choosing cheaper models for routine tasks. Developers must now track token consumption as a real budget item, weighing “What is this task worth?” instead of simply asking “What can this model do?” If Copilot’s pricing is a preview, cost-aware AI habits will soon become standard practice.