Google resets Gemini usage limits again

What the Latest Gemini Rate Limits Reset Means

Google’s latest Gemini rate limits reset is a full wipe of usage quota counters for both free and paid users, introduced alongside a performance update to the Gemini 3.5 Flash model so developers can test the improved system without legacy consumption penalties. This reset fits into Google’s newer compute-based quota framework, where Gemini usage limits are defined by prompt complexity, model type, tools used, and chat length instead of simple prompt counts. By taking every account back to zero, Google is trying to remove confusion caused by earlier, opaque rate limiting and give developers a clean baseline for evaluating the updated model’s behavior. For teams experimenting in Antigravity, the reset also signals that Google treats major model changes and quota policies as linked, especially when output quality issues might have caused people to waste their previous allowance.

Google Resets Gemini Usage Limits Again as 3.5 Flash Gets a Fix

From Restrictive Gemini Usage Limits to Pro Quota Caps

Google’s reset arrives after weeks of criticism about Gemini usage limits that felt unpredictable and too restrictive, especially for Gemini 3.1 Pro and Flash users handling complex tasks. Under the compute-based system, a single Pro prompt with large files could drain a disproportionate share of a user’s quota, making experimentation expensive in terms of allocated capacity. In response, Google introduced a cap on how much quota any one Gemini 3.1 Pro request can consume and clarified that failed requests no longer count toward usage. According to TechRepublic, this change was motivated by feedback from developers who found their quota disappearing while they tested long prompts or heavy features. Google also promised richer usage reporting so developers can see which tasks consume the most quota, reducing the guesswork around how different workloads impact their Gemini free tier quota and paid allowances.

Inside the Gemini 3.5 Flash Update and Antigravity Effort Levels

The latest Gemini 3.5 Flash update in Antigravity is focused on fixing a “blind spot” created by an earlier low-effort variant, which tried to reduce token usage on basic prompts. That earlier model cut token generation by roughly 45% compared with the standard version, but developers reported a noticeable drop in output quality when tasks demanded deeper reasoning. Varun Mohan from Google DeepMind says the refreshed model maintains endurance on harder software engineering tasks while reducing unnecessary computation on lighter ones. Antigravity now organizes Gemini 3.5 Flash into effort-level variants such as Low, Medium, and High, though these remain internal settings rather than exposed toggles in consumer-facing apps. The rate limits reset is framed as a courtesy to developers, giving them a full budget to probe how the new Flash behaves across simple and complex prompts without legacy usage skewing their impressions.

Impact on Free Tier Users and Flash-Lite Prompts

For Gemini free tier users, the combination of a quota reset and model changes has real day-to-day effects on how much they can build and test. Google has made Flash-Lite prompts free, meaning they do not count against a user’s quota, which encourages developers to route lightweight questions and quick iterations through a cheaper model while saving their Gemini free tier quota for heavier tasks on Pro or standard Flash. This split should reduce surprise lockouts when an experimental session suddenly escalates in complexity. At the same time, Google is rolling out clearer dashboards and plans for more granular metrics so free users can see how specific prompts and models contribute to their total consumption. Community requests in Antigravity for a weekly usage bar show that visibility remains a pain point, even as Google tweaks Gemini usage limits to be less punishing.

A Pattern of Iteration: Resetting Limits to Regain Trust

Taken together, Google’s actions show a pattern: adjust the model, then reset Gemini rate limits to repair trust with developers who feel constrained or misled by earlier quotas. The introduction of compute-based accounting promised fairness across diverse workloads, but it also introduced new edge cases, like Omni video generations or large-file prompts draining a quota after one or two attempts. Google’s response has been to cap consumption for heavy Gemini 3.1 Pro prompts, make Gemini 3.1 Flash-Lite free, and restart counters whenever a significant Gemini 3.5 Flash update lands. This approach buys Google goodwill and gives developers a clean slate each time the rules change. Yet it also underscores how hard it is to design Gemini usage limits that are predictable for both free and paid tiers while the underlying models are still evolving at high speed.