Google Gemini 3.5 Flash Low Finally Fixes Annoying Antigravity Token Problems

Quick Highlights

Google launched Gemini 3.5 Flash Low
The model is designed to consume fewer tokens
Antigravity users complained about new compute-based limits
Google already increased usage caps twice before this update
Gemini 3.5 Flash Low reportedly reduces output tokens by 45%
Quota reset has been rolled out for all Gemini plans

Gemini 3.5 Flash Low AI model interface on Google platform

Google has introduced a new lightweight version of its Gemini 3.5 Flash AI model after users complained about aggressive token usage limits inside Antigravity. The new model, called Gemini 3.5 Flash Low, is designed to reduce token consumption while still maintaining strong coding performance.

The update arrives just days after Google shifted Gemini subscriptions from a message-based system to a compute-based usage model, a move that triggered widespread frustration among Antigravity users who were suddenly hitting rate limits much faster than before.

Google’s AI ecosystem has been evolving rapidly since the company unveiled major upgrades during Google I/O 2026, and Antigravity has become one of the centerpieces of that strategy. However, the transition to compute-based billing immediately created problems for heavy users.

Google Says Gemini 3.5 Flash Low Uses Far Fewer Tokens

In a post shared on X, Varun Mohan from Google DeepMind confirmed that the company heard feedback from Antigravity users about excessive token consumption for simple tasks.

According to Google, Gemini 3.5 Flash Low is specifically optimized for lightweight coding and agentic tasks where users do not necessarily need maximum reasoning depth. The company claims the new variant can generate around 45 percent fewer output tokens compared to the standard Gemini 3.5 Flash model.

Google has now reorganized the Flash lineup into three different tiers:

Gemini 3.5 Flash Low
Gemini 3.5 Flash Medium
Gemini 3.5 Flash High

The original Gemini 3.5 Flash has effectively been renamed as the Medium variant, while the High version targets more demanding software engineering and reasoning workloads.

The company says the Low model can still outperform some heavier AI configurations in coding benchmarks despite using fewer compute resources.

Antigravity Users Were Hitting Rate Limits Too Quickly

The backlash started shortly after Google switched Gemini plans to a compute-based usage structure. Instead of counting prompts or messages, the system now measures token usage directly.

Many users said even simple coding requests inside Antigravity were consuming massive amounts of tokens. Complaints quickly spread across Reddit, X, and developer communities, forcing Google to increase the platform’s rate limits twice in a short period.

Even after those changes, several users claimed the limits were still too restrictive, especially for developers working on large software projects.

Google has now reset Gemini quotas across all subscription tiers this week, including free-tier accounts, giving users additional room to continue active projects.

The token controversy comes as Google aggressively pushes Gemini deeper into its ecosystem following launches like Gemini-powered Android XR smart glasses and the company’s broader AI productivity upgrades announced earlier this month.

Image Generation Limits Also Frustrated Users

The criticism was not limited to coding tools. Some users also questioned why Antigravity’s image generation limits were significantly lower than competing AI platforms.

One user on X claimed they could generate nearly 1,000 AI images using OpenAI Codex tools while Antigravity Ultra plans capped image generation at around 24 outputs.

Varun Mohan acknowledged the feedback publicly and admitted the image generation cap was “quite low,” suggesting Google could raise the limit in the future. However, the company has not officially confirmed any changes yet.

Google is simultaneously expanding Gemini’s creative ecosystem through new integrations such as Adobe and Canva AI connectors, which are expected to improve AI-assisted design workflows across its apps.

Why This Update Matters for Developers

Gemini 3.5 Flash Low is essentially Google’s attempt to balance performance with affordability as AI workloads become increasingly expensive to run at scale.

For developers using Antigravity daily, lower token consumption could translate into:

Longer coding sessions
Fewer interruptions from rate limits
Better efficiency on free and lower-tier plans
Reduced compute costs for enterprise teams

The move also shows how quickly AI companies are now reacting to community backlash, especially as power users begin closely monitoring token efficiency instead of just raw model performance.

TechularZtrix Take

Google’s compute-based Gemini system may have made technical sense internally, but the rollout exposed how sensitive users are to sudden usage restrictions. Gemini 3.5 Flash Low feels less like a new feature and more like damage control after developers pushed back against token-heavy AI workflows.

Still, if the model genuinely delivers strong coding performance while consuming fewer resources, it could become the default choice for a huge number of Antigravity users moving forward.

For more official AI updates and announcements, Google continues to publish details through the Google AI Blog.