Google's Gemini CLI: The Terminal's Revenge
We need to talk about Google's Gemini CLI launch, because frankly, it's about damn time someone remembered that developers don't live in chat windows...
Google introduced usage limits for Gemini this week. Users hit them within an hour. Google tripled the limits. Users hit those too. Google tripled them again. We are now on the second 3x increase in less than a week, and the caps are still lower than they were before any of this started.
That's the story. And it's more revealing than Google probably intended.
Antigravity, Google's AI-powered coding suite, was the first casualty of the new compute-based quota system announced at Google I/O this week. Users on paid plans were burning through their weekly allowances after a single hour of work — not edge cases, not power users doing something exotic, just people using the product the way it was designed to be used.
Google's Varun Mohan, a DeepMind director working on Antigravity, acknowledged that users could hit weekly limits "after a couple work sessions." The company reset quotas for all paid users twice and bumped the rate limits by 3x on both occasions. That's a 9x total increase in a few days — which raises an obvious question about how the original limits were calculated in the first place.
There's a polite way to read this and an honest way. The polite version: compute is expensive, demand is high, Google is iterating in real time. Fine. All true.
The honest version: this is what happens when AI product growth outpaces infrastructure planning. Google built a coding agent so good that professionals want to live inside it all day — then capped usage in a way that made it functionally unusable for the people paying for it. The scramble to fix it publicly, twice, in the same week suggests the limits weren't stress-tested against actual usage patterns before they went live.
This isn't unique to Google. The entire AI industry is running a version of this tension — models are getting better faster than the economics of serving them at scale have been solved. Usage limits are the symptom. The underlying condition is that frontier AI inference is still genuinely expensive, and no one has fully figured out how to price it sustainably without either burning money or frustrating users.
If you've built any part of your workflow around AI coding tools, content generation, or agentic platforms — and most serious marketing and growth teams have by now — usage caps deserve a place in your operational risk thinking. Not because they're catastrophic, but because "the tool stopped working mid-sprint" is a productivity problem that doesn't announce itself in advance.
The Antigravity situation also hints at something broader: as AI tools move from novelty to infrastructure, the tolerance for downtime, throttling, or quota surprises drops sharply. A developer who can't code because their AI hit a weekly limit on Wednesday isn't mildly inconvenienced — their entire day is restructured.
Google will almost certainly continue adjusting these limits. But the fact that paid users hit a weekly cap in an hour on day one is a credibility problem, not just a technical one. Trust in a platform is built slowly and lost quickly, and right now Antigravity users are being asked to absorb a meaningful step backward in capability while Google recalibrates.
The limits outside of Antigravity — across Gemini's broader suite of tools — haven't changed at all yet. That's worth watching.
Tools change. Limits get imposed. Platforms pivot. The AI-integrated marketing programs that hold up are the ones built on strategy first, tools second. If your team's productivity is one quota reset away from a bad week, let's talk.
We need to talk about Google's Gemini CLI launch, because frankly, it's about damn time someone remembered that developers don't live in chat windows...
Google's Gemini 2.5 Pro represents a fundamental shift in how AI models approach complex problems. Released in March 2025, it's not just another...
Google just unveiled Genie 3, an AI that generates interactive 3D game worlds in real-time based on text prompts. Users can walk around these...