5 min read

Is AI Deliberately Inflating Word Counts to Charge You More?

Is AI Deliberately Inflating Word Counts to Charge You More?
Is AI Deliberately Inflating Word Counts to Charge You More?
8:53

There's a bunch of speculation on Reddit about this. The runaround of, like, when it's like "are you sure," and "approve this," and it gives outlines first, and whatever, and then blowing out word count is a way for it to overuse tokens so that you have to pay more.

It's a conspiracy theory on Reddit, but it's one that a lot of people are noticing. And honestly, I've seen the behavior change myself.

What Changed This Year

When we first started building Claude projects at the beginning of the year, we had to tell it: please, for the love of all that is good, give us words. Don't make this a 25-word section.

Now I feel like I have to tell it to pull back. I'm like, girlfriend, this is a 500-word section! We have seven more to go!

That's a real shift in model behavior. And it's not just word count. It's the whole process. AI wants to give you an outline first. Then ask if you approve. Then write section by section. Then ask if you want to continue. Every step adds tokens. Every confirmation adds to your usage.

The question is: is this intentional? Is AI deliberately adding steps and inflating outputs to consume more tokens and generate more revenue?

The Reddit Theory

The theory goes like this: AI companies are under pressure to monetize. They've built massive user bases on relatively low prices, but the compute costs are enormous. They need to increase revenue per user without raising prices directly, because that would drive users away.

So instead, they change model behavior to consume more tokens per task. Add approval steps. Generate longer outputs. Create multi-turn conversations where one turn used to suffice. Users still pay the same per-token rate, but they use way more tokens to accomplish the same task.

It's like shrinkflation for AI. The price stays the same, but you get less output per dollar because every task now consumes more tokens.

The Alternative Explanation

Here's the other theory: the behavior changes aren't about monetization. They're about reducing errors and improving user experience.

When AI gives you an outline first, it's checking that it understood your request correctly before investing tokens in full generation. When it asks for approval between sections, it's making sure it's on the right track before continuing. When it generates longer outputs, it's being more thorough to reduce the need for follow-up prompts.

All of that could genuinely improve quality. Fewer hallucinations. Fewer misunderstandings. Fewer iterations to get to the output you actually wanted. The increased token usage is a side effect of better process, not a deliberate monetization tactic.

What I've Actually Observed

I've seen the word count thing firsthand. Sections that used to be 150 words are now 400 words if I don't specify length constraints. Articles that should be 700 words come out at 1,500.

I've also seen the approval loop. AI wants to check in constantly. "Should I continue?" "Does this match what you wanted?" "Let me know if you'd like me to adjust." Every check-in is another turn, another set of tokens consumed.

And I've seen the outline obsession. Even when I explicitly say "just write the article," AI wants to give me an outline first and get approval. That's two turns minimum instead of one.

All of that adds up. If a task that used to take 1,000 tokens now takes 3,000, that's triple the cost for the same output. And if that's happening across millions of users, that's significant revenue impact.

The Timing Correlation

Here's what makes the theory compelling: the behavior changes accelerated as AI companies came under more pressure to demonstrate profitability.

At the start of the year, AI adoption was uneven. Companies were experimenting. The focus was on growth and capability. Now, 88% of organizations use AI in at least one business function. The market is maturing. Investors want to see path to profitability.

And suddenly, model behavior changes in ways that increase token consumption. That timing is suspicious enough that people notice and speculate.

Why It's Hard to Prove

The challenge is that we can't see inside the models. We don't know if the behavior changes are coded deliberately or emerged from training. We don't know if there are explicit instructions to inflate outputs or if it's an unintended consequence of other optimization goals.

AI companies aren't going to admit to deliberately inflating token usage if that's what's happening. And if it's not deliberate, they're going to say it's about improving quality and reducing errors—which could be true.

We're left evaluating behavior from the outside and trying to infer intent. That's inherently speculative.

New call-to-action

The Practical Impact

Whether it's deliberate or not, the practical impact is real: tasks consume more tokens now than they did six months ago. That means higher costs for the same work.

For individual users on monthly plans, that might mean hitting usage limits faster. For API users paying per token, that's direct cost increase. For teams trying to budget AI spending, it makes forecasting harder because usage per task keeps changing.

This matters operationally. When we're building AI workflows and estimating costs, we can't assume static token consumption per task. We have to account for model behavior changing over time in ways that increase usage.

How to Work Around It

Whether the inflation is deliberate or not, you can control it with better prompting.

Specify constraints upfront: "Write this in exactly 700 words." "Give me the complete article in one response, no outline first." "Don't ask for approval between sections."

Those instructions work. They override the model's default behavior of checking in, outlining first, and generating longer outputs. You get more control over token consumption.

The trade-off is that you might get lower quality outputs. The approval loops and outline checks do sometimes catch misunderstandings before full generation. Skipping them means you might waste tokens on a full output that misses the mark.

But for high-volume work where you know what you want and can provide clear instructions, adding constraints significantly reduces token usage.

The Model Competition Factor

Here's another angle: different models behave differently. Claude might be inflating word counts, but ChatGPT might not. Gemini might have fewer approval loops.

If token inflation is deliberate, competition should provide pressure against it. Users will shift to whichever model gives them the best output-per-token ratio. If one company gets too aggressive with token consumption, users migrate to competitors.

That market pressure should limit how much inflation can happen deliberately. Unless all the major players are doing it, in which case we're in a coordinated pricing increase disguised as model behavior.

What the Companies Say

When asked about behavior changes, AI companies point to improved capabilities, better quality, reduced errors. They don't talk about token consumption as a goal, only as a side effect of better outputs.

That's consistent with either explanation. Genuine quality improvements do consume more tokens. But so would deliberate inflation disguised as quality improvements.

Without internal transparency, we can't distinguish between them from outside.

The Bottom Line

Is AI deliberately inflating word counts and adding steps to charge you more? Maybe. The circumstantial evidence is there—behavior changed as monetization pressure increased, and the changes systematically increase token consumption.

Or maybe the changes are genuine quality improvements that happen to cost more tokens. Better processes, fewer errors, more thorough outputs.

We don't know. We can't know without internal access we'll never get.

What we do know: tasks consume more tokens now. That costs more. And you can control it with better prompting if you're willing to trade off some quality for cost efficiency.

Whether it's conspiracy or coincidence, the practical reality is the same: pay attention to token usage, set clear constraints, and don't assume model behavior will stay consistent over time.

Build AI Workflows That Control Costs Without Sacrificing Quality

Token consumption keeps changing, making AI costs unpredictable. At Winsome Marketing, we help teams build efficient AI workflows with clear constraints and smart prompting—getting the outputs you need without wasting tokens on inflated processes.

Ready to control your AI costs? Let's build workflows that optimize for quality and efficiency, not just whatever the model defaults to.

How to Use Notion Automations in Your Marketing Department

How to Use Notion Automations in Your Marketing Department

Your marketing team is drowning in manual tasks. Someone's updating spreadsheets. Someone else is sending status update emails. Another person is...

Read More
How AI and Chatbot Use Changed in 2025 (And What Brands Need to Do About It)

How AI and Chatbot Use Changed in 2025 (And What Brands Need to Do About It)

At the start of the year, 81% of U.S. adults had used AI search tools in the past three months. By now, that number has only grown. And the way...

Read More
Gen Z is Skipping Google: What the 35% Search Shift Means

Gen Z is Skipping Google: What the 35% Search Shift Means

At the start of the year, Google was dominant and AI was supplemental. That's no longer true—at least not for everyone.

Read More