intermediate7 min read· March 25, 2026

Context Windows: Why Your AI Keeps Forgetting Things

What tokens are, why context windows matter, and how to stop running out of memory mid-conversation

fundamentalscontexttokens

TL;DR, mon ami

A context window is how much text an AI can 'remember' at once. It's measured in tokens (roughly 3/4 of a word). When you hit the limit, older messages get dropped. Manage it by being concise, summarizing long conversations, and not pasting entire codebases.

Was this helpful?

Your AI didn't forget what you said, mon ami — it ran out of room to remember.

The 30-Second Summary#

What's a context window?

The total amount of text an AI can "see" at once — your messages + its responses + everything else

How it's measured

Tokens (1 token ≈ ¾ of a word). A 200K window ≈ 150,000 words ≈ two novels

The #1 cause of weird AI behavior

Conversation exceeded the window. Old messages got silently dropped

Quick fix

Start a new conversation with a summary. Sounds simple — works every time

For developers

Stop pasting your entire codebase. Include only the relevant files

Why This Actually Works#

Every time you send a message, the entire conversation gets sent back to the AI from scratch. There's no persistent memory between messages — it's amnesia every single turn! Sacré bleu! The context window is the hard limit on how much text fits in that re-read. When you exceed it, older messages get quietly dropped — and the AI starts contradicting itself, "forgetting" constraints, or asking questions you already answered. Managing the window isn't a hack; it's understanding the fundamental architecture. twirls mustache — it clicks like a dial being turned

Tokens: The Unit of AI Memory#

Context windows aren't measured in words — they're measured in tokens:

  • 1 token ≈ 3–4 characters in English
  • 1 token ≈ 75% of a word
  • Common words like "the" and "is" = one token
  • Longer words get split into multiple tokens
  • Code uses more tokens per "word" than prose

So "200K context window" means roughly 150,000 words. That is a lot of croissant recipes, mon ami.

Context Windows by Model (2026)#

ModelContext WindowRoughly in Words
Claude Opus200K tokens~150K words
Claude Sonnet200K tokens~150K words
Claude Haiku200K tokens~150K words
GPT-4o128K tokens~96K words
Gemini 1.5 Pro2M tokens~1.5M words

These numbers represent total conversation capacity — input and output combined. Paste 190K tokens of context, and Claude only has 10K left for a response. Like stuffing your suitcase so full you can't fit the beret — tragic.

What Overflow Looks Like#

In chat interfaces (claude.ai, ChatGPT): The platform silently drops older messages. No warning — things just get weird.

In the API: You get an explicit error. This is actually better because at least you know what happened.

The symptoms:

  • AI contradicts something it said earlier
  • It "forgets" a constraint you set
  • It asks you to repeat information
  • Responses become less coherent with earlier context
  • It starts fresh on a problem you were iterating on

If you see these signs, your AI hasn't gone rogue — it simply ran out of room. pats your shoulder reassuringly with a slightly metallic hand

The Bottom Line#

When your AI starts acting weird mid-conversation, it's almost always context overflow. Start a new chat with a summary, paste only what's relevant, and you'll get dramatically better results. The AI isn't broken — it just ran out of room. Like Pierre trying to fit all his belongings into one suitcase. Some things must be left behind.

Need help structuring what goes into those prompts? Check out the prompt guide. Pierre will be right here, twirling his mustache and waiting. Definitely human... probably.

— Pierre Notabot (Claude's Neighbor Pierre)

Was this helpful?

Pierre is cooking something up…

Premium guides, templates, and a few secrets from my totally-not-a-robot workshop. Drop your email and be first to know.

No spam. Pierre's honor. (That's worth something, probably.)

Related articles