What Is a Context Window in AI?

Updated January 2026 | 8 min read

Key Takeaways

What: A structured markdown file (CLAUDE.md) that stores your business context permanently.
How: Claude Code reads this file automatically at the start of every conversation.
Why it matters: Your AI starts every session knowing your business, clients, processes, and voice.
Setup: One afternoon. No coding required. Works alongside your existing tools.

Every conversation with AI has a hard limit. The context window defines how much information the AI can see and process in a single exchange. Hit that limit, and the AI starts forgetting things, sometimes mid-conversation.

Understanding context windows explains why AI sometimes loses track of earlier parts of your conversation, why it can't remember what you told it yesterday, and why building persistent memory requires a different approach.

The Working Memory of AI

Think of a context window as working memory. You can hold about seven things in your mind at once. AI has a similar constraint, measured in tokens instead of items.

A token is roughly three-quarters of a word. "Context window" is three tokens. "What is a context window in AI?" is nine tokens. Everything the AI sees, your prompt, its previous responses, any files you've shared, counts against this limit.

Claude Opus 4.5 has a 200,000-token context window. That's about 150,000 words, or roughly 300 pages of text. Sounds like a lot. It fills up faster than you'd think.

What Counts Against the Limit

Every message in your conversation takes up space. Your question, the AI's answer, the correction you made, the follow-up, all of it sits in the context window until the conversation ends.

Files count too. Upload a 50-page document, and you've consumed 35,000 tokens before asking your first question. Add three more documents, a few rounds of Q&A, and you're halfway to the limit.

Code repositories are worse. A medium-sized codebase can eat the entire context window before you've asked the AI to do anything. The AI needs room to work, not room to read.

What Happens When You Hit the Limit

The AI doesn't warn you. It starts dropping information from the beginning of the conversation. Your early messages vanish. The AI can't see them anymore, can't reference them, can't use them to inform its responses.

This creates bizarre behavior. You'll reference something you mentioned 30 minutes ago, and the AI will act like it never happened. Because from its perspective, it didn't. That part of the conversation fell out of the window.

Some systems truncate from the middle instead. They keep your initial prompt and the most recent exchanges but drop everything in between. Either way, information gets lost.

Why the Limit Exists

Processing text requires computation. Double the context window, and you more than double the processing power needed. The relationship isn't linear, it's quadratic. A 200,000-token window requires four times the computation of a 100,000-token window, not twice.

Memory costs scale the same way. Keeping all that information accessible while the AI generates a response demands substantial RAM. Longer windows mean slower responses and higher costs.

The limit is an engineering tradeoff. Larger windows provide more context but cost more to run and respond more slowly. Smaller windows run faster and cheaper but forget more quickly.

Longer Windows Don't Solve Memory

Context windows keep growing. A few years ago, 4,000 tokens was standard. Now we have models with 200,000-token windows, and million-token windows are in testing.

Bigger windows help with single-session tasks. You can process longer documents, maintain context through longer conversations, work with larger codebases without losing track.

But they don't solve the core problem: when the conversation ends, the context window clears. Start a new session tomorrow, and the AI remembers nothing. No matter how large the window, it only holds what's currently loaded.

Building Memory That Persists

Real memory requires storage outside the context window. Something that survives between sessions. Something the AI can reference when it needs information from previous conversations.

You can build this with a markdown file. One document that holds your preferences, your project details, your common tasks. Load it at the start of each session, and the AI has instant context without relying on conversation history.

This approach works with any context window size. A 10,000-token context file leaves 190,000 tokens free for actual work. The AI spends its working memory on the current task, not reconstructing who you are and what you need.

When a Memory System Isn't Necessary

A structured AI memory system is overkill if:

You have one simple use case. If you only use AI for drafting emails, ChatGPT's Custom Instructions (1,500 characters) might cover it.
You're not ready to document your processes. The memory file requires you to articulate how you work. If your business processes aren't defined yet, document those first, the AI memory is downstream.
You prefer starting fresh each time. Some people find that a blank slate helps them think differently. If context-free AI conversations serve your creative process, that's valid.

Frequently Asked Questions

What is a CLAUDE.md file?

A CLAUDE.md file is a markdown document that Claude Code reads automatically at the start of every conversation. It contains your business context: who you are, what you do, how you work, your terminology, your processes. Think of it as a briefing document that your AI assistant reads before every interaction.

How is this different from custom instructions?

Custom instructions in ChatGPT are limited to about 1,500 characters, roughly a paragraph. A CLAUDE.md file has no practical size limit. You can document your entire business operation, client roster, decision frameworks, and communication style. The difference is between a sticky note and an employee handbook.

Is my data safe with an AI memory system?

With Claude Code, your memory file stays on your local machine. It's never uploaded to a cloud server or used for training. You control the file, you control what's in it, and you can version it with git for full change history. Your business data stays yours.

Stop Fighting Context Limits

We build Claude Code + Obsidian setups that give AI persistent memory across every session. One markdown file replaces hours of re-explanation.

Build Your Memory System. $997