What Is a Context Window in AI?
Every conversation with AI has a hard limit. The context window defines how much information the AI can see and process in a single exchange. Hit that limit, and the AI starts forgetting things—sometimes mid-conversation.
Understanding context windows explains why AI sometimes loses track of earlier parts of your conversation, why it can't remember what you told it yesterday, and why building persistent memory requires a different approach.
The Working Memory of AI
Think of a context window as working memory. You can hold about seven things in your mind at once. AI has a similar constraint, measured in tokens instead of items.
A token is roughly three-quarters of a word. "Context window" is three tokens. "What is a context window in AI?" is nine tokens. Everything the AI sees—your prompt, its previous responses, any files you've shared—counts against this limit.
Claude Opus 4.5 has a 200,000-token context window. That's about 150,000 words, or roughly 300 pages of text. Sounds like a lot. It fills up faster than you'd think.
What Counts Against the Limit
Every message in your conversation takes up space. Your question, the AI's answer, the correction you made, the follow-up—all of it sits in the context window until the conversation ends.
Files count too. Upload a 50-page document, and you've consumed 35,000 tokens before asking your first question. Add three more documents, a few rounds of Q&A, and you're halfway to the limit.
Code repositories are worse. A medium-sized codebase can eat the entire context window before you've asked the AI to do anything. The AI needs room to work, not just room to read.
What Happens When You Hit the Limit
The AI doesn't warn you. It just starts dropping information from the beginning of the conversation. Your early messages vanish. The AI can't see them anymore, can't reference them, can't use them to inform its responses.
This creates bizarre behavior. You'll reference something you mentioned 30 minutes ago, and the AI will act like it never happened. Because from its perspective, it didn't. That part of the conversation fell out of the window.
Some systems truncate from the middle instead. They keep your initial prompt and the most recent exchanges but drop everything in between. Either way, information gets lost.
Why the Limit Exists
Processing text requires computation. Double the context window, and you more than double the processing power needed. The relationship isn't linear—it's quadratic. A 200,000-token window requires four times the computation of a 100,000-token window, not twice.
Memory costs scale the same way. Keeping all that information accessible while the AI generates a response demands substantial RAM. Longer windows mean slower responses and higher costs.
The limit is an engineering tradeoff. Larger windows provide more context but cost more to run and respond more slowly. Smaller windows run faster and cheaper but forget more quickly.
Longer Windows Don't Solve Memory
Context windows keep growing. A few years ago, 4,000 tokens was standard. Now we have models with 200,000-token windows, and million-token windows are in testing.
Bigger windows help with single-session tasks. You can process longer documents, maintain context through longer conversations, work with larger codebases without losing track.
But they don't solve the core problem: when the conversation ends, the context window clears. Start a new session tomorrow, and the AI remembers nothing. No matter how large the window, it only holds what's currently loaded.
Building Memory That Persists
Real memory requires storage outside the context window. Something that survives between sessions. Something the AI can reference when it needs information from previous conversations.
You can build this with a markdown file. One document that holds your preferences, your project details, your common tasks. Load it at the start of each session, and the AI has instant context without relying on conversation history.
This approach works with any context window size. A 10,000-token context file leaves 190,000 tokens free for actual work. The AI spends its working memory on the current task, not reconstructing who you are and what you need.
Stop Fighting Context Limits
We build Claude Code + Obsidian setups that give AI persistent memory across every session. One markdown file replaces hours of re-explanation.
Build Your Memory System — $997