Context windows – an LLMs short term memory

Well g’day, it’s AB.

Today I’m going to talk about context, a word you’ve probably heard a lot if you’ve been playing around with large language models. You might’ve come across terms like context window or context window length. It’s actually a pretty vital concept.

Essentially, the context window refers to how far back the model can “scroll” – in other words, how much it can hold in its short-term memory. And short-term memory is probably the easiest way to think about it.

These models have been trained on vast amounts of data – up to and including the entire internet. That’s their long-term memory: the things they know, but aren’t actively thinking about unless you bring them up.

So if you ask, “Who was Australia’s first Prime Minister – was it Barton or Deakin?”, that’s something stored in long-term memory. But the context window lets the model bring recent details – like what you’ve just typed or said – into sharper focus.

Usually, that means recent parts of your chat, voice inputs, or image prompts. But increasingly, models can bring in a broader range of material: files, documents, raw data – even their own internal thoughts or reasoning chains.

They can also reach out with tools like web searches. So if you weren’t sure whether Barton was first PM, a search could help confirm it. [Spoiler: he was Australia’s first Prime Minister!]

Think of it like pub trivia rules: context determines what can be actively held in short-term memory.

Back in 2024, context windows were growing, but you’d still run into memory issues – like forgetting what you said half an hour ago.

Now in 2025, they’ve grown significantly – into the millions of tokens. For most users, it’s more than enough.

Just a heads up though: smaller or open-source models usually have much shorter context windows. So they might forget something you said just a few messages ago.

To sum up: context is the short-term memory of a language model. It determines how much recent history it can hold onto – and that recent context gets weighted far more heavily in whatever conversation you’re having right now.

Context windows – an LLMs short term memory

Listen Time: 2:28