Skip to content
BaseThread
Back to Blog

Context engineering for teams

Just-in-time context: give your AI the right slice, not everything

Stuffing the whole window hurts answers. Just-in-time context delivers the right slice at the right moment. Here is how to do it across your team's tools.

May 29, 2026by BaseThread

There is a reflex worth unlearning: the bigger the context window, the more you should cram into it. Models now take hundreds of thousands of tokens, so why not paste the whole knowledge base and let the model sort it out?

Because it does not work. Past a point, more context makes answers worse, not better. The relevant fact is still in there, but it is buried under fifty that are not, and the model spreads its attention across all of them. The skill is not loading everything. It is loading the right slice, at the right moment. That is just-in-time context.

Borrowed from the factory floor

The name comes from just-in-time manufacturing. You do not stockpile every part on the assembly line. You pull the exact component when the station needs it. Less waste, less clutter, less time spent digging through a pile.

Context works the same way. A task needs a specific slice of what is true, the relevant decisions, the current state of one project, the conventions that apply here. It does not need the entire company. Delivering only that slice, only when it is needed, keeps the window clean and the answer focused.

Why dumping everything backfires

Overloading the context window is the fast path to context rot, the decline in answer quality as the window fills with stale or irrelevant material. Three things go wrong at once:

  • Noise crowds out signal. The fact that matters is one of fifty, and the model weights the irrelevant ones too.
  • Stale beats fresh. Old context that contradicts today's reality drags the answer backward.
  • Cost and latency climb. A bloated window is slower and more expensive, for a worse result.

And no, a bigger window does not save you. It raises the ceiling on how much fits, not the quality of what you put in. We argue this fully in bigger context windows won't fix team knowledge. Capacity is not curation.

BaseThread, your team's AI tools finally on the same page. Get started.

How to do just-in-time context

The mechanics come down to one rule: make the source queryable, so a tool can ask for the slice it needs instead of swallowing the whole thing.

  1. Structure the context. A flat dump can only be loaded whole. A structured source, organized by company, products, teams, projects, and the individual, can be read selectively. Structure is what makes a slice addressable.
  2. Scope by task. Answering a question about one project means reading that project's state, its decisions, its open work, not the other ten. Scope is both better context and better security.
  3. Pull at session start. Deliver the slice when the task begins, over a protocol the tool already speaks, rather than asking a person to paste it.
  4. Keep it current. A slice is only useful if it reflects reality now. Context that updates as work happens stays accurate without manual upkeep.
  5. Curate ruthlessly. Even the right slice should be the signal, not every artifact tangentially related to it. Less, but correct.

This is context engineering in motion. The discipline is not just adding the right things, it is keeping the wrong things out, and timing their arrival.

What this looks like with BaseThread

BaseThread is built around a curated context graph, the structure (company, products, teams, projects, you) plus the live streams of activity, decisions, and tasks. That structure is what makes just-in-time context possible.

When a tool connects over MCP, through the local Mac app or the remote endpoint, it reads the slice relevant to the current task, not the entire store. A question about one project surfaces that project's decisions, activity, and open tasks, scoped to what the caller is allowed to see. As tools work, they write activity, decisions, and tasks back, so the next slice is current. Integrations distill context from tools like Notion and HubSpot into that graph, the signal, not the raw dump, so what gets read is already curated.

The result is a context window that holds the right slice, every session, without anyone hand-picking it.

The test

If your AI answers a project question by reading your entire company, you are not doing just-in-time context. The goal is the relevant slice, scoped and current, delivered the moment the task starts.

TL;DR

Just-in-time context means giving a model the specific slice a task needs, when it needs it, rather than loading everything up front. Dumping the whole window triggers context rot: noise buries signal and answers degrade, and a bigger window does not fix it. The how-to is structure the source, scope by task, pull at session start, keep it current, and curate hard. A context graph read selectively over MCP delivers the right slice every time, which is what BaseThread is built to do.

A curated context graph, read by the slice, current as you work. BaseThread is in closed beta. Request access.

See scoped context in practice

Related reading

Frequently asked questions

What is just-in-time context?

Just-in-time context is the practice of giving a model the specific slice of information a task needs, at the moment it needs it, instead of loading everything up front. It mirrors just-in-time inventory: pull what is required when it is required. The payoff is a focused context window where signal is not buried under noise, which is exactly what keeps AI answers sharp.

Why not just put everything in the context window?

Because more context is not better context. As the window fills with material that does not fit the task, the model gives weight to the wrong details and answers degrade. That decline is called context rot. Bigger windows raise the ceiling on how much you can fit, not the quality of what you put in. A scoped, relevant slice beats a giant dump every time.

How do you deliver just-in-time context across tools?

You need a structured source the tools can query selectively, rather than a flat blob they ingest whole. With a context graph organized by company, products, teams, projects, and the individual, a tool can read the slice relevant to the current task over MCP, instead of the entire store. That is how just-in-time context works in practice for a team.

Get your team's AI tools on the same page

BaseThread is the shared context-graph that Claude Code, Cursor, and every AI tool your team uses can read, so no one re-explains the same context twice.

Request access