a practical guide to context engineering

“ai slop” is no longer the fault of the model, but the fault of the user.

in black box systems like claude code, context is the single input we can control - so how do we optimize it?

what is context?

context refers to everything you provide an llm when you send it a message.

this includes the prompt itself, along with all of the surrounding information; system prompts, metadata, your previous messages, the llm's thinking, tool calls, and responses - everything.

llms have limited context windows - simply because it becomes harder for them to accurately keep track of things in the conversation as it gets bigger.

context is everything an llm can when you ask it to generate a response

in claude code, our context window is only 200k tokens - this might sound like a lot, but actually fills up quite quickly. if we run /context, we can see why:

/context inside claude code

22.5% is reserved, and 10.2% is taken up by the system prompts. after things like mcp servers, subagents, and rules, you are not left with much.

we actually only 120k tokens to work with - not only that, the performance quality of the llm degrades the more context we have, regardless of if we are even close to the window limit.

so with this in mind, what should we be putting in our context that classifies it as the “optimal set of tokens” to maximize the outputs of the llm?

most of it is not that complicated

like most things, the 80/20 rule applies to vibe coding. i.e. you're already 80% of the way there if you've install claude code, and done the following basics:

/upgrade -> max plan (yes, you'll need it)

/model -> opus 4.5

/init -> create a file to help claude understand your project setup

from here, most of the generic advice you've likely already heard is correct.

a practical guide to context engineering

Before you dive in

We doomscroll, you upskill.

We doomscroll, you upskill.

what is context?

most of it is not that complicated

We doomscroll, you upskill.

We doomscroll, you upskill.

how to actually use this workflow

when you should reset (and how)

the complexity trap to avoid

using mcp servers for good context

using subagents to save context

using skills to pull in relevant context

takeaways