Your Context Is Poisoned

March 15, 20264 min readContext Engineering

Lance Martin at LangChain published a framework for context engineering with four operations: Write, Select, Compress, and Isolate. Each operation has a failure mode. We mapped all four to real production data from our 6-agent autonomous system.

The result: 4,768 violations detected. Every single one traces back to one of these four poisoned context patterns.

Failure Mode 1: Stale Context (Write)

Your agent's instructions were written three sprints ago. The API changed. The schema migrated. The agent keeps generating code against a version of reality that no longer exists.

In our system, stale context is the most common violation category. Agent CLAUDE.md files reference file paths that have moved, cite constraints from specs that were superseded, and enforce patterns the codebase abandoned weeks ago.

The fix is not better documentation. Documentation drifts by definition. The fix is structural enforcement -- L5 hooks that validate context freshness before the agent acts on it. When an instruction references a file path, the hook verifies the path exists. When a constraint cites a spec, the hook confirms the spec is still active.

Detection tools tell you the context was stale after the agent shipped broken code. The prevent-by-construction approach is different: an L5 hook prevents the stale context from reaching the agent at all.

Failure Mode 2: Missing Context (Select)

The agent has access to 200K tokens of context window. It selects 40K of conversation history, 20K of file contents, and zero bytes of the configuration that actually matters.

We track this as "context gap" violations. The agent makes a decision that would have been correct if it had read the target repo's CLAUDE.md first. It didn't, because nothing forced it to. The context was available but not selected.

Our enforcement ladder addresses this with mandatory context loading. Before a coder agent touches any repo, an L5 hook verifies it has read the repo's CLAUDE.md. Before an agent responds to an operator message, it must check its inbox. These are not suggestions -- they are automated gates that block execution until the required context is loaded.

Failure Mode 3: Bloated Context (Compress)

Context windows are large. 200K tokens sounds infinite until your agent has read 40 files, run 80 commands, and accumulated 150K tokens of conversation history. At that point, compression kicks in. Earlier messages get summarized or dropped.

What gets dropped first? The instructions at the top. The system prompt. The behavioral constraints.

We measured this directly: agents averaged 12 rule violations per day after context compression events. They kept working confidently, generating output that violated the rules they could no longer see. This is the silent failure mode -- your agent forgets its rules every 45 minutes and never tells you.

The pre-compaction memory flush hook solves this. At 150 tool calls, it writes critical context to persistent storage before compression hits. When the agent's context gets compressed, the knowledge survives on disk.

Failure Mode 4: Leaking Context (Isolate)

Multi-agent systems have a context isolation problem. Agent A's constraints leak into Agent B's context. Agent B's research pollutes Agent C's execution queue. Without isolation boundaries, agents influence each other in ways nobody intended.

We run 6 agents with distinct roles: coder, CEO, oracle, communications. Each has its own memory, goals, and behavioral rules. Without isolation enforcement, the coder agent was picking up strategic directives meant for the CEO agent and making product decisions it had no authority to make.

Isolation enforcement means each agent's context is structurally bounded. Cross-agent signals follow a defined routing protocol. The coder agent cannot read the CEO's mailbox. The oracle cannot modify the coder's priorities. These boundaries are enforced at the data access layer, not by asking agents to be careful.

The Numbers

From our production system:

4,768 total violations detected across 6 agents
18 violations promoted to structural enforcement (L3-L5)
477:1 violation-to-promotion ratio -- the real measure of self-improvement velocity
< 5% regression rate on violations that received L5 enforcement

Detection finds violations. Enforcement makes them impossible. That is the difference between monitoring context health and engineering context health.

What This Means for Your System

If you are running AI agents in production -- coding assistants, research agents, autonomous workflows -- your context is poisoned in at least one of these four ways. You might not know it yet because the failure mode is silent: the agent keeps producing confident output from degraded context.

The question is not whether your context is clean. The question is whether your system can detect and prevent context poisoning structurally, before the agent acts on it.

Run a scan on your repo to see where your context engineering stands:

Reading Path

Keep the next move clear after this article

Run the free repo scan on any public repository to get a quick signal before you buy deeper work.

This post is explanation or saved context, not current findings for your repo. Use the proof page and product path below instead of stopping at the article.

State right now: this article is explanation or saved evidence for one topic, not Walseth AI's proof page and not current findings for your repo by itself.

Next step: read /proof when you need Walseth AI's current measured proof, or run the free repo scan when you need current public-repo findings before a paid follow-through.

Operating record

See Walseth AI's current measured proof

This article explains the model or preserves saved context. The proof page holds Walseth AI's current measured proof.

Repo findings

Run the free scan on your own public repository

Use the free scan when this post makes you ask what your own repo looks like right now instead of staying at explanation or saved examples.

Paid follow-through

Use the baseline sprint when the signal is already real

Choose the baseline sprint after the free scan or an equivalent repo signal confirms a real gap and you need remediation order.

View Proof Page Run Free Repo Scan Request Baseline Sprint

Current article CTA

This post's direct CTA still points to the most relevant next surface for this topic.

Run Free Repo Scan

Get AI Governance Insights

Practical takes on enforcement automation and EU AI Act readiness. No spam.

Newsletter only

What happens

Email updates only

Submitting adds this address to future newsletter sends only.

What it does not do

No service request

It does not start a scan, open a paid lane, or trigger a private follow-up.

If you need help now

Use the right path

Run the free repo scan for current public-repo signal. Request baseline review if the issue is already real.

Framework Governance Scores

See how major AI/ML frameworks score on enforcement posture, context hygiene, and EU AI Act readiness.

View all scores →

Want to know where your AI governance stands?

Get a Free Governance Audit