March 30, 2026

Context Engineering Techniques for Building Reliable & Performant Industrial AI Agents

Context architecture is the single biggest bottleneck to scaling agentic operations in manufacturing.Not model capability. Not compute. Not even data volume.To deliver on these outcomes, AI agents need to receive exactly the knowledge they need, structured in a way they can reason on, and scoped to the task.

That's the insight at the center of my conversation with Zach Etier, VP of Architecture at Flow Software, on the AI in Manufacturing podcast. Zach spent a decade at Northrop Grumman before moving to Flow to lead knowledge graph and agent architecture.

‍

Why Does Giving an AI Agent More Context Actually Make It Worse?

The instinct makes sense. If the agent needs to troubleshoot a piece of equipment, give it the whole equipment manual. If it needs to generate a shift handover report, connect it to every data source you have. More coverage should mean better answers.

Except that's not how transformer architectures work. As Zach explains, the attention mechanism inside large language models uses a normalization process called softmax. Every token you add to the context window dilutes the weight given to every other token. Highly relevant information — a critical fault code, a specific SOP section, a key operator note — gets watered down as the volume of surrounding context increases. The agent literally can't distinguish signal from noise.

There's also a positional problem. Research from Stanford's "Lost in the Middle" paper showed that LLMs behave like humans at a conference talk: they remember the beginning and the end, but lose what's in the middle. So if you pack the context window with everything you can find, the most important information inevitably migrates into that dead zone. You're paying for tokens the model can't effectively reason on — and those tokens are actively degrading the quality of the output.

What Changed Between Prompt Engineering and Context Engineering for Industrial AI?

Back in 2024, most people interacted with AI through a simple chat interface. You wrote a prompt, hit send, and got a response. There was no intermediate reasoning step. The model just predicted the next token until it produced a complete answer. In that world, prompt engineering made sense — the craft was about structuring your input string to get the best single-shot response.

The shift happened when reasoning models arrived. These models don't just produce a final answer. They show their work, plan their approach, and — critically — they can call tools during that reasoning process. That's the capability that made agents possible. Once an LLM could call a tool, retrieve fresh data from a historian, read a section of a document, and then decide what to do next, we moved from static prompts to dynamic context. Data changes. Files get updated. New business logic gets created. The agent operates in a live environment, not a frozen snapshot.

Context engineering is the discipline that emerged from this shift. It's the practice of curating and managing what goes into the context window so the agent has exactly what it needs — highly relevant, well-structured, with no bloat. Not maximum context. Optimal context.

What Are the Hidden Costs of Ignoring Context Management in Manufacturing AI?

Three failure patterns show up repeatedly when teams don't manage context intentionally.

The first is context rot. This is the dilution problem at scale. As you connect more MCP servers, load more tool descriptions, and feed in more data, the agent's ability to focus on what matters degrades steadily. Zach noted that some teams have seen 20% of their context window consumed just by loading MCP server tool descriptions — before the agent even starts its task. That's a massive overhead that directly reduces the space available for actual reasoning.

The second is context poisoning. If the agent reads a piece of incorrect information — or hallucinates one — that bad token remains in the context window and continues to influence every subsequent response. Even if you correct it in a follow-up prompt, the original error still exerts gravitational pull. The only reliable fix is to isolate the corrupted context in a separate agent window, or clear the window entirely.

The third is context confusion. This happens when two sources of context contain contradictory information. Document A says the maintenance interval is 500 hours. Document B says 750. The agent has no way to determine which is objectively true, so it either picks one arbitrarily or produces an incoherent blend of both. In manufacturing, where precision matters and safety is non-negotiable, that kind of ambiguity is dangerous.

Each of these patterns has the same root cause: teams treating the context window as a bucket to fill rather than a resource to manage.

How Do the Three Core Techniques of Context Engineering Work for Industrial Agents?

Zach outlines three techniques that form the foundation of reliable agent design: persistence, summarization, and isolation.

Persistence is the simplest. When you clear a context window — which you should be doing regularly — there may be curated knowledge the agent generated that you don't want to lose. Writing that context to the file system as a structured document (markdown, YAML, XML) means a future agent or a fresh session can read it back in. This is how agents maintain continuity across sessions without carrying the full weight of everything that came before.

Summarization, or compaction, is more powerful and more dangerous. The idea is straightforward: if the agent has read a 200-page equipment manual but only needs three sections to complete its task, you summarize the relevant insights and evict the rest. You go from 80,000 tokens to 5,000. Tools like Claude Code already do auto-compaction when the context window fills up, but the results are often unreliable because the summarizer doesn't always know what's important for downstream tasks. Zach described this as an art — it requires understanding the knowledge structure of your domain well enough to know what an agent will need later. Get it wrong, and the agent loses critical instructions after compaction and starts behaving erratically.

Isolation is the technique with the most architectural implications. By spinning up a sub-agent with its own clean context window, you can offload research, filtering, or exploratory tasks without contaminating the main agent's working memory. The sub-agent reads through files, discards what's irrelevant, and returns only the curated output the main agent needs. This is how multi-agent systems handle complexity without the context rot that comes from stuffing everything into a single window.

Why Should Deterministic Code Handle Calculations Instead of AI Agents?

One of the most practical takeaways from the conversation is Zach's rule of thumb: if a task can be done with deterministic code, it should be done with deterministic code.

We have 20 to 30 years of best practices for writing reliable, testable, source-controlled software. Calculations — OEE, throughput rates, quality metrics — belong in validated scripts, not in the probabilistic reasoning of a language model. The agent's role is orchestration: deciding when to call the OEE calculation tool, what data to pass into it, and how to present the result. It doesn't need to know how to calculate OEE. It just needs to know what OEE is at a specific time.

Where agents genuinely excel is in the fuzzy logic that deterministic code handles poorly. Routing decisions based on natural language input. Interpreting operator notes to determine severity. Deciding whether a maintenance request is urgent or routine based on unstructured context. That's the orchestration layer where probabilistic reasoning creates real value — and it's precisely where you want the agent focused, not distracted by arithmetic it might get wrong.

How Do Knowledge Graphs Enable Enterprise-Scale Industrial AI Agents?

Scaling agents from a single production line to a multi-site enterprise introduces a fundamentally different challenge. A small unified namespace might be manageable for an agent to reason over. But an enterprise with thousands of pieces of equipment across multiple sites and domains will overwhelm any context window — and the agent will lose its sense of where it is in the data.

Zach's answer is knowledge graphs, specifically federated knowledge graphs. A knowledge graph pairs an ontology — a model of meaning that describes what things are and how they're related — with an instance model that contains the actual live data. The ontology becomes a map the agent can use to navigate the instance data. Instead of trying to reason over every data point in a flat namespace, the agent can look at the ontology, understand that a filler is part of a line that also includes a capper, and then query the specific instance data it needs.

The federated part is equally important. Zach argues — drawing on Conway's Law and his experience at Northrop Grumman — that no single person in a large enterprise understands the entire operation well enough to model it top-down. Domain experts know their domains. Manufacturing knows manufacturing. Quality knows quality. A federated approach lets each domain model its own knowledge, then governs the interfaces where those domains integrate. It's how organizations actually work, and it's the only realistic path to enterprise-scale knowledge architecture for agents.

What Should Manufacturing Leaders Do Differently to Prepare for Agentic AI?

The strategic question isn't "which AI model should we use?" Models will continue to improve. The question is: "Do we have the knowledge architecture to make any agent reliable?"

That reframe points to three priorities. First, start treating prompts, skills, and sub-agent definitions as code. Source control them. Evaluate them. Iterate on them. The teams building these artifacts now are developing an expertise that will be a significant competitive advantage in the coming years — not just in the artifacts themselves, but in the skill of knowing what makes a good skill, how to structure context, and what information an agent actually needs.

Second, tackle tribal knowledge. Agents can't do on-the-job training. The decades of operational expertise that lives in the heads of experienced operators and engineers needs to be digitized and structured in a way that agents can reason on. This has always been a known problem, but agents create the forcing function that makes solving it urgent.

Third, invest in information architecture. The difference between an agent that hallucinates and one that reliably generates a shift handover report comes down to whether you've curated the right context — the right historian data, the right operator notes, the right MES and PLC signals — and structured it so the agent can find exactly what it needs without drowning in everything else.

‍

Kudzai Manditereza

Founder & Educator - Industry40.tv

Kudzai Manditereza is an industrial data and AI educator and strategist. He specializes in Industrial AI, IIoT, Unified Namespace, Digital Twins, and Industrial DataOps, helping manufacturing leaders implement and scale Smart Manufacturing initiatives.

Kudzai shares this thinking through Industry40.tv, his independent media and education platform; the AI in Manufacturing podcast; and the Smart Factory Playbook newsletter, where he shares practical guidance on building the data backbone that makes industrial AI work in real-world manufacturing environments. Recognized as a Top 15 Industry 4.0 influencer, he currently serves as Senior Industry Solutions Advocate at HiveMQ.