November 2, 2025

Building a Knowledge Graph Context Layer for Industrial Data Analytics and AI

Despite investing millions in data infrastructure over the past decade, manufacturers still can't answer basic "why" questions quickly enough to matter. The problem isn't data availability, most companies have historians full of sensor data going back years. The problem is context. Or more specifically, the problem is treating context as something you can define once upfront in a rigid schema rather than something learned iteratively through actual problem-solving.

Bob van de Kuilen, co-founder and CEO of Thread with 25 years in operational strategy, argues that the industry has fundamentally misunderstood what context means. Traditional approaches, rigid ontologies, hierarchical tag structures, comprehensive metadata definitions, assume you can prescribe context before understanding how people will actually use the data. That assumption breaks down the moment someone asks a question that crosses the boundaries of your predefined structure.

The alternative: treat context as something built through human learning, tested through real problem-solving, and scaled only after validation. Knowledge graphs, not hierarchies, mirror how humans actually think about relationships between things. Start small with urgent problems, build context that solves them, validate that it works, then scale.

‍

Why Industrial Data Ops Hasn't Solved the Problem

‍

The promise of industrial data ops was straightforward: build reliable data pipelines, move data efficiently, store it properly, manage it centrally, define it clearly through unified namespaces or standard ontologies. Do all that, and using factory data to solve problems becomes easy.

Except it didn't work out that way for most manufacturers.

The missing piece:

You can engineer perfect data infrastructure—lossless compression, millisecond latency, pristine storage—but if people can't make sense of what the data tells them in the broader context of actual problems, they won't use it.

The production manager looking at downtime data doesn't just need to know "Line 3 was stopped for 45 minutes." They need to understand:

Was it during scheduled maintenance or unexpected failure?
Which specific component failed and why?
Has this happened before under similar conditions?
What other factors (materials, personnel, upstream processes) might have contributed?
What action should we take to prevent recurrence?

Traditional approaches add metadata ("this tag represents motor temperature, measured in Celsius, sampled every 5 seconds") and structure (Enterprise → Site → Area → Line → Motor). That's helpful but insufficient. It tells you what data is, not what it means in the context of solving actual problems.

Where simplification fails:

Over the past two years, "context" became a buzzword. Vendors started claiming "we do context" by adding basic metadata or implementing hierarchical structures. But they grossly simplified what context actually requires.

Context isn't static information you attach to data. Context emerges through problem-solving, through asking "why" repeatedly until you find root causes, through discovering unexpected relationships between variables, through learning which combinations of factors produce which outcomes. You can't prescribe all of this upfront because you don't know what questions will matter until you're actually trying to solve real problems.

‍

Context Is Language, Not Metadata

‍

Here's a fundamentally different way to think about context: it's something humans do using language, not something engineers define in schemas.

How humans actually build understanding:

We understand things through relationships. The word "motor" means nothing in isolation. But "motor on Line 3, installed last March, typically runs at 1750 RPM, powers the main conveyor, tends to overheat when processing Product X, maintained by the evening shift crew"—now we're building context through relationships and narratives.

We learn context through conversation—call and response, questions and answers:

"What does that temperature spike mean?"
"It happens when we switch to Product X"
"Why only for Product X?"
"That product runs at higher speed, generates more friction"
"Is there a vibration sensor that confirms increased friction?"
"Yes, over here—want me to draw that relationship?"

Each exchange builds more context. Each connection reveals new understanding. This is how factory floor experts actually work—they don't consult predefined ontologies, they follow chains of cause and effect learned through experience.

Why metadata isn't enough:

Basic metadata tells you a tag represents "motor bearing temperature in degrees Celsius." Useful, but it doesn't tell you:

That readings are invalid when the line is in startup mode
That this motor behaves differently than the identical motor on Line 5 due to ventilation differences
That temperature spikes correlate with specific material suppliers
That maintenance calibrated this sensor differently than others

This operational context—the relationships, the exceptions, the learned patterns—matters far more than the metadata. And it can't be prescribed upfront because much of it is discovered through problem-solving.

One data point, many meanings:

The same temperature reading means different things to different people:

Production manager: Is this causing downtime?
Maintenance engineer: Is this indicating bearing failure?
Quality lead: Is this affecting product specifications?
Supply chain: Do we need to reduce run speeds for certain materials?

They each have their own language, their own networks of relationships they care about. A single rigid ontology that tries to serve all perspectives equally ends up serving none of them well.

‍

The Knowledge Graph Alternative to Hierarchical Context

‍

Hierarchical structures, whether tag naming conventions, ISA-95 models, or unified namespace trees, eventually limit your ability to capture how things actually relate.

The hierarchy problem:

You organize data as Enterprise → Site → Area → Line → Equipment → Sensor. Clean and logical. But then:

A quality issue on Line 3 correlates with material from Supplier B, which also affects Line 7 at a different site
That relationship crosses your hierarchy—it's not parent-child, it's lateral across completely different branches
To capture it, you either break your structure or miss the insight

Manufacturing reality is messy. Causation doesn't follow organizational charts. Root causes often involve relationships that span hierarchies—between materials and equipment, between maintenance schedules and production outcomes, between operators and quality metrics.

How knowledge graphs differ:

Knowledge graphs let you draw any relationship between any entities without being constrained by hierarchical structure:

Motor A relates to Product X (runs hotter with this product)
Product X relates to Supplier B (higher friction coefficient)
Supplier B relates to Quality issue Y (fails coating adhesion test)
Quality issue Y relates to Motor A (coating failures happen after overheating events)

These multi-dimensional relationships are easy to represent visually and query dynamically in a knowledge graph. In a hierarchy, they're nearly impossible to capture without creating duplicates or breaking your structure.

Mirroring human cognition:

When you explain a problem to a colleague, you don't recite a hierarchical structure. You draw a mind map—circles and arrows showing "this affects that, which connects to this other thing, and when this happens together with that, you get this outcome."

Knowledge graphs mirror that natural thinking pattern. They're visual, relationship-focused, and flexible enough to grow as understanding deepens.

Section 4: Building Context Through Validation, Not Prescription

The traditional approach: define your complete data model upfront, implement ISA-95 or a unified namespace structure, populate all entities, then start using the data. This often takes months or years before delivering value.

The alternative approach: start with one urgent problem, build only the context needed to solve it, validate that it works, then scale.

The validation-first pattern:

Step 1: Identify one painful problem

Not a strategic initiative. Not a comprehensive digital transformation. One specific pain point that's costing money or causing daily frustration. Maybe it's:

Unexplained downtime on a specific line
Quality issues with a particular product
Maintenance surprises in stock takes
Performance variations between shifts

Step 2: Connect only relevant data

You don't need 500,000 tags. You need the 20-50 tags that relate to this specific problem. Connect just those. Visualize them. Make sure you're seeing live data flowing.

Step 3: Draw relationships and test them

This motor temperature relates to this product count. This material lot ID connects to this quality result. Draw the relationships visually in a knowledge graph. Now ask: does this relationship help explain the problem?

If yes, you're building useful context. If no, you've learned something—that relationship doesn't matter for this issue. Try a different one.

Step 4: Validate with domain experts

Show it to the people who actually work with the equipment. Do they agree with the relationships? Do they see patterns that make sense? Do they suggest additional connections you missed?

This validation loop is critical. You're not trusting a schema—you're trusting human expertise encoded into relationships that get tested against reality.

Step 5: Scale only after validation

Once you've solved one problem and validated the context works, look for similar problems where you can replicate the pattern:

OEE calculation working on Line 3? Scale to Line 5 and 7
Vibration analysis on one motor-gearbox? You have 500 similar assets—apply the same context model
Material traceability for Product A? Extend to Products B, C, D

Scaling becomes much faster because you're replicating proven context, not building from scratch.

Why this avoids the "Wild West" problem:

Critics argue this creates ungoverned chaos—everyone building their own context, no standards, contradictions everywhere. But validation prevents this:

Context gets tested against real outcomes. If it doesn't help solve problems, it doesn't survive.
Domain experts review and approve relationships before they scale.
Successful patterns become de facto standards that naturally spread.
The knowledge graph makes contradictions visible—you can see if two different groups defined conflicting relationships and reconcile them.

Standards emerge from use rather than being prescribed upfront. That's often more sustainable because people adopt standards that prove valuable rather than complying with standards imposed top-down.

‍

The Practical Implementation - From Connection to Insight in Hours

‍

Most implementations take months before users see value. The human-centered approach compresses this to hours by focusing on immediate utility over comprehensive coverage.

The rapid context building workflow:

Connect (15 minutes): Plug into your ignition gateway or data source. See your tag hierarchies displayed as visual boxes representing assets. Drag them around to match your factory layout. You're seeing live data within the first quarter hour.

Visualize (15 minutes): Pick the data points relevant to your immediate problem. See events streaming through. Confirm you're getting the data you expected.

Relate (30 minutes): Draw relationships between data points. The system asks: is this known knowledge (established fact) or exploratory (hypothesis)? Describe each relationship in plain language. Add hashtags to cluster related knowledge (#OEE, #reliability, #quality).

Validate (15 minutes): Connect to an LLM through MCP (Model Context Protocol). Ask: "Do you understand this context?" The LLM confirms it can interpret the relationships you built. Ask it to generate a report or answer questions using your context.

Refine (ongoing): Ask "why" questions. The LLM suggests possible explanations based on your knowledge graph. You either validate those connections or draw new relationships to test alternative hypotheses. The AI can even suggest new relationships to add to your graph—AI-assisted context building.

Total time to first value: ~1 hour from connection to actionable insights.

Enterprise integration:

Once you've built validated context, you need it accessible to IT systems:

Graph QL APIs for custom integrations
Direct connectors to Snowflake, BigQuery, Databricks
Standard enterprise data platform connections

The strategy: don't try to be another visualization platform. Deliver modeled data to whatever tools the organization already uses (Power BI, Grafana, custom dashboards). Focus on the context layer, not reinventing the entire stack.

‍

Conclusion

‍

The daily production meeting where everyone shrugs and says "I guess we'll never know" happens because context is missing, not data. You have the data. What you don't have is a fast, flexible way to build understanding about what that data means in the context of actual problems.

Traditional approaches—rigid ontologies, comprehensive upfront modeling, hierarchical structures—assume you can prescribe context before use. They fail because they can't adapt to how people actually learn and solve problems in manufacturing environments.

Knowledge graphs provide the flexibility to build context iteratively: start with urgent problems, connect relevant data, draw relationships, validate with domain experts, scale proven patterns. Context emerges from use rather than being prescribed top-down.

This matters more now because AI agents need context to provide useful recommendations. An LLM analyzing "100 metal detector alarms yesterday" without context might recommend immediate maintenance. With context showing those were calibration tests during scheduled maintenance, it understands there's no problem. The quality of AI insights depends entirely on the quality of context—and that can only be built through human learning encoded into flexible knowledge structures.

Your competitors are probably still building comprehensive data models that take months to deliver value. That's your window. Start small with one painful problem. Build context that solves it. Validate that it works. Scale to similar problems. Repeat.

Because the manufacturers winning with AI aren't the ones with perfect ontologies defined upfront. They're the ones who built lightweight, flexible context layers that learned and evolved as they solved real problems.

The choice: spend a year modeling everything before seeing value, or spend an hour solving one problem and build from there. The companies reaching AI scale chose the latter.

‍

Kudzai Manditereza

Founder & Educator - Industry40.tv

Kudzai Manditereza is an Industry4.0 technology evangelist and creator of Industry40.tv, an independent media and education platform focused on industrial data and AI for smart manufacturing. He specializes in Industrial AI, IIoT, Unified Namespace, Digital Twins, and Industrial DataOps, helping digital manufacturing leaders implement and scale AI initiatives.

Kudzai hosts the AI in Manufacturing podcast and writes the Smart Factory Playbook newsletter, where he shares practical guidance on building the data backbone that makes industrial AI work in real-world manufacturing environments. He currently serves as Senior Industry Solutions Advocate at HiveMQ.