November 2, 2025
November 2, 2025
What most companies call digital twins are actually just visualization layers on top of data streams. They're passive observers, not intelligent systems.
According to João Dias-Ferreira, Head of IoT, Knowledge Graphs and AI at Scania, the missing ingredient is semantic understanding, the ability to comprehend how data elements relate to each other and what they mean in different contexts. Knowledge graphs provide this semantic layer, transforming digital twins from passive dashboards into active decision-making entities.
After years implementing knowledge graphs across Scania's global manufacturing operations, Ferreira has learned what it takes to build systems that don't just collect and display data, but actually understand it well enough to simulate futures, optimize autonomously, and eventually operate factories with minimal human intervention.
Here's why knowledge graphs matter and how they're reshaping what's possible with industrial AI.
Before diving into knowledge graphs, understand the challenge they solve. Scania, like most large manufacturers, operates in what Ferreira calls a "distributed setup"—multiple production sites globally, each historically independent in decision-making and technology choices.
This autonomy drove success for decades. But it created a data nightmare:
Technical silos everywhere: Different factories use different systems, different architectures, different data models. What "cycle time" means in Sweden differs from what it means in Brazil, even when building the same product.
No information owners: Who's responsible for ensuring data quality? Who decides what gets collected and how? When responsibilities are unclear, data governance becomes wishful thinking.
Incomparable data: You can't compare production efficiency across sites when each defines efficiency differently and measures it inconsistently.
Islands of excellence: Individual factories develop sophisticated analytics solutions that never scale beyond their walls because the foundations aren't portable.
The traditional response is centralization—impose standards, mandate systems, force compliance. But this kills the autonomy that made these sites successful in the first place.
Ferreira's team takes a different approach: "We're trying to centralize to distribute." Build the minimum common foundation needed for interoperability while letting sites maintain operational independence.
Knowledge graphs enable this because they add semantic context without requiring you to move or restructure existing data.
Most people conflate these terms, but the distinction matters for implementation:
Ontologies are schemas—the structure defining how things connect. Think of it as a blueprint showing relationships: "A motor has bearings. Bearings have vibration sensors. Vibration patterns correlate with failure modes."
Knowledge graphs are instances—actual data populating that structure. "Motor #47 in Line 3 has bearing serial #892, which shows vibration of 4.2mm/s, which is 20% above normal for this motor type."
The ontology remains relatively stable. The knowledge graph updates constantly as real-world conditions change.
This separation is powerful but creates a learning curve. Most engineers think in terms of databases with rows and columns. Ontologies require thinking about relationships and context, which feels unfamiliar.
Ferreira admits this threshold is a significant barrier. But once teams grasp the concept, they unlock capabilities impossible with traditional data architectures.
If you're familiar with unified namespace (and Ferreira's audience certainly is), think of knowledge graphs as the semantic enrichment layer.
Unified namespace provides: Hierarchical structure showing where data lives, real-time data flows through publish-subscribe patterns, and consistency in how data is organized and accessed.
Knowledge graphs add: Multiple dimensional views of the same data, contextual relationships explaining what data means, connections to historical data and domain knowledge, and the ability to query across domains without moving data.
Here's the key insight: Unified namespace tells you where to find data. Knowledge graphs tell you how that data relates to everything else, what it means in different contexts, and what other information you need to make a decision.
An example from Scania: A "part" in their system means completely different things depending on context:
Traditional databases force you to choose one perspective. Knowledge graphs let you model all of them simultaneously and navigate between perspectives based on your query.
Want to know how a logistics delay affects production scheduling, which impacts tooling wear, which requires engineering analysis? The knowledge graph connects these domains automatically because the relationships are explicitly modeled.
Traditional databases require upfront design. You define tables, columns, relationships, and constraints based on what you know you'll need. Then reality hits: new equipment, new processes, new questions nobody anticipated.
Changing a relational database schema after deployment is painful. Tables depend on each other. Queries assume specific structures. Application code breaks when columns move. Major refactoring becomes necessary.
Knowledge graphs work differently. As Ferreira explains, you simply extend the graph with new dimensions when needs change:
Start with one piece of equipment. Model its components, sensors, and behaviors. Works great.
Expand to the entire production line. Connect equipment together, model material flows, add quality checkpoints. The original equipment model doesn't change—you just add more connections.
Scale to the whole factory. Include supply chain, maintenance systems, and planning tools. Previous models remain untouched.
Connect multiple factories globally. Add geographic dimensions, shipping logistics, and regional compliance requirements. Everything you built still works.
Integrate enterprise knowledge. Link production data to design documents, training materials, work instructions, and tribal knowledge. The graph just grows.
This extensibility is what makes knowledge graphs suitable for enterprise-scale deployments. You don't need to design the perfect system upfront—impossible in complex manufacturing environments. You design for what you know, then extend continuously as understanding deepens.
Here's a practical problem knowledge graphs solve: When integrators deploy new equipment, how do you ensure data connects properly to your existing systems?
Traditionally, this is manual. Integrators provide documentation (if you're lucky). Someone manually maps sensor tags to your SCADA system. Mistakes happen. Inconsistencies accumulate. Onboarding takes weeks.
Scania's approach: Use knowledge graphs to automate mapping.
When new equipment arrives with its data point definitions, the knowledge graph already contains patterns for similar equipment. It can automatically:
The integrator connects equipment using standard protocols. The knowledge graph figures out where everything belongs and routes data automatically.
Ferreira admits they're still working toward this vision—it's harder than it sounds. But the direction is clear: move from manual, error-prone integration to automated onboarding that gets faster with each deployment.
Here's where things get strategically important. Ferreira sees knowledge graphs not as a end in themselves, but as the foundational layer that AI operates on.
Scania isn't building their own large language models. They're leveraging foundation models like GPT, Claude, and LLaMA. But generic models don't understand your specific processes, equipment, or domain knowledge.
Knowledge graphs bridge this gap:
Provide company-specific context: LLMs know general manufacturing concepts. Knowledge graphs add your specific process flows, equipment configurations, quality standards, and operational constraints.
Enable grounded responses: Instead of AI hallucinating plausible-sounding but wrong answers, it retrieves actual facts from your knowledge graph and reasons over them.
Support multi-agent architectures: Different AI agents handle different domains (logistics, quality, maintenance, planning). The knowledge graph provides the shared understanding that lets them coordinate.
Make services discoverable: Using patterns like Model Context Protocol (MCP), manufacturing services become available to AI agents. The knowledge graph describes what each service does and when to use it.
The vision: An AI agent receives a request like "optimize throughput on Line 3 while maintaining quality." It:
This only works if the AI has a rich, structured understanding of how everything connects—exactly what knowledge graphs provide.
Most digital twins today are, in Ferreira's blunt assessment, "very passive entities." They show you what's happening in real-time, maybe with some 3D visualization. That's useful, but it's not transformative.
True digital twins should let you:
Building this requires several layers:
Semantic layer: Knowledge graphs defining how equipment components interact, how processes flow, and how actions propagate through the system.
Historical data: Understanding normal behavior patterns and how the system responds to changes.
Simulation capability: Either detailed physics-based models or AI models trained on operational data that can predict outcomes.
Multi-agent orchestration: AI agents that can decompose complex scenarios, test subsystems independently, and coordinate results.
Ferreira points out that current tools don't make this easy unless you rely on vendor-provided solutions. But the components are emerging:
The challenge is integration. Which brings up his key advice for vendors...
Ferreira has strong opinions about what industrial software vendors get wrong: vertical integration that locks customers in.
Vendors try to control everything—data collection, storage, processing, visualization, analytics. Buy their complete stack or nothing works properly. This makes vendors' lives easier but customers' lives harder.
What Scania wants instead:
Modular, best-of-breed components: Use the best database for time-series data, the best visualization platform, the best AI tooling—from different vendors if necessary.
Data ownership and access: Keep control of operational data. It's strategic IP that determines competitive advantage in the AI era.
Standard interfaces: Connect components through open protocols and APIs, not proprietary middleware that creates dependencies.
Ability to customize: Leverage standard solutions where possible, but maintain ability to differentiate where it matters competitively.
This isn't unique to Scania. Most sophisticated manufacturers share this view. But vendor incentives push the opposite direction—lock-in drives recurring revenue.
The vendors that succeed long-term will be those that embrace openness:
Knowledge graphs exemplify this philosophy. They add a semantic layer that works with any underlying data system, connects to any analytics tool, and supports any AI platform. They're inherently open.
Ferreira's excitement about AI in manufacturing centers on multi-agent systems—networks of specialized AI agents that coordinate to control complex operations.
This isn't science fiction. The technology exists:
What's missing is production-scale implementation. Most companies are still figuring out basic AI—deploying a single model for predictive maintenance or quality inspection. Multi-agent orchestration feels distant.
But Ferreira sees the path: Start with knowledge graphs that model processes and systems semantically. Build AI agents that can query these graphs and understand domain context. Make manufacturing services available through standard protocols. Let agents gradually take on more autonomous decision-making.
The future factory: Not controlled by humans making real-time decisions, but by AI agents that continuously optimize, predict disruptions, simulate improvements, and coordinate execution—all while understanding constraints, quality requirements, and business objectives through their knowledge graph foundation.
The factory becomes, in Ferreira's words, "its own entity that will try to optimize itself in an autonomous way."
That's when digital twins evolve from passive observers to active participants.
Knowledge graphs sound complex (they are) and distant (they don't have to be). Here's how to begin:
Phase 1: Map what you know
Phase 2: Connect to existing data
Phase 3: Enable new use cases
Phase 4: Scale systematically
Phase 5: Enable AI
The key insight: You don't need the complete solution upfront. Knowledge graphs are designed to grow and extend. Start small, prove value, expand systematically.
Three trends converge to make knowledge graphs suddenly critical:
AI needs semantic understanding: Foundation models are powerful but generic. They need rich, structured domain knowledge to operate effectively in manufacturing. Knowledge graphs provide this.
Complexity is increasing: More SKUs, shorter product lifecycles, distributed supply chains, regulatory requirements. Traditional data architectures can't keep up. Knowledge graphs handle complexity by design.
Autonomy is the future: From predictive maintenance to autonomous optimization, manufacturers are pushing toward systems that operate with minimal human intervention. These systems need machine-understandable knowledge—exactly what ontologies and knowledge graphs deliver.
Companies that build knowledge graphs now create a strategic asset: structured understanding of their operations that becomes more valuable as AI capabilities advance.
Those that stick with traditional data architectures will find their AI initiatives repeatedly hitting walls. The models are ready. The bottleneck is giving them the knowledge to operate on.
Your unified namespace collects data. Your time-series databases store it efficiently. Your edge infrastructure processes it in real-time.
But without knowledge graphs adding semantic understanding, you're building impressive infrastructure for AI that can't fully leverage it.
The digital twin revolution isn't about better visualization. It's about creating autonomous systems that understand operations deeply enough to optimize without constant human guidance.
Knowledge graphs are how you get there.
Kudzai Manditereza is an Industry4.0 technology evangelist and creator of Industry40.tv, an independent media and education platform focused on industrial data and AI for smart manufacturing. He specializes in Industrial AI, IIoT, Unified Namespace, Digital Twins, and Industrial DataOps, helping digital manufacturing leaders implement and scale AI initiatives.
Kudzai hosts the AI in Manufacturing podcast and writes the Smart Factory Playbook newsletter, where he shares practical guidance on building the data backbone that makes industrial AI work in real-world manufacturing environments. He currently serves as Senior Industry Solutions Advocate at HiveMQ.