November 2, 2025

Powering Industrial AI and Digital Twins With Knowledge Graphs

What most companies call digital twins are actually just visualization layers on top of data streams. They're passive observers, not intelligent systems.

According to João Dias-Ferreira, Head of IoT, Knowledge Graphs and AI at Scania, the missing ingredient is semantic understanding, the ability to comprehend how data elements relate to each other and what they mean in different contexts. Knowledge graphs provide this semantic layer, transforming digital twins from passive dashboards into active decision-making entities.

After years implementing knowledge graphs across Scania's global manufacturing operations, Ferreira has learned what it takes to build systems that don't just collect and display data, but actually understand it well enough to simulate futures, optimize autonomously, and eventually operate factories with minimal human intervention.

Here's why knowledge graphs matter and how they're reshaping what's possible with industrial AI.

The Data Management Problem No One Talks About

Before diving into knowledge graphs, understand the challenge they solve. Scania, like most large manufacturers, operates in what Ferreira calls a "distributed setup"—multiple production sites globally, each historically independent in decision-making and technology choices.

This autonomy drove success for decades. But it created a data nightmare:

Technical silos everywhere: Different factories use different systems, different architectures, different data models. What "cycle time" means in Sweden differs from what it means in Brazil, even when building the same product.

No information owners: Who's responsible for ensuring data quality? Who decides what gets collected and how? When responsibilities are unclear, data governance becomes wishful thinking.

Incomparable data: You can't compare production efficiency across sites when each defines efficiency differently and measures it inconsistently.

Islands of excellence: Individual factories develop sophisticated analytics solutions that never scale beyond their walls because the foundations aren't portable.

The traditional response is centralization—impose standards, mandate systems, force compliance. But this kills the autonomy that made these sites successful in the first place.

Ferreira's team takes a different approach: "We're trying to centralize to distribute." Build the minimum common foundation needed for interoperability while letting sites maintain operational independence.

Knowledge graphs enable this because they add semantic context without requiring you to move or restructure existing data.

Understanding Ontologies vs. Knowledge Graphs

Most people conflate these terms, but the distinction matters for implementation:

Ontologies are schemas—the structure defining how things connect. Think of it as a blueprint showing relationships: "A motor has bearings. Bearings have vibration sensors. Vibration patterns correlate with failure modes."

Knowledge graphs are instances—actual data populating that structure. "Motor #47 in Line 3 has bearing serial #892, which shows vibration of 4.2mm/s, which is 20% above normal for this motor type."

The ontology remains relatively stable. The knowledge graph updates constantly as real-world conditions change.

This separation is powerful but creates a learning curve. Most engineers think in terms of databases with rows and columns. Ontologies require thinking about relationships and context, which feels unfamiliar.

Ferreira admits this threshold is a significant barrier. But once teams grasp the concept, they unlock capabilities impossible with traditional data architectures.

How Knowledge Graphs Enrich Unified Namespace

If you're familiar with unified namespace (and Ferreira's audience certainly is), think of knowledge graphs as the semantic enrichment layer.

Unified namespace provides: Hierarchical structure showing where data lives, real-time data flows through publish-subscribe patterns, and consistency in how data is organized and accessed.

Knowledge graphs add: Multiple dimensional views of the same data, contextual relationships explaining what data means, connections to historical data and domain knowledge, and the ability to query across domains without moving data.

Here's the key insight: Unified namespace tells you where to find data. Knowledge graphs tell you how that data relates to everything else, what it means in different contexts, and what other information you need to make a decision.

An example from Scania: A "part" in their system means completely different things depending on context:

  • Logistics views parts as inventory items with locations and quantities
  • Production views parts as assembly steps with cycle times and tooling requirements
  • Engineering views parts as CAD models with tolerances and materials
  • Embedded systems view parts as components with firmware and communication protocols

Traditional databases force you to choose one perspective. Knowledge graphs let you model all of them simultaneously and navigate between perspectives based on your query.

Want to know how a logistics delay affects production scheduling, which impacts tooling wear, which requires engineering analysis? The knowledge graph connects these domains automatically because the relationships are explicitly modeled.

The Extensibility Advantage

Traditional databases require upfront design. You define tables, columns, relationships, and constraints based on what you know you'll need. Then reality hits: new equipment, new processes, new questions nobody anticipated.

Changing a relational database schema after deployment is painful. Tables depend on each other. Queries assume specific structures. Application code breaks when columns move. Major refactoring becomes necessary.

Knowledge graphs work differently. As Ferreira explains, you simply extend the graph with new dimensions when needs change:

Start with one piece of equipment. Model its components, sensors, and behaviors. Works great.

Expand to the entire production line. Connect equipment together, model material flows, add quality checkpoints. The original equipment model doesn't change—you just add more connections.

Scale to the whole factory. Include supply chain, maintenance systems, and planning tools. Previous models remain untouched.

Connect multiple factories globally. Add geographic dimensions, shipping logistics, and regional compliance requirements. Everything you built still works.

Integrate enterprise knowledge. Link production data to design documents, training materials, work instructions, and tribal knowledge. The graph just grows.

This extensibility is what makes knowledge graphs suitable for enterprise-scale deployments. You don't need to design the perfect system upfront—impossible in complex manufacturing environments. You design for what you know, then extend continuously as understanding deepens.

Automating Asset Onboarding with Knowledge Graphs

Here's a practical problem knowledge graphs solve: When integrators deploy new equipment, how do you ensure data connects properly to your existing systems?

Traditionally, this is manual. Integrators provide documentation (if you're lucky). Someone manually maps sensor tags to your SCADA system. Mistakes happen. Inconsistencies accumulate. Onboarding takes weeks.

Scania's approach: Use knowledge graphs to automate mapping.

When new equipment arrives with its data point definitions, the knowledge graph already contains patterns for similar equipment. It can automatically:

  • Identify what type of equipment this is based on sensor signatures
  • Map data points to the appropriate locations in the unified namespace
  • Configure MQTT topic structures without human intervention
  • Validate that all expected data is flowing correctly

The integrator connects equipment using standard protocols. The knowledge graph figures out where everything belongs and routes data automatically.

Ferreira admits they're still working toward this vision—it's harder than it sounds. But the direction is clear: move from manual, error-prone integration to automated onboarding that gets faster with each deployment.

Knowledge Graphs as AI's Operating System

Here's where things get strategically important. Ferreira sees knowledge graphs not as a end in themselves, but as the foundational layer that AI operates on.

Scania isn't building their own large language models. They're leveraging foundation models like GPT, Claude, and LLaMA. But generic models don't understand your specific processes, equipment, or domain knowledge.

Knowledge graphs bridge this gap:

Provide company-specific context: LLMs know general manufacturing concepts. Knowledge graphs add your specific process flows, equipment configurations, quality standards, and operational constraints.

Enable grounded responses: Instead of AI hallucinating plausible-sounding but wrong answers, it retrieves actual facts from your knowledge graph and reasons over them.

Support multi-agent architectures: Different AI agents handle different domains (logistics, quality, maintenance, planning). The knowledge graph provides the shared understanding that lets them coordinate.

Make services discoverable: Using patterns like Model Context Protocol (MCP), manufacturing services become available to AI agents. The knowledge graph describes what each service does and when to use it.

The vision: An AI agent receives a request like "optimize throughput on Line 3 while maintaining quality." It:

  1. Queries the knowledge graph to understand Line 3's configuration, constraints, and current state
  2. Identifies which services it can call (adjust speeds, modify parameters, reroute material)
  3. Simulates different scenarios using the digital twin
  4. Evaluates outcomes against quality requirements
  5. Executes the optimal plan through appropriate service calls

This only works if the AI has a rich, structured understanding of how everything connects—exactly what knowledge graphs provide.

Redefining Digital Twins: From Passive to Proactive

Most digital twins today are, in Ferreira's blunt assessment, "very passive entities." They show you what's happening in real-time, maybe with some 3D visualization. That's useful, but it's not transformative.

True digital twins should let you:

  • Simulate future scenarios: What happens if we change this parameter? Run 1,000 variations and find the optimal settings.
  • Predict impacts: How does introducing a new product variant affect cycle times, quality, and tooling wear?
  • Optimize autonomously: Continuously test improvements in simulation before applying them to physical assets.
  • Respond to disruption: Supply chain issue? Instantly model alternative routing and material substitutions.

Building this requires several layers:

Semantic layer: Knowledge graphs defining how equipment components interact, how processes flow, and how actions propagate through the system.

Historical data: Understanding normal behavior patterns and how the system responds to changes.

Simulation capability: Either detailed physics-based models or AI models trained on operational data that can predict outcomes.

Multi-agent orchestration: AI agents that can decompose complex scenarios, test subsystems independently, and coordinate results.

Ferreira points out that current tools don't make this easy unless you rely on vendor-provided solutions. But the components are emerging:

  • Knowledge graphs for semantic understanding
  • Foundation models for reasoning and planning
  • Agent frameworks for orchestration
  • Time-series databases for operational data
  • Digital twin platforms for simulation

The challenge is integration. Which brings up his key advice for vendors...

The Open vs. Closed Systems Battle

Ferreira has strong opinions about what industrial software vendors get wrong: vertical integration that locks customers in.

Vendors try to control everything—data collection, storage, processing, visualization, analytics. Buy their complete stack or nothing works properly. This makes vendors' lives easier but customers' lives harder.

What Scania wants instead:

Modular, best-of-breed components: Use the best database for time-series data, the best visualization platform, the best AI tooling—from different vendors if necessary.

Data ownership and access: Keep control of operational data. It's strategic IP that determines competitive advantage in the AI era.

Standard interfaces: Connect components through open protocols and APIs, not proprietary middleware that creates dependencies.

Ability to customize: Leverage standard solutions where possible, but maintain ability to differentiate where it matters competitively.

This isn't unique to Scania. Most sophisticated manufacturers share this view. But vendor incentives push the opposite direction—lock-in drives recurring revenue.

The vendors that succeed long-term will be those that embrace openness:

  • Expose rich APIs and support standard protocols
  • Let customers own their data and access it directly
  • Focus on excellence in specific components rather than mediocre full-stack solutions
  • Enable integration with competitors' products when it serves customers

Knowledge graphs exemplify this philosophy. They add a semantic layer that works with any underlying data system, connects to any analytics tool, and supports any AI platform. They're inherently open.

From Multi-Agent Research to Manufacturing Reality

Ferreira's excitement about AI in manufacturing centers on multi-agent systems—networks of specialized AI agents that coordinate to control complex operations.

This isn't science fiction. The technology exists:

  • Agent frameworks that handle orchestration and communication
  • Foundation models that provide reasoning capabilities
  • MCP (Model Context Protocol) that makes services discoverable to agents
  • Knowledge graphs that provide shared understanding

What's missing is production-scale implementation. Most companies are still figuring out basic AI—deploying a single model for predictive maintenance or quality inspection. Multi-agent orchestration feels distant.

But Ferreira sees the path: Start with knowledge graphs that model processes and systems semantically. Build AI agents that can query these graphs and understand domain context. Make manufacturing services available through standard protocols. Let agents gradually take on more autonomous decision-making.

The future factory: Not controlled by humans making real-time decisions, but by AI agents that continuously optimize, predict disruptions, simulate improvements, and coordinate execution—all while understanding constraints, quality requirements, and business objectives through their knowledge graph foundation.

The factory becomes, in Ferreira's words, "its own entity that will try to optimize itself in an autonomous way."

That's when digital twins evolve from passive observers to active participants.

Practical Steps for Getting Started

Knowledge graphs sound complex (they are) and distant (they don't have to be). Here's how to begin:

Phase 1: Map what you know

  • Pick one production line or process
  • Document how equipment connects and interacts
  • Identify the different "contexts" where the same data means different things
  • Create a simple ontology showing these relationships

Phase 2: Connect to existing data

  • Don't move data—connect to it where it lives
  • Use tools like Scania's open-source Serach Graph for data virtualization
  • Query across multiple sources without replication
  • Validate that the semantic layer accurately represents reality

Phase 3: Enable new use cases

  • Start with problems impossible to solve before (multi-domain queries, cross-site comparisons)
  • Build confidence that knowledge graphs deliver value
  • Document patterns that work for replication

Phase 4: Scale systematically

  • Extend the ontology to cover more equipment and processes
  • Connect additional data sources as needed
  • Deploy the same approach at other sites
  • Build the enterprise knowledge layer gradually

Phase 5: Enable AI

  • Make the knowledge graph available to AI systems
  • Start with simple automation (auto-generating work instructions, answering technical questions)
  • Progress toward autonomous optimization and control
  • Build multi-agent systems as capabilities mature

The key insight: You don't need the complete solution upfront. Knowledge graphs are designed to grow and extend. Start small, prove value, expand systematically.

Why This Matters Now

Three trends converge to make knowledge graphs suddenly critical:

AI needs semantic understanding: Foundation models are powerful but generic. They need rich, structured domain knowledge to operate effectively in manufacturing. Knowledge graphs provide this.

Complexity is increasing: More SKUs, shorter product lifecycles, distributed supply chains, regulatory requirements. Traditional data architectures can't keep up. Knowledge graphs handle complexity by design.

Autonomy is the future: From predictive maintenance to autonomous optimization, manufacturers are pushing toward systems that operate with minimal human intervention. These systems need machine-understandable knowledge—exactly what ontologies and knowledge graphs deliver.

Companies that build knowledge graphs now create a strategic asset: structured understanding of their operations that becomes more valuable as AI capabilities advance.

Those that stick with traditional data architectures will find their AI initiatives repeatedly hitting walls. The models are ready. The bottleneck is giving them the knowledge to operate on.

Your unified namespace collects data. Your time-series databases store it efficiently. Your edge infrastructure processes it in real-time.

But without knowledge graphs adding semantic understanding, you're building impressive infrastructure for AI that can't fully leverage it.

The digital twin revolution isn't about better visualization. It's about creating autonomous systems that understand operations deeply enough to optimize without constant human guidance.

Knowledge graphs are how you get there.

Kudzai Manditereza

Founder & Educator - Industry40.tv

Kudzai Manditereza is an Industry4.0 technology evangelist and creator of Industry40.tv, an independent media and education platform focused on industrial data and AI for smart manufacturing. He specializes in Industrial AI, IIoT, Unified Namespace, Digital Twins, and Industrial DataOps, helping digital manufacturing leaders implement and scale AI initiatives.

Kudzai hosts the AI in Manufacturing podcast and writes the Smart Factory Playbook newsletter, where he shares practical guidance on building the data backbone that makes industrial AI work in real-world manufacturing environments. He currently serves as Senior Industry Solutions Advocate at HiveMQ.