November 2, 2025

Modernizing Industrial Data Architecture for AI-Readiness

More than half of manufacturing executives report that poor data quality is the primary factor preventing them from realizing value from AI investments. But the problem runs deeper than quality alone. Most manufacturers are attempting to build AI capabilities on data architectures designed decades ago, when the primary goal was simply moving information between systems, not training intelligent algorithms.

Jonathan Wise, Chief Technology Architect at CESMII (the Clean Energy Smart Manufacturing Innovation Institute), has worked with over 50 manufacturers on their data modernization journeys. His perspective cuts through the hype to reveal a fundamental truth: you cannot build effective AI on inadequate data infrastructure.

This episode explores the specific challenges manufacturers face with legacy data architectures and provides a practical roadmap for modernization that makes industrial AI initiatives actually work.

The Three Data Problems Blocking Your Industrial AI Success

Before diving into solutions, it's essential to understand exactly why current data architectures fail AI projects. The problem has three distinct layers:

Problem One: Data Isn't Moving:

Many manufacturers struggle to get data out of their systems in the first place. Legacy architectures were built with security boundaries between layers, the Purdue model with DMZs separating shop floor from enterprise systems. These boundaries made sense in the 1980s and 1990s, but they now act as barriers that prevent data from reaching the places where it can be analyzed.

What gets historized in traditional systems is often just someone's idea of what might be useful someday. But if you assume your historian contains everything important about your operation, you're limiting your analysis before it even begins. The data that could reveal breakthrough insights might never make it past the first filter.

Problem Two: Data Isn't Meaningful:

Even when data moves freely, it often lacks context. Consider a typical scenario: you've liberated data from your PLCs and moved it all to a data lake. But the labels are just tag names like "temperature" or "DI_01" or "D_0." You might have dozens of sensors all called "temperature" with no indication of what they're measuring or where they're located.

Moving unlabeled data to a lake doesn't create a foundation for AI. It creates a data swamp, a mess that requires extensive cleanup before any meaningful analysis can begin. Good PLC programmers sometimes use naming conventions that embed semantic meaning, but this is far from universal. Without consistent, meaningful labels, your data remains unusable for AI training.

Problem Three: Relationships Are Lost:

This is the most critical issue for modern AI approaches. Large language models and generative AI, the technologies generating the most excitement right now, work by learning relationships between data points. An LLM trains on relationships between tokens, understanding how words connect to other words.

Manufacturing data works the same way. A data point from a machine isn't an island. It has relationships with upstream suppliers, downstream processes, quality measurements, and environmental conditions. These relationships contain the intelligence that AI needs to learn from.

When you move data without preserving these relationships, you lose the ability to train the kinds of AI that everyone is excited about. Getting data moving is step one. Labeling it intelligently is step two. But preserving the relationships between data points is step three, and it's the step most organizations miss.

Why Legacy Industrial Architectures Can't Support Modern AI

The data architectures most manufacturers inherited were designed for a different era with different priorities. Understanding why they fall short helps clarify what needs to change.

The Purdue Model Legacy:

Traditional manufacturing IT followed the Purdue model, with strict layers separating field devices, control systems, supervisory systems, and enterprise systems. Each layer had specific purposes, and data moved up through these layers in a controlled way. This made sense when the primary concern was keeping production systems stable and secure.

The problem is that modern analytics and AI need access to raw data from multiple sources simultaneously. Waiting for data to percolate up through layers means losing timeliness and context. By the time data reaches your analytics layer, crucial relationships have been filtered out.

The Historian Bottleneck:

Many organizations treat their historians as the single source of truth for manufacturing data. But historians were designed to store what someone thought would be important, often with sampling rates and retention policies that made sense for manual analysis.

AI and machine learning need different things. They might need higher frequency data. They might need data points that seemed irrelevant to human analysts but contain patterns invisible without algorithmic analysis. They definitely need the relationships between data points preserved.

The Silo Problem:

Even beyond the vertical Purdue layers, most manufacturers have horizontal silos. Production data lives in one system. Quality data in another. Maintenance data in yet another. Supply chain data is somewhere else entirely.

Manufacturing isn't isolated within your four walls. A data point in your facility is related to data from your suppliers because that supplier's material is part of your finished product. When there's a quality issue, you need to trace it back through the supply chain. But if your data architecture treats your operation as an island, these connections are invisible.

The Unified Namespace: A Modern Approach to Data Architecture

The solution to these challenges is what industry leaders call a unified namespace. Think of it as a comprehensive directory system for all your manufacturing data that preserves context and relationships.

What It Actually Does:

A unified namespace creates a single, hierarchical structure where every piece of data has a meaningful, contextualized location. Instead of having "temperature" sensors scattered across disconnected systems, you have paths like "Site/Building/Line/Station/Sensor/Temperature" that tell you exactly what you're looking at.

This isn't just better organization. It fundamentally changes what's possible with your data. When an AI algorithm can walk through these relationships, understanding how data points connect to each other, it can learn patterns that were previously invisible.

Beyond Internal Boundaries:

The real power of a unified namespace emerges when you think beyond your facility's four walls. Your namespace should connect to your suppliers' data and your customers' systems. This doesn't mean everyone has access to everything, but it means the relationships can be preserved and traced when needed.

One manufacturer discovered that a quality issue in their facility traced back to a change their supplier made three months earlier. Without a unified namespace that could follow these relationships, they would have struggled for weeks to identify the root cause. With proper data architecture, the connection was visible immediately.

Standards Enable Scale:

For unified namespaces to work across organizational boundaries, they need standards. This is where industry efforts like ISA-95, PackML, and other specifications become crucial. These aren't just academic exercises. They provide common languages that let different systems communicate while preserving meaning.

If your internal namespace aligns with industry standards, connecting with partners becomes straightforward. If every company invents their own structure, integration remains painful and expensive. The good news is that standards are evolving based on real-world feedback. If you find a standard that should work for you but doesn't, that feedback helps improve it for everyone.

IT and OT Convergence: Where the Real Work Happens

Every discussion about modernizing data architecture eventually comes down to IT and OT convergence. This is where data and analytics leaders earn their value.

Understanding the Impedance Mismatch: When IT and OT professionals sit in the same room, there's often an impedance mismatch. They use the same words but mean different things. Their technology evolution paths have been so different that they sometimes struggle to understand each other's perspectives.

IT comes from a world of rapid iteration. "Move fast and break things" is a celebrated motto in software development. You can deploy updates quickly, roll back changes if they don't work, and experiment freely.

OT brings a completely different mindset, one informed by the reality that broken things on a factory floor can hurt people or destroy expensive equipment. The discipline and rigor around engineering that OT professionals bring to the table isn't bureaucracy, it's necessary for safety and reliability.

What Each Side Brings: IT has developed principles and practices over decades that apply perfectly to manufacturing data challenges. Software development practices, data modeling approaches, and architectural patterns for distributed systems all have direct applicability to unified namespaces and modern manufacturing data platforms.

OT understands the actual processes, what's important on the shop floor, how systems really work in practice. They know which measurements matter and which don't. They understand the consequences of getting things wrong.

The Convergence Imperative: These disciplines must work together because they have essential things to teach each other. When you build a unified namespace, you're creating software that people depend on to make critical decisions. If you change that namespace and break dependent systems, the consequences aren't just lost productivity. Safety and product quality are at stake.

This requires IT's architectural thinking combined with OT's engineering discipline. It requires OT's process knowledge combined with IT's data management expertise. Organizations that can genuinely bring these worlds together create sustainable competitive advantages.

Practical Steps for Getting Started

Understanding the problem and the solution is one thing. Actually modernizing your data architecture is another. Here's a practical approach based on what works:

Step One: Assess Where You Are: Start by mapping your current data landscape honestly. Where does data live? What systems are connected? What isn't connected? What data exists but isn't accessible? What relationships between data points are currently captured, and which are lost?

This assessment isn't about creating shame around technical debt. Every manufacturer has legacy systems. The goal is understanding your specific starting point so you can chart the right path forward.

Step Two: Define Your First Use Case: Don't try to modernize everything at once. Identify a specific business problem where better data architecture would drive measurable impact. Maybe it's reducing downtime on a critical line. Maybe it's improving first-pass yield on a quality-sensitive process. Maybe it's reducing energy consumption.

The key is picking something concrete with clear success metrics. This becomes your proving ground for new approaches and your source of momentum for broader changes.

Step Three: Start Building Your Namespace: Even if you can't unify everything immediately, start building the structure you want to have. Create meaningful naming conventions. Organize new data properly. Document relationships explicitly.

As you add sensors or connect new systems, do it right from the start. Over time, you can gradually migrate legacy data into this structure, but every new addition should fit the modern architecture you're building toward.

Step Four: Invest in the Right Skills: This work requires people who understand both IT and OT. They might be IT people who've learned manufacturing or OT people who've developed data skills. Either path works, but you need individuals who can bridge both worlds.

These people don't necessarily exist at the executive level, and that's fine. Often the best candidates are at the manager or director level, people close enough to the technology to understand it deeply but experienced enough to navigate organizational dynamics.

Step Five: Think Beyond Your Walls: From the beginning, design your namespace to connect with external partners. This doesn't mean opening your systems to everyone, but it means thinking about how supplier data and customer data relate to your internal processes.

Align with industry standards where they exist. Participate in standards bodies or work with organizations like CESMII that can represent your needs. The interoperability you build now determines how easily you can collaborate in the future.

The Culture Change Nobody Talks About

Technical challenges get most of the attention in data modernization projects, but culture change often determines success or failure.

Empowering the Workforce: When you add data visibility for the first time, you're not just installing technology. You're changing how people work. Operators who never had access to process data are suddenly empowered to make data-driven decisions.

This can be exciting, but it can also be uncomfortable. People need training not just on how to use new systems, but on how to interpret data and what actions they're empowered to take based on what they see.

Leadership Commitment: Modernizing data architecture isn't a six-month project that ends when systems are connected. It's an ongoing journey of improvement. Leadership needs to commit to this reality and provide sustained support, not just initial funding.

This includes being willing to make process changes based on data insights. If data reveals that a longtime practice isn't optimal, leadership needs to support changing it. Otherwise, the investment in better data architecture becomes academic.

Celebrating Small Wins: Large-scale transformation can feel overwhelming. Breaking it into achievable steps and celebrating successes along the way builds momentum. When the first unified namespace project reduces downtime or improves quality, that success becomes a template for expanding the approach.

Key Takeaways

If you're responsible for data and analytics infrastructure in manufacturing, here's what you should focus on:

First, understand that poor data quality is often a symptom, not the root cause. The real problem is architecture that doesn't preserve context and relationships. Fixing quality without fixing architecture is treating symptoms.

Second, recognize that modern AI requires different infrastructure than traditional analytics. Historical approaches to data collection were designed for human analysis. Machine learning and generative AI need raw data with relationships intact.

Third, invest in IT and OT convergence. This isn't optional. The organizations that figure out how to genuinely combine these disciplines will dominate their markets. Those that don't will struggle with technical debt and missed opportunities.

Fourth, think beyond your four walls from the beginning. Manufacturing is a connected ecosystem. Your data architecture needs to reflect this reality.

Finally, remember that this is both a technology project and a culture change effort. Technical excellence without organizational readiness leads nowhere. Plan for both.

Conclusion: Building the Foundation for Competitive Advantage

The manufacturers who successfully modernize their data architectures aren't just enabling AI projects. They're building fundamental competitive advantages that compound over time.

Better data architecture means faster problem resolution. It means more consistent quality. It means lower energy consumption and reduced waste. It means suppliers and customers integrate more smoothly into your processes.

Most importantly, it means you can actually realize the promise of AI in manufacturing. Not the hype, but the real value that comes from algorithms that can learn from your data and provide actionable insights.

The path forward isn't easy, especially for organizations with decades of legacy systems. But it's also not as difficult as it seems when broken into achievable steps. Start with honest assessment. Pick a meaningful first use case. Build the right team. Align with standards. Think beyond your walls.

Every step you take toward modern data architecture makes the next step easier and more valuable. The organizations that start this journey now will look back in five years with clear competitive advantages over those who waited.

The question isn't whether to modernize your data architecture. It's whether you'll lead the transformation or scramble to catch up when your competitors pull ahead. The foundation you build today determines what's possible tomorrow.

Kudzai Manditereza

Founder & Educator - Industry40.tv

Kudzai Manditereza is an Industry4.0 technology evangelist and creator of Industry40.tv, an independent media and education platform focused on industrial data and AI for smart manufacturing. He specializes in Industrial AI, IIoT, Unified Namespace, Digital Twins, and Industrial DataOps, helping digital manufacturing leaders implement and scale AI initiatives.

Kudzai hosts the AI in Manufacturing podcast and writes the Smart Factory Playbook newsletter, where he shares practical guidance on building the data backbone that makes industrial AI work in real-world manufacturing environments. He currently serves as Senior Industry Solutions Advocate at HiveMQ.