November 8, 2025

Edge Computing for Manufacturing: A Guide to Distributed Data Architecture and Industrial AI

Production data in manufacturing operations has historically been isolated within individual facilities, creating challenges for enterprise-wide analytics and optimization. As manufacturing organizations seek to implement AI and machine learning at scale, understanding how to architect data infrastructure that spans from cloud to edge devices becomes increasingly important.

Angelo Corsaro, CEO and CTO at ZetaScale, brings over 25 years of experience working at the intersection of distributed computing and industrial systems. His insights provide a comprehensive view of edge computing that goes beyond the common definition of placing servers closer to machines. This article explores the concept of edge computing as a cloud-to-device continuum and examines how this architectural approach enables data-driven manufacturing operations.

Understanding Edge Computing: Definition and Evolution

The confusion around edge computing stems from its dual origins. In 2014, two separate communities independently recognized that cloud-centric architectures couldn't meet their needs.

Telecommunications providers developed Mobile Edge Computing (MEC) to reduce latency by intercepting requests at network boundaries before they reached backend data centers. For telcos, defining the "edge" was straightforward—it was their network infrastructure.

Simultaneously, industrial teams realized that the Industrial Internet of Things demanded a different approach. Bandwidth limitations made it impossible to push all production data to the cloud. Data sovereignty requirements meant strategic operational data couldn't leave the premises. Connectivity reliability issues—particularly in mobility applications like autonomous agricultural equipment—made cloud dependency unacceptable.

This parallel evolution created lasting confusion about what "edge" actually means. The reality is that the edge location changes depending on your application. A better framework is the cloud-to-device continuum: infrastructure that allows you to provision, monitor, and manage applications seamlessly from data centers down to microcontrollers, placing computation wherever it makes the most sense for your specific use case.

Cloud-Centric Architecture Limitations in Manufacturing Environments

Traditional industrial automation operates with data contained within individual facilities. Each factory has its own production systems, with information typically remaining within facility boundaries.

This architecture presents challenges for three capabilities that data and analytics leaders need to develop:

Cross-facility optimization. When production data stays locked in individual factories, you cannot optimize globally across your manufacturing network. Supply chain decisions lack real-time production visibility. Capacity planning relies on delayed reports rather than live data streams. Quality insights from one facility cannot immediately inform processes at another.

Real-time responsiveness. Cloud round-trips introduce latency that makes real-time control impossible. Production decisions that require millisecond response times cannot wait for data to travel to a distant data center and back. Safety systems cannot depend on cloud connectivity. Quality control adjustments need immediate data processing at the source.

Data sovereignty and security. Strategic production data often cannot leave your premises due to competitive sensitivity or regulatory requirements. Cloud architectures force you to choose between data utilization and data control. This binary choice leaves valuable operational intelligence unused.

The underlying challenge is architectural: systems designed around centralized cloud computing have limitations when addressing the distributed, real-time, and security requirements of modern manufacturing operations.

Industrial Data Infrastructure: Current Protocol Considerations

Current industrial protocols were designed to address specific integration challenges. OPC UA, the widely adopted standard for industrial connectivity, was developed to solve point-to-point integration—connecting machines to control systems. While OPC UA has evolved to include publish-subscribe capabilities, its architecture reflects its request-response origins.

This creates three considerations for data and analytics leaders:

Data modeling requirements. OPC UA requires upfront data modeling and schema management. When adding new data sources or modifying existing streams, integration work is required. For teams running multiple ML experiments, understanding these modeling requirements becomes important for planning development cycles and resource allocation.

Multicast efficiency considerations. When multiple systems need the same data stream, traditional protocols often require separate unicast connections, with each consumer creating additional network load. This becomes particularly relevant when streaming high-frequency sensor data to multiple analytics applications, ML models, and monitoring systems simultaneously.

Real-time delivery characteristics. Many manufacturing applications require deterministic, low-latency data delivery. Understanding how different protocol architectures handle timing guarantees becomes important in distributed environments where data flows between edge devices, local processing nodes, and centralized systems with specific timing requirements.

Building Data Infrastructure for Distributed Intelligence

Modern edge computing requires fundamentally different communication patterns. Data Distribution Service (DDS), a standard originally developed for military and aerospace applications, offers an alternative architectural model based on three key principles.

Data-centricity rather than message-centricity. Instead of sending messages between endpoints, DDS creates a global data space where applications publish and subscribe to data topics. Systems declare what data they need and what data they produce, then the infrastructure handles routing, filtering, and delivery. This abstraction eliminates point-to-point integration complexity.

Quality of Service guarantees. DDS provides 23 different QoS policies that let you specify exactly how data should be delivered—durability, reliability, latency budgets, resource limits, and temporal constraints. You can say "I need this production data delivered within 5 milliseconds" or "I need the last 100 samples of this sensor stream even if I temporarily disconnect," and the infrastructure enforces these requirements.

Location transparency with optimal routing. The same application code works whether components are in the same process, on the same machine, across a local network, or distributed across facilities. DDS automatically discovers peers and establishes optimal communication paths. When infrastructure availability changes—like a vehicle moving from urban 5G coverage to rural areas with limited connectivity—the system adapts without application-level changes.

This architecture provides specific capabilities for analytics teams deploying AI at scale. ML models can subscribe to data streams with clear integration patterns. Feature stores can distribute training data across distributed compute resources. Real-time inference systems can receive sensor data with specified latency characteristics.

Practical Implementation: From Architecture to Operations

Effective applications combine edge intelligence with dynamic infrastructure adaptation. Automotive driver assistance systems provide a useful parallel for understanding manufacturing mobility challenges.

In an urban environment with robust 5G coverage and nearby edge servers, a vehicle can offload computationally intensive processing to infrastructure one network hop away. This conserves battery for driving rather than computing. But as that vehicle moves toward areas with limited connectivity, those same functions must seamlessly migrate back to onboard processing. The application doesn't change—the infrastructure adapts.

Manufacturing faces analogous challenges. A mobile autonomous guided vehicle in a warehouse requires real-time object detection and path planning. When operating in areas with strong WiFi coverage and local edge servers, it can offload model inference to more powerful compute resources. But it must function independently when moving through connectivity dead zones. The same pattern applies to mobile robots in large facilities, autonomous equipment in outdoor agricultural operations, or even production lines where temporary connectivity issues cannot halt operations.

The key architectural requirement is computation mobility—the ability to place processing optimally based on available infrastructure, then migrate that processing as conditions change. This goes beyond traditional failover. It's dynamic optimization of where computation happens based on latency requirements, available bandwidth, compute resources, and energy constraints.

For data and analytics leaders, this has direct implications for ML deployment strategies. Rather than assuming models will always run in one location (edge devices or cloud infrastructure), design for dynamic placement. Build systems where the same model package can execute on edge hardware when necessary or leverage more powerful centralized compute when available. Implement monitoring that tracks not just model performance but also infrastructure conditions that might warrant placement changes.

Implementation Strategy: Key Considerations for Data Leaders

Implementing edge computing for industrial AI involves evaluating your data architecture comprehensively. Begin by identifying where production data currently resides and how it flows through your organization. Map which production insights require real-time processing versus those that can accommodate cloud latency. Evaluate where data sovereignty requirements necessitate on-premises processing.

Next, assess your current integration architecture. Document the point-to-point connections between data sources and consuming applications. Analyze how engineering resources are allocated between maintaining existing integrations and developing new analytics capabilities. This analysis helps inform decisions about architectural investments.

Finally, design for the cloud-to-device continuum rather than cloud-first or edge-first architectures. The question isn't whether to use edge or cloud—it's how to build infrastructure that lets you place computation optimally for each use case while maintaining operational flexibility.

Success with industrial AI implementation depends on establishing effective data infrastructure. Organizations benefit from addressing foundational data architecture requirements—enabling data accessibility across facility boundaries and supporting operations across the spectrum from device to cloud. Edge computing, when properly designed and implemented, provides this foundational capability for distributed manufacturing operations.