November 8, 2025

Why Open Source Software Is Critical for Your Industrial IoT Data Strategy

In manufacturing environments where production systems can run for decades, the wrong technology choices today can lock you into costly vendor dependencies tomorrow. Frederick Desbiens, Program Manager for IoT and Edge Computing at the Eclipse Foundation, makes a compelling case for why open source software should be the foundation of your industrial IoT architecture—especially for data leaders navigating the complex intersection of legacy systems and modern analytics requirements.

‍

The Real Challenge: Your 20-Year Technology Horizon

‍

When enterprise data leaders plan IT systems, they typically work on 3-5 year refresh cycles. But on the factory floor, the timeline is completely different. Production equipment and control systems often remain in place for 20 years or longer, representing massive capital investments that can't simply be replaced.

This creates a unique challenge for data and analytics teams:

Legacy integration burden: Your data infrastructure must connect to systems that were installed before cloud computing existed, while simultaneously supporting modern AI and machine learning workloads
Technology debt accumulation: Proprietary protocols from major automation vendors create data silos that make it expensive to extract value from your operational data
Modernization pressure: Business demands for real-time analytics, predictive maintenance, and AI-driven optimization require you to integrate cutting-edge technologies alongside decades-old systems

The complexity is staggering. Consider that the OPC UA specification—one of the leading standards for industrial data exchange—spans over 3,000 pages. For teams tasked with building data platforms that can serve multiple business units across global manufacturing operations, navigating this landscape without getting locked into proprietary ecosystems is a critical strategic decision.

‍

Why Vendor-Neutral Open Source Matters for Data Architecture

‍

Desbiens uses a powerful analogy: imagine if bread-making had been a closed secret controlled by five global companies. The diversity, innovation, and local variations we enjoy today simply wouldn't exist. The same principle applies to your data infrastructure.

Open source creates the conditions for robust, innovative ecosystems in three specific ways that matter to data leaders:

Competition drives innovation: When multiple vendors build on the same open protocols, they compete on implementation quality, performance, and features—not on lock-in. This benefits you by providing better tools at lower costs while maintaining the flexibility to switch providers
Community-driven problem solving: Complex challenges in data integration, protocol implementation, and security get solved faster when developers worldwide contribute solutions, rather than waiting for a single vendor's product roadmap
Long-term viability: Your architecture decisions today need to remain valid in 2045. Open source projects with active communities and vendor-neutral governance provide more confidence than proprietary systems tied to a single company's business strategy

For data and analytics leaders building enterprise-scale platforms, this translates directly to reduced risk. When you standardize on open protocols like MQTT, OPC UA clients from Eclipse Milo, or data orchestration tools from vendor-neutral foundations, you maintain architectural flexibility while building on battle-tested components.

‍

Building Your Open Source IoT Data Stack

‍

The Eclipse IoT working group maintains over 50 open source projects that form a comprehensive toolkit for industrial data infrastructure. These aren't academic experiments—they're production-grade implementations used by major manufacturers and cloud providers.

Protocol implementations for every industrial standard:

MQTT clients and brokers (Eclipse Paho, Mosquitto, and HiveMQ—IBM's Watson IoT broker recently open-sourced)
CoAP implementations for constrained devices (Eclipse Californium)
Lightweight M2M for device management (Eclipse Leshan server, Wakaama client)
OPC UA connectivity (Eclipse Milo for Java-based systems)

Edge computing and orchestration:

Eclipse ioFog provides container orchestration specifically designed for distributed edge deployments, letting you manage compute resources across factories worldwide
Integration with Project Skupper enables secure connectivity between edge nodes without complex VPN infrastructure—critical when you need to deploy analytics at the edge while maintaining centralized governance

Data transformation and integration:

Components for semantic data modeling and transformation
Support for MQTT Sparkplug specification, which adds standardized data structure on top of MQTT messaging

What makes this ecosystem valuable isn't just the breadth of components, but the fact that they're designed to work together while remaining independently useful. You can adopt pieces incrementally rather than committing to a monolithic platform.

‍

MQTT Sparkplug: Solving the Data Standardization Problem

‍

One of the most significant developments for manufacturing data platforms is the MQTT Sparkplug specification. While MQTT provides reliable messaging, it doesn't define what that data should look like. Different devices publishing to your message broker might use completely different data formats, making it difficult to build analytics that work across your entire operation.

Sparkplug addresses this by defining:

Standardized data models that represent industrial devices and their metrics in a consistent way
Discovery mechanisms so your analytics systems can automatically understand what data is available
State management that ensures your data platform always knows whether devices are online and data is current

For data leaders building centralized analytics platforms that need to ingest data from thousands of devices across multiple facilities, Sparkplug dramatically reduces integration complexity. Instead of writing custom parsers for each device type, you build against a single, well-defined data model.

The specification emerged from real industrial implementations at companies like Cirrus Link and has gained adoption precisely because it solves practical problems that data engineers face every day.

‍

Security in Open Source: Addressing the Critical Concern

‍

For enterprise data leaders, security isn't optional—it's a requirement that must be baked into every layer of your architecture. The Eclipse approach to security deserves attention because it's fundamentally different from how proprietary systems often handle it.

Rather than having a separate "security project," each Eclipse IoT component implements the security standards relevant to its protocol:

CoAP implementations include proper DTLS (Datagram Transport Layer Security) support
MQTT clients support TLS encryption and certificate-based authentication
OPC UA implementations handle the security specifications defined in the standard

This matters because security isn't a feature you add on top—it's part of the protocol implementation itself. When you consume these components, you get security implementations that have been reviewed by the broader community and are actively maintained as standards evolve.

Additionally, the Eclipse Foundation enforces IP cleanliness and contribution processes that ensure code meets quality and security standards before it's accepted. This provides a level of governance that informal open source projects often lack.

‍

Practical Takeaways for Data Leaders

‍

If you're architecting data platforms for manufacturing operations, here's how to apply these insights:

Start with protocols, not platforms: Build your architecture around open protocols (MQTT, OPC UA, Sparkplug) rather than proprietary APIs. This gives you flexibility to change vendors or build custom components as your needs evolve.

Evaluate vendor neutrality: When selecting commercial products, ask whether they're built on open standards and whether you can access your data using standard protocols. Products from Eclipse Foundation members often provide this assurance.

Consider edge-to-cloud architecture early: The Eclipse ioFog approach to container orchestration at the edge, combined with secure connectivity through tools like Project Skupper, provides a blueprint for deploying analytics close to your data sources while maintaining central governance.

Invest in semantic modeling: The Sparkplug specification shows that simply moving data isn't enough—you need standardized data models that make sense across your entire operation. Allocate resources to defining these models for your organization.

Plan for the 20-year horizon: Your data architecture decisions today will impact your ability to implement AI and advanced analytics in 2045. Choose components with active communities and vendor-neutral governance that are likely to evolve with your needs.

‍

Conclusion

The choice between open source and proprietary technologies for industrial IoT isn't just a technical decision—it's a strategic one that determines how flexible, scalable, and cost-effective your data platform will be over the next two decades. For data and analytics leaders in manufacturing, the Eclipse IoT ecosystem provides a proven foundation that balances the need for production-grade reliability with the flexibility to adapt as your requirements evolve.

The message is clear: open standards and vendor-neutral governance aren't nice-to-have features. They're fundamental requirements for building industrial data platforms that can support the AI, machine learning, and analytics capabilities your business will need—not just today, but twenty years from now.

‍

Kudzai Manditereza

Founder & Educator - Industry40.tv

Kudzai Manditereza is an industrial data and AI educator and strategist. He specializes in Industrial AI, IIoT, Unified Namespace, Digital Twins, and Industrial DataOps, helping manufacturing leaders implement and scale Smart Manufacturing initiatives.

Kudzai shares this thinking through Industry40.tv, his independent media and education platform; the AI in Manufacturing podcast; and the Smart Factory Playbook newsletter, where he shares practical guidance on building the data backbone that makes industrial AI work in real-world manufacturing environments. Recognized as a Top 15 Industry 4.0 influencer, he currently serves as Senior Industry Solutions Advocate at HiveMQ.