November 11, 2025
November 11, 2025

Bridging the gap between operational technology (OT) and information technology (IT) remains a significant challenge in manufacturing. Apache StreamPipes addresses this by providing an open-source toolbox that enables non-technical users to connect, analyze, and exploit industrial IoT data streams. This comprehensive guide explores StreamPipes' architecture, features, and capabilities for building connected factory solutions.
Understanding Apache StreamPipes is essential for industrial engineers, data scientists, and architects building IIoT analytics platforms without extensive programming expertise.
Modern factories contain numerous sensors and data sources using diverse protocols—OPC UA, MQTT, Modbus, PROFIBUS, and robotics-specific protocols like ROS. These systems generate substantial data from sensors, production control systems, and automated assembly lines.
The core problem: a significant gap exists between OT personnel familiar with production processes and IT specialists who can perform data analytics. Bridging this gap requires tools that make industrial data accessible to non-technical experts.
As an Apache project, StreamPipes integrates with other Apache Software Foundation projects:
StreamPipes relies on Apache PLC4X for IoT connectivity—a library providing drivers for industrial protocols enabling connections to PLCs and production systems.
The platform uses Apache ECharts, a powerful visualization library, for all data visualizations including charts, graphs, and dashboards.
Beyond Apache projects, StreamPipes connects to numerous third-party systems:
StreamPipes consists of several modules supporting different stages of the IIoT lifecycle:
Quickly connect to industrial data sources without programming:
Web-based tool for creating data analytics pipelines:
Visual exploration of historical time-series data:
Real-time monitoring for shop floor personnel:
SDKs for developers:
Data Sources: OPC UA, PLCs, MQTT, REST APIs, file systems
Adapter Library: Microservices containing protocol-specific connectors configured through the web UI. Adapters collect data from sources and forward to the message broker.
Message Broker: Central communication channel between data sources and processing algorithms. StreamPipes supports multiple brokers:
The messaging layer is exchangeable—choose the broker that fits your infrastructure at installation time.
Time-Series Database: Included storage for historical data analysis. StreamPipes can use various time-series databases including InfluxDB.
Pipeline Element Microservices: Standalone services providing business logic for transforming and analyzing live data:
Pipeline Management: Component orchestrating microservices based on user-defined pipelines, interacting with the web UI and programmatic clients.
User Interface: Web-based interface for all StreamPipes modules—pipeline editor, data explorer, dashboards, adapter configuration.
StreamPipes uses a microservices architecture providing several advantages:
Scalability: Individual components scale independently based on workload
Flexibility: Add or remove processing elements without affecting the system
Technology Diversity: Different microservices can use different technologies and languages
Fault Isolation: Failures in one component don't cascade to others
StreamPipes supports numerous industrial protocols:
OPC UA: Modern standard for industrial communicationS7 (Siemens): Direct PLC connectivity via Apache PLC4XModbus: Widely used in legacy equipmentMQTT: Lightweight messaging protocolREST APIs: HTTP-based interfacesFile Formats: CSV, JSON uploads for testingSimulators: Built-in data generators
Connecting data sources follows a consistent workflow:
Step 1: Select Protocol
Choose the appropriate adapter from the library based on your data source type.
Step 2: Configure Protocol Settings
Provide protocol-specific parameters:
Configuration templates can be saved for reuse with similar data sources.
Step 3: Define Format (if applicable)
For brokers or file sources, specify the data format:
Step 4: Model Event Schema
StreamPipes works with event streams having defined schemas. Configure:
Data Types: Integer, Float, Boolean, String, Timestamp
Semantic Descriptions: Add metadata describing what measurements represent
Transformations: Apply unit conversions or calculations during ingestion (e.g., Fahrenheit to Celsius)
Required Fields: Ensure timestamp fields exist (auto-generate if missing)
The event schema modeling step is crucial—it enables StreamPipes to provide intelligent assistance when building analytics pipelines.
Step 5: Execute Adapter
Start the adapter, which begins collecting data and publishing to the message broker. Optionally, automatically persist data to the time-series database for later analysis.
Apache StreamPipes provides a comprehensive open-source platform for industrial IoT data collection and analysis. Its microservices architecture enables non-technical users to connect diverse industrial protocols, build sophisticated analytics pipelines through drag-and-drop interfaces, explore historical data visually, and create monitoring dashboards—all without programming.
For data scientists and developers, StreamPipes offers Python and Java clients enabling programmatic access, integration with machine learning libraries like River, and a platform for building custom IoT applications. The active Apache community ensures ongoing development, while commercial support options exist for enterprise deployments.
By bridging the gap between OT and IT, StreamPipes empowers manufacturing organizations to unlock the value in their industrial data, enabling data-driven decision-making and continuous improvement across production operations.