November 2, 2025
November 2, 2025
In manufacturing, time-series data is everywhere, but most plants are still relying on static dashboards, lagging insights, and manual root-cause analysis.
The result?
- Downtime that’s explained, not prevented
- Insights that arrive, after the line slows down
- Human effort wasted on repeat investigations
AI agents transform the way manufacturers harness time-series data. They process live sensor feeds while simultaneously referencing historical records, enabling instant anomaly detection and context-aware decisions.
They can correlate vast time-series data with external factors to uncover insights missed by rigid statistical models.
They can trigger actions like maintenance tickets or production adjustments directly from analytics, bypassing manual interpretation steps.They connect the dots across thousands of data streams in real time, automatically identifying root causes and recommending actions on the fly.
According to Jeff Tao, CEO and founder of TDengine, existing manufacturing data collection infrastructure that was never designed for modern AI.
Traditional data historians do what they're designed for brilliantly: reliably storing process data from PLCs, SCADA systems, and instruments. They've proven themselves over decades in mission-critical environments.
But they're fundamentally closed systems. Tao breaks down exactly what this means:
Limited connectivity options: Most historians use proprietary interfaces. Getting data out requires vendor-specific tools and APIs. Want to run a Python analysis? You'll need special libraries and documentation that varies by vendor.
Restricted query capabilities: SQL support is often an afterthought or nonexistent. You learn the historian's custom query language or use their GUI tools. Modern data scientists working in Jupyter notebooks struggle to integrate efficiently.
Walled garden ecosystems: The historian, visualization tools, analytics capabilities—everything comes from one vendor. Integrating external AI platforms or open-source tools requires complex middleware layers.
Performance not optimized for analytics: Historians prioritize write performance and data retention. Complex analytical queries that modern AI requires can be slow or resource-intensive.
This architecture made sense when historians were primarily for compliance reporting and troubleshooting. But it creates friction for every AI project you try to build.
Time-series databases take a different approach: open by default, SQL-native, designed for both operational storage and analytical workloads. The difference isn't just technical, it's strategic.
Modern time-series databases were built for the big data and AI era. TDengine exemplifies this approach with several architectural decisions that matter for manufacturing AI:
Native SQL support with full ecosystem access: Query your operational data using standard SQL. Connect from Python, R, Spark, or any tool data scientists already use. No proprietary query languages, no special drivers, no friction.
Comprehensive data processing built-in: Beyond just storage, TDengine includes caching, message queuing, and stream processing. These are essential components for AI pipelines, but with traditional historians you'd need to integrate Redis, Kafka, and Flink separately.
Industrial protocol connectors without code: Read directly from MQTT, OPC UA, OPC DA, Sparkplug B, and major historians (PI System, Wonderware, Aspen IP-21) without writing integration code. OT engineers can connect data sources using configuration, not programming.
Out-of-order data handling: Unlike IT monitoring databases (Prometheus, Victoria Metrics), industrial time-series databases expect data to arrive out of sequence due to network latency or buffering. This is critical for real-world manufacturing where perfect ordering is impossible.
Massive performance optimization: TDengine's "one table per device" model with super tables provides better compression, faster queries, and higher ingestion rates than general-purpose time-series databases. When you have thousands of sensors updating every second, this matters.
The architecture difference is fundamental: historians were designed to archive operational data, while modern time-series databases were designed to power AI applications on operational data.
Here's where things get interesting. Even with better data infrastructure, deploying AI in manufacturing requires choosing models, managing training pipelines, handling multiple algorithms, and dealing with constant innovation in AI techniques.
Most manufacturers lack the data science teams to do this at scale. And those that have data scientists watch them spend more time on infrastructure than on solving business problems.
TDengine's solution: AI agents that hide complexity behind simple SQL interfaces.
The concept: TD GPT acts as an intelligent proxy between your time-series data and various AI models—whether that's ChatGPT, LLaMA, DeepSeek, or custom machine learning algorithms. You don't interface directly with AI models. You call SQL functions.
Want to forecast power consumption? SELECT forecast(power_consumption) FROM equipment
Need anomaly detection on vibration data? SELECT anomaly(vibration_sensor) FROM motor
The AI agent handles everything behind the scenes:
Why this matters: Your process engineers and OT teams can leverage sophisticated AI without becoming data scientists. As AI models evolve—new versions of GPT, better time-series algorithms, improved foundation models—users don't change their code. The AI agent adapts.
General-purpose large language models like GPT-4 know a lot about many things. Custom machine learning models know a lot about one specific thing. Time-series foundation models sit in the middle: specialized for temporal patterns but general enough to work across industries.
TDengine trained their own time-series foundation model based on LLaMA architecture. Why build yet another model when Amazon, TikTok, and others already offer time-series models?
Size and cost: 100 million parameters instead of billions. Smaller models run on cheaper hardware, require less memory, and inference faster. For manufacturing with cost sensitivity, this matters.
On-premises deployment: Regulated industries (pharmaceutical, defense, critical infrastructure) can't send data to public cloud LLMs. A smaller model you can deploy on-premises solves data sovereignty concerns while still providing AI capabilities.
Customization path: With their own model and training pipeline, TDengine can create specialized versions for specific customers. Your proprietary process data stays private while you get a model trained specifically on patterns relevant to your operations.
Open SDK for model integration: Don't like their foundation model? Build your own. The SDK lets you plug any algorithm into the platform. Your PhD student working on novel forecasting methods can contribute their model, and every application across your enterprise can use it through the same SQL interface.
The key insight: you don't need the biggest, most sophisticated model. You need models optimized for your use case that run where your data lives and integrate with your existing tools.
Let's walk through a concrete example: predictive maintenance on a critical motor. Here's how data flows through a modern AI-enabled time-series architecture:
Data ingestion: Motor sensors connect to an OPC gateway or MQTT broker. TDengine reads data automatically using built-in connectors—no custom integration code needed. Data includes vibration, temperature, current draw, speed.
Storage and indexing: Time-series database stores raw data with high compression. One table per motor, super table structure grouping similar equipment. Historical data readily available for training, real-time data for inference.
Stream processing triggers: As new data arrives, stream processing rules automatically feed it to AI pipelines. No batch jobs running hourly—inference happens as data flows.
AI agent orchestration: TD GPT receives the data stream and relevant historical context. It calls the appropriate model—maybe a time-series foundation model for baseline prediction, plus an anomaly detection algorithm for outlier identification.
Model inference: Multiple models analyze the data:
Results return to database: Predictions, anomaly scores, and maintenance recommendations flow back to TDengine as new time-series. These aren't separate "AI results"—they're integrated into your operational data model.
Application integration: Maintenance dashboards, alert systems, and work order platforms query results using standard SQL. No special AI infrastructure—just data.
The entire pipeline runs continuously without human intervention. As models improve or new versions deploy, applications don't change. The AI agent abstracts all the complexity.
For pharmaceutical manufacturers, defense contractors, and critical infrastructure operators, sending operational data to public cloud AI services is non-negotiable. Data must stay on-premises.
But most AI innovation happens in the cloud. How do you get modern AI capabilities while maintaining data sovereignty?
Deploy everything on-premises: Time-series database, AI agents, and models all run in your data center. No data leaves your network. You maintain complete control.
Use smaller, efficient models: You don't need GPT-4's 175 billion parameters for industrial time-series analysis. TDengine's 100 million parameter model runs on standard servers and often performs better on focused tasks.
Custom model training: For highly sensitive processes, train models exclusively on your data. TDengine's engineering platform makes this straightforward—they can train specialized models for individual customers without those customers sharing proprietary information.
Air-gapped deployments: In extreme security environments, install everything without internet connectivity. Models train and run entirely within your secure zone.
The technology to maintain data sovereignty while leveraging AI fully exists. The barrier isn't technical—it's choosing architectures that prioritize this from the start rather than bolting on security after designing for cloud deployment.
Single-machine pilots are easy. Production deployments across hundreds of lines and dozens of sites are hard. Here's how modern time-series databases enable scale:
Distributed by design: TDengine deploys as a cluster that scales horizontally. Start with 16 CPU cores, add more nodes as data volumes grow. Performance scales linearly with hardware.
Edge-cloud synchronization: Deploy instances at individual sites collecting local data. Aggregate data from multiple sites at regional or corporate levels. Each site runs AI models locally for real-time decisions while corporate sees the complete picture.
Hierarchical architecture: Local installations at each production line. Site-level aggregation for plant-wide analytics. Corporate data lake for enterprise insights. All using the same technology stack, just at different scales.
Cloud and on-premises flexibility: Some sites in public cloud for easy access and unlimited scale. Others on-premises for security or latency reasons. Mix and match based on requirements without architectural changes.
Observability built-in: Monitor latency, throughput, CPU usage, and model performance across the entire deployment. Identify bottlenecks before they impact production. Track model drift and retrain automatically when accuracy degrades.
The pattern that works: start with one line, prove value, document the architecture, then replicate systematically across sites. Same infrastructure, same tools, same SQL interfaces—just more of them.
Manufacturing leaders often struggle with AI adoption not because the technology doesn't work, but because the path to getting started seems overwhelming. Tao's advice: stop overthinking it and just try.
Start with free tools: TDengine, InfluxDB, TimescaleDB—multiple open-source time-series databases offer free tiers and cloud services. Test for a month without spending money or writing code.
Don't get locked in: Traditional data historians lock you into vendor ecosystems. Modern time-series databases are open by default. If one doesn't work, switch. Your SQL queries and Python scripts still work.
Focus on quick wins: Predictive maintenance, power forecasting, anomaly detection—pick use cases with clear ROI and fast time to value. Prove AI works before tackling harder problems.
Embrace the learning curve: Large language models know your processes even if you don't think they do. Chemical plants, power generation, food and beverage—LLMs trained on vast technical corpora understand these domains. Use them as expert assistants.
Build on open standards: Open-source software, SQL interfaces, standard protocols. Every decision that increases openness reduces future risk and cost.
The barrier to experimenting with modern AI infrastructure has never been lower. The risk of waiting while competitors adopt has never been higher.
Your historian isn't "bad." It does exactly what it was designed for in an era before AI, before big data, before cloud computing.
But the world changed. The AI capabilities manufacturers need—real-time optimization, predictive maintenance at scale, autonomous process control, digital twins that actually adapt—require modern data infrastructure.
Time-series databases aren't just faster historians. They're the foundation for AI-native manufacturing:
AI agents sitting on top don't just make AI easier—they make it accessible to everyone. Process engineers, maintenance technicians, quality managers—anyone who can write SQL can leverage sophisticated AI.
The companies that recognize this shift and rebuild their data infrastructure accordingly will unlock capabilities their competitors can't match. The ones that stick with "if it's not broken, don't fix it" will wonder why their AI initiatives keep stalling.
Your data is already there. Your AI models are ready. The question is whether your infrastructure can connect them in ways that actually scale.
Maybe it's time to rethink that historian.
Kudzai Manditereza is an Industry4.0 technology evangelist and creator of Industry40.tv, an independent media and education platform focused on industrial data and AI for smart manufacturing. He specializes in Industrial AI, IIoT, Unified Namespace, Digital Twins, and Industrial DataOps, helping digital manufacturing leaders implement and scale AI initiatives.
Kudzai hosts the AI in Manufacturing podcast and writes the Smart Factory Playbook newsletter, where he shares practical guidance on building the data backbone that makes industrial AI work in real-world manufacturing environments. He currently serves as Senior Industry Solutions Advocate at HiveMQ.