November 2, 2025

AI-Powered Smart Guidance Solutions For Smart Manufacturing

According to ARC Advisory Group, 82% of industrial equipment failures appear to be random—meaning we don't understand their causes. Only 18% are predictable enough to use conventional monitoring systems like vibration analysis or oil testing.

For data leaders in manufacturing, this statistic reveals a massive opportunity disguised as chaos. Those "random" failures aren't actually random, they're happening because conditions arise that we didn't know existed, couldn't detect in time, or couldn't connect to outcomes.

Nikunj Mehta has spent 15 years focused on this problem. As founder and CEO of Falkonry and former executive at C3.ai, he's built his career on one insight: if you can detect conditions in real-time from operational data, you can guide actions that prevent problems across maintenance, quality, energy, and emissions, without waiting 10-20 years for experts to learn what conditions matter through trial and error.

His work shows that the answer isn't better physics models or more manual labeling. It's using deep learning on time series data to automatically discover conditions that human experts would never have time to identify—and doing it without pulling those experts away from running the plant.

From Condition-Based Maintenance to Condition-Based Actions

Most manufacturers understand condition-based maintenance: if you can detect a condition (high vibration, low pressure), you can make better maintenance decisions. The problem? This only works for conditions you already know about.

Condition-based actions is different. It recognizes that new conditions constantly arise that nobody knew could exist. And those conditions don't just affect maintenance—they impact quality, energy consumption, emissions, and reliability simultaneously.

Take the example of a steel company operating blast furnaces. High humidity in the atmosphere causes ore to absorb moisture. Moisture-laden ore requires substantially more energy to process. Higher energy use affects three things:

  • Energy costs and carbon emissions
  • Refractory lining wear (maintenance implications)
  • Product quality variations

Which is the "real" condition? High humidity? Moisture content in ore? Excess energy use in the furnace? They're all conditions connected by cause-and-effect, and you need to detect all of them to make informed decisions.

Or consider a shale gas operation. Depending on how you maintain pressure at the well bottom, you either produce gas or produce emissions—which triggers state penalties. The condition isn't just equipment health. It's the physical state of the well that determines whether your next action extracts value or creates waste.

This is why condition-based actions is more encompassing than condition-based maintenance. It's about detecting any condition affecting your operation—whether you've seen it before or not—and determining the best action given that condition.

The Real-Time Nature of Conditions (And Why 24-Hour Delays Don't Work)

Conditions happen exactly when they happen. Look six or twelve hours later, and you won't see them. This creates a fundamental problem for traditional approaches.

A mining company processing trona (raw soda ash) faced this in 2017-18. Variations in ore grade were affecting their equipment, but the chemical testing process took 24 hours to assess ore quality. By the time they knew the grade, the ore was already processed—and any quality or equipment issues had already occurred.

You cannot use 24-hour-delayed information to control real-time production. This is what makes smart manufacturing "smart"—the ability to detect conditions as they happen and act on them immediately.

In steel production, data comes in at 10, 100, sometimes 200 times per second for a single sensor. At those speeds, traditional approaches of manually setting thresholds or building physics models for every possible condition simply don't scale.

The Dead End of Physics-Based Models and Manual Labeling

For nearly a century, experts built mathematical models of systems—understanding the physics, creating equations, designing algorithms for specific failure modes. This approach works beautifully when governing laws are known and you can create clean models.

But here's the problem: what we omit to create those clean models is the real world.

When you build a physics-based model or a specific algorithm for one failure mode, you're making progress on that narrow problem. But you can't scale it. You can't address the thousands of other conditions that might arise. And you certainly can't discover the conditions you didn't even know existed.

The analogy Nikunj uses is language. For a century, linguists constructed mathematical models of language—valuable scientific progress, but no social impact. Then in the past 20 years, Google and others used computational methods heavily driven by statistics, without necessarily understanding the model of language. Today we have ChatGPT because the approach shifted from perfect models to pattern discovery at scale.

The same transformation is happening in industrial operations. The gap that remains won't be filled by more physics models. Just like autonomous driving evolved from individual use cases (turning corners, rural roads) to deep learning that generalizes across scenarios, condition intelligence needs general methods that work across many situations.

Why Deep Learning Changes Everything (Without Expert Labeling)

The biggest hurdle in smart manufacturing has been the lack of labeled datasets. When you need labels, you're pulling experts away from running the plant to teach the AI. This slows down the entire manufacturing sector—and they don't have spare people anyway, given the retirement cliff many organizations face.

The productivity plateau over the past 10 years happened partly because we're not getting as much out of technology as we're putting into it. Experts drawn out of manufacturing to label data for AI aren't moving the industry forward.

Falconry's breakthrough came from applying deep learning to detect conditions without requiring labeled datasets. The system can identify:

  • Statistical deviations from normal patterns
  • Threshold violations (simple but necessary)
  • Complex shapes, waveforms, distributions, and frequencies
  • Correlations across multiple signals

Critically, it does this in a computationally efficient, explainable, and generalized way that works across different scenarios without starting from scratch each time.

This matters because you can't manually build individual models for every possible condition across thousands of parameters streaming in hundreds of times per second. The only way forward is general methods that automatically discover conditions.

Building Trust: The Roadmap That Actually Works

Smart systems need to earn trust before people will act on their recommendations in real-time. Here's the three-phase roadmap successful manufacturers follow:

Phase 1: Trusted LeadersStart with people who hold strong reputations within the organization, plant, or field. They take the first steps moving the organization toward smart guidance, and their credibility helps others trust the system.

Phase 2: Retrospective AnalysisUse the guidance system to look backwards at known incidents: What was happening at time T when that problem occurred? Can the system identify the condition that needed action? Can it help understand proximate causes?

This builds confidence without real-time risk. When incidents occur and someone asks "why did this happen?" the smart system provides proximate causes immediately, improving recovery processes. This proves the system's intelligence level before anyone relies on it for forward-looking decisions.

Connect IIoT sensor data or SCADA data—a few hundred or few thousand parameters depending on comfort level—for a specific line or subsystem. You don't need to know which conditions can be predicted from those parameters. You just need to give the system data from operations where trusted leaders have expertise.

Phase 3: Real-Time AlertingOnce people develop confidence, transition from analysis to alerting. But only after ensuring:

  • Low false positive rate (people won't ignore alerts if they're mostly wrong)
  • Clear, actionable information (operators understand what alerts mean)
  • Integration with work management systems (alerts flow into existing workflows)

This progression—trusted people, retrospective validation, then real-time alerting—builds the organizational confidence necessary for smart manufacturing to succeed.

Real Results: $1M Saved, 1% Unproductive Time Eliminated

ArcelorMittal's Calvert cold rolling plant provides a published example. They suffered 3-4% unproductive time due to "breaks"—the metal being processed breaking during production. At that scale, 3-4% translates to $4-5 million per year in lost value.

Falconry helped them reduce unproductive time by 1%. Not eliminating all breaks, but reducing the problem enough to save approximately $1 million annually on a single line.

Other customers using Falconry in analysis mode see returns in high six figures to low seven figures—for a single line. The value comes from reducing lost production time that can range from an hour or two to multiple days per incident.

The Use Case Question: Maintenance vs. The Whole Operation

Manufacturers often stumble on how to frame "use cases." If you define a use case as one specific failure mode, you can't scale—and that failure mode might not occur for a long time, so you don't see ROI even if you have the right solution.

Better framing: the use case is "maintenance" or "emissions reduction" or "process optimization"—not one specific failure.

Most organizations start with reliability or process improvement. Sometimes a process engineer has an ongoing goal to improve energy efficiency. Sometimes maintenance downtime is the trigger. The starting point varies, but successful implementations recognize that engineering teams—not just maintenance teams—need to drive smart manufacturing.

Why? Because conditions are a new kind of resource. Process engineers will exploit conditions to optimize entire line operations, not just one objective. They might start with maintenance, but they naturally expand to energy and quality because conditions affect all of them simultaneously.

The Data Integration Reality: MQTT vs. Parquet Files

For organizations working with slower data—like oil and gas extraction—MQTT works excellently as the communication protocol. For high-speed production environments (steel, aluminum) where single sensors generate data 10-200 times per second, many organizations transfer Parquet files every few minutes to S3 buckets instead.

The lesson: be prepared for hybrid approaches. The ideal future is one mechanism for both, but today's reality requires flexibility based on data velocity and volume characteristics.

The Workforce You Actually Need (Not Data Scientists)

Six to seven years ago, the narrative was: every manufacturing company needs to hire data science teams and hold onto them for 20 years until systems are in production across the enterprise.

That roadmap isn't feasible. Just hiring data scientists to churn out algorithmic solutions for individual problems doesn't move you forward. It might give confidence there's a path, but it doesn't deliver scaled results.

The same applies to data engineers—moving and preparing data for each project doesn't scale when there's a common approach that makes it work successfully across AI systems.

The workforce you need? Navigators and drivers—people who take guidance and act on it. In steel companies worldwide (Mexico, Scandinavia, Canada, US), this role is filled by "advanced maintenance teams" combining:

  • Veterans who've done maintenance for years (understand the operation)
  • Recent mechanical or electrical engineering graduates (bring fresh knowledge)

Neither group needs to be software engineers, mathematicians, or physicists. They just need to understand the operation well enough to evaluate guidance and decide: "This condition requires action. I'm going to act now."

What This Means for Your Data Strategy

For data and analytics leaders, the implications reshape how you think about smart manufacturing:

Stop waiting for perfect labeled datasets: If your AI approach requires experts to label data for months before deployment, you're stuck. The breakthrough comes from methods that discover conditions without supervision.

Think in conditions, not failure modes: Building 50 individual models for 50 failure modes doesn't scale. Building one system that detects thousands of conditions across any failure mode does.

Time series data is the foundation: If you're not treating operational data as time series—with all the temporal patterns, frequencies, and correlations that matter—you're missing the signal.

Trust precedes adoption: Don't push real-time alerting before proving the system works retrospectively. Retrospective validation builds the confidence necessary for forward-looking actions.

Engineering leads, not just maintenance: If you position smart manufacturing as "for the maintenance team," you'll limit impact. Engineers who optimize entire operations will exploit conditions across all objectives.

Commercial solutions over bespoke development: Government agencies are choosing commercial off-the-shelf solutions for sustainability. Manufacturers should do the same. Build internal data science capabilities for governance and organization-specific optimization, not to reinvent foundational technology.

The Urgency: Executive Decisions, Not Technical Experiments

Smart manufacturing is no longer a technical risk—it's a business risk. The economics have been worked out. ROI is proven. The technology is ready.

This means responsibility for decisions must sit with executives, not technical teams waiting for executive attention. Smart manufacturing should be led by executive-level people who can make business risk decisions quickly.

Too many delays happen because it's not treated as an executive decision. The result? Slow progress while competitors move faster.

The Bottom Line

82% of failures don't appear random because there's no pattern. They appear random because the conditions causing them happen too fast to detect manually, involve too many interrelated variables to model individually, or simply have never occurred before so nobody thought to watch for them.

The answer isn't hiring more data scientists to manually label conditions. It isn't building 1,000 physics models for 1,000 failure modes. It's using deep learning on time series data to automatically discover conditions—including ones experts didn't know existed—and providing real-time guidance on actions to take.

The manufacturers who figure this out will reduce the "random" 82% down to maybe 35%—using smart guidance for 50% of situations and conventional systems for the remaining 15%. That shift from chaos to understanding translates directly to millions of dollars per line in reduced downtime, lower energy costs, better quality, and fewer emissions.

For data and analytics leaders, the path forward is clear: build trust through retrospective analysis, prove the system's intelligence on known incidents, then transition to real-time alerting integrated into work systems. Start with trusted leaders on one line. Expand from there.

The conditions are happening right now in your operations. The only question is whether you're detecting them.

Kudzai Manditereza

Founder & Educator - Industry40.tv

Kudzai Manditereza is an Industry4.0 technology evangelist and creator of Industry40.tv, an independent media and education platform focused on industrial data and AI for smart manufacturing. He specializes in Industrial AI, IIoT, Unified Namespace, Digital Twins, and Industrial DataOps, helping digital manufacturing leaders implement and scale AI initiatives.

Kudzai hosts the AI in Manufacturing podcast and writes the Smart Factory Playbook newsletter, where he shares practical guidance on building the data backbone that makes industrial AI work in real-world manufacturing environments. He currently serves as Senior Industry Solutions Advocate at HiveMQ.