When companies talk about digitalization, they often think of beautiful dashboards and Power BI reports. However, the path to these graphs begins long before the dispatcher's screen—with a small sensor on a pipe or machine. This is where the primary signal is generated, repeatedly transformed, cleaned, aggregated, and interpreted before being transformed into a clear management report. Modern industrial IoT devices make it possible to build this path "from hardware to business," but if errors are made at any stage, the data becomes dead weight: it sits somewhere, but no one uses it and no one makes decisions based on it.

What is a full data path in Industrial IoT?

The complete data path in an industrial IoT system can be represented as a chain of several logically interconnected stages: measurement of a physical parameter, its digitization and initial processing, transmission via communication channels, storage in the data infrastructure, analytics, and, finally, visualization and use in management processes. It's important to understand that this isn't just a technical pipeline, but an end-to-end business process: if the questions for which the data is being collected aren't initially addressed, the system risks becoming an expensive telemetry adornment. Therefore, the data path always begins with answering a simple but crucial question: what decisions do we want to make and what metrics are needed to achieve them.

Measurement stage: the role of sensors and the quality of the primary signal

Any industrial IoT system relies on sensors that convert physical quantities into electrical signals. Temperature, pressure, flow, vibration, level, current, voltage—all of these are initially measured by sensors attached to specific points in the process. The quality of this first step determines everything that follows: if a sensor is installed in the wrong place, incorrectly calibrated, or operated outside its specifications, no amount of "smart" analytics will save the situation.

Several common mistakes are often made at this stage. First, sensors are selected based on the "cheapest" principle, without considering operating conditions: high temperatures, vibrations, and aggressive environments. Second, they skimp on installation: poorly selected installation locations, missing temperature wells, improper flow meter taps, and poor cable protection. Third, regular verification and calibration are neglected. As a result, the system neatly assembles "pretty nonsense": numbers are present, graphs are drawn, but they reflect measurement errors rather than the actual process.

Digitization and pre-processing: what does a controller or smart module do?

After a sensor produces an analog signal or pulses, it must be digitized and converted into a form suitable for transmission and processing. This is where controllers, I/O modules, RTUs, and edge devices come into play. They convert the analog signal via an ADC, apply filtering, convert values ​​to engineering units, and can perform basic logic.

It's important to choose the sampling frequency and preprocessing volume wisely. Unnecessarily oversampling the sample rate will result in gigantic volumes of data that overload the network and storage, even though much less frequent measurements are sufficient for solving the problem. Conversely, overly aggressive averaging and filtering can lead to the loss of important peaks and anomalies, which are precisely what's needed for diagnostics and predictive analytics.

This is where problems with metadata most often arise. Simply recording a number isn't enough; you need to know what object it refers to, what units it was measured in, what channel and sensor type it came from, its timestamp, and its quality status. If this information isn't generated correctly at the controller or edge device level, then at subsequent stages the data turns into endless "Tag_01," "AI_12," and other abstract entities that no one wants to deal with.

Transmission: Protocols, Network, and Channel Reliability

Once the data is prepared, it must be transported to the central location—to a server, cloud, or specialized platform. A variety of protocols are used here: from classic industrial protocols (Modbus, Profibus, Profinet) to modern IoT-oriented protocols (MQTT, OPC UA, etc.). The choice of protocol and architecture significantly impacts the system's flexibility, fault tolerance, and scalability.

At the transmission stage, a common problem is the lack of a well-thought-out network architecture. Often, everything is "hung" on a single network without segmentation, failing to consider that telemetry, video streams, office traffic, and internet access will all compete for resources. As a result, at the most inopportune moments, data is delayed or missed entirely. Another common mistake is the lack of a robust mechanism for buffering and redelivering data when connection interruptions occur: a controller or edge device sends data once, the connection is lost, and these points are lost forever. Gaps appear in reports, and trust in the system declines.

For the Industrial IoT, not only speed and throughput are critical, but also predictable latency. If the site is remote and cellular or radio is used, a "store and forward" strategy is essential, whereby data is temporarily stored locally and transmitted in batches as soon as the connection is restored. Without this, reporting becomes a patchwork quilt.

Storage: From raw data flows to a structure understandable to business

Once on a server or in the cloud, data shouldn't simply be written "into a database somewhere," but structured with future use cases in mind. Typically, at least several levels are distinguished: raw data, cleaned and validated data, aggregated data, and business-oriented data. Industry uses different types of storage: time series databases for telemetry, classic relational DBMSs for reference books and reports, and object storage for archives and large data sets.

Here, it's especially important to think through a naming scheme and how data is linked to real-world objects: equipment, lines, sections, workshops, contracts, and clients. If the table and tag structure doesn't reflect the actual structure of the enterprise, any analytics becomes a pain. The analyst opens the data warehouse and sees thousands of tags with no obvious connection to equipment or business processes, and then says, "The system has collected a ton of data, but I can't use it."

A classic mistake is designing a storage system solely for automation engineers, neglecting future reports for technical directors, sales, and finance. Then, when attempting to create a "energy consumption by workshop and product" report, it turns out that the data is stored only by measurement point, with no link to products or orders.

Analytics: Transforming data into knowledge and management action

Once the storage architecture is in place, the key moment comes: analysis. Initially, this may involve simple tasks: calculating energy consumption KPIs, monitoring equipment downtime, analyzing accidents and protection response statistics. Then, more complex scenarios are added: predictive diagnostics, optimizing operating modes, and modeling "what will happen if a certain parameter is changed."

High-quality analytics is always based on accurate timestamps, units of measurement, normalized references, and a clear understanding of context. If all these factors are ignored in the previous stages, analytics devolves into endless fuss with data exports and manual data cleaning in Excel. In this case, the data is technically there, but in reality, its value tends to be zero.

The importance of feedback between analysts, technologists, and automated control system engineers is particularly noteworthy. Without regular dialogue, analysts don't understand which patterns are meaningful from a process perspective, and engineers don't know which additional parameters should be measured or at what frequency to improve the accuracy of the models. Where this communication is established, the system constantly evolves: new sensors are added, processing algorithms are modified, and reports are refined.

Visualization and reporting: how to convey meaning to people

The final, but crucial, step is presenting the results in a user-friendly format. This could include online dashboards for dispatchers and shift supervisors, analytical panels for management, regular email reports, or mobile apps for frontline staff. The goal of visualization isn't just to present a pretty graph; it's to highlight deviations, trends, and areas requiring attention.

A well-designed report answers a specific question: "How efficiently is the equipment operating?", "Where are we losing energy?", "Which assets are at risk of failure?" It's not overloaded with dozens of graphs, but highlights the main points and allows for drilling down into the details as needed. If this logic is broken, users quickly stop looking at the dashboards. As a result, even potentially useful data goes unused.

A common mistake is creating visualizations for the sake of visualization. Engineers and developers create complex diagrams with numerous indicators, but fail to ensure they are user-friendly for the people who will be making decisions in real time. The true value of visualization is revealed when a dispatcher or manager can see in a couple of minutes whether everything is in order and, if necessary, quickly identify the cause of a problem.

Why data becomes "dead" and what leads to it

In the context of the Industrial IoT, the term "dead data" typically describes a situation where data is formally collected and stored somewhere, but in reality, no one analyzes it or uses it for management. The reasons are almost always rooted in early design errors.

The first reason is the lack of clear business goals. If a project starts with the slogan "let's implement IoT, it's trendy," without a list of specific tasks, after a while the team realizes they've collected massive amounts of telemetry data but don't know what to do with it. The second reason is underestimating the quality of measurements and metadata. When parameters are measured irregularly, without normal dates and times, without units, and without reference to objects, analytics turns into archeology.

The third reason is poor integration with existing systems. If IoT data lives in its own "sandbox" and isn't connected to ERP, MES, or maintenance and repair systems, it can't directly influence planning, procurement, or HR management. The fourth reason is the lack of a "data owner"—a specific department or role responsible for the quality, structure, and development of data, as well as for ensuring that decisions are made based on it.

As a result, a system into which much effort and money has been invested turns into a collection of logs, accessed only in emergency situations after the fact. This is a classic example of dead data: it exists, but creates no value.

How to Design a Data Path to Work From the Start

To ensure data has a full lifespan, from sensor to report, it's important to define business cases at the outset of the project: what decisions will be made, what metrics are needed, how often, who the end users are, and what their roles and tasks are. Based on this, measurement points, sensor types, controller and edge device architecture, transmission protocols, storage types, and analytics approaches are selected.

We need to agree on unified reference books and classifiers, tag naming standards, rules for working with timestamps, and data quality. It's worth defining system maintenance processes in advance: who monitors sensor calibration, who is responsible for ensuring the up-to-dateness of the data warehouse schemas, and who initiates changes to reports and dashboards.

By viewing the data path as a single, living organism, rather than a collection of disparate technical projects, SCADA, and reporting systems, industrial IoT solutions cease to be a fad and begin to deliver real benefits: from reduced downtime and energy losses to more accurate production and equipment maintenance planning. Then, every byte generated in the controller has the potential to become part of a meaningful management decision, rather than a meaningless number in a forgotten database.