The standard I-IoT flow

In Chapter 2, Understanding the Industrial Process and Devices, we looked at how industrial data is generated, gathered, and transferred to the cloud by familiarizing ourselves with computer-integrated manufacturing (CIM). We also looked at industrial equipment, such as DCS, PLCs, and SCADA, and network protocols, such as Fieldbus, ControlBus, and OPC. This gave us a picture of the whole path of industrial signals, from when they are generated by sensors in a factory to when they are processed in the cloud. In Chapter 3, Industrial Data Flow and Devices, and Chapter 4, Implementing the Industrial IoT Data Flow, we talked in detail about data acquisition, data sources, related protocols, and the edge devices used to push data into the cloud. This has shown us how data acquisition is the foundation of the I-IoT and how we need to be compliant with several security and safety standards. Data, therefore, must be analyzed and processed. In a typical new I-IoT architecture, there are other parallel mechanisms, such as central data storage and data-processing and execution.

The following diagram shows the flow of I-IoT data on the cloud side, summing up what was already presented in Chapter 2, Understanding the Industrial Process and Devices. During data-transfer, the data coming from the sensors is gathered from the data sources (such as PLCs, DCSs, SCADA, or historian) and stored temporarily to avoid data loss due a to a connectivity issue. The data can be time-series data, such as an event, semi-structured data, such as a log or binary file, or completely unstructured data, such as an image. Time-series data and events are collected frequently (from every second to every few minutes). Files are normally collected when they are triggered locally, which can happen through machine shutdown or inspection. Shadow data can then be sent, through LAN, to the department datacenter. The data is then sent by the edge over the WAN and stored in a centralized data lake and a time-series database (TSDB). The data lake can be cloud-based, an on-premises datacenter, or a third-party storage system.

Data can be immediately processed using data-stream analytics, which is called a hot path, with a simple rule engine platform based on the threshold or the smart threshold:

End-to-end data flow on the I-IoT

Advanced analytics, including digital twins, machine learning, deep learning, and data-driven or physics-based analytics, can process a large amount of data (from ten minutes' to one month's worth of data) coming from different sensors. This data is stored on an intermediate repository (called a cold path, as can be seen in the preceding diagram). These analytics are triggered by a scheduler or by the availability of the data and need a lot of computational resources and dedicated hardware, such as CPU, GPU, or TPU.

Azure refers to cold paths and hot paths. In a hot path, data is processed immediately. In a cold path, data is stored and processed later. Normally, we use data streams for hot paths and micro-batch analytics for cold paths.

These analytics often need additional information, such as the model of the machine being monitored and the operational attributes; this information is contained in the asset registry. The asset registry has information about the type of asset you are monitoring, including its name, serial number, symbolic name, location, operating capabilities, the history of the parts it is made up of, and the role it plays in the production process. In the asset registry, we can store the list of measures of each asset, the logic name, the unit of measure, and the boundary range. In the industrial sector, this static information is important so that the correct analytic model can be applied.

The output of the analytics, either stream-based or advanced, is typically calculated data. This might be the expected performance; the operability drift; an alarm such as an anomaly vibration; a recommendation, such as clean, replace filter, or close valve; or a report. These insights can be analyzed by an operator to provide information to the operations manager. They might then decide to open a work order to improve the process. Although these operations are not part of an I-IoT process, the analytics should skip this data because the signals are unpacked and therefore nothing useful can be concluded. For example, if you decide to turn off a machine whose status is being monitored, this should be interpreted as an anomaly and the analytics should raise an alert that something is wrong.

Occasionally, I-IoT can use big data analytics to process a huge amount of data, such as images or raw data. These technologies can employ data lakes and big data platforms.

A data lake is a repository of data stored in its natural format, usually object blobs, images, or log files. The most common data-lake implementation is AWS S3 or Hadoop HDFS.

Table of Contents for The standard I-IoT flow

Create new playlist

Sign In

Sign Up

Table of Contents for
The standard I-IoT flow