Machine learning in IoT

Machine learning is not a new computer science development. On the contrary, mathematical models for data fitting and probability go back to the early 1800s, and Bayes' theorem and the least squares method of fitting data. Both are still widely used in machine learning models today, and we will briefly explore them in the chapter.

It wasn't until Marvin Minsky (MIT) produced the first neural network devices called perceptrons in the early 1950s that computing machines and learning were unified. He later wrote a paper in 1969 that was interpreted as a critique of the limitations of neural networks. Certainly, during that period, computational horsepower was at a premium. The mathematics were beyond the reasonable resources of IBM S/360 and CDC computers. As we will see, the 1960s introduced much of the mathematics and foundations of artificial intelligence in areas such as neural nets, support vector machines, fuzzy logic, and so on.

Evolutionary computation such as genetic algorithms and swarm intelligence became a research focus in the late 1960s and 1970s, with work from Rechenberg, Ingo Evolutionsstrategie (1973). It gained some traction in solving complex engineering problems. Genetic algorithms are still used today in mechanical engineering, and even automatic software design.

The mid-1960s also introduced the concept of hidden Markov models as a form of probabilistic AI, like Bayesian models. It had been applied to research in gesture recognition and bioinformatics. 

Artificial intelligence research lulled with government funding drying up until the 1980s and the advent of logic systems. This started the field of AI known as a logic-based AI and supporting programming languages called Prolog and LISP, which allowed programmers to easily describe symbolic expressions. Researchers found limitations with this approach to AI: principally logic-based semantics didn't think like a human. Attempts using anti-logic or scruffy models to try to describe objects didn't work well either. Essentially, one cannot describe an object precisely using loosely coupled concepts. Later in the 1980s, expert systems took root. Expert systems are another form of logic-based systems for a well-defined problem trained by experts in that particular domain. One could think of them as a rule-based engine for a control system. Expert systems proved successful in corporate and business settings and became the first commercially available AI systems sold. New industries started to form around expert systems. These types of AI grew, and IBM used the concept to build deep thought to defeat chess grandmaster Garry Kasparov in 1997. 

Fuzzy logic first manifested itself in research by Lotfi A. Zadeh at UC Berkeley in 1965, but it wasn't until 1985 that researchers at Hitachi demonstrated how fuzzy logic could be applied successfully to control systems. That sparked significant interest in Japanese automotive and electronics firms to adopt fuzzy systems into actual products. Fuzzy logic has been used successfully in control systems, and we will discuss it formally later in this chapter.

While expert systems and fuzzy logic seemed to be the mainstay for AI, there was a growing and noticeable gap between what it could do, and what it would never be able to do. Researchers in the early 1990s saw that expert systems, or logic-based systems, in general, could never emulate the mind. The 1990s brought on the advent of statistical AI in the form of hidden Markov models and Bayesian networks. Essentially, computer science adopted models commonly used in economics, trade, and operations research to make decisions.

Support vector machines were first proposed by Vladimir N. Vapnik and Alexey Chervonenkis in 1963, but became popular after the AI winter of the 1970s and early 1980s. SVMs became the foundation for linear and nonlinear classification by using a novel technique to find the best hyperplanes to categorize data sets. This technique became popular with handwriting analysis. Soon, this evolved into uses for neural networks. 

RNN also became a topic of interest in the 1990s. This type of network was unique and different to deep learning neural networks such as Convolutional Neural Network, because it maintained state and could be applied to a problem involving the notion of time, such as audio and speech recognition. They have a direct impact on IoT predictive models today, which we will discuss later in this chapter.

A seminal event occurred in 2012 in the field of image recognition. In a competition, teams around the globe competed on a computer science task of recognizing the object in a 50-pixel by 30-pixel thumbnail. Once the object was labeled, the next task was to draw a box around it. The task was to do this for 1 million images. A team from the University of Toronto built the first deep Convolutional Neural Network to process images to win this competition. Other neural networks had attempted this machine vision exercise in the past, but the team developed an approach that identified images with more accuracy than any approach before, with an error rate of 16.4%. Google developed another neural net that brought the error rate down to 6.4%. It was also around this time that Alex Krizhevsky developed AlexNet, which introduced GPUs to the equation to greatly speed up training. All these models were built around Convolutional Neural Network, and had processing requirements that were prohibitive until the advent of GPUs. 

Today, we find AI everywhere, from self-driving cars to speech recognition in Siri, to tools emulating humans in online customer service, to medical imaging, to retailers using machine learning models to identify consumer interest in shopping and fashion as they move about a store:  

The Spectrum of Artificial Intelligence algorithms.

What does this have to do with IoT at all? Well, IoT opens up the spigot to a massive amount of constantly streaming data. The value of a system of sensors is not what one sensor measures, but what a collection of sensors measure and tell us about a much larger system. IoT, as mentioned earlier, will be the catalyst to generate a step-function in the amount of data collected. Some of that data will be structured: time-correlated series. Other data will be unstructured: cameras, synthetic sensors, audio, and analog signals. The customer wants to create useful decisions for their business based on that data. For example, in a manufacturing plant that is planning to optimize operational expenses and potentially capital expenses by adopting IoT and machine learning (at least, that's what they were sold on). When we think about a factory IoT use case, the manufacturer will have many interdependent systems. They may have some assembly tool to produce a widget, another robot to cut parts out of metal or plastic, another machine to perform some type of injection molding, conveyor belts, lighting and heating systems, packaging machine, supply and inventory control systems, robots to move material around, and various levels of controls systems. In fact, this company may have many of these spaces spread across a campus or geography. A factory like this has adopted all the traditional models of efficiency, and read W. Edwards Deming's literature; however, the next industrial revolution will come in the form of IoT and machine intelligence. 

Specialized individuals know what to do when an erratic event occurs. For example, a technician who had been operating one of the assembly machines for years knows when the machine needs service based on how that machine is behaving. It may start creaking in a certain way. Perhaps it's worn out its ability to pick and place parts, and dropped a few in the last couple of days. These simple behavioral effects are things that machine learning can see and predict even before a human. Sensors can surround such devices, and monitor actions both perceived and inferred. An entire factory could be perceived in such a case to understand how that factory is performing at that very instant based on a collection of millions or billions of events from every machine and every worker in that system. 

With that amount of data, only a machine learning appliance can sift through the noise and find the relevance. These are not human-manageable problems, but the manageable problem of big data and machine learning.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset