Bayesian models

Bayesian models are based on Bayes' theorem from 1812. Bayes' theorem describes the probability that an event will occur based on prior knowledge of the system. For example, what is the probability that a machine will fail based on the temperature of the device?

Bayes' theorem is expressed as:

A and B are the events of interest. P(A|B) asks, what is the probability that event A will occur, given event B has occurred? They have no relation to each other, and are mutually exclusive.

The equation can be re-written using the theorem of total probability, which replaces P(B). We can also extend this to i number of events. P(B|A) is the probability that event B will occur, given event A has occurred. This is the formal definition of Bayes' theorem:

In the case, we are dealing with a single probability and its complement (pass/fail). The equation can be re-written as:

An example follows. Supposing we have two machines producing identical parts for a widget. Say a machine can fail if its temperature exceeds a certain value. Machine A will fail 2% of the time if the temperature exceeds a certain temperature. Machine B will fail 4% of the time if it exceeds a certain temperature. Machine A produces 70% of the parts, and machine B produces the remaining 30%. If I pick up a random part and it fails, what is the probability it was produced by machine A, or machine B?

In this case, A is an item produced by machine A, and B is an item produced by machine B. F represents the failed chosen part. We know:

P(A) = 0.7
P(B) = 0.3
P(F|A) = 0.02
P (F|B) = 0.04

Therefore, the probability you pick a defective part from machine A or B is:

Replacing the values:

Therefore, P(A | F) = 53% and P(B | F) is the complement (1 - 0.53) = 47%.

A Bayesian network is an extension of Bayes' theorem in the form of a graphical probability model, specifically a directed acyclic graph. Notice the graph flows one way, and there are no loopbacks to previous states; this is a requirement of the Bayesian network:

Bayesian network model

Here, the various probabilities of each state come from expert knowledge, historical data, logs, trends, or combinations of these. This is the training process for a Bayesian network. These rules can be applied to a learning model in an IoT environment. As sensor data streams in, the model could predict machine failures. Additionally, the model could be used to make inferences. For example, if the sensors are reading an overheating condition, one could infer that there is a probability it may be related to the speed of the machine, or an obstruction.

There are variants of Bayesian networks that go beyond the scope of this book, but have benefits for certain types of data and problem sets:

Naive Bayes
Gaussian Naive Bayes
Bayesian belief networks

A Bayesian network is good for environments in IoT that can't be completely observed. Additionally, in a situation where the data is unreliable, Bayesian networks have an advantage. Poor sample data, noisy data, and missing data have less an effect on Bayesian networks than other forms of predictive analytics. The caveat is that the number of samples will need to be very large. Bayesian methods also avoid the overfitting problem, which we will discuss later in neural networks. Additionally, Bayesian models fit well with streaming data, which is a typical use case in IoT. Bayesian networks have been deployed to find aberrations in signals and time-correlated series from sensors, and also to find and filter malicious packets in networking.

Table of Contents for Bayesian models

Create new playlist

Sign In

Sign Up

Table of Contents for
Bayesian models