The uses of a machine learning algorithm have their place in IoT. The typical case is when there is a plethora of streaming data that needs to produce some meaningful conclusion. A small collection of sensors may only need a simple rule engine on the edge in a latency-sensitive application. Others may stream data to a cloud service and apply rules there for systems with less-aggressive latency demands. When large amounts of data, unstructured data, and real-time analytics come into play, we need to consider the use of machine learning to solve some of the hardest problems.
In this section, we detail some tips and reminders in deploying machine learning analytics, and what use cases may warrant such tools.
Training phase:
- For random forest, use bagging techniques to create ensembles.
- When using a random forest, ensure you maximize the number of decision trees.
- Watch overfitting. Overfitting will lead to inaccurate field models. Techniques such as regularization and even injecting noise into a system will reinforce the mode.
- Don't train on the edge.
- Gradient descent will lead to error. RNNs naturally are susceptible.
Model in field:
- Update model with new data sets as they become available. Keep the training set current.
- Running models on the edge can be reinforced with larger and more comprehensive models in the clouds.
- Neural network execution can be optimized in the cloud and at the edge with a minimum loss by considering techniques such as pruning node and reducing precision.
Model |
Best application |
Worse fit and side effects |
Resource demands |
Training |
Random forests (statistical models) |
|
|
Low |
|
RNN (temporal and sequence-based neural networks) |
|
|
|
|
CNN (deep learning) |
|
|
|
Supervised and unsupervised |
Bayesian networks (probabilistic models) |
|
|
Low |
|