Making a prediction in dynamic environments does not always succeed in producing desired outcomes, particularly in complex and unstructured data.
There are several reasons for that. For example, how do you infer a realistic outcome from a bit of data or deal with unstructured and high dimensional data that has been found too tedious? Moreover, model revision with efficient strategies to control the realistic environments is also costly.
Furthermore, sometimes the dimensionality of the input dataset is high. Consequently, data might be too dense or very sparse. In that case, how you deal with very large settings and how to apply the static models in emerging application areas such as robotics, image processing, deep learning, computer vision, or web mining is challenging. On the other hand, ensemble methods are becoming more popular for selecting and combining models from existing models to make the ML model more adaptable. A hierarchical and dynamic environment-based learning is shown in Figure 10:
In this case, ML techniques such as neural networks and statistically-based learning are also becoming popular for their success with numerous applications in industry and research such as biological systems. In particular, classical learning algorithms such as neural networks, decision trees, or vector quantizes are often restricted to purely feedforward settings, and simple vectorial data, instead of dynamic environments. The feature of vectorization often provides a better prediction because of the rich structure. In summary, there are three challenges in developing ML applications in a dynamic environment:
With only limited reinforcement signals, ill-posed domains, or partially underspecified settings, how do we develop controlled and effective strategies in dynamic environments? Considering these issues and promising advancement in the research, in this section, we will provide some insights into online learning techniques through a statistical and adversarial model. Since learning in a dynamic environment such as streaming will be discussed in Chapter 9, Advanced Machine Learning with Streaming and Graph Data, we will not discuss streaming-based learning in this chapter.
Batch learning techniques generate the best predictor by learning on the entire training dataset at once and are often called static learning. Static learning algorithms take batches of training data to train a model, then a prediction is made using the test sample and the found relationship, whereas online learning algorithms take an initial guess model and then pick up a one-one observation from the training population and recalibrate the weights on each input parameter. Data usually becomes available in a sequential order as batches. The sequential data is used to update the best predictor of the outcome at each step as outlined in Figure 11. There are three use cases of online-based learning:
Online learning, therefore, requires out-of-core algorithms, that is, algorithms that can perform considering the constraints of networks. There are two general modeling strategies that exist for online learning models:
Although online and incremental learning techniques are similar, they also differ slightly. In online, it's generally a single pass (epoch=1) or a number of epochs that could be configured, whereas, incremental would mean that you already have a model. No matter how it is built, the model can be mutable by new examples. Also, a combination of online and incremental is often what is required.
Data is being generated in an unprecedented way everywhere, every day. This huge data imposes an enormous challenge to building ML tools that can handle data with high volume, velocity, and veracity. In short, data generated online is also big data. Therefore, we need to know the technique by which to learn about the online learning algorithms that are meant to handle data with such high volume and velocity with limited performance machines.
As already outlined, in statistically-based learning models such as stochastic gradient descents (SGD) and artificial neural networks or perceptron, data samples are assumed to be independent of each other. In addition to this, it is also assumed that the dataset is identically distributed as random variables. In other words, they don't adapt with time. Therefore, an ML algorithm has a limited access to the data.
In the field of the statistical learning model there are two interpretations that are considered significant:
Classical machine learning, which is especially taught in classes, emphasizes a static environment where usually unchanging data is used to make predictions. It is, therefore, formally easier compared to a statistical or causal inference or dynamic environment. On the other hand, finding and solving the learning problem as a game between two players, for example, learner versus data generator in a dynamic environment, is an example of an adversarial model. This kind of modeling and making predictive analytics is critically tedious since the world does not know that you are trying to model it formally.
Furthermore, your model does not have any positive or negative effect on the world. Therefore, the ultimate goal of this kind of model is to minimize losses prevailing from circumstances generated by the move made and played by the other player. The opponent can adapt the data generated based on the output of the learning algorithm in run-time or dynamically. Since no distributional assumptions are made about the data, performing well for the entire sequence that could be viewed ahead of time becomes the ultimate goal. Additionally, regret is to be minimized on the hypothesis at the last pace. According to Cathy O. et al (Weapons of Math Destruction, Cathy O'Neil, and Crown, September 6, 2016) t adversarial-based machine learning can be defined as follows:
Adversarial machine learning is the formal name for studying what happens when conceding even a slightly more realistic alternative to assumptions of these types (harmlessly called relaxing assumptions).