Categorizing RL algorithms

Before deep diving into the first RL algorithm that solves the optimal Bellman equation, we want to give a broad but detailed overview of RL algorithms. We need to do this because their distinctions can be quite confusing. There are many parts involved in the design of algorithms, and many characteristics have to be considered before deciding which algorithm best fits the actual needs of the user. The scope of this overview presents the big picture of RL so that in the next chapters, where we'll give a comprehensive theoretical and practical view of these algorithms, you will already see the general objective and have a clear idea of their location in the map of RL algorithms.

The first distinction is between model-based and model-free algorithms. As the name suggests, the first requires a model of the environment, while the second is free from this condition. The model of the environment is highly valuable because it carries precious information that can be used to find the desired policies; however, in most cases, the model is almost impossible to obtain. For example, it can be quite easy to model the game tic-tac-toe, while it can be difficult to model the waves of the sea. To this end, model-free algorithms can learn information without any assumptions about the environment. A representation of the categories of RL algorithms is visible in figure 3.2.

Here the distinction is shown between model-based and model-free, and two widely known model-free approaches, namely policy gradient and value-based. Also, as we'll see in later chapters, a combination of those is possible:

Figure 3.2. Categorization of RL algorithms

The first distinction is between model-free and model-based. Model-free RL algorithms can be further decomposed in policy gradient and value-based algorithms. Hybrids are methods that combine important characteristics of both methods. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset