Solving Problems with Dynamic Programming

The purposes of this chapter are manifold. We will introduce many topics that are essential to the understanding of reinforcement problems and the first algorithms that are used to solve them. Whereas, in the previous chapters, we talked about reinforcement learning (RL) from a broad and non-technical point of view, here, we will formalize this understanding to develop the first algorithms to solve a simple game.

The RL problem can be formulated as a Markov decision process (MDP), a framework that provides a formalization of the key elements of RL, such as value functions and the expected reward. RL algorithms can then be created using these mathematical components. They differ from each other by how these components are combined and on the assumptions made while designing them.

For this reason, as we'll see in this chapter, RL algorithms can be categorized into three main categories that can overlap each other. This is because some algorithms can unify characteristics from more than one category. Once these pivotal concepts have been explained, we'll present the first type of algorithm, called dynamic programming, which can solve problems when given complete information about the environment.

The following topics will be covered in this chapter:

MDP
Categorizing RL algorithms
Dynamic programming

Table of Contents for Solving Problems with Dynamic Programming

Create new playlist

Sign In

Sign Up

Table of Contents for
Solving Problems with Dynamic Programming