Model

The model is an optional component of the agent, meaning that it is not required in order to find a policy for the environment. The model details how the environment behaves, predicting the next state and the reward, given a state and an action. If the model is known, planning algorithms can be used to interact with the model and recommend future actions. For example, in environments with discrete actions, potential trajectories can be simulated using look ahead searches (for instance, using the Monte Carlo tree search).

The model of the environment could either be given in advance or learned through interactions with it. If the environment is complex, it's a good idea to approximate it using deep neural networks. RL algorithms that use an already known model of the environment, or learn one, are called model-based methods. These solutions are opposed to model-free methods and will be explained in more detail in Chapter 9, Model-Based RL.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset