Imitation Learning with the DAgger Algorithm

The ability of an algorithm to learn only from rewards is a very important characteristic that led us to develop reinforcement learning algorithms. This enables an agent to learn and improve its policy from scratch without additional supervision. Despite this, there are situations where other expert agents are already employed in a given environment. Imitation learning (IL) algorithms leverage the expert by imitating their actions and learning the policy from them.

This chapter focuses on imitation learning. Although different to reinforcement learning, imitation learning offers great opportunities and capabilities, especially in environments with very large state spaces and sparse rewards. Obviously, imitation learning is possible only when a more expert agent to imitate is available.

The chapter will focus on the main concepts and features of imitation learning methods. We'll implement an imitation learning algorithm called DAgger, and teach an agent to play Flappy Bird. This will help you to master this new family of algorithms and appreciate their basic principles.

In the last section of this chapter, we'll introduce inverse reinforcement learning (IRL). IRL is a method that extracts and learns the behaviors of another agent in terms of values and rewards; that is, IRL learns the reward function.

The following topics will be covered in this chapter:

  • The imitation approach
  • Playing with Flappy Bird
  • Understanding the dataset aggregation algorithm
  • IRL

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset