Deep Q-Network

So far, we've approached and developed reinforcement learning algorithms that learn about a value function, Vfor each state, or an action-value function, Q, for each action-state pair. These methods involve storing and updating each value separately in a table (or an array). These approaches do not scale because, for a large number of states and actions, the table's dimensions increase exponentially and can easily exceed the available memory capacity.

In this chapter, we will introduce the use of function approximation in reinforcement learning algorithms to overcome this problem. In particular, we will focus on deep neural networks that are applied to Q-learning. In the first part of this chapter, we'll explain how to extend Q-learning with function approximation to store Q values, and we'll explore some major difficulties that we may face. In the second part, we will present a new algorithm called Deep Q-network (DQN), which using new ideas, offers an elegant solution to some challenges that are found in the vanilla version of Q-learning with neural networks. You'll see how this algorithm achieves surprising results on a wide variety of games that learn only from pixels. Moreover, you'll implement this algorithm and apply it to Pong, and see some of its strengths and vulnerabilities for yourself.

Since DQN was proposed, other researchers have proposed many variations that provide more stability and efficiency for the algorithm. We'll quickly look at and implement some of them so that we have a better understanding of the weaknesses of the basic version of DQN and so that we can provide you with some ideas so that you can improve it yourself.

The following topics will be covered in this chapter:

  • Deep neural networks and Q-learning
  • DQN
  • DQN applied to Pong
  • DQN variations
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset