Deep Q-Network

So far, we've approached and developed reinforcement learning algorithms that learn about a value function, V, for each state, or an action-value function, Q, for each action-state pair. These methods involve storing and updating each value separately in a table (or an array). These approaches do not scale because, for a large number of states and actions, the table's dimensions increase exponentially and can easily exceed the available memory capacity.

In this chapter, we will introduce the use of function approximation in reinforcement learning algorithms to overcome this problem. In particular, we will focus on deep neural networks that are applied to Q-learning. In the first part of this chapter, we'll explain how to extend Q-learning with function approximation to store Q values, and we'll explore some major difficulties that we may face. In the second part, we will present a new algorithm called Deep Q-network (DQN), which using new ideas, offers an elegant solution to some challenges that are found in the vanilla version of Q-learning with neural networks. You'll see how this algorithm achieves surprising results on a wide variety of games that learn only from pixels. Moreover, you'll implement this algorithm and apply it to Pong, and see some of its strengths and vulnerabilities for yourself.

Since DQN was proposed, other researchers have proposed many variations that provide more stability and efficiency for the algorithm. We'll quickly look at and implement some of them so that we have a better understanding of the weaknesses of the basic version of DQN and so that we can provide you with some ideas so that you can improve it yourself.

The following topics will be covered in this chapter:

Deep neural networks and Q-learning
DQN
DQN applied to Pong
DQN variations

Table of Contents for Deep Q-Network

Create new playlist

Sign In

Sign Up

Table of Contents for
Deep Q-Network