Model architecture

So far, we have talked about the algorithm itself, but we haven't explained the architecture of the DQN. Besides the new ideas that have been adopted to stabilize its training, the architecture of the DQN plays a crucial role in the final performance of the algorithm. In the DQN paper, a single model architecture is used in all of the Atari environments. It combines CNNs and FNNs. In particular, as observation images are given as input, it employs a CNN to learn about feature maps from those images. CNNs have been widely used with images for their translation invariance characteristics and for their property of sharing weights, which allows the network to learn with fewer weights compared to other deep neural network types. 

The output of the model corresponds to the state-action values, with one for each action. Thus, to control an agent with five actions, the model will output a value for each of those five actions. Such a model architecture allows us to compute all the Q-values with only one forward pass.

There are three convolutional layers. Each layer includes a convolution operation with an increasing number of filters and a decreasing dimension, as well as a non-linear function. The last hidden layer is a fully connected layer, followed by a rectified activation function and a fully-connected linear layer with an output for each action. A simple representation of this architecture is shown in the following illustration:

Figure 5.2. Illustration of a DNN architecture for DQN composed with a CNN and FNN
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset