DQN implementation

Though DQN is a pretty simple algorithm, it requires particular attention when it comes to its implementation and design choices. This algorithm, like every other deep RL algorithm, is not easy to debug and tune. Therefore, throughout this book, we'll give you some techniques and suggestions for how to do this. 

The DQN code contains four main components:

  • DNNs
  • An experienced buffer
  • A computational graph
  • A training (and evaluation) loop

The code, as usual, is written in Python and TensorFlow, and we'll use TensorBoard to visualize the training and the performance of the algorithm. 

All the code is available in this book's GitHub repository. Make sure to check it out there. We don't provide the implementation of some simpler functions here to avoid weighing down the code.

Let's immediately jump into the implementation by importing the required libraries:

import numpy as np
import tensorflow as tf
import gym
from datetime import datetime
from collections import deque
import time
import sys

from atari_wrappers import make_env

atari_wrappers includes the make_env function we defined previously.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset