Keras RL

Keras provides a very useful RL API that wraps several variations such as DQN, DDQN, SARSA, and so on. We won't get into the details of those various RL variations right now, but we will cover the important parts later, as we get into more complex models. For now, though, we are going to look at how you can quickly build a DRL model to play Atari games. Open up Chapter_5_6.py and follow these steps:

  1. We first need to install several dependencies with pip; open a command shell or Anaconda window, and enter the following commands:
pip install Pillow
pip install keras-rl

pip install gym[atari] # on Linux or Mac
pip install --no-index -f https://github.com/Kojoley/atari-py/releases atari_py # on Windows thanks to Nikita Kniazev

  1. This will install the Keras RL API, Pillow, an image framework, and the Atari environment for gym.
  2. Run the example code as you normally would. This sample does take script arguments, but we don't need to use them here. An example of the rendered Atari Breakout environment follows:


Atari Breakout environment

Unfortunately, you cannot see the game run as the agent plays, because all the action takes place in the background, but let the agent run until it completes and saves the model. Here's how we would run the sample:

  1. You can rerun the sample using --mode test as an argument to let the agent run over 10 episodes and see the results.
  2. As the sample runs, look through the code and pay special attention to the model, as follows:
model = Sequential()
if K.image_dim_ordering() == 'tf':
# (width, height, channels)
model.add(Permute((2, 3, 1), input_shape=input_shape))
elif K.image_dim_ordering() == 'th':
# (channels, width, height)
model.add(Permute((1, 2, 3), input_shape=input_shape))
else:
raise RuntimeError('Unknown image_dim_ordering.')
model.add(Convolution2D(32, (8, 8), strides=(4, 4)))
model.add(Activation('relu'))
model.add(Convolution2D(64, (4, 4), strides=(2, 2)))
model.add(Activation('relu'))
model.add(Convolution2D(64, (3, 3), strides=(1, 1)))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dense(nb_actions))
model.add(Activation('linear'))
print(model.summary())
  1. Note how our model is using Convolution, with pooling. This is because this example reads each screen/frame of the game as input (state) and responds accordingly. In this case, the model state is massive, and this demonstrates the real power of DRL. In this case, we are still training to a state model, but in future chapters, we will look at training a policy, rather than a model.

This was a simple introduction to RL, and we have omitted several details that can get lost on newcomers. As we plan to cover several more chapters on RL, and in particular the Proximal Policy Optimization (PPO) in more detail in Chapter 8, Understanding PPO, don't fret too much about differences such as policy and model-based RL.

There is an excellent example of this same DQN in TensorFlow at this GitHub link: https://github.com/floodsung/DQN-Atari-Tensorflow. The code may be a bit dated, but it is a simple and excellent example that is worth taking a look at.

We won't look any further at the code, but the reader is certainly invited to. Now let's try some exercises.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset