Understanding visual state

RL is a very powerful algorithm, but can become very computationally complex when we start to look at massive state inputs. To account for massive states, many powerful RL algorithms use the concept of model-free or policy-based learning, something we will cover in a later chapter. As we already know, Unity uses a policy-based algorithm that allows it to learn any size of state space by generalizing to a policy. This allows us to easily input a state space of 15 vectors in the example we just ran to something more massive, as in the VisualHallway example.

Let's open up Unity to the VisualHallway example scene and look at how to reduce the visual input space in the following exercise:

  1. With the VisualHallway scene open, locate the HallwayLearningBrain in the Assets | ML-Agents | Examples | Hallway | Brains folder and select it.
  2. Modify the Brain Parameters | Visual Observation first camera observable to an input of 32 x 32 Gray scale. An example of this is shown in the following screenshot:

Setting up the visual observation space for the agent
  1. When Visual Observations are set on a brain, then every frame is captured from the camera at the resolution selected. Previously, the captured image was 84 x 84 pixels large, by no means as large as the game screen in player mode, but still significantly larger than 35 vector inputs. By reducing our image size and making it gray, scale we reduced one input frame from 84 x 84 x 3 = 20,172 inputs to 32 x 32 x 1 =1,024. In turn, this greatly reduces our required model input space and the complexity of the network that's needed to learn.
  2. Save the project and the scene.
  3. Run the VisualHallway in learning mode again using the following command:
mlagents-learn config/trainer_config.yaml --run-id=vh_reduced --train
  1. Notice how we are changing the --run-id parameter with every run. Recall that, if we want to use TensorBoard, then each of our runs needs a unique name, otherwise it just writes over previous runs.
  2. Let the sample train for as long as you ran the earlier VisualHallway exercise, as this will give you a good comparison of the change we made in state.

Are the results what you expected? Yeah, the agent still doesn't learn, even after reducing the state. The reason for this is because the smaller visual state actually works against the agent in this particular case. Not unlike the results, we would expect us humans to have when trying to solve a task by looking through a pinhole. However, there is another way to reduce visual state into feature sets using convolution. As you may recall, we covered convolution and CNN in Chapter 2, Convolutional and Recurrent Networks, at some length. In the next section, we will look at how we can reduce the visual state of our example by adding convolutional layers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset