The Curiosity Intrinsic module in action

With our appreciation of the difficulty of the Pyramids task, we can move on to training the agent with curiosity in the following exercise:

  1. Open the Pyramids scene in the editor.
  2. Select the AreaRB | Agent object in the Hierarchy window.
  3. Switch the Pyramid Agent | Brain for the PyramidsLearning brain.
  4. Select the Academy object in the Hierarchy window.
  1. Enable the Control option on the Academy | Pyramid Academy | Brains | Control property, as shown in the following screenshot:

Setting the Academy to Control
  1. Open a Python or Anaconda console and prepare it for training.
  2. Open the trainer_config.yaml file located in the ML-Agents/ml-agents/config folder.
  3. Scroll down to the PyramidsLearning configuration section, as follows:
      PyramidsLearning:
use_curiosity: true
summary_freq: 2000
curiosity_strength: 0.01
curiosity_enc_size: 256
time_horizon: 128
batch_size: 128
buffer_size: 2048
hidden_units: 512
num_layers: 2
beta: 1.0e-2
max_steps: 5.0e5
num_epoch: 3

  1. There are three new configuration parameters highlighted in bold:
    • use_curiosity: Set this to true to use the module, but it is generally false by default.
    • curiosity_strength: This is how strongly the agent values the intrinsic reward of curiosity over the extrinsic ones.
    • curiosity_enc_size: This is the size of the encoded layer we compress the network to. If you think back to autoencoders, you can see the size of 256 is quite large, but also consider the size of the state space or observation space you may be encoding.
Leave the parameters at the values they are set.
  1. Launch the training session with the following command:
      mlagents-learn config/trainer_config.yaml --run-id=pyramids --train

While this training session may take a while, it can be entertaining to watch how the agent explores. Even with the current settings, using only one training area, you may be able to see the agent solve the puzzle on a few iterations. 

Since ICM is a module, it can quickly be activated for any other example we want to see the effects on, which is what we will do in the next section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset