Transferring a brain

We now want to take the brain we have just been training and reuse it in a new, but similar, environment. Since our agent uses visual observations, this makes our task easier, but you could try and perform this example with other agents as well.

Let's open Unity and navigate to the VisualPushBlock example scene and follow this exercise:

  1. Select Academy and enable it for Control of the Brains.
  2. Select the Agent and set it to use the VisualPushBlockLearning brain. You should also confirm that this brain is configured in the same way as the VisualHallwayLearning brain we just ran, meaning that the Visual Observation and Vector Action spaces match.
  3. Open the ML-Agents/ml-agents_b/models/vishall-0 folder in File Explorer or another file explorer.
  4. Change the name of the file and folder from VisualHallwayLearning to VisualPushBlockLearning as shown in the following screenshot:
Changing the model path manually
  1. By changing the name of the folder, we are essentially telling the model loading system to restore our VisualHallway brain as VisualPushBlockBrain. The trick here is making sure that both brains have all the same hyperparameters and configuration settings.
  1. Speaking of hyperparameters, open the trainer_config.yaml file and make sure that the VisualHallwayLearning and VisualPushBlockLearning parameters are the same. The configuration for both is shown in the following code snippet for reference:
VisualHallwayLearning:
use_recurrent: true
sequence_length: 64
num_layers: 1
hidden_units: 128
memory_size: 256
beta: 1.0e-2
gamma: 0.99
num_epoch: 3
buffer_size: 1024
batch_size: 64
max_steps: 5.0e5
summary_freq: 1000
time_horizon: 64

VisualPushBlockLearning:
use_recurrent: true
sequence_length: 64
num_layers: 1
hidden_units: 128
memory_size: 256
beta: 1.0e-2
gamma: 0.99
num_epoch: 3
buffer_size: 1024
batch_size: 64
max_steps: 5.0e5
summary_freq: 1000
time_horizon: 64
  1. Save the configuration file when you are done editing.
  2. Open your Python/Anaconda window and launch training with the following code:
mlagents-learn config/trainer_config.yaml --run-id=vishall --train --save-freq=10000 --load
  1. The previous code is not a misprint; it is the exact same command we used to run the VisualHallway example, except with --load appended on the end. This should launch the training and prompt you to run the editor.
  2. Feel free to run the training for as long as you like, but keep in mind that we barely trained the original agent.

Now, in this example, even if we had trained the agent to complete VisualHallway, this likely would not have been very effective in transferring that knowledge to VisualPushBlock. For the purposes of this example, we chose both since they are quite similar, and transferring one trained brain to the other was less complicated. For your own purposes, being able to transfer trained brains may be more about retraining agents on new or modified levels, perhaps even allowing the agents to train on progressively more difficult levels.

Depending on your version of ML-Agents, this example may or may not work so well. The particular problem is the complexity of the model, number of hyperparameters, input space, and reward system that we are running. Keeping all of these factors the same also requires keen attention to detail. In the next section, we will take a short diversion to explore how complex these models are.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset