Configuring the agents' personalities

With all the code set up, we can now continue back in the editor and set up the agents to match the personality we want to apply to them. Open up the editor again, and follow the next exercise to apply the personalities to the agents and start training:

  1. Select RedStriker in Hierarchy and set the Agent Soccer | Person Role parameter we just created to Girl, as shown:

Setting the personalities on each of the agents
  1. Update all the agents with the relevant personality that matches the model we assigned earlier: BlueStriker-> Boy, BlueGoalie -> Police, and RedGoalie -> Zombie, as shown in the preceding screenshot.
  2. Save the scene and project.
  3. Now, at this point, if you wanted it to be more detailed, you may want to go back and update each of the agent brain names to reflect their personalities, such as GirlStrikerLearning or PoliceGoalieLearning, and you can omit the team colors. Be sure to also add the new brain configuration settings to your trainer_config.yaml file.
  4. Open your Python/Anaconda training console and start training with the following command:
mlagents-learn config/trainer_config.yaml --run-id=soccer_peeps --train
  1. Now, this can be very entertaining to watch, as you can see in the following screenshot:

Watching individual personalities play soccer
  1. Note how we kept the team color cubes active in order to show which team each individual agent is on.
  1. Let the agents train for several thousand iterations and then open the console; note how the agents now look less symbiotic. In our example, they are still paired with each other, since we only applied a simple linear transformation to the rewards. You could, of course, apply more complex functions that are non-linear and not inversely related that describe some other motivation or personality for your agents.
  2. Finally, let's open up TensorBoard and look at a better comparison of our multi-agent training. Open another Python/Anaconda console to the ML-Agents/ml-agents folder you are currently working in and run the following command:
tensorboard --logdir=summaries
  1. Use your browser to open the TensorBoard interface and examine the results. Be sure to disable any extra results and just focus on the four brains in our current training run. The three main plots we want to focus on are shown merged together in this diagram:

TensorBoard Plots showing results of training four brains

As you can see from the TensorBoard results, the agents are not training very well. We could enhance that, of course, by adding additional training areas and feeding more observations in order to train the policy. However, if you look at the Policy Loss plot, the results show the agents' competition is causing minimal policy change, which is a bad thing this early in training. If anything, the zombie agent appears to be the agent learning the best from these results.

There are plenty of other ways you can, of course, modify your extrinsic reward function in order to encourage some behavioral aspect in multi-agent training scenarios. Some of these techniques work well and some not so well. We are still in the early days of developing this tech and best practices still need to emerge.

In the next section, we look to further exercises you can work on in order to reinforce your knowledge of all the material we covered in this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset