What's in a brain?

One of the brilliant aspects of the ML-Agents platform is the ability to switch from player control to AI/agent control very quickly and seamlessly. In order to do this, Unity uses the concept of a brain. A brain may be either player-controlled, a player brain, or agent-controlled, a learning brain. The brilliant part is that you can build a game and test it, as a player can then turn the game loose on an RL agent. This has the added benefit of making any game written in Unity controllable by an AI with very little effort. In fact, this is such a powerful workflow that we will spend an entire chapter, Chapter 12, Debugging/Testing a Game with DRL, on testing and debugging your games with RL.

Training an RL agent with Unity is fairly straightforward to set up and run. Unity uses Python externally to build the learning brain model. Using Python makes far more sense, since as we have already seen, several DL libraries are built on top of it. Follow these steps to train an agent for the GridWorld environment:

Select the GridAcademy again and switch the Brains from GridWorldPlayer to GridWorldLearning as shown:

Switching the brain to use GridWorldLearning

Make sure to click the Control option at the end. This simple setting is what tells the brain it may be controlled externally. Be sure to double-check that the option is enabled.
Select the trueAgent object in the Hierarchy window, and then, in the Inspector window, change the Brain property under the Grid Agent component to a GridWorldLearning brain:

Setting the brain on the agent to GridWorldLearning

For this sample, we want to switch our Academy and Agent to use the same brain, GridWorldLearning. In more advanced cases we will explore later, this is not always the case. You could of course have a player and an agent brain running in tandem, or many other configurations.
Be sure you have an Anaconda or Python window open and set to the ML-Agents/ml-agents folder or your versioned ml-agents folder.
Run the following command in the Anaconda or Python window using the ml-agents virtual environment:

mlagents-learn config/trainer_config.yaml --run-id=firstRun --train

This will start the Unity PPO trainer and run the agent example as configured. At some point, the command window will prompt you to run the Unity editor with the environment loaded.
Press Play in the Unity editor to run the GridWorld environment. Shortly after, you should see the agent training with the results being output in the Python script window:

Running the GridWorld environment in training mode

Note how the mlagents-learn script is the Python code that builds the RL model to run the agent. As you can see from the output of the script, there are several parameters, or what we refer to as hyper-parameters, that need to be configured. Some of these parameters may sound familiar, and they should, but several may be unclear. Fortunately, for the rest of this chapter and this book, we will explore how to tune these parameters in some detail.
Let the agent train for several thousand iterations and note how quickly it learns. The internal model here, called PPO, has been shown to be a very effective learner at multiple forms of tasks and is very well suited for game development. Depending on your hardware, the agent may learn to perfect this task in less than an hour.

Keep the agent training, and we will look at more ways to inspect the agent's training progress in the next section.

Table of Contents for What's in a brain?

Create new playlist

Sign In

Sign Up

Table of Contents for
What's in a brain?