Introducing rewards to the game

The scene currently has no well-defined goal. There are plenty of open worlds and exploration-style games where the goal is very loosely defined. For our purposes, however, we only really want the agent to test-play the whole game level, and hopefully identify any game flaws or perhaps strategies that we never foresaw. Of course, that doesn't mean that if the car-driving agents became good, we could also use them as game opponents. The bottom line is that our agent needs to learn, and it does that through rewards; therefore, we need to make some reward functions.

Let's first define a reward function for our goal, as follows:

It's pretty simple; whenever the agent encounters a goal, they will score a reward of 1 . Now, to avoid the agent taking too long, we will also introduce a standard step reward, as follows:

This means that we apply a step reward of -1 divided by the maximum number of steps, per agent action. This is quite standard (our Hallway agent used it, for instance), so there is nothing new here. So, our reward functions will be quite simple, which is good.

In many cases, your game may have well-defined goals that you can use to give rewards with. A driving game, for example, would have a clear goal that we could map for our agent. In this case, in our open-world game, it makes sense to add goals for the agent to locate. How you implement your reward structure does matter, of course, but use what makes sense for your situation.

With the reward functions defined, it is time to introduce the concept of a goal into our game. We want to keep this system somewhat generic, so we will build a goal deployment system into a new object called TestingAcademy. That way, you can take this academy and drop it into any similar FPS or third-person controlled worlds, and it will work the same.

First-person shooter (FPS) refers to a type of game, but also a type of control/camera system. We are interested in the latter, since it is the method by which we control our car.

Open the editor to the new combined project, and follow the next exercise to build the TestingAcademy object:

Click in the Hierarchy window, and from the menu, select GameObject | Create Empty. Name the new object TestingAcademy.
Locate and click inside the HoDLG | Scripts folder, and then open the Create sub-menu in the Project window.
From the Create menu, select C# Script. Rename the script TestingAcademy.
Open the new TestingAcademy script and enter the following code:

using MLAgents;
using UnityEngine;

namespace Packt.HoDLG
{
  public class TestingAcademy : Academy
  {
    public GameObject goal;
    public int numGoals; 
    public Vector3 goalSize;
    public Vector3 goalCenter;
    public TestingAgent[] agents;
    public GameObject[] goals;
  }
}

All of the code for this chapter's exercise is included in the Chapter_12_Code.assetpackage included with the book's source code.

This code defines our class and imports by using the required namespaces. Then, we define our own namespace, Packt.HoDLG, and the class is extended from Academy, an ML-Agents base class. Next comes the declaration of several variables for defining the goal deployment cube. Think of this as a virtual cube in space that will spawn the goals. The idea is to let physics do the rest and let the goal just drop to the ground.

Namespaces are optional in Unity, but it is highly recommended to put your code within a namespace in order to avoid most naming issues, which can be a common problem if you are using many assets or if you find yourself modifying existing assets, as we are doing here.

Next, we will define the standard Academy class setup method, InitializeAcademy. This method is called automatically, and is shown as follows:

public override void InitializeAcademy()
{
  agents = FindObjectsOfType<TestingAgent>();
  goals = new GameObject[numGoals];
}

This method is called as a part of the ML-Agents setup, and it essentially starts the whole SDK. By adding the Academy (TestingAcademy), we will effectively be enabling ML-Agents. Next, we will add the final method, called when the academy is reset at the end of all of the agent episodes, as follows:

public override void AcademyReset()
{
  if (goalSize.magnitude > 0)
  {
    for(int i = 0; i < numGoals; i++)
    {
    if(goals[i] != null && goals[i].activeSelf)
      Destroy(goals[i]);
    }
    for(int i = 0; i < numGoals; i++)
    {
      var x = Random.Range(-goalSize.x / 2 + goalCenter.x, goalSize.x / 2 + goalCenter.x);
      var y = Random.Range(-goalSize.y / 2 + goalCenter.y, goalSize.y / 2 + goalCenter.y);
      var z = Random.Range(-goalSize.z / 2 + goalCenter.z, goalSize.z / 2 + goalCenter.z);
     goals[i] = Instantiate(goal, new Vector3(x, y, z), Quaternion.identity, transform);
   }
  }
}

This code just spawns the goals randomly within the virtual cube bounds. Before it does this, however, it first clears the old goals by using the Destroy method. Destroy removes an object from the game. Then, the code loops again and creates new goals at random locations within the virtual cube. The line that actually creates the goal in the game is highlighted and uses the Instantiate method. Instantiate creates an object in the game at the specified location and rotation.
Save the file and return to the editor. Don't worry about any compiler errors at this time. If you are writing the code from scratch, you will be missing some types, which we will define later.

With the new TestingAcademy script created, we can move on to adding the component to the game object and setting up the academy in the next section.

Table of Contents for Introducing rewards to the game

Create new playlist

Sign In

Sign Up

Table of Contents for
Introducing rewards to the game