You may have read sci-fi novels from the 50's and 60's; they are full of visions of what life in the 21st century would look like. They imagined a world of people with personal jet packs, underwater cities, intergalactic travel, flying cars, and truly intelligent robots capable of independent thought. The 21st century has arrived now; sadly, we are not going to get those flying cars, but thanks to deep learning, we may get that robot.
What does this have to do with deep learning for board games? In the next two chapters, including the current one, we will look at how to build Artificial Intelligence (AI) that can learn game environments. Reality has a vast space of possibilities. Doing even simple human tasks, such as getting a robot arm to pick up objects, requires analyzing huge amounts of sensory data and controlling many continuous response variables for the movement of the arms.
Games act as a great playing field for testing general purpose learning algorithms. They give you an environment of large, but manageable possibilities. Also, when it comes to computer games, we know that humans can learn to play a game just from the pixels visible on the screen and the most minor of instructions. If we input the same pixels plus an objective into a computer agent, we know we have a solvable problem, given the right algorithm. In fact, for the computer, the problem is easier because a human being identifies that the things they seeing in their field of vision are actually game pixels, as opposed to the area around the screen. This is why so many researchers are looking at games as a great place to start developing true AI's—self-learning machines that can operate independently from us. Also, if you like games, it's lots of fun.
In this chapter, we will cover the different tools used for solving board games, such as checkers and chess. Eventually, we'll build up enough knowledge to be able to understand and implement the kind of deep learning solution that was used to build AlphaGo, the AI that defeated the greatest human Go player. We'll use a variety of deep learning techniques to accomplish this. The next chapter will build on this knowledge and cover how deep learning can be used to learn how to play computer games, such as Pong and Breakout.
The full list of concepts that we will cover across both the chapters is as follows:
We will use a few different terms to describe tasks and their solutions. The following are some of the definitions. They all use the example of a basic maze game as it is a good, simple example of a reinforcement learning environment. In a maze game, there are a set of locations with paths between them. There is an agent in this maze that can use the paths to move between the different locations. Some locations have a reward associated with them. The agent's objective is to navigate their way through the maze to get the best possible reward.
A lot of this chapter is code-heavy, so as an alternative to copying all the samples from the book, you can find the full code in a GitHub repository at https://github.com/DanielSlater/PythonDeepLearningSamples. All the examples in the chapters are presented using TensorFlow, but the concepts could be translated into other deep learning frameworks.
Building AI's to play games started in the 50's with researchers building programs that played checkers and chess. These two games have a few properties in common:
The combination of perfect information and determinism in chess and checkers means that given the current state, we can exactly know what state we will be in if the current player takes an action. This property also chains if we have a state, then takes an action leading to a new state. We can again take an action in this new state to keep playing as far into the future as we want.
To experiment with some of the approaches of mastering board games, we will give examples using a Python implementation of the game called Tic-Tac-Toe. Also known as noughts and crosses, this is a simple game where players take turns making marks on a 3 by 3 grid. The first player to get three marks in a row wins. Tic-Tac-Toe is another deterministic, zero sum, perfect information game and is chosen here because a Python implementation of it is a lot simpler than chess. In fact, the whole game can be done in less than a page of code, which will be shown later in this chapter.