Understanding reinforcement learning

As we have already seen, reinforcement learning is an on-the-go learning technique. Let's consider a simple analogy to understand reinforcement learning. Think about a nine-month-old baby trying to get up and walk.

The following diagram represents our analogy:

Baby walking analogy 

The first step the baby does is try to get up by pressing their legs toward the ground. Then, they try to balance themselves and try to hold still. If this is successful, you would see a smile on the baby's face. Now, the baby takes one step forward and tries to balance itself again. If, while trying, the baby lost balance and fell down, then there is a chance that the baby might frown or cry. The baby may either give up walking if it doesn't have the motivation to walk or may try once again to get up and walk. If the baby was successful taking two steps forward, you might see a bright smile, along with a happy sound from the baby. The more steps the baby takes, the more confident they will become and they will eventually continue walking or, at times, even running:

Transition representation

To start with, the baby is called an agent. The two states that the agent assumes are represented in the preceding diagram. The goal of the agent is to get up and walk in an environment such as a living room or bedroom. The steps that the agent takes or does such as getting up, balancing, and walking are called actions. The agent's smiles or frowns are treated as rewards (positive and negative rewards) and the happy sound that came from the baby while walking is a reward that probably indicates the agent is doing the right thing. The motivation that the agent receives to get up and walk once again or to stay put on the ground explains the explore-exploit concept in reinforcement learning. What do we mean by this? Let's look at this in detail now.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset