To grasp these key concepts better, we can use the example of a teenager learning to drive. Let's assume that they have never been in a car, that this is the first time they are seeing one, and that they don't have any knowledge of how it works. There are three approaches to learning:
- They are given the keys and have to learn all by themselves, with no supervision at all.
- Before being given the keys, they sit in the passenger seat for 100 hours and look at the expert driving in different weather conditions and on different roads.
- They observe the expert driving but, most importantly, they have sessions where the expert provides feedback while driving. For example, the expert can give real-time instructions on how to park the car, and give direct feedback on how to stay in a lane.
As you may have guessed, the first case is a reinforcement learning approach where the agent has only sparse rewards from not breaking the car, pedestrians not yelling at them, and so on.
Regarding the second case, this is a passive IL approach with the competence that is acquired from the pure reproduction of the expert's actions. Overall, it's very close to a supervised learning approach.
The third and final case is an active IL approach that gives rise to a real imitation learning approach. In this case, it is required that, during the training phase, the expert instructs the learner on every move the learner makes.