We introduced the term active imitation in the previous example, with the teenager learning to drive a car. Specifically, we referred to it in the situation in which the learner was driving with additional feedback from the expert. In general, for active imitation, we mean learning from on-policy data with the actions assigned by the expert.
Speaking in terms of input s (the state or observation) and output a (the action), in passive learning, s and a both come from the expert. In active learning, s is sampled from the learner, and a is the action that the expert would have taken in state s. The objective of the newbie agent is to learn a mapping, .
Active learning with on-policy data allows the learner to fix small deviations from the expert trajectory that the learner wouldn't know how to correct with only passive imitation.