Efficiency

In the previous section, Choosing the appropriate algorithm, we saw that the sample efficiency between the algorithms is highly variable. Moreover, from the previous chapters, we saw that more efficient methods, such as value-based learning, still require a substantial number of interactions with the environment to learn. Maybe only model-based RL can save itself from the hunger of data. Unfortunately, model-based methods have other downsides, such as a lower performance bound.

For this reason, hybrid model-based and model-free approaches have been built. However, these are difficult to engineer and are impractical for use in real-world problems. As you can see, the efficiency-related problem is very hard to solve but at the same time very important to address so that we can deploy RL methods in the real world.

There are two alternative ways to deal with very slow environments such as the physical world. One is to use a lower-fidelity simulator in the first place and then fine-tune the agent in the final environment. The other is to train the agent directly in the final environment, but transferring some prior related knowledge so as to avoid learning the task from scratch. It's like learning to drive when you've already trained your sensory system. In both cases, because we are transferring knowledge from one environment to another, we talk about a methodology called transfer learning. We'll elaborate on this methodology very soon in the Advanced techniques section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset