Questions

  1. Would you use a model-based or a model-free algorithm if you had only 10 games in which to train your agent to play checkers?
  2. What are the disadvantages of model-based algorithms?
  3. If a model of the environment is unknown, how can it be learned?
  4. Why are data aggregation methods used?
  5. How does ME-TRPO stabilize training?
  6. How does using an ensemble of models improve policy learning?
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset