The solution

The key innovations brought by DQN involve a replay buffer to get over the data correlation drawback, and a separate target network to get over the non-stationarity problem.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset