Solving Acrobot

We'll test ESBAS on yet another Gym environment—Acrobot-v1. As described in the OpenAI Gym documentation, the Acrobot system includes two joints and two links, where the joint between the two links is actuated. Initially, the links are hanging downward, and the goal is to swing the end of the lower link up to a given height. The following diagram shows the movement of the acrobot in a brief sequence of timesteps, from the start to an end position:

Sequence of the acrobot's movement

The portfolio comprises three deep neural networks of different sizes. One small neural network with only one hidden layer of size 64, one medium neural network with two hidden layers of size 16, and a large neural network with two hidden layers of size 64. Furthermore, we set the hyperparameter of (the same value that is used in the paper).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset