Loading the expert inference model

The expert should be a policy that takes a state as input and returns the best action. Despite this, it can be anything. In particular, for these experiments, we used an agent trained with Proximal Policy Optimization (PPO) as the expert. In principle, this doesn't make any sense, but we adopted this solution for academic purposes, to facilitate integration with the imitation learning algorithms.

The expert's model trained with PPO has been saved on file so that we can easily restore it with its trained weights. Three steps are required to restore the graph and make it usable:

Import the meta graph. The computational graph can be restored with tf.train.import_meta_graph.
Restore the weights. Now, we have to load the pretrained weights on the computational graph we have just imported. The weights have been saved in the latest checkpoint and they can be restored with tf.train.latest_checkpoint(session, checkpoint).
Access the output tensors. The tensors of the restored graph are accessed with graph.get_tensor_by_name(tensor_name), where tensor_name is the tensor's name in the graph.

The following lines of code summarize the entire process:

def expert():
    graph = tf.get_default_graph()
    sess_expert = tf.Session(graph=graph)

    saver = tf.train.import_meta_graph('expert/model.ckpt.meta')
    saver.restore(sess_expert,tf.train.latest_checkpoint('expert/'))

    p_argmax = graph.get_tensor_by_name('actor_nn/max_act:0') 
    obs_ph = graph.get_tensor_by_name('obs:0')

Then, because we are only interested in a simple function that returns an expert action given a state, we can design the expert function in such a way that it returns that function. Thus, inside expert(), we define an inner function called expert_policy(state) and return it as output of expert():

    def expert_policy(state):
        act = sess_expert.run(p_argmax, feed_dict={obs_ph:[state]})
        return np.squeeze(act)

    return expert_policy

Table of Contents for Loading the expert inference model

Create new playlist

Sign In

Sign Up

Table of Contents for
Loading the expert inference model