Summary

This chapter covers the basic principles of Reinforcement Learning and the fundamental Q-learning algorithm.

The distinctive feature of Q-learning is its capacity to choose between immediate rewards and delayed rewards. Q-learning at its simplest uses tables to store data. This very quickly loses viability as the state/action space of the system it is monitoring/controlling increases.

We can overcome this problem by using a neural network as a function approximator, which takes the state and action as input, and outputs the corresponding Q-value.

Following this idea, we implemented a Q-learning neural network using the TensorFlow framework and the OpenAI Gym toolkit for developing and comparing Reinforcement Learning algorithms.

Our journey into Deep Learning with TensorFlow ends here.

Deep learning is a very productive research area; there are many books, courses, and online resources that may help you to go deeper into its theory and programming. In addition, TensorFlow provides a rich set of tools for working with deep learning models, and so on.

We really hope for you to be a part of the TensorFlow community, which is very active and expects enthusiastic people to join in!

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Table of Contents for
Summary