LSTM networks

Long Short Term Memory (LSTM) is a special Recurrent Neural Network architecture, which was originally conceived by Hochreiter and Schmidhuber in 1997. This type of neural network has been recently rediscovered in the context of deep learning, because it is free from the problem of vanishing gradients, and offers excellent results and performance. The networks that are LSTM-based are ideal for prediction and classification of temporal sequences, and are replacing many traditional approaches to deep learning.

LSTM is a network that is composed of cells (LSTM blocks) linked to each other. Each LSTM block contains three types of gate: Input gate, Output gate, and Forget gate, respectively, which implement the functions of writing, reading, and resetting on the cell memory. These gates are not binary, but analogical (generally managed by a sigmoidal activation function mapped in the range [0, 1], where 0 indicates total inhibition, and 1 shows the total activation).

The presence of these gates, allows LSTM cells to remember information for an indefinite time; in fact, if the following Input gate is the activation threshold, the cell will retain the previous state, and if the current state is enabled, it will be combined with the input value. As the name suggests, the Forget gate resets the current state of the cell (when its value is cleared to 0), and the Output gate decides whether the value in the cell must be carried out or not.

Block diagram of an LSTM cell

Table of Contents for LSTM networks

Create new playlist

Sign In

Sign Up

Table of Contents for
LSTM networks