Long Short-Term Memory units

Special thanks to the group of Swiss researchers who published a paper titled Long Short-Term Memory in 1997, which described a method for further augmenting RNNs with a more advanced memory.

So, what does memory in this context actually mean? LSTMs take the dumb RNN cell and add another neural network (consisting of inputs, operations, and activations), which will be selective about what information is carried from one timestep to another. It does this by maintaining a cell state (like a vanilla RNN cell) and a new hidden state, both of which are then fed into the next step. These gates, as indicated in the following diagram, learn about what information should be maintained in the hidden state:

Here, we can see that multiple gates are contained within r(t), z(t), and h(t). Each has an activation function: Sigmoid for r and z and tanh for h(t)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset