LSTM and GRU networks

As we saw, the recursive structure of RNN and LSTM networks have problems with gradients, either the gradients vanish or explode. One workaround is to introduce forget gates, which will delete some of the old information. This helps to keep track of relevant information without destroying the gradients, and to better preserve important data observed a long time ago.

Both LSTM and GRU share the same design principle with recurrent neural networks, give an input, compute an output, and then a black box updates the internal state. This is crucial in order to understand the bigger picture.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset