Batch normalization

Batch normalization, as its name suggests, normalizes the distribution of weights in a layer around some mean of 0. This allows for the network to use a higher learning while still avoiding a vanishing or exploding gradient problem. It is due to the weights being normalized, which allows for fewer shifts or training wobble, as we have seen before. 

By normalizing the weights in a layer, we allow for the network to use a higher learning rate and thus train faster. Also, we can avoid or reduce the need to use DropOut. You will see that we use the standard term, shown here, to normalize the layers:

model.add(BatchNormalization(momentum=0.8))

Recall from our discussions of optimizers that momentum controls how quickly or slowly we want to decrease the training gradient. In this case, momentum refers to the amount of change of the mean or center of the normalized distribution.  

In the next section, we look at another special layer called LeakyReLU.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset