Summary

We have seen how to implement FFNN architectures that are characterized by a set of input units, a set of output units, and one or more hidden units that connect the input level from that output. We have seen how to organize the network layers so that the connections between the levels are total and in a single direction: each unit receives a signal from all the units of the previous layer and transmits its output value, suitably weighed to all units of the next layer.

We have also seen how to define an activation function (for example, sigmoid, ReLU, tanh, and softmax) for each layer, where the choice of an activation function depends on the architecture and the problem being addressed.

We then implemented four different FFNN models. The first model had a single hidden layer, with a softmax activation function. The three other more complex models had five hidden layers in total, but with different activation function. We have also seen how to implement a deep MLP and DBN with TensorFlow, for solving a classification task. Using these implementations, we managed to achieve above 90% accuracy. Finally, we have discussed how to tune the hyperparameters for DNNs for better and more optimized performance.

Although a regular FFNN, such as an MLP, works fine for small images (for example, MNIST or CIFAR-10), it breaks down for larger images because of the huge number of parameters required. For example, a 100×100 image has 10,000 pixels, and if the first layer has just 1,000 neurons (which already severely restricts the amount of information transmitted to the next layer), this means 10 million connections. In addition, that is just for the first layer.

Importantly, a DNN has no prior knowledge of how pixels are organized, so it does not know that nearby pixels are close. The architecture of a CNN embeds this prior knowledge. Lower layers typically identify features in small areas of the images, while higher layers combine the lower-level features into larger features. This works well with most natural images, giving CNNs a decisive head start compared to DNNs.

In the next chapter, we will look further into the complexity of neural network models, introducing CNNs, which may have a big impact on deep learning techniques. We will study the main features and see some implementation examples.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset