In this chapter, we had a quick look at two fundamentals topics related to optimizing the computation of DNNs.
The first topic explained how to use GPUs and TensorFlow to implement DNNs. They are structured in a very uniform manner so that, at each layer of the network, thousands of identical artificial neurons perform the same computation. Hence, the architecture of a DNN fits quite well with the kinds of computation that a GPU can efficiently perform.
The second topic introduced distributed computing. This was initially used to perform very complex calculations that could not be completed by a single machine. Likewise, analyzing large amounts of data quickly by splitting this task among different nodes appears to be the best strategy when faced with such a big challenge.
At the same time, DL problems can be exploited using distributed computing. DL computations can be divided into multiple activities (tasks); each of them will be given a fraction of data and will return a result that will have to be recomposed with the results provided by the other activities. Alternatively, in most complex situations, a different calculation algorithm can be assigned to each machine.
Finally, in the last example, we showed how computation in TensorFlow can be distributed.