ReLU classifier

The last architectural change improved the accuracy of our model, but we can do even better by changing the sigmoid activation function with the Rectified Linear Unit, shown as follows:

ReLU function

A Rectified Linear Unit (ReLU) unit computes the function f(x) = max(0, x), ReLU is computationally fast because it does not require any exponential computation, such as those required in sigmoid or tanh activations, furthermore it was found to greatly accelerate the convergence of stochastic gradient descent compared to the sigmoid/tanh functions.

To use the ReLU function, we simply change, in the previously implemented model, the following definitions of the first four layers, in the previously implemented model.

First layer output:

Y1 = tf.nn.relu(tf.matmul(XX, W1) + B1)  

Second layer output:

Y2 = tf.nn.relu(tf.matmul(Y1, W2) + B2) 

Third layer output:

Y3 = tf.nn.relu(tf.matmul(Y2, W3) + B3)  

Fourth layer output:

Y4 = tf.nn.relu(tf.matmul(Y3, W4) + B4)  

Of course tf.nn.relu is TensorFlow's implementation of ReLU.

The accuracy of the model is almost 98%, as you could see running the network:

>>>  
Loading data/train-images-idx3-ubyte.mnist
Loading data/train-labels-idx1-ubyte.mnist
Loading data/t10k-images-idx3-ubyte.mnist
Loading data/t10k-labels-idx1-ubyte.mnist
Epoch: 0
Epoch: 1
Epoch: 2
Epoch: 3
Epoch: 4
Epoch: 5
Epoch: 6
Epoch: 7
Epoch: 8
Epoch: 9
Accuracy: 0.9789
done
>>>
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset