After stacking up a bunch of convolution and pooling steps, we follow them with a fully connected layer where we feed the extracted high-level features that we got from the input image to this fully connected layer to use them and do the actual classification based on these features:
For example, in the case of the digit classification task, we can follow the convolution and pooling step with a fully connected layer that has 1,024 neurons and ReLU activation to perform the actual classification. This fully connected layer accepts the input in the following format:
[batch_size, features]
So, we need to reshape or flatten our input feature map from pool_layer2 to match this format. We can use the following line of code to reshape the output:
pool1_flat = tf.reshape(pool_layer1, [-1, 14 * 14 * 20])
In this reshape function, we have used -1 to indicate that the batch size will be dynamically determined and each example from the pool_layer1 output will have a width of 14 and a height of 14 with 20 channels each.
So the final output of this reshape operation will be as follows:
[batch_size, 3136]
Finally, we can use the dense() function of TensorFlow to define our fully connected layer with the required number of neurons (units) and the final activation function:
dense_layer = tf.layers.dense(inputs=pool1_flat, units=1024, activation=tf.nn.relu)