Step 4 – Constructing the CNN layers

Once we have defined the CNN hyperparameters, the next task is to implement the CNN network. As you can guess, for our task, we will construct a CNN network having three convolutional layers, a flattened layer and two fully connected layers (refer to LayersConstructor.py). Moreover, we need to define the weight and the bias as well. Furthermore, we will have implicit max-pooling layers too. At first, let's define the weight. In the following, we have the new_weights() method that asks for the image shape and returns the truncated normal shapes:

def new_weights(shape): 
    return tf.Variable(tf.truncated_normal(shape, stddev=0.05)) 

Then we define the biases using the new_biases() method:

def new_biases(length): 
    return tf.Variable(tf.constant(0.05, shape=[length])) 

Now let's define a method, new_conv_layer(), for constructing a convolutional layer. The method takes the input batch, number of input channels, filter size, and number of filters and it also uses the max pooling (if true, we use a 2 x 2 max pooling) to construct a new convolutional layer. The workflow of the method is as follows:

  1. Define the shape of the filter weights for the convolution, which is determined by the TensorFlow API.
  2. Create the new weights (that is, filters) with the given shape and new biases, one for each filter.
  3. Create the TensorFlow operation for the convolution where the strides are set to 1 in all dimensions. The first and last stride must always be 1, because the first is for the image-number and the last is for the input channel. For example, strides= (1, 2, 2, 1) would mean that the filter is moved two pixels across the x axis and y axis of the image.
  4. Add the biases to the results of the convolution. Then a bias-value is added to each filter-channel.
  5. It then uses the pooling to downsample the image resolution. This is 2 x 2 max pooling, which means that we consider 2 x 2 windows and select the largest value in each window. Then we move two pixels to the next window.
  6. ReLU is then used to calculate the max(x, 0) for each input pixel x. As stated earlier, a ReLU is normally executed before the pooling, but since relu(max_pool(x)) == max_pool(relu(x)) we can save 75% of the relu-operations by max-pooling first.
  7. Finally, it returns both the resulting layer and the filter-weights because we will plot the weights later.

Now we define a function to construct the convolutional layer to be used:

def new_conv_layer(input,  num_input_channels, filter_size, num_filters,                    use_pooling=True):   
    shape = [filter_size, filter_size, num_input_channels, num_filters] 
    weights = new_weights(shape=shape) 
    biases = new_biases(length=num_filters) 
    layer = tf.nn.conv2d(input=input, 
                         filter=weights, 
                         strides=[1, 1, 1, 1], 
                         padding='SAME') 
    layer += biases 
    if use_pooling: 
        layer = tf.nn.max_pool(value=layer, 
                               ksize=[1, 2, 2, 1], 
                               strides=[1, 2, 2, 1], 
                               padding='SAME') 
    layer = tf.nn.relu(layer) 
    return layer, weights 

The next task is to define the flattened layer:

  1. Get the shape of the input layer.
  2. The number of features is img_height * img_width * num_channels. The get_shape() function TensorFlow is used to calculate this.
  3. It will then reshape the layer to (num_images and num_features). We just set the size of the second dimension to num_features and the size of the first dimension to -1, which means the size in that dimension is calculated so the total size of the tensor is unchanged from the reshaping.
  4. Finally, it returns both the flattened layer and the number of features.

The following code does exactly the same as described before defflatten_layer(layer):

    layer_shape = layer.get_shape() 
    num_features = layer_shape[1:4].num_elements() 
    layer_flat = tf.reshape(layer, [-1, num_features]) 
    return layer_flat, num_features 

Finally, we need to construct the fully connected layers. The following function, new_fc_layer(), takes the input batches, number of batches, and number of outputs (that is, predicted classes) and it uses the ReLU. It then creates the weights and biases based on the methods we define earlier in this step. Finally, it calculates the layer as the matrix multiplication of the input and weights, and then adds the bias values:

def new_fc_layer(input, num_inputs, num_outputs, use_relu=True):  
    weights = new_weights(shape=[num_inputs, num_outputs]) 
    biases = new_biases(length=num_outputs) 
    layer = tf.matmul(input, weights) + biases 
    if use_relu: 
        layer = tf.nn.relu(layer) 
    return layer 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset