Applying pooling operations in TensorFlow

Using TensorFlow, a subsampling layer can normally be represented by a max_pool operation by maintaining the initial parameters of the layer. For max_pool, it has the following signature in TensorFlow:

tf.nn.max_pool(value, ksize, strides, padding, data_format, name)

Now let's learn how to create a function that utilizes the preceding signature and returns a tensor with type tf.float32, that is, the max pooled output tensor:

import tensorflow as tf
 
def maxpool2d(x, k=2): 
   return tf.nn.max_pool(x,  
               ksize=[1, k, k, 1],  
               strides=[1, k, k, 1],  
               padding='SAME')

In the preceding code segment, the parameters can be described as follows:

value: This is a 4D tensor of float32 elements and shape (batch length, height, width, and channels)
ksize: A list of integers representing the window size on each dimension
strides: The step of the moving windows on each dimension
data_format: NHWC, NCHW, and NCHW_VECT_C are supported
ordering: NHWC or NCHW
padding: VALID or SAME

However, depending upon the layering structures in a CNN, there are other pooling operations supported by TensorFlow, as follows:

tf.nn.avg_pool: This returns a reduced tensor with the average of each window
tf.nn.max_pool_with_argmax: This returns the max_pool tensor and a tensor with the flattened index of max_value
tf.nn.avg_pool3d: This performs an avg_pool operation with a cubic-like
window; the input has an added depth
tf.nn.max_pool3d: This performs the same function as (...) but applies the max operation

Now let's see a concrete example of how the padding thing works in TensorFlow. Suppose we have an input image x with shape [2, 3] and one channel. Now we want to see the effect of both VALID and SAME paddings:

valid_pad: Max pool with 2 x 2 kernel, stride 2, and VALID padding
same_pad: Max pool with 2 x 2 kernel, stride 2, and SAME padding

Let's see how we can attain this in Python and TensorFlow. Suppose we have an input image of shape [2, 4], which is one channel:

import tensorflow as tf 
x = tf.constant([[2., 4., 6., 8.,], 
                 [10., 12., 14., 16.]])

Now let's give it a shape accepted by tf.nn.max_pool:

x = tf.reshape(x, [1, 2, 4, 1])

If we want to apply the VALID padding with the max pool with a 2 x 2 kernel, stride 2:

VALID = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')

On the other hand, using the max pool with a 2 x 2 kernel, stride 2 and SAME padding:

SAME = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')

For VALID padding, since there is no padding, the output shape is [1, 1]. However, for the SAME padding, since we pad the image to the shape [2, 4] (with - inf) and then apply the max pool, the output shape is [1, 2]. Let's validate them:

print(VALID.get_shape())  
print(SAME.get_shape())

>>> 
(1, 1, 2, 1) 
(1, 1, 2, 1)

Table of Contents for Applying pooling operations in TensorFlow

Create new playlist

Sign In

Sign Up

Table of Contents for
Applying pooling operations in TensorFlow