Image augmentation

While training a CNN model, we do not want the model to change any prediction based on the size, angle, and position of the image. The image is represented as a matrix of pixel values, so the size, angle, and position have a huge effect on the pixel values. To make the model more size-invariant, we can add different sizes of the image to the training set. Similarly, in order to make the model more rotation-invariant, we can add images with different angles. This process is known as image data augmentation. This also helps to avoid overfitting. Overfitting happens when a model is exposed to very few samples. Image data augmentation is one way to reduce overfitting, but it may not be enough because augmented images are still correlated. Keras provides an image augmentation class called ImageDataGenerator that defines the configuration for image data augmentation. This also provides other features such as:

Sample-wise and feature-wise standardization
Random rotation, shifts, shear, and zoom of the image
Horizontal and vertical flip
ZCA whitening
Dimension reordering
Saving the changes to disk

An augmented image generator object can be created as follows:

imagedatagen = ImageDataGenerator()

This API generates batches of tensor image data in real-time data augmentation, instead of processing an entire image dataset in memory. This API is designed to create augmented image data during the model fitting process. Thus, it reduces the memory overhead but adds some time cost for model training.

After it is created and configured, you must fit your data. This computes any statistics required to perform the transformations to image data. This is done by calling the fit() function on the data generator and passing it to the training dataset, as follows:

imagedatagen.fit(train_data)

The batch size can be configured, the data generator can be prepared, and batches of images can be received by calling the flow() function:

imagedatagen.flow(x_train, y_train, batch_size=32)

Finally, call the fit_generator() function instead of calling the fit() function on the model:

fit_generator(imagedatagen, samples_per_epoch=len(X_train), epochs=200)

Let's look at some examples to understand how the image augmentation API in Keras works. We will use the MNIST handwritten digit recognition task in these examples.

Let's begin by taking a look at the first nine images in the training dataset:

#Plot images 
from keras.datasets import mnist
from matplotlib import pyplot
#loading data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
#creating a grid of 3x3 images
for i in range(0, 9):
  pyplot.subplot(330 + 1 + i)
  pyplot.imshow(X_train[i], cmap=pyplot.get_cmap('gray'))
#Displaying the plot
pyplot.show()

The following code snippet creates augmented images from the CIFAR-10 dataset. We will add these images to the training set of the last example and see how the classification accuracy increases:

from keras.preprocessing.image import ImageDataGenerator
# creating and configuring augmented image generator
datagen_train = ImageDataGenerator(
 width_shift_range=0.1, # shifting randomly images horizontally (10% of total width)
 height_shift_range=0.1, # shifting randomly images vertically (10% of total height)
 horizontal_flip=True) # flipping randomly images horizontally
# creating and configuring augmented image generator
datagen_valid = ImageDataGenerator(
 width_shift_range=0.1, # shifting randomly images horizontally (10% of total width)
 height_shift_range=0.1, # shifting randomly images vertically (10% of total height)
 horizontal_flip=True) # flipping randomly images horizontally
# fitting augmented image generator on data
datagen_train.fit(x_train)
datagen_valid.fit(x_valid)

Table of Contents for Image augmentation

Create new playlist

Sign In

Sign Up

Table of Contents for
Image augmentation