Training the model

This section explains about training the neural network model to identify each character.

We start by importing the desired packages for the purpose. The label binarizor class is used to convert a vector into one-hot encoding in one step. The model_selection import, train_test_split, is used to split into test and train sets. Several other keras packages are used for training the model:

import cv2
import pickle
import os.path
import numpy as np
from imutils import paths
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.layers.core import Flatten, Dense
from helpers import resize_to_fit

We need to initialize and look over the input CAPTCHAs. After converting the images into grayscale, we make sure that they fit in 20 x 20 pixels. We grab the letter and the name of letter and add the letter and name to our training set, as shown:

LETTER_IMAGES_PATH = "output_letter_images"
MODEL = "captcha.hdf5"
MODEL_LABELS = "labels.dat"

dataimages = []
imagelabels = []

for image_file in paths.list_images(LETTER_IMAGES_PATH):
 
 text_image = cv2.imread(image_file)
 text_image = cv2.cvtColor(text_image, cv2.COLOR_BGR2GRAY)

text_image = resize_to_fit(text_image, 20, 20)
text_image = np.expand_dims(text_image, axis=2)
text_label = image_file.split(os.path.sep)[-2]

dataimages.append(text_image)
 imagelabels.append(text_label)

We scale the pixel intensities to the range [0, 1] to improve training:

dataimages = np.array(dataimages, dtype="float") / 255.0
imagelabels = np.array(imagelabels)

We again split the training data into train and test sets. We then convert the letter labels to one into one-hot encoding. One-hot encodings make it easy for Keras with:

(X_train_set, X_test_set, Y_train_set, Y_test_set) = train_test_split(dataimages, imagelabels, test_size=0.25, random_state=0)

lbzr = LabelBinarizer().fit(Y_train_set)
Y_train_set = lbzr.transform(Y_train_set)
Y_test_set = lbzr.transform(Y_test_set)

with open(MODEL_LABELS, "wb") as f:
 pickle.dump(lbzr, f)

Finally, we build the neural network. Both the first and the second convolutional layer have max pooling, as shown in the following code:

nn_model = Sequential()

nn_model.add(Conv2D(20, (5, 5), padding="same", input_shape=(20, 20, 1), activation="relu"))
nn_model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

nn_model.add(Conv2D(50, (5, 5), padding="same", activation="relu"))
nn_model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

The hidden layer has 500 nodes, and every output layer has 32 possibilities, which means one for each alphabet.

Keras will build the TensorFlow model in the background and hence train the neural network:

nn_model.add(Flatten())
nn_model.add(Dense(500, activation="relu"))

nn_model.add(Dense(32, activation="softmax"))

nn_model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

nn_model.fit(X_train_set, Y_train_set, validation_data=(X_test_set, Y_test_set), batch_size=32, epochs=10, verbose=1)

nn_model.save(MODEL)

Table of Contents for Training the model

Create new playlist

Sign In

Sign Up

Table of Contents for
Training the model