Texture is the spatial and visual quality of an image. In this recipe, we will take a look at Haralick texture features. These features are based on the co-occurrence matrix (11.5) defined as follows:
In equation 11.5, i and j are intensities, while p and q are positions. The Haralick features are 13 metrics derived from the co-occurrence matrix, some of them given in equation 11.6. For a more complete list, refer to http://murphylab.web.cmu.edu/publications/boland/boland_node26.html (retrieved December 2015).
We will calculate the Haralick features with the mahotas API and apply them to the handwritten digits dataset of scikit-learn.
import mahotas as mh import numpy as np from sklearn.datasets import load_digits import matplotlib.pyplot as plt from tpot import TPOT from sklearn.cross_validation import train_test_split import dautil as dl
digits = load_digits() X = digits.data.copy()
for i, img in enumerate(digits.images): np.append(X[i], mh.features.haralick( img.astype(np.uint8)).ravel())
X_train, X_test, y_train, y_test = train_test_split( X, digits.target, train_size=0.75) tpot = TPOT(generations=6, population_size=101, random_state=46, verbosity=2) tpot.fit(X_train, y_train) print('Score {:.2f}'.format(tpot.score(X_train, y_train, X_test, y_test)))
dl.plotting.img_show(plt.gca(), digits.images[0]) plt.title('Original Image')
plt.figure() dl.plotting.img_show(plt.gca(), digits.data[0].reshape((8, 8))) plt.title('Core Features')
plt.figure() dl.plotting.img_show(plt.gca(), mh.features.haralick( digits.images[0].astype(np.uint8))) plt.title('Haralick Features')
Refer to the following screenshot for the end result:
The code is in the extracting_texture.ipynb
file in this book's code bundle.