Creating the covariance matrix of the dataset

To calculate the covariance matrix of iris, we will first calculate the feature-wise mean vector (for use in the future) and then calculate our covariance matrix using NumPy.

The covariance matrix is a d x d matrix (square matrix with the same number of features as the number of rows and columns) that represents feature interactions between each feature. It is quite similar to a correlation matrix:

# Calculate a PCA manually

# import numpy
import numpy as np

# calculate the mean vector
mean_vector = iris_X.mean(axis=0)
print mean_vector
[ 5.84333333 3.054 3.75866667 1.19866667]
# calculate the covariance matrix
cov_mat = np.cov((iris_X-mean_vector).T)
print cov_mat.shape
(4, 4)

The variable cov_mat stores our 4 x 4 covariance matrix.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset