How to do it…

To do this, we need to consider the following points:

  • It is imperative to have a good set of different distance functions for any of the algorithms that perform the search and SciPy has, for this purpose, a huge collection of optimally coded functions in the distance submodule of the scipy.spatial module.
  • The list is long. Besides Euclidean, squared Euclidean, or standardized Euclidean, we have many more—Bray-CurtisCanberraChebyshevManhattan, correlation distance, cosine distance, dice dissimilarityHammingJaccard-NeedhamKulsinskiMahalanobis, and so on.
  • The syntax in most cases is simple:
distance_function(first_vector, second_vector)

The only three cases in which the syntax is different are the Minkowski, Mahalanobis, and standardized Euclidean distances, in which the distance function requires either an integer number (for the order of the norm in the definition of Minkowski distance), a covariance for the Mahalanobis case (but this is an optional requirement), or a variance matrix to standardize the Euclidean distance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset