Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Bisecting K-means clustering of the neighborhood using Spark MLlib

In the previous section, we saw how to cluster similar houses together to determine the neighborhood. The bisecting K-means is also similar to regular K-means except that the model training that takes different training parameters as follows:

// Cluster the data into two classes using KMeans 
val bkm = new BisectingKMeans() 
                 .setK(5) // Number of clusters of the similar houses
                 .setMaxIterations(20)// Number of max iteration
                 .setSeed(12345) // Setting seed to disallow randomness 
val model = bkm.run(landRDD)

You should refer to the previous example and just reuse the previous steps to get the trained data. Now let's evaluate clustering by computing WSSSE as follows:

val WCSSS = model.computeCost(landRDD)
println("Within-Cluster Sum of Squares = " + WCSSS) // Less is better

You should observe the following output: Within-Cluster Sum of Squares = 2.096980212594632E11. Now for more analysis, please refer to step 5 in the previous section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Bisecting K-means clustering of the neighborhood using Spark MLlib

Create new playlist

Sign In

Sign Up

Table of Contents for
Bisecting K-means clustering of the neighborhood using Spark MLlib