In the previous section, we saw how to cluster similar houses together to determine the neighborhood. The bisecting K-means is also similar to regular K-means except that the model training that takes different training parameters as follows:
// Cluster the data into two classes using KMeans
val bkm = new BisectingKMeans()
.setK(5) // Number of clusters of the similar houses
.setMaxIterations(20)// Number of max iteration
.setSeed(12345) // Setting seed to disallow randomness
val model = bkm.run(landRDD)
You should refer to the previous example and just reuse the previous steps to get the trained data. Now let's evaluate clustering by computing WSSSE as follows:
val WCSSS = model.computeCost(landRDD)
println("Within-Cluster Sum of Squares = " + WCSSS) // Less is better
You should observe the following output: Within-Cluster Sum of Squares = 2.096980212594632E11. Now for more analysis, please refer to step 5 in the previous section.