How to do it...

Just like with classification or regression models, building clustering models is pretty straightforward in Spark. Here's the code that aims to find patterns in the census data:

import pyspark.mllib.clustering as clu

model = clu.KMeans.train(
    final_data.map(lambda row: row[1])
    , 2
    , initializationMode='random'
    , seed=666
)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset