Regression

Here's the snippet to create the regression RDD of labeled points that we will be using to predict the number of hours people work:

mu, std = sModel.mean[3], sModel.std[3]

final_data_hours = ( final_data .map(lambda row: reg.LabeledPoint( row[1][3] * std + mu , ln.Vectors.dense([row[0]] + list(row[1][0:3]) + list(row[1][4:])) ) )
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset