Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Regression metrics

true_pred_reg is an RDD of tuples where the first element is the prediction from our linear regression model and the second element is the expected value (the number of hours worked per week). Here's how we create it:

true_pred_reg = (
    final_data_hours_test
    .map(lambda row: (
         float(workhours_model_lm.predict(row.features))
         , row.label))
)

The metrics_lm object contains a variety of metrics: explainedVariance, meanAbsouteError, meanSquaredError, r2, and rootMeanSquaredError. Here, we will only print out a couple of them:

print('R^2: ', metrics_lm.r2)
print('Explained Variance: ', metrics_lm.explainedVariance)
print('meanAbsoluteError: ', metrics_lm.meanAbsoluteError)

Let's see what we got for the linear regression model:

Not unexpectedly, the model performs really poorly, given what we have already seen. Do not be too surprised by the negative R-squared; it can turn negative, that is, a nonsensical value for R-squared, if the predictions of the model are nonsensical.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Regression metrics

Create new playlist

Sign In

Sign Up

Table of Contents for
Regression metrics