Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Putting it all together

After dealing with a variety of problems with our dataset, from identifying missing values hidden as zeros, imputing missing values, and normalizing data at different scales, it's time to put all of our scores together into a single table and see what combination of feature engineering did the best:

Pipeline description	# rows model learned from	Cross-validated accuracy
Drop missing-valued rows	392	.7449
Impute values with 0	768	.7304
Impute values with mean of column	768	.7318
Impute values with median of column	768	.7357
Z-score normalization with median imputing	768	.7422
Min-max normalization with mean imputing	768	.7461
Row-normalization with mean imputing	768	.6823

It seems as though we were finally able to get a better accuracy by applying mean imputing and min-max normalization to our dataset and still use all 768 available rows. Great!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Putting it all together

Create new playlist

Sign In

Sign Up

Table of Contents for
Putting it all together