Binarization

Binarization is yet another type of transformation in which continuous variables are transformed into binary variables. For example, if we had a continuous variable named AGE, we could binarize the variable around 50 years by thresholding ages 50 and above to have a value of one, and ages with values below 50 to have a value of zero. Binarizing is good to save time and memory when you have many variables; however, in practice, the raw continuous values usually perform better since they are more informative.

While binarization can also be performed in pandas using the code demonstrated earlier, scikit-learn comes with a Binarizer class that can also be used to binarize features. For instructions on using the Binarizer class, you can visit http://scikit-learn.org/stable/modules/preprocessing.html#binarization.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset