As we discussed in Chapter 3, A Tour of Machine Learning Classifiers Using Scikit-learn, regularization is one approach to tackle the problem of overfitting by adding additional information, and thereby shrinking the parameter values of the model to induce a penalty against complexity. The most popular approaches to regularized linear regression are the so-called Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO) and Elastic Net method.
Ridge regression is an L2 penalized model where we simply add the squared sum of the weights to our least-squares cost function:
Here:
By increasing the value of the hyperparameter , we increase the regularization strength and shrink the weights of our model. Please note that we don't regularize the intercept term .
An alternative approach that can lead to sparse models is the LASSO. Depending on the regularization strength, certain weights can become zero, which makes the LASSO also useful as a supervised feature selection technique:
Here:
However, a limitation of the LASSO is that it selects at most variables if >. A compromise between Ridge regression and the LASSO is the Elastic Net, which has a L1 penalty to generate sparsity and a L2 penalty to overcome some of the limitations of the LASSO, such as the number of selected variables.
Those regularized regression models are all available via scikit-learn, and the usage is similar to the regular regression model except that we have to specify the regularization strength via the parameter , for example, optimized via k-fold cross-validation.
A Ridge Regression model can be initialized as follows:
>>> from sklearn.linear_model import Ridge >>> ridge = Ridge(alpha=1.0)
Note that the regularization strength is regulated alpha
, which is similar to the parameter . Likewise, we can initialize a LASSO regressor from the linear_model
submodule:
>>> from sklearn.linear_model import Lasso >>> lasso = Lasso(alpha=1.0)
Lastly, the ElasticNet
implementation allows us to vary the L1 to L2 ratio:
>>> from sklearn.linear_model import ElasticNet >>> lasso = ElasticNet(alpha=1.0, l1_ratio=0.5)
For example, if we set l1_ratio
to 1.0
, the ElasticNet
regressor would be equal to LASSO regression. For more detailed information about the different implementations of linear regression, please see the documentation at http://scikit-learn.org/stable/modules/linear_model.html.