Regularization

The regularization was originally developed to cope with ill-poised problems, where the problem was underconstrained—allowed multiple solutions given the data—or the data and the solution that contained too much noise (A.N. Tikhonov, A.S. Leonov, A.G. Yagola. Nonlinear Ill-Posed Problems, Chapman and Hall, London, Weinhe). Adding additional penalty function that skews a solution if it does not have a desired property, such as the smoothness in curve fitting or spectral analysis, usually solves the problem.

The choice of the penalty function is somewhat arbitrary, but it should reflect a desired skew in the solution. If the penalty function is differentiable, it can be incorporated into the gradient descent process; ridge regression is an example where the penalty is the Regularizationmetric for the weights or the sum of squares of the coefficients.

MLlib currently implements Regularization, Regularization, and a mixture thereof called Elastic Net, as was shown in Chapter 3, Working with Spark and MLlib. The Regularization regularization effectively penalizes for the number of non-zero entries in the regression weights, but has been known to have slower convergence. Least Absolute Shrinkage and Selection Operator (LASSO) uses the Regularization regularization.

Another way to reduce the uncertainty in underconstrained problems is to take the prior information that may be coming from domain experts into account. This can be done using Bayesian analysis and introducing additional factors into the posterior probability—the probabilistic rules are generally expressed as multiplication rather than sum. However, since the goal is often minimizing the log likelihood, the Bayesian correction can often be expressed as standard regularizer as well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset