Heteroscedasticity

One of the fundamental assumptions in regression approach is that the target variance is not correlated with either independent (attributes) or dependent (target) variables. An example where this assumption might break is counting data, which is generally described by Poisson distribution. For Poisson distribution, the variance is proportional to the expected value, and the higher values can contribute more to the final variance of the weights.

While heteroscedasticity may or may not significantly skew the resulting weights, one practical way to compensate for heteroscedasticity is to perform a log transformation, which will compensate for it in the case of Poisson distribution:

Heteroscedasticity
Heteroscedasticity

Some other (parametrized) transformations are the Box-Cox transformation:

Heteroscedasticity

Here, Heteroscedasticity is a parameter (the log transformation is a partial case, where Heteroscedasticity) and Tuckey's lambda transformation (for attributes between 0 and 1):

Heteroscedasticity

These compensate for Poisson binomial distributed attributes or the estimates of the probability of success in a sequence of trails with potentially a mix of n Bernoulli distributions.

Heteroscedasticity is one of the main reasons that logistic function minimization works better than linear regression with Heteroscedasticity minimization in a binary prediction problem. Let's consider discrete labels in more details.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset