222 High-Function Business Intelligence in e-business
A.1.4 Correlation
Correlation is normalized covariance.
That is divide the covariance by the square root of the variance of the two
variables as represented by the following equation:
The correlation value (most texts refer to it as a coefficient) is a number between
-1 and 1.
The values for CORR(X,Y), the correlation coefficient is shown in Table A-3.
Table A-3 Correlation coefficient meaning
Correlation coefficients measure the degree of the linear relationship between
+1 indicates a very strong or direct relationship. If we plotted the salary and
bonus data that had a correlation of 1
the data points would lie on a
straight line with positive slope.
-1 also indicates a strong relationship, and the data points would also lie in a
straight line but with negative slope.
0 indicates that there is no linear relationship and the data points are
scattered all over the place.
Therefore, correlation is a measure of how well the data points align themselves.
Figure A-1 is a visual representation of varying correlation coefficients.
Correlation Value Meaning
CORR(X,Y) between 0 and 1 (positive) The attributes are directly linearly related.
As one increases so does the other.
CORR(X,Y) equal 0 There is no linear relationship between the
two attributes.
CORR(X,Y) between -1 and 0 (negative) The attributes are inversely linearly
related. As one increases the other
Cov X Y
Var X
Var Y
