The statistical dependence that's expressed in a correlation relationship does not imply a causal relationship between the two variables; the famous line regarding this is correlation does not imply causation. Thus, the correlation between two variables or datasets implies just a casual rather than a causal relationship or dependence. For example, there is a correlation between the amount of ice cream purchased on a given day and the weather.
The correlation measure, known as the correlation coefficient, is a number that describes the size and direction of the relationship between the two variables. It can vary from -1 to +1 in direction and 0 to 1 in magnitude. The direction of the relationship is expressed through the sign, with a + sign expressing a positive correlation and a - sign expressing a negative correlation. The higher the magnitude, the greater the correlation, with a 1 being termed as the perfect correlation.
The most popular and widely used correlation coefficient is the Pearson product-moment correlation coefficient, known as r. It measures the linear correlation or dependence between two x and y variables and takes values between -1 and +1.
The sample correlation coefficient, r, is defined as follows:
This can also be written as follows:
Here, we have omitted the summation limits.