# Correlation Coefficient – Explanation and Examples

The correlation coefficient of a set of data is a number between $-1$ and $1$ that shows how random the data is.

A number closer to $0$ indicates randomness. A number closer to $1$ indicates a positive correlation while a number closer to $-1$ indicates a negative correlation.

Correlation coefficients are important for any kind of statistical analysis. The higher the absolute value of the correlation coefficient, the stronger the association between to variables.

This section covers:

• What is a Correlation Coefficient?
• Correlation Coefficient Definition
• How to find Correlation Coefficient

## What is a Correlation Coefficient?

A correlation coefficient is a number that shows how strongly two variables are associated. The closer the absolute value of the coefficient is to 1, the stronger the association between the two variables.

Specifically, values closer to $1$ indicate a strong positive association, while values closer to $-1$ indicate a strong negative association. That is, when the value is closer to $1$, the value of the dependent variable will increase as the independent variable increases. The opposite is true when the correlation coefficient is closer to $-1$.

When the correlation coefficient is closer to $0$, it indicates a lack of association between the two variables.

A correlation coefficient greater than $0.8$ or less than $-0.8$ is generally considered significant.

### Correlation Coefficient Definition

A correlation coefficient is a number between $1$ and $-1$ that shows how associated two variables are. Usually this number is denoted as $r$.

Random data will have values closer to $0$, proportional data will have values closer to $1$, and inversely proportional data will have values closer to $-1$.

## How to Find Correlation Coefficient

The correlation coefficient is meaningful for bivariate quantitative data. That is, when the data consists of two numerical values. For example, height and shoe size or temperature and humidity are bivariate quantitative data.

The correlation coefficient shows whether or not the data have a linear relationship.

Calculating this number by hand is certainly possible, but it does take a significant amount of time, especially as the number of data points increases.

To calculate $r$ for bivariate data with independent variable $x$ and dependent variable $y$:

1. Calculate the mean of all the $x$ values, $\bar{x}$.
If there are $n$ data points, the mean is $\bar{x}=\frac{\sum\limits_{k=1}^n x_k}{n}$. That is, the sum of all the $x$ terms divided by the total number of terms.
2. Calculate the mean of all the $y$ values, $\bar{y}$.
If there are $n$ data points, the mean is $\bar{y}=\frac{\sum\limits_{k=1}^n y_k}{n}$. That is, the sum of all the $y$ terms divided by the total number of terms.
3. Calculate the standard deviation of all the $x$ terms, $s_x$.
The standard deviation is $s_x$=$\sqrt{\frac{\sum\limits_{k=1}^n (x_k-\bar{x})^2}{n-1}}$. This is a complicated looking formula, but it simply finds how much the typical data point deviates from the mean.
4. Calculate the standard deviation of all the $y$ terms, $s_y$.
The standard deviation is $s_y$=$\sqrt{\frac{\sum\limits_{k=1}^n (y_k-\bar{y})^2}{n-1}}$. Again, this is a complicated looking formula, but remember it just finds how much the typical data point deviates from the mean.
5. Calculate the $z$-score for the $x$ terms. The $z$-score (also known as the standard value) is equal to $\frac{x-\bar{x}}{s_x}$. This number makes data from different samples easy to compare.
6. Similarly, calculate the $z$-score for the $y$ terms. This is equal to $\frac{y-\bar{y}}{s_y}$.
7. Finally, calculate the correlation coefficient $r$ as the sum of the products of corresponding $z$-scores divided by $n-1$. That is, multiply the $z$-score of each $x$ value by the $z$-score of the corresponding $y$ value. Then, add together those products and divide by $1$ less than the total number of terms.

### Example 2

Find the z-scores for the following data points.

### Example 3

Find the correlation coefficient for this set of data.

### Example 4

Interpret a correlation coefficient in context.

### Example 5

Use a calculator to find the correlation coefficient. Then, interpret the correlation coefficient in context.