Sign in or start a free trial to avail of this feature.
5. Variance and Standard Deviation
These two metrics quantify how far your data is dispersed from the mean - they can be calculated manually or through in-built Excel functions.
Calculating Variance Manually (00:04)
The variance and the standard deviation are important metrics that we can use to measure how far our data is spread around the average. We can calculate the variance of a data set manually by following these steps:
Find the difference between each value and the mean of the data set.
Square each of these values. This ensures that the distance between each point and the mean is positive, and stops values below the mean canceling out values above the mean.
Find the average of these squared values.
Calculating Standard Deviation Manually (02:19)
It’s difficult to interpret variance intuitively. Because we square the deviations as part of the calculation, the variance is measured in square units. For example, the variance of revenues for the sports bar is measured in dollars squared.
We can solve this problem by computing the standard deviation, which is simply the square root of the variance. This is measured in the same units as the data set. A smaller standard deviation indicates the numbers are closer to the mean of the data set.
Using Excel Functions (03:14)
Excel provides inbuilt functions for calculating the variance and standard deviation. The function VAR.P can be used to find variance, and STDEV.P can be used to find standard deviation. Both these functions take an array of cells as their input, and return the variance or standard deviation of those values.
The P in both functions stands for population. When we’re performing descriptive statistics, we want to use the population versions of these two functions.
In the previous lesson, we learned how to create a frequency distribution for our sports bar. In this lesson, we'll learn how to calculate variance and standard deviation in Excel. Variance and standard deviation are two important metrics that quantify how far your data is dispersed from the mean. These two metrics can be calculated with in-built Excel functions, but in this lesson I'm going to first calculate them manually on a simple example to make sure you fully understand how they work. The example in question has 5 different dollar amounts with an average of 6 dollars. I'd like to calculate the variance and standard deviation of this data set. The first step is to simply subtract the average value from each data point. So I'll write equals 2 dollars, and then subtract the average, which is at J3, and then anchor this cell with F4.
I'll then copy this formula for the remaining four cells.
And this leaves me with the set of numbers, some positive and some negative. Variance and standard deviation don't care if a number is above or below the average, they just care how far. Since distance is always positive, we must find a way to convert the numbers from negative to positive, so they don't cancel each other out, later on in the calculation. To do this, we simply square each number.
So I'll write equals, select the first data point, write a caret, which on my keyboard is Shift+6, and put it to the power of 2.
I'll then copy this for the remaining 4 cells.
This now gives me a set of numbers that highlights the distance each point is away from the mean. The variance is simply the average of these numbers. So I'll add them together using the Sum function, and then I'll divide them by 5. So I'll write equals Sum, select the cells, and then divide by 5, which is the number of data points.
And this give me a variance of 8.
Now unfortunately variance isn't a very intuitive number. Because the unit of variance in this case is dollars squared because we squared the difference between the data points and mean earlier in the equation. A much more intuitive number is the standard deviation, which is simply the square root of the variance. I'll write equals sqrt, open a bracket, and find the variance.
I'll then press enter, and this give me a standard deviation of $2.83. This means that on average our data points are $2.83 away from the mean value.
If I now change these numbers off camera, bringing the numbers closer together and keeping the mean the same, you can see the standard deviation decrease, because on average our numbers are closer to the mean. In this way, standard deviation quantifies intuitively how far the data is dispersed from the mean, or average value. Thankfully, we don't have to complete these separate steps when we are calculating standard deviation and variance. Excel has some handy inbuilt functions we'll use for the sports bar. So for variance, I'll write equals var, and select the P option, which is for the entire population.
I'll then simply select revenues as my input.
And this gives me a rather unintuitive value of over $14,000,000, so instead we'll focus on standard deviation. I'll write equals stdev, select the P option, because we have the entire population, and again my input is revenues.
When I press enter, I'll find that on average, my data is $3,762 away from the mean value of $5,738.
Standard deviation is a very important metric in statistics, and it's imperative you understand how it's calculated and how to interpret it. To help you with this task, I've created an exercise on the third tab that asks you to calculate the mean, median, and standard deviation for a rival sports bar, called Solo. When you've calculated your results, try to compare Solo's performance against Virtuoso. I'll leave my answer in the after file below the video.