5. Variance and Standard Deviation

Subtitles Enabled

Overview

These two metrics quantify how far your data is dispersed from the mean - they can be calculated manually or through in-built Excel functions.

Lesson Notes

Variance and standard deviation

- These two metrics measure how far our data deviates from the mean (or average)
- Standard deviation is the square root of the variance
- Standard deviation is a more intuitive metric because its unit is the same as the mean

Useful functions

=VAR.P: Calculate the variance of a population (for descriptive statistics)
=STDEV.P: Calculate the standard deviation of a population (for descriptive statistics)
=VAR.S: Calculate the variance of a sample (for inferential statistics)
=STDEV.S: Calculate the standard deviation of a sample (for inferential statistics)

Keyboard shortcuts

CTRL + : Move to end of data region
CTRL + SHIFT + : Select all cells within data region
SHIFT + : Select adjacent cell
ALT + E, S, F: Paste formulas
ALT + E, S, T: Paste formats

Formula for calculating standard deviation (for descriptive statistics)

Transcript

Variance and standard deviation are two important metrics that quantify how far your data is disbursed from the mean.

These two metrics can be calculated with inbuilt Excel functions, but in this lesson I’m going to first calculate them manually on a simple example to make sure you fully understand how they work.

The example in question is five different dollar amounts with an average of \$6.

I'd like to calculate the variance and standard deviation of this data set.

The first step is to simply subtract the average value from each data point.

So I'll write “=\$2” and then subtract the average, which is at J3.

I’ll then anchor this cell with F4.

I'll then copy this formula for the remaining four cells.

And this leaves me with a set of numbers, some positive and some negative.

Variants and standard deviation don't care if a number is above or below the average, they just care how far.

Since distance is always positive, we must find a way to convert the negative numbers to positive, so they don't cancel each other out later on in the calculation.

To do this, we simply square each number.

So I'll write =, select the first data point, write a carat, which on my keyboard is Shift 6, and put it to the power of two.

I'll then copy this for the remaining four cells.

This now gives me a set of numbers that highlight the distance each point is away from the mean.

The variance is simply the average of these numbers.

So I'll add them together using the SUM function, and then divide them by five.

So I'll write “=sum”, select the cells and then divide by five, which is the number of data points.

And this gives me a variance of 8.

Now unfortunately, variance isn't a very intuitive number because the unit of variants in this case is dollar squared.

Because we squared the difference between the data points and the mean earlier in the equation.

A much more intuitive number is the standard deviation, which is simply the square root of the variants.

I'll write “=SQRT”, open a bracket and find the variants.

I'll then press Enter and this gives me a standard deviation of \$2.83.

This means that on average our data points are \$2.83 away from the main value.

If I now change these numbers off camera, moving the numbers closer together and keeping the mean the same, you can see that the standard deviation decreases, because on average, our numbers are closer to the mean.

In this way, standard deviation quantifies intuitively how far your data is disbursed from the mean or average value.

Thankfully, we don't have to complete these separate steps when we're calculating standard deviation and variants.

Excel has some handy inbuilt functions we'll use for the Sports Bar.

So for variants, I'll write “=VAR” and select the P option, which is for the entire population.

I'll then simply select revenues as my input.

Press Enter to complete.

And this gives me a rather unintuitive value of over \$14,000,000, so instead we’ll focus on standard deviation.

I'll write “=stdev”, select the P option because I have the entire population, and again my input is revenues.

When I press Enter, I'll find that on average my data is \$3,762 away from the main value of \$5,738.

Standard deviation is a very important metric in statistics and it’s imperative that you understand how it’s calculated and how to interpret it.

To help you in this task, I've created an exercise on the third tab that asks you to calculate the mean, median and standard deviation for a rival sports bar called Solo When you've calculated your results, try to compare Solo's performance against Virtuoso.

I'll leave my answer in the after file below the video.

3 mins

4 mins

4 mins

3 mins

4 mins

4 mins

5 mins