2. Average, Median and Mode
These are three of the most popular descriptive statistics. Learn how to calculate and interpret them in this lesson.
Average, median and mode
Average: The sum of all datapoints divided by the number of datapoints
Median: The middle value in our dataset when data are ordered from smallest to largest
Mode: The most commonly occurring value in our dataset
=AVERAGE: Calculates the average for an array of cells
=MEDIAN: Calculates the median for an array of cells
=MODE.SNGL:Calculates the mode for an array of cells
CTRL + F3: Open Name Manager
ALT + N: Create a new name
CTRL + SHIFT + →: Select all cells within data region
CTRL + →: Move to the last cell in the data region
CTRL + Page Up/Page Down: Move to next sheet
We'll start our analysis of the Sports Bar by calculating three very common descriptive statistics: the average or arithmetic mean, the median and the mode.
To make sure our formulas are nice and readable, and to save some time, I'll first create two named ranges for our data columns.
I'll press Ctrl + F3 to create the Name Manager, and Alt + N to create a New Name.
The first will be called Revenues, and in the Refers to box I'll simply select the Revenue column.
I'll press Alt + N again to create another name, and this will be called Dates, and in the Refers to box I'll select the first column with Ctrl + Shift + Down Arrow.
I'll press OK again and Esc to close the Name Manager dialog box.
Let's begin our analysis by now calculating the average daily revenue.
This value can be calculated manually by adding up the total revenue for the three month period and dividing by the total number of days.
Thankfully, Excel has an inbuilt AVERAGE function, so we'll use that instead to calculate the figure.
I'll write “=average” and then I’ll simply type revenues and close the bracket.
And this tells me that the average is $5,738.
Now let's find the median revenue.
The median is calculated by sorting the data from the smallest to the largest value, and then identifying the middle value in this list.
The middle value can be the same as the average, but it's almost always a different number.
The median can be calculated with an inbuilt Excel function, so I'll type “=median”, again I'll type revenues, close the bracket and press Enter.
And interestingly, we can see that the median is considerably lower than the average.
Can you guess why this might be? Well, it turns out that you can actually gain some interesting insights by examining the difference between the average and the median.
If the average is considerably greater than the median, it means there are a few large entries that are skewing the data.
A cleaver example to illustrate the difference between the median and the average is to take five of your friends and calculate the average and median of their net worth.
Now replace the richest friend with Bill Gates and his net worth, which we will say for argument sake is $10,000,000,000.
And now we can see that the average has gone through the roof, but the median or middle value stays the same.
This also applies to Virtuoso Sports Bar.
A couple of very large revenue days, for example the 6 and 7 of April, are dragging the average well above the median.
On the flip side, if the average was actually lower than the median, it would point to a few very bad days in the Sports Bar that’s dragging the average down.
Typically, companies tend to prioritize the average over the median in their analysis.
But I’d recommend calculating both, because the difference between the two values can tell you a lot about the revenue profile of your business.
To wrap up this lesson, we're going to look at the mode.
Mode returns the most frequently occurring value in a data set.
So I'll write “=mode” and I'll select singular, and then I'll simply select my revenues column.
And when we click Enter we get an N/A error, and this is because our revenue data does not have frequently occurring identical values because revenue is a continuous variable.
For the mode to be of use, your data must be discreet, which means that can only be a finite number of outcomes possible.
A scenario where you might want to use the mode is in sales transaction data, where you have multiple products on sale.
The mode could quickly help you find your most popular product among this data.