Sign in or start a free trial to avail of this feature.
6. Calculated Fields Part 2
In this second lesson on calculated fields, we explore some of the common errors encountered when building formulas, such as mixing aggregated and non-aggregated data.
To explore more Kubicle data literacy subjects, please refer to our full library.
Mixing aggregated and non-aggregated data
- Mixing these two types of data is a very common error in Tableau
- It results from trying to perform an operation between a single value (e.g. SUM(Sales) and a column (e.g. [Quantity])
- To fix this formula, simply place a SUM function around the [Quantity] field
Columns or single values
- If you want to add an additional column to the dataset, don't use aggregate functions
- Instead write formulas just with fields (e.g. [Sales]/[Quantity])
- If you want to calculate single values, use aggregate functions (e.g. Avg([Sales])
- This distinction will help you avoid mixing aggregated and non-aggregated data in formulas
In the previous lesson, we saw how easy it was to create calculated fields, such as price per unit.
But also, how easy it was to make a mistake using calculated fields as this example showed you.
In the first bar chart, we took the average of the price per unit column to come up with our answer.
Unfortunately, this column, which we can see in the data source does not take into account quantity sold after the calculation is made. And so, to find the total average price per unit for the whole dataset I get an incorrect answer.
The second calculation, which takes all of the sales for the full-time period and divides by all of the quantity for the full-time period is the correct answer.
Let's now recreate this formula in another calculated field So, I'll go to analysis, and create calculated field And I'll call this price per unit v2 and my calculated field will be the sum of sales divided by the sum of quantity and this tells me that my calculation is valid If you do make a mistake in your formulas for example, leaving out a bracket Tableau's error correction is actually very good and helps you quickly fix any errors in your formula When I add the bracket back in the error disappears When you are including functions such as sum in your calculated fields you can simply type the function as I have done here or you can search for the functions in this dropdown.
Let's take a look at some of the aggregate functions available in the dropdown, many of which you would have seen in a previous lesson such as average, count and CountD exclude, fixed and include are level of detail functions which will be covered in a later course max, median, min, percentile, sum and the remaining functions are all available in the pill dropdown that we explored in an earlier lesson each function is also given a nice description on the right-hand side.
One problem early users often run into is combining aggregated data and unaggregated data in a single formula.
For example, if I remove sum from quantity I get the following error.
which states I cannot mix aggregate and non-aggregate arguments within this function.
And this actually makes sense.
The sum of sales returns a single value say 1.3 million pounds whereas quantity is a column and I can't divide a single value by a column however, when I add back in the sum of quantity, now I have a single value divided by another single value And this is a nice way to think about calculated fields Sometimes you want to create a calculated field that represents a new column in the data source whereas other times, you want to create a formula that returns a single value such as this example.
And when this formula is used in a particular visualization it can be split by product name, date, customer ID, etcetera.
Thinking about calculated fields in this way makes them much easier to create and results in a lot less errors So, when you are creating calculated fields first ask yourself, do I want to create a single value or do I want to create a new column and this will help you decide if aggregate functions are required.
As an exercise, try using some simple aggregate functions, such as average and median, in your formulas to see if you can create some insightful visualizations with new calculated fields which should appear in your measures