Sign in or start a free trial to avail of this feature.
1. Creating Calculated Columns
Calculated columns let you add new columns to the tables in the data model. Learn how to create these new columns in this lesson.
Data Analysis Expressions - DAX
- Used to create formulas in Power BI Desktop
- In some ways similar to Excel formulas, but optimized to work with Power BI’s structured data model
- Calculated columns are computed for each row of a table and stored in the data model
- A calculated column is only computed once, when it is created
- When referring to another column, refer to the table name in single quotes, then the column name in square brackets
- Single quotes around the table name can be omitted if the table name has no spaces or special characters
- Including the table name is not required in certain circumstances, but it is best practice in all circumstances
In the previous courses, we've seen that Power BI is designed to allow you to create insightful reports in dashboards, without requiring a large amount of technical expertise. However, to fully utilize the abilities of Power BI, you should be familiar with data analysis expressions. Or DAX. DAX is a language used to create formulas in Power BI. Note that some DAX formulas are similar to Excel formulas. However, they're built to interact with data in different ways.
Excel contains unstructured data, meaning it isn't organized in any predefined manner, and cells can contain any type of data. In Power BI, data is structured, containing tables made up of columns of data. DAX is designed to work with data in this structured format.
In the next few courses, we'll teach you the most common functions and concepts in DAX.
We'll also demonstrate several business situations where you might find yourself using DAX in Power BI. In this lesson, we'll use DAX to create a calculated column that adds unit price information to our dataset. We'll start by introducing a dataset from a pharmaceutical company. The dataset records sales of various pharmaceutical products. Each row represents a sale of an individual product. We can see that each sale has a corresponding date, product name, sale amount, quantity sold, a customer id, representing who the product was sold to, and geographic information about the customer. Let's now move over to Power BI desktop. It's worth noting that DAX formulas can't be created in the online services, so all the DAX courses will use Power BI desktop. We'll start by importing the pharma dataset, which is currently housed in a CSV file. We'll navigate to the home tab, select get data, and then text slash CSV.
We'll then find the CSV file and select it.
This opens up a preview of the data.
This dataset comes from a company based in the UK, using the international date standard. However, my machine defaults to the US standard. As a result, I'll need to edit the query off-screen. If your system defaults to the international standard, you can simply select load.
Note that we have the ability to create columns in any of the three views. That being said, it's best practice to create columns in data view, as we can see the columns being added to the data model.
Let's switch to data view now. While we're here, let's ensure that the date column is set to the date data type.
As mentioned previously, each sales transaction has a corresponding quantity sold and total sales amount among other fields. Let's say we wanna create a new column, showing the price per unit. This is simply the total price of a transaction divided by the quantity sold. This will allow us to compare the average price per unit of each product, and see which product sells for the highest and lowest prices. We can create new columns from either the home tab, or the modeling tab.
Let's select the modeling tab, and new column.
Note that a new column is added to the table, and the formula bar is now highlighted.
We'll start our formula, by typing in the name unit price, followed by the equals sign. As we start typing in sales, the auto complete suggests the appropriate functions and columns. We'll select the sales field from the list of options. We'll then divide by quantity and again accept the auto complete suggestion. Before we confirm this formula, let's look at the syntax.
The table name is enclosed in a set of single quotes, followed by the column name in square brackets. The single quotes around the table name are not needed if the table name does not contain spaces.
In this situation, we could actually omit the table name, as we're referring to columns in the table we're adding to.
That said, it's best practice to always include table names for reasons we'll discuss in a later lesson.
We'll press enter to create the column, and see that each row now has a value for unit price.
Unit price also appears in the fields list, with a table symbol, indicating it's a calculated column.
Let's use this new column in a visualization.
We'll switch to report view, create a clustered bar chart, and add both unit price and product name.
Note that the chart currently shows the sum of unit price by product.
This isn't very useful, so we'll change the unit price from sum to average.
Finally, we'll sort the chart by average of unit price.
On the resulting graph, we can see that Melacaine has the highest average unit price of 10.89. However, this number is a bit deceiving.
Our new column calculated the unit price per transaction.
This average does not take into account that each transaction may cover a different quantity of products sold.
To find out the actual average unit price, we'll need a different calculation.
As we've seen here, calculated columns are relatively easy to create, however they don't offer the best solution in all situations.
In the next lesson, we'll create a measure to calculate the average unit price, and discuss the difference between these two types of calculations.