5. Data Blending


Data blending is an alternative to joins and is pretty easy to implement in Tableau, through relationships. In this lesson, we will learn how to blend data and when to use this technique.

To explore more Kubicle data literacy subjects, please refer to our full library.


Data blending

- An easier alternative to joins that are easy to build for the user
- Represents a LEFT JOIN between a primary and secondary data source
- Blending happens after aggregation rather than at a row level
- Unfortunately, blending is done in the background and can be hard for users to find errors

When you may want to use blending instead of joins

- Cross-database joins are not possible (e.g. Oracle ESSBASE)
- Data has different levels of granularity
- Data needs cleaning in Tableau
- Datasets are extremely large


In the previous lesson, we learned how to create joins between different datasets. In this lesson, we're going to explore an alternative to joins called data blending, and when you might want to use this new technique.

It's important to understand when to use data blending instead of joins, and there are a couple of instances where this is the case. Previously, when we were building relationships between datasets in different locations, only data blending would work. So for example, if I wanted to join an Excel sheet with a Google Analytics account, I couldn't do this, because joins only worked when datasets were held in the same data source. However, in Tableau 10, we can now use cross-database joins for many data sources, but not all. So for example, if I wanted to connect an extract only connection, such as Salesforce, with Oracle S-Base, then I couldn't use joins. I could only use data blending. As a consequence, you'll always need to keep data blending in mind when your data sources do not support joins. The next instance where you would consider data blending is when data has different levels of granularity.

This often results in the duplicate problem we saw in the previous lesson. For example, suppose you were analyzing daily transactional data against monthly quota data. The transactional data might capture all transactions, but the quota data might be at a monthly level. Because the transactions are captured at a different level of detail or granularity in both datasets, you might use data blending instead of joins when combining these datasets. Another time to consider data blending is when the data needs cleaning. If you tried joining unclean datasets, you may have problems if the column values don't match up exactly. For example, you might need to rename a column, change a column data type, or create a calculation. Blending allows you to do this in Tableau, and fix problems that might be occurring with a join. The last reason you might want to consider using data blending is if your datasets are very large. And consequently, the time spent querying a join is overly long. In certain instances, data blending can have shorter query times than joins if the datasets are very large. Because Tableau handles combining the data after the data is aggregated, not before. Let's create a data blend on the fee earner column, as we did in the joining lesson. In the current Tableau notebook, I have both datasets connected as extracts. If I go to data, edit relationships, I can see that transaction data is the primary data source, which is what I want, and the secondary data source is fee earner by division. I can switch these by simply hitting the dropdown. But I want transaction data as the master table, and the primary data source. If I hit automatic, Tableau sees that fee earner exists in both datasets, and creates a relationship between these two columns automatically. If I need to create a join manually, I can simply hitch the columns that I want in both datasets.

And as you can see, it only shows dimensions in the list, not measures. Then I'll press okay.

And okay again.

Now I'm going to create my chart that shows fee earner value.

And switching to my second dataset, I now have a link across the fee earner column, which allows me to use division as a color code.

So as you can see, relationships work very easily and very quickly in Tableau, but there are some health warnings associated with this technique, and a few problems that can emerge, which I'll explain more in the next lesson.

Advancing in Tableau
Optimizing for Performance


My Notes

You can take notes as you view lessons.

Sign in or start a free trial to avail of this feature.

Free Trial

Download our training resources while you learn.

Sign in or start a free trial to avail of this feature.

Free Trial