1. Extract Filters

 
Subtitles Enabled

Sign up for a free trial to continue watching this lesson.

Free trial

Filter, Groups, and Sets

8 lessons , 3 exercises , 1 exam

Start Course

Overview

Extract filters are used to reduce the amount of data stored in your extracts so as to improve performance. Find out how to use them in this lesson.

Lesson Notes

 

The 6 different types of filters in Tableau

- Extract filters
- Data source filters
- Context filters
- Dimension filters
- Measure filters
- Table calculation filters

Extract filters

- Are the lowest-level filter and are especially useful when pulling data from very large sources
- Help you specify what data needs to be included in the extract and what data can be ignored
- Ensure that you don’t have unused data stored in your extract, affecting loading times and performance
- When connecting to databases or cloud storage platforms, always consider inserting an extract filter

Transcript

Tableau has six different types of filters available, and in the first few lessons of this course, I'm going to explain each type of filter in detail, As you'll need to use different filters for different use cases. The diagram on screen shows the order in which filters are performed, with the lowest level filter called an extract filter all the way through to table calculation filters.

Let's start at the top with extract filters.

These filters are performed when you connect to a data set for the first time, and do not want to import all the data from that data source into a Tableau extract.

Say for example, you were connected to a huge server with millions of rows of data.

Your analysis requires data for a discreet time period and only ten columns. Therefore, you would be much better off applying and extract filter to pull only the data that you need into the Tableau data extract. As a consequence, Tableau will then only query a much smaller data set and have much faster response times.

Another reason for using extract filters is data security. Say your data source contains personal customer information or private and confidential information that you do not need for your analysis. An extract filter can be used to filter this sensitive data so that it does not leave the server and appear in the extract on your local machine.

Let's see how an extract works with an example in Tableau. In this data source I have one thousand customers with revenue numbers, payment dates, and regions, sub regions, and locations. What I'd like to do is create an extract filter just including companies based in the Midwest region above 20,000 in revenue, which is the focus of my analysis.

So at the top, I've got a connection.

Make sure extract is selected and hit Edit.

In this dialog box, I'll add a filter, and the filter will be region.

And I'll select Midwest and press OK.

Now I'll add another filter, select revenue, press OK, and I'll simply set a minimum bound of 20,000.

I want to only aggregate data and I will include all rows.

While not relevant here, the incremental refresh can be valuable, particularly when connecting to very large data sets.

If you are connecting to a huge data base regularly. I would recommend using an incremental refresh which will only add the new rows since the previous refresh, and ignore all previous entries.

This can save a huge amount of time, particularly when connecting to very large databases. When I populate my extract filters, I'll simply press OK.

And when my database updates, you can see that in the region I only have the Midwest, and for revenue, all the values are above 20,000.

Let's now create the extract, and I'll call it Midwest 20,000.

In the folder on screen, I have two files, one is the Tableau extract for the full data set, which is 131 Kilobytes, and the second is the data set after I've applied my extract filter, which is only 84 Kilobytes. So as you can see, these extract filter are a great way of reducing the size of files. While these files in particular are not very big, much larger data sources, particularly databases, can see huge reductions in size by applying extract filters correctly.

As a result, this improves the performance of your Tableau dashboards.

Let's now return to Tableau and see how we can remove columns and not just filter them.

Returning to my extract filter, I can see a button down the bottom, called Hide All Unused Fields, and what this does is remove any columns from our extract that aren't being used in the views.

When we have a data set with a lot of columns that we're not using as part of our visualizations, this button is very useful for reducing the size of our extract. Always consider using it if you're having performance problems with your Tableau dashboards. In the next lesson I'm going to move on to data source filters, which are at a slightly higher level than our extract filters.