Sign in or start a free trial to avail of this feature.
1. Extract Filters
Extract filters are used to reduce the amount of data stored in your extracts so as to improve performance. Find out how to use them in this lesson.
To explore more Kubicle data literacy subjects, please refer to our full library.
The 6 different types of filters in Tableau
- Extract filters
- Data source filters
- Context filters
- Dimension filters
- Measure filters
- Table calculation filters
- Are the lowest-level filter and are especially useful when pulling data from very large sources
- Help you specify what data needs to be included in the extract and what data can be ignored
- Ensure that you don’t have unused data stored in your extract, affecting loading times and performance
- When connecting to databases or cloud storage platforms, always consider inserting an extract filter
Tableau has six different types of filters available. And the first few lessons of this course, I'm going to explain each type of filter in detail as you'll need to use different filters for different use cases. The diagram on screen shows the order in which filters are performed with the lowest level filter called an Extract filter all the way through to Table Calculation filters.
Let's start at the top with Extract filters.
These filters are performed when you connect to a dataset for the first time and do not want to import all the data from that data source into a Tableau extract.
Say for example, you were connected to a huge server with millions of rows of data.
Your analysis requires data for a discrete time period and only 10 columns.
Therefore you would be much better off applying an Extract filter to pull only the data that you need into the Tableau data extract. As a consequence, Tableau will then only query a much smaller data set and have much faster response times.
Another reason for using Extract filters is data security. Say your data source contains personal customer information or private and confidential information that you do not need for your analysis. An Extract filter can be used to filter the sensitive data so that it does not leave the server and appear in the extract on your local machine.
Let's see how an extract works with an example in Tableau.
In this data source, I have 1,000 customers with revenue numbers, payment dates and regions, subregions and locations. What I'd like to do is create an Extract filter just including companies based in the Midwest region above 20,000 in revenue which are the focus of my analysis.
So at the top, I got a connection. Make sure Extract is selected and hit Edit.
In this dialog box, I'll add a filter and the filter will be Region and I'll select Midwest and press OKay.
Now I'll add another filter, select Revenue. Press OK.
And I'll simply set a minimum bound of 20,000.
I won't do any aggregate data and I will include All rows.
While not relevant here, the Incremental refresh can be valuable particularly when connecting to very large datasets.
If you are connecting to a huge database regularly, I would recommend using an Incremental refresh which will only add the new rows since the previous refresh and ignore all previous entries. This can save a huge amount of time particularly when connecting to very large databases. When I'm happy with my Extract filters, I'll simply press OK.
And when my database updates, you can see that in the region I only have the Midwest and for revenue all the values are above 20,000.
Let's now create the extract.
And I'll call it Midwest 20,000.
In the folder on screen, I have two files. One is the Tableau extract for the full data set which is 131 kilobytes and the second is the dataset after I've applied my Extract filter which is only 84 kilobytes. So as you can see, these Extract filters are a great way of reducing the size of files. While these files in particular are not very big, much larger data sources particularly databases can see huge reductions in size by applying Extract filters correctly.
As a result, this improves the performance of your Tableau dashboards.
Let's now return to Tableau and see how we can remove columns and not just filter them.
Returning to my Extract filter, I can see a button down the bottom called Hide All Unused Fields. And what this does is remove any columns from our extract that aren't being used in the views.
When we have a data set with a lot of columns that we're not using as part of our visualizations, this button is very useful for reducing the size of our extract. Always consider using it if you're having performance problems with your Tableau dashboards.
In the next lesson, I'm going to move on to data source filters which are at a slightly higher level than our Extract filters.