1. Introducing the Dataset

 
Subtitles Enabled

Sign up for a free trial to access more free content.

Free trial

Reshaping Data in Tableau Prep

7 lessons , 3 exercises

Preview Course

Overview

This lesson introduces the datasets that we will be using in this course. All datasets relate to the fees generated by a pharmaceuticals company distribution company.

Lesson Notes

Lesson Goal

Examine the data and determine data cleaning steps required

Reorganizing sales data

In this course, we’ll work with new data from the pharmaceutical company Alvetica. In this lesson, we investigate the company’s sales data.

The data is spread across 5 Excel files, one for each year between 2012 and 2016. Within each Excel file, there are 4 sheets, one for each of the products: Lomina, Samtan, Tridesta, and Wedicare. In total, we have 20 separate datasets.

In this course, we’ll demonstrate how to combine all 20 into a single dataset and we’ll then combine this master sales dataset with the customer address dataset from the previous Tableau Prep course.

Transcript

In this course we're going to examine different methods of reshaping our data in Tableau Prep. In the first few lessons, we'll look at methods for merging data from multiple sources into a single dataset.

We'll then look at changing the granularity of the data by aggregating it.

We'll finish with a lesson on pivoting our data from wide to tall.

In this lesson we'll examine the data for a pharmaceutical firm called Altevica and determine the data cleaning steps required to prepare the data for interpretation by Tableau Desktop.

In this course we'll examine Altevica's sales data.

Altevica has sales data spanning five years covering four products, Lomina, Samtan, Tridesta and Wedicare.

However, prior to the decision to purchase Tableau, there was no rigorous format for data storage.

An inexperienced Excel user stored the company's sales data in multiple locations, as a result there's a separate Excel file for each year of data from 2012 to 2016.

Within each of these Excel files there's a sheet for each of the four products.

In total, there are 20 data sets of the company's sales data.

Altevica would like to develop some visualizations using these 20 data sets, along with the customer information data set we manipulated in the previous course.

Unfortunately the company is having trouble using all this data in Tableau Desktop.

Working with one dataset in Tableau Desktop is far more convenient than working with 21. As such, our aim in this course will be to merge the sales datasets and customer information dataset into a single master dataset.

In the next lesson we'll start by examining how we can use unions and joins to merge data sets.