3. Exploring the Dataset

 
Subtitles Enabled

Sign up for a free trial to access more free content.

Free trial

Overview

In this lesson, we’ll create several charts that give us some insights into the dataset. Specifically, we’ll analyze the customer distribution of the company by several different variables

Lesson Notes

Analysis of Customers

  • Most customers are businesses, not individuals
  • Retail is the largest sector, but no sector is especially dominant
  • Cheaper plans have a larger proportion of individual customers, more expensive plans have more business customers

Cross Filtering Directions

  • When we create a table counting customers by plan, we initially get an error
  • This is because relationships with single direction filtering do not allow information from the Plans table to flow to the Customers table
  • Changing a relationship to bidirectional filtering creates a path from the Plans table to the Customers table and makes the visual work

Transcript

In the previous lesson, we looked over a data set containing sales data and then imported that data to Power BI Desktop.

In this lesson, we'll create a report exploring the customer and plans tables. This will allow us to get an idea of some aspects of the dataset and demonstrate that we can quickly create a visually appealing report. We'll start by looking at the distribution of customers with a pie chart.

We'll navigate to the visualizations pane, select a pie chart, and place it in the top left corner of the canvas. We'll then drag type of customer to the legend well, and customer ID to the values well.

This chart shows us that the majority of customers are businesses.

Let's add a donut chart to the canvas and place it to the right of the pie chart.

We'll again drag customer ID to the values well, but this time, drag location to the legend well. We can see that the US and Great Britain have about the same number of customers, while Ireland trails behind.

Although pie charts and donut charts are often considered undesirable visualization types, they work reasonably well in situations with a small number of customer types or locations, such as this.

Next, we'll create a chart showing the distribution of customers by sector. As there are many different sectors, we'll need a new chart type. We'll select a tree map, and place it in the bottom right corner of the canvas.

We'll then add sector to the group well, and drag customer ID to the values well.

This chart shows us the size of each sector in the dataset.

We can see that the largest sector is retail, followed by other and technology.

Finally, we'll add a table that displays the number of customers subscribed to each plan.

We'll create a table and position it below the pie and donut charts.

We'll add the plan name field from the plans table and the customer ID field from the customers table.

This just gives us a list of the various customers, so we'll change the aggregation of the customer ID field to count.

As a final step, we'll navigate to the format section of the visualizations pane, select grid, and increase the text size to 14.

We can see that this table is not working as expected. The table is showing 5,402 customers for each plan. This is due to the cross-filtering directions in the data model.

To better understand this problem, let's go back to relationship view.

Each of the two relationships has a filtering direction indicated by the arrow.

Our table aims to use the plan name field from the plans table to filter the customer IDs from the customers table.

Fields from the plans table can flow into the transactions table, but they do not then continue to the customers table because the filtering for this relationship runs in the opposite direction.

To fix this, we'll navigate to the home tab, and select manage relationships.

We'll select the relationship between the transactions table and the customers table, and then edit.

Next, we'll change the cross-filter direction to both, press okay, and then close.

We can see that the arrow for this relationship now flows both ways, creating a path from the plans table through to the customers table.

We'll return to report view, and see that our table is now correctly showing the number of customers in each plan.

As mentioned in a previous course, bi-directional filtering is not enabled by default as it can lead to circular reference errors.

However, this is not a problem in our relatively small data model.

You many notice that the number of customers for each plan does not add up to the total of 5,402.

This is because each customer can change their plan at any month in the dataset.

As such, some customers are counted more than once. Effectively, this table measures the number of customers that were subscribed to each plan at some point during the two years of transactions in our dataset.

You may also wonder why some customers have a blank plan. This refers to churning customers, who canceled their subscription, and therefore are not subscribed to any plan.

They appear once as a churned customer, before Interslice stops recording data on them.

To finish off our report, we'll add a couple of slicers.

We'll put the first slicer in the top right corner of the canvas, and add the type of customer field from the customers table.

We'll then add a second slicer next to it, and add plan name from the plans table.

Our report is now complete. We can use our new slicers to quickly explore some of our data.

For example, if we select the basic plan from the plan name slicer, we can see that almost a quarter of the basic customers are individual consumers.

This is a much larger proportion than for the dataset as a whole. If we select standard, advanced, and then enterprise, we can see the proportion of businesses increases consistently. This tells us that individual consumers are more likely to use the cheaper plans, while businesses tend to use the high-end plans. We'll stop the lesson here.

In the next lesson, we'll begin our first significant area of analysis, and look at monthly recurring revenue.