Sign in or start a free trial to avail of this feature.
15. Scatter and Bubble Plots
Scatter and bubble plots can be useful when you want to study the relationship between two or three numeric variables. We’ll learn how to create them in this lesson.
To explore more Kubicle data literacy subjects, please refer to our full library.
Lesson Goal (00:10)
The goal of this lesson is to use scatter and bubble plots to understand the relationship between two numeric variables.
Understanding Scatter Plots (00:33)
A scatter plot is a chart that is used to analyze the relationship between two numeric variables. It displays data as a series of points. Here, we create a scatter plot showing the total revenue and the total number of users for each sales person.
The numeric fields are added to the X-axis well and the Y-axis well. This produces a chart with a single point. To see multiple points, we add a field to the details well. For example, adding sales person to this well creates a point for each sales person. The pattern of the points gives us an insight into the relationship between the two variables. For example, if the points form a linear pattern, it suggests a strong correlation between the numeric fields.
After creating a scatter plot, there are numerous formatting options available in the Format pane. For example, we can add a data label to each point or give each point an individual color.
Understanding Bubble Plots (02:11)
A bubble plot is a variation of the scatter plot where the size of each point is based on the value of another field. For example, we could size the point for each sales person based on the number of companies they sold to. To create a bubble plot in Power BI, we add a field to the Size well of a scatter plot. This field is then used to determine the size of each point.
Animated Scatter and Bubble Plots (02:54)
Power BI also lets you create animated scatter and bubble plots. To create this, we add a date field to the Play Axis well of a scatter or bubble plot. We can then see an animated version of the plot for each date in the data set. This plot won’t always be effective, but in the right circumstances, it can be an interesting way of tracking data over time.
In the previous lesson, we learned how to use the Analytics pane to add reference lines to charts.
In this lesson, we'll analyze the relationship between two numeric fields, using scatter and bubble plots. In our dataset, we may want to know how the relationship between revenue and number of users varies between salespeople. Are some salespeople bringing considerably more revenue per user than others? A scatterplot makes it easy to find out. We'll start by creating an empty scatterplot.
We can see that the available wells are slightly different than what we've used so far.
To construct the plot, we'll start by selecting variables for the x and y-axes.
We'll put revenue on the x-axis and users on the y-axis.
This produces a single dot.
To see a dot for each salesperson, we'll drag Sales person to the Details well. At this point, we'll create a page well filter of sales person and exclude Barcus and Stefani.
We can now see the remaining salespeople much more clearly. The linear pattern to this dots suggests that revenue and users are fairly strongly correlated and there are no salespeople getting a particularly high or low price per user.
Let's take a moment to format this chart and make it a bit easier to read. We'll select the Format icon and start by ensuring that fill point is turned on. This makes the points solid so that they're easy to see.
We'll also turn on category labels so we can see which salesperson is represented by each bubble.
On a graph with more data points, this can create a lot of clutter but we have enough data points here that it looks reasonable.
We'll now turn on color by category so that each salesperson gets their own color.
What we currently have here is a scatterplot.
In a scatterplot, each dot is the same size.
If we select the Fields icon again, we can see the Size well. The Size well is specific to the scatter chart type and allows each dot to be a different size. Activating this well changes a scatterplot into a bubble plot.
Let's drag Company Name to the Size well.
Each bubble is sized according to the number of companies its salesperson has sold to. The size variable is not particularly useful here as all the bubbles are of a similar size.
We can reasonably conclude that each salesperson has sold to a similar number of companies. However, with the right data, a bubble plot like this can provide an appealing way to view the relationship between up to three variables. If we look at the other available wells in the Visualization pane, we can see one labeled Play Axis. Let's drag our Date field to this well.
A play button now appears at the bottom left of the chart. When we click play, we can see an animated plot for each date in our dataset.
We can even click on an individual bubble to track its progress over time.
It doesn't always make sense to include a play axis.
In this case, it's not ideal as our data is collected on a daily basis and not every salesperson makes a sale every day.
As a result, bubbles constantly appear and disappear.
However, in the right circumstances, this can be an interesting way of tracking a variable over time.
This marks the end of our first course on visualizations in Power BI.
In this course, we learned how to use some of the most common visualization types in Power BI and saw how to format these visualizations effectively. In the next course, we'll look at the remaining visual types, including maps, waterfall charts and cards. We'll also look at more options for modifying visuals and the data used to create them, including methods of aggregation, quick measures and drilling down into visualizations. For now, you should have a good grasp of the most common visuals featured in Power BI and how to use them in your reports.