7. Transforming a Large Table

Overview

Tables are commonly found in business reports and presentations, but they’re rarely a good visualization choice. In our third case study, we learn how to start transforming a table into a more effective visualization.

To explore more Kubicle data literacy subjects, please refer to our full library.

Summary

  1. Lesson Goal (00:16)

    The goal of this lesson is to transform a table of data into a better visualization.

  2. Issues with Tables (00:22)

    Tables, or crosstabs, are often used in business reports and presentations, even though they are difficult visualizations to interpret, and can rarely be scanned quickly to identify insights. Many people know tables are a poor visual choice, but use them because they don’t know how to create something better. The solution is to use the hierarchy to create a new visualization.

    In this case study, we consider a table that contains five pieces of information: product category, region, sales, profits, and quantity sold. We want to create a visualization that demonstrates this data better than a table.

  3. Creating a New Visualization (01:23)

    When creating a visualization, we need to use the hierarchy to incorporate all the information in our table. In our case, we start with an x-y scatter plot, which uses location, to visualize sales and profit. The other pieces of information are in labels attached to each point.

    The next step is to start visualizing the information contained in the label. To do this, we simply move down through the hierarchy. After location comes size, which needs to be used for data with continuous values. In our case, we encode quantity using size, as it is the only remaining data that has continuous values. In our case, we find this is not an ideal visualization, as the points tend to overlap, so we step down the hierarchy to gradient, and encode quantity using a gradient instead.

    We then have two more pieces of data in the labels, and two more traits on the hierarchy, which leaves us needing to decide which trait to use for each of these pieces of data.

Transcript

So far in this course, we've completed two visualization case studies and we're now we're ready to move on to the third and final case.

Our goal in this lesson is to transform a table of data into a better visualization.

In this case study, we're going to consider this table.

It shows sales, profits, and quantity sold for various product categories and regions.

You might hear the term "crosstab" used to describe tables like this.

Tables are common in business reports and presentations, even though they can be difficult to interpret. It's very hard to scan a table and see your relationships across the regions, products, and between each of the values. One of the challenges people talk about is while they don't like using crosstabs, they don't know what to do instead. This is a perfect situation for using the hierarchy.

Over the next few lessons, we're going to go through every level of the hierarchy and learn how to tackle some tough choices.

To work with location, we need two values to use as the x and y axes. Let's start with sales and profit.

Using the location trait is a good way to see the relationship of sales and profit.

It's now clear, we're dealing with a large cluster of low sales and low profit with a few having higher sales and higher profit.

That was hard to see in the table, even with only a few values. This is a good first step. And now we have to transform the other data elements, quantity, category, and region from a label into a visual.

The thing about labels is you have to read them to understand the data and that's not using our visual processing power. And labels don't even fit on some of the data points, so we have no idea how to analyze those positions. We need to start eliminating the labels and encode them visually.

So let's move to the next level on the hierarchy after location.

Size is next, and that has to be encoded with a value. Of the three labels, quantity is the only value. We can't use category or region for size because those are categories of information.

By using size to encode the quantity, the highest values are instantly obvious. And good news, it looks like most of the low sales are also low quantity.

Now there are some challenges here from lots of overlap and some of the smaller points are difficult to see. In some cases that would be fine. It all depends on how you want to get across the meaning of the day data. In this case, we want to see each point more clearly. Instead of using size for quantity, we can step down the hierarchy to the next trait that requires a value, and that's gradients.

Now we see all the points more clearly and still have a sense of the quantity based on the gradient.

Instead of using a single color gradient, this yellow green option emphasizes low and high values. We've made good progress in reducing the amount of data in the label. Only region and category remain. Fortunately, the last two traits in the hierarchy can be used for categories of information.

So we have to decide which one gets assigned to color and which one gets assigned to shapes.

This is a complex decision involving a visual trade-off. So we'll stop this lesson here.

In the next lesson, we'll consider this decision and determine how to complete the visualization.