Sign in or start a free trial to avail of this feature.
11. Visualizing Co-Occurrence
In this lesson, we will import our consolidated data to Tableau, learn how to manually calculate lift, and display co-occurrence in a heat map.
- Lift is the likely of any specific rule, or ordered item set, to occur, versus what you would expect if the items were independent
Level of Detail Expressions
- Level of Detail Expressions allow users to create expressions that are tied to a specific dimension, and won’t change when new dimensions are added to the visualization
- In this lesson, we use Level of Detail Expressions to calculate Lift
In the previous lessons, we determined the co-occurrence for specific customer segments, and exported our data to a CSV file. Our goal in this lesson is to create a heat map that visualizes item association by gender and day of the week. We'll accomplish this goal in three key steps. First, we'll organize our fields into hierarchies. Next, we'll calculate the lift for each co-occurrence. Finally, we'll create a heat map to visualize the co-occurrences. Please note, that this lesson assumes that you have a good command of Tableau. If you need a refresher, or feel lost at any point, I recommend you review our Tableau courses. We'll start this lesson by organizing the fields.
We'll connect to our data set, and open up a new sheet. We'll go to the dimensions area, right click on item one, and create a hierarchy called "Item 1".
We'll then add Category 1 Item 1, and Category 2 Item 1, to this hierarchy.
Next, we'll right click on Item 2, and create a hierarchy called "Item 2".
Again, we'll place Category 1 Item 2, and Category 2 Item 2, into this hierarchy.
We're now ready to move on to step two, and calculate the lift for each co-occurrence. Lift in this case is calculated as occurrence, divided by the proportion of Item 1 occurrence, times Item 2 occurrence, over total occurrence. This will make more sense when we see it on the screen.
We currently have a field for occurrence, but we need to calculate Item 1 occurrence, Item 2 occurrence, and total occurrence.
We'll calculate these values using level of detail expressions. Level of detail expressions ensure that the formulas aren't affected by other dimensions in the visualization. We'll start with the formula for total occurrence. We'll create a formula, call it "Total Baskets", and enter fixed, for day and gender, with the sum of occurrence.
This will ensure that the total occurrence value won't change when we filter our data set by day, or gender values.
We'll wrap this formula in curly brackets, copy it, and press OK.
We'll create another formula, call it "Item 1 Baskets", and paste in the formula we just copied.
We'll add Item 1 after gender, and press OK to accept it.
We'll then create a third formula, call it "Item 2 Baskets", and again paste in the previous formula.
This time we'll add Item 2 after gender, and again press OK.
We can now calculate lift. We'll create a forth formula, and call it "Lift", In the formula area, we'll enter, sum of occurrence, divided by, open parenthesis, sum Item 1 Baskets, times, sum Item 2 Baskets, divided by, sum total baskets, closed parenthesis.
We'll the press OK to accept the formula. We can now move on to step three, and create our heat map. This heat map will simply be a matrix showing lift for different item sets. We'll put Item 2 entries in columns across the top, and Item 1 entries in rows along the side. We'll bring Category 2 Item 2, to the column shelf.
Category 2 Item 1, to the row shelf.
Item 2, to the column shelf, Item 1, to the row shelf, and the lift calculation, to the color section of the marks area.
We now have a basic two product heat map. We'll add further filtering to this presentation, by bringing both gender and day, to the filter shelf.
To help with labeling, we'll bring Category 1 Item 2, to the column shelf, and Category 1 Item 1, to the row shelf.
For presentation purposes, we'll hide the header for Category 1 Item 2.
Show both filters as single value drop-downs, and edit the color to Red Blue Diverging with zero as a center point.
We'll then add Category 1 level filters, to allow us to drill down to the item level.
If we now restrict the view to drinks, and frozen food, we can see there is high level of lift, between party food, and beer bitter ale stout.
This is expected, however it's a bit surprising that this level of lift does not extend to other alcoholic drinks such as cider and lager.
Cut price may want to investigate this further. We'll stop the lesson here. In the past few lessons, we've compared just one item versus another item on each receipt. Over the next few lessons, we use the MB rules, and MB inspect tools, to discover what relationships might exist between larger bundles of items.