Sign in or start a free trial to avail of this feature.
4. Time Series Plot
In this lesson, we will learn how to engage the TS Plot tool and how to interpret its results.
Accessing Time Series Tools
- To access time series tools in Alteryx, download the Predictive Tools add-on from this link
- The TS Plot tool provides several different graphical representations of time series data, allowing users to better understand their data before applying any forecasting methodologies
Plots Displayed in the TS Plot Tool
- The time series plot is a graphical representation of each value in the dataset
- The season plot displays data from each cycle, allowing users to instantly compare data from corresponding preiods
- The decomposition plots show the data, a seasonal trend around the mean, a long-term trend, and a remainder, or error term
- The autocorrelation function plot shows the correlation between data and the time series offset by a degree of time lag
- The partial autocorrelation function plot shows the log adjusted period-on-period difference
In the previous lesson, we started prepping our data for analysis by aggregating our sales figures into weekly periods. In this lesson, we'll take a look at the trends in our dataset by connecting a TS Plot tool and viewing the outputs to determine if our dataset is suitable for further analysis. Before conducting an in-depth Time Series analysis, it's advisable to begin by decomposing the dataset using the TS Plot tool.
Note that the TS Plot tool can be found in the Time Series tab.
This tab is part of the Predictive Analytics add-on that we discussed in a previous course. If you don't see the Time Series tab, you'll need to download and install the Predictive Analytics add-on. You can find a link in the lesson notes. We'll navigate to the Time Series tab and connect a TS Plot tool.
In the configuration window, we'll set the Target Field Frequency to Weekly, add all browses, and run the work flow.
Let's consider the outputs of this tool starting with the R-Report Output Node. Notice this output contains a line chart of our data. A more informative output is available from the I-Interactive Report. If we open this window, we can see that it's made up of five panels. First, we have the Time Series Plot. This is a replica of the data presented from the R-Output Node. We can see that weekly sales max out at around 60,000. The second panel presents the Time Series data at the weekly level, with each year presented in a different color. It's quite clear that weekly sales tend to peak between Week 47 through Week five in the following year. Also, note that sales consistently seem to dip around Week 31. The seasonality is quite consistent across the years, which bodes well for our modeling. On the right side of the window, we have four charts that make up the decomposition plot. This breaks the data down into it's constituent parts, starting with the data itself. We can then see a representation of the seasonal component. Notice it moves above and below the zero line, or the mean.
Next, we have the trend. This is the longer term direction of the data once the seasonality is stripped out. The fourth charts the remainder. You can think of this as the error term. It's everything that's left over that the model does not capture. Clearly, the closer the remainder is to the zero line, the more accurately the model will interpret the data. At the bottom of the TS Plot Output, we have two panels. The first is the Autocorrelation Function Plot. This looks at correlation between data points in the Time Series offset by a degree of time lag. The relationship between one observation and the observation three weeks prior is tracked across the entire dataset.
The correlation between this data is then plotted as a bar chart between positive one and negative one. Anything above 0.2 is deemed to be statistically significant. For our data, there's a statistically significant positive correlation to a time lag of about nine weeks. That is to say, if sales are rising in one week, there's a high probability they will rise for two weeks, a slightly smaller chance they will rise for three weeks, and so on all the way down to nine weeks. To the right, we have the Partial Autocorrelation function. Like the Autocorrelation function, this chart plots the relationship between values at two data points in a Time Series. However, it strips out any influence from the values between the two data points. Looking at this chart, we can see there's a statistically significant relationship for a lag of one and two periods.
This may help us refine some of the specific settings in out Time Series model later. For now, simply note that these two charts are relevant because they give an indication as to the degree of randomness in our data.
As a final step, let's put our Time Series Plot and browses in a container.
We'll name the container Time Series Plot and disable it so that our future tasks process a bit faster. Conducting a Time Series Plot of your data is a good first step before embarking on any Forecasting exercise. As we saw here, our dataset displays very strong seasonal features. There are some clear trends in the data and some positive Autocorrelation. This gives us a positive indication that our data sample may be suitable for deploying a Forecasting model. We'll commence with this analysis in our next lesson.