9. Preparing Data for a Batch Macro

Overview

Continuing our examination of the macro options available in Alteryx, in this lesson you will learn of the versatile batch macro.

To explore more Kubicle data literacy subjects, please refer to our full library.

Summary

  1. What is a Batch Macro (00:24)

    Where a standard macro provides a one-off calculation, a batch macro can cycle through a task again and again.

  2. Setting up the Case (00:39)

    We would like to generate individual payslip files for each employee at a factory. To do this, we’ll need to cross-reference the pay rate and timesheet data.

  3. Lesson Goal (01:53)

    Our goal in this lesson is to create a workflow that creates a payslip for a single employee.

  4. Format DateTime Values and Remove Nulls (02:35)

    The time sheet contains many null values, and the dates are not in the correct format.

    We’ll connect a DateTime tool and convert the dates to the ISO standard. We’ll then connect a Select tool, deselect the old date, rename DateTime_Out to Date, and set Basic Time and Over Time to the data type Fixed Decimal 12.2.

    We’ll then connect a Data Cleansing tool to replace the null values with zeros or blanks.

  5. Apply a Filter so We Can Specify a Time Period (03:51)

    We want our payslip to apply to a specific week, so we’ll connect a Filter tool to limit the date range. We’ll create a basic filter that targets the Date field for the range of 2016-02-15 and 2016-02-21.

  6. Join the Datasets (04:39)

    We’ll use a Join tool to combine the timesheet and pay grade datasets. We’ll join these datasets on the Date field and deselect the right date.

  7. Write a Formula to Calculate Employee Wages (05:01)

    We’ll connect a Formula tool and create two formulas that help us calculate employee wages. In the first formula canvas, we’ll create a new field called Pay Grade, and enter a Switch formula that returns the numeric multiplier for the corresponding pay grade. Here is the completed formula:

    Switch([Grade],0,

    “A”, [Grade A],

    “B”, [Grade B],

    “C”, [Grade C],

    “D”, [Grade D])

    This field has the data type Fixed Decimal 12.2.

    In the second formula canvas, we’ll create a new field called Gross Wages, and enter a formula that calculates employee wages.  Here is the completed formula:

    ([Basic hours]*[Base rate]*[Pay Grade]*[Day multiplier])+([Over time]*[Pay Grade]*[Day multiplier])

    This field also has the data type Fixed Decimal 12.2.

  8. Generate a Payslip for an Individual Employee (06:05)

    Next, we’ll clean up the dataset with a Select tool, and deselect the Grade fields since we no longer need them.

    As a final step, we’ll connect an Output Data tool, and save a .csv file called Employee 001 to the directory Payslips. 

Transcript

(upbeat music) - [Instructor] In the previous two lessons, we introduced the concept of macros, focusing on the Standard Macro type.

However, if we look at the dropdown list in the configuration window of the workflow tab, we can see there are several other types of macros.

In the next two lessons, we'll look at the Batch Macro.

While the Standard Macro performs a once-off calculation on a workflow, a Batch Macro can cycle through a task again and again.

In order to understand how to apply a Batch Macro, we're going to work with some sample datasets.

Imagine a factory where employees clock in and out each day.

A database records this information in a timesheet file, unique to each employee.

This is a CSV file that contains dates, base hours, overtime hours and pay grade information.

We want to generate this payslip file for each employee, over a specific week.

You'll notice that the timesheet CSV file contains a pay grade for each employee, but we don't know exactly what each value means.

To make sense of this, we'll need to reference a pay rates file that contains information regarding base rate of pay together with increased rates for weekends and overtime.

It also includes a multiplication factor for different pay grades.

For example, an employee at grade A gets the base rate, while an employee at grade D gets 1.9 times the base rate.

We need to cross-reference this information with our timesheet data.

Our overall goal is to prepare the Batch Macro by cross-referencing these two employee datasets to calculate individual pay by date.

Note that we won't actually run the Batch Macro until the next lesson.

Our goal in this lesson is to develop a workflow that creates a payslip for a single employee.

We'll accomplish this goal by following five key steps.

First, we'll format the date time values and remove any null values.

We'll then apply a filter, so we can specify a time period.

Our third step is to join the datasets.

In step four, we'll write a formula to calculate employee wages.

In our final step, we'll generate a payslip for an individual employee.

We'll start by formatting date time values and removing null values from the employee timesheet.

Our canvas currently contains a timesheet for employee 001, and the pay rates dataset.

We need to update the date data so it's in the correct format.

We'll connect the date time tool and select the format, dd-Mon-yy.

We'll then connect a select tool.

Deselect the date field, rename date time out to date and convert basic hours and overtime to fixed decimal size 12.2.

Next, we'll connect the data cleansing tool and ensure it's set to remove all null values from the dataset.

We're now ready to move on to step two and complete the meat of our workflow.

For this step, we'll need to connect a filter so that we can generate a payslip for a specific time period.

We'll set the field as date, the operator as range, and then specify a start date of February 15th, 2016, and an end date of February 21st, 2016.

We'll then run the workflow.

If we look at the output for the true node, we can see that we now have hours worked on each of the seven days during the weekend question together with the employee's pay grade.

We now need to join this with the pay rate dataset so we can later determine the employee's wages.

We'll bring down a join tool, connecting it to both the true node of the filter tool and the pay rates dataset.

We'll specify date as a join field and deselect the duplicate date field below.

We can now calculate wages due to employee 001 for the period in question.

We'll connect a formula tool to the J output node and name the new field pay grade.

We'll use a switch function to return the corresponding numeric value for the pay grade identified in the grade column.

We'll then change the data type to fixed decimal size 12.2.

We'll now create a second formula canvas labeled gross wages.

Here, we'll enter a formula that simply calculates base pay plus overtime pay.

Again, we'll change the data type to fixed decimal size 12.2.

We know that our timesheet file splits base hours and overtime hours. So we're using this formula to first calculate the adjusted pay for base hours, and then add the adjusted pay for overtime hours.

At this point, we're ready to move on and generate a report.

Let's start by connecting a select tool to bring forward only relevant fields.

Since we just calculated gross wages, we no longer need information on individual pay grades. So we'll deselect grades A through D.

Finally, we'll connect an output data tool and set our file parameters.

We'll set the directory as payslips, call the file employee 001, set the file type as CSV and select save.

We'll run the workflow.

And see that a payslip has been generated in the directory we specified.

To recap, we achieved this in five steps.

First, we formatted the date time values so that they would be consistent.

We then applied a filter so that we could specify a time period.

Next, we joined the datasets.

After that we applied a formula that calculates employee wages.

Finally, we generated a payslip for an individual employee.

Now imagine that this company has hundreds of employees.

Clearly creating a workflow like this for each employee would be both time-consuming and inefficient.

Instead, we can use a Batch Macro to cycle through these timesheets and generate individual payslip files. We'll complete this process in the next lesson.