9. Preparing Data for a Batch Macro

 
Subtitles Enabled

Sign up for a free trial to access more free content.

Free trial

Overview

Continuing our examination of the macro options available in Alteryx, in this lesson, you will learn of the versatile batch macro.

Lesson Notes

Batch Macros

  • Batch macros perform a simple task many times, where as a standard macro only performs a task once
  • In this lesson we prepare data for a batch macro that creates payslips from employee timesheets

Transcript

In the previous lesson we introduced the concept of macros focusing on the standard macro type. However, if we look at the dropdown list in the configuration window of the workflow tab we can see there's several other types of macros. In the next two lessons, we'll look at the batch macro. Where the standard macro performs a once-off calculation on a workflow, a batch macro can cycle through a task again and again. In order to understand how to apply a batch macro we're going to work with some example datasets. Imagine a factory where employees clock in and out each day. A database records this information in a timesheet file unique to each employee. This is CSV file that contains dates, basic hours, overtime hours and pay grade information for an individual employee. We want to generate an individual payslip file like this for each employee for a specific week. You'll notice that the timesheet CSV file contains a pay grade for each employee but we don't know exactly what each value means. To make sense of this we'll need to reference a pay rates file that contains information regarding basic rate of pay together with increased rates for weekends and overtime. It also includes the multiplication factor for different pay grades. For example, an employee at grade A gets the base rate while an employee at grade D gets 1.9 times the base rate.

We need to cross reference this information with our timesheet data. Our goal is to repair the batch macro by cross referencing these two employee datasets to calculate individual pay by date. Note that we won't actually run the batch macro until the next lesson. For now we'll simply create a workflow that creates a payslip for a single employee. First, we'll format the date/time values and remove any null values. We'll then apply a date filter so we can specify a time period.

Our third step will be to join the datasets so that we can perform step four which involves writing a formula to calculate wages. In our final step we'll generate a payslip for an individual employee. We'll start by formatting date/time values and removing null values from the employee timesheet. We'll bring an input data tool onto the canvas, and connect it to the timesheet file for employee 001.

This CSV file contains dates, basic hours, overtime hours and the pay grade of the employee in question. The first step in prepping our data is getting the date data in the current format. We'll connect the date/time tool and select the format dd-Mon-yy.

We'll then connect a select tool, deselect the date field, rename DateTime_Out to Date and convert basic hours and overtime to fixed decimal size 12.2.

Next we'll connect the data cleansing tool and ensure that it's set to remove all null values from the dataset. We're now ready to move onto step two and complete the meat of our workflow. In this step we'll connect a date filter so that we can generate a payslip for a specific time period. We'll specify a starting date of February 15th, 2016 and an ending date of February 21st 2016. We'll run the workflow and see that we have the hours worked on each of the seven days during the week in question together with the employee's pay grade. We now have to join the datasets. Our information regarding rates of pay and terms and conditions are stored separately so we'll need to bring down a new input data tool, and connect the pay rates dataset.

We'll then bring down a join tool to connect the two branches, and specify date as the join field. The date information in the pay rates dataset duplicates the date information in the timesheet file, so we'll deselect it. Our next step is to calculate the wages due to employee 001 for the period in question.

We'll connect a formula tool to the J output node and name the new field Pay Grade. We'll then use the switch function to return the corresponding numeric value for the pay grade identified in the grade column. We'll then change the data type to fixed decimal size 12.2. If you'd like a more in-depth look at the switch function, please refer to the conditional expressions and conversions part two lesson in the formula tool course. We'll now create a second formula canvas labeled Gross Wages.

We'll enter a formula here that simply calculates base pay plus overtime pay. Again, we'll change the data type to fixed decimal size 12.2.

We know that our timesheet file splits basic hours and overtime hours, so we're using this formula to first calculate the adjusted pay for basic hours and then add the adjusted pay for overtime hours.

Before moving on let's connect a select tool and bring forward only relevant fields. Since we just calculated the gross wages, we no longer need the information on individual pay grades, so we'll deselect grades A through D.

Finally, we'll connect an output data tool to create a report specifying the CSV format and targeting a new save directory called Payslips.

We'll run the workflow, and see that a payslip has been generated in the directory that we specified.

To recap we achieved this in five steps.

First, we formatted the date/time values so that they would be consistent. We then applied a date filter so that we could specify a time period. Next, we joined the datasets so that we could apply a formula to calculate wages. Finally, we generated a payslip for an individual employee.

Now, imagine that this company has 100s of employees. Clearly creating a workflow like this for each employee would be both time consuming and inefficient. Instead we can make use of a batch macro to cycle through the individual timesheets available in a target directory generating individual payslip files for each one. We'll complete this process in the next lesson.