11. New Data Checklist

 
Subtitles Enabled

Sign up for a free trial to access more free content.

Free trial

Overview

In this lesson, we will review some best practices when connecting to new data using Alteryx and develop a checklist for every data import.

Lesson Notes

Kubicle has developed a checklist of best practices that users should perform when importing data to a canvas

  • Length – ensure that string fields are interpreted correctly and not truncated
  • Error – check the messages window for errors and ensure that data is properly connected
  • Fields – connect a select icon to your workflow and only select fields required for analysis
  • Types – ensure that all fields have been assigned the correct type

Transcript

Now that we have a taste for Alteryx, we're going to take a moment to consider what we have learned so far and suggest some best practices.

Alteryx can connect to and bring together an array of different data sets from both local sources and servers.

To ensure efficient data processing, you should follow a 4 step checklist when connecting a new dataset.

First, you should make sure that your string fields are being interpreted correctly.

The default length is 254 characters.

If necessary, expand the total length of each field using the configuration window for the data import tool.

Remember, this may result in some of your fields becoming excessively long.

They should be trimmed back down to size using the select tool to keep your workflow efficient.

Next, check the messages window for errors and ensure that your data is properly connected.

Third, bring a select tool onto the canvas and select only the fields required for your analysis.

Bringing unnecessary data through your workflow is inefficient and uses up extra resources.

Finally, ensure all of your fields have the correct data type.

Some of your fields would have been assigned data types automatically.

Make use of the auto field tool, but always take the time to make sure that these fields have been correctly assigned.

Numeric fields should be assigned as such.

Integer 32 will cater for all whole values between negative 2 billion and positive 2 billion and is a good default setting.

If your data contains decimals, then fixed decimal will usually suffice.

The default setting is 19.6 which is an 18 digit number including 6 decimal places.

You may require more precision in this for example, when processing foreign exchange transactions.

It's something to bear in mind as rounding errors can occur.

Note that these are good defaults. You should use smaller or larger data types depending on the values in your dataset.

You should get in the habit of going through each of these steps as a checklist each time you connect a new dataset.

In doing so, you safeguard the basic integrity of your workflow.

This will increase the efficiency of your workflow and cut down on potential sources of error.