10. Q&A – Data Type and Messages

 
Subtitles Enabled

Sign up for a free trial to access more free content.

Free trial

Overview

This lesson will review information regarding Alteryx data types and the Messages window

Lesson Notes

Content reviewed in this lesson

  • Question 1: Why does data type matter?
  • Question 2: How do I choose what numeric data type to use?
  • Question 3: What do Alteryx messages mean?

Transcript

Alteryx classifies all data into distinct types. Many of the tools you'll use in your analysis will not function with certain data types or allow data of different types to be used together.

There are five broad data types, Boolean, or true and false, integers and decimals, text, dates and time, and geodata, like latitude and longitude.

Sometimes changing the data type is as simple as clicking on the drop-down toggle from the Select tool. However, often your data will need to be manipulated first. This is particularly true if your numbers contain currency symbols or dates and times are not in the specific YYYY-MM-DD format.

When deciding between numeric data types, we need to consider two factors. First, does the field contain only whole numbers, or does it also decimals? Second, what's the physical size of the numbers we're looking at? Or simply put, how many digits do those numbers contain? First, let's take a look at whole numbers, also known as integers. Alteryx has four different integer data types that should be used only with whole numbers, byte, integer 16, integer 32, and integer 64.

Note that if you use these data types on field-containing decimals, the decimal information will be cut off.

Each of these integer data types only covers a specific range of numbers, so you'll need to choose what data type to use based on the actual numbers in the field. You might be asking yourself why we wouldn't just use integer 64 exclusively since it covers the widest range of numbers. There's a trade-off here between accuracy and speed, meaning that processing calculations with integer 64 fields will take much longer than integer 32 fields even if they contain the same numbers.

As such, a good rule of thumb is to default to integer 32 if you're unsure of the range of numbers in any specific field. If the field contains decimals, you can choose between float, double, and fixed decimal.

Again, the major difference between these options is the range of numbers they cover. However, float will only cover numbers with seven digits after the decimal point, while double will cover a number with 15 digits after the decimal point. In most cases, float and double are too large for our needs, so fixed decimal is a better choice. Fixed decimal defaults to a size of 19.6. The 19 signifies the total number of digits, including the decimal point, while the six signifies how many of those digits must be after the decimal point.

Therefore, a size of 19.6 will encompass any positive or negative number that contains up to 12 digits before the decimal point and six digits after the decimal point.

Again, this may be larger than necessary, so I'll often use a size of 12.2 instead, which will cover numbers between positive and negative 9,999,999,999.99.

Alteryx helps prevent errors from occurring by providing a Messages section in the Results window. These messages are broken into five categories. Errors inform you when a specific tool is not configured correctly. Conversion errors inform you when data exceeds field limits, for example, too many characters or that certain data cannot be converted. The warning message informs you if a field is missing. General messages are used for feedback on the workflow. The File icon shows you all files being linked to, in, or from your workflow.