Sign in or start a free trial to avail of this feature.
11. Extracting Text Strings
Working with text data is another common business task. In this lesson, we’ll see several ways to find a particular string within a text field.
Finding text substrings within a string
There are several functions that allow you to find a set of characters within a larger string:
- Find: Finds the position of a substring or set of characters within a larger string
- Search: Like find, but is not case sensitive
- Left: Extracts a specified number of characters from the beginning of a string
- Right: Extracts a specified number of characters from the end of a string
Working with text-based data could be a frustrating experience, and you may well need to analyze and modify text strings in Power BI. In the next two lessons, we'll look at common DAX functions you can use for these tasks. You might think that a common use case for text functions is to clean up badly formatted data you've imported into Power BI. However, this task is better dealt with in the Query Editor.
Among other options, the Query Editor lets you convert text strings to upper, lower or title case, trim text to remove white space, or extract the length of a text field. You can also extract substrings by splitting columns of text data. All this can be done without any formulas.
However, in some situations, you'll need to use the text functions in DAX. In this lesson, we'll look at how to extract a substring of text from a larger string. Let's go into DataView, navigate to the Pharma dataset, and look at the customer ID column.
We'll right-click the column header and sort ascending.
As we scroll through the table, we can see the IDs run from A100014 to A199914.
Let's say we wanna create a new column that will identify customers whose IDs start with A19.
There are two ways to accomplish this. The first involves using the FIND or SEARCH functions. These functions follow the principle of searching a text field for a string of characters we specify.
The main difference is that FIND is case-sensitive while SEARCH is not.
In this instance, we'll use the FIND function. We'll create a new column, call it customer group, and enter FIND.
FiND has four arguments.
The first is the string we want to find. So we'll enter A19 in double quotes.
The second is the string we want to evaluate, which is the customer ID.
The third argument is the position in the string that starts our search. By default, this is the start of the string. We don't wanna change this default. So we'll leave this argument blank.
The final argument is the value if the string A19 is not found.
By default, FIND returns in error, which we don't want.
We'll specify a value of zero.
We'll press Enter to create the new column. This produces a zero-one output where one indicates an ID starting with A19, and zero indicates an ID not starting with A19. This is not very intuitive. To improve the output, let's put this FIND statement inside an IF statement.
This IF statement is checking that the value of our FIND statement is not equal to zero.
Note that the DAX symbol for not equal to is a greater than and less than sign pointing away from each other. This formula will return A19 for customer IDs starting with A19 and A1X otherwise.
We'll confirm this formula and see if the column output is now easier to understand. The second option for finding a substring, is to use the LEFT function. This identifies a number of characters at the beginning of a string. Let's replace our FIND statement with LEFT. LEFT takes two arguments.
The first argument is the string we want to search, which is the customer ID.
The second argument is the number of characters to check. In this case, three.
We'll also need to change the IF statement as we want to see if the first three characters are equal to A19.
We'll change the not equal sign to an equal sign, set it equal to A19, and confirm the formula.
Note that our results are exactly the same as before. As you might expect, a function called RIGHT exists, which allows us to check the last characters in a string. Let's stop the lesson here. In the next lesson, we'll look at some more text functions and conclude this first DAX course.