Overview

Many datasets can contain patterns or trends of interest to businesses. These patterns can be uncovered using unsupervised learning methods. Learn how this works in this lesson.

To explore more Kubicle data literacy subjects, please refer to our full library.

Summary

  1. Lesson Goal (00:20)

    The goal of this lesson is to learn how unsupervised learning methods can uncover patterns in business data.

  2. Supervised and Unsupervised Learning (00:27)

    To understand unsupervised learning, we contrast it with supervised learning. In a supervised learning example, a bank wants to predict which borrowers will default in the next 12 months. They do this by developing an AI model that analyzes a dataset of borrowers from last year that includes information about the borrowers, including if they defaulted or not. The model knows if each borrower defaulted, and this information is used to supervise the development of the model predicting defaults.

    If we didn’t know if each borrower defaulted, we would only have a dataset containing information about the borrowers. With this dataset, any analysis will focus on finding general trends or patterns in the dataset, because the supervision of knowing if a borrower defaulted is gone. The trends we find by analyzing this dataset may relate to default rates, or they may relate to something else entirely. We’ll have no way of knowing before we run the algorithm.

  3. What is Unsupervised Learning (02:13)

    Unsupervised learning relates to AI applications where the output is not known in advance for the data used to build the model. Unsupervised learning can be used when we want to explore a dataset and find insights and patterns that might otherwise be missed.

  4. Course Overview (02:42)

    This course will cover the following areas:

    • Uses of unsupervised learning

    • Clustering

    • Association rules

Transcript

Data underpins many aspects of business. Sometimes we know how to analyze that data and extract value from it. But other times it's best just to give our data to a computer and let it find insights. We can do this using unsupervised learning, in this lesson, we'll learn how unsupervised learning methods can uncover patterns in business data. To understand what unsupervised learning is, let's contrast it to supervise learning. Let's say we own a bank and we want to predict which borrowers will default in the next 12 months. We can figure this out by building an artificial intelligence model that this data we have on borrowers from last year this data set will include various pieces of information that might tell us if borrowers are more or less likely to default, like their job, income level, age and so on. Most importantly, the data set will tell us whether each of last year's borrowers actually defaulted or not. This allows an AI model to analyze all the possible variables that might affect if a borrower defaults and find out which ones actually do affect the likelihood of a default. This is called a supervised learning problem because we know if each borrower in the dataset defaults and this information is used to supervise the model as it analyzes the data. But what would happen if we don't know whether each borrower defaulted? In this case, we would still have a data set containing various pieces of information about each borrower but no information on defaults. We can still use artificial intelligence to analyze this data set. But the nature of this analysis now changes. Without knowing whether each borrower has defaulted, our aim is to identify any interesting patterns their groups in the data set. In this case, the algorithm we use to analyze the data will look for any data it can use to identify these patterns or groups. The supervision provided by knowing if each borrower defaults is gone. So this is now an unsupervised learning problem. The groups that the algorithm finds might tell us something about which borrower's default more often or they might tell us something else entirely. Importantly, we have no way of knowing which group a borrower will be in before we run the algorithm. As a result, we can say that an unsupervised learning AI application is one where the output is not known in advance for the data use to build the model.

Unsupervised learning can be used in various cases where we want to explore dataset and find insights and patterns that you might not discover or otherwise. It's less commonly used than supervised learning but it still has its applications in business. We'll learn more about the applications of unsupervised learning, as well as the advantages and disadvantages of using it in the next lesson. Let's now consider the areas we're going to cover in this course. First, we'll consider how businesses can use unsupervised learning and also introduce a case study, we'll then learn how to find groups in a data set using clustering methods. Finally, we'll learn how to spot patterns in a data set using association rules.

This concludes our first lesson on finding patterns in data. In this lesson, we started by comparing supervised and unsupervised learning methods and considering how unsupervised methods can be used to explore a dataset. We then outlined the upcoming material for this course.

In the next lesson, we'll consider some examples where interesting patterns can arise in business data.