2. Structured and Unstructured Data

This lesson introduces the concepts of structured and unstructured data. We’ll also discuss the growth of unstructured data in the modern world.

Lesson Notes

What is Structured and Unstructured Data?

Structured data is data that has a clearly defined format. Usually, this is a tabular format, like the tables found in a database. For example, a table recording sales figures would be structured data.

Unstructured data is data which cannot be readily categorized. For example, text and images would be unstructured data. Finding insights from unstructured data is more difficult than finding insights from structured data.

The Rise of Unstructured Data

Traditionally, most business data was stored in structured formats, however, this is changing. More and more companies now collect and store unstructured data. For example, social media companies store images people upload, and the text of people’s posts. There is a need to generate insights from unstructured data.

Dealing With Structured and Unstructured Data

When working with applications like Alteryx, Tableau or Power BI, all your data must be stored in structured tables. In Excel, you can layout your data in a structured manner, but it’s not a requirement, and you can create an unstructured layout instead.

When you collect data, you’ll often collect structured and unstructured data. For example, in a survey, questions asking people to rate something from 1 to 10 produce structured data. Questions asking people to respond with free text produce unstructured data.