3. Obtaining Data

 
Subtitles Enabled

Sign up for a free trial to access more free content.

Free trial

Overview

If you need to gather data for an analytics project, there are numerous methods available to you. This lesson introduces some of the most common ones.

Lesson Notes

Lesson Goal

The goal of this lesson is to explore the different ways you can gather data.

Surveys

Surveys can deliver useful insights but require some effort to set up. When designing a survey, you need to consider:

  • Who should be surveyed
  • How to carry out the survey (phone, online etc.)
  • What questions to ask

Focus Groups

Focus groups bring together a group of people for a discussion with a facilitator. They can take place in person or online. They can generate deeper insights than surveys but generally reach fewer people, due to the need for a facilitator.

Experiments

In an experiment, companies measure some outcome of interest, make some sort of intervention, then measure the outcome again. Comparing before and after lets them judge the impact of the intervention. For example, a bank can measure the impact of an automation project on waiting times using an experiment.

Experiments can be challenging to set up. A company needs to ensure that the results they observe can only be caused by their intervention, and not by some other factor.

Observation

Observation works similar to an experiment, except no intervention is made. We simply observe some business process in order to observe how it works. Many online companies observe customer behavior using cookies and other tracking technology. They can use this information to improve their business at a later date.

Transcript

When considering a data analysis project, we rarely give much thought to where the data actually comes from.

We tend to just dive in and start analyzing it.

In reality, you should consider many important factors when obtaining data for further analysis.

In this lesson, we'll explore different ways to gather data, such as surveys, focus groups, experiments, and observations.

To do this, we'll assume that your data analysis project requires you to create a new data set.

A common method for obtaining data is from surveys or questionnaires.

There are several things you need to think about when creating a survey.

First, you should decide who you want to survey.

Next, you need to consider how the survey will be carried out. Online, by phone, or in person? Finally, you have to determine what questions to ask.

Questions should be guided by information you want to learn in the survey, but should not push respondents toward particular answers.

Surveys can deliver useful insights into your company, but generally require a lot of work to both set up and analyze results.

Similar data gathering methods include focus groups and interviews.

Let's take a closer look at focus groups.

A focus group brings together a number of people to discuss a product, service, or company with the help of a facilitator.

The individuals involved will usually all share some characteristic of interest to the company, such as age, location, or gender.

In the past, focus groups traditionally took place in person, however, they can now also take place online. Focus groups can generate deeper insights than surveys as discussion between participants and the facilitator can go beyond a question and answer format.

The disadvantage is that focus groups usually reach a smaller audience than a survey due to the need for a human facilitator.

Another way of gathering data is through experiments.

You might think that experiments are only used by academics or scientists, but they have a place in the world of business, too.

In an experiment, companies measure some outcome of interest.

They then make an intervention that may affect the outcome, and measure the outcome again to see if the intervention caused any changes.

For example, a bank wants to see if replacing tellers and cashiers with machines affects customer transaction and wait times.

They can do this by measuring these timings before an automation project, and then again afterwards to see if there's a significant change.

To improve the experiment, they may automate some branches but not others.

They can then compare changes in transaction and waiting times between branches that were automated, and branches that were not.

This accounts for the possibility that some external factor, like an overall increase in customers, may impact waiting times in all branches.

Experiments can deliver valuable real world information that's highly relevant to the company, but they need to be set up very carefully. For example, the bank needs to be sure that other factors that could increase or decrease customer wait times during the experiment are known and controlled for.

They also need to be sure that the branches selected for automation are similar to those not selected, and that both groups are broadly representative of the whole network. An alternative to experimentation is observation.

In an observational study, no intervention is made.

The researcher simply observes some factor of interest and collects data on what happens.

For example, the bank could use observation to identify if waiting times in their branches are longer on particular days or at particular times of the day.

Gathering data through observation is particularly common in online businesses.

When people browse a website, cookies and other tracking tools can be used to identify what pages they visit, how they reach the site, how long they spend on it, and so on.

Information like this would be very difficult to gather for an offline business.

This kind of observational data is a big contributor to the increased amount of data that's present in the modern world.

While there are other ways of obtaining data, these are the most common in a business context.

When collecting your own data, you'll need to carefully plan the process before starting.

Bad data can ruin an analysis project before you even start, so it's important to get the data right. In the next lesson, we'll see how you can obtain existing data from various sources.