Subtitles Enabled

Sign up for a free trial to access more free content.

Free trial

Overview

In this lesson, we will further clean the dataset and create map points for each unique tournament entry in the dataset.

Summary

Lesson Goal

The goal of this lesson is to create map points for all entries.

Key Steps

  1. Remove any duplicate entries
  2. Create map points

Step 1: Remove any duplicate entries

Several actions are required to accomplish this step:

  • Use a Crosstab tool to group and rearrange the data with JSON_Name as the column headers and JSON_ValueString as the values for the new columns
  • Use a Unique tool to bring forward unique values based on the Tournament, Website, Address, Date to start, Date to finish, Venue, error_message, and status fields
  • Use a Select tool to bring forward only the Tournament, Website, Date to start, Date to finish, Venue, results_0_address_components_6_long_name, results_0_formatted_address, and results_0_geometry_location_lat, results_0_geometry_location_lng fields

Step 2: Create map points

Several actions are required to accomplish this step:

  • Use a Filter tool to remove entries where the latitude field is not empty¬†
  • Use a Create Points tool to create map point spatial objects for each entry
  • Use a Select tool to bring forward all fields EXCEPT API key, URL, and DownloadHeaders

Transcript

- So far, we've downloaded calendar data from the Chicago Point website, passed the address information through the Google API to generate geo-coordinates, and cached the data in order to minimize calls to the API.

Our goal in this lesson is to create map points for all entries so we can display this information on a map in our final report. We'll achieve this through two key steps. First, we'll remove any duplicate entries by applying cross tab and unique tools to the API output.

Next, we'll create map points so we can display a map in our final report. We'll start by removing duplicate values.

The data from the Google API can read using JSON or Java Script Object Notation.

This is a standard file format well suited to transmitting structured data. We'll begin by using the cross tab tool to rearrange our fields.

We'll connect a cross tab tool and group the data by all values except JSON name and JSON value string.

We'll make the JSON name field the new column headers, with the JSON value strings as the values for the new columns.

We'll then run the work flow.

We can see that our data appears horizontally and there are a lot of columns.

Let's tidy this up with unique and select tools.

We'll bring a unique tool onto the canvas and select only the tournament, website, address, date to start, date to finish, venue, error message, and status fields.

These fields capture most of the common information so adding other fields is a bit superfluous. We'll run the work flow to return unique entries based on these criteria.

We'll then connect a select tool to the unode and reduce the data set to just tournament, website, date to start, date to finish, venue, results, address, six long name, results zero, formatted address, and results zero geometry latitude, and longitude.

The formatted address field contains the formal address details while the latitude and longitude fields contain geo-coordinates.

As mentioned previously, there are a lot of different fields with address information for each venue. Unfortunately, when doing this kind of analysis in the real world, you'll need to look at all these fields individually to determine which ones to keep. Let's run the work flow again to institute these changes. At this point, we can move on to step two and create map points for all venues. First, we need to ensure that all our venues actually have coordinate information.

This offers us another opportunity to check the integrity of our data and fix any potential issues.

Let's bring a filter tool onto the canvas and filter by one of the geo-coordinate fields being not empty.

In this case, I'll choose latitude and run the workflow.

We can see there are 33 venues for which the Google API has not returned geo-coordinates. The other two entries here don't have associated venues at all.

In the real world, we may wish to manually enter details for these entries as appropriate. However, for our example, we'll simply exclude them.

Next, we'll navigate to the spacial tab, connect a create points tool to the true node of the filter tool.

Select the latitude and longitude fields and run the workflow again.

Now that we've created the centroids, or map points for our map, we'll stop the lesson here.

In the next and final lesson, we'll finish up organizing the data set and develop our final report.

Contents

My Notes

You can take notes as you view lessons.

Sign in or start a free trial to avail of this feature.

Free Trial

Download our training resources while you learn.

Sign in or start a free trial to avail of this feature.

Free Trial