Sign in or start a free trial to avail of this feature.
13. Exporting the Model Object
Having successfully learned how to train and validate a range of predictive models, in this lesson you will learn how to export your preferred predictive algorithms for subsequent use.
- Model Objects contain the configuration settings for each model
- We can export these objects to a database file for use in future workflows
Exporting Model Objects
- To export the preferred Model Objects, the O output from each model needs to be combined into a single set
- We want this workflow to be flexible an easily adjusted based on new datasets or information, as such the preferred models could change in the future
- To account for this, we cross reference the list of most accurate models with the combined dataset of all Model Objects to return the preferred model objects
- The dataset of Model Objects can be exported to a database file with the Export Data tool
- Note that the database should only contain Name and Object fields
Over this course, we've created various predictive models based on a single data set and analyzed the results. We've generated summary confusion matrix data for each of these models, and we're now ready to choose our preferred models, so that we can deploy them to a new data set. The sort tool has ranked our models according to accuracy, minimizing false positives. The forest model came out on top, however, the second and third decision tree models also performed well. We'll make these our second and third alternatives, giving us three models for our new data set. In this lesson, we'll prepare these three models for deployment to future data sets. We'll save them as model objects so that we can reuse them without recreating them every time. We'll accomplish this goal in three key steps. First, we'll separate the top three models from the other four.
Next, we'll isolate the model object fields, as they contain the data that we'll need to export. Finally, we'll export our models to an Alteryx database file, so we can use them in future workflows. We'll start by separating the top three models that we would like to use in future workflows. To that end, we'll connect a sample tool to the sort tool, ensure the First N Records radio is selected, and set N equal to three.
We'll then connect a RecordID tool to the sort tool. This will help us sort the models in the correct order again, as we move along in our workflow.
We are now ready to move on to step two and isolate the model object fields for our preferred three models. We want to be able to deploy our models to a new data set. This means exporting the model object field that comes from each model tool.
To do this, we'll need to combine the model objects with a union tool.
We could just connect the top three models to the union tool, however, we would like to future-proof this workflow, in case some parameters change. To that end, we'll cross-reference all the models with the list from the RecordID tool. We'll bring a union tool onto the canvas and connect the O output node from each of our seven models.
Again, we'll specify that the incoming connections for the union tool should be wireless.
We now need to find an automated way to cross-reference the full list of model objects with the three preferred models.
We can do this with a join tool. We'll bring a join tool onto the canvas, connecting one input node to the union and the other to the RecordID tool.
At this point, we only need the model names, object fields, and RecordID, so we'll select just the left RecordID and model names, together with the right model objects. We'll join the data sets on model name and name.
We'll run the workflow to ensure the output is correct.
As before, this could take some time to process, so I'll cut out the wait time in this video. As we can see, the join tool has not respected our previous sort work, so we'll need to connect another sort tool to get the models back in the correct order. In the configuration window, we'll sort the models by RecordID, in ascending order.
At this point, we're ready to move on to step three and export our models to an Alteryx database file, so we can use them in future workflows. The data in this format must contain just two fields, name and object, so as a final step, we'll bring a select tool onto the canvas.
In the configuration window, we'll deselect the RecordID field and rename model name to name.
We're now ready to create the output database for our models. To that end, we'll bring an output data tool onto the canvas and connect it to the select tool.
We'll specify a directory for our top three models, name the file fitted_models, and change the file type to Alteryx database file.
We'll run the workflow to see the output.
Again, this could take some time to process, so I'll cut the wait time. If we open up the specified directory, we can see that the database file has been created. Let's stop here and recap our lesson. First, we separated out the top three models, using sample and sort tools.
We then isolated the model object fields for these models. To future-proof this workflow, we cross-referenced all the model objects with a list of the top three models, using a join tool. We then resorted the model objects, to make sure they were in the correct order. As a final step, we exported our models to an Alteryx database file, so we can use them in future workflows. In the next lesson, we'll fit these models to a new data set, to see how they can be applied to improve the grant approval process.