Articles, Blog

IBM Watson Machine Learning: Build a multiclass classification model by running an AutoAI experiment

February 14, 2020

This video shows you how to build a Multiclass classification model that evaluates and rates car purchases based on typical criteria. For this video, you’ll use a Data Set called Car Evaluation which you’ll find in the IBM Watson Gallery. The data set is sourced from the University of California Irvine Machine Learning Repository. It includes 6 attributes someone would consider when purchasing a vehicle: the purchase price, the maintenance costs, number of doors, passenger capacity, size of the luggage area, and safety ratings. The last column is the evaluation for each vehicle: acceptable, good, unacceptable, or very good.
Add this data set to the Watson Machine Learning project, and then view the project. On the assets tab, you’ll see the data set.
Now you’re ready to add to the project, an AutoAI experiment. This project already has the Watson Machine Learning service associated. If you haven’t done that yet, first watch the video showing how to run an AutoAI experiment based on a sample. Just provide a name for the experiment, and then click Create.
Next, the AutoAI Experiment Builder displays. You first need to load the training data. In this case, the data set will be from the project. Select the CarEvaluation CSV file from the list.  AutoAI reads the data set, and lists the columns found in the data set. In this case, you want to select Evaluation as the column to predict.
Now edit the experiment settings. First look at settings for the Data Source. If you have a large data set, you can run the experiment on a subsample of rows. And you can configure how much of the data will be used for training and how much will be used for evaluation. The default is a 90%-10% split where 10% of the data is reserved for evaluation.  You can also select which columns from the data set to include when running the experiment. On the Prediction panel, you can select a prediction type. In this case, the Evaluation column includes four possible outcomes, so a multiclass classification model is most suitable. The default metric for a multiclass classification is accuracy. If you’d like, you can choose specific algorithms to consider for this experiment. On the General panel,  you can review other details about the experiment.
Now save the settings. Now run the experiment, and wait as the Pipeline leaderboard fills in to show the generated pipelines using different estimators, such as the XGBoost classifier, or enhancements, such as  hyperparameter optimization and feature engineering, with the pipelines ranked based on the Accuracy metric. Hyperparameter Optimization is a mechanism for automatically exploring a search space of potential Hyperparameters, building a series of models and comparing the models using metrics of interest. Feature engineering attempts to transform the raw data into the combination of features that best represents the problem to achieve the most accurate prediction.
It looks like AutoAI has chosen as the top algorithm, the LGBM Classifer which is a gradient boosting framework that uses leaf-wise (or horizontal) tree-based learning algorithm. Okay! The run has completed. View the progress map to see details of the run. It looks like pipeline 4 is ranked as the #1 pipeline. Viewing that pipeline in the leaderboard, you can see the model evaluation measures, and the ROC curve.  You can also compare the pipelines. This chart provides metrics for the four pipelines viewed by cross-validation score or holdout score. And you can see the pipelines ranked based on other metrics such as F1 Weighted which is a weighted average of the precision and recall.  You can select an individual pipeline to review the model evaluation which includes the ROC curve. During AutoAI training, your data set is split into two parts: training data and hold-out data. The training data is used by the AutoAI training stages to generate the model pipelines and cross-validation scores are used to rank them.  After training, the hold-out data is used for the resulting pipeline model evaluation and computation of performance information such as ROC curves and confusion matrices. You can also view the precision recall curve, threshold chart, model information, and feature importance. This pipeline had the highest ranking, so you can save this as a machine learning model. Just accept the defaults, and save the model.  Now view the model. The Overview tab shows a model summary and the input schema. On the Deployments tab, add a deployment. This will be a Web Service deployment with the specified name. When you’re ready, save the new deployment.  When the model deployment is complete, view the deployment. The Overview tab shows the basic deployment information. On the Test tab, you can test the model prediction. You can either enter test input data, or paste JSON input data, and click Predict. This shows that the first car evaluation is good with 95% probability, the second is acceptable with 99.9% probability, the third is unacceptable with  99.9% probability, and the fourth is very good with 99.9% probability. On the Implementation tab, you’ll find the scoring end point for future reference. You’ll also find code snippets for various programming languages to utilize this deployment from your application. You can also view the API specification from here. And back on the Assets tab in the project. you’ll find the AutoAI experiment and the model, and on the Deployments tab, you’ll find the deployment. Find more videos in the IBM Watson Data and AI Learning Center.

No Comments

Leave a Reply