Intermediate 25 min

Model Evaluation and Interpretation

Now let’s evaluate our tuned model properly and save it for future use. We’ll use metrics beyond accuracy to get a complete picture of performance.

Using the Best Model

GridSearchCV gives us the best model already fitted. Let’s use it for final evaluation:

🐍 Python Using the Best Model
📟 Console Output
Run code to see output...

Classification Report

Accuracy alone doesn’t tell the whole story. The classification report shows precision, recall, and F1-score for each class:

🐍 Python Classification Report
📟 Console Output
Run code to see output...

Confusion Matrix

The confusion matrix shows exactly where the model makes mistakes:

🐍 Python Confusion Matrix
📟 Console Output
Run code to see output...

Feature Importance (Optional)

For tree-based models, we can see which features are most important:

🐍 Python Feature Importance
📟 Console Output
Run code to see output...

Saving the Pipeline

Once we’re happy with the model, we save it. The whole pipeline (preprocessing + model) gets saved together:

🐍 Python Saving the Pipeline
📟 Console Output
Run code to see output...

Loading and Using the Pipeline

Later, you can load the pipeline and use it for predictions:

🐍 Python Loading and Using Saved Pipeline
📟 Console Output
Run code to see output...

Why Save the Whole Pipeline?

Saving just the model would require you to:

  1. Remember which preprocessing was used
  2. Manually apply preprocessing to new data
  3. Keep preprocessing code in sync with the model

Saving the pipeline means:

  • One file contains everything
  • Preprocessing is automatic
  • No chance of mismatched preprocessing
  • Production-ready pattern

Prediction Playground

Try making predictions on custom values:

🐍 Python Prediction Playground
📟 Console Output
Run code to see output...

Final Knowledge Check

Summary

Congratulations! You’ve built a complete ML pipeline. Here’s what you learned:

Data Loading - Loaded and explored the Wine dataset
Baseline Model - Created a simple model for comparison
Preprocessing - Used ColumnTransformer for feature scaling
Pipelines - Combined preprocessing and model into a Pipeline
Cross-Validation - Used cross-validation for reliable evaluation
Hyperparameter Tuning - Optimized parameters with GridSearchCV
Evaluation - Evaluated with classification report and confusion matrix
Saving - Saved the pipeline for reuse

Next Steps

Now that you understand pipelines, you can:

  • Handle missing values - Add SimpleImputer to your pipeline
  • Work with real datasets - Apply this to your own data
  • Build regression pipelines - Same concepts apply to regression
  • Create custom transformers - Build your own preprocessing steps
  • Deploy to production - Use the saved pipeline in your applications

Resources

Thanks for completing this tutorial! You now have the skills to build production-ready ML pipelines.