Feb 1, 2026

Intermediate 25 min

What’s a Confusion Matrix?

A confusion matrix shows you exactly what your model got right and wrong. It’s a table that breaks down predictions by actual class.

The four categories:

True Positives (TP): Correctly predicted positive
True Negatives (TN): Correctly predicted negative
False Positives (FP): Predicted positive but was negative (false alarm)
False Negatives (FN): Predicted negative but was positive (missed it)

Why It Matters

A single accuracy number doesn’t tell you:

Are errors evenly distributed?
Does the model miss one class more than another?
What types of mistakes does it make?

A confusion matrix answers all of these.

Create a Confusion Matrix

Let’s build on our previous model:

🐍 Python Create Confusion Matrix

📟 Console Output

Run code to see output...

Visualize the Confusion Matrix

Let’s plot it to make it clearer:

🐍 Python Visualize Confusion Matrix

📟 Console Output

Run code to see output...

Understanding the Errors

In a medical context, these errors have different costs:

False Positives (FP): Saying a healthy person has cancer

Cost: Unnecessary worry, extra tests
Bad, but not catastrophic

False Negatives (FN): Missing a real cancer case

Cost: Cancer goes undetected, patient doesn’t get treatment
Very bad - potentially life-threatening

For our model: We want to minimize false negatives. Missing cancer is worse than a false alarm.

Calculate Metrics from Confusion Matrix

All our metrics come from the confusion matrix:

🐍 Python Metrics from Confusion Matrix

📟 Console Output

Run code to see output...

Interactive: Which Error Is More Common?

Look at your confusion matrix. Which type of error is more common in your run?

If FP > FN: Model is being cautious, predicting cancer more often
If FN > FP: Model is missing cancer cases - this is worse for medical diagnosis
If they’re balanced: Model makes both types of errors equally

For medical diagnosis: We’d rather have more false positives (false alarms) than false negatives (missed cancer).

Why Confusion Matrix > Single Metric

A single accuracy score tells you: “95% correct”

A confusion matrix tells you:

How many of each type of error
Whether errors are balanced
What the model struggles with
Where to focus improvement efforts

Example: Two models both have 90% accuracy:

Model A: 10 false negatives, 0 false positives
Model B: 5 false negatives, 5 false positives

For medical diagnosis, Model B is better (fewer missed cancers), even though accuracy is the same.

Key Takeaways

Before moving forward:

Confusion matrix shows detail - More than just accuracy
Different errors have different costs - Especially in medical problems
All metrics come from confusion matrix - It’s the foundation
Visualization helps - Plot it to see patterns

What’s Next?

In the next page, we’ll learn about cross-validation. This solves the problem of “one split might be lucky or unlucky” by testing multiple splits and getting more stable estimates.

Progress 57%

Page 4 of 7

← Previous → Next

Sign In