Data Science Diagnostics

My Model is worse on the Test Set than the Train Set!

This problem is typically called “overfitting” or the “generalization gap”.

There are many causes for this problem, and we must know the specific cause in order to address it.

Discover the cause of your problem with diagnostic tests

We can use diagnostic tests to probe your project and discover the cause of the performance mismatch.

Below are diagnostic tests that we can use to gather evidence for the specific cause on your project:

Diagnostic Test Categories

Generalization Gap ProblemGeneralization Gap Problem
Discover more about the biggest problem in machine learning typically referred to simply as ‘overfitting’.
Dataset Split ProcedureDataset Split Procedure
Discover diagnostic checks to confirm that you followed best practices when splitting your dataset.
Split Size SensitivitySplit Size Sensitivity
Discover diagnostic checks to confirm the chosen train/test split size is appropriate for your data.
Gap Quantification TestsGap Quantification Tests
Discover diagnostic tests to quantify the size of the gap in performance and determine whether it is a concern.
Challenge Performance TestsChallenge Performance Tests
Discover diagnostic tests to challenge and develop more robust estimates of model performance.
Data Distribution TestsData Distribution Tests
Discover diagnostic tests to determine whether train and test sets have the same distributions of data.
Model Performance TestsModel Performance Tests
Discover diagnostic tests to determine whether performance is consistent across train and test sets.
Prediction Error TestsPrediction Error Tests
Determine diagnostic tests to determine whether hard to predict examples are consistent across sets.
Residual Distribution TestsResidual Distribution Tests
Determine diagnostic tests to determine whether residual error distributions are consistent across train and test sets.
Model Overfitting TestsModel Overfitting Tests
Discover diagnostic tests to determine whether your model is overfitting the training dataset.
Model Robustness TestsModel Robustness Tests
Discover diagnostic tests to determine whether your model is fragile to small changes.
Test Set LeakageTest Set Leakage
Discover diagnostic tests to determine whether knowledge of the test set has leaked into the train set
InterventionsInterventions
Discover ideas on how to fix your data and/or model once the cause has been identified