Lesson 6 · 9 min

Train / val / test — how to not fool yourself

Models that memorize their training data look great on it. The whole game is honest evaluation.

Three splits, three jobs

Training set — what the model fits to. Loss goes down here automatically.
Validation set — held out from training. Used to tune hyperparameters and decide when to stop. The model never trains on it but you do peek at performance to make decisions.
Test set — held out from training AND held out from your decision-making. Used once to report final performance.

If you tune hyperparameters on the test set, you've leaked. The test set is no longer a fair estimate of generalization.