Definition:Cross-Validation
Jump to navigation
Jump to search
Definition
Cross-validation is a technique in statistics in which:
- $(1): \quad$ data are randomly partitioned into $2$ or more subsets
- $(2): \quad$ A model, for example a regression model, is fitted to all but one of these subsets
- $(3): \quad$ A prediction error of the fitted model when applied to the omitted set is calculated.
Each subset is omitted in turn, and a combined estimated prediction error is obtained.
This method is useful for testing the overall goodness of fit of a model and for detecting outliers.
Leave-One-Out
The leave-one-out technique of cross-validation is the extreme situation in which the subsets into which the data is partitioned are singletons.
That is, each of the $n$ observations is omitted in turn, and a model is fitted to the remaining $n - 1$ data.
Also see
- Results about cross-validation can be found here.
Sources
- 1998: David Nelson: The Penguin Dictionary of Mathematics (2nd ed.) ... (previous) ... (next): cross-validation
- 2008: David Nelson: The Penguin Dictionary of Mathematics (4th ed.) ... (previous) ... (next): cross-validation