Definition:Cross-Validation

From ProofWiki
Jump to navigation Jump to search

Definition

Cross-validation is a technique in statistics in which:

$(1): \quad$ data are randomly partitioned into $2$ or more subsets
$(2): \quad$ A model, for example a regression model, is fitted to all but one of these subsets
$(3): \quad$ A prediction error of the fitted model when applied to the omitted set is calculated.

Each subset is omitted in turn, and a combined estimated prediction error is obtained.

This method is useful for testing the overall goodness of fit of a model and for detecting outliers.


Leave-One-Out

The leave-one-out technique of cross-validation is the extreme situation in which the subsets into which the data is partitioned are singletons.

That is, each of the $n$ observations is omitted in turn, and a model is fitted to the remaining $n - 1$ data.


Also see

  • Results about cross-validation can be found here.


Sources