Definition:Bootstrapping

From ProofWiki
Jump to navigation Jump to search

Definition

Bootstrapping is a method for obtaining information about the distribution of a population by taking a random sample of $n$ observations, then from this random sample forming further random samples from this.


Bootstrap Sample

These secondary random samples are called bootstrap samples.

They will also typically be of $n$ observations.

They are obtained by sampling with replacement.


Motivation

Bootstrapping is useful when either:

there is insufficient information to specify the distribution of the population

or:

there is little analytic theory about properties or estimators.


Refinements are available to improve the estimates exemplified here.

In practice, a statistical software package is generally used, with a reliable random number generator.


Examples

Arbitrary Example

Let $B$ bootstrap samples be taken of a population.

Let $m$ be the median of the population as a whole.

Let ${m_b}^*$ be the median of the $b$th bootstrap sample, where $b = 1, 2, \ldots, B$.

Then the bootstrap estimated standard error of $m$ is given by:

$\map {\operatorname {se} } m = \sqrt {\dfrac 1 {B - 1} \ds \sum_b \paren { {m_b}^* - {\overline m}^*}^2}$

where ${\overline m}^*$ is the mean of the ${m_b}^*$.


Approximating $95 \%$ confidence limits for the population median are given by the $0 \cdotp 025$ and $0 \cdotp 975$ quantiles of the set of $b$ values ${m_b}^*$.

In practice, good values of $\map {\operatorname {se} } m$ can be obtained with $B = 50$, but values of $B = 1000$ or $B = 2000$ are needed for reliable estimates of confidence intervals.


Also see

  • Results about bootstrapping can be found here.


Sources