Relative Frequency is a Probability Measure
Contents |
Theorem
The relative frequency model is a probability measure.
Proof
We check all the Kolmogorov axioms in turn:
First Axiom
Let $n$ be the number of times a certain event $\omega$ is observed to happen.
Let $n'$ be the number of times $\omega$ is observed not to happen.
By the Law of Excluded Middle and the Principle of Non-Contradiction, $\omega$ either happened or did not, and not both.
Therefore $n + n'$ is the total number of observations.
It is supposed that at least one observation is actually made, so that $n + n' \ne 0$.
The relative frequency model says that the probability of $\omega$ occurring can be defined as:
- $\Pr \left({\omega}\right) = \dfrac n {n + n'}$
the numerator and denominator of which are positive integers such that $n + n' \ge n$.
If $\omega$ is observed never to happen, then $n = 0$ and $\Pr \left({\omega}\right) = \dfrac 0 {0 + n'} = 0$.
If $\omega$ is observed to always happen, then $n' = 0$ and $\Pr \left({\omega}\right) = \dfrac n {n + 0} = 1$.
Otherwise, if $n, n' \ne 0$, from Mediant is Between:
- $0 = \dfrac 0 {0 + n'} < \dfrac n {n + n'} < \dfrac n {n + 0} = 1$
Thus $\Pr$ is bounded:
- $0 \le \Pr \left({\cdot}\right) \le 1$
$\Box$
Second Axiom
- $\Pr \left({\Omega}\right) = \dfrac {n + n'}{n + n'} = 1$.
$\Box$
Third Axiom
This is a proof by induction.
Basis for the Induction
The case $j = 2$ is verified as follows:
Let $A$ and $B$ be two pairwise disjoint events.
Let $p$ and $q$ be the number of times $A$ and $B$ have been observed, respectively.
Let $n$ be the total number of trials observed.
By the definition of pairwise disjoint, $A$ and $B$ never happened at the same time.
By the same reasoning as the proof for the first axiom, in all $n$ observations:
- $A$ happened $p$ times
- $B$ happened $q$ times
- $A \lor B$ happened $p + q$ times.
By hypothesis:
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \Pr \left({A \cup B} \right)\) | \(=\) | \(\displaystyle \dfrac {p + q} n\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | |||
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(=\) | \(\displaystyle \dfrac p n + \dfrac q n\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | |||
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(=\) | \(\displaystyle \Pr\left({A}\right) + \Pr\left({B}\right)\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) |
This is the basis for the induction.
Induction Hypothesis
Let $A_1, A_2, \ldots, A_j$ be $j$ pairwise disjoint events.
By the definition of the relative frequency model, $j$ is finite.
Assume:
- $\displaystyle \Pr \left({\bigcup_{i=1}^j A_i}\right) = \Pr \left({A_1}\right) + \Pr \left({A_2}\right) + \cdots + \Pr\left({A_j}\right)$
This is our induction hypothesis.
Induction Step
This is our induction step:
Let $A_1, A_2, \ldots, A_j, A_{j+1}$ be $j+1$ pairwise disjoint events.
Define $C = A_1 \lor A_2 \lor A_3 \lor \cdots \lor A_j$.
Then $C$ and $A_{j+1}$ are also pairwise disjoint.
By the base case:
- $\displaystyle \Pr \left({C \cup A_{j+1}} \right) = \Pr \left({C}\right) + \Pr \left({A_{j+1}} \right)$
By the definition of $C$, this equation is logically equivalent to:
- $\displaystyle \Pr \left({\bigcup_{i=1}^{j+1} A_i} \right) = \sum_{i=1}^{j+1} \Pr \left({A_i} \right)$
By the definition of the relative frequency model, $j + 1$ is finite.
The result follows by the Principle of Mathematical Induction.
$\blacksquare$
Note
This proof depends on the Law of the Excluded Middle.
This rule is denied by the intuitionist school.
Sources
- Michael C. Gemignani: Calculus and Statistics (2001) $\S 1.3, \ \S 1.4$