# Variance as Expectation of Square minus Square of Expectation

## Theorem

Let $X$ be a random variable.

Then the variance of $X$ can be expressed as:

$\var X = \expect {X^2} - \paren {\expect X}^2$

That is, it is the expectation of the square of $X$ minus the square of the expectation of $X$.

## Proof

### Discrete Random Variable

We let $\mu = \expect X$, and take the expression for variance:

$\var X := \ds \sum_{x \mathop \in \Img X} \paren {x - \mu}^2 \map \Pr {X = x}$

Then:

 $\ds \var X$ $=$ $\ds \sum_x \paren {x^2 - 2 \mu x + \mu^2} \map \Pr {X = x}$ $\ds$ $=$ $\ds \sum_x x^2 \map \Pr {X = x} - 2 \mu \sum_x x \map \Pr {X = x} + \mu^2 \sum_x \map \Pr {X = x}$ $\ds$ $=$ $\ds \sum_x x^2 \map \Pr {X = x} - 2 \mu \sum_x x \map \Pr {X = x} + \mu^2$ Definition of Probability Mass Function: $\ds \sum_x \map \Pr {X = x} = 1$ $\ds$ $=$ $\ds \sum_x x^2 \map \Pr {X = x} - 2 \mu^2 + \mu^2$ Definition of Expectation: $\ds \sum_x x \map \Pr {X = x} = \mu$ $\ds$ $=$ $\ds \sum_x x^2 \map \Pr {X = x} - \mu^2$

Hence the result, from $\mu = \expect X$.

$\blacksquare$

### Continuous Random Variable

Let $\mu = \expect X$.

Let $X$ have probability density function $f_X$.

As $f_X$ is a probability density function:

$\ds \int_{-\infty}^\infty \map {f_X} x \rd x = \Pr \paren {-\infty < X < \infty} = 1$

Then:

 $\ds \var X$ $=$ $\ds \expect {\paren {X - \mu}^2}$ Definition of Variance of Continuous Random Variable $\ds$ $=$ $\ds \int_{-\infty}^\infty \paren {X - \mu}^2 \map {f_X} x \rd x$ Definition of Expectation of Continuous Random Variable $\ds$ $=$ $\ds \int_{-\infty}^\infty \paren {x^2 - 2 \mu x + \mu^2} \map {f_X} x \rd x$ $\ds$ $=$ $\ds \int_{-\infty}^\infty x^2 \map {f_X} x \rd x - 2 \mu \int_{-\infty}^\infty x f_X \paren x \rd x + \mu^2 \int_{-\infty}^\infty \map {f_X} x \rd x$ $\ds$ $=$ $\ds \expect {X^2} - 2 \mu^2 + \mu^2$ Definition of Expectation of Continuous Random Variable $\ds$ $=$ $\ds \expect {X^2} - \mu^2$ $\ds$ $=$ $\ds \expect {X^2} - \paren {\expect X}^2$

$\blacksquare$

## Comment

This is a significantly more convenient way of defining the variance than the first-principles version. In particular, it is far easier to program a computer to calculate this (you don't need to maintain a record of all the divergences). Therefore, this is by far the more usually encountered of the definitions for variance.