Variance of Discrete Random Variable from P.G.F.
Contents |
Theorem
Let $X$ be a discrete random variable whose probability generating function is $\Pi_X \left({s}\right)$.
Then the variance of $X$ can be easily obtained from the value of the second derivative of $\Pi_X \left({s}\right)$ WRT $s$ at $x=1$:
- $\operatorname{var} \left({X}\right) = \Pi''_X \left({1}\right) + \mu - \mu^2$
where $\mu = E \left({x}\right)$ is the expectation of $X$.
Proof
From the definition of the probability generating function:
- $\displaystyle \Pi_X \left({s}\right) = \sum_{x \ge 2} p \left({x}\right) s^x$
Differentiating this twice WRT $s$ gives us:
- $\displaystyle \Pi''_X \left({s}\right) = \sum_{x \ge 2} x \left({x-1}\right) p \left({x}\right) s^{x-2}$
from Differentiation of Power Series‎.
But it also holds when you include $x = 0$ and $x = 1$ in the sum, as in both cases the term evaluates to zero and therefore vanishes.
So:
- $\displaystyle \Pi''_X \left({s}\right) = \sum_{x \ge 0} x \left({x-1}\right) p \left({x}\right) s^{x-2}$
Plugging in $s = 1$ gives:
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \Pi''_X \left({1}\right)\) | \(=\) | \(\displaystyle \sum_{x \ge 0} x \left({x-1}\right) p \left({x}\right) 1^{x-2}\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | |||
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(=\) | \(\displaystyle \sum_{x \ge 0} x^2 p \left({x}\right) - \sum_{x \ge 0} x p \left({x}\right)\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | |||
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(=\) | \(\displaystyle E \left({X^2}\right) - E \left({X}\right)\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) |
The result follows from the definition of variance:
- $\operatorname{var} \left({X}\right) = E \left({X^2}\right) - \left({E \left({X}\right)}\right)^2$
after a little algebra.
$\blacksquare$
Comment
So, in order to find the variance of a discrete random variable, then there is no need to go through the tedious process of what might be a complicated and fiddly summation.
All you need to do is differentiate the p.g.f twice, and plug in $1$.
Assuming, of course, you know what the p.g.f is.
Sources
- Geoffrey Grimmett and Dominic Welsh: Probability: An Introduction (1986): $\S 4.3 \ (20)$