Definition:Probability Mass Function
Contents |
Definition
Let $\left({\Omega, \Sigma, \Pr}\right)$ be a probability space.
Let $X: \Omega \to \R$ be a discrete random variable on $\left({\Omega, \Sigma, \Pr}\right)$.
Then the (probability) mass function or p.m.f. of $X$ is the function $p_X: \R \to \left[{0 .. 1}\right]$ defined as:
- $\forall x \in \R: p_X \left({x}\right) = \begin{cases} \Pr \left({\left\{{\omega \in \Omega: X \left({\omega}\right) = x}\right\}}\right) & : x \in \Omega_X \\ 0 & : x \notin \Omega_X \end{cases}$
where $\Omega_X$ is defined as $\operatorname{Im} \left({X}\right)$, the image of $X$.
That is, $p_X \left({x}\right)$ is the probability that the function $X$ takes the value $x$.
$p_X \left({x}\right)$ can also be written:
- $\Pr \left({X = x}\right)$
Note that for any discrete random variable $X$, the following applies:
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \sum_{x \in \Omega_X} p_X \left({x}\right)\) | \(=\) | \(\displaystyle \Pr \left({\bigcup_{x \in \Omega_X} \left\{{\omega \in \Omega : X \left({\omega}\right) = x}\right\}}\right)\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | by the definition of probability measure | ||
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(=\) | \(\displaystyle \Pr \left({\Omega}\right)\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | |||
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(=\) | \(\displaystyle 1\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) |
The latter is usually written:
- $\displaystyle \sum_{x \in \R} p_X \left({x}\right) = 1$
Thus it can be seen by definition that a probability mass function is an example of a normalized weight function.
Joint Probability Mass Function
Let $X: \Omega \to \R$ and $Y: \Omega \to \R$ both be discrete random variables on $\left({\Omega, \Sigma, \Pr}\right)$.
Then the joint (probability) mass function of $X$ and $Y$ is function $p_{X, Y}: \R^2 \to \left[{0 .. 1}\right]$ defined as:
- $\forall \left({x, y}\right) \in \R^2: p_{X, Y} \left({x, y}\right) = \begin{cases} \Pr \left({\left\{{\omega \in \Omega: X \left({\omega}\right) = x \land Y \left({\omega}\right) = y}\right\}}\right) & : x \in \Omega_X \text { and } y \in \Omega_Y \\ 0 & : \text {otherwise} \end{cases}$
That is, $p_{X, Y} \left({x, y}\right)$ is the probability that the function $X$ takes the value $x$ at the same time that the function $Y$ takes the value $y$.
$p_{X, Y} \left({x, y}\right)$ can also be written:
- $\Pr \left({X = x, Y = y}\right)$
Similarly to the individual mass functions of $X$ and $Y$, we have:
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \sum_{x \in \Omega_X \atop y \in \Omega_Y} p_{X, Y} \left({x, y}\right)\) | \(=\) | \(\displaystyle \Pr \left({\bigcup_{x \in \Omega_X \atop y \in \Omega_Y} \left\{{\omega \in \Omega: X \left({\omega}\right) = x, Y \left({\omega}\right) = y}\right\}}\right)\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | by the definition of probability measure | ||
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(=\) | \(\displaystyle \Pr \left({\Omega}\right)\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | |||
| \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) | \(=\) | \(\displaystyle 1\) | \(\displaystyle \) | \(\displaystyle \) | \(\displaystyle \) |
The latter is usually written:
- $\displaystyle \sum_{x \in \R} p_{X, Y} \left({x, y}\right) = 1$
Generalized Definition
Let $X = \left\{{X_1, X_2, \ldots, X_n}\right\}$ be a set of discrete random variables on $\left({\Omega, \Sigma, \Pr}\right)$.
Then the joint (probability) mass function of $X$ is function $p_X: \R^n \to \left[{0 .. 1}\right]$ defined as:
- $\forall x = \left({x_1, x_2, \ldots, x_n}\right) \in \R^n: p_X \left({x}\right) = \Pr \left({X_1 = x_1, X_2 = x_2, \ldots, X_n = x_n}\right)$
The properties of the two-element case can be appropriately applied.
Sources
- Geoffrey Grimmett and Dominic Welsh: Probability: An Introduction (1986): $\S 2.1$, $\S 3.1$