next up previous
Next: Statistical Inference Up: Technical Background Previous: Probability Theory

Random Variables

There are situations where one does not want the information concerning each and every outcome of an experiment. Instead, one is more interested in high-level information. For instance, given a grayscale digital image where each pixel takes one of the 256 values or intensities, $\{
0,1,2,\ldots,255 \}$, one may want to know how many pixels had a particular intensity, rather than which particular pixels had that intensity. The notion of random variables helps us extract such information.

The term random variable can be a little misleading [167]. A random variable (RV), denoted by $X$, is a mapping, or a function, that assigns some real number to each element in the sample space $\Omega$. Thus, an RV is a function, $X : \Omega \rightarrow \Re$, whose domain is the sample space and the range is the set of real numbers [167]. The set of values actually taken by $X$ is typically a subset of $\Re$. When the sample space $\Omega$ is uncountable, or nondenumerable, not every subset of $\Omega$ constitutes an event to which we could assign a probability. This entails the definition of a class $\mathcal{F}$ denoting the class of measurable subsets of $\Omega$. Furthermore, we require that the set $\{ \omega \in \Omega : X
(\omega) \le x\}$ be an event, and a member of $\mathcal{F}$, so that we can define probabilities such as $P (X \le x)$. The collection of entities $( \Omega, \mathcal{F}, P )$ is called the probability space associated with the RV $X$. In this dissertation, uppercase letters, e.g., $X$, denote RVs and lowercase letters, e.g., $x$, denotes the value assigned by the RVs.

The cumulative distribution function (CDF) $F_X (\cdot)$ of an RV $X$ is

$\displaystyle F_X (x) = P (X \leq x).$     (4)

The CDF satisfies the following properties
    $\displaystyle \forall x \in (-\infty, +\infty), 0 \leq F_X (x) \leq 1,$ (5)
    $\displaystyle F_X (x) \mathrm { is a nondecreasing function of } x,$ (6)
    $\displaystyle \lim_{x \rightarrow - \infty} F_X (x) = 0,$ (7)
    $\displaystyle \lim_{x \rightarrow + \infty} F_X (x) = 1.$ (8)

The joint CDF $F_{X,Y} (\cdot)$ of two RVs $X$ and $Y$ is
$\displaystyle F_{X,Y} (x,y) = P (X \leq x, Y \leq y).$     (9)

A continuous RV is one whose CDF is a continuous function. A discrete RV has a piecewise-constant CDF. Most situations in image processing, and so also in this dissertation, entail the use of continuous RVs. Hence, from now on we focus on continuous RVs and, unless explicitly mentioned, we use to the term RV to refer to a continuous RV.

The probability density function (PDF) $P_X (\cdot)$ of an RV $X$ is

$\displaystyle P_X (x)
= \frac
{d F_X (x)}
{d x}.$     (10)

The PDF $P_X (\cdot)$ satisfies the following properties
$\displaystyle \forall x, P_X (x)$ $\textstyle \ge$ $\displaystyle 0,$ (11)
$\displaystyle \int_{\mathcal{S}_X} P_X (x) dx$ $\textstyle =$ $\displaystyle 1,$ (12)

where $\mathcal{S}_X = \{ x \in \Re : P_X (x) > 0 \}$ is the support of $P_X (X)$.

The PDF of a discrete RV is a set of impulse functions located at the values taken by the RV. In this way, a discrete RV creates a mutually-exclusive and collectively-exhaustive partitioning of the sample space--each partition being $\Omega_x = \{ \omega \in \Omega : X(\omega) = x \}$. For instance, assuming that the intensity takes only integer values in $[0,255]$, we can define a discrete RV which maps each pixel in the image to its grayscale intensity. Then each partition corresponds to the event of a particular intensity $x$ being assigned to any pixel.

Here, we denote the PDF of an RV $X$ by $P_X (\cdot)$ that uses a subscript to signify the associated RV. In the future, for simplicity of notation, we may drop this subscript when it is clear which RV we are referring to. The joint PDF $P_{X,Y} (\cdot)$ of two RVs $X$ and $Y$ is [123]

$\displaystyle P_{X,Y} (x,y)
= \frac
{\partial^2 F_{X,Y} (x,y)}
{\partial x \partial y}.$     (13)

The conditional distribution $F_{X\vert M} (\cdot)$ of an RV $X$ assuming event $M$ is

$\displaystyle F_{X\vert M} (x \vert M)
= \frac
{ P (X \leq x, M) }
{ P (M) },$     (14)

when $P (M) \neq 0$. The conditional PDF $P_{X\vert M} (\cdot)$ of an RV $X$ assuming event $M$ is
$\displaystyle P_{X\vert M} (x \vert M)
= \frac
{d F_{X\vert M} (x \vert M)}
{d x}.$     (15)

Let us now consider examples of a few important PDFs, many of which we will encounter in the subsequent chapters in this dissertation. Figure 2.1 shows the PDF and CDF for a discrete RV.

Figure 2.1: Discrete RVs: (a) The PDF and (b) the CDF for a discrete RV.
\begin{figure}\twoWidth {Figures/discretePDF.eps} {Figures/discreteCDF.eps}
\end{figure}
A continuous PDF, on the other hand, is the $d$D Gaussian PDF [123], also known as the Normal PDF:
$\displaystyle G (x)
=
\frac {1} {(\sigma \sqrt {2 \pi})^d}
\exp
\Bigg(
- \frac {(x - \mu)^2} {2 \sigma^2}
\Bigg),$     (16)

where $\mu$ and $\sigma $ are the associated parameters. Figure 2.2 shows the PDF and CDF of a Gaussian RV.
Figure 2.2: Continuous RVs: (a) The PDF and (b) the CDF for a continuous (Gaussian) RV with $\mu = 0$ and $\sigma = 1$.
\begin{figure}\twoWidth {Figures/gaussianPDF.eps} {Figures/gaussianCDF.eps}
\end{figure}
One example of a PDF derived from Gaussian PDFs is the Rician PDF [123]. If independent RVs $X_1$ and $X_2$ have Gaussians PDFs with means $\mu_1,\mu_2$ and variance $\sigma^2$, then the RV $X = \sqrt {X_1^2 + X_2^2}$ has the Rician PDF:
$\displaystyle P ( x \vert \mu )
= \frac {x} {\sigma^2}
\exp
\Bigg(
- \frac {x^2 + \mu^2} { 2 \sigma^2}
\Bigg)
%
I_0
\Bigg(
\frac {x \mu} {\sigma^2}
\Bigg),$     (17)

where $\mu = \sqrt {\mu_1^2 + \mu_2^2}$. In practice, the Rician PDF results from independent additive Gaussian noise components in the real and imaginary parts of the complex MR data--the magnitude of the complex number produces a Rician PDF. The Rician PDF has close relationships with two other well-known PDFs: (a) the RV $((X_1/\sigma)^2 + (X_2/\sigma)^2)$ has a noncentral chi-square PDF [123] and (b) the Rician PDF reduces to a Rayleigh PDF [123] when $\mu = 0$. Figure 2.3 shows two Rician PDFs with different $\mu$ values and $\sigma = 1$. We can show that the Rician PDF approaches a Gaussian PDF as the ratio of $\mu / \sigma$ tends to infinity [123].

Figure 2.3: Rician PDFs with parameter values (a)  $\mu = 0.5, \sigma = 1$, and (b)  $\mu = 5, \sigma = 1$. Note the similarity between the Rician PDF in (b) and the Gaussian PDF in Figure 2.2(a).
\begin{figure}\twoWidth {Figures/ricianPDF_mean_0.5.eps} {Figures/ricianPDF_mean_5.eps}
\end{figure}

Two RVs are independent if their joint PDF is the product of the marginal PDFs, i.e.,

$\displaystyle P_{X, Y} (X, Y) = P_X (X) P_Y (Y)$     (18)

This is to say that knowing the value of one RV does not give us any information about the value of the other RV. In other words, the occurrence of some event corresponding to RV $X$ does not affect, in any way, the occurrence of events corresponding to RV $Y$, and vice versa. A set of RVs are mutually independent if their joint PDF is the product of the marginal PDFs, i.e.,
$\displaystyle P_{X_1,X_2,\ldots,X_n} (X_1, X_2, \ldots, X_n) = P_{X_1} (X_1) P_{X_2} (X_2) \ldots P_{X_n} (X_n)$     (19)

It is possible that each pair of RVs in a set be pairwise independent without the entire set being mutually independent [167].

Often, we deal with measures that characterize of certain properties of PDFs. One such quantity is the expectation or mean of an RV $X$:

$\displaystyle E [X] = \int_{\mathcal{S}_X} x P(x) dx.$     (20)

The expectation represents the average observed value $x$, if a sample is derived from the PDF $P(X)$. It also represents the center of gravity of the PDF $P(X)$. For example, the mean of a Gaussian PDF is $\mu$. The expectation is a linear operator, i.e., given two RVs $X$ and $Y$ and constants $a$ and $b$
$\displaystyle E [aX + bY] = a E [X] + b E [Y].$     (21)

Deterministic functions $f(X)$ of an RV $X$ are also RVs [167]. The expected value of $Y
= f(X)$ when the observations are derived from $P(X)$ is
$\displaystyle E_{P(X)} [Y] = \int_{\mathcal{S}_X} f(x) P(x) dx.$     (22)

The variance gives the variability or spread of the observations around the expectation:
$\displaystyle \mathop{\mbox{Var}}(X) = \int_{\mathcal{S}_X} (x - E [X])^2 P(x) dx.$     (23)

For example, the variance of a Gaussian PDF is $\sigma^2$.


next up previous
Next: Statistical Inference Up: Technical Background Previous: Probability Theory
Suyash P. Awate 2007-02-21