next up previous
Next: Markov Consistency Up: Technical Background Previous: Mutual Information


Markov Random Fields

Markov random fields (MRFs) are stochastic models that characterize the local spatial interactions in data. The last 40 years have seen significant advances in the mathematical analysis of MRFs as well as numerous application areas for MRFs ranging from physics, pattern recognition, machine learning, artificial intelligence, image processing, and computer vision. This has firmly established MRFs as powerful statistical tools for data analysis. This dissertation proposes an adaptive MRF image model and builds processes images relying on this model. This section gives a brief review of theory behind MRFs and some relevant MRF-based algorithms.

The first concept of the MRF theory came from the physicist Ernst Ising in the 1920s. Ising was trying to devise a mathematical model to explain the experimental results concerning properties of ferromagnetic materials. This dealt with local interactions between a collection of dipoles associated with such materials. He published the model in his doctoral thesis, which later became popular as the Ising model. The name Markov, however, is dedicated in the memory of the mathematician Andrei Markov who pioneered the work on Markov chains, i.e., ordered sequences of RVs where the conditional PDF of an RV given all previous RVs is exactly the same as the conditional PDF of the RV given only its preceeding RV. In other words, the next RV, given the present RV, is conditionally independent of all other previous RVs. This notion of conditional independence concerning chains of RVs generalizes to grids of RVs or random fields. Such random fields are called MRFs.

A random field [47,161] is a family of RVs ${\bf X} = \{ X_t \}_{t \in
\mathcal{T}}$, for some index set $\mathcal{T}$. For each index $t$, the RV $X_t$ is defined on some sample-space $\Omega$. If we let $\mathcal{T}$ be a set of points defined on a discrete Cartesian grid and fix $\Omega = \omega$, we have a realization or an instance of the random field, ${\bf X} (\omega) = {\bf x}$, called the digital image. In this case, $\mathcal{T}$ is the set of grid points in the image. For vector-valued images $X_t$ becomes a vector RV.

In the early 1970s, Spitzer, Preston, Hammersely, Clifford, and Besag were among the pioneers who rigorously analyzed the theory behind the stochastic models for systems of spatially-interacting RVs. The joint PDF $P ({\bf X})$ of all the RVs in the random field dictates the image-formation process. However, modeling this joint PDF is intractable because of the enormous dimensionality $\vert\mathcal{T}\vert$ that equals the number of pixels in the image. Early researchers advocated the use of the lower-dimensional conditional PDFs, one associated with each RV $X_t$, to model the statistical dependencies between RVs. Such PDFs were conditioned only on the values of a few RVs in the spatial proximity of the RV in concern, thereby making the analysis tractable. These ideas rely on the notion of a neighborhood, which we define next.

We can associate with the index set $\mathcal{T}$, a family of neighborhoods

$\displaystyle \mathcal{N}$ $\textstyle =$ $\displaystyle \{ \mathcal{N}_t \}_{t \in \mathcal{T}} \mathrm { such that}$  
$\displaystyle \mathcal{N}_t$ $\textstyle \subset$ $\displaystyle \mathcal{T},$  
$\displaystyle t$ $\textstyle \notin$ $\displaystyle \mathcal{N}_t, \mathrm { and}$  
$\displaystyle \Big( u \in \mathcal{N}_t \Big)$ $\textstyle \Leftrightarrow$ $\displaystyle \Big( t \in \mathcal{N}_u \Big).$ (70)

Then $\mathcal{N}$ is called a neighborhood system for the set $\mathcal{T}$. Indices in $\mathcal{N}_t$ constitute the neighborhood of index $t$. $\mathcal{N}_t$ is also referred to as the Markov blanket or Markov cover for index $t$. We define a random vector ${\bf Y}_t =
\{ X_t \}_{t \in \mathcal{N}_t}$ to denote image neighborhoods. Figure 2.8 shows a 3-pixel $\times $ 3-pixel square neighborhood.

Figure 2.8: A 3-pixel $\times $ 3-pixel square neighborhood. The center pixel is shaded different from its neighbors.
\begin{figure}\oneWidth {Figures/Cliques_3x3.eps} {0.25}
\end{figure}

Based on this general notion of a neighborhood, $X (\Omega,T)$ is a MRF if and only if

$\displaystyle \bigg( P (x_t) > 0, \forall t \bigg) \Rightarrow P ( x_1,x_2,\ldots,x_{\vert\mathcal{T}\vert} )$ $\textstyle >$ $\displaystyle 0, \mathrm { and}$ (71)
$\displaystyle \forall t,
P \Big( X_t \vert \{ x_u \}_{u \in \{ \mathcal{T} \setminus \{ t \} \}} \Big)$ $\textstyle =$ $\displaystyle P ( X_t \vert {\bf y}_t ).$ (72)

The first condition above is the positivity condition. The second one is the Markovity condition that implies the conditional independence of any RV ($X_t$), with respect to all other RVs not in its Markov cover ( $\mathcal{T} - \mathcal{N}_t$), given the values of RVs in its Markov cover ($\mathcal{N}_t$). This means that, given the the Markov cover of an RV, the remaining RVs carry no extra information about the RV. We define a random vector ${\bf Z}_t =
(X_t, {\bf Y}_t)$. We refer to the PDFs $P (X_t, {\bf Y}_t) = P ({\bf Z}_t)$ as Markov PDFs defined on the feature space $< {\bf z} >$.



Subsections
next up previous
Next: Markov Consistency Up: Technical Background Previous: Mutual Information
Suyash P. Awate 2007-02-21