Next: Engineering Enhancements for the
Up: Estimating Uncorrupted-Signal Markov Statistics
Previous: Inverse Problem: KL-Divergence Optimality
The inverse problem we have here is that of mixture-density parameter estimation--the parameter
here is the set
of the means of Gaussians that
defines the uncorrupted-signal Markov PDF. We propose to solve this using the EM
algorithm [43,104]. The EM algorithm computes a ML parameter estimate when the
data are incomplete, i.e., a part of the data remains unobserved or hidden. We now
describe the key idea behind the working of the EM algorithm.
The optimization formulation in (5.9) is a little unwieldy because it contains the
logarithm of a sum. If we knew which Gaussian component generated each observation, then we could
obtain the probability
by evaluating a single Gaussian: the one that
generated
. The EM approach gets rid of the summation that the logarithm applies
to. The key idea behind EM is that it assumes the existence of one hidden RV associated with
each observation
. The PDF of this hidden RV gives the probabilities for different
Gaussian components to have generated
. Let us call this RV
. The values of
are, however, never observed. The EM algorithm starts by assuming a joint PDF
of the observed and hidden RVs, i.e., the complete data. It defines the probability of
the observation
assuming that it came from the
-th Gaussian as
 |
|
|
(124) |
where
and
are the mean and covariance values, respectively, for the
-th
Gaussian. The goal of the EM algorithm is to iteratively find the ML estimate of the parameter
as
where
is the support of
. Each iteration comprises the E
(expectation) step and the M (maximization) step. The E step formulates an expectation of
the complete-data likelihood function over the PDF of the hidden RV conditioned on the observed data
and current parameter estimate. The M step maximizes this expectation with respect to the
parameter. After much simplification [19], the maximization performed in the
-th
iteration reduces to
 |
|
|
(126) |
where
is the
-th parameter estimate that is held constant and
is the free variable. The parameter updates guarantee no decrease
in the likelihood
of the observed data and, hence, the sequence of
estimates converge to a local maximum of the likelihood function.
An important element in this entire process of inferring the uncorrupted-signal Markov statistics is
the initial choice of the sample
for the EM
algorithm. We initialize
to comprise a small random
fraction of the entire set of observed neighborhood-intensities
, spread uniformly over the image domain
. This ensures the representation
of all important features in the image and produces an initial estimate close to the global maximum
of the likelihood function.
The EM updates, for density estimation using a sum of Gaussians, are as follows.
-
Let
be the parameter estimate at the
-th
iteration.
-
Use the lookup tables to compute
and
,
,
where
-
Compute
 |
|
|
(128) |
-
Use Bayes rule to evaluate
. Because we derive the initial set of observations
from the PDF
that is close to
, we can ignore the a priori
probabilities
--treat them equal for all
. Thus, we compute
 |
|
|
(129) |
-
Update the current parameter estimate using a gradient-ascent scheme using first-order finite
forward differences:
 |
|
|
(130) |
where the Jacobian is a diagonal matrix--each component of the vector neighborhood
is corrupted independently because of the conditional independence assumption on the noise
model--that can be computed numerically using the lookup table
. The
partial derivatives in the Jacobian are the reciprocal of the rate of change of the shift in the
mean of the Rician-corrupted Gaussians with respect to the change in the means of the input
Gaussian (for i.i.d. additive Gaussian noise the Jacobian is exactly identity). We have
numerically found that this derivative is always greater than unity, and approaches unity for
large SNR (where Rician noise behaves very similar to i.i.d. additive Gaussian noise). For low
SNR, however, the derivative can be much larger than unity and this may lead to numerically-large
updates. In practice, we treat the Jacobian as identity. This results in a projected-gradient
ascent strategy that is still guaranteed to converge.
-
If
, where
is a small threshold, then stop, otherwise go to
Step 3.
Next: Engineering Enhancements for the
Up: Estimating Uncorrupted-Signal Markov Statistics
Previous: Inverse Problem: KL-Divergence Optimality
Suyash P. Awate
2007-02-21