next up previous
Next: Restoration via Entropy Reduction Up: Image Restoration By Entropy Previous: Image Restoration By Entropy

Overview of Image Restoration

The literature on signal and image restoration is vast, and this chapter by no means aims at a comprehensive review. This section establishes the relationship of this work to several important, relevant areas of nonlinear image filtering. Nonlinear filtering approaches are typically based on either variational methods, leading to algorithms based on partial differential equations (PDEs), or statistical methods, leading to nonlinear estimation problems.

PDE-based image processing methods became widespread after the work of Perona and Malik [127], where they propose a modified version of the heat equation (calling it anisotropic diffusion) that adapted the diffusivity to image features. The anisotropic diffusion equation is also the first variation of an image energy [114,159] that favors piecewise-constant solutions (in 1D--the situation is somewhat more complex in multiple dimensions). Because such variational approaches prefer certain image geometries, we refer to these local geometric configurations as models. A multitude of nonlinear PDE models have been developed for a wide variety of images and applications [143,173], including the total variation model by Rudin et al. [145], PDE versions [26] of the Mumford and Shah [110] variational model, the cartoon-texture model by Vese and Osher [169], the coherence-enhancing flow by Weickert [174], and various algorithms based on level sets [118,153,164,117,26]. These nonlinear PDE models have proven to be very effective, but only for particular applications where the input data are well suited to the model's underlying geometric assumptions. Moreover, the parameter tuning is a challenge because it entails fuzzy thresholds that determine which image features are enhanced and which are smoothed away.

Statistical formulations have given rise to a wide variety of image filters. For instance, the median and other order-statistics on image neighborhoods can be quite effective [107]. Tomasi and Manduchi [166] describe a bilateral filter, which does a robust averaging in Gaussian-weighted image neighborhoods. A great deal of image processing work develops from a stochastic model of image structure given by Markov random fields (MRFs). Geman and Geman [61] exploit the equivalence between MRFs and Gibbs distributions to model images with Gibbs distributions, in which case the optimal image estimate is given as a fixed point of an iterative procedure that relies on neighborhood-dependent updates. Besag [16] and Owen [120] propose the ICM and ICE schemes, respectively, for Bayesian denoising of images in the light of a priori information. The conditional probabilities for image neighborhood configurations, namely cliques, play a similar role to the image energy in the variational approaches. The most widely-used models penalize intensity differences and simultaneously estimate hidden parameters that explicitly model intensity edges, which pushes the iterative process toward piecewise-constant solutions. The cliques in the MRF approach encode a set of probabilistic assumptions (priors) about the geometric properties of the signal, and thus they are effective only when the signal conforms sufficiently well to the prior. UINTA also exploits the Markov property of the images, but in a different context. Rather than imposing a particular model on the image, UINTA learns the relevant conditional PDFs from the input data and updates pixel intensities to decrease the randomness of these conditional PDFs. Unlike ICM and ICE, UINTA does not employ any priors in the restoration process and is fully unsupervised.

Figure 4.1 demonstrates the effects of such strong models on image filtering [*] by showing the effects of some of the prevalent nonlinear techniques on the Lena image. Anisotropic diffusion (Figure 4.1(c)) restores the cheeks but introduces spurious edges near the nose and the lips. Bilateral filtering [166] tends to smooth away fine textures resulting in their elimination, e.g., on the lips in Figure 4.1(d). Both of these algorithms entail two free parameters, i.e., scale and contrast, and require significant tuning. The coherence-enhancing diffusion forces specific elongated shapes in images, as seen in the enlarged nostril and the lips' curves in Figure 4.1(e). On the other hand, Figure 4.1(f) shows the curvature flow [118,153,117], which is very similar to the total variation strategy of [145], tends to shrink features by rounding them off. The Lena image, which appears to be a very typical grayscale photograph, does not adhere very well to the basic geometric models underlying these algorithms.

Figure 4.1: Comparison of UINTA with prevalent strategies. (a) Degraded Lena image: grayscale values range:0-100 grayscale unit (G.U.). Zoomed insets of: (b) the degraded image; (c) anisotropic diffusion: $K$=0.5 G.U.s, 20 iterations, (d) bilateral filtering: $\sigma_{\mathrm{domain}}$=3 pixels, $\sigma_{\mathrm{range}}$=12 G.U., (e) coherence-enhancing diffusion: $\sigma $=0.1 pixels, $\rho $=2 pixels, $\alpha $=0.0001, $C$=0.0001, 15 iterations, and (f) curvature flow: time step=0.2, 8 iterations.
\begin{figure}\threeAcross {UINTA/Lena_InputImage.eps} {UINTA/Lena_InputZoom2.ep...
...ps} {UINTA/OutputImage_Lena_Curvature_zoom.eps} {(d)} {(e)} {(f)}
\end{figure}

An alternative to filtering with variational models is to construct nonlinear transforms in the frequency domain. In this context, the wavelet literature addresses image denoising extensively. The current state-of-the-art wavelet denoising methods [133,152,128,160] treat the wavelet coefficients as random variables and model their a priori marginal/joint PDFs parametrically. They then estimate the coefficients of the noiseless image given the observed coefficients of the noisy image via various schemes such as Bayesian estimation. The limitations of these methods stem both from the choice of the particular wavelet decomposition basis and the parametric models imposed on the coefficients. A very recent work [132] aims at the blind removal of correlated Gaussian noise using Gaussian-scale-mixture signal models in the wavelet domain. It adapts to the noise statistics by estimating the noise covariance from the input image. The sparse-code shrinkage strategy [80] chooses the transformation based on the statistical properties of the data, using noiseless training data, in order to concentrate the energy in only a few components and then shrinking the sparse component values in a manner similar to wavelet-based methods.

Weissman et al. [175] propose the DUDE algorithm that addresses the problem of denoising data sequences generated by a discrete source and received over a discrete, memoryless channel. It assumes no knowledge of the source statistics and yet performs (asymptotically) as well as any denoiser (e.g., one that knows the source statistics), thereby making DUDE universal. DUDE assigns image values based on the similarity of neighborhoods gathered from image statistics, which resembles the construction of conditional probabilities in UINTA. However, the DUDE approach does not account for noise in the neighborhoods that are used to condition the probabilities for the reconstruction, and it is limited to discrete-valued signals. Motta et al. [109] extend DUDE to handle continuous-tone images, corresponding to large number of discrete intensity levels, with i.i.d. additive Gaussian noise.

The literature shows several statistically-based image processing algorithms that do rely on information theory. The mean-shift algorithm [60,155,27,32,10] modifies image intensities so that they move uphill on the PDF associated with the grayscale histogram of the image. At steady state (assuming appropriate windowing strategies) all samples converge to the nearest mode. The mean-shift procedure, thus, can be said to be a mode seeking process. However, the mean-shift algorithm operates only on image intensities (be they scalar or vector valued) and does not account for neighborhood structure in images. Thus, mean shift resembles a kind of data-driven thresholding process, particularly in the algorithm proposed by [32], in which the density estimate is static as the algorithm iterates. We show the mathematical relationship between the mean-shift procedure and entropy reduction, thereby establishing UINTA as a generalization of the mean-shift algorithm, which incorporates image neighborhoods to reduce the entropy of the associated conditional PDFs.

Buades et al. [22,23], in their work that was developed simultaneously with this dissertation, propose a nonlocal means (NL means) algorithm for image denoising that computes the denoised image intensity as a weighted average of a sample of image intensities, where the weights are derived from the neighborhoods of the pixels in the sample. Empirical analysis of their method shows that it produces denoised images having a low degree of correlation in the difference image between the noisy image and the denoised image. The intensity updates in their method are based on the expectation of the conditional Markov PDF $P (X_t \vert {\bf
y}_t)$ and closely resemble those in UINTA. However, their method contains a free parameter that defines the weights in the weighted-average update, unlike UINTA which automatically tunes this parameter optimally in a data-driven manner. Furthermore, UINTA is iterative and arrives at the intensity updates via an entropy-reduction scheme coupled with a stochastic-relaxation approach using MRFs. The NL means algorithm could be considered a special case of the UINTA algorithm involving a single iteration and a user-defined Gaussian kernel width. Empirical comparisons show that UINTA typically produces better results than NL means, both quantitatively and qualitatively, at the cost of increased processing time.


next up previous
Next: Restoration via Entropy Reduction Up: Image Restoration By Entropy Previous: Image Restoration By Entropy
Suyash P. Awate 2007-02-21