next up previous
Next: Information Theory Up: Nonparametric Density Estimation Previous: Parzen-Window Convergence

High-Dimensional Density Estimation

Some key ideas in this dissertation entail nonparametric PDF estimation where the observations lie in high-dimensional spaces. With a sufficiently large sample size, the Parzen-window estimate can converge to an arbitrarily-complex PDF. Alas, for guaranteeing convergence, the theory dictates that the sample size must increase exponentially with the dimensionality of the space. In practice, such a large number of samples are not normally available. Indeed, estimation in high-dimensional spaces is notoriously challenging because the available data populates such spaces very sparsely--regarded as the curse of dimensionality [155,150,156]. One reason behind this phenomenon is that high-dimensional PDFs can be, potentially, much more complex than low-dimensional ones, thereby demanding large amounts of data for a faithful estimation. There exists, however, inherent regularity in virtually all image data that we need to process [188,79,91,40]. This makes the high-dimensional data lie on locally low-dimensional manifolds and, having some information about this locality, the PDF estimation becomes much simpler. Figure 2.7 depicts this phenomenon. Despite theoretical arguments suggesting that density estimation beyond a few dimensions is impractical due to the unavailability of sufficient data, the empirical evidence from the literature is more optimistic [150,131,189,50,172]. The results in this dissertation confirm that observation.

Figure 2.7: Neighborhoods (circles) in images and their locations (circles) on manifolds (dashed line) in the high-dimensional space. Different patterns in images, expectedly, produce neighborhoods lying on different manifolds.
\begin{figure}\oneWidth {Figures/manifolds.eps} {1}
\end{figure}


next up previous
Next: Information Theory Up: Nonparametric Density Estimation Previous: Parzen-Window Convergence
Suyash P. Awate 2007-02-21