next up previous contents
Next: Generating the feature image Up: Document Segmentation Previous: Gaussian and Laplacian Pyramids   Contents

Gabor Filters

The Gabor Filters have received considerable attention because the characteristics of certain cells in the visual cortex of some mammals can be approximated by these filters. In addition these filters have been shown to posses optimal localization properties in both spatial and frequency domain and thus are well suited for texture segmentation problems [13,20]. Gabor filters have been used in many applications, such as texture segmentation, target detection, fractal dimension management, document analysis, edge detection, retina identification, image coding and image representation [21]. A Gabor filter can be viewed as a sinusoidal plane of particular frequency and orientation, modulated by a Gaussian envelope. It can be written as:
\begin{displaymath}
h(x,y) = s(x,y)g(x,y)
\end{displaymath} (11)

where s(x,y) is a complex sinusoid, known as a carrier, and $g(x,y)$ is a 2-D Gaussian shaped function, known as envelope. The complex sinusoid is defined as follows,
\begin{displaymath}
s(x,y) = e^{-j2\pi(u_0x+v_0y)}
\end{displaymath} (12)

The 2-D Gaussian function is defined as follows,
\begin{displaymath}
g(x,y) = \frac{1}{\sqrt{2\pi}\sigma} e^{-\frac{1}{2}{(\frac{x^2}{\sigma_x^2}+\frac{y^2}{\sigma_y^2})}}
\end{displaymath} (13)

Thus the 2-D Gabor filter can be written as:
\begin{displaymath}
h(x,y) = e^{-\frac{1}{2}{(\frac{x^2}{\sigma_x^2}+\frac{y^2}{\sigma_y^2})}}e^{-j2\pi(u_0x+v_0y)}
\end{displaymath} (14)

$= g(x,y) e^{-j2\pi(u_0x+v_0y)}$


The frequency response of the filter is: $H(u,v) = G(u-u_0,v-v0)$
\begin{displaymath}
\Rightarrow H(u,v) = 2 \pi \sigma_x \sigma_y [e^{-2 \pi^2[(u-u_0)^2 \sigma_x^2 + (v-v_0)^2 \sigma_y^2]}]
\end{displaymath} (15)

= $\frac{1}{2 \pi \sigma_u \sigma_v} e^{- \frac{1}{2}[\frac{(u-u_0)^2}{\sigma_u^2} + \frac{(v-v_0)^2}{\sigma_v^2}]}$ where, $\sigma_u = \frac{1}{2 \pi \sigma_x}, \sigma_v = \frac{1}{2 \pi \sigma_y}$ This is equivalent to translating the Gaussian function by $(u_0,v_0)$ in the frequency domain. Thus the Gabor function can be thought of as being a Gaussian function shifted in frequency to position $(u_o,v_0)$ i.e at a distance of $\sqrt{u_0^2+v_0^2}$ from the origin and at an orientation of $tan^{-1}\frac{u_0}{v_0}$. In the above 2 equations, ($u_0$,$v_0$) are referred to as the Gabor filter spatial central frequency. The parameters $\sigma_x, \sigma_y$ are the standard deviation of the Gaussian envelope along X and Y directions and determine the filter bandwidth.

Figure 6: Plot of frequency response of the Gabor filter for different values of u0, v0 corresponding to four orientations - 0, 45, 90 and 135
\begin{figure*}
\centerline{\epsfig{figure=gabor.eps,width=0.55\textwidth}}
\end{figure*}

A plot of the frequency response of the Gabor filter for different values of u0, v0 corresponding to four orientations - 0, 45, 90 and 135 is shown in Figure 6.

Figure 7: Gabor filter output of the simple lines pattern corresponding to 4 orientations - 0, 45, 90 and 135
\begin{figure*}
\centerline{\epsfig{figure=gabor_output.eps,width=0.55\textwidth}}
\end{figure*}

Figure 7 shows the output of the Gabor filter at four orientations, when the input is an image containing lines at similar frequencies but at different angles. It can be seen that the filter at orientation $\theta$ has a strong response to the region when the variation is also along angle $\theta$. In this case, a simple smoothing operation followed by thresholding is enough to segment the image into four regions corresponding to the lines at four orientations. By passing an image through a Gabor filter defined by the parameters ( $u_0, v_0, \sigma_x, \sigma_y)$, we obtain all those components in the image that have their energies concentrated near the spatial frequency point $(u_0,v_0)$.
next up previous contents
Next: Generating the feature image Up: Document Segmentation Previous: Gaussian and Laplacian Pyramids   Contents
2002-06-03