Next: Estimation of Feature Importances
Up: Document Segmentation
Previous: Generating the feature image
  Contents
The output of the Gabor Filter bank is a set of n filtered images. To obtain uniform and complete coverage of the frequency domain, Gabor filters of up to 6 scales and 30 orientations were used. Using all the 180 filtered images is computationally expensive. Also, some of the filtered images have very little discriminatory power, while others contain very little information about the original image. Hence, it is enough to use a subset of the filtered images for the segmentation.
Consider a set of n d-dimensional feature vectors
,
, ... ,
belonging to two classes. The two classes will be well separated in the feature space if the function:
 |
(18) |
is maximum. Here
,
,
,
are the means and standard deviations of the two classes respectively.
Maximizing the above function ensures that the means of the two classes are well separated in the feature space and the standard deviation within a class is minimum i.e. the points belonging to the same class are compactly clustered around their respective means.
The discriminatory power of each feature is estimated by computing the value of
for that feature. The ten best features i.e the 10 features having the highest value of
are used for segmentation.
Next: Estimation of Feature Importances
Up: Document Segmentation
Previous: Generating the feature image
  Contents
2002-06-03