next up previous contents
Next: Experimental Results Up: Clustering using the Feature Previous: Clustering using the Feature   Contents

Segmentation using K-Means Algorithm

K-Means is a least-squares partitioning method that divide a collection of objects into K groups. The algorithm iterates over two steps:
  1. Compute the mean of each cluster.
  2. Compute the distance of each point from each cluster by computing its distance from the corresponding cluster mean. Assign each point to the cluster it is nearest to.
  3. Iterate over the above two steps till the sum of squared within group errors cannot be lowered any more.
The initial assignment of points to clusters can be done randomly. In the course of the iterations, the algorithm tries to minimize the sum, over all groups, of the squared within group errors, which are the distances of the points to the respective group means. Convergence is reached when the objective function (i.e., the residual sum-of-squares) cannot be lowered any more. The groups obtained are such that they are geometrically as compact as possible around their respective means. Using the set of feature images, a feature vector is constructed corresponding to each pixel ($[e_1(a,b)$, $e_2(a,b)$, ... ,$e_d(a,b)]$), where d is the number of feature images used for the segmentation process. The K-Means can then be used to segment the image into three clusters - corresponding to two scripts and background respectively. For each additional script, one more cluster is added. Here, each feature is assigned a different weight, which is calculated based on the feature importance as described in the previous Section. The distance between two vectors is computed using Equation  19. Once the image has been segmented using the K-Means algorithm, the clustering can be improved by assuming that neighboring pixels have a high probability of falling into the same cluster. Thus, even if a pixel has been wrongly clustered, it can be corrected by looking at the neighboring pixels.
next up previous contents
Next: Experimental Results Up: Clustering using the Feature Previous: Clustering using the Feature   Contents
2002-06-03