|
CS7960:
|
|
Project 6
|
||
|
Name:
|
|
James Fishbaugh
|
||
|
Email:
|
|
jfishbau@cs.utah.edu
|
||
|
Normalized Graph Cuts – Graph-based Image Segmentation |
||||
|
Introduction
Segmenting an image into a number of labeled regions is a common task in many areas of computational science. Methods
for segmentation range from simple thresholding techniques, to Gaussian mixture models, to deformable models that evolve
over time. Here we consider a method which models an image as a weighted undirected graph and seeks to find similar components.
|
||||
|
Theory
We model an image as an undirected graph where the nodes of the graph represent pixels and the edges represent
relationship between pixels. The edges are weighted based on a similarity measurement between the pixels. We
further model the graph as an affinity matrix, which is a symmetric matrix of edge weights between every pixel
and every other pixel. Figure 1 shows an example graph and corresponding affinity matrix.
Figure 1: Example graph and affinity matrix. Image from Trucco and Verri.
Affinity Measures
There are many ways we can measure the similarity between two pixels. Here we consider two common methods, one based
on distance and one based on image intensity. A combination of measures can be combined by elementwise multiplication of
the resulting matrices.
Distance We can measure the similarity between pixels based on the spatial distance between them. We desire a measurement that falls off quickly with distance, so we define the affinity as an exponential function of distance as
where σd influences how quickly the function falls off with distance. A large value of σd
allows for similarity between pixels that are far away, whereas a small value says only neighboring pixels should be considered
similar. Figure 2 shows the impact of σd on the affinity matrix for a simple dataset.
Figure 2: Top row) Original and clustered data. Bottom row) Affinity matrix
based on distance with increasing value of σd
Intensity
We can measure the similarity between pixels in a grayscale image based on pixel intensity. Again, we want a measurement that falls off quickly, so we define the affinity as an exponential function of intensity difference as
where σI influences how the range of intensity differences considered similar.
Graph Cuts
We desire a method to split the affinity matrix into sub-matrices of similar regions. We want clusters of elements
that have high affinity values. If we desire the single best cluster, we can formulate this as an
optimization problem of the form
where w is a vector of weights indicating the probability an element belongs to the cluster, and
A is the affinity matrix. The best cluster is found by maximizing the objective function for
w. However, we see that increasing w will increase the objective function without
bound. Therefore, in order to obtain a meaningful value of w that maximizes the objective function,
we impose the constraint that wTw=1. We now have the constrained
objective function
where λ is a Lagrange multiplier.
We solve the problem by differentiating with respect to w and we have (ignoring constants)
which shows that the solution w is an eigenvector of the affinity matrix. Therefore, the best cluster
can be obtained by looking at the eigenvector corresponding to the largest eigenvalue. The values of w,
as mentioned previously, represent the probability that a point is a member of the cluster. A segmentation is obtained
by labeling only those points with sufficiently high probability.
Multiple clusters are obtained by considering the first k eigenvectors. For example, figure 3 shows the eigenvectors associated with the four highest eigenvalues from the example data in figure 2. Figure 3: First four eigenvectors for data in figure 2.
We can infer the number of desired clusters by investigating the distribution of eigenvalues. Figure 4 shows the
eigenvalues obtained from the example data in figure 2 with different values of σd. A good
choice of σd reveals that there are four significant clusters. However, the number of clusters is
not always clear from the eigenvalue distribution. It is best if the number of clusters is known apriori.
Figure 4: Eigenvalue distribution for increasing value of σd
for the data in figure 2.
Normalized Graph Cuts
Rather than performing a spectral clustering of the affinity matrix, we could consider cutting the graph in such a way
that the cost of the cut is small with respect to the affinity of the components. We can define the cost of a cutting
an undirected graph V into two components A and B as
where cut(A,B) is the sum of edge weights in V connecting elements in A and B, assoc(A,V) is the sum of all edge
weights in A to all other edges in V, and assoc(B,V) is defined similarly. In other words, we want a cut where the edges between the components have small weight, and
the edges in each component have high weight. The cut which minimizes this cost is referred to as a normalized cut.
A solution to this problem is difficult to determine, since we must consider every possible cut in order to obtain the best cut. Instead, we look for an approximation of the best cut. Let us define a vector y where element i is 1 if node i belongs to one component and -1 if it belongs to the other component. Let us further define a diagonal matrix D as
where A is the affinity matrix. Matrix D is referred to as the degree matrix.
This formulation leads to an optimization problem where we wish to find y that minimizes
which is categorized as an integer programming problem. However, rather than solving for integer values of y,
we instead find a real values. The solution is given by
where we choose the eigenvector associated with the second smallest eigenvector. Having reals rather than
integers necessitates a thresholding to determine which cluster each element belongs.
Multiple clusters are obtained by investigating the eigenvector with the third smallest eigenvalue, the eigenvector with the fourth smallest eigenvalue, and so forth. |
||||
|
Experiments
We begin by testing the graph cut framework on two synthetic images, shown in figure 5. Each image has dimensions of 25x25.
Figure 5: Synthetic images for testing. The image of three white shapes
on a black background is courtesy of Jonathan Bronson. Graph Cuts
We first attempt to segment the synthetic image with three regions. We add Gaussian noise with zero mean and variance 0.05 to make
the segmentation more challenging. The affinity matrix is shown on the left side of the middle row of figure 6. It represents a
combination of distance with σd=2.5 and intensity with σI=0.05. We were able to obtain decent results
with just intensity, but the segmentation proved more robust when distance information was included. We note
a block diagonal structure in the affinity matrix, which implies structure in the image. It doesn't, however, give us much information
about how many regions there are in the image.
The eigenvalue distribution, shown on the right side of the middle row of figure 6, is also not very informative. It is not clear from the eigenvalues that there are 3 regions in the image. The bottom row of figure 6 shows the first two eigenvectors. We observe that all non-zero weights from the first eigenvector represent the dark grey region of the original image, and that all non-zero weights from the second eigenvector represent the black region. The remaining region is all the pixels that have not been clustered by the first two eigenvectors. The final segmentation is shown on the right side of the top row of figure 6. Figure 6: Top row) Original and segmented image. Middle row) Affinity matrix and
eigenvalue distribution. Bottom row) Eigenvectors used for clustering.
Next, we attempt to segment four regions from the synthetic image of three white shapes on a black background. It is straightforward to
find two clusters – one representing the shapes, and one for the background – if we use intensity affinity only. However,
since we desire four regions, we also include distance measurements. We note that even without noise, we are unable to cluster the
shapes differently. The top row of figure 7 shows the noise free image and resulting segmentation into two clusters.
The reason for the poor segmentation is clear from the eigenvectors, shown on the bottom row of figure 7. We see that all shapes are clustered by the first eigenvector. By investigating the next two eigenvectors, we find many non-zero weights. However, all these weights represent pixels already clustered by the first eigenvalue. Figure 7: Top row) Original and segmented image. Middle row) Affinity matrix and
eigenvalue distribution. Bottom row) Eigenvectors used for clustering. Normalized Graph Cuts
Next, we revisit the synthetic image of three regions with noise using the normalized graph cuts framework, shown on the left side of the top
row of figure 8. Again we use affinity by both distance and intensity to increase the robustness of the segmentation. The bottom row of
figure 8 shows the eigenvectors used for clustering. We note that we only threshold on positive weights, whereas we used an absolute value
thresholding scheme in the previous section. It appears that sign information is important in the normalized graph cut framework. The
right hand side of the top row of figure 8 shows the final segmentation.
Figure 8: Top row) Original and segmented image. Middle row) Affinity matrix and
eigenvalue distribution. Bottom row) Eigenvectors used for clustering.
Finally, we revisit the difficult example that the graph cut framework failed to properly segment. Here, we add Gaussian noise of mean
zero and variance 0.1, shown on the left side of the top row of figure 9. Again, we use a combination of distance and
intensity measurements to attempt to segment four regions. The bottom row of figure 9 shows the eigenvectors obtained from the normalized
graph cut method. The first eigenvector is an interesting case, as we had to use a very small threshold in order to cluster the circle.
As with the previous example, we used all positive thresholds and did not consider negative weights. The final segmentation is shown on
the right side of the top row in figure 9. It appears that the normalized graph cut method is an improvement over graph cuts.
Figure 9: Top row) Original and segmented image. Middle row) Affinity matrix and
eigenvalue distribution. Bottom row) Eigenvectors used for clustering. |
||||
|
Conclusion
The graph cut framework provides a different interpretation for the problem of image segmentation. By representing the image as a connected
graph, where edge weights represent similarity between pixels, we can obtain a segmentation by splitting the graph. We have looked at two
methods that attempt to determine good places to cut the graph. Both methods work well for simple examples, however we saw that graph cuts
was unable to differentiate between shapes of the same intensity, though they were separated by a significant distance. In this case, the
normalized graph cut method is an improvement.
There are also a lot of small engineering decisions to be made in implementation. One such decision is to whether or not to automatically determine the number of clusters. For our implementation, we required this knowledge be known apriori. However, it is sometimes possible to determine the number of clusters from the eigenvalues. Another possibility is to keep iterating over eigenvectors until all points have been clustered. In our experiments, however, we noted that no solution to this problem is perfect. Another engineering decision is the choice of thresholds for both graph cuts and normalized graph cuts. Our implementation allowed the user to provide thresholds for each cluster they desire. We found that a single threshold was not sufficient, as each eigenvector required a specific threshold for accurate clustering. It is also possible to automatically determine thresholds, though we did not find a scheme that worked well in every case. A better option would be to use yet another clustering technique, such as k-means, to find the best cluster of weights of an eigenvector. However, at this point we have a nested clustering problem, providing evidence that other methods may be superior. Perhaps other affinity measures, such as texture, provide better results for more complex images. In fact, in order to obtain reasonable segmentation from a real image, we would need to add texture as one of our affinity measures. However for distance and intensity measurements, it seems that other methods are preferable. For example, the expectation maximization algorithm with a spatial prior, such as the icing model, provides a framework where both intensity and spatial location are used for segmentation. |
||||
|
Last Updated: April 30, 2010
|
||||