| Name: | Jonathan Bronson |
| Date: | April 28, 2010 |
Table of Contents |
|
Overview
The goal of this project is to implement and understand how the
Graph Cut and Normalized Graph Cut algorithms can be used for image
segmentation. Further, we perform an in depth analysis of the qualitative
and quantitative differences between the two algorithms. A series of test
images are created to help facilitate this process.
Design Choices
Providing an algorithm useful for segmentation should immediately beg the question, what do we mean by segmentation. In it's purest sense, that segmentation could simply mean dividing something up into two or more components, or segments. In computer science we typically want to segment based off some measure, or how the segmentation effects the measure of the segments. In computer vision, we mean something even more precise. We start with the assumption that an image we are segmenting is already composed of meaningful components. Given adequate measures, we would like to segment the image into exactly those pieces of which it is intrisically composed. For instance, we may like to divide the image by color, texture, or shape. Chapter 16 of Trucco and Veri's book "Introductory Techniques for 3D Computer Vision" provides further thought on the interpretation of segmentations.
Segmentation can also be seen as a form of clustering. If we decompose an image into pixels, we would like to group pixels that all come from similar objects. This would lead to our desired segmentation. There exist many different clustering techniques, such as K-means, and Support Vector Machines, just to name a few. In this report, we will be examining a spectral graph approach to segmentation.
is a set of vertices
connected by a set of edges
. The set notation
for this is
. One way to refer to an edge
is by the pair of vertices that the edge connects. Thus, we can write the set
of edges as
. If edge (a,b)
is unique and distinct from edge (b,a), we call the edges
'directed edges', and the graph a 'Directed Graph'. Conversely, if
(a,b) is equivalent to (b,a), we call the graph
an 'Undirected Graph'. Further, we can attach weights to the edges of a graph,
to form a 'Weighted Graph'. Weighted graphs are incredibly powerful and can be
conveniently represented in matrix notation. For a graph of n vertices,
we can represent it as an n x n matrix. In this way,
the i,jth element represents the weight of the edge between
vertex i and vertex i. A zero is used to indicate either no edge,
or a zero weight, which are for all purposes equivalent. Thus, an Unweighted
Graph can be represented strictly with one's and zero's, indicating whether an
edge is present or not. Note that for undirected graphs, their matrices will
always be symmetric.
If our goal is to segment an image by clustering pixels, we need a measure of how similar or related pixels are. Only then can we hope to cluster in a meaningful way. There are many affinity measures used in Image Processing. Some of the most common are Intensity, Distance, Color, and Texture. For the purposes of this project, we will only utilize the Intensity and Distance measures between pixels.
Intensity: Affinity by intensity is probably the most obvious measure we can use. In some sense it is an approximation to texture, which is an extremely powerful measure for segmentation. To use it properly, we want pixels with similar intensities to have a large affinity, and pixels of higher contrast to have a small affinity. Trucco and Veri recommend the exponential form:

Distance: Affinity by distance is slightly less intuitive than intensity. In natural images the assumption is that objects connected physically, will be nearby in the image. Thus, we would like to have a high affinity for pixels nearby one another. A similar exponential form can be used for this measure:

In the above affinity equations, the two sigma terms are scaling factors the user can tune for any specified image. This helps account for the fact that locality and smoothness may be less valid assumptions in some images.
The goal of the graph cut algorithm is quite simple and intuitive. Given a weighted graph, we would like to remove a series of edges that segments the graph into one or more connected components. Further, we would like to perform these cuts such that the weights remaining in the connected components are as high as possible. Clearly, this is a difficult problem and is in fact NP-complete. However, we can still find a useful solution using approximation techniques. The method we will use is a spectral decomposition.
The spectral interpretation of the graph cut offers a very interesting perspective, and gives us a continuous numerical approach to solving for the cut. The idea is that each pixel has an ideal weight vector that tells how much it 'belongs' in one cluster or another. By constraining this weight vector to be of unit length, the following relationship holds:

This powerful relationship tells us that the eigenvectors of the affinity
matrix are what make up the ideal weights, at least in a continuous sense.
This is the closest continuous version of cuts possible. However, since we
can only perform discrete cuts, we must ultimately threshold these weights
in some intelligent way if we want a meaningful result.
Graph Cut Algorithm
1. Construct (n*m)x(n*m) affinity matrix of pixels in nxm image.
2. Perform Eigenvalue Decomposition
3. For eigenvectors corresponding to k largest eigenvalues
Threshold and assign pixels to cluster k
The above formulation of the naive Graph Cut algorithm has a fatal flaw. A threshold must be picked, and determining this threshold is at best a heuristic, and sometimes simply wrong. If the eigenvalues of blocks are very similar to one another, they will not actually form distinct clusters. What is missing is a notion of how this choice affects the graph as a whole. The Normalized Graph Cut addresses this problem by finding a cut such that the cost is proportional to the total affinity within each group the cut creates. This cost can be written formally as:

Here,
is the sum of the weights of edges
that are being removed, and
is the sum
of the weights of edges remaining in the newly formed cluster A. The same holds
for B. This new formulation is slightly harder to solve than the previous, and
so again, we resort to an approximation algorithm. This approximation is again
based off an optimization, this time of the criterion:
| where | ![]() |
With this new normalized criterion, we would like to find the vector y that minimizes it. It can be shown that the solution y is also the solution to the generalized eigenvalue system:

There are different strategies for solving generalized eigenvalue systems, depending on what one knows about the matrices. Since we know matrix D is both diagonal and invertible, we have a numerically stable approach by solving the regular eigenvalue system:

As a final step, similar to the naive Graph Cuts algorithm, we must
threshold the continuous values of this vector to obtain a discrete
cut of our graph. Again, strategies for this discretization can
radically change the resulting segmentations. For this project, I
utilize the k-means clustering algorithm on each eigenvector under
inspection. The clusters are intialized to the min and max values of
the vector, and then ran to convergence. I use the midpoint between the
two clusters as a threshold for which pixels fall within the cluster.
Graph Cut Algorithm
1. Construct (n*m)x(n*m) affinity matrix of pixels in nxm image.
2. Solve Generalized Eigenvalue System
3. For eigenvectors corresponding to k smallest eigenvalues
Threshold and assign pixels to cluster k
With the two affinity measures, intensity and distance, there are two types of tests which are appropriate. First is the ability to segment pixels of differing intensities. The second is the ability to segment pixels which are separated into distinct clusters by some distance. Additionally, we would like to see if the segmentations of the aforementioned tests are robust to noise. Since our image is composed of a regular grid of points, there is no meaningful segmentation by distance alone, so this test is omitted. Finally, we wood like to compare the results of these segmentations for the regular Graph Cut algorithm with that of the Normalized Graph Cut.
Test 1: Intensity Regions
This first test is designed to test the ability of the Graph Cut algorithms to properly segment regions of differing intensity. Strictly speaking, the distance term is not needed as an affinity measure. However, both distance and intensity are used on this image to evaluate the difficulty of tuning around the available parameters. For the Normalized Graph Cuts, the images of the Affinity Matrices were scaled to be in a viewable range. This makes some images appear washed out, whereas without this mapping, all of the affinity matrices look almost entirely black.
| Graph Cut | |||
|---|---|---|---|
| Input Image | d=10, i=0.5 | d=20, i=0.5 | d=40, i=0.5 |
![]() |
![]() |
![]() |
![]() |
| Affinity Matrices: | ![]() |
![]() |
![]() |
| Eigenvalue Plots: | ![]() |
![]() |
![]() |
The figures above show the effects of having erroneous sigma parameters. The distance sigma is increased from the left to right, causing the distance affinity to have less and less effect. This can clearly be seen in the plot of the eigenvalues, as the values begin to compress into linear strips. The patterns appear as waves in the leftward images because pixels are ordered in row-major order, and pixels toward the edge of the image have a further distance to most other pixels while pixels closer in have much shorter average distance.
| Normalized Graph Cut | |||
|---|---|---|---|
| Input Image | d=3, i=0.01 | d=5, i=0.01 | d=7, i=0.01 |
![]() |
![]() |
![]() |
![]() |
| Affinity Matrices: | ![]() |
![]() |
![]() |
| Eigenvector Plots: | ![]() |
![]() |
![]() |
Tuning parameters for the Normalized Graph Cut seems to be easier in some sense. A much broader range of values are acceptable and even when bad parameters are given, the effects are not as detrimental. This is evident in the eigenplots, as they are generally more linear. The third eigenvector is misleading since many of its compontents are accounted for in the first and second vectors. Additionally, the first partitioning split cuts through this segment, forcing both negative and positive values in the vector.
Test 2: Intensity Regions (20% Noise)
This test is designed to evaluate the robustness of the Graph Cuts to noisey images, as well as to see how noise affects the tuning of the sigma parameters. The image is the same image from test 1, with 20% correlated noise added to the image.
| Graph Cut | |||
|---|---|---|---|
| Input Image | d=60, i=0.5 | d=30, i=0.5 | d=30, i=0.4 |
![]() |
|
|
|
| Affinity Matrices: |
|
|
|
| Eigenvector Plots: |
|
|
|
The graph cut algorithm is suprisingly robust to noise. However, the user must account for noise by being more careful in parameter selection. In particular, the intensity sigma must be relaxed to account for the variation in intensities. Comparing the eigenvector plots to that of the noiseless version of the image, we can see the same noise spreading these plots out.
| Normalized Graph Cut | |||
|---|---|---|---|
| Input Image | d=1, i=0.03 | d=3, i=0.03 | d=5, i=0.03 |
![]() |
![]() |
![]() |
![]() |
| Affinity Matrices | ![]() |
![]() |
![]() |
| Eigenvector Plots: | ![]() |
![]() |
![]() |
It was clear from parameter exploration that the Normalized Graph Cut was much more robust to noise. That is, a much larger set of parameter options gave correct segmentations. Inspecting the eigenvector plots we see that they are still more linear than the naive Graph Cut's. Trying to gain insight from the Affinity Matrix for this image is very difficult. The reason it is so faint is we are able to set the distance sigma much lower than for the naive Graph Cut, due to the Normalized Graph Cut's increased robustness.
Test 3: Shapes
This third test incorporates the distance affinity in a meaningful way. We could use an image of shapes of varying intensity, but this would be too easy since intensity differences alone could identify this. Thus, only a combination of intensity affinity and distance affinity can properly segment this test image.
| Graph Cut | |||
|---|---|---|---|
| Input Image | d=100, i=0.5 | d=50, i=0.4 | d=15, i=0.3 |
![]() |
![]() |
![]() |
![]() |
| Affinity Matrices: | ![]() |
![]() |
![]() |
| Eigenvector Plots: | ![]() |
![]() |
![]() |
Unfortunately, the Graph Cut algorithm as implemented for this project seemed unsuccessful at properly segmenting this image the way we wanted. Segmenting the three shapes from the background was exceedingly easy, however, separating the distinct shapes proved apparently impossible. It is unclear if this failure is simply a result of my thresholding scheme. The process of multiplying affinities together might also be part of the problem. It seems more intuitive that distinct affinities should be combined as linear combinations, as the distance between pixels should have absolutely no bearing on the difference in color between the two.
| Normalized Graph Cut | |||
|---|---|---|---|
| Input Image | d=300, i=0.42 | d=55, i=0.37 | d=25, i=0.35 |
![]() |
![]() |
![]() |
![]() |
| Affinity Matrices: | ![]() |
![]() |
![]() |
| Eigenvector Plots: | ![]() |
![]() |
![]() |
Where the naive Graph Cut algorithm failed, the Normalized Graph Cut had no problems. In fact, it behaved exactly as one would expect. As we decrease the weight of distance affinity, the three shapes are clustered as one item. As we slowly decrease the sigma for distance, we gradually segment more and more finely. Inspecting the eigenvector plots of both of these trials reveals that thresholding for such images can be extremely challenging. The use of kmeans here surely made this job easier, and yet further improvements can be made.
Test 4: Shapes (20% Noise)
This final test follows the same theme behind test 2. We add 20% correlated noise to test image 3 and evaluate the ability and difficulty of picking sigmas to properly segment the three shapes from the background.
| Graph Cut | |||
|---|---|---|---|
| Input Image: | d=100, i=5 | d=50, i=4 | d=15, i=3 |
![]() |
![]() |
![]() |
![]() |
| Affinity Matrices: | ![]() |
![]() |
![]() |
| Eigenvector Plots: | ![]() |
![]() |
![]() |
Since the noiseless image was too much for the naive Graph Cuts to handle, we cannot expect the noisey image to do any better. However, at least the segmentation of the shapes from the background can be achieved successfully in the presence of noise.
| Normalized Graph Cut | |||
|---|---|---|---|
| Input Image: | d=25, i=0.35 | d=25, i=0.4 | d=35, i=0.5 |
![]() |
![]() |
![]() |
![]() |
| Affinity Matrices: | ![]() |
![]() |
![]() |
| Eigenvector Plots: | ![]() |
![]() |
![]() |
The observation from the eigenplot of the previous image that it should be difficult to segment seems correct. By adding the correlated noise to the image, tuning parameters for the Normalized Graph Cut became much more difficult. It's unclear if there is a set of parameters that can properly segment this image. The values above were the best found from experimentation. Further restricting the affinity for distance results in images similar to the background segmentation found in the naive Graph Cut's output image.
Clearly the Graph Cut techniques are powerful and capable of robust segmention. The ability to use an arbitrary set of feature descriptors for segmentation is a huge advantage. For the simple examples in this report, distance and intensity were good enough. For real images, we'd really want some kind of texture affinity as well. However, texture analysis is a topic in it's own right.
One of the limiting factors of spectral techniques is the computational complexity. They require the decomposition of a n2xn2 which will be high rank for real images. To skirt the issue for this project, we utilize very small images, (the largest being 25x25), which take several seconds to segment. Surely we could speed this up, but the complexity will not change.
Perhaps the biggest drawback of this approach is the need for the scaling factors. From the users perspective, choosing these parameters is non-intuitive, and time-consuming. To be practical, an automated technique to pick parameters would be ideal. However, since there are multiple ways to segment complex images, there may be no one correct choice for an automated system. This is not a problem with the algorithm, but rather with the ambiguity inherent in natural images.
The assertion that the affinity measures should be multiplied together seems dubious. As mentioned in the experiments section, the intensity difference of two pixels should have absolutely no baring on the distance between them. A linear combination of affinities seems the most intuitive. The downside to such an approach would be additional scaling parameters, which would be competing against the sigma factors. Users may already find tuning parameters difficult, and while it may allow for more successful segmentations, the added complexity may only exacerbate the problem.
These algorithms are interesting because they do not specify a best way to threshold each eigenvector. In fact, the success and failure of the experiments may lie partly or entirely in the method used for segmentation (k-means). An evaluation of many different thresholding strategies might be able to shed light on this, and would be worth the time.