| Name: | Jonathan Bronson |
| Date: | Feb 2, 2010 |
Table of Contents |
|
Overview
The goal of this project is the implementation of a method that
tries to find the centers and size of various blobby objects via
scale-space analysis. The project requires first the development
of a scale-space representation of the image, followed by a
detection of blob responses and their local maxima in scale space.
The next sections in this report will go into more detail of each
step of the algorithm. Section 2 describes the construction of the
Gaussian Scale-Space. Section 3 describes the use of the Laplacian
for generating blob responses. Section 4 goes into the details of
finding a maximum response, across Scale-Space. Section 5 provides
the results of my experiments for this project, and finally Section 6
wraps up this report with some conclusions.
Design Choices
|
For this project and future projects for this course, I have decided
to build a comprehensive framework application. This application will
allow me to easily open and save images using a GUI, and quickly access
all previous projects through their respective toolbox's. Using an MDI
display model allows myself or another user to have any number of
images open at one time, and compare results of methods more easily.
My implementation is coded C++, using Qt for the GUI, and VISPack for image handling. I use CMake to automate the build process and except for some limitations of VISPack, the application is completely cross-platform. The software has been successfully built on OpenSUSE, Ubuntu, and Windows 7. |
![]() |
There are many possible ways of implementing a Gaussian Scale-Space. I chose to implement several approaches and kept the one which proved superior to the others. My first approach was to generate a unique Gaussian kernel for each desired sigma. This is easy to implement, but is slow and wasteful as the scale increases. An alternative approach was to do successively convolve the current image with a Gaussian of constant sigma. When the image reached the next desired scale, we can save that image as the scales sample. This method was lest wasteful, but all-together not any faster.
![]() input |
![]() t = 0.0 |
![]() t = 0.5 |
![]() t = 1.0 |
![]() t = 1.5 |
![]() t = 2.0 |
The approach I ultimately favored is a combination of the first two. I first compute the Gaussian of the smallest layer. Then, I take advantage of the fact that convolving two Gaussians results in a third Gaussian with sigma equal to the square root of the sum of squares of the input sigmas. Knowing what the next level's sigma should be, and what the previous level's sigma is, I calculate what sigma Gaussian would be necessary to get me there in a single Convolution. This relationship can be expressed as
| 0 | 1 | 0 |
| 1 | -4 | 1 |
| 0 | 1 | 0 |
There is one catch to computing these Laplacians. Gaussian kernels smear out a signal in a way that reduces peak intensities proportional to σ2. This means as we increase the scale, each image would get progressively darker, and comparison across scales becomes impossible. To account for this, we can simply multiply each scaled image by a factor of σ2, corresponding to the σ used to generate that level. Below are the same images from the previous section, with the discrete Laplacian kernel applied.
![]() input |
![]() t = 0.0 |
![]() t = 0.5 |
![]() t = 1.0 |
![]() t = 1.5 |
![]() t = 2.0 |
To speed up the search for maxima, I short-circuit the maximum comparison if the intensity level of the pixel is below some threshold. The reasoning is if this is truly a maxima, it will have a fairly large signal response. By immediately disregarding values that don't, the computation is speed up and erroneous detections are avoided. Finally, for evaluation a red circle is drawn at the peak signal location with a radius r as defined above. The result from the single blob image above is shown below:
One Blob
Multiple Blobs
Sunflower 1
t = [3.0; 0.02; 4.0]
Sunflower 2
The image above is an example of such a case. As we climb to a Gaussian scale of t=3.0, the space looks like a single object. This is similar to how ink is dispersed on a sheet of printer paper. This in turn forms a local maxima and the detector labels the region as a single large blob.
In addition to blobs blurring together, non-circular blobs can produce responses of different intensities. Without thresholding, dark patches which have no definitive shape may register as blobs of weak response. Below shows the results of thresholding maxima below 0.5 (left) and thresholding maxima below 0.75 (right):
Another complication is excentric blobs. In the Sunflower1 image, the flowers near the top have very elliptcal blobs. The filter however cannot know this and instead we draw a circle that encompasses this ellipse. The issue is more severe in the Sunflower2 image. The hat the child is wearing does not form even an elliptical blob, since the child's face is in the center. Instead, the detector finds several blobs around the rim of the hat. Finally, as we sample larger and larger scales, the cost becomes computationally prohibitive. For very high resolution images, it would be worth considering the use of image pyramids or some other scheme to reduce image resolution for higher scales.