CS 7960 - Project 1

Scale-Space Selection

Name: Jonathan Bronson
Date: Feb 2, 2010

Table of Contents

  1. Introduction
  2. Scale-Space
  3. Laplacian Filtering
  4. Maximum Detection
  5. Results
  6. Conclusion
             

1. Introduction

Overview
The goal of this project is the implementation of a method that tries to find the centers and size of various blobby objects via scale-space analysis. The project requires first the development of a scale-space representation of the image, followed by a detection of blob responses and their local maxima in scale space. The next sections in this report will go into more detail of each step of the algorithm. Section 2 describes the construction of the Gaussian Scale-Space. Section 3 describes the use of the Laplacian for generating blob responses. Section 4 goes into the details of finding a maximum response, across Scale-Space. Section 5 provides the results of my experiments for this project, and finally Section 6 wraps up this report with some conclusions.

Design Choices

For this project and future projects for this course, I have decided to build a comprehensive framework application. This application will allow me to easily open and save images using a GUI, and quickly access all previous projects through their respective toolbox's. Using an MDI display model allows myself or another user to have any number of images open at one time, and compare results of methods more easily.

My implementation is coded C++, using Qt for the GUI, and VISPack for image handling. I use CMake to automate the build process and except for some limitations of VISPack, the application is completely cross-platform. The software has been successfully built on OpenSUSE, Ubuntu, and Windows 7.

Back


2. Scale-Space

A Gaussian Scale-Space is three-dimensional. It consists of a series of images, with an increasing aperature size such that as we ascend the layers, we are effectively using a coarser scale of the image. We can generate an arbitrary number of sample layers, but there is little benefit in sampling multiple times within the same order of magnitude. The user-interface allows the user to specify the minimum and maximum scales, as well as the intervals in a logarithmic scale. For each scale, the corresponding Gaussian kernel will have a size:

σ = et

There are many possible ways of implementing a Gaussian Scale-Space. I chose to implement several approaches and kept the one which proved superior to the others. My first approach was to generate a unique Gaussian kernel for each desired sigma. This is easy to implement, but is slow and wasteful as the scale increases. An alternative approach was to do successively convolve the current image with a Gaussian of constant sigma. When the image reached the next desired scale, we can save that image as the scales sample. This method was lest wasteful, but all-together not any faster.

input

t = 0.0

t = 0.5

t = 1.0

t = 1.5

t = 2.0

The approach I ultimately favored is a combination of the first two. I first compute the Gaussian of the smallest layer. Then, I take advantage of the fact that convolving two Gaussians results in a third Gaussian with sigma equal to the square root of the sum of squares of the input sigmas. Knowing what the next level's sigma should be, and what the previous level's sigma is, I calculate what sigma Gaussian would be necessary to get me there in a single Convolution. This relationship can be expressed as

σi+12 = σi2 + σreq2

where σreq is the size of the Gaussian we need to convolve level i with to obtain level i+1. In this way, the minimum amount of work is required to compute the whole Scale-Space, and no results are wasted. In practice, I found this third approach to be significantly faster than the other two.

Back


3. Laplacian Filtering

In order to detect a response from blobby objects, we take the Laplacian of each image. There are multiple options for this step as well. Given the raw input image, we could explicitly compute the Laplacian of the Gaussian as a kernel. The downside to this approach is we can no longer use the optimized routine to build the Scale-Space. As an alternative, I simply convolve the Gaussian Scale-Space images with the discrete 3x3 Laplacian Kernel:

  1   0
-4  1
  1  0

There is one catch to computing these Laplacians. Gaussian kernels smear out a signal in a way that reduces peak intensities proportional to σ2. This means as we increase the scale, each image would get progressively darker, and comparison across scales becomes impossible. To account for this, we can simply multiply each scaled image by a factor of σ2, corresponding to the σ used to generate that level. Below are the same images from the previous section, with the discrete Laplacian kernel applied.


input

t = 0.0

t = 0.5

t = 1.0

t = 1.5

t = 2.0

Back


4. Blob Detection

Examining the Laplacian convolved images from the previous section, it should be clear that at a certain scale, the center of a blob will produce a local maxima in signal response. Since we have normalized each level by factoring in σ2, the ideal response will also be a local maxima in Scale-Space. To find these maxima, I scan through the pixels and only record values whose 26 neighbors are all lower. This is equivalent to making sure the pixel is the maximum of the 3x3x3 cube that surrounds it in Scale-Space. If the maximum is found, the σ corresponding to the scale the pixel was from indicates the size of the blob. That is, it's radius is approximated as:

r = 1.5*σ

To speed up the search for maxima, I short-circuit the maximum comparison if the intensity level of the pixel is below some threshold. The reasoning is if this is truly a maxima, it will have a fairly large signal response. By immediately disregarding values that don't, the computation is speed up and erroneous detections are avoided. Finally, for evaluation a red circle is drawn at the peak signal location with a radius r as defined above. The result from the single blob image above is shown below:

Back

5. Results

One Blob

Multiple Blobs

Sunflower 1

t = [3.0; 0.02; 4.0]

Sunflower 2

Back


6. Conclusion

This project has demonstrated the inherent usefulness of looking at an image in Scale-Space rather than just the smallest scale possible. Even using only simple filters, we are able to generate powerful algorithms that would otherwise require exhaustive computation. This project has also highlighted some of the challenges inherent to Scale-Space methods. Just as shapes can appear ambiguous to humans, nearby objects blur together at larger scales, creating (false?) maxima.

The image above is an example of such a case. As we climb to a Gaussian scale of t=3.0, the space looks like a single object. This is similar to how ink is dispersed on a sheet of printer paper. This in turn forms a local maxima and the detector labels the region as a single large blob.

In addition to blobs blurring together, non-circular blobs can produce responses of different intensities. Without thresholding, dark patches which have no definitive shape may register as blobs of weak response. Below shows the results of thresholding maxima below 0.5 (left) and thresholding maxima below 0.75 (right):

Another complication is excentric blobs. In the Sunflower1 image, the flowers near the top have very elliptcal blobs. The filter however cannot know this and instead we draw a circle that encompasses this ellipse. The issue is more severe in the Sunflower2 image. The hat the child is wearing does not form even an elliptical blob, since the child's face is in the center. Instead, the detector finds several blobs around the rim of the hat. Finally, as we sample larger and larger scales, the cost becomes computationally prohibitive. For very high resolution images, it would be worth considering the use of image pyramids or some other scheme to reduce image resolution for higher scales.

Back