| Name: | Jonathan Bronson |
| Date: | March 7, 2010 |
Table of Contents |
|
Overview
The goal of this project is to implement Fourier Harmonics for use
in invariant contour representation. After reading this document
it should become clear how an input chain-code representing the
discrete sampling of a contour can be transformed to a basis
that allows us to describe the contour in a way that is invariant
to scale, translation, and rotation. From this representation, we
can also reconstruct the original contour, in this new invariant frame.
Design Choices
Elliptic Harmonics are a way of representing closed contours as the superposition of ellipses of varying sizes. The coefficients to these ellipses are obtained by means of a Fourier Expansion, from negative infinity to infinity. To fully reconstruct the contour to 100% accuracy would require equally as many coefficients. However, in practice, relatively few coefficients are needed. The full form of this expansion ca be written as:
In this notation, the parameterized curve position
and coefficients
are complex numbers.
They can be interpreted as vectors on the real/imaginary plane. This interpretation
is intuitive, as it explains the position
geometrically as a
rotating vector, defined as the sum of other rotating vectors. Notice that since
the coefficient index n is in the exponent, higher degree coefficients correspond
to vectors rotating at increasingly faster frequencies.
Calculating Coefficients:
Given a discrete sampling of a closed contour, converting to the Elliptic Harmonic
basis requires the computation of the
coefficients. The solution
to these coefficients can be written as:
In general, there is no closed-form solution to the above equation. However, Kuhl & Giardina showed that we can take advantage of the fact that our input curve is composed of piece-wise linear segments. Substituting this information in and and simplifying gives us the following closed-form solution:
where
is the slope of the segment, and ds is the length of the segment. This
simplification is only possible because our input is a piece-wise linear contour
defined on a pixel grid.
Reconstruction:
Most of the time, we will only use the Coefficients to compare, classify, and modify contours. However, if for no other reason than verification, we would also like to reconstruct our contour from these coefficients. There is nothing tricky about the reconstruction, it can be expanded as:
The only catch is that we must choose a sampling interval to draw the curve. For all examples in this document, a sample interval of 0.01 was chosen. This is much finer resolution than needed in most cases.
Translation:
Obtaining a translation invariant contour is almost given for free. This
is because
is the center of gravity of the contour. By setting
=0, it will be moved to be centered
at the origin. Of course, for visualization purposes, we can arbitrarily
set this value to center our contour on an image. This
centering is performed on most of the images in this report.
Rotation:
Of the three invarients, rotation is the most difficult to obtain, but is still relatively simple to compute. Just as translation involved using the 0th order coefficient to adjust the center of mass, the 1st order coefficients can be used to achieve rotational invariance. They describe the dominant ellipse that represents the orientation of the contour. The semi-major axis of the ellipse is unique, and has two extreme points which can be used as a relative starting location for a parameterized contour. While this method suffers from an ambiguity as to which side of the axis the starting point lies on, it's simplicity makes it appealing for demonstration purposes.
Since the coefficients/vectors that compose the ellipse are rotated by phasers to construct the a contour, multiplication with additional phasers can rotate both the starting point of the parameterization on the contour, as well as the orientation of the contour with respect to image-space.
can be solved for
by noting that when the two vectors
and
superimpose to form the maxima extent of the ellipse,
=0.
If we refer to the anglular component of
and
as
1 and
-1 respectively, then we can define
as:
Further, we can define the angle needed to rotate the whole contour as:
In addition to aligning this semi-major axis, my implementation also minimizes the rotation within the space of the ambiguity. This avoids contours of well known shapes coming out upside down, even when input right-side up.
Scaling:
Similar to rotation, the 1st order coefficients can be used to define a proper scaling. Assuming the coefficients have already been made to be invariant to rotations, a uniform scaling can be achieved by scaling the semi-major axis to unity. Since the two principal vectors are aligned to the x-axis, scaling by their joint magnitude will achieve scaling the semi-major axis to unity:
, where
| n=1 | n=2 | n=4 | n=8 | n=16 | n=1000 |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
![]() | ![]() |
From the images above, it is clear that with only a few coefficients, the dominant
features of a closed contour emmerge. By 15-20 coefficients, all distinguishable
features have been captured to within the pixel precision of the original scale.
The only downside to using too few coefficients are subtle fluctuations along edges
that can manifest as aliasing artifacts if scaled to a different pixel grid.
| Initial Contour | ![]() |
![]() |
![]() |
![]() |
| Scale & Rotationally Invariant | ![]() |
![]() |
![]() |
![]() |
| 1st-Order Ellipse | ![]() |
![]() |
![]() |
![]() |
The above images illustrate the effects of rotational and scale invariance.
The coloring of the contours is a linear interpolation from black to white
to show the parameterization of the curve from 0 to 1. Notice the rotational
invariant contours start their parameterization at 0-degrees, and the semi-major
axis are all of equal size.
| Vase | House | Turtle | Duck | |
| Vase | 0.096 | 0.081 | 0.132 | |
| House | 0.096 | 0.143 | 0.123 | |
| Turtle | 0.081 | 0.143 | 0.162 | |
| Duck | 0.132 | 0.123 | 0.162 |
Having a representation of the contours that is invariant to translation,
scale, and rotation allows to compare these curves against eachother in a
meaningful way. The above table shows the pairwase Sums of Squares Differences.
Thinking of the sets of computed coefficients as a high-dimensional data
point, the calculated SSD can be thought of as the distance between the
two points. The larger the number, the more different the two contours are.
| Vase | House | Turtle | Duck | |
| Vase | 0.097 | 0.081 | 0.132 | |
| House | 0.097 | 0.143 | 0.124 | |
| Turtle | 0.081 | 0.143 | 0.163 | |
| Duck | 0.132 | 0.124 | 0.163 |
The above table shows the same SSD calculation but computed on contours described by 1000 coefficients rather than 20. Just as we saw diminishing returns for higher order coefficients in regard to the quality of the reconstruction, these higher order coefficients offer diminishing information on how different these contours are. Comparing these values to the previous table shows very little difference. This is excellent because it means we can get away with minimal computation.
The simplicity and robustness of the Elliptic Harmonics makes it an appealing method for contour representation. The results from the previous section show that for contours generated from rasterized images, we will typically require very few coefficients to obtain a perfect reconstruction to pixel accuracy. Further, for shapes of extreme high similarity, we can compute arbitrarily many coefficients to improve the quality of measure.
Elliptic Haromincs are not without their drawbacks. As mentioned in Section 3, while we can achieve rotational invariance, it is somewhat arbitrary. The method assumes the first-order ellipse provides the most information about the shape of the contour, but really it only describes the relative dimensions of the contour. One could image two contours with identical high-frequency oscillations, but very different low-frequency shapes. In such a case, rotations based on the first ellipse are most likely to result in those oscillations being out of phase, and the difference between the two will be large. A human being however, will see this strong similarity of high frequency pattern, and deside the larger dimensional change is not what makes these shapes similar. One better metric for comparison would be to treat each feature of the representation independently. Frequencies, phase-shifts, and amplitudes could be compared, as well as their relationship between one another. However, how these correspond to the ellipses is not clear and may prove problematic.