| Name: | Jonathan Bronson |
| Date: | March 23, 2010 |
Table of Contents |
|
Overview
The goal of this project is to implement a statistical shape model
based on the point distribution model (PDM) developed by Cootes and
Taylor. The following sections will show how these point models,
coupled with PCA, allow us to not only describe the relationship
between existing shapes, but also to generate new plausible shapes
from that family which not found in the original data set. We will
apply this analysis to several sets of data, inluding a simple set
of shapes, and more complicated set of human hand data.
Design Choices
The idea of Active Shape Models was developed by Cootes and Taylor, building on their previous work with the Point Distribution Model (PDM). In such a model, a training shape (usually a contour) is sampled and represented as a series of landmark points. These landmarks are chosen such that they are distinguishable enough that a deformed version of the contour would still contain such landmarks. This ensures a one-to-one correspondence among all shapes in a dataset. Additionally, further samples may be added which are interpolations of these landmark points. We refer to the full sampling of a contour as a shape vector:
where n is the number of samples of the shape. In this form, there is little information to explain the relationship between shapes of the same family.
Principal Component Analysis (PCA) is the mechanism that allows us to model shape variation in a more compact and meaningful way. Since we ensured a one-to-one correspondence between sample points, it follows that similar shapes will have high correlation between corresponding samples. Similarly, a deformable shape will have high correlation in connected regions across the whole set of samples. PCA provides a transformation to a parameter space, where each axis represents a mode of variation pulled directly from the correlation and covariance of the samples.
For a family of shapes, we can obtain the covariance matrix Σ of the mean-centered data by the formula:

where
is the ith shape vector, and
is the mean shape vector. A singular
value decomposition provides us with the eigenvalues and eigenvectors
of this matrix, in the form of:

where W is a diagonal matrix of the eigenvalues and V is a column matrix of the corresponding eigenvectors. We will utilize V as a transformation to a basis that maximizes variance along each dimension. In other words, transforming a shape vector by the first eigenvector will deform that shape by the most prominent mode of variation across the training set. The second vector the second most prominent mode, and so forth. By examining the eigenvalues, we can ignore the modes (eigenvectors) which provide little or know importance. A good scheme for this is to only accept the eigenvectors corresponding to 90% or 95% of the variation. This is easily found by summing the eigenvalues in order until the desired threshold is reached as the proportion to the total variance:

By multiply shape vectors by this reduced space matrix containing only the most important eigenvectors, we can obtain a compressed and more meaningful representation of the data:
.Finally, to help visualization the actual correlation between landmark points, we can compute the Correlation Matrix and visualize it as a color coded image. The Correlation Matrix can be derived from the Covariance Matrix as follows:

Since we are looking for correlation within a single vector, the diagonal of this matrix will be composed entirely of 1's. Since correlated values range from -1 to 1, we have to use a transformed color scale to visualize this. For the images in this report, colors are scaled such that black, gray, and white correspond to -1, 0, and 1, respectively.


where
is a scaling factor for eigen axis
i,
is the ith eigenvector,
and
is the standard deviation
of the ith mode among the traning data set. This is an ideal way
of viewing the reconstruction, since over 95% of the traning data falls within
two standard deviations and over 99% lie within 3 standard deviations.
Synthetic Data: Squares
(a) Random Translation:
![]() |
![]() |
![]() |
![]() |
| Sample Shapes | Correlation Matrix | Samples Shapes (10% Noise) |
Correlation Matrix (10% Noise) |
![]() |
![]() |
![]() |
![]() |
| 1st Mode | 2nd Mode | 1st Mode (10% noise) |
2nd Mode (10% noise) |
Examining the Correlation Matrix image, we see all landmarks are highly correlated within their axis. That is, the x components are all highly correlated with x components, and the y components are all highly correlated with the other y components. This makes perfect sense, since translation in the x and y axis are independent. Further, the high correlation within an axis makes sense since we are performing rigid translations, and all landmarks move in synchronicity. Unsuprisingly, the high correlation within each axis leads the major modes of variation to coincide with the major axis.
| Translation Eigenvalues | Translation Eigenvalues (10% Noise) | |
![]() |
![]() |
![]() |
Comparing the raw data with the 10% noisey data, there are several
clear differences. First, the correlation matrix, while generally the
same, has fluctuating shades of grey rather than consistent correlation.
This is of course a direct result of the noise, and should be expected.
A less intuitive side effect of the noise is the rotation of the major
axis of variation. Rather than being aligned with the x and y axis they
are aligned more closely to the y = x and y = -x lines. Additionally, there
is a small shearing effect present, particuarly more so in the second
mode of variation.
(b) Random Scale:
![]() |
![]() |
![]() |
![]() |
| Sample Shapes | Correlation Matrix | Sample Shapes (10% Random Noise) |
Correlation Matrix (10% Random Noise) |
![]() |
![]() |
![]() |
![]() |
| 1st Mode | 1st Mode (10% Noise) |
2nd Mode (10% Noise) |
The pattern found in the Correlation Matrix for the raw randomly scaled squares is less intuitive than for the translated shapes. Clearly, from the black and white there is still a strong mixed correlation. The grouping of these correlations reflects which quadrant the landmark falls in. In the 1st quadrant, the x and y components are both positive, and thus, positively correlated. Similarly, in the 3rd quadrant, the x and y components are both negative, and again, positive correlated. Conversely, the 2nd and 4th quadrants have mixed signs and therefore result in checkboard pattern between x,y components, but continuous color for adjacent landmark points.
| Scale Eigenvalues | Scale Eigenvalues (10% Noise) | |
![]() |
![]() |
![]() |
For the noise-free samples, the only non-zero eigenvalue is the first
one, and entirely captures the scaling found in the sample set. Once
noise is introduced, the 2nd and 3rd modes begin to show subtle variation.
As seen in the figure above, these latter modes of variation are shearing
effects, capturing the noise present in the data.
(c) Random Translation & Scale:
|
|
|
|
| Sample Shapes | Correlation Matrix | Sample Shapes (10% Noise) |
Correlation Matrix (10% Noise) |
|
|
| |
| 1st Mode | 2nd Mode | 3rd Mode |
|
|
|
| 1st Mode (10% Noise) |
2nd Mode (10% Noise) |
3rd Mode (10% Noise) |
Combining both random translation and random scale into a single family of shapes proves quite robust, as these truly are linearly independent operations. This can be seen as the 1st dominant mode provides scaling, and the 2nd and 3rd provides translation in the x and y axis, respectively. The Correlation Matrix is however, much more difficult to read. It is infact a superposition of the previous two Correlation Matrices from experiments a and b. Suprisingly, the 10% noise plays a less significant role in the outcome of the 2nd and 3rd modes of variation found from PCA. The axis are only slightly off, and the 2nd and 3rd modes have switched order. This second effect is most likely due not to the noise, but rather to the inability of random samplings to evenly favor translations in both dimensions.
| Translation & Scale Eigenvalues | Translation & Scale Eigenvalues (10% Noise) | |
![]() |
![]() |
![]() |
(d) Random Rotation:
![]() |
![]() |
![]() |
![]() |
| Sample Shapes | Correlation Matrix | Sample Shapes (10% Noise) |
Correlation Matrix (10% Noise) |
![]() |
![]() |
![]() |
![]() |
| 1st Mode | 2nd Mode | 1st Mode (10% Noise) |
2nd Mode (10% Noise) |
Clearly rotation is more difficult to handle than scaling and translation. If not immediately apparent, the problem should become more clear when examining the captured modes of variation. At first glance the square appears to mostly scale and slightly rotate simulataneously. If, however, we follow the path of a single landmark during deformation, we see it follows a linear path. Ultimately we are trying to represent a non-linear operation with a linear decomposition, and we simply cannot expect to properly capture the deformation using PCA. Higher-dimensional techniques might be worth trying for rotation based deformations.
| Rotation Eigenvalues | Rotation Eigenvalues (10% Noise) | |
![]() |
![]() |
![]() |
Real Data: Corpus Collosum
The data used for this experiment comes from segmentation of the human Corpus Collosum from medical images. Each sample shape is represented with 12 landmarks. Ideally, we should use cubic interpolating splines to display these shapes, however, in the interest of time, a linear interpolation is used. The sample set contains 30 example shapes, aligned with anatomical AC-PC coordinates.
![]() |
![]() |
![]() |
| Sample Shapes | Correlation Matrix |
It is clear from the Correlation Matrix image that these shapes are
structured in a very specific way. In particular, notice the strong
positive and negative correlation along the tridiagonal of the matrix.
This makes intuitive sense, as our landmark points are ordered around
the contour of the corpus collosum. Nearby landmarks should be
correlated if the shape is to remain entact through deformation.
Similarly, another such diagonal is present orthogonal in the matrix.
This set of correlations implies that landmarks on one side of the
shape are correlated with landmarks on the other side. This also makes
intuitive sense, since the shape undergoes little shearing. For the
reconstructions below, the bounding box of the sample shapes was
adjusted to be closer to the origin.
![]() |
![]() |
![]() |
![]() |
| 1st Mode | 2nd Mode | 3rd Mode | 4th Mode |
Examining the modes of variation for the corpus collosum dataset doesn't offer the type of clear insight it did for the synthetic examples. However, we can still see some basic patterns. The first mode is a kind of rotation, while the second is an expansion along the x-axis. Similarly, the third mode also seems to be a type of rotation, while the fourth looks like an expansion along the y-axis.
| Corpus Collosum Eigenvalues | Corpus Collosum Eigen CDF | |
![]() |
![]() |
![]() |
As one might also expect, being real world data, the contributing eigenvectors extend much further than the synthetic data. The first 5 modes capture over 90% of the variation, and by 15 there is virtually nothing left to contribute.
The biggest benefits from the Active Shape Model (ASM) approach is the ability to take a model generated from a training set, and through a minimization optimization, deform it to match a similar but different shape in an image. We've seen from the experiments above that the method is quite robust to noise, and compact for linear data. Conversely, non-linear deformations require a further expansion of the basis to properly capture the modes of shape deformation. This should only be seen as a problem if the shape requires a large number of landmarks, correlated in highly non-linear ways.
Examining the eigenmodes of variation proved insightful for linear operations, but not so much for non-linear modes such as rotation. For medical data it seems unlikely that sample shapes will vary linearly in any way accept scaling and translation. However, variation in translation is usually removed prior to use anyway. Rotation as well for that matter. If the application of ASM is for rigid mechanical parts, however, examining the modes of variation could prove incredibly insightful. On the other hand, being man-made objects, manufacturers are probably already aware of such variation. Still, the power of ASM's is quite clear.