To measure the relationship between the data value and its derivatives
described in the last section, we use a three-dimensional histogram we
term a *histogram volume*. There is one axis for each of the
three quantities , , and , and each axis is divided into
some number of (one-dimensional) bins, causing the interior volume to
be divided into a three-dimensional array of bins. The histogram
volume has two defining characteristics:

- 1.
- Each bin in the histogram volume represents the combination of a small range of values in each of the three variables , , and .
- 2.
- The value stored in each bin signifies the number of voxels in the original volume within that same combination of ranges of these three variables.

Fig. 7 illustrated the position-independent relationship
between , , and
that characterized an ideal boundary. To
find that relationship, however, we afforded ourselves the luxury of
first knowing *where* the boundary was in the idealized dataset.
For instance, Fig. 4 was produced with knowledge
of where to place a path so as to cross through the boundary. In the
case of real volume data, however, the positions of the boundaries are
not known, but the same relationship between ,
and
needs
to be revealed by some measurement technique.

It is sufficient to measure , , and at each point of a uniform lattice. Fig. 8(a) shows a boundary being sampled continuously to produce smooth graphs of and versus . In Fig. 8(b), the sampling is along the same path, but is now discrete. The smooth graphs have been replaced by scatterplots, but the sequence of measurements traces out the same curves as before, indicating that discrete sampling and the resulting scatterplots are sufficient to illuminate the important relationships between and its derivatives. Finally, in Fig. 8(c), the boundary is sampled everywhere on a uniform grid. Though now the points are distributed differently -- many more hits have accumulated along where and are near zero -- the scatterplots trace out the save curves as before. By sampling everywhere, we no longer require knowledge of boundary location, and the ``global'' derivative characteristics of the boundary have been measured. This is precisely the sort of information relevant to opacity function generation.

The approach taken in this paper is to measure and its directional derivatives exactly once per voxel, at the original sample points of the dataset. One might be concerned that sampling merely at the original data points is not a sufficient sampling density to produce the curves seen in Figs. 7 and 8. However, with real volume data this will not be a problem, since the band-limiting in data acquisition assures there will always be some blurring, and since the boundaries of real objects tend to assume a variety of positions and orientations relative to the sampling grid.

One implementation issue in the histogram volume creation is how many bins to use. There is a trade-off between storage and processing requirements versus having sufficient resolution in the histogram volume to discern the patterns from which we generate the opacity functions. In our experiments, good results were obtained with histogram volumes of sizes between and bins, though there is no reason that the histogram volumes need to have equal resolution on each axis. Also, we have found it sufficient to use only 8 bits to represent the values in the histogram volume, scaling and/or clipping the number of hits to the range 0-255 if necessary.

A more subtle issue is what range of values to include along each
axis. Obviously, for the data value axis, the full range should be
included, since we intend to capture all the values at which
boundaries might occur. But along the axes for first and second
derivative, it makes sense to include something less than the full
range. Since derivative measures are by nature sensitive to noise,
including the full range of derivative values in the histogram volume
may cause the important and meaningful sub-range of values to be
compressed to a smaller number of bins, thereby hampering the later
step of detecting patterns in the histogram volume. We do not have an
*a priori* knowledge of the meaningful ranges of derivatives
values, so currently the derivative value ranges are set with an
educated guess. This is a matter in need of further research and
automation.

The most significant implementation issue is the method of measuring the first and second directional derivatives. The first derivative is actually just the gradient magnitude. From vector calculus [15] we have:

thus

(2) |

Unfortunately there is no similarly compact formula for , the second directional derivative along the gradient direction. Twice applying Eqn. 1 gives:

Or, using the Taylor expansion of along [5] gives:

where is the Hessian of , a matrix of second partial derivatives of [15]. Alternatively, we can use the Laplacian to approximate :

The approximation is exact only where the isosurfaces have zero mean surface curvature [5].

These three expressions for each suggest different implementations for the second derivative measure. Although this would benefit from more detailed study, we can still make useful observations regarding the comparative merits of each. While we have found the Hessian method to be the most numerically accurate, the others have proven sufficiently accurate in practice to make them appealing by virtue of their computation efficiency. The Laplacian computation is direct and inexpensive, but the most sensitive to quantization noise. The gradient of the gradient magnitude (Eqn. 3) is better, and its computational expense is lessened if the gradient magnitude has already been computed everywhere for the sake of volume rendering (e.g., as part of shading calculations).

By measuring the derivatives only at the original data points, the calculation of the first and second partial derivatives required in the above expressions is greatly facilitated by the use of discrete convolution masks applied at the data points; we have used standard central differences. Thus, our task is somewhat distinct from the usual problem of derivative measurement in volume rendering, where a primary concern is continuity of the derivative between sample points to allow for correct shading of interpolated data values [2,16].

The general algorithm for creating the histogram is straight-forward:

- 1.
- Initialize the histogram volume to all zeroes.
- 2.
- Make one pass through the volume looking for the highest values of and , and the lowest value of ; assume zero for the lowest value of . Set ranges on the histogram volume axes accordingly.
- 3.
- On a second pass through the volume,
- 3a.
- Measure , , and at each voxel,
- 3b.
- Determine which bin in the histogram volume corresponds to the measured combination of , , and , and
- 3c.
- Increment the bin's value.

It is possible to gain some insight into the object boundaries of the original dataset by simple visualization of the histogram volume after it has been calculated. One may be tempted to simply volume render the histogram volume from arbitrary views, but this usually turns out to be unrevealing due to the speckled and noisy nature of the histogram volume. A better way is to use summed-voxel of projections the histogram volume, projecting along either the or the axis, to produce scatterplots of versus or versus , respectively. This allows testing of the premise in Section 3.3 -- if there are boundaries in the original dataset that conform to the boundary model, there should be curves like that of Fig. 7 in the histogram volume.

Fig. 9 shows cross-sections and scatterplots for two synthetic datasets. The curves in the scatterplots are exactly the form seen in Fig. 8, and one can see an important property of the histogram volume -- for each pair of materials that share a boundary, there is a curve in the histogram volume. This property is again visible in Fig. 10, wherein various computed tomography datasets are being analyzed. Though the scatterplots are noisier, it clear that the histogram volume is successfully capturing information about the materials and their boundaries.

It should be noted that a related technique has been used in computer vision for feature identification. Panda and Rosenfeld [18] use two-dimensional scatterplots of data value and gradient magnitude to perform image thresholding for night vision applications. They, however, do not assume a boundary model, instead limiting their analysis of the scatterplot to identifying particular distributions within regions of low and/or high gradient magnitude.