next up previous
Next: Validation on Real MR Up: Results and Validation Previous: Results and Validation


Validation on Simulated MR Images

This section validates the proposed approach on simulated brain MR images with a known ground truth. We use 1 mm isotropic T1-weighted images from the BrainWeb simulator [31] with varying noise levels and bias fields. Figure 6.1 shows some data along with the classification and the ground truth.

Figure 6.1: Qualitative analysis of the proposed algorithm with BrainWeb data [31] with $5 \%$ noise and a $40 \%$ bias field. (a) A coronal slice of the data. (b) The classification produced by the proposed method. (c) The ground truth.
\begin{figure}\threeAcrossLabels {MRI_Classification/BrianWeb_t1_icbm_normal_1mm...
...Web_simplifiedGroundTruth_Slice_86_Coronal.eps} {(a)} {(b)} {(c)}
\end{figure}

We first show results on simulated T1-weighted data without any bias field and with noise levels varying from $0 \%$ to $9 \%$ . We use the 2-class prior. The BrainWeb simulator defines the noise-level percentages with respect to the mean intensity of the brightest tissue class. Figures 6.2(a) and 6.2(b) plot the Dice metrics for gray-matter ( $D_{\mathrm{gray}}$) and white-matter ( $D_{\mathrm{white}}$) classifications for the proposed algorithm and compare them with the corresponding values for the current state-of-the-art [94]. We see that the proposed method is consistently better for the white matter. For a few noise levels for the gray matter, its performance level is slightly below the state-of-the-art. We have found that this is caused by the 2-class prior which biases the results against the gray matter, as compared to the scaled-atlas prior. With the scaled-atlas prior the results are consistently better than the state-of-the-art for all noise levels. Section 6.5.2 describes that both priors perform equally well as measured by the average of the Dice metric for the white matter and gray matter, i.e., $(D_{\mathrm{white}} + D_{\mathrm{gray}})/2$.

Figure: Validation, and comparison with the state-of-the-art [94], on simulated T1-weighted data without any bias and varying noise levels. Here, the proposed method uses the 2-class prior. Dice metrics for (a) white matter: $D_{\mathrm{white}}$, (b) gray matter: $D_{\mathrm{gray}}$, and (c) their average: $(D_{\mathrm{white}} + D_{\mathrm{gray}})/2$. Note: In the graphs, P: Proposed method, L: State-of-the-art method of Leemput et al. [94].
\begin{figure}\oneWidthLabel {MRI_Classification/graphs_BrainWeb_FgBg_Bias_0_WM....
...ssification/graphs_BrainWeb_FgBg_Bias_0_Average.ps} {0.475} {(c)}
\end{figure}

Figure 6.2(c) shows that for the average Dice metric, the proposed algorithm performs consistently better than the state-of-the-art at all noise levels for gray matter and white matter. Furthermore, it exhibits a slower performance degradation with increasing noise levels than the state-of-the-art method. For $3 \%$ noise, which is typical for real MRI [94], the improvement in the average Dice metric is approximately $1.1 \%$. The performance gain at $9 \%$ noise is $3.8 \%$. The larger gain over the state-of-the-art for large noise levels should prove useful for classifying noisier fast-acquisition clinical MRI.

Figure 6.2 shows that for low noise levels, the performance of the parametric EM-based algorithm drops dramatically. This is because it systematically assigns voxels close to the interface between gray matter and white matter to the class which happens to have a larger intensity variability [94]. This class is, inherently, the gray matter class. It turns out that, in such low-noise cases, partial voluming seems to dictate the MR-tissue intensity model which deviates significantly from the assumed Gaussian [94]. Hence, approaches enforcing Gaussian intensity PDFs on the classes, such as [94,146], would face a serious challenge in this case. In contrast, the proposed adaptive modeling strategy, which is based on nonparametric density estimation, does not suffer from this drawback. Figure 6.2 clearly depicts this advantage of the proposed method.

Strictly speaking, all methods trying to classify partial-volume voxels to one specific class are, in a way, fundamentally flawed. The proposed method, however, approaches this problem in a relatively more principled manner as compared to the EM-based method [94]. A partial-volume voxel $t$ comprising a larger contribution from tissue-class $k$ will produce a ${\bf z}_t$ lying ``closer'' to the feature-space distribution of class $k$. The results show that the data-driven nonparametric estimation of all tissue-class PDFs, employing the same Parzen-window $\sigma $ for each class, prevents any undesirable biases (unlike [94]) in the classification.

Figure 6.3: Validation, and comparison with the state-of-the-art [94], on simulated T1-weighted data with $40 \%$ bias and varying noise levels. We compare the performance by incorporating explicit bias correction and global sampling: same sample size (see text). Dice metrics for (a) white matter: $D_{\mathrm{white}}$, (b) gray matter: $D_{\mathrm{gray}}$, and (c) their average: $(D_{\mathrm{white}} + D_{\mathrm{gray}})/2$. Note: In the graphs, P: Proposed method, BC: Bias correction, GS: Global sampling: same sample size, L: State-of-the-art method of Leemput et al. [94].
\begin{figure}\oneWidthLabel {MRI_Classification/graphs_BrainWeb_FgBg_Bias_40_WM...
...sification/graphs_BrainWeb_FgBg_Bias_40_Average.ps} {0.475} {(c)}
\end{figure}

Figure 6.3 shows the validation results with the BrainWeb data having a $40 \%$ bias field with varying noise levels. Even in the absence of an explicit bias-correction scheme, the method performs quite well on biased BrainWeb MR data (Figure 6.2). To confirm the important role that the local-sampling Parzen-window density estimation strategy plays in enabling the automatic learning of the bias field, we perform two more experiments. In the first experiment, we use explicit bias correction with the proposed method (degree-4 polynomial fit [93] to the white matter intensities iteratively). Figure 6.3 shows that this method performs approximately as well, but not significantly better than without the bias correction. The second experiment replaced the local-sampling scheme with a global-sampling scheme that chooses the random Parzen-window sample (with the same sample size $\vert\mathcal{A}_t\vert$) uniformly over the image. Figure 6.3 shows that this scheme performs significantly worse at all noise levels in the absence of bias correction.

To study the sensitivity of the variance parameter $\sigma_{\mathrm{spatial}}^2$ for the local-sampling Parzen-window Gaussian and the Parzen-window $\sigma $ multiplicative factor $\alpha $, we measure the Dice metrics for the white matter and gray matter over a range of parameter values. We use the BrainWeb T1 data with $5 \%$ noise and a $40 \%$ bias field. Table 6.1 gives the results confirming that the classification performance is fairly robust to changes in the values of these two parameters, as explained before in Section 3.5.2.

We can extend the proposed method in a straightforward manner to deal with multimodal data. Multimodal segmentation entails classification using MR images of multiple modalities, e.g., T1 and PD. It treats the combination of images as an image of vectors with the associated PDFs in the combined probability space. Figure 6.4 shows the classification results for multimodal data using T1 and PD images, both with and without a bias field. The results demonstrate that incorporating more information in the classification framework, via images of two modalities T1 and PD, produces consistently better results than those using T1 images alone.

Figure 6.4: Validation on simulated multimodal (T1 and PD) data with varying noise levels. Dice metrics for (a) white matter: $0 \%$ bias, (b) gray matter: $0 \%$ bias, and (c) their average: $0 \%$ bias. Dice metrics for (d) white matter: $40 \%$ bias, (e) gray matter: $40 \%$ bias, and (f) their average: $40 \%$ bias. Note: In the graphs, P: Proposed method, T1PD: Using both T1 and PD images.
\begin{figure}\twoWidthLabels {MRI_Classification/graphs_BrainWeb_FgBg_T1_PD_Bia...
...cation/graphs_BrainWeb_FgBg_T1_PD_Bias_40_Average.ps} {(c)} {(f)}
\end{figure}


next up previous
Next: Validation on Real MR Up: Results and Validation Previous: Results and Validation
Suyash P. Awate 2007-02-21