Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The aim of radiotherapy treatment planning (RTP) is to deliver an optimal dose of radiation over the target area while sparing the normal tissues. RTP first requires contouring the target and organs at risk (OARs). Once these volumes have been defined, the optimal dose distribution for treating the tumour is determined according to the attenuation properties of the different tissues. Most radiotherapy treatments are planned using an X-ray computed tomography (CT) scan of the patient. The acquisition of a CT is fast and the tissue attenuation coefficients can easily be derived from the CT intensity values in Hounsfield unit (HU). However, CT images have low soft tissue contrast, which can lead to large variations when delineating the organs, particularly when located in the brain, head & neck, or pelvic regions. Magnetic resonance (MR) imaging is often preferred over CT as a structural imaging modality, mainly for its excellent soft tissue contrast. Although increasingly used in clinical practice, the role of MR in RTP is currently limited by the fact that it does not readily provide electron density information, hampering the calculation of dose distributions. In this work, we propose to tackle the problem of RTP from MR images by developing a multi-atlas propagation framework to jointly delineate the OARs and estimate the tissue attenuation properties.

Multi-atlas propagation was first introduced for segmentation purposes [1, 2]. The technique relies on a database of atlases composed of an intensity image and a segmented image. To segment the target image, a first step consists of registering the atlas intensity images to the target intensity image and to apply the same transformations to the associated segmented images. A second step consists of fusing the propagated segmented images to generate the target segmentation. The technique was later extended to the synthesis of images by propagating intensity images instead of segmented images, for example CT images [3]. Multi-atlas CT synthesis methods have been developed for RTP [46] but only a few have been applied outside of the brain [7, 8].

Even though atlas-based segmentation and CT synthesis methods have been successfully applied to RTP independently, a key to expand MR-based planning is to guarantee that the segmentations and pseudo CT generated from the MR images match each other, i.e. a voxel labelled as bone should have a bone density value in the pseudo CT image. This is not guaranteed if the segmentation and CT synthesis tasks are handled separately. Dowling et al. [8] proposed to combine CT synthesis and segmentation using a database of atlases each composed of an MR, a CT and a segmented image. The atlas MR images were first registered to the target MR image and the same transformations were applied to both the atlas CT and segmented images. A local weighted voting method was then used to generate the target CT and segmented images. In this paper, we develop an iterative multi-atlas propagation framework that combines in a single pipeline segmentation and CT synthesis, with the aim to improve both the segmentation and synthesis accuracy, and guarantee consistent results.

2 Methods

2.1 Database Building

A multi-atlas database consisting of a T2-weighted and a T1-weighted MR image, a CT image, and a segmented image for each atlas was created by non-rigidly registering each subject’s CT and T1 images to their T2 image, after resampling the MR images to isotropic resolution and intensity non-uniformity correction. The segmentations were obtained by manually contouring the T2 image. The number of atlases was artificially increased by left-right flipping the images.

2.2 Joint Iterative Segmentation and Image Synthesis

In the proposed iterative framework, a set of probabilistic segmentations and pseudo CT (pCT) images is jointly generated from the target’s MR images by registering the atlases to the target, and fusing the atlas segmentations and CT images according to the similarity between the target and atlases. Similarly to [9], this set is then combined with the target MR images and used to improve the subsequent multi-atlas registration and fusion. A diagram illustrating the proposed method is shown in Fig. 1.

Fig. 1.
figure 1

Joint segmentation and CT synthesis at iteration t. All the atlases are non-rigidly registered to the target. A local similarity measure between the mapped atlases and the target is used to jointly generate a pseudo CT and a segmented image.

Joint Iterative Label and Intensity Fusion. Let the target subject’s dataset at iteration t be denoted by \(\mathbb {I}_t = \lbrace I^{T2}, I^{T1}, I^{pCT}_{t-1}, I^{S}_{t-1} \rbrace \) with \(I^{T2}\) and \(I^{T1}\) the T2- and T1-weighted MR images, and \(I^{pCT}_{t-1}\) and \(I^{S}_{t-1}\) the pCT and segmented images obtained at iteration \(t-1\). Let the dataset of the nth atlas in the database be denoted by \(\mathbb {I}_n = \left\{ I^{T2}_n, I^{T1}_n, I^{CT}_n, I^{S}_n \right\} \) where \(I^{CT}_n\) corresponds to the real CT and \(I^{S}_n\) to the manual segmentation.

The first step of the method consisted of registering each atlas to the target subject. This inter-subject coordinate mapping was obtained using a symmetric global registration followed by a cubic B-spline parametrised non-rigid multi-channel registration as implemented in NiftyReg [10]. The local normalised cross-correlation was used as a similarity measure for the MR channels while the sum of squared differences was computed for the CT channel, exploiting the quantitative property of the CT intensities, and the Kullback-Leibler divergence for the segmentation channels. At each iteration t, a new transformation \(\mathcal {T}_{n,t}\) that maps atlas n to the target was defined, generating a set of images aligned to the target: \(\mathbb {J}_{n,t} = \left\{ J^{T2}_{n,t}, J^{T1}_{n,t}, J^{CT}_{n,t}, J^{S}_{n,t} \right\} \) where \(J^X_{n,t} = I^X_{n} \left( \mathcal {T}_{n,t}(x) \right) \).

The pCT and probabilistic segmentations were then obtained by fusing the N atlases mapped to the target subject as follows:

$$\begin{aligned} I^{pCT}_t(x) =&\frac{\sum ^{N}_{n=1}{w_{n,t}(x)} \cdot J^{CT}_{n,t}(x)}{\sum ^{N}_{n=1}{w_{n,t}(x)}} , \\ I^{S_l}_t(x) = \frac{\sum ^{N}_{n=1}{w_{n,t}(x)} \cdot V^l_{n,t}(x)}{\sum ^{N}_{n=1}{w_{n,t}(x)}}&\ \text {with} \ V^l_{n,t}(x) = {\left\{ \begin{array}{ll} 1 &{} \text {if } J^{S_l}_{n,t}(x)=l, l \in \{1..L\} \nonumber \\ 0 &{} \text {otherwise} \end{array}\right. }. \end{aligned}$$
(1)

l indexes through the labels and L is the number of all possible labels. \(V^l_{n}\) is the vote for label l produced by the nth atlas. If required, the categorical label result \(\mathcal {L}\) at location x can thus be obtained by estimating \(\mathcal {L}(x)=\max _l(I^{S_l}(x))\).

The weighting factor \(w_{n,t}(x)\) was obtained by applying an exponential decay function to the rank \(r_{n,t}(x)\) of the measure used to assess the similarity between atlas n and the target: \(w_{n,t}(x)=e^{-\beta _t r_{n,t}(x)}\). After each iteration, the registration for all the atlases improves and more atlases can contribute to the fusion. As a smaller \(\beta \) means that more atlases contribute to the average, we set \(\beta \) to decrease with the number of iterations (by 0.125 starting from \(\beta _1=0.75\)).

Note that at the first iteration both the inter-subject mapping and fusion steps were based on the MR images only.

Convolution-Based Local Similarity Measures. To locally select the atlases used in the fusion, a combination of two similarity measures computed between the target and atlases was used. The structural similarity [11] extended to irregular regions-of-interest (ROI) was computed on the MR and CT channels. The ROI-SSIM between images I and J at voxel x is given by

$$\begin{aligned} \text {ROI-SSIM}(I(x),J(x))=\frac{2 \mu _{I}(x)\mu _{J}(x)+C_1}{\mu _{I}^2(x) + \mu _{J}^2(x) + C_1} \frac{2 \sigma _{I, J}(x)+C_2}{\sigma _{I}^2(x)+\sigma _{J}^2(x)+C_2} . \end{aligned}$$
(2)

\(C_1\) and \(C_2\) are two constants used to improve the stability of the structural similarity [11]. Let \(\varOmega \) be a density function equal to 1 where the fields of view (FOV) overlap, and 0 otherwise. The means and standard deviations at voxel x were calculated using a Gaussian kernel \(G_{\sigma _G}\) with standard deviation \({\sigma _G}\) through density normalised convolution

figure a

where \(*\) denotes the convolution operator and \(G_{\sigma _G}*\varOmega \) represents a density normalisation term that compensates for areas with missing information. As the values of ROI-SSIM are only valid within the bounds of the FOV, values outside the FOV were set to \(-\infty \).

A local fuzzy dice score coefficient (DSC) defined per label l and summed over all labels was used to asses the local overlap between the segmented images \(I^S\) and \(J^S\)

$$\begin{aligned} \text {LDSC}(I^S(x),J^S(x))= \sum _{l \in \{1..L\}} \frac{2 \min \left( \mu _{I^{S_l}}(x),\mu _{J^{S_l}}(x) \right) }{\mu _{I^{S_l}}(x) + \mu _{J^{S_l}}(x)} . \end{aligned}$$
(3)

\(\mu _I^{S_l}\) was obtained by convolving the segmentation density \(I^{S_l}\) with the Gaussian kernel \(G_{\sigma _G}\): \(\mu _{I^{S_l}}(x)=\left[ G_{\sigma _G}*I^{S_l}\right] (x)\).

The final local similarity measure (LSIM) computed at iteration t between the target and the nth atlas was obtained by summing the ROI-SSIM computed on the MR and CT channels and the LDSC computed on the segmentation channel: \(\text {LSIM}(\mathbb {I}_{t-1},\mathbb {J}_{n,t}) = \text {ROI-SSIM}(I^{T2},J^{T2}_{n,t}) + \text {ROI-SSIM}(I^{T1},J^{T1}_{n,t}) + \text {ROI-SSIM}(I^{pCT}_{t-1},J^{CT}_{n,t}) + \text {LDSC}(I^S_{t-1},J^S_{n,t})\).

3 Validation and Results

Data. The proposed framework was evaluated on 15 prostate cancer patients. Each subject had a T2-(2D spin echo; TE/TR: 80/2500 ms; 1.46\(\times \)1.46\(\times \)\(\mathrm {mm}^{3}\)) and a T1-weighted (2D spin echo; TE/TR: 10/400 ms; 1.64\(\times \)1.64\(\times \)\(\mathrm {mm}^{3}\)) MR image, and a CT image (140 kVp, voxel size 0.98\(\times \)0.98\(\times \)1.5 \(\mathrm {mm}^{3}\)) acquired the same day. The delineation of the OARs was performed by a qualified clinician.

The performance of the proposed methodology was compared with reference data for the 15 subjects following a leave-one-out cross-validation strategy. Four iterations of the pipeline were computed. They are referred to as local fusion iter t with \(t={1,2,3,4}\). An additional method (global fusion) equivalent to the first iteration but with a global atlas selection instead of per voxel, i.e. the weights \(w_{n,t}\) in Eq. (1) are not a function of x, was also implemented for comparison with state-of-the-art [2]. An example of reference data and results obtained with the global fusion and local fusion iter 1 and 4 are displayed in Fig. 2.

Fig. 2.
figure 2

The categorical segmentations of the prostate (red), bladder (orange), rectum (green) and femur heads (blue), and the pCTs were obtained from the T1-weighted and T2-weighted MR images of the patient. Note the increased smoothness of the segmentations and sharpness of the pCTs between global fusion and local fusion iter 4.

3.1 Segmentation Accuracy

The segmentation accuracy was assessed by computing the fuzzy DSC between the manual and atlas-based segmentations for the different OARs considered (bladder, prostate, rectum, and left (L) and right (R) femur heads). Results displayed in Fig. 3 show a statistically significant improvement for most OARs between the global and local fusion methods, and between the first and second iterations. Statistical significance was assessed using the paired one-tailed Wilcoxon signed-rank test, with a 5 % significance level. On average for the 4th iteration of the proposed method, the DSC was 0.90 for the bladder, 0.74 for the prostate, 0.76 for the rectum and 0.90 for both femur heads. Although the DSC does not significantly improve after the 2nd iteration, we observed an increase in the segmentation smoothness when we kept iterating, as illustrated in Fig. 4.

Fig. 3.
figure 3

Boxplots displaying the median, lower and upper quartiles, and minimum and maximum of the DSC calculated between the manual and atlas-based segmentations (left); and the MAE computed between the reference and pseudo CTs (right). The stars indicate a significant improvement between the current and previous measurements.

Fig. 4.
figure 4

3D representation of the prostate manually delineated and of the probabilistic segmentations (thresholded at 0.5) obtained with the proposed method. Note how the smoothness of the segmentations increases with the number of iterations.

3.2 CT Synthesis Accuracy

The mean absolute error (\(\text {MAE}= \frac{1}{V} \sum _x{|I^{pCT}(x)-R^{CT}(x)|}\)) and the mean error (\(\text {ME}= \frac{1}{V} \sum _x{I^{pCT}(x)-R^{CT}(x)}\)) were calculated for every subject between the reference CT non-rigidly aligned to the MR (\(R^{CT}\)) and each of the pseudo CTs (\(I^{pCT}\)) in an ROI comprising V voxels. In both the full image and the bone region (delineated by thresholding the reference CT), the MAE showed a significant decrease in synthesis error between the global and local approaches, and between each iteration (Fig. 3). The low ME obtained for all the methods (−3 ± 10 HU on average in the full image and −2 ± 32 HU in the bone region) demonstrates an absence of bias in the methodology.

3.3 Dosimetry Calculations

Dose calculations were performed using the RayStation treatment planning system to assess the applicability of the proposed framework for RTP. We compared the cumulative dose volume histogram (DVH) obtained for the pseudo CT obtained after the 4th iteration (\(I^{pCT}_4\)) to the DVH obtained for the reference CT image in the planning target volume (PTV) located in the prostate, and in the OARs. The same contours were used for the pseudo and reference CT images. The DVHs displayed in Fig. 5 for a representative subject shows a close agreement between the doses calculated from the reference and pseudo CT images. For all the subjects, percentage differences were evaluated for several DVH points: D\(_{\text {98}\,\%}\), D\(_\text {mean}\) and D\(_{\text {2}\,\%}\) for the PTV, and D\(_\text {mean}\) and D\(_{\text {2}\,\%}\) for the OARs (bladder, rectum and femur heads). D\(_\text {x}\) is the dose given to x% of the structure volume and D\(_\text {mean}\) is the mean dose given to the evaluated volume. Results are displayed in Fig. 5. On average for all the DVH points considered, the percentage difference was less than ± 0.15 % for the PTV and all the OARs.

Fig. 5.
figure 5

Left: DVHs obtained for the reference CT (solid lines) and pseudo CT (dash-dot lines) in the PTV and OARs for a representative subject. Right: Boxplots of the percentage differences evaluated for several DVH points obtained for 15 subjects.

4 Discussion and Conclusion

This paper presents a joint segmentation and CT synthesis framework for MRI-only RTP able to automatically generate accurate pseudo CT images and OAR contours in the pelvic region. Solving the segmentation and synthesis tasks simultaneously results in not only having the solutions in agreement, but the joint estimation aids in improving the accuracy of each aspect, as seen in Fig. 3, where we observe that an increase in segmentation overlap corresponds to a decrease in CT synthesis error.

When evaluating the segmentation accuracy obtained with their method, Dowling et al. [8] reported mean DSCs of 0.80, 0.86 and 0.84 for the prostate, bladder and rectum, respectively. Wong et al. [12] assessed the performance of several multi-atlas segmentation methods. With their recommended setting, the median DSC for the prostate, bladder, rectum and femurs was 0.84, 0.90, 0.77 and 0.95, respectively. Taking into account the lower resolution of the images used in this work (5 mm slice thickness versus 2 mm in [8] and 2.5 mm in [12]), we obtained comparable results with mean DSCs of 0.74, 0.89, 0.76 and 0.90.

Regarding the CT synthesis, the proposed method generates pseudo CTs as accurate as previously reported for the pelvic region, even though they were obtained from lower resolution MR images. Dowling et al. [8] reported an average MAE of 40.5 ± 8.2 HU while we obtained an average MAE of 42.9 ± 4.0 HU.

As future work, the probabilistic property of the segmentations obtained with the proposed framework could be used to automatically define margins when contouring organs and improve MR-based RTP.