Introduction

At present, the standard treatment for patients with locally advanced rectal cancer consists of a long course of neoadjuvant chemoradiation treatment (CRT) followed by surgical resection. As surgery is routinely performed in each patient—regardless of the response to treatment—response evaluation after CRT has so far not been a major issue. Nowadays there is, however, a trend towards minimally invasive treatments instead of standard surgery for well-responding patients [13]. Accurate response assessment then becomes relevant, as it may directly influence treatment planning. 18F-Fluorodeoxyglucose-Positron Emission Tomography (FDG-PET) and MRI have been most extensively studied for response evaluation, but these techniques suffer from limitations in the interpretation of fibrotic scar tissue and inflammation [4, 5]. Diffusion-weighted MR Imaging (DWI) is a functional imaging technique that analyses differences in the extracellular movement of water protons to discriminate between tissues of varying cellularity [6]. Different publications on DWI have shown its potentially beneficial role for the detection and characterisation of malignant tumours [79]. In addition, changes in tumour diffusion during and after treatment are indicative of tissue changes on a cellular level and may be used to evaluate treatment response [10, 11]. Previous studies in a variety of tumour types have suggested that quantitative interpretation of the apparent diffusion coefficient (ADC) can be used as a biomarker for response to treatment [1215]. For rectal cancer patients specifically, a benefit for treatment response evaluation by measuring tumour ADC values before [1619], during [1618, 20, 21], and after chemoradiation treatment has been suggested [22, 23]. Nevertheless—as also previously pointed out in a review by Patterson et al. [10]—there is no consensus yet on the true clinical value of ADC measurements for response assessment in rectal cancer. This is because the available literature consists of mainly small-scale studies with conflicting results. Moreover, in most studies, DWI evaluation was only performed by a single reader and ADC measurements by a variety of methods for region of interest (ROI) placement. Whereas some authors included the whole tumour volume [1719, 22, 24], others included only a single tumour slice [16, 21] or small tumour samples [23], which may contribute to the large variety in reported ADC results. It remains unclear whether ROIs for ADC measurements should ideally incorporate the entire tumour volume or only a representative tumour section. Furthermore, none of the studies focusing on rectal tumour ADC have addressed the issue of interobserver variability, which is a non-negligible factor when considering the use of ADC as a potential marker for response in clinical practice.

The purpose of the current study is to assess the influence of ROI size and positioning on interobserver variability and ADC values when measuring tumour ADC before and after chemoradiation treatment in patients with locally advanced rectal cancer. We aim to determine which method offers the most reproducible results in order to provide a reference for further studies.

Materials and methods

Patients

This study retrospectively evaluated 46 patients who were treated for locally advanced rectal cancer between 2006 and 2010. Clinical patient data were retrieved from a patient database originating from a previous imaging study approved by the local institutional review board, for which the patients provided written informed consent. Thirty-four patients were male and 12 were female. Median age was 70 years (range 49–88). Inclusion criteria consisted of [a] histologically (biopsy) proven rectal adenocarcinoma, [b] locally advanced disease, defined on primary staging T2-weighted MRI by an experienced gastrointestinal radiologist as tumour in the distal rectum (≤5 mm from the anorectal junction), threatened or involved circumferential resection margins (≤2 mm margin between the tumour and mesorectal fascia) and/or positive nodal stage (≥1 suspicious nodes, i.e. >5 mm in size and/or heterogeneous signal intensity and/or irregular border), [c] treatment consisting of a long course of preoperative CRT (50.4 Gy radiation + 2 × 825 mg/m2/day capecitabine) followed by surgical resection and [d] availability of pre- and post-CRT MR imaging including DWI. Patients with non-resectable and/or metastatic disease were excluded. Mucinous tumours are known to have a very low cellular density and will therefore exhibit high ADC values [25]. As this may bias the study results, patients with predominantly mucinous appearing tumours (identified as predominantly high signal lesions on T2-weighted MRI) were also excluded.

MR imaging

Patients did not receive bowel preparation or spasmolytics. Imaging was performed at 1.5 T (Intera; Philips Medical Systems, Best, The Netherlands) using a phased array body coil. All patients underwent a pre-treatment MRI for primary tumour staging and a second, restaging MRI for response evaluation 6–8 weeks after completion of CRT. The imaging protocol consisted of standard 2D T2-weighted (T2W) fast spin-echo sequences (FSE) in three orthogonal directions and an axial DWI single-shot echo planar imaging sequence, according to the method of diffusion-weighted imaging with background body signal suppression (DWIBS), acquired with b-values of 0,500 and 1000 s/mm2 [26]. The sequence parameters are displayed in Table 1. The axial T2W and DWI sequences were angled in identical planes and were planned perpendicular to the tumour axis as defined on sagittal MRI. ADC maps in greyscale were automatically generated at the operating system, using a monoexponential decay model including all three b-values.

Table 1 Sequence parameters

Image evaluation

The MR images were independently analysed by two radiological researchers (DMJL and TT), who performed tumour ADC measurements on the pre- and post-chemoradiation images. The readers were blinded to each other’s results, the clinical patient data and pathology reports. Mean tumour ADC was evaluated by manually drawing regions of interest (ROI) on the high b-value (b1000) diffusion images and copying them to the corresponding ADC map (Fig. 1). The mean ADC + standard deviation (SD) and the number of pixels per ROI was recorded for each individual measurement. On the pre-treatment b1000 diffusion images, tumour was defined as a focal mass showing high signal intensity compared with the signal of the normal adjacent rectal wall and corresponding with the tumour (mass showing intermediate signal intensity) on the anatomical T2-weighted MRI. On the post-chemoradiation DWI, tumour was defined as focal areas of residual high signal on the b1000 images within the location of the primary tumour bed and/or corresponding with residual tumour on T2-weighted MRI (Fig. 2). The pre-treatment images were at the readers’ disposal when analysing the post-treatment images, in order to compare and identify the location of the tumour. When no remaining high signal could be visualised on DWI, three sample measurements were obtained of the rectal wall at the former location of the primary tumour, of which an example is illustrated in Fig. 3.

Fig. 1
figure 1

Axial T2-weighted image (a), b1000 diffusion image (b) and ADC map (c) of a male patient with a tumour in the rectum. For the whole-volume and single-slice methods, ADC was measured by drawing freehand ROIs along the high signal intensity border of the tumour on the b1000 images (b) to cover the entire tumour area. ROIs were copied to the ADC map (c) to calculate ADC. For the solid sample method, tumour ADC was measured by drawing three oval- or round-shaped ROIs within the most solid tumour areas

Fig. 2
figure 2

Axial pre- (a) and post-treatment (b) T2-weighted images of a male patient with a rectal tumour. After treatment, the tumour has undergone mainly fibrotic changes (arrowheads). On the corresponding b1000 diffusion image, an ROI was drawn along a well-defined area of high signal intensity within the fibrosis, suggestive of residual tumour. At histology, a residual ypT2 tumour was found

Fig. 3
figure 3

Axial T2-weighted images of a male patient with a rectal tumour before (a) and after (b) chemoradiation treatment. After CRT, the rectal wall has normalised (arrowheads). On the corresponding b1000 diffusion image (c), no high signal was observed and ROIs were placed within the rectal wall at the location of the primary tumour to measure post-treatment ADC. At histology, the patient had undergone a complete response

ROI protocols

Mean tumour ADCs were measured according to three distinct ROI protocols: [a] ‘Whole-volume’, [b] ‘Single-slice’ and [c] ‘Solid tumour samples’. For the whole -volume method, freehand ROIs were drawn along the border of the high signal of the tumour on the b1000 images to cover the entire tumour area of each consecutive tumour-containing slice. Mean ADC (+SD) was obtained for each slice and ADC values were averaged to calculate the mean ADC of the whole tumour volume. For the single-slice method, a single freehand ROI was drawn in the same way (along the border of the tumour), but only on a single slice containing the largest available tumour area. For the third method, mean ADC was calculated from a sample of three round/oval-shaped ROIs that were placed within the most solid tumour part (as identified on T2W-MRI) of three independent tumour-containing slices, which an example is illustrated in Fig. 1.

Statistical analyses

Statistical analyses were performed using the Statistical Package for the Social Sciences (SPSS, version 16.0, Inc., Chicago, IL, USA). Interobserver variability for the tumour ADC measurements of the two readers for the pre- and post-CRT ADC measurements and for each individual ROI method was analysed according to the method of Bland and Altman and by calculating the intraclass correlation coefficient (0.00–0.20 poor, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 good and 0.81–1.00 excellent correlation). ADCs were averaged between the two observers for further analyses. A paired samples t-test was used to compare [a] the pre- and post-treatment ADCs and [b] the tumour ADC values obtained by the three different ROI methods. For each patient, the average variance was calculated over the different slice measurements, weighted with the number of pixels. The mean SD for each patient was calculated as the square root of the variance. The variance (mean for the whole patient group) of the different ROI measurement methods and for the pre- and post-CRT measurements was compared using the F-statistics with the total number of slices as the degree of freedom. P values <0.05 were considered statistically significant.

Results

Patient and treatment characteristics

Twenty-seven patients underwent a low anterior resection, 15 an abdominoperineal resection and 4 more extended surgery. At histology 6 patients had a ypT0, 5 ypT1, 14 ypT2, 20 ypT3 and 1 a ypT4 status. Thirty-three patients had a ypN0, 9 ypN1 and 4 ypN2 status.

Effect of ROI methods

The mean tumour ADCs, SDs and total ROI sizes are displayed in Table 2 for the pre- and post-treatment measurements of each respective ROI protocol. Mean pre-treatment tumour ADC was significantly lower when measured by means of small sample ROIs, compared with the whole-volume (p < 0.001) or single-slice protocol (p < 0.001), respectively. For the post-CRT measurements there were no significant differences in tumour ADC between the whole-volume ROIs compared with the single-slice (p = 0.07) or small sample ROIs (p = 0.08), respectively, but the single-slice ROIs resulted in significantly higher ADCs compared with the small sample ROIs (p = 0.002). For the pre-CRT measurements, the variance (SD) of the small sample ROI measurements was significantly smaller than for the whole-volume ROIs (p < 0.001) and single-slice ROIs (p = 0.03), respectively. For the post-CRT measurements, the variance of the small sample ROIs was also smaller than that of the whole-volume ROIs (p = 0.003) and single-slice ROIs, although the latter difference was not statistically significant (p = 0.06). There were no significant differences in tumour ADC or variance between the whole-volume and single-slice approaches.

Table 2 Influence of choice of regions of interest (ROIs)

Interobserver variability

Intraclass correlation coefficients between the two readers are provided in Table 3 for the three ROI protocols. The interobserver reproducibility was excellent (ICC 0.91) for the pre-CRT whole-volume ADC measurements, and good (ICC 0.66) for the post-CRT measurements. For the single-slice and solid sample ROIs, the ICCs ranged from 0.42 to 0.65. Figure 4 displays the Bland-Altman plots for the whole-volume measurements performed pre- and post-CRT.

Table 3 Interobserver variability (measured as the intraclass correlation coefficient*) for the different ROI protocols
Fig. 4
figure 4

Interobserver reproducibility for the whole-volume tumour ADC measurements performed pre- and post-chemoradiation treatment. Bland-Altman plots of the mean ADC of the two observers (x-axis) against the difference in ADC between the two observers (y-axis). The continuous lines represent the mean absolute difference (bias) in ADC between the two observers; the dashed lines represent the 95% confidence intervals of the mean differences (limits of agreement)

Discussion

The results of this study show that, when measuring ADC in patients with locally advanced rectal cancer, tumour ADC values and interobserver variability are highly dependent on methods of ROI analysis. ADC measurements obtained from the whole tumour volume are more reproducible than those obtained from single-slice or small sample measurements. In specific pre-treatment whole-volume ADC measurements result in excellent interobserver reproducibility.

The number and size of the ROIs affected the interobserver agreement. When comparing the different ROI protocols, the single-slice and sample ROIs resulted in considerably poorer interobserver agreement (ICC 0.42–0.65) than the whole-volume ROIs (ICC 0.66–0.91), indicating that analysing a larger number of pixels results in more reproducible ADC values. Interobserver agreement for the whole-volume ADC measurements before treatment was excellent (ICC 0.91), but results after treatment were poorer (ICC 0.66). After chemoradiation, rectal tumours have often undergone massive fibrotic changes and defining a region of tumour residue within the fibrosis may be more difficult (Fig. 5). In cases where the tumour has completely regressed and the bowel wall has normalised or become fibrotically thickened, it can be even more challenging to correctly define an ROI (Fig. 3). After CRT, ADC measurements thus seem to be more affected by the interpretation skills of the reader than before CRT, when the tumour is generally better defined.

Fig. 5
figure 5

Axial T2-weighted images of a male patient with a rectal tumour before (a) and after (b) chemoradiation treatment. An ill-defined residual area of hypointense signal intensity, indicative of fibrosis, is visible after CRT (arrowheads). On the corresponding diffusion image (c) there is still an area of high signal intensity, suggestive of residual tumour (arrows). Because of its irregular aspect and ill-defined borders, however, it is difficult to delineate an ROI, explaining the relatively poor interobserver agreement for the post-CRT ADC measurements. At histology, a ypT1 residual tumour was found

The choice of ROIs also significantly influenced the tumour ADC values. On pre-CRT MRI, the whole-volume and single-slice ROIs resulted in significantly higher tumour ADC values than the small sample ROIs. The small sample ROIs only included the most viable solid tumour parts, which may explain the lower ADC values. In this setting, areas of necrosis are likely to be excluded from the ADC measurements, while the presence of necrosis before onset of treatment is in fact believed to be an important indicator when aiming at evaluating response. A previous study of Roth and co-authors showed that whole-volume tumour ADC measurements were a better predictor of response than ROIs chosen only from viable regions of the tumour [18]. Although the focus in their study was on perfusion CT in patients with colorectal cancer Goh et al. also found that, when obtaining pharmacokinetic parameters by applying different ROI sizes and positions, whole tumour volume measurements were the most reliable [27]. The above-described phenomenon may also explain why the whole-volume ADC measurements resulted in a larger variance and higher standard deviations, which is likely to reflect the heterogeneous nature of the tumour, including solid foci, as well as areas of necrosis and fibrosis. Altogether these findings suggest that whole-volume measurements might be a better indicator of tumour viability and may therefore be more suitable for assessment of response. Furthermore, as was also stressed by Goh et al. [27], if variations in ROI substantially influence the measurements, efforts should be made to standardize their application for clinical use. Interestingly, we observed no significant differences in tumour ADC or SD between the whole-volume measurements and the single-slice approach, suggesting that the latter may also be used as a less time-consuming alternative. However, one should keep in mind that the single-slice method was subject to a much larger interobserver variability and whole-volume measurements thus remain the single most reliable method.

Our study is limited because of its retrospective nature and the relatively small patient numbers. Furthermore, it was sometimes difficult to position regions of interests due to susceptibility artefacts occurring around air-tissue interfaces. This was especially challenging after chemoradiation, in cases where only a limited or no residual tumour could be identified on DWI. Susceptibility artefacts might be minimised by applying rectal wall distension with intraluminal filling, which we have not done in the current study. The specific focus of this study was to determine the effect of ROI size and positioning on tumor ADC evaluation and not to assess the relation between ADC and response, as various previous authors have done [1624]. As such, we chose not to include a correlation between ADC and histopathological parameters of response.

In conclusion, variations in ROI size and positioning have a significant effect on tumour ADC values and interobserver variability. The most reproducible results are obtained when measuring ADC of the whole tumour volume. Interobserver variability is larger after chemoradiation treatment than before. These issues should be taken into account when considering the use of ADC as a potential biomarker for response in clinical practice.