Background

Ultrasound, computed tomography (CT) and magnetic resonance imaging (MRI) are widely recommended in the diagnosis of gynaecologic cancer. These conventional imaging modalities present a high sensitivity, specificity and accuracy in the primary staging of the disease. However, the accuracy and specificity of these techniques for the detection of pelvic tumor recurrences or postsurgical residual disease remains low owing to limitations in distinguishing disease from postsurgical changes [1, 2]. CT and MRI may be used for target volume delineation in RT treatment planning of gynaecologic carcinomas. However, a reliable definition of tumor extension is difficult to assess with either modality, especially after surgery. Recently, 18 fluorodeoxyglucose (18F-FDG) positron emission tomography – computed tomography (PET/CT) has been recognized as a valuable tool for the diagnosis of primary and recurrent gynaecological cancer enabling the optimization of RT treatment planning [3, 4].

The objective of this study is to assess the role of 18F-FDG PET/CT based target volume delineation in recurrent or post-surgical residual gynaecologic cancer. We compared the gross tumor volume (GTV) defined manually by four experienced radiation oncologists using contrast-enhanced CT and fused 18F-FDG PET/CT images, as well as the biological target volumes (BTVs) defined on the PET/CT semi-automated delineation technique. In addition, we evaluated the inter- and intra-observer variability in the GTV delineation using the above mentioned methods.

Methods

Patients

This prospective study was approved by the institutional ethical committee. A signed informed consent was obtained from all patients participating in the study protocol. Between September 2006 and December 2008, 10 patients with a histologically proven local recurrent (n = 5) or post-surgical residual (n = 5) gynaecological cancer were included. Patients didn’t show any evidence of lymph node or distant metastases. Local recurrences were observed at a median of 34 months (range, 9-62 months) after surgery in 4 patients and following postsurgical radio-chemotherapy in 1 patient. The median age was 64 years (range, 40-81 years). The clinical characteristics and referral patterns of the patient population are summarized in Table 1.

Table 1 Tumor characteristics and CT- and PET/CT-based GTVs (gross tumor volumes)

18F-FDG PET/CT

All 10 patients underwent a diagnostic whole body 18F-FDG PET/CT scan performed in treatment planning conditions on the Biograph 16 PET/CT scanner, Siemens Healthcare, Erlangen, Germany. Patients fasted at least 6 hours prior to the start of the examination. A forced-diuresis protocol was used in all patients for a better differentiation between the tumor and the bladder. Thirty minutes after the 18F-FDG-injection, each patient received 0.5 mg of furosemide per kilogram of body weight (maximum, 40 mg) followed by infusion of 500 mL of physiologic saline through an intravenous line. One hour after 18F-FDG injection and directly after voiding of the bladder, patients were placed in scanning position.

First, a topogram was obtained from the skull to the upper region of the legs. Secondly, 18F-FDG PET data were acquired in 3 to 4 minutes bed positions (total of 6 to 7 bed positions) following a low dose CT scan using for attenuation correction. A diagnostic quality contrast enhanced CT scan was then performed.

18F-FDG PET, CT and fused 18F-FDG PET/CT images were displayed for reviewing axial, coronal, and sagittal planes. All studies were interpreted and reviewed with knowledge of the patient’s clinical history and results of previous imaging studies including MRI of the pelvis in all patients. A combined team of an experienced nuclear medicine physician and an experienced radiologist interpreted the 18F-FDG PET/CT images. A multimodality computer platform (Syngo Multimodality Workplace, Siemens Healthcare, Erlangen, Germany) was used for image review and interpretation. All 18F-FDG PET/CT studies showing at least one site of abnormal 18F-FDG uptake were characterized as malignant. Foci of increased 18F-FDG uptake, with intensity higher than that of surrounding tissues, in areas unrelated to physiologic or benign processes, were defined as malignant. Tumor uptake of all lesions was assessed quantitatively using maximum standardized uptake value (SUV) derived by placing a region of interest encompassing the tumor on each slice of transaxial plane.

Manual contouring protocol

Four experienced radiation oncologists were asked to delineate the GTVs on axial slices of the CT (GTVCT) and the 18F-FDG PET/CT (GTVPET/CT), respectively. Recent T2-weighted contrast enhanced MRI images were also available as additional information for contouring and for fusion on Syngo multimodality software (Siemens Healthcare, Erlangen, Germany). All scans were contoured with knowledge of the additional diagnostic images and reports.

The contouring process consists of the following steps: firstly the radiation oncologists delineated the GTV on the contrast-enhanced CT images alone (GTVCT1). The images and reports of the 18F-FDG PET were blinded. Then, after at least two weeks the observers contoured the BTV on the fused 18F-FDG PET/CT images (GTVPET/CT1). To assess the intra-observer variability, all observers were asked to contour the target volume a second time two months later on CT images (GTVCT2) and once again two weeks later on the 18F-FDG PET/CT images (GTVPET/CT2). They were blinded to their previous contours as well as to those of the other observers. The radiation oncologists were all trained in target volume delineation on PET/CT and were free to adjust the window, level and contrast setting of the images.

Signal-to-background ratio-based (SBR) adaptive thresholding (GTVSBR)

For GTVSBR delineation, the maximum signal intensity of the tumor was defined as the mean activity of the hottest voxel and its eight surrounding voxels in a transversal slice, whereas the mean background activity was obtained from a manually drawn ROI far away from the tumor [5]. The SBR-thresholding technique has been described in a previous publication by our group [6]. The GTVSBR were checked visually before approval.

Contour analysis

The delineated contours for both delineation phases were analyzed separately. Firstly, the volumes contoured by every observer for GTVCT and GTVPET/CT were calculated for every patient separately and the composite and common volume of GTVCT and GTVPET/CT were calculated. The composite volume PET/CT is the sum of GTVCT1 and GTVPET/CT1 while the common volume PET/CT is the joint volume of GTVCT1 and GTVPET/CT1 of each observer. To assess the geographic mismatch between the GTVs delineated using the different segmentation techniques, the following overlap analyses were performed: (A) The overlap volume of GTVCT1 and GTVPET/CT1, for which overlap was expressed as the overlap volume of GTVCT1 and GTVPET/CT1 relative to the CT-based GTVs − overlap fraction (OF) CT1 [OFCT1]; (B) the OF of GTVPET/CT1 and GTVCT1 relative to the PET/CT-based GTV − overlap fraction PET/CT1 [OFPET/CT1]. In addition, the overlap volume of GTVPET/CT1 and GTVSBR relative to GTVSBR-OF was also calculated [OFSBR] (C).

Inter- and intra-observer variability was calculated using a two-way ANOVA model. Regression analysis was used to evaluate the difference between calculated volumes and overlap between GTVs when using the different segmentation tools. Statistical analysis and curve fitting was performed using PASW Statistics package, version 18.0 (IBM, Chicago, Illinois, USA). The level of statistical significance adopted was 0.05.

Results

The contrast enhanced CT scan as well as the 18F-FDG PET/CT were able to pinpoint the local recurrent or residual cancer in the pelvis. The median SUVmax of GTVs was 11.74 (range, 7.55 -17.82). We did not observe any difference in PET signal between residual tumor and recurrent tumors. Figure 1 presents the mean tumor volumes using the different manual and SBR delineation techniques. Error bars indicate standard deviation (SD) on the mean. Wide variability of the GTVCT and GTVPET/CT was observed. The mean GTVCT1 (43.65 cm3, SD 4.84) was significantly larger than the mean GTVPET/CT1 (33.06 cm3, SD 5.24), p = 0.02. The smallest GTVCT1 and GTVPET/CT1 was found in patient #6 with 1.89 cm3 and 0.85 cm3 respectively, and the largest GTVCT in patient #4 with 120.39cm3, while the largest GTVPET/CT was observed in patient #10 (101.93 cm3). Figure 2 presents an example of the GTVs contoured by each observer in each modality in a patient with a local recurrent cervical cancer. The contouring of this case was hampered by the adjacent localization of the bladder and the rectum.

Figure 1
figure 1

Comparison of mean tumor volumes using the different manual and SBR delineation techniques. Error bars indicate standard deviation (SD) on the mean. Results are shown for the gross tumor volume (GTV) delineated on CT (GTVCT1 and GTVCT2) and PET/CT-based GTVs obtained by manual delineation of contours (GTVPET/CT1 and GTVPET/CT2), and signal-to-background ratio (SBR)-based adaptive thresholding (GTVSBR).

Figure 2
figure 2

18F-FDG PET with coregistered contrast enhanced CT showed a local recurrence with a SUV max of 16.16 in a 60 year old patient 6 months after total hysterectomy, adnexectomy and pelvic lymphadenectomy for an endometrial cancer FIGO IIIA. (A) GTVCTs defined by four observers on axial, sagittal and coronal contrast enhanced CT. (B) GTVPET/CTs defined by four observers on axial, sagittal and coronal 18F-FDG PET/CT. In panel B we observed a greater interobserver agreement and the GTVPET/CTs were smaller than the GTVCTs.

Table 2 summarizes the comparative evaluation of the CT- and PET/CT-based GTVs. The mean composite volume was 46.15 cm3 (SD 5.42) and the mean common volume was 31.48 cm3 (SD 4.21). The mean OFCT1 was 0.63 (SD 0.04). The mean OFPET/CT1 was 0.90 (SD 0.03). In 2 patients, the GTVPET/CT of all observers was included entirely in the GTVCT and in 6 patients, GTVPET/CT added substantial tumor extension outside the GTVCT.

Table 2 Summary of the composite and common volumes of GTV CT1 and GTV PET/CT1 as well as overlap fractions (OFs) between the GTV CT1 , GTV PET/CT1 and GTV SBR

We found that among four experienced radiation oncologists, the ratio of largest to smallest GTVs outlined on 10 patients using the planning CT had a median of 1.87 (range, 1.21 to 3.27). When the 18F-FDG-PET was included, this ratio was reduced to median 1.38 (range, 1.16 to 1.81). The ratio of largest to smallest GTV was decreased in 9 of 10 patients using PET/CT for GTV delineation.

Evaluation of inter- and intra-observer variation

The median inter-observer reliability index for the GTVCT was 0.37 (range, 0.21-0.63) and for the GTVPET/CT was 0.48 (range, 0.32-0.71); p = 0.23. All physicians contoured each patient twice and the median intra-observer percentage of concordance for the GTVCT was 0.49 (range, 0.13-0.89) and for the GTVPET/CT was 0.65 (range, 0.30-0.92) (p = 0.18).

SBR-based auto-contour compared with manual delineation

The GTVs were delineated both manually and by editing the SBR-based auto-contour. The results concerning GTVSBR are shown in Table 1. The mean GTVSBR was 21.33 cm3 (SD 23.87), which is significantly smaller than the manually contoured GTVCT (p ≤ 0.005) and GTVPET/CT (p ≤ 0.005). In 6 patients the GTVSBR was included completely in all GTVCTs and the mean OF between GTVSBR and GTVPET/CT was 0.97 (SD 0.02). Comparing the GTVSBR with the GTVPET/CTs, we observe that in 4 patients the GTVSBR were larger than the GTVPET/CT.

Discussion

CT and MRI have reasonable sensitivity but low specificity in identifying recurrent gynaecologic disease [1, 2]. Consequently, significant observer variation has been noted in contouring the GTVCT[7]. 18F-FDG PET/CT plays an increasingly important role in the staging and management of gynaecologic cancer including RT treatment planning [3, 4]. 18F-FDG PET/CT has demonstrated a high sensitivity and accuracy of more than 90% with average specificity in locally advanced or recurrent gynaecologic pelvic carcinoma. Furthermore 18F-FDG PET/CT can help to distinguish between tumor recurrence and post-therapy changes [4, 8]. Kidd et al. have shown that cervical cancer patients treated with 18F-FDG PET/CT-guided IMRT had improved survival and decreased treatment related toxicity compared with patients treated with non-IMRT radiotherapy [9].

This delineation study evaluated inter- and intra-observer variability of CT-based and 18F-FDG PET/CT-based target volume delineation in local recurrent or postsurgical residual gynaecological cancer. The results were compared with an automated PET segmented technique using adaptive thresholding technique. In other cancer sites such as head and neck and lung, 18F-FDG PET/CT was reported to decrease inter- and intra-observer variability in tumor contouring [10]. Our results suggest that GTV delineation using 18F-FDG PET/CT could be superior to CT alone in this group of patients. GTVPET/CT was significantly smaller than the GTVCT with a trend for reduced inter- and intra-observer variability using PET/CT. The inter-observer agreement was moderate for the GTVCT and substantial for the GTVPET/CT[11]. The inter-observer reliability was lower than the intra-observer reliability. This is in agreement with observations made by other authors [12]. It has been considered that the observers tend to agree more with themselves rather than with each other. Inter- and intra-observer variability has been mostly investigated in lung cancer and the increased observer reliability on 18F-FDG PET/CT in our study is in line with these findings [10]. Only one study by our group evaluated the inter-observer variability in PET/CT-based target volume delineation in the pelvis [13]. A trend of reduced inter-observer variability has been observed in the delineation of the intraprostatic recurrence lesion using 18F-choline PET/CT. In gynaecologic cancer no inter- or intra-observer variability in PET-based GTV-delineation has been evaluated until now.

Our study demonstrate that the size of GTVPET/CT was significantly smaller than the GTVCT with the implementation of a coregistered 18F-FDG PET/CT. When the GTVSBR volumes were analyzed and compared with manual delineated target volume, it was observed that the GTVSBR was significantly smaller than the median GTVCT and GTVPET/CT. This was also manifested in the overlap analysis, where the overlap fraction increased from OFCT1 to OFPET/CT1 and OFSBR. Overall, the comparison of GTVs delineated in primary and recurrent cancer did not result in any significant differences.

The strength of our study includes the use of contrast enhanced CT scans for GTCT and GTVPET/CT determination and that the exams were performed on a dedicated PET/CT scanner for virtual simulation and fused with a recent MRI. Nevertheless the inter- and intra-observer variability was relatively high with both imaging modalities, highlighting the difficulty to determine the target volumes in this group of patients. An automated segmentation of the target volume using the adaptive thresholding technique could eventually help to reduce inter- and intra-observer variability. One potential limitation of our study is that the observers were at liberty to adjust the window, level and contrast setting of the images. This could have increased the inter- and intra-observer variability. However, all observers were experienced in PET/CT-based target volume delineation and were helped by a nuclear medicine physician. Another drawback of this study is the lack of comparison of the PET/CT results with pathologic findings after surgery.

The delineation of target volumes and organ at risk is a very critical step in high-precision RT treatment planning. Good image quality and reliable delineation protocol are important for accurate target volume delineation. One of the challenges of PET/CT-guided target volume delineation is the accurate segmentation of noisy and low resolution functional PET images. This is in particular true in recurrent or residual gynaecological cancer where vascular and urinary activity hampers target volume delineation. The result is a relatively high inter- and intra-observer variability. Various PET image segmentation techniques for target volume delineation were developed and evaluated to overcome this drawback [11]. Among them, manual contouring by visual examination is the most commonly used method. The determination of an appropriate window and level for viewing the PET images is highly operator-dependent and is subject to high variability between operators [12].

An improved concordance in target volume delineation using PET/CT implies a greater accuracy and can help to determine a more appropriate treatment plan. In our study, the inter-observer variability coefficient prevailed is still relatively high. Variability negatively impacts the quality of treatments delivered to cancer patients. Alternatively an automated segmented target volume could be considered. There is consensus in the need for highly objective and automatic segmentation methods, and various groups have observed that semi- or fully-automated delineation techniques reduce inter-observer variability and improve reproducibility [10]. The adaptive thresholding technique is one of the most widely used segmentation techniques for target volume determination in clinical setting. However, knowledge of the true target volume in relation the GTVSBR in gynaecologic tumors is needed for validation purposes. PET-based target volume delineation in gynaecologic tumors is actually not recommended outside clinical studies. It has to be emphasized that both patients with recurrent or postsurgical residual gynaecologic cancer are challenging cohorts for reliable target volume delineation and thus it is more likely that high inter- and intra-observer variability will result. In the absence of a more accurate information on the target volume position in gynaecologic cancer, a composite of GTVCT and GTVPET/CT can be recommended to optimize the GTV definition.

Conclusions

This delineation study showed that GTVPET/CT was significantly smaller than GTVCT. The reduction was larger when the adaptive thresholding-based semi-automated contouring algorithm was used. GTVPET/CT added substantial tumor extension outside the GTVCT in 60% of the patients. The combination of a matched 18F-FDG PET/CT reduced the inter- and intra-observer variation in the delineation of gynaecological cancer however the difference was not significant. Target volume delineation may be improved with the inclusion of 18F-FDG PET/CT.