Assessment of tumour size in PET/CT lung cancer studies: PET- and CT-based methods compared to pathology
- First Online:
- Cite this article as:
- Cheebsumon, P., Boellaard, R., de Ruysscher, D. et al. EJNMMI Res (2012) 2: 56. doi:10.1186/2191-219X-2-56
- 5.2k Downloads
Positron emission tomography (PET) may be useful for defining the gross tumour volume for radiation treatment planning and for response monitoring of non-small cell lung cancer (NSCLC) patients. The purpose of this study was to compare tumour sizes obtained from CT- and various more commonly available PET-based tumour delineation methods to pathology findings.
Retrospective non-respiratory gated whole body [18F]-fluoro-2-deoxy-D-glucose PET/CT studies from 19 NSCLC patients were used. Several (semi-)automatic PET-based tumour delineation methods and manual CT-based delineation were used to assess the maximum tumour diameter.
50%, adaptive 41% threshold-based and contrast-oriented delineation methods showed good agreement with pathology after removing two outliers (R2=0.82). An absolute SUV threshold of 2.5 also showed a good agreement with pathology after the removal of 5 outliers (R2: 0.79), but showed a significant overestimation in the maximum diameter (19.8 mm, p<0.05). Adaptive 50%, relative threshold level and gradient-based methods did not show any outliers, provided only small, non-significant differences in maximum tumour diameter (<4.7 mm, p>0.10), and showed fair correlation (R2>0.62) with pathology. Although adaptive 70% threshold-based methods showed underestimation compared to pathology (36%), it provided the best precision (SD: 14%) together with good correlation (R2=0.81). Good correlation between CT delineation and pathology was observed (R2=0.77). However, CT delineation showed a significant overestimation compared with pathology (3.8 mm, p<0.05).
PET-based tumour delineation methods provided tumour sizes in agreement with pathology and may therefore be useful to define the (metabolically most) active part of the tumour for radiotherapy and response monitoring purposes.
KeywordsTumour delineation Tumour diameter FDG PET Non-small cell lung cancer
Positron emission tomography (PET) is a functional imaging modality that provides information about metabolism, physiology and molecular biology of tumour tissue. 18F]-fluoro-2-deoxy-D-glucose (FDG) is the most widely used radiotracer that provides information on glucose metabolism. There is an increasing interest in using FDG PET not only to determine FDG uptake but also to determine the location and extent of the metabolic active part of the tumour. PET is being explored as a tool for e.g. the definition of the gross tumour volume (GTV), location of the metabolic active part of the tumour for radiation oncology or monitoring response during chemotherapy [1, 2, 3]. For radiation treatment planning, the GTV is defined mainly on computed tomography (CT), which provides anatomical image data. CT imaging, however, has low contrast for soft tissue, making it difficult to differentiate between tumour and normal tissue due to their similar electron density values. It is hypothesised that PET could improve accuracy of GTV definition for radiation treatment planning . Accurate GTV definition is vital in order to generate a highly conformal radiation dose distribution, thereby sparing surrounding normal tissue and allowing a higher radiation dose to the most active part of the tumour. In addition to improve the accuracy of GTV definition, PET can indicate areas within the tumour that are metabolically more active or malignant, which may be used to define areas that need an additional radiotherapy boost .
Another important application of FDG PET is its use for the assessment of (early) treatment response and/or as a prognostic factor. So far, these applications have been mainly explored by studying FDG uptake quantitatively by means of standardised uptake values (SUV). Nevertheless, other parameters, such as the metabolic volume or total lesion glycolysis (product of SUV and metabolic volume), may provide additional valuable information both as prognostic value as well as for treatment response monitoring. Recently, metabolic tumour volume or maximum metabolic tumour diameter have been shown to be independent prognostic factors for oesophageal cancer but only when accurate PET-based tumour delineation methods are used . Moreover, for response monitoring studies it is important to know whether a difference in SUV or metabolic tumour volume in successive scans represents a true response or represents methodology-related variability. Therefore, accurate and reproducible metabolic volume delineation is important.
Several (semi-)automatic PET-based tumour delineation methods have been studied previously [7, 8, 9, 10, 11, 12]. It is generally accepted that pathology is the gold standard and should be used for validation of tumour delineation methods. However, only a limited number of (lung) tumour studies [4, 13, 14, 15, 16, 17] have been performed that compare data obtained from (semi-)automatic PET-based tumour delineation methods with pathological data. Therefore, the primary purpose of this study was to compare measured tumour sizes obtained from a broad spectrum of (more commonly available) PET-based tumour delineation methods to pathology findings. In addition, CT-based tumour size assessments were compared with pathology.
Patients, PET/CT and pathology
Included in this study were 19 consecutive patients (8 females and 11 males; weight 75±14 kg, range 42–100 kg) with histological proven non-small cell lung cancer, who had undergone a diagnostic whole body PET/CT scan and underwent a surgical resection of their primary lung tumour in the period from December 2003 to December 2004. All patients gave written informed consent prior to inclusion and the study was approved by the Medical Ethics Review Board of the Maastricht University Medical Center.
FDG data were acquired using a whole-body PET/CT scanner (Biograph, Somatom Sensation 16; Siemens, Erlangen, Germany). FDG was administered as an intravenous bolus (365±62 MBq). For each patient, plasma glucose levels were measured (mean 5.7±2.1 mmol·L-1, range 4.1-12.0 mmol·L-1) and all patients fasted for at least 6 h before scanning. PET data were reconstructed using ordered subsets expectation maximisation with 4 iterations and 18 subsets, followed by 5 mm full width at half maximum (FWHM) Gaussian smoothing, resulting in an image resolution of about 6.5 mm FWHM. All PET images had a matrix size of 128×128×178, corresponding to a pixel size of 5.31×5.31×5.00 mm3. CT data had an image matrix size of 512×512×178, corresponding to a pixel size of 0.98×0.98×5.00 mm3. CT data were used to correct for tissue attenuation. More acquisition and reconstruction details can be found elsewhere .
All patients underwent surgical resection of their primary tumour after approximately 47 d (range: 7 to 112 d). Directly after surgical resection, the maximum diameter of this tumour was measured by macroscopic examination in three dimensions using a calliper. Shrinkage of the tumour, estimated to be around 10%, was not considered as no preservation, fixation or inflation was applied prior to the diameter measurements. The obtained maximal diameters ranged from 1.5 to 7.0 cm (mean: 4.0±1.8 cm). The primary tumours were located in the superior (n=8), middle (n=1) or inferior (n=10) lobe.
Fixed threshold of 50% and 70% of maximum voxel value (VOI50, VOI70) .
Adaptive threshold range of 41-70% of maximum voxel value (VOIA41, VOIA50, VOIA70). Same as above, except that it adapts the threshold relative to the local average background .
Contrast-oriented method (VOISchaefer) . This method was recalibrated for the image characteristics used.
Background-subtracted relative-threshold level (RTL) method (VOIRTL). This method is an iterative method based on a convolution of the point-spread function that takes into account the differences between various sphere sizes and the scanner resolution .
Gradient-based watershed segmentation method. This method uses two steps before calculating the VOI. First, this method calculates a gradient image on which a ‘seed’ is placed in the tumour and another one in the background. Next, a watershed (WT) algorithm is used to grow the seeds in the gradient basins, thereby creating boundaries on the gradient edges. In our study, two different types of gradient basins were used. The first approach, indicated by GradWT1, assigns all voxels on the edge between the tumour and the background to the tumour [10, 11, 12]. The second approach uses an upsampled image to ensure less effects of sampling. In addition, voxels that indicate an edge between the tumour and the background are given to either the tumour or the background depending on which region has a value closest to the edge voxel value .
Absolute SUV (SUV2.5). Normalised (SUV) voxel intensities at a chosen absolute threshold (2.5) are used to delineate tumour .
More details on the methods used can be found in [10, 11, 12]. In addition to PET-based tumour delineation, the tumour was also delineated manually on the CT image by an expert physician. Window and level settings were varied according to the expertise by the expert physician. The volume delineated on CT covered the whole primary tumour.
For the first analysis, the measured maximum tumour diameters were compared with the maximum diameter obtained from pathology. The maximum diameter was obtained from the derived tumour volume by measuring diameters in all possible directions, possibly including spans of regions inside the tumour that are e.g. necrotic or cystic. For each tumour delineation method, mean, median, minimum and maximum values of maximum diameter of the primary tumours were reported. In addition, for each delineation method, correlation of maximum diameter with corresponding pathology data was determined.
As suggested by Wanet et al. , we also performed a second analysis that uses a logarithmic transformation of the data to reduce the magnitude of both skewness and kurtosis of the volume distributions, so a nearly Gaussian distribution of the data was obtained. For this analysis, the difference in logarithmic transformed data was calculated as , where diameterVOI is the maximum tumour diameter of either manual CT delineation or various PET-based tumour volume delineation methods and diameterpathology is the maximal pathological diameter. Correspondence in maximal diameter for each tumour delineation method was also evaluated by Bland-Altman plot analysis. P-values were calculated using the two-tailed Wilcoxon signed-rank test between maximal diameters obtained from PET- or CT-based delineation and pathology. P-values, which were lower than 0.05, were considered statistically significant. Outliers were identified visually as VOIs that showed an unrealistically large measured tumour volume.
During analyses VOI50, VOIA41, VOISchaefer and GradWT1 showed two outliers for which the generated VOI resulted in unrealistically large volumes, as assessed visually. In addition, SUV2.5 showed five outliers that had an unrealistically large measured volume. Therefore, all means and correlation analyses were corrected for these outliers.
Mean, median, minimum andmaximum values of maximumtumour diameter as obtainedwith different methods
Maximum diameter (mm)
Linear regression data ofmaximum diameter obtained usingseveral delineation methods andpathology
R 2 a
Obtaining accurate (metabolic) tumour boundaries may be important for treatment planning in radiotherapy and/or for use as prognostic factor and/or to monitor response during therapy, and may therefore have a direct impact on clinical outcome.
Tumour delineation methods are only suited for radiotherapy planning purposes if they correspond well with pathology. The present results indicate that VOI50, VOIA41 and VOISchaefer show good agreement with pathology after removing two outliers (R2: 0.82, slope: 1.00-1.06, Figure 1). These outliers were located closely to high uptake regions, e.g. mediastinum and heart. Only those tumour delineation methods should be selected for radiotherapy planning purposes if they are able to distinguish between these adjacent normal tissues and the tumour. VOIA50, VOIRTL and GradWT2 did not show any outliers, provided only small, non-significant differences in maximum tumour diameter (<4.7 mm, p>0.10), and showed fair correlation (R2>0.62, slope 0.88-0.97) with pathology (Tables 1 and 2). These results correspond with those of previous studies [4, 13], in which small differences in maximum tumour diameter between an adaptive threshold method and pathology were reported , as well as good correlation between maximum diameters obtained from a percentage threshold-based method and pathology . The latter study also showed a reduction in inter-observer variability when PET-based (semi-)automatic tumour delineation methods were used. GradWT1 showed only moderate agreement with pathology (R2: 0.43) and overestimated the maximum diameter. In addition, a poor agreement obtained with pathology and GradWT1 was found for small tumours (≤2.5 cm diameter). As the sizes of the small tumours are less than three times the full-width-at-half-maximum, the influence of partial volume effects may be relatively high, causing this poor agreement with pathology. By applying modifications to the algorithm (i.e. GradWT2), the method showed more accurate tumour sizes (R2: 0.62). These findings are in line with a previous study that compared volumes derived with a gradient-based method to pathology . SUV2.5 showed a good agreement with pathology after the removal of 5 outliers (R2: 0.79), but showed a significant overestimation in the maximum diameter (19.8 mm, p<0.001). Due to the large number of outliers and large overestimation in diameter size, it is not recommended to use this method for radiotherapy planning purposes.
Assessment of change in tumour size is important when monitoring tumour response during therapy. Although VOIA70 showed a large systematic underestimation (−36%) of diameter, this method was the most precise (smallest SD and coefficient of variation (COV, ) of 14 and 40%, respectively (Figure 1). For all other methods, SD and COV ranged from 19 to 188% and from 77 to 448%, respectively). In addition, good correlation (R2=0.81) between maximum diameters obtained from this method and pathology was observed (Table 2). Therefore, VOIA70 may be a good method for response monitoring in which relative changes in (more active part of) metabolic volume are considered.
Wu et al.  showed that CT-based delineation provided better correlation (R2=0.87) with pathology than PET-based percentage threshold methods that did not correct for background activity (R2=0.77). This is in contrast to the present study, where CT-based delineation provided a slightly lower correlation of maximum diameter with pathology than VOI50 (R2: 0.77 and 0.82, respectively). In addition, CT-based delineation showed a moderate overestimation of maximum diameter compared with pathology (slope: 1.25 and 1.00 for CT and VOI50, respectively). Despite the good correlation, a drawback of manual CT delineation is the requirement of both a high resolution image (i.e. not a low dose CT) and an experienced observer. Even if delineation is performed by an experienced observer, manual CT delineation suffers from substantial interobserver variation . In addition, accuracy of manual CT delineation was shown to be dependent on the colour window settings used [15, 16]. Moreover, several conditions including chronic obstructive lung disease (COPD), cavitation, pleural fluid, necrosis, atelectasis and mucus plugs, which all occur frequently in lung cancer patients, obscure the exact boundaries of the tumour on CT inducing errors in measured tumour volume and/or diameter size. In the present study, CT delineation has been performed by only one expert physician. Although this may weaken the strength of any correlation with pathology, the results of this study were consistent with the results from another study where manual CT delineation was performed by two experienced observers .
For the patient shown in Figure 3, large differences in diameter were observed between CT- and PET-based methods. In this case, the primary tumour was located close to another suspicious mass within the lymph node. In addition, the primary tumour showed heterogeneous FDG uptake, and both air-containing cavitation and fluid level on the CT scan. Note that, for this typical example, the measured tumour volume obtained using PET-based delineation methods excluded this (non-metabolic) necrotic and cystic centre that was included in manual CT delineation. However, also note that this (non-metabolic) necrotic and cystic centre was included in all maximum diameter calculations. For this tumour, the maximum diameter obtained from PET-based methods was closer to that of pathology than for corresponding CT delineation, further illustrating the conceptual differences between anatomical (CT) and metabolic (PET) volumes. As previously suggested , CT-based delineation is unable to differentiate between high and low activity regions, and the use of PET can assist in quantifying and visualising heterogeneous tracer uptake across the tumour and PET-based delineation may be useful to define the most active part of the tumour.
Some factors might limit accurate delineation of tumour volumes and corresponding maximum diameters using the commonly available PET-based (semi-)automatic tumour delineation described in this article. First, primary lesions could be surrounded by high uptake regions, e.g. from suspected locoregional metastases, heart and spine. Therefore, application of tumour delineation methods might be more valid for peripheral tumours and less valid for more centrally located tumours, unless a (manually adjustable) bounding box around the tumour is used to prevent delineation of surrounding high uptake regions. Second, the metabolic volume of tumours could show heterogeneous tracer uptake that has been shown to have an impact on threshold-based delineation methods . Moreover, tumours located in the thorax could be affected by respiratory motion. However, a good correlation between pathology and PET data is observed in the present study, which might indicate that lung tumors might not be strongly affected by these effects (at least not in this study). Nevertheless, a slight mismatch between PET and CT data can be observed in Figure 3. Fourth, it should be noted that trends observed in the present study may only be valid for primary lung tumours. For other locations, the local background surrounding the tumour is different, which could have an effect on the performance of the tumour volume delineation methods evaluated . Finally, tumour delineation methods are affected by several factors, such as scanner type, radiotracer, image noise and tumour characteristics [10, 11]. So, additional evaluations with pathology, and/or optimisation of systems or tumour delineation methods may be required for other PET/CT systems.
The present study showed some potential methodological limitations that might have influenced the results. First, it should be noted that deformations could occur between in-vivo CT imaging and ex-vivo pathology due to the softness of lung tissue . The method used in the present study involved no inflation of the tissue after resection nor other deformation compensation techniques. All tumours were measured directly after surgery, without using preservation. Inflation is required to find the exact position of the lung tumour inside of the lung. However, inflation is expected to influence mostly the surrounding lung tissue, as the tumours imaged in this study showed a relatively solid mass. The purpose of the current study was not to determine the exact position, but to measure the maximum diameter of the tumour. In addition, the results of the present study are in line with Siedschlag et al.  where inflation was used. Therefore, deformations of the tumour after resection are presumed to be negligible. However, ideally, a CT scan of the excised tumour should have been made to confirm that no deformations occurred. Second, no pathological data on the volume of the primary tumour was available. Therefore, only a comparison with maximum tumour diameter was made rather than with volume. Finally, it should be noted that in this study pathological correlation is available only for resectable lung tumours. However, the majority of patients that will receive radiotherapy suffer from unresectable lung tumours for which accurate tumour volume delineation is critical for treatment. However, obtaining the true volumes for this kind of tumour will remain a challenge yet to be solved.
The maximum diameter derived from CT-based delineation was overestimated compared to pathology, especially at large tumour diameters. PET-based tumour delineation methods provided maximum diameter sizes in closer agreement with pathology. The PET-based 50%, adaptive 41% threshold-based and contrast-oriented (Schaefer) methods seem to be best suited for assessing tumour sizes (of the metabolically most active part) of primary lung tumours, as it provides the best correspondence with pathology data. However, these methods could show a potential difficulty when located close to high uptake regions. Despite only a non-significant small underestimation compared to pathology data, PET-based adaptive 50%, relative threshold level and gradient-based methods could distinguish between the tumour and these adjacent high uptake normal tissues, and are therefore recommended for radiotherapy purposes. An adaptive 70% threshold-based method may be best suited for response monitoring, as it provides the best precision and best correlation with pathology derived size without suffering from outliers.
This study was performed within the framework of the Center for Translational Molecular Medicine (CTMM), AIRFORCE project (grant 03O-103). Patsuree Cheebsumon was supported by a scholarship from the National Science and Technology Development Agency of the Royal Thai Government.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.