Development and validation of the PET-CT score for diagnosis of malignant pleural effusion

Purpose Although some parameters of positron emission tomography with 18F-fluorodeoxyglucose (18F-FDG) and computed tomography (PET-CT) are somehow helpful in differentiating malignant pleural effusion (MPE) from benign effusions, no individual parameter offers sufficient evidence for its implementation in the clinical practice. The aim of this study was to establish the diagnostic accuracy of a scoring system based on PET-CT (the PET-CT score) in diagnosing MPE. Methods One prospective derivation cohort of patients with pleural effusions (84 malignant and 115 benign) was used to develop the PET-CT score for the differential diagnosis of malignant pleural effusion. The PET-CT score was then validated in another independent prospective cohort (n = 74). Results The PET-CT parameters developed for discriminating MPE included unilateral lung nodules and/or masses with increased 18F-FDG uptake (3 points); extrapulmonary malignancies (3 points); pleural thickening with increased 18F-FDG uptake (2 points); multiple nodules or masses (uni- or bilateral lungs) with increased 18F-FDG uptake (1 point); and increased pleural effusion 18F-FDG uptake (1 point). With a cut-off value of 4 points in the derivation cohort, the area under the curve, sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio of the PET-CT score to diagnose MPE were 0.949 (95% CI: 0.908–0.975), 83.3% (73.6%–90.6%), 92.2% (85.7%–96.4%), 10.7 (5.6–20.1), and 0.2 (0.1–0.3), respectively. Conclusions A simple-to-use PET-CT score that uses PET-CT parameters was developed and validated. The PET-CT score can help physicians to differentiate MPE from benign pleural effusions. Electronic supplementary material The online version of this article (10.1007/s00259-019-04287-7) contains supplementary material, which is available to authorized users.


Introduction
Malignant pleural effusion (MPE) is frequently observed in multiple malignancies, with lung cancer being the most frequent underlying malignancy [1,2]. As the prognosis for patients with MPE is poor [3,4], an efficacious procedure that can establish a definite diagnosis as early as possible with a minimum risk and discomfort is highly desirable. Series examining the diagnostic rate for malignancy of pleural cytology has reported a mean sensitivity of about 60% (range 40-87%), which has highlighted the challenge of MPE diagnosis [5,6]. It is important to avoid subjecting frail patients to unnecessarily invasive procedures and to select just those who may benefit the most from such interventions.
Positron emission tomography (PET) with 1 8 Ffluorodeoxyglucose ( 18 F-FDG) was reported for the first time to be an effective tool in the evaluation of pleural diseases in 1997 by Bury et al. [7]. The subsequent four prospective studies showed that 18 F-FDG PET and computed tomography (CT) can be a useful method for the differential diagnosis of MPE [8][9][10][11]. However, these four prospective studies recruited small numbers of patients with pleural effusions, and the evidence supporting the discriminatory role of 18 F-FDG PET-CT was not validated in an independent population. One recent meta-analysis suggested that no individual 18 F-FDG PET imaging measurement appears to predict the probability of MPE enough to be recommended in the routine work-up for effusions of undetermined cause [12]. We undertook the present prospective study to develop a scoring system based on 18 F-FDG PET-CT findings (the PET-CT score) for clinicians to discriminate MPE from benign effusion. Following STARD guidelines, this study was conducted in accordance with the Declaration of Helsinki and approved by the institutional ethics committee of Beijing Chaoyang Hospital, Beijing, China (ID 2014-ke-98). Patients provided written informed consent.

Diagnostic criteria
The diagnosis of MPE was established if malignant cells were detected upon cytological examination of the pleural fluid or biopsy specimens that were obtained during the same admission with the PET-CT examination. Tuberculous pleural effusion was diagnosed when Ziehl-Neelsen stains or Lowenstein-Jensen cultures of pleural fluid, sputum, or pleural biopsy specimens were positive or when granulomas were found in the parietal pleural biopsies. A parapneumonic effusion was defined as any effusion associated with bacterial pneumonia, lung abscess, bronchiectasis, or empyema, when reported with the presence of pus within the pleural space. Other causes of pleural effusions followed well-established clinical criteria. The patients were followed up for at least 12 months to ensure the absence of malignant pleural processes if they were diagnosed to have benign effusion [13].

PET-CT imaging
The integrated 18 F-FDG PET-CT study was performed on a GE Discovery STE device using a standard protocol before invasive procedures were performed. All patients fasted for at least 6 h and had a blood glucose level of <200 mg/dL before 18 F-FDG administration. Whole body PET-CT scans were acquired 55-73 min (mean 63.2 ± 7.3 min) after intravenous injection of 3.7 MBq/kg of 18 F-FDG. A body scan from the skull base to the upper thighs was firstly performed and then was followed by a head scan. CT parameters for body scan were: 140 kV, 120 mA, and slice thickness of 3.75 mm. CT parameters for head scan were: 120 kV, 200 mA, and slice thickness of 3.75 mm. PET parameters were: 2.5 min/bed for body scan and 5 min/bed for head scan in 3-dimension mode. Attenuation-corrected PET images (voxel size: 3.9 × 3.9 × 3.3-mm for both body and head scan) were reconstructed using a 3-dimensional ordered-subset expectation maximization algorithm (14 subsets and 2 iterations for body scan, and 28 subsets and 2 iterations for head scan). Integrated PET and CT images were obtained automatically on AW VolumeShare2 (GE Healthcare).
One radiologist (TJ) and one nuclear physician (MFY) evaluated the PET-CT images together and a final consensus was obtained on all imaging findings. Both observers were blinded to the final diagnosis of pleural effusion. Pleural 18 F-FDG uptake was calculated by manually drawing regions of interest (ROIs) on PET and CT registered images slice by slice, and the maximum standardized uptake value (SUVmax) was selected to represent 18 F-FDG uptake. 18 F-FDG uptake of pleural fluid was evaluated on the registered slice with the deepest fluid. A circular ROI of 5 mm diameter was placed 5 mm to the parietal pleura, and the SUVmax of pleural fluid was recorded. To obtain a background value of 18 F-FDG uptake, mediastinal uptake was measured by placing an ROI on the superior vena cava and the SUVmean was recorded. All SUVs were normalized to body weight. Then, a target-tobackground ratio (TBR) was determined by calculating the ratio of the SUVmax of the pleura or pleural fluid, and the SUVmean of the mediastinum. Lung nodules and/or masses (diameter ≥ 8 mm) were identified, and the SUVmax was measured. Lymph nodes with high uptake (higher than that in the surrounding normal soft tissues) but without calcification and without attenuation higher than 70 HU were regarded as positive. Other organs were classified as positive when there was focal 18 F-FDG uptake, compared with the surrounding normal organ (tissue) or 18 F-FDG uptake that could not be explained by physiologic activity.
Continuous data of PET-CT features (pleural 18 F-FDG uptake, pleural effusion 18 F-FDG uptake, and pleural effusion depth) were first transferred to categorical data by means of drawing the receiver-operating-characteristic (ROC) curves and calculating the cut-off values. Pleural effusion depth was measured in centimeters for maximum anteroposterior depth of the effusion on chest CT scans. The cut-off value of the SUVmax of lung nodule was set as 2.5.

Statistical analysis
Categorical and continuous data were expressed as numbers (percentages) and means ± SD, respectively. Between-group comparisons were performed with X 2 , Fisher's exact, and Student's t tests, as appropriate. We calculated differences between groups by using ANOVA and univariate analysis, entering only variables with a p value < 0.01 into the multiple regression models. A logistic regression analysis with backward conditional method served to select those imaging variables entering the scoring system. Weight values to each variable were assigned proportionally to the magnitude of the logistic equation's coefficients. ROC curves were drawn and the areas under the curve (AUCs) were calculated to determine the diagnostic value of the PET-CT measurement or score, including sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio [14,15]; and AUCs were compared using z-statistic with the Hanley and McNeil procedure [16]. The optimum cut-off values were defined based on their maximum Youden index (sensitivity + specificity − 1). The parameters of diagnostic accuracy are reported together with their 95% confidence intervals (CIs). To verify the diagnostic accuracy of the PET-CT score in the validation cohort, cases with the PET-CT score above the cut-off value obtained in the derivation cohort were considered as positive results. Interobserver agreement about the PET-CT parameters that make up the PET-CT score was evaluated using κ statistics. The statistical significance level was set at 0.05 (two-tailed). All analyses were conducted with MedCalc and SPSS version 23.0 statistical software.

Study populations
As shown in Fig. 1, 41 patients in the derivation cohort and 39 in the validation cohort were excluded for the following reasons: (1) suspected benign effusion but follow-up <12 months; (2) with malignant primary disease but no definite diagnosis of the plerual effusion or pleura during the same admission with the PET-CT examination; and (3) no confirmed primary disease and no definite diagnosis of the effusion with a followup of 12 months. The characteristics and diagnostic work-up of the excluded patients were presented in Supplemental Table 1. Eventually, a total of 199 patients were included in the derivation cohort and 74 patients in the validation cohort, and their baseline characteristics are shown in Table 1. There were 113 males and 86 females in the derivation cohort, and the average age of the patients was 60.3 ± 16.4 years (range, 21-88 years). In the validation cohort, 43 patients were males and 31 females, and the average age was 61.7 ± 15.3 years (range, 18-90 years).
Patients in the derivation cohort were similar to those in the validation cohort in terms of the distribution of age, sex, and etiology of MPE. Differences in the etiological distribution of benign pleural effusions were found between the two cohorts (X 2 = 23.56, p < 0.001, Table 1). A total of 42.2% of the patients in the derivation cohort and 52.7% of the patients in the validation cohort were confirmed to have MPE; 57.8% of the patients in the derivation cohort and 47.3% of the patients in the validation cohort suffered from benign pleural effusion ( Table 1). The most frequent cause of MPE is lung cancer, followed by mesothelioma in both cohorts. The percentage of tuberculous pleural effusion in the derivation cohort was significantly higher than that in the validation cohort (49.6% vs. 20.0%, p = 0.002).
Diagnostic performance of the PET-CT score in the derivation cohort First, we explored the diagnostic accuracy of each parameter that comprises the PET-CT score in discriminating MPE from benign effusion (Table 4). Although none of these parameters were individually satisfactory for diagnostic purposes in a clinical setting there were meaningful results when examined as a group. At the best cut-off of 4 points, the PET-CT score yielded 83.3% sensitivity (95% CI: 73.6-90.6%), 92.2% specificity (85.7-96.4%), 10.7 positive likelihood ratio (5.6-20.1), 0.2 negative likelihood ratio (0.1-0.3), and AUC 0.949 (0.908-0.975) ( Table 5 and Fig. 2a and b), indicating that the PET-CT score provides acceptable differential diagnostic accuracy for patients with MPE and performs significantly better than any single PET-CT parameter.

Validation of the PET-CT score in the validation cohort
We performed another blinded validation study in an independent population (39 MPE and 35 benign effusions) to validate the diagnostic accuracy of a PET-CT score ≥ 4 in discriminating MPE from benign effusion. Figure 3 shows that the

Discussion
This is the largest prospective study that has investigated the diagnostic accuracy of PET-CT for patients with MPE. Based on a derivation study and a validation study, we propose a simple and feasible PET-CT scoring system with high reliability that can accurately discriminate MPE from benign effusion, as reflected by an AUC of 0.949. Our data show that from a maximum sum of 10 points of the PET-CT score, a score of ≥4 would prompt consideration of an MPE, whereas a score of <4 would mitigate against such a consideration. Since Bury et al. first reported the application of 18 F-FDG PET in diagnosing MPE in 1997 [7], the diagnostic accuracy of PET or PET-CT has been extensively studied; however, its exact clinical significance remains controversial. PET is based on the differential metabolism of normal and abnormal tissues, and the uptake of 18 F-FDG is usually accelerated in tumor cells.  Because some pleural inflammatory and infectious lesions can also induce increased 18 F-FDG uptake, the underlying diseases may display false-positive findings, leading to a low diagnostic accuracy for PET-CT. To improve diagnostic ability, various methods of interpreting PET-CT, including the use of SUV threshold, ratio of SUVof pleural lesions to that of mediastinum, and dual-time-point PET have been proposed [9,10,17,18]. However, the concern about false-negative results still remains. The results from a recent meta-analysis suggest that the pooled sensitivity and specificity are 81% and 74%, respectively [12]; thus, the diagnostic performance of PET-CT for MPE is not good enough to be recommended in the routine workup of pleural effusions of undetermined causes [5]. Although our findings support previous works indicating that some PET-CT parameters are somehow helpful in the diagnosis of MPE, any single parameter does not offer sufficient evidence for its implementation in the clinical practice. Our present study provides strong evidence to support that the PET-CT score is a valuable diagnostic tool in discriminating MPE from benign effusion, with a sensitivity and specificity of 83.3% and 92.2%, respectively, from the derivation cohort. A positive likelihood ratio of 10.7 with the PET-CT score suggests that patients with MPE have about 11-fold higher chance of having a PET-CT score ≥ 4 compared to those without the disease, and this is high enough for diagnostic purposes. Moreover, such good diagnostic performance found in the derivation cohort was verified prospectively in the validation cohort. The PET-CT score system developed by this study comprehensively integrated PET and CT information, and might be better than high end CT alone [19].
The five variables that comprise the PET-CT score are readily available on PET-CT scans and have highly significant associations with the diagnosis of MPE from multivariable analysis. 18 F-FDG PET-CT has been widely accepted for the diagnosis of lung cancer in patients with suspicious lung nodules/masses [20,21]. Most patients with MPE in this study have lung cancer (77.4% in the derivation cohort), and these patients usually show unilateral lung nodules and/or masses with increased 18 F-FDG uptake. In contrast, most patients with benign effusion suffered from tuberculosis and parapneumonic effusion/empyema (92.2%) without pulmonary nodules and masses. It is reasonable to infer that the presence of unilateral lung nodules and/or masses with increased 18 F-FDG uptake is the first single variable that makes up the PET-CT score with 3 points.
Compared with other imaging examinations, one distinct advantage of PET-CT is that it may simultaneously detect malignancies occurring in any organs. If PET-CT suggests  AUC area under the receiver operating characteristic curve, CI confidence interval, 18 F-FDG 18 F-fluorodeoxyglucose, NLR negative likelihood ratio, PLR positive likelihood ratio, TBR target-to-background ratio CI confidence interval, NLR negative likelihood ratio, PLR positive likelihood ratio primary or metastatic tumors outside the lung, then the possibility of MPE is greatly increased. Therefore, the finding of extrapulmonary malignancies indicates the malignant nature of pleural effusion and can also be offered 3 points. The majority (84.5%) of our patients with MPE had pleural thickening coupled with increased 18 F-FDG uptake; whereas 73.0% of our patients with benign effusion also exhibited pleural thickening. However, only 35.7% of these patients had increased 18 F-FDG uptake in the thickened pleura. Pleural thickening (≥3 mm) with increased 18 F-FDG uptake enters the PET-CT score with 2 points.
Multiple pulmonary nodules or masses with increased 18 F-FDG uptake are frequently seen in lung cancer with intrapulmonary metastasis, and sometimes can also be observed in benign diseases, such as pulmonary tuberculosis. Both malignant cells and inflammatory cells are associated with increased 18 F-FDG uptake, and in general the former is higher than the latter [22,23]. In the present study, multiple nodules or masses (uni-or bilateral lungs) with increased 18 F-FDG uptake or increased pleural effusion 1 8 F-FDG uptake are meaningful in differentiating MPE from benign effusion, although the coefficient is not very high (1 point each).
The most important strength of this study is its large sample size and the prospective nature of the consecutive cases in the derivation and validation cohorts. Another strength is that we conducted a comprehensive follow-up to ascertain definite etiological diagnosis of pleural effusion. However, our study had several limitations. First, in the derivation cohort, all consecutive patients with pleural effusions admitted to the Department of Respiratory and Critical Care Medicine of our hospital were recruited in the present study. All patients were persuaded to undergo PET-CT, and only those with a definite cause of pleural effusion were included in the final statistical analysis. It should be noted that some patients with a relatively easy clinical diagnosis, including 46 patients with parapneumonic effusion and three empyema, underwent PET-CT. Therefore, there might be a selection bias in the present study.
Second, we developed the PET-CT score in the high tuberculosis prevalence setting [24,25]. It has been reported that patients with tuberculosis may show intense 18 F-FDG uptake mimicking malignant mesothelioma [26]. A meta-analysis [27] showed that the accuracy of 18 F-FDG PET in diagnosing lung nodules is extremely heterogeneous, and that the use of PET is less specific in diagnosing malignancy in populations with endemic infectious lung disease compared to nonendemic regions. In the present study, we separately explored the discriminative properties of the PET-CT score in identifying MPE from tuberculous and from non-tuberculosis effusions, and found the diagnostic accuracy of the PET-CT score is almost robust. Although 70.2% tuberculous patients in this study showed intense pleural uptake, 20 (35.1%) of Fig. 3 Diagnostic accuracy of the PET-CT score for the diagnosis of patients with malignant pleural effusions (MPEs) in the validation cohort. In Panel a, the use of a cut-off value of 4 points of the PET-CT score shows high sensitivity and specificity for the differential diagnosis of MPE (n = 39) from benign pleural effusions (BPE, n = 35). In Panel b, receiver operating characteristic (AUC) curve shows the diagnostic performance of the PET-CT score from the validation cohort Fig. 2 Diagnostic accuracy of the PET-CT score for the diagnosis of patients with malignant pleural effusions (MPEs) in the derivation cohort. By using a cut-off value of 4 points, the PET-CT score shows high sensitivity and specificity for the differential diagnosis of MPE (N = 84) from the overall benign pleural effusions (BPE, N = 115) (a), tuberculous pleural effusion (TPE, N = 57) (c), and non-tuberculosis pleural effusion (non-TPE, N = 58) (e). The receiver-operatingcharacteristic (AUC) curve shows the diagnostic performance of the PET-CT score for the differential diagnosis of MPE from BPE (b), TPE (d), and non-TPE (f) them had only tuberculous lesions and another 31 patients (54.4%) had non-nodule pulmonary lesions (Supplemental Table 2). Therefore, according to the PET-CT score system developed by this study, their PET-CT score did not reach 4 points and then would not be considered as malignant. Therefore, the present study demonstrated the PET-CT score system can steadily identify pleural tuberculosis from malignance by a comprehensive analysis of pleura, effusion, pulmonary and extra-pulmonary lesions. Nevertheless, this finding warrants further studies.
Third, the small proportion of pleural malignant mesothelioma cases found in the present study and in our previous studies [24,28] reflects the low incidence of this tumor in China. Tumor metastasis outside the thoracic cavity is uncommon in mesothelioma; patients with mesothelioma usually can only obtain at most 3 points (pleural thickening [≥3 mm] with increased 18 F-FDG uptake [2 points] plus increased pleural effusion 18 F-FDG uptake [1 point]), possibly leading to false negative results. In addition, given that our hospital is a respiratory diseasepredominant general hospital, the majority of patients with MPE admitted to our institute are lung cancer patients, and that the PET-CT score is based mainly on findings outside the pleura, diagnostic accuracy of the PET-CT score in the subgroup of MPE patients caused by non-lung tumors needs further investigation.
In conclusion, we have provided evidence to support that the PET-CT score can be reliably used in the differential diagnosis of MPE. Further endorsement with prospective studies from populations with lower tuberculosis burden and/or higher incidence of non-lung malignancies, including pleural mesothelioma, would be beneficial before the PET-CT score is introduced into standard clinical practice.
Authors' contributions MFY, ZHT, ZW, and YYZ recruited the patients, collected, analyzed the data, and did the statistical analyses. ZW, LLX, and XJW were responsible for the management of patients and conducted the study. WL, XZW, WW, and YHZ conducted the study and analyzed the data. TJ and MFY interpreted PET-CT data. HZS planned and led the study, analyzed the data, interpreted the results and wrote the first draft of the article. All authors contributed to the article's revision, agreed to its submission, and had full access to the original data.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Ethical approval This study was conducted in accordance with the Declaration of Helsinki and approved by the institutional ethics committee of Beijing Chaoyang Hospital, Beijing, China (ID 2014-ke-98).
Open Access This article is distributed under the terms of the Creative Comm ons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.