Abstract
Objective
To develop and evaluate a radiomics approach for classifying histological subtypes and epidermal growth factor receptor (EGFR) mutation status in lung cancer on PET/CT images.
Methods
PET/CT images of lung cancer patients were obtained from public databases and used to establish two datasets, respectively to classify histological subtypes (156 adenocarcinomas and 32 squamous cell carcinomas) and EGFR mutation status (38 mutant and 100 wild-type samples). Seven types of imaging features were obtained from PET/CT images of lung cancer. Two types of machine learning algorithms were used to predict histological subtypes and EGFR mutation status: random forest (RF) and gradient tree boosting (XGB). The classifiers used either a single type or multiple types of imaging features. In the latter case, the optimal combination of the seven types of imaging features was selected by Bayesian optimization. Receiver operating characteristic analysis, area under the curve (AUC), and tenfold cross validation were used to assess the performance of the approach.
Results
In the classification of histological subtypes, the AUC values of the various classifiers were as follows: RF, single type: 0.759; XGB, single type: 0.760; RF, multiple types: 0.720; XGB, multiple types: 0.843. In the classification of EGFR mutation status, the AUC values were: RF, single type: 0.625; XGB, single type: 0.617; RF, multiple types: 0.577; XGB, multiple types: 0.659.
Conclusions
The radiomics approach to PET/CT images, together with XGB and Bayesian optimization, is useful for classifying histological subtypes and EGFR mutation status in lung cancer.
Similar content being viewed by others
References
Siegel RL, Miller KD, Jemal A. Cancer statistics, 2017. CA Cancer J Clin. 2017;67(1):7–30.
Shepherd FA, Rodrigues Pereira J, Ciuleanu T, et al. Erlotinib in previously treated non-small-cell lung cancer. N Engl J Med. 2005;353:123–32.
Cook GJ, O’Brien ME, Siddique M, et al. Non-small cell lung cancer treated with erlotinib: heterogeneity of (18)F-FDG uptake at PET-association with treatment response and prognosis. Radiology. 2015;276:883–93.
Domachevsky L, Groshar D, Galili R, Saute M, Bernstine H. Survival prognostic value of morphological and metabolic variables in patients with stage I and II non-Small cell lung cancer. Eur Radiol. 2015;25:3361–7.
Koyasu S, Nakamoto Y, Kikuchi M, et al. Prognostic value of pretreatment 18F-FDG PET/CT parameters including visual evaluation in patients with head and neck squamous cell carcinoma. AJR Am J Roentgenol. 2014;202:851–8.
Chalkidou A, O’Doherty MJ, Marsden PK. False discovery rates in PET and CT studies with texture features: a systematic review. PLoS ONE. 2015;10:e0124165.
Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016. p. 785–794.
Bergstra J, Bardenet R, Bengio Y, Ke´gl B. Algorithms for hyper-parameter optimization. In: Proceedings of the 25th annual conference on neural information processing systems 2011. p. 2546–2554.
Nishio M, Nishizawa M, Sugiyama O, et al. Computer-aided diagnosis of lung nodule using gradient tree boosting and Bayesian optimization. PLoS ONE. 2018;13:e0195875.
Maeta K, Nishiyama Y, Fujibayashi K, et al. Prediction of glucose metabolism disorder risk using a machine learning algorithm: pilot study. JMIR Diabetes. 2018;3:e10212.
Clark K, Vendt B, Smith K, et al. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. J Digit Imaging. 2013;26:1045–57.
Gevaert O, Xu J, Hoang CD, et al. Non-small cell lung cancer: identifying prognostic imaging biomarkers by leveraging public gene expression microarray data–methods and preliminary results. Radiology. 2012;264:387–96.
The Cancer Imaging Archive. Data for NSCLC Radiogenomics Collection. http://doi.org/10.7937/K9/TCIA.2017.7hs46erv. Accessed 1 Dec 2019
Nishio M, Kono AK, Kubo K, Koyama H, Nishii T, Sugimura K. Tumor segmentation on 18F FDG-PET images using graph cut and local spatial information. Open J Med Imaging. 2015;5:174–81.
Besson FL, Henry T, Meyer C, et al. Rapid contour-based segmentation for 18F-FDG PET imaging of lung tumors by using ITK-SNAP: comparison to expert-based segmentation. Radiology. 2018;288(1):277–84.
Yan J, Chu-Shern JL, Loi HY, Khor LK, et al. Impact of image reconstruction settings on texture features in 18F-FDG PET. J Nucl Med. 2015;56:1667–73.
Shiri I, Rahmim A, Ghaffarian P, Geramifar P, Abdollahi H, Bitarafan-Rajabi A. The impact of image reconstruction settings on 18F-FDG PET radiomic features: multi-scanner phantom and patient studies. Eur Radiol. 2017;27:4498–509.
Ojala T, Pietikäinen M, Harwood D. A comparative study of texture measures with classification based on feature distributions. Pattern Recogn. 1996;29:51–9.
Ojala T, Pietikäinen M, Mäenpää T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell. 2002;24:971–87.
Zhao G, Pietikäinen M. Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell. 2007;29:915–28.
Breiman L. Random forests. Mach Learn. 2001;45:5–32.
scikit learn. sklearn.ensemble.RandomForestClassifier. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html. Accessed 25 Jan 2019
XGBoost. XGBoost Parameters. https://xgboost.readthedocs.io/en/latest/parameter.html. Accessed 25 Jan 2019
Lv Z, Fan J, Xu J, et al. Value of 18F-FDG PET/CT for predicting EGFR mutations and positive ALK expression in patients with non-small cell lung cancer: a retrospective analysis of 849 Chinese patients. Eur J Nucl Med Mol Imaging. 2018;45:735–50.
Fotouhi S, Asadi S, Kattan MW. A comprehensive data level analysis for cancer diagnosis on imbalanced data. J Biomed Inform. 2019;90:103089.
Funding
The present study was supported by JSPS KAKENHI (Grant Number JP16K19883 and JP19K17232). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
No potential conflicts of interest were disclosed.
Ethical approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Optimal hyperparameters of RF in the classification of histological subtype were as follows.
max_depth: 315.0
max_features: 0.2753668962684457
min_samples_leaf: 0.05751743706118054
min_samples_split: 0.058694587169320536
min_weight_fraction_leaf’: 0.0953635294579193
n_estimators: 160.0,
Optimal hyperparameters of XGB in the classification of histological subtype were as follows.
colsample_bytree: 0.7027440354853365
gamma: 7.050213660755347
learning_rate: 0.7977229209740264
max_depth: 2.0
min_child_weight: 2.4182773699264475
n_estimators: 321.0
subsample: 0.9103390721909822
Optimal hyperparameters of RF in the classification of EGFR mutation status were as follows.
max_depth: 193.0
max_features: 0.6572677943338099
min_samples_leaf: 0.12015049574522058
min_samples_split: 0.13611438197316517
min_weight_fraction_leaf: 0.20518268083328728
n_estimators: 160.0
Optimal hyperparameters of XGB in the classification of EGFR mutation status were as follows.
colsample_bytree: 0.674785098701497
gamma: 11.61630121641136
learning_rate: 0.6246833186520859
max_depth: 7.0
min_child_weight: 6.224162304714124
n_estimators: 400.0
subsample: 0.7311708739539864
Rights and permissions
About this article
Cite this article
Koyasu, S., Nishio, M., Isoda, H. et al. Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on 18F FDG-PET/CT. Ann Nucl Med 34, 49–57 (2020). https://doi.org/10.1007/s12149-019-01414-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12149-019-01414-0