Skip to main content

Advertisement

Log in

Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on 18F FDG-PET/CT

  • Original Article
  • Published:
Annals of Nuclear Medicine Aims and scope Submit manuscript

Abstract

Objective

To develop and evaluate a radiomics approach for classifying histological subtypes and epidermal growth factor receptor (EGFR) mutation status in lung cancer on PET/CT images.

Methods

PET/CT images of lung cancer patients were obtained from public databases and used to establish two datasets, respectively to classify histological subtypes (156 adenocarcinomas and 32 squamous cell carcinomas) and EGFR mutation status (38 mutant and 100 wild-type samples). Seven types of imaging features were obtained from PET/CT images of lung cancer. Two types of machine learning algorithms were used to predict histological subtypes and EGFR mutation status: random forest (RF) and gradient tree boosting (XGB). The classifiers used either a single type or multiple types of imaging features. In the latter case, the optimal combination of the seven types of imaging features was selected by Bayesian optimization. Receiver operating characteristic analysis, area under the curve (AUC), and tenfold cross validation were used to assess the performance of the approach.

Results

In the classification of histological subtypes, the AUC values of the various classifiers were as follows: RF, single type: 0.759; XGB, single type: 0.760; RF, multiple types: 0.720; XGB, multiple types: 0.843. In the classification of EGFR mutation status, the AUC values were: RF, single type: 0.625; XGB, single type: 0.617; RF, multiple types: 0.577; XGB, multiple types: 0.659.

Conclusions

The radiomics approach to PET/CT images, together with XGB and Bayesian optimization, is useful for classifying histological subtypes and EGFR mutation status in lung cancer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2017. CA Cancer J Clin. 2017;67(1):7–30.

    Article  Google Scholar 

  2. Shepherd FA, Rodrigues Pereira J, Ciuleanu T, et al. Erlotinib in previously treated non-small-cell lung cancer. N Engl J Med. 2005;353:123–32.

    Article  CAS  Google Scholar 

  3. Cook GJ, O’Brien ME, Siddique M, et al. Non-small cell lung cancer treated with erlotinib: heterogeneity of (18)F-FDG uptake at PET-association with treatment response and prognosis. Radiology. 2015;276:883–93.

    Article  Google Scholar 

  4. Domachevsky L, Groshar D, Galili R, Saute M, Bernstine H. Survival prognostic value of morphological and metabolic variables in patients with stage I and II non-Small cell lung cancer. Eur Radiol. 2015;25:3361–7.

    Article  CAS  Google Scholar 

  5. Koyasu S, Nakamoto Y, Kikuchi M, et al. Prognostic value of pretreatment 18F-FDG PET/CT parameters including visual evaluation in patients with head and neck squamous cell carcinoma. AJR Am J Roentgenol. 2014;202:851–8.

    Article  Google Scholar 

  6. Chalkidou A, O’Doherty MJ, Marsden PK. False discovery rates in PET and CT studies with texture features: a systematic review. PLoS ONE. 2015;10:e0124165.

    Article  Google Scholar 

  7. Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016. p. 785–794.

  8. Bergstra J, Bardenet R, Bengio Y, Ke´gl B. Algorithms for hyper-parameter optimization. In: Proceedings of the 25th annual conference on neural information processing systems 2011. p. 2546–2554.

  9. Nishio M, Nishizawa M, Sugiyama O, et al. Computer-aided diagnosis of lung nodule using gradient tree boosting and Bayesian optimization. PLoS ONE. 2018;13:e0195875.

    Article  Google Scholar 

  10. Maeta K, Nishiyama Y, Fujibayashi K, et al. Prediction of glucose metabolism disorder risk using a machine learning algorithm: pilot study. JMIR Diabetes. 2018;3:e10212.

    Article  Google Scholar 

  11. Clark K, Vendt B, Smith K, et al. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. J Digit Imaging. 2013;26:1045–57.

    Article  Google Scholar 

  12. Gevaert O, Xu J, Hoang CD, et al. Non-small cell lung cancer: identifying prognostic imaging biomarkers by leveraging public gene expression microarray data–methods and preliminary results. Radiology. 2012;264:387–96.

    Article  Google Scholar 

  13. The Cancer Imaging Archive. Data for NSCLC Radiogenomics Collection. http://doi.org/10.7937/K9/TCIA.2017.7hs46erv. Accessed 1 Dec 2019

  14. Nishio M, Kono AK, Kubo K, Koyama H, Nishii T, Sugimura K. Tumor segmentation on 18F FDG-PET images using graph cut and local spatial information. Open J Med Imaging. 2015;5:174–81.

    Article  Google Scholar 

  15. Besson FL, Henry T, Meyer C, et al. Rapid contour-based segmentation for 18F-FDG PET imaging of lung tumors by using ITK-SNAP: comparison to expert-based segmentation. Radiology. 2018;288(1):277–84.

    Article  Google Scholar 

  16. Yan J, Chu-Shern JL, Loi HY, Khor LK, et al. Impact of image reconstruction settings on texture features in 18F-FDG PET. J Nucl Med. 2015;56:1667–73.

    Article  CAS  Google Scholar 

  17. Shiri I, Rahmim A, Ghaffarian P, Geramifar P, Abdollahi H, Bitarafan-Rajabi A. The impact of image reconstruction settings on 18F-FDG PET radiomic features: multi-scanner phantom and patient studies. Eur Radiol. 2017;27:4498–509.

    Article  Google Scholar 

  18. Ojala T, Pietikäinen M, Harwood D. A comparative study of texture measures with classification based on feature distributions. Pattern Recogn. 1996;29:51–9.

    Article  Google Scholar 

  19. Ojala T, Pietikäinen M, Mäenpää T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell. 2002;24:971–87.

    Article  Google Scholar 

  20. Zhao G, Pietikäinen M. Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell. 2007;29:915–28.

    Article  Google Scholar 

  21. Breiman L. Random forests. Mach Learn. 2001;45:5–32.

    Article  Google Scholar 

  22. scikit learn. sklearn.ensemble.RandomForestClassifier. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html. Accessed 25 Jan 2019

  23. XGBoost. XGBoost Parameters. https://xgboost.readthedocs.io/en/latest/parameter.html. Accessed 25 Jan 2019

  24. Lv Z, Fan J, Xu J, et al. Value of 18F-FDG PET/CT for predicting EGFR mutations and positive ALK expression in patients with non-small cell lung cancer: a retrospective analysis of 849 Chinese patients. Eur J Nucl Med Mol Imaging. 2018;45:735–50.

    Article  CAS  Google Scholar 

  25. Fotouhi S, Asadi S, Kattan MW. A comprehensive data level analysis for cancer diagnosis on imbalanced data. J Biomed Inform. 2019;90:103089.

    Article  Google Scholar 

Download references

Funding

The present study was supported by JSPS KAKENHI (Grant Number JP16K19883 and JP19K17232). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mizuho Nishio.

Ethics declarations

Conflict of interest

No potential conflicts of interest were disclosed.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Optimal hyperparameters of RF in the classification of histological subtype were as follows.

  • max_depth: 315.0

  • max_features: 0.2753668962684457

  • min_samples_leaf: 0.05751743706118054

  • min_samples_split: 0.058694587169320536

  • min_weight_fraction_leaf’: 0.0953635294579193

  • n_estimators: 160.0,

Optimal hyperparameters of XGB in the classification of histological subtype were as follows.

  • colsample_bytree: 0.7027440354853365

  • gamma: 7.050213660755347

  • learning_rate: 0.7977229209740264

  • max_depth: 2.0

  • min_child_weight: 2.4182773699264475

  • n_estimators: 321.0

  • subsample: 0.9103390721909822

Optimal hyperparameters of RF in the classification of EGFR mutation status were as follows.

  • max_depth: 193.0

  • max_features: 0.6572677943338099

  • min_samples_leaf: 0.12015049574522058

  • min_samples_split: 0.13611438197316517

  • min_weight_fraction_leaf: 0.20518268083328728

  • n_estimators: 160.0

Optimal hyperparameters of XGB in the classification of EGFR mutation status were as follows.

  • colsample_bytree: 0.674785098701497

  • gamma: 11.61630121641136

  • learning_rate: 0.6246833186520859

  • max_depth: 7.0

  • min_child_weight: 6.224162304714124

  • n_estimators: 400.0

  • subsample: 0.7311708739539864

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koyasu, S., Nishio, M., Isoda, H. et al. Usefulness of gradient tree boosting for predicting histological subtype and EGFR mutation status of non-small cell lung cancer on 18F FDG-PET/CT. Ann Nucl Med 34, 49–57 (2020). https://doi.org/10.1007/s12149-019-01414-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12149-019-01414-0

Keywords

Navigation