Skip to main content

Performance Comparison of Individual and Ensemble CNN Models for the Classification of Brain 18F-FDG-PET Scans


The high-background glucose metabolism of normal gray matter on [18F]-fluoro-2-D-deoxyglucose (FDG) positron emission tomography (PET) of the brain results in a low signal-to-background ratio, potentially increasing the possibility of missing important findings in patients with intracranial malignancies. To explore the strategy of using a deep learning classifier to aid in distinguishing normal versus abnormal findings on PET brain images, this study evaluated the performance of a two-dimensional convolutional neural network (2D-CNN) to classify FDG PET brain scans as normal (N) or abnormal (A). Methods: Two hundred eighty-nine brain FDG-PET scans (N; n = 150, A; n = 139) resulting in a total of 68,260 images were included. Nine individual 2D-CNN models with three different window settings for axial, coronal, and sagittal axes were trained and validated. The performance of these individual and ensemble models was evaluated and compared using a test dataset. Odds ratio, Akaike’s information criterion (AIC), and area under curve (AUC) on receiver-operative-characteristic curve, accuracy, and standard deviation (SD) were calculated. Results: An optimal window setting to classify normal and abnormal scans was different for each axis of the individual models. An ensembled model using different axes with an optimized window setting (window-triad) showed better performance than ensembled models using the same axis and different windows settings (axis-triad). Increase in odds ratio and decrease in SD were observed in both axis-triad and window-triad models compared with individual models, whereas improvements of AUC and AIC were seen in window-triad models. An overall model averaging the probabilities of all individual models showed the best accuracy of 82.0%. Conclusions: Data ensemble using different window settings and axes was effective to improve 2D-CNN performance parameters for the classification of brain FDG-PET scans. If prospectively validated with a larger cohort of patients, similar models could provide decision support in a clinical setting.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6





two dimensional convolutional neural network


Akaike’s information criterion


area under curve


comma separated value file


computed tomography


magnetic resonance


positron emission tomography


Portable Network Graphics format


receiver operating characteristic


standard deviation


standardized uptake value.


  1. 1.

    Jadvar H, Colletti PM, Delgado-Bolton R et al.: Appropriate use criteria for 18F-FDG PET/CT in restaging and treatment response assessment of malignant disease. J Nucl Med. 58:2026–2037, 2017

    CAS  Article  Google Scholar 

  2. 2.

    Waite S, Scott J, Gale B, Fuchs T, Kolla S, Reede D: Interpretive error in radiology. AJR Am J Roentgenol. 208:739–749, 2017

    Article  Google Scholar 

  3. 3.

    Nishie A, Kakihara D, Nojo T et al.: Current radiologist workload and the shortages in Japan: How many full-time radiologists are required? Jpn J Radiol. 33:266–272, 2015

    Article  Google Scholar 

  4. 4.

    Wong TZ, van der Westhuizen GJ, Coleman RE: Positron emission tomography imaging of brain tumors. Neuroimaging Clin N Am. 12:615–626, 2002

    Article  Google Scholar 

  5. 5.

    Litjens G, Kooi T, Bejnordi BE et al.: A survey on deep learning in medical image analysis. Med Image Anal. 42:60–88, 2017

    Article  Google Scholar 

  6. 6.

    Yamashita R, Nishio M, Do RKG, Togashi K: Convolutional neural networks: An overview and application in radiology. Insights Imaging. 9:611–629, 2018

    Article  Google Scholar 

  7. 7.

    Esteva A, Kuprel B, Novoa RA et al.: Dermatologist-level classification of skin cancer with deep neural networks. Nature. 542:115–118, 2017

    CAS  Article  Google Scholar 

  8. 8.

    Causey JL, Zhang J, Ma S et al.: Highly accurate model for prediction of lung nodule malignancy with CT scans. Sci Rep. 8:9286, 2018

    Article  Google Scholar 

  9. 9.

    Bernal J, Kushibar K, Asfaw DS et al.: Deep convolutional neural networks for brain image analysis on magnetic resonance imaging: A review. Artif Intell Med. 95:64–81, 2019

    Article  Google Scholar 

  10. 10.

    Chen MC, Ball RL, Yang L et al.: Deep learning to classify radiology free-text reports. Radiology. 286:845–852, 2018

    Article  Google Scholar 

  11. 11.

    Yasaka K, Akai H, Kunimatsu A, Abe O, Kiryu S: Deep learning for staging liver fibrosis on CT: A pilot study. Eur Radiol. 28:4578–4585, 2018

    Article  Google Scholar 

  12. 12.

    Zhou Z, Zhao G, Kijowski R, Liu F: Deep convolutional neural network for segmentation of knee joint anatomy. Magn Reson Med. 80:2759–2770, 2018

    Article  Google Scholar 

  13. 13.

    Huo Y, Xu Z, Xiong Y et al.: 3D whole brain segmentation using spatially localized atlas network tiles. NeuroImage. 194:105–119, 2019

    Article  Google Scholar 

  14. 14.

    Liu M, Cheng D, Yan W: Alzheimer’s disease neuroimaging initiative. Classification of Alzheimer’s disease by combination of convolutional and recurrent neural networks using FDG-PET images. Front. Neuroinformatics. 12:35, 2018

    CAS  Article  Google Scholar 

  15. 15.

    He K, Zhang X, Ren S, Sun J: Deep residual learning for image recognition. ArXiv e-prints arXiv:1512.03385, 2015

    Google Scholar 

  16. 16.

    Michael SH, Rodney JH: How we read oncologic FDG PET/CT. Cancer Imaging. 16:35, 2016

    Article  Google Scholar 

  17. 17.

    Krell MM, Su KK: Rotational data augmentation for electroencephalographic data. Conf Proc Annu Int Conf IEEE Eng Med Biol Soc IEEE Eng Med Biol Soc Annu Conf. 2017:471–474, 2017

    Google Scholar 

  18. 18.

    Costa AC, Oliveira HCR, Catani JH, de Barros N, Melo CFE, Vieira MAC: Data augmentation for detection of architectural distortion in digital mammography using deep learning approach. ArXiv e-prints arXiv:1807.03167, 2018

    Google Scholar 

  19. 19.

    Lakhani P, Sundaram B: Deep learning at chest radiography: Automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 284:574–582, 2017

    Article  Google Scholar 

  20. 20.

    Paul R, Hall L, Goldgof D, Schabath M, Gillies R: Predicting nodule malignancy using a CNN ensemble approach. Proc Int Jt Conf Neural Netw Int Jt Conf Neural Netw. Available from: 2018 Jul.

  21. 21.

    Kitamura G, Chung CY, Moore BE: Ankle fracture detection utilizing a convolutional neural network ensemble implemented with a small sample, de novo training, and multiview incorporation. J Digit Imaging. Doi: Apr 18, 2019.

  22. 22.

    Rajaraman S, Jaeger S, Antani SK: Performance evaluation of deep neural ensembles toward malaria parasite detection in thin-blood smear images. PeerJ. 7:e6977, 2019

    Article  Google Scholar 

  23. 23.

    Lyksborg M, Puonti O, Agn M, Larsen R: An ensemble of 2D convolutional neural networks for tumor segmentation. In: Paulsen RR, Pedersen KS Eds. Image Analysis. New York: Springer International Publishing, 2015, pp. 201–211

    Chapter  Google Scholar 

  24. 24.

    Wei L, Yang Y, Nishikawa RM, Jiang Y: A study on several machine-learning methods for classification of malignant and benign clustered microcalcifications. IEEE Trans Med Imaging. 24:371–380, 2005

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Guido A. Davidzon.

Ethics declarations

This retrospective study protocol received approval by the institutional review board and was found to be compliant with the standards of the Health Insurance Portability and Accountability Act.

Conflict of Interest

JKE and CZ are employed and related to DimensionalMechanics Inc.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nobashi, T., Zacharias, C., Ellis, J.K. et al. Performance Comparison of Individual and Ensemble CNN Models for the Classification of Brain 18F-FDG-PET Scans. J Digit Imaging 33, 447–455 (2020).

Download citation


  • S: Deep learning
  • 2D-CNN
  • Ensemble
  • Brain
  • Cancer