Artificial Neural Network Analysis of Volatile Organic Compounds for the Detection of Lung Cancer

  • John B. ButcherEmail author
  • Abigail V. Rutter
  • Adam J. Wootton
  • Charles R. Day
  • Josep Sulé-Suso
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 650)


Lung cancer is a widespread disease and it is well understood that systematic, non-invasive and early detection of this progressive and life-threatening disorder is of vital importance for patient outcomes. In this work we present a convergence of familiar and less familiar artificial neural network techniques to help address this task. Our preliminary results demonstrate that improved, automated, early diagnosis of lung cancer based on the classification of volatile organic compounds detected in the exhaled gases of patients seems possible. Under strictly controlled conditions, using Selected Ion Flow Tube Mass Spectrometry (SIFT-MS), the naturally occurring concentrations of a range of volatile organic compounds in the exhaled gases of 20 lung cancer patients and 20 healthy individuals provided the dataset that has been analysed. We investigated the performance of several artificial neural network architectures, each with complementary pattern recognition properties, from the domains of supervised, unsupervised and recurrent neural networks. The neural networks were trained on a subset of the data, with their performance evaluated using unseen test data and classification accuracies ranging from 56% to 74% were obtained. In addition, there is promise that the topological ordering properties of the unsupervised networks’ clusters will be able to provide further diagnostic insights, for example into patients who may have been heavy smokers but so far have not presented with any lung cancer. With the collection of data from a larger number of subjects across a long time period there is promise that an automated assistive tool in the diagnosis of lung cancer via breath analysis could soon be possible.


Lung cancer diagnosis Volatile organic compounds SIFT Artificial neural network analysis 



We gratefully acknowledge funding from the Slater & Gordon Health Projects & Research Fund/14/15 Round 1/A34489.


  1. 1.
    Lewis, D.R., Chen, H.S., Feurer, E.J., et al.: SEER Cancer Statistics Review, 1975–2008. MD National Cancer Institute, Bethesda (2010)Google Scholar
  2. 2.
    Comella, P., Frasci, G., Panza, N., et al.: Randomised trial comparing cisplatin, gemcitabine and vinorelbine with either cisplatin and gemcitabine or cisplatin and vinorelbine in advanced non small cell lung cancer: interim analysis of a Phase III trial of the Southern Italy Cooperative Oncology Group. J. Clin. Oncol. 18, 1451–1457 (2000)CrossRefGoogle Scholar
  3. 3.
    Amann, A., Smith, D. (eds.): Breath analysis for clinical diagnosis and therapeutic monitoring. World Scientific Publishing Co., Singapore (2005)Google Scholar
  4. 4.
    Buszewski, B., Ulanowska, A., Kowalkowski, T., Cieliski, K.: Investigation of lung cancer biomarkers by hyphenated separation techniques and chemometrics. Clin. Chem. Lab. Med. 50, 573–581 (2012)CrossRefGoogle Scholar
  5. 5.
    Rutter, A.V., Chippendale, T.W.E., Yang, Y., Španĕl, P., Smith, D., Sulé-Suso, J.: Quantification by SIFT-MS of acetaldehyde released by lung cells in a 3D model. Anal. 138, 91–95 (2013)CrossRefGoogle Scholar
  6. 6.
    Amann, A., de Lacy Costello, B., Miekisch, W., Schubert, J., Buszewski, B., Pleil, J., Risby, T.: The human volatilome: volatile organic compounds (VOCs) in exhaled breath, skin emanations, urine, feces and saliva. J. Breath Res. 8(3), 034001 (2014)CrossRefGoogle Scholar
  7. 7.
    Hakim, M., Broza, Y.Y., Barash, O., et al.: Volatile organic compounds of lung cancer and possible biochemical pathways. Chem Rev. 112, 5949–5966 (2012)CrossRefGoogle Scholar
  8. 8.
    Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning Internal Representations by Error Propagation Parallel Distributed Processing, Explorations in the Microstructure of Cognition, vol. 1. MIT Press, Cambridge (1986)Google Scholar
  9. 9.
    Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34(2), 113–127 (2005)CrossRefGoogle Scholar
  10. 10.
    Choudhury, T., Kumar, V., Nigam, D., Mandal, B.: Intelligent classification of lung & oral cancer through diverse data mining algorithms. In: 2016 International Conference on Micro-Electronics and Telecommunication Engineering (ICMETE), pp. 133–138. IEEE (2016)Google Scholar
  11. 11.
    Emaminejad, N., Qian, W., Guan, Y., Tan, M., Qiu, Y., Liu, H., Zheng, B.: Fusion of quantitative image and genomic biomarkers to improve prognosis assessment of early stage lung cancer patients. IEEE Trans. Biomed. Eng. 63(5), 1034–1043 (2016)CrossRefGoogle Scholar
  12. 12.
    Adetiba, E., Adebiyi, M.O., Thakur, S.: Breathogenomics: a computational architecture for screening, early diagnosis and genotyping of lung cancer. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering, IWBBIO 2017. Lecture Notes in Computer Science, vol. 10209. Springer, Cham (2017)Google Scholar
  13. 13.
    Jaeger, H.: The “echo state” approach to analysing and training recurrent neural networks-with an erratum note. Bonn, Germany: German National Research Center for Information Technology GMD Technical report, vol. 148(34), p. 13 (2001)Google Scholar
  14. 14.
    Butcher, J.B., Day, C.R., Austin, J.C., Haycock, P.W., Verstraeten, D., Schrauwen, B.: Defect detection in reinforced concrete using random neural architectures. Comput. Aided Civ. Infrastruct. Eng. 29(3), 191–207 (2013)CrossRefGoogle Scholar
  15. 15.
    Verstraeten, D., Schrauwen, B., Stroobandt, D.: Reservoir-based techniques for speech recognition. In: IEEE International Joint Conference on Neural Networks (IJCNN 2006), pp. 1050–1053 (2006)Google Scholar
  16. 16.
    Butcher, J.B., Verstraeten, D., Schrauwen, B., Day, C.R., Haycock, P.W.: Reservoir computing and extreme learning machines for non-linear time-series data analysis. Neural Netw. 38, 76–89 (2013)CrossRefGoogle Scholar
  17. 17.
    Scardapane, S., Butcher, J.B., Bianchi, F.M., Malik, Z.K.: Advances in biologically inspired reservoir computing. Cogn. Comput. 9(3), 295–296 (2017)CrossRefGoogle Scholar
  18. 18.
    Wootton, A.J., Taylor, S.L., Day, C.R., Haycock, P.W.: Optimizing echo state networks for static pattern recognition. Cogn. Comput. 9(3), 391–399 (2017)CrossRefGoogle Scholar
  19. 19.
    Emmerich, C., Reinhart, R.F., Steil, J.J.: Recurrence enhances the spatial encoding of static inputs in reservoir networks. In: Diamantaras, K., Duch, W., Iliadis, L. (eds) Artificial Neural Networks - ICANN 2010, vol. 6353, pp. 148–153. Springer, Heidelberg (2010)Google Scholar
  20. 20.
    Campbell, J.K., Rhoades, J.W., Gross, A.L.: Acetonitrile as a constituent of cigarette smoke. Nature 198, 991–992 (1963)CrossRefGoogle Scholar
  21. 21.
    Kushch, I., et al.: Compounds enhanced in a mass spectrometric profile of smokers’ exhaled breath versus non-smokers as determined in a pilot study using PTR-MS. J. Breath Res. 2(2), 026002 (2008)CrossRefGoogle Scholar
  22. 22.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.School of Computing and MathematicsKeele UniversityStaffordshireUK
  2. 2.Institute for Science & Technology in Medicine, Guy Hilton Research CentreKeele UniversityStaffordshireUK
  3. 3.Foundation Year CentreKeele UniversityStaffordshireUK
  4. 4.Oncology DepartmentRoyal Stoke University Hospital, University Hospitals of North MidlandsStaffordshireUK

Personalised recommendations