Skip to main content


Log in

Establishing a survival probability prediction model for different lung cancer therapies

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript


Cancer is the leading cause of death in Taiwan, according to the Ministry of Health and Welfare (2017), with cancers of the trachea, bronchus, and lung being the most prevalent. Thus, it is critically important to study this disease. By using Taiwan’s National Health Insurance Research Database (NHIRDB), which covers 99.9% of residents, we are capable of analyzing comorbidities and predicting the outcomes of the clinical therapy. This study focuses on non-small cell lung cancer. We first obtain cancer registration indexes from two million individual patient records in NHIRDB by screening patients of having a clinical diagnosis of ICD C33-34 (trachea, bronchus and lung cancer). Then, we used these cancer registration indexes to find all the therapies and comorbidity of the patients and used them as input parameters to establish a predictive model of survival probability for lung cancer. Linear and nonlinear data mining methods were employed to build prediction models to study the effects of different therapies on the 3-year survival probability of lung cancer patients. We found that the artificial neural network (ANN) model performs better than the logistic regression (LR) model. It comes out that the best point of the ANN model on the ROC curve is at sensitivity = 77.6%, specificity = 76.8% and AUROC = 83%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others


  1. Welfare, T.M.o.H.a. (2017) 2017 cause of death statistics analysis

  2. Cassidy A et al (2007) Lung cancer risk prediction: a tool for early detection. Int J Cancer 120(1):1–6

    Article  Google Scholar 

  3. Young RP et al (2009) COPD prevalence is increased in lung cancer, independent of age, sex and smoking history. Eur Respir J 34(2):380–386

    Article  Google Scholar 

  4. Yang-Hao Yu et al (2011) Increased lung cancer risk among patients with pulmonary tuberculosis a population cohort study. J Thorac Oncol 6(1):32–37

    Article  Google Scholar 

  5. Grose D, Milroy R (2011) Chronic obstructive pulmonary disease a complex comorbidity. J Comorbidity 1:45–50

    Article  Google Scholar 

  6. Wang S et al (2012) Impact of age and comorbidity on non-small-cell lung cancer treatment in older veterans. J Clin Oncol 30(13):1447–1455

    Article  Google Scholar 

  7. Iqbal U et al (2015) Is long-term use of benzodiazepine a risk for cancer? Medicine 94(6):e483

    Article  Google Scholar 

  8. Chen Y-C et al (2011) Taiwan’s National Health Insurance Research Database: administrative health care database as study object in bibliometrics. Scientometrics 86(2):365–380

    Article  Google Scholar 

  9. Habr-Gama A et al (2004) Operative versus nonoperative treatment for stage 0 distal rectal cancer following chemoradiation therapy: long-term results. Ann Surg 240(4):711

    Google Scholar 

  10. Samuel OW et al (2017) An integrated decision support system based on ANN and Fuzzy_AHP for heart failure risk prediction. Expert Syst Appl 68:163–172

    Article  Google Scholar 

  11. StatSoft, I.J.T., USA (2001) STATISTICA (data analysis software system), version 6, vol 150

  12. MEDCALC (2017) Comparison of proportions calculator

  13. Software, M.S. (2016) Medcalc statistics for biomedical research: software manual, p 295

  14. De Castro AK et al (2010) Applied hybrid model in the neuropsychological diagnosis of the Alzheimer’s disease: a decision making study case. Int J Soc Humanist Comput 1(3):331–345

    Article  Google Scholar 

  15. Jacquemet G et al (2016) L-type calcium channels regulate filopodia stability and cancer cell invasion downstream of integrin signalling. Nat Commun 7:13297

    Article  Google Scholar 

  16. Czejdo BD, Baszun M (2010) Remote patient monitoring system and a medical social network. Int J Soc Humanist Comput 1(3):273–281

    Article  Google Scholar 

  17. Bach PB et al (2003) Variations in lung cancer risk among smokers. J Natl Cancer Inst 95(6):470–478

    Article  Google Scholar 

  18. Cassidy A et al (2008) The LLP risk model: an individual risk prediction model for lung cancer. Br J Cancer 98(2):270

    Article  Google Scholar 

  19. Spitz MR et al (2007) A risk model for prediction of lung cancer. J Natl Cancer Inst 99(9):715–726

    Article  Google Scholar 

  20. Tammemägi MC et al (2013) Selection criteria for lung-cancer screening. N Engl J Med 368(8):728–736

    Article  Google Scholar 

  21. D’Amelio A Jr et al (2010) Comparison of discriminatory power and accuracy of three lung cancer risk models. Br J Cancer 103(3):423

    Article  Google Scholar 

  22. Etzel CJ et al (2008) Development and validation of a lung cancer risk prediction model for African-Americans. Cancer Prev Res 1(4):255–265

    Article  Google Scholar 

  23. Field JK, Raji OY, Duffy SW, Agbaje OF, Baker SG, Christiani DC, Cassidy A (2013) Predictive accuracy of the Liverpool Lung Project risk model. Ann Intern Med 158(7):568–569

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Chien-Yeh Hsu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, HA., Rau, HH., Chao, L.R. et al. Establishing a survival probability prediction model for different lung cancer therapies. J Supercomput 76, 6501–6514 (2020).

Download citation

  • Published:

  • Issue Date:

  • DOI: