Skip to main content

Risk Stratification and Prognosis Using Predictive Modelling and Big Data Approaches

  • Chapter
  • First Online:
Personalized and Precision Medicine Informatics

Part of the book series: Health Informatics ((HI))

  • 875 Accesses

Abstract

Predictive modeling is the application of supervised machine learning methods to risk assessment and stratification, diagnosis, prognosis and therapeutics. With increasing availability of big biomedical data, predictive modeling is increasingly applied to leverage the data for clinical medicine, public health, and biomedical research. This chapter will describe key methods and application examples in the development, validation, dissemination and deployment of clinical predictive models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Moons KGM, Altman DG, Vergouwe Y, Royston P. Prognosis and prognostic research: application and impact of prognostic models in clinical practice. Br Med J. 2009;338:b606.

    Article  Google Scholar 

  2. Wilson PWF, D’Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB. Prediction of coronary heart disease using risk factor categories. Circulation. 1998;97:1837–47.

    Article  CAS  PubMed  Google Scholar 

  3. Jessen MK, Mackenhauer J, Hvass AMSW, Ellermann-Eriksen S, Skibsted S, Kirkegaard H, et al. Prediction of bacteremia in the emergency department: an external validation of a clinical decision rule. Eur J Emerg Med. 2016;23:44–9.

    Article  PubMed  Google Scholar 

  4. Shapiro NI, Wolfe RE, Wright SB, Moore R, Bates DW. Who needs a blood culture? A prospectively derived and validated prediction rule. J Emerg Med. 2008;35:255–64.

    Article  PubMed  Google Scholar 

  5. LaHaye SA, Gibbens SL, Ball DGA, Day AG, Olesen JB, Skanes AC. A clinical decision aid for the selection of antithrombotic therapy for the prevention of stroke due to atrial fibrillation. Eur Heart J. 2012;33:2163–71.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Brownstein JS, Freifeld CC, Chan EH, Keller M, Sonricker AL, Mekaru SR, et al. Information technology and global surveillance of cases of 2009 H1N1 influenza. N Engl J Med. 2010;362:1731–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. New York: Springer; 2008.

    Google Scholar 

  8. Clark GM, Zborowski DM, Culbertson JL, Whitehead M, Savoie M, Seymour L, et al. Clinical utility of epidermal growth factor receptor expression for selecting patients with advanced non-small cell lung cancer for treatment with erlotinib. J Thorac Oncol. 2006;1:837–46.

    Article  PubMed  Google Scholar 

  9. Sechidis K, Papangelou K, Metcalfe PD, Svensson D, Weatherall J, Brown G. Distinguishing prognostic and predictive biomarkers: an information theoretic approach. Bioinformatics. 2018;1:12.

    Google Scholar 

  10. Labarère J, Bertrand R, Fine MJ. How to derive and validate clinical prediction models for use in intensive care medicine. Intensive Care Med. 2014;40:513–27.

    Article  PubMed  Google Scholar 

  11. Hendriksen JMT, Geersing GJ, Moons KGM, De Groot JAH. Diagnostic and prognostic prediction models. J Thromb Haemost. 2013;11:129–41.

    Article  PubMed  Google Scholar 

  12. Alba AC, Agoritsas T, Walsh M, Hanna S, Iorio A, Devereaux PJ, et al. Discrimination and calibration of clinical prediction models: users’ guides to the medical literature. J Am Med Assoc. 2017;318:1377–84.

    Article  Google Scholar 

  13. Collart F, Feier H, Kerbaul F, Mouly-Bandini A, Riberi A, Mesana TG, et al. Valvular surgery in octogenarians: operative risks factors, evaluation of Euroscore and long term results. Eur J Cardio Thoracic Surg. 2005;27:276–80.

    Article  Google Scholar 

  14. Nashef SAM, Roques F, Sharples LD, Nilsson J, Smith C, Goldstone AR, et al. Euroscore II. Eur J Cardio Thoracic Surg. 2012;41:734–45.

    Article  Google Scholar 

  15. Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157–82.

    Google Scholar 

  16. Zeng Y, Luo J, Lin S. Classification using Markov blanket for feature selection. IEEE International Conference on Granular Computing. 2009. p. 743–7.

    Google Scholar 

  17. Aliferis CF, Statnikov A, Tsamardinos I, Mani S, Koutsoukos XD. Local causal and Markov blanket induction for causal discovery and feature selection for classification part I: algorithms and empirical evaluation. J Mach Learn Res. 2010;11:171–234.

    Google Scholar 

  18. Aliferis CF, Statnikov A, Tsamardinos I, Mani S, Koutsoukos XD. Local causal and Markov blanket induction for causal discovery and feature selection for classification part II: analysis and extensions. J Mach Learn Res. 2010;11:235–84.

    Google Scholar 

  19. Margaritis D, Thrun S. Bayesian network induction via local neighborhoods. Proc Adv Neural Inf Process Syst. 2000:505–11.

    Google Scholar 

  20. Tsamardinos I, Aliferis CF, Statnikov AR, Statnikov E. Algorithms for large scale Markov blanket discovery. Proc Florida Artif Intell Res Soc. 2003:376–80.

    Google Scholar 

  21. Aliferis CF, Tsamardinos I, Statnikov A. HITON: a novel Markov blanket algorithm for optimal variable selection. AMIA Annu Symp Proc. 2003;2003:21–5.

    PubMed Central  Google Scholar 

  22. Tsamardinos I, Brown LE, Aliferis CF. The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn. 2006;65:31–78.

    Article  Google Scholar 

  23. Strobl EV, Visweswaran S. Markov blanket ranking using kernel-based conditional dependence measures. arXiv Prepr arXiv14020108. 2014.

    Google Scholar 

  24. Hoeting JA, Madigan D, Raftery AE, Volinsky CT. Bayesian model averaging: a tutorial. Stat Sci. 1999;14:382–401.

    Article  Google Scholar 

  25. Madigan D, Raftery AE. Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc. 1994;89:1535–46.

    Article  Google Scholar 

  26. Yeung KY, Bumgarner RE, Raftery AE. Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics. 2005;21:2394–402.

    Article  CAS  PubMed  Google Scholar 

  27. Wei W, Visweswaran S, Cooper GF. The application of naive Bayes model averaging to predict Alzheimer’s disease from genome-wide data. J Am Med Inform Assoc. 2011;18:370–5.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Fragoso TM, Bertoli W, Louzada F. Bayesian model averaging: a systematic review and conceptual classification. Int Stat Rev. 2018;86:1–28.

    Article  Google Scholar 

  29. Dash D, Cooper GF. Exact model averaging with naive Bayesian classifiers. Proc Int Conf Int Conf Mach Learn. 2002:91–8.

    Google Scholar 

  30. Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med. 2015;372:793–5. https://doi.org/10.1056/NEJMp1500523.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Visweswaran S, Angus DC, Hsieh M, Weissfeld L, Yealy D, Cooper GF. Learning patient-specific predictive models from clinical data. J Biomed Inform. 2010;43:669–85.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Visweswaran S, Cooper GF. Learning instance-specific predictive models. J Mach Learn Res. 2010;11:3333–69.

    PubMed  PubMed Central  Google Scholar 

  33. Visweswaran S, Ferreira A, Ribeiro GA, Oliveira AC, Cooper GF. Personalized modeling for prediction with decision-path models. PLoS One. 2015;10:e0131022.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Visweswaran S, Cooper GF. Patient-specific models for predicting the outcomes of patients with community acquired pneumonia. AMIA Annu Symp Proc. 2005;2005:759–63.

    PubMed Central  Google Scholar 

  35. Suermondt HJ, Cooper GF. An evaluation of explanations of probabilistic inference. Comput Biomed Res. 1993;26:242–54.

    Article  CAS  PubMed  Google Scholar 

  36. Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N. Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM; 2015. p. 1721–30.

    Google Scholar 

  37. Ribeiro MT, Singh S, Guestrin C. Why should I trust you?: explaining the predictions of any classifier. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016. p. 1135–44.

    Google Scholar 

  38. Moons KGM, Altman DG, Reitsma JB, Ioannidis JPA, Macaskill P, Steyerberg EW, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162:W1–73.

    Article  PubMed  Google Scholar 

  39. Heus P, Damen JAAG, Pajouheshnia R, Scholten RJPM, Reitsma JB, Collins GS, et al. Poor reporting of multivariable prediction model studies: towards a targeted implementation strategy of the TRIPOD statement. BMC Med. 2018;16:120.

    Article  PubMed  PubMed Central  Google Scholar 

  40. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, et al. The FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:160018.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Flynn AJ, Friedman CP, Boisvert P, Landis-Lewis Z, Lagoze C. The knowledge object reference ontology (KORO): a formalism to support management and sharing of computable biomedical knowledge for learning health systems. Learn Heal Syst. 2018;2:e10054.

    Article  Google Scholar 

  42. Collins FS, Hudson KL, Briggs JP, Lauer MS. PCORnet: turning a dream into reality. J Am Med Inform Assoc. 2014;21:576–7.

    Article  PubMed  PubMed Central  Google Scholar 

  43. Visweswaran S, Becich MJ, D’Itri VS, Sendro ER, MacFadden D, Anderson NR, et al. Accrual to clinical trials (ACT): a clinical and translational science award consortium network. JAMIA Open. 2018;1:147–52. https://doi.org/10.1093/jamiaopen/ooy033.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie MJ, et al. Observational health data sciences and informatics (OHDSI): opportunities for observational researchers. Stud Health Technol Inform. 2015;216:574.

    PubMed  PubMed Central  Google Scholar 

  45. Reps JM, Schuemie MJ, Suchard MA, Ryan PB, Rijnbeek PR. Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data. J Am Med Inform Assoc. 2018;25:969–75.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Cohen IG, Amarasingham R, Shah A, Xie B, Lo B. The legal and ethical concerns that arise from using complex predictive analytics in health care. Health Aff. 2014;33:1139–47.

    Article  Google Scholar 

  47. National Institutes of Health. NIH strategic plan for data science [Internet]. [cited 2018 Oct 21]. p. 1–26. https://grants.nih.gov/grants/rfi/NIH-Strategic-Plan-for-Data-Science.pdf.

  48. Brennan PF. Models: the third leg in data-driven discovery – NLM musings from the mezzanine [internet]. 2017 [cited 2018 Oct 21]. https://nlmdirector.nlm.nih.gov/2017/12/12/models-the-third-leg-in-data-driven-discovery/.

  49. Ravi D, Wong C, Deligianni F, Berthelot M, Andreu-Perez J, Lo B, et al. Deep learning for health informatics. IEEE J Biomed Heal Informat. 2017;21:4–21. http://ieeexplore.ieee.org/document/7801947/.

    Article  Google Scholar 

  50. Voigt P, von dem Bussche A. The EU General Data Protection Regulation (GDPR): a practical guide. Cham: Springer; 2017.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shyam Visweswaran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Visweswaran, S., Cooper, G.F. (2020). Risk Stratification and Prognosis Using Predictive Modelling and Big Data Approaches. In: Adam, T., Aliferis, C. (eds) Personalized and Precision Medicine Informatics. Health Informatics. Springer, Cham. https://doi.org/10.1007/978-3-030-18626-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18626-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18625-8

  • Online ISBN: 978-3-030-18626-5

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics