Predicting Medical Outcomes

Bellazzi, Riccardo; Dagliati, Arianna; Nicora, Giovanna

doi:10.1007/978-3-031-09108-7_11

Riccardo Bellazzi^5,6,
Arianna Dagliati⁵ &
Giovanna Nicora⁵

Part of the book series: Cognitive Informatics in Biomedicine and Healthcare ((CIBH))

1121 Accesses

Abstract

Clinical outcomes are measurable changes in health, function or quality of life that result from patients’ care. The capability of predicting changes related to specific care actions, including administration of drugs, therapeutic protocols, guidelines and technology-related interventions, is of obvious great interest, in particular when such actions are implemented in the real-world, after clinical trials. AI and machine learning hold the promise to provide methods and tools able to add several important elements to traditional statistical modeling techniques, including the capability of analyzing and synthesizing very large data sets, the tools for handling non-linear relationships between variables, and the strategies for incorporating prior knowledge coming from experts into the analysis. This chapter introduces the problem of predicting different types of clinical outcomes, ranging from binary responses to temporal trajectories, and an overview of AI approaches able to deal with such types of prediction. The chapter will also discuss how to carefully assess the prediction performances and will finally provide some application examples in different clinical areas.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
ImageNet is an image database that has been very important in advancing computer vision and deep learning research also by means of a number of large image recognition challenges.
2.
https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device, with (accessed August 18, 2022).
3.
Hematopoietic disorders are heterogeneous diseases that can be caused by problems with red blood cells, white blood cells, platelets, bone marrow, lymph nodes, and the spleen.

References

Goodfellow IJ, Bengio Y, Courville A. Deep learning. Cambridge, MA: MIT Press; 2016.
MATH Google Scholar
Bzdok D, Altman N, Krzywinski M. Statistics versus machine learning. Nat Methods. 2018;15:233–4.
Article Google Scholar
Tejedor M, Woldaregay AZ, Godtliebsen F. Reinforcement learning application in diabetes blood glucose control: a systematic review. Artif Intell Med. 2020;104:101836.
Article Google Scholar
Rubin DB. Inference and missing data. Biometrika. 1976;63:581–92.
Article MathSciNet MATH Google Scholar
Pedersen AB, et al. Missing data and multiple imputation in clinical epidemiological research. Clin Epidemiol. 2017;9:157–66.
Article Google Scholar
van Buuren S, Groothuis-Oudshoorn K. Mice: multivariate imputation by chained equations in R. J Stat Softw. 2011;45:1–67.
Article Google Scholar
Chandrashekar G, Sahin F. A survey on feature selection methods. Comput Electr Eng. 2014;40:16–28.
Article Google Scholar
Pearson KLIII. On lines and planes of closest fit to systems of points in space. Lond Edinb Dublin Philos Mag J Sci. 1901;2:559–72.
Article MATH Google Scholar
Hotelling H. Analysis of a complex of statistical variables into principal components. J Educ Psychol. 1933;24:417–41.
Article MATH Google Scholar
Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philos Trans R Soc A Math Phys Eng Sci. 2016;374:20150202.
Article MathSciNet MATH Google Scholar
van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9:2579–605.
MATH Google Scholar
Wang F, Casalino LP, Khullar D. Deep learning in medicine-promise, progress, and challenges. JAMA Intern Med. 2019;179:293–4.
Article Google Scholar
Xu X, Liang T, Zhu J, Zheng D, Sun T. Review of classical dimensionality reduction and sample selection methods for large-scale data processing. Neurocomputing. 2019;328:5–15.
Article Google Scholar
Yu Y, Si X, Hu C, Zhang J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019;31:1235–70.
Article MathSciNet MATH Google Scholar
Yang X, Bian J, Hogan WR, Wu Y. Clinical concept extraction using transformers. J Am Med Inform Assoc. 2020;27:1935–42.
Article Google Scholar
Vaswani A, et al. Attention is all you need. arXiv. 2017:1706.03762 [cs].
Google Scholar
Russakovsky O, et al. ImageNet large scale visual recognition challenge. arXiv. 2015:1409.0575 [cs].
Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. arXiv. 2019:1810.04805 [cs].
Google Scholar
Goto T, Camargo CA, Faridi MK, Freishtat RJ, Hasegawa K. Machine learning-based prediction of clinical outcomes for children during Emergency Department Triage. JAMA Netw Open. 2019;2:e186937.
Article Google Scholar
Skrede O-J, et al. Deep learning for prediction of colorectal cancer outcome: a discovery and validation study. Lancet. 2020;395:350–60.
Article Google Scholar
Yala A, Lehman C, Schuster T, Portnoi T, Barzilay R. A deep learning mammography-based model for improved breast cancer risk prediction. Radiology. 2019;292:60–6.
Article Google Scholar
Ling CX, Sheng VS. Cost-sensitive learning. In: Sammut C, Webb GI, editors. Encyclopedia of machine learning. New York: Springer; 2010. p. 231–5. https://doi.org/10.1007/978-0-387-30164-8_181.
Chapter Google Scholar
Bayati M, et al. Data-driven decisions for reducing readmissions for heart failure: general methodology and case study. PLoS One. 2014;9:e109264.
Article Google Scholar
Salvi E, Parimbelli E, Quaglini S, Sacchi L. Eliciting and exploiting utility coefficients in an integrated environment for shared decision-making. Methods Inf Med. 2019;58:24–30.
Article Google Scholar
Hastie T, Tibshirani R. Generalized additive models. Stat Sci. 1986;1:297–310.
MathSciNet MATH Google Scholar
Schneider A, Hommel G, Blettner M. Linear Regression Analysis. Dtsch Arztebl Int. 2010;107:776–82.
Google Scholar
Kleinbaum DG, Klein M. Survival analysis: a self-learning text. 3rd ed. New York: Springer-Verlag; 2012. https://doi.org/10.1007/978-1-4419-6646-9.
Book MATH Google Scholar
Giolo SR, Krieger JE, Mansur AJ, Pereira AC. Survival analysis of patients with heart failure: implications of time-varying regression effects in modeling mortality. PLoS One. 2012;7:e37392.
Article Google Scholar
Goldhirsch A, Gelber RD, Simes RJ, Glasziou P, Coates AS. Costs and benefits of adjuvant therapy in breast cancer: a quality-adjusted survival analysis. J Clin Oncol. 1989;7:36–44.
Article Google Scholar
Lee ET, Go OT. Survival Analysis in Public Health Research. Annu Rev Public Health. 1997;18:105–34.
Article Google Scholar
Wang P, Li Y, Reddy CK. Machine learning for survival analysis: a survey. ACM Comput Surv. 2019;51:110.
Article Google Scholar
Huang Z, et al. SALMON: Survival Analysis Learning With Multi-Omics Neural Networks on Breast Cancer. Front Genet. 2019;10:166.
Article Google Scholar
Ching T, Zhu X, Garmire LX. Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data. PLoS Comput Biol. 2018;14:e1006076.
Article Google Scholar
Lee, C., Zame, W., Yoon, J. & van der Schaar, M. DeepHit: A deep learning approach to survival analysis with competing risks. AAAI 32, (2018).
Google Scholar
Lee C, Yoon J, van der Schaar M. Dynamic-DeepHit: a deep learning approach for dynamic survival analysis with competing risks based on longitudinal data. IEEE Trans Biomed Eng. 2020;67:122–33.
Article Google Scholar
Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA. 1982;247:2543–6.
Article Google Scholar
Brier GW. Verification of forecasts expressed in terms of probability. Mon Wea Rev. 1950;78:1–3.
Article Google Scholar
Muthén B, Muthén LK. Integrating person-centered and variable-centered analyses: growth mixture modeling with latent trajectory classes. Alcohol Clin Exp Res. 2000;24:882–91.
Article Google Scholar
van der Schoot R, Sijbrandij M, Winter SD, Depaoli S, Vermunt JK. The GRoLTS-Checklist: guidelines for reporting on latent trajectory studies. Struct Equ Model Multidiscip J. 2017;24:451–67.
Article MathSciNet Google Scholar
Lennon H, et al. Framework to construct and interpret latent class trajectory modelling. BMJ Open. 2018;8:e020683.
Article Google Scholar
Dagliati A, et al. Latent class trajectory modeling of 2-component disease activity score in 28 joints identifies multiple rheumatoid arthritis phenotypes of response to biologic disease-modifying antirheumatic drugs. Arthritis Rheumatol. 2020;72:1632–42.
Article Google Scholar
Komorowski M, Raffa J. Markov models and cost effectiveness analysis: applications in medical research. In: Secondary Analysis of Electronic Health Records (ed. MIT Critical Data). New York: Springer International Publishing; 2016. p. 351–67. https://doi.org/10.1007/978-3-319-43742-2_24.
Chapter Google Scholar
Mor B, Garhwal S, Kumar A. A systematic review of hidden markov models and their applications. Arch Computat Methods Eng. 2021;28:1429–48.
Article MathSciNet Google Scholar
Liu M, et al. A comparison between discrete and continuous time Bayesian networks in learning from clinical time series data with irregularity. Artif Intell Med. 2019;95:104–17.
Article Google Scholar
Ferrazzi F, Sebastiani P, Ramoni MF, Bellazzi R. Bayesian approaches to reverse engineer cellular systems: a simulation study on nonlinear Gaussian networks. BMC Bioinformatics. 2007;8:S2.
Article Google Scholar
Chen R, Zheng Y, Nixon E, Herskovits EH. Dynamic network model with continuous valued nodes for longitudinal brain morphometry. NeuroImage. 2017;155:605–11.
Article Google Scholar
Bates S, Hastie T, Tibshirani R. Cross-validation: what does it estimate and how well does it do it? arXiv. 2021:2104.00673 [math, stat].
Google Scholar
Cabitza F, Campagner A. The need to separate the wheat from the chaff in medical informatics: Introducing a comprehensive checklist for the (self)-assessment of medical AI studies. Int J Med Inform. 2021;153:104510. https://doi.org/10.1016/j.ijmedinf.2021.104510.
Article Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–57.
Article MATH Google Scholar
Ho SY, Phua K, Wong L, Bin Goh WW. Extensions of the external validation for checking learned model interpretability and generalizability. Patterns. 2020;1:100129.
Article Google Scholar
Toll DB, Janssen KJM, Vergouwe Y, Moons KGM. Validation, updating and impact of clinical prediction rules: a review. J Clin Epidemiol. 2008;61:1085–94.
Article Google Scholar
Mesquita DPP, Rocha LS, Gomes JPP, Rocha Neto AR. Classification with reject option for software defect prediction. Appl Soft Comput. 2016;49:1085–93.
Article Google Scholar
Saria S, Subbaswamy A. Tutorial: safe and reliable machine learning. 2019. Preprint at https://arxiv.org/abs/1904.07204.
Moons KGM, et al. Risk prediction models: II. External validation, model updating, and impact assessment. Heart. 2012;98:691–8.
Article Google Scholar
Caruana R, Lundberg S, Ribeiro MT, Nori H, Jenkins S. Intelligible and Explainable Machine Learning: Best Practices and Practical Challenges. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 3511–3512. New York: Association for Computing Machinery; 2020. https://doi.org/10.1145/3394486.3406707.
Chapter Google Scholar
Dagliati A, et al. Machine learning methods to predict diabetes complications. J Diabetes Sci Technol. 2018;12:295–302.
Article Google Scholar
Dagliati A, Geifman N, et al. Using topological data analysis and pseudo time series to infer temporal phenotypes from electronic health records. Artif Intell Med. 2020;108:101930. https://doi.org/10.1016/j.artmed.2020.101930.
Article Google Scholar
Nicora G, et al. A continuous-time Markov model approach for modeling myelodysplastic syndromes progression from cross-sectional data. J Biomed Inform. 2020;104:103398.
Article Google Scholar
Greenberg PL, et al. Revised international prognostic scoring system for myelodysplastic syndromes. Blood. 2012;120:2454–65.
Article Google Scholar
Brat GA, et al. International electronic health record-derived COVID-19 clinical course profiles: the 4CE consortium. NPJ Digital Medicine. 2020;3:109.
Article Google Scholar
Weber GM, et al. International Comparisons of Harmonized Laboratory Value Trajectories to Predict Severe COVID-19: Leveraging the 4CE Collaborative Across 342 Hospitals and 6 Countries: A Retrospective Cohort Study. medRxiv. 2021:2020.12.16.20247684. https://doi.org/10.1101/2020.12.16.20247684.
Klann JG, et al. Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data. J Am Med Inform Assoc. 2021;28(7):1411–20. https://doi.org/10.1093/jamia/ocab018.
Article Google Scholar
World Health Organization. International statistical classification of diseases and related health problems. World Health Organization; 2015.
Google Scholar
WHO Expert Committee on the Selection and Use of Essential Medicines, World Health Organization. The selection and use of essential medicines. In: Report of the WHO expert committee, 2005 (including the 14th model list of essential medicines), 2006.
Google Scholar
Huff SM, et al. Development of the Logical Observation Identifier Names and Codes (LOINC) Vocabulary. J Am Med Inform Assoc. 1998;5:276–92.
Article Google Scholar
Liu S, Ma W, Moore R, Ganesan V, Nelson S. RxNorm: prescription for electronic drug information exchange. IT Professional. 2005;7:17–23.
Article Google Scholar
Estiri H, Strasser ZH, Klann JG, McCoy TH Jr., Wagholikar KB, Vasey S, Castro VM, Murphy ME, Murphy SN. Transitive sequencing medical records for mining predictive and interpretable temporal representations. Patterns 2020.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Riccardo Bellazzi, Arianna Dagliati & Giovanna Nicora
Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri, Pavia, Italy
Riccardo Bellazzi

Authors

Riccardo Bellazzi
View author publications
You can also search for this author in PubMed Google Scholar
Arianna Dagliati
View author publications
You can also search for this author in PubMed Google Scholar
Giovanna Nicora
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Riccardo Bellazzi .

Editor information

Editors and Affiliations

University of Washington, Seattle, WA, USA
Trevor A. Cohen
New York Academy of Medicine, New York, NY, USA
Vimla L. Patel
Columbia University, New York, NY, USA
Edward H. Shortliffe

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bellazzi, R., Dagliati, A., Nicora, G. (2022). Predicting Medical Outcomes. In: Cohen, T.A., Patel, V.L., Shortliffe, E.H. (eds) Intelligent Systems in Medicine and Health. Cognitive Informatics in Biomedicine and Healthcare. Springer, Cham. https://doi.org/10.1007/978-3-031-09108-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-09108-7_11
Published: 10 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09107-0
Online ISBN: 978-3-031-09108-7
eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics