Abstract
Current clinical practice guidelines for managing Coronary Artery Disease (CAD) account for general cardiovascular risk factors. However, they do not present a framework that considers personalized patient-specific characteristics. Using the electronic health records of 21,460 patients, we created data-driven models for personalized CAD management that significantly improve health outcomes relative to the standard of care. We develop binary classifiers to detect whether a patient will experience an adverse event due to CAD within a 10-year time frame. Combining the patients’ medical history and clinical examination results, we achieve 81.5% AUC. For each treatment, we also create a series of regression models that are based on different supervised machine learning algorithms. We are able to estimate with average R2 = 0.801 the outcome of interest; the time from diagnosis to a potential adverse event (TAE). Leveraging combinations of these models, we present ML4CAD, a novel personalized prescriptive algorithm. Considering the recommendations of multiple predictive models at once, the goal of ML4CAD is to identify for every patient the therapy with the best expected TAE using a voting mechanism. We evaluate its performance by measuring the prescription effectiveness and robustness under alternative ground truths. We show that our methodology improves the expected TAE upon the current baseline by 24.11%, increasing it from 4.56 to 5.66 years. The algorithm performs particularly well for the male (24.3% improvement) and Hispanic (58.41% improvement) subpopulations. Finally, we create an interactive interface, providing physicians with an intuitive, accurate, readily implementable, and effective tool.
Similar content being viewed by others
References
AHA (2017) Heart disease and stroke statistics 2017. AHA Centers for Health Metrics and Evaluation
Angrist JD, Imbens GW, Rubin DB (1996) Identification of causal effects using instrumental variables. J Am Stat Assoc 91(434):444–455
Arnett DK, Blumenthal RS, Albert MA, Buroker AB, Goldberger ZD, Hahn EJ, Himmelfarb CD, Khera A, Lloyd-Jones D, McEvoy JW, Michos ED, Miedema MD, Muñoz D, Smith SC, Virani SS, Williams KA, Yeboah J, Ziaeian B (10) 2019 acc/aha guideline on the primary prevention of cardiovascular disease. Journal of the American College of Cardiology 74:e177–e232. https://doi.org/10.1016/j.jacc.2019.03.010. https://www.onlinejacc.org/content/74/10/e177.full.pdf
Athey S, Imbens G (2016) Recursive partitioning for heterogeneous causal effects. Proc Nat Acad Sci 113(27):7353–7360
Beitelshees AL (2012) Personalised antiplatelet treatment: a rapidly moving target. The Lancet 379(9827):1680–1682. https://doi.org/10.1016/S0140-6736(12)60431-0. http://www.sciencedirect.com/science/article/pii/S0140673612604310
Bertsimas D, Dunn J (2017) Optimal classification trees. Mach Learn 106(7):1039–1082
Bertsimas D, Dunn J (2019) Machine learning under a modern optimization lens. Dynamic Ideas, Belmont
Bertsimas D, Kallus N, Weinstein AM, Zhuo YD (2017) Personalized diabetes management using electronic medical records. Diabetes Care 40(2):210–217
Bertsimas D, Pawlowski C, Zhuo YD (2018) From predictive methods to missing data imputation: an optimization approach. J Mach Learn Res 18(1):7133–7171
Bertsimas D, Dunn J, Mundru N (2019) Optimal prescriptive trees. Informs J Opt 1 (2):164–183
Boden WE, O’Rourke RA, Teo KK, Hartigan PM, Maron DJ, Kostuk WJ, Knudtson M, Dada M, Casperson P, Harris CL, Chaitman BR, Shaw L, Gosselin G, Nawaz S, Title LM, Gau G, Blaustein AS, Booth DC, Bates ER, Spertus JA, Berman DS, Mancini GJ, Weintraub WS (2007) Optimal medical therapy with or without pci for stable coronary disease. N Engl J Med 356 (15):1503–1516. https://doi.org/10.1056/NEJMoa070829, pMID: 17387127
Bou-Hamad I, Larocque D, Ben-Ameur H (2011) A review of survival trees. Stat Surv 5:44–71
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees wadsworth and brooks. Monterey, California
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. arXiv:160302754
Conroy R, Pyörälä K, Ae Fitzgerald, Sans S, Menotti A, De Backer G, De Bacquer D, Ducimetiere P, Jousilahti P, Keil U et al (2003) Estimation of ten-year risk of fatal cardiovascular disease in europe: the score project. European Heart J 24(11):987–1003
Cox DR (1972) Regression models and life-tables. J Royal Stat Soc Ser B (Methodological) 34(2):187–220. http://links.jstor.org/sici?sici=0035-9246%281972%2934%3A2%3C187%3ARMAL%3E2.0.CO%3B2-6
D’agostino RB, Vasan RS, Pencina MJ, Wolf PA, Cobain M, Massaro JM, Kannel WB (2008) General cardiovascular risk profile for use in primary care. Circulation 117(6):743–753
Duan T, Rajpurkar P, Laird D, Ng AY, Basu S (2019) Clinical value of predicting individual treatment effects for intensive blood pressure therapy: a machine learning experiment to estimate treatment effects from randomized trial data. Circulation: Cardiovascular Quality and Outcomes 12(3):e005010
Ebinger JE, Porten BR, Strauss CE, Garberich RF, Han C, Wahl SK, Sun BC, Abdelhadi RH, Henry TD (2016) Design, challenges, and implications of quality improvement projects using the electronic medical record. Circulation: Cardiovascular Quality and Outcomes 9(5):593–599. https://doi.org/10.1161/CIRCOUTCOMES.116.003122. http://circoutcomes.ahajournals.org/content/9/5/593.full.pdf
Emanuel EJ, Wachter RM (2019) Artificial Intelligence in Health Care: Will the Value Match the Hype? Artificial Intelligence in Health Care—Will the Value Match the Hype? Artificial Intelligence in Health Care Will the Value Match the Hype? JAMA, https://doi.org/10.1001/jama.2019.4914, https://jamanetwork.com/journals/jama/articlepdf/2734581/jama_emanuel_2020_vp_190060.pdf
Epstein CCL (2014) An analytics approach to hypertension treatment. PhD thesis, Massachusetts Institute of Technology
Farkouh ME, Domanski M, Sleeper LA, Siami FS, Dangas G, Mack M, Yang M, Cohen DJ, Rosenberg Y, Solomon SD, Desai AS, Gersh BJ, Magnuson EA, Lansky A, Boineau R, Weinberger J, Ramanathan K, Sousa JE, Rankin J, Bhargava B, Buse J, Hueb W, Smith CR, Muratov V, Bansilal S, King SI, Bertrand M, Fuster V (2012) Strategies for multivessel revascularization in patients with diabetes. N Engl J Med 367(25):2375–2384. https://doi.org/10.1056/NEJMoa1211585, pMID: 23121323
FDA (2017) Clinical and patient decision support software - guidance for industry and food and drug administration staff. Available at http://www.fda.gov/regulatory-information/search-fda-guidance-documents/clinical-and-patient-decision-support-software (2017/05/27)
Feldstein ML, Savlov ED, Hilf R (1978) A statistical model for predicting response of breast cancer patients to cytotoxic chemotherapy. Cancer Res 38(8):2544–2548
Fihn SD, Blankenship JC, Alexander KP, Bittl JA, Byrne JG, Fletcher BJ, Fonarow GC, Lange RA, Levine GN, Maddox TM, Naidu SS, Ohman EM, Smith PK (2014) 2014 acc/aha/aats/pcna/scai/sts focused update of the guideline for the diagnosis and management of patients with stable ischemic heart disease: A report of the american college of cardiology/american heart association task force on practice guidelines, and the american association for thoracic surgery, preventive cardiovascular nurses association, society for cardiovascular angiography and interventions, and society of thoracic surgeons. Journal of the American College of Cardiology 64(18):1929–1949. https://doi.org/10.1016/j.jacc.2014.07.017. http://www.sciencedirect.com/science/article/pii/S0735109714045100
Fihn SD, Gardin JM, Abrams J, Berra K, Blankenship JC, Dallas AP, Douglas PS, Foody JM, Gerber TC, Hinderliter AL, King SB, Kligfield PD, Krumholz HM, Kwong RY, Lim MJ, Linderbaum JA, Mack MJ, Munger MA, Prager RL, Sabik JF, Shaw LJ, Sikkema JD, Smith CR, Smith SC, Spertus JA, Williams SV (2015) 2012 accf/aha/acp/aats/pcna/scai/sts guideline for the diagnosis and management of patients with stable ischemic heart disease: A report of the american college of cardiology foundation/american heart association task force on practice guidelines, and the american college of physicians, american association for thoracic surgery, preventive cardiovascular nurses association, society for cardiovascular angiography and interventions, and society of thoracic surgeons. Circulation 60(24):e44–e164
Frohlich H, Balling R, Beerenwinkel N, Kohlbacher O, Kumar S, Lengauer T, Maathuis MH, Moreau Y, Murphy SA, Przytycka TM, Rebhan M, Rost H, Schuppert A, Schwab M, Spang R, Stekhoven D, Sun J, Weber A, Ziemek D, Zupan B (2018) From hype to reality: data science enabling personalized medicine. BMC Medicine 16(1):150. https://doi.org/10.1186/s12916-018-1122-7
Fuster V, Badimon L, Badimon JJ, Chesebro JH (1992) The pathogenesis of coronary artery disease and the acute coronary syndromes. New England Journal of Medicine 326(5):310– 318
Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G (2018) Potential biases in machine learning algorithms using electronic health record data. JAMA Internal Medicine 178(11):1544–1547. https://doi.org/10.1001/jamainternmed.2018.3763. https://www.ncbi.nlm.nih.gov/pubmed/30128552
Gittins JC, Glazebrook KD, Weber R, Weber R (1989) Multi-armed bandit allocation indices, vol 25. Wiley Online Library
Goff DC, Lloyd-Jones DM, Bennett G, Coady S, D’Agostino RB, Gibbons R, Greenland P, Lackland DT, Levy D, O’Donnell CJ, Robinson JG, Schwartz JS, Shero ST, Smith SC, Sorlie P, Stone NJ, Wilson PW (2014) 2013 acc/aha guideline on the assessment of cardiovascular risk. Journal of the American College of Cardiology 63(25 Part B):2935–2959. https://doi.org/10.1016/j.jacc.2013.11.005. https://www.onlinejacc.org/content/63/25_Part_B/2935.full.pdf
Goldenshluger A, Zeevi A (2013) A linear response bandit problem. Stochastic Systems 3 (1):230–261
Hamburg MA, Collins FS (2010) The path to personalized medicine. N Engl J Med 363 (4):301–304
Hansson GK (2005) Inflammation, atherosclerosis, and coronary artery disease. N Engl J Med 352(16):1685–1695
Ibrahim JG, Chen MH, Sinha D (2014) Bayesian survival analysis. Wiley StatsRef: Statistics Reference Online
Imbens GW, Rubin DB (2015) Causal inference for statistics, social, and biomedical sciences: an introduction. Cambridge University Press, New York
Kallus N (2017) Recursive partitioning for personalization using observational data. In: International conference on machine learning, pp 1789–1798
Krittanawong C, Zhang H, Wang Z, Aydar M, Kitai T (2017) Artificial intelligence in precision cardiovascular medicine. J Am Coll Cardiol 69(21):2657–2664. https://doi.org/10.1016/j.jacc.2017.03.571. http://www.sciencedirect.com/science/article/pii/S0735109717368456
Lagakos S (1979) General right censoring and its impact on the analysis of survival data. Biometrics 35(1):139–156
Roeters van Lennep JE, Westerveld HT, Erkelens DW, van der Wall EE (2002) Risk factors for coronary heart disease: implications of gender. Cardiovasc Res 53(3):538–549. https://doi.org/10.1016/S0008-6363(01)00388-1
Lesko L (2007) Personalized medicine: elusive dream or imminent reality? Clinical Pharmacology & Therapeutics 81(6):807–816
Li L, Chu W, Langford J, Schapire RE (2010) A contextual-bandit approach to personalized news article recommendation. In: Proceedings of the 19th international conference on World wide web, ACM, pp 661–670
Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30. http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf. Curran Associates, Inc., pp 4765–4774
Nevin L, Editors PM et al (2018) Advancing the beneficial use of machine learning in health care and medicine: Toward a community understanding
Omar AMS, Narula S, Rahman MAA, Pedrizzetti G, Raslan H, Rifaie O, Narula J, Sengupta PP (2017) Precision phenotyping in heart failure and pattern clustering of ultrasound data for the assessment of diastolic dysfunction. JACC: Cardiovascular Imaging 10(11):1291–1303. https://doi.org/10.1016/j.jcmg.2016.10.012. http://www.sciencedirect.com/science/article/pii/S1936878X16309792
Orfanoudaki A, Chesley E, Cadisch C, Stein B, Nouh A, Alberts MJ, Bertsimas D (2020) Machine learning provides evidence that stroke risk is not linear: the non-linear framingham stroke risk score. PloS One 15(5):e0232414
Pearl J et al (2009) Causal inference in statistics: an overview. Statistics Surveys 3:96–146
Polonsky TS, McClelland RL, Jorgensen NW, Bild DE, Burke GL, Guerci AD, Greenland P (2010) Coronary artery calcium score and risk classification for coronary heart disease prediction. Jama 303 (16):1610–1616
Qian M, Murphy SA (2011) Performance guarantees for individualized treatment rules. Ann Stat 39(2):1180
Rejnmark L, Vestergaard P, Mosekilde L (2006) Treatment with beta-blockers, ace inhibitors, and calcium-channel blockers is associated with a reduced fracture risk: a nationwide case–control study. J Hyper 24(3):581–589
Ridker PM, Buring JE, Rifai N, Cook NR (2007) Development and validation of improved algorithms for the assessment of global cardiovascular risk in WomenThe reynolds risk score. JAMA 297(6):611–619. https://doi.org/10.1001/jama.297.6.611. https://jamanetwork.com/journals/jama/articlepdf/205528/joc70004_611_619.pdf
Ron Kohavi FP (1998) Glossary of terms. Mach Learn 30:271–274
Rosenbaum PR (2010) Design of observational studies, vol 10, Springer, Berlin
Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55. https://doi.org/10.1093/biomet/70.1.41. http://oup.prod.sis.lan/biomet/articlepdf/70/1/41/662954/70-1-41.pdf
Ross R (1999) Atherosclerosis—an inflammatory disease. New England Journal of Medicine 340 (2):115–126
Rubin DB (1990) Comment: Neyman (1923) and causal inference in experiments and observational studies. Stat Sci 5(4):472– 480
Schulz KF, Chalmers I, Hayes RJ, Altman DG (1995) Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. Jama 273 (5):408–412
Sedlis SP, Hartigan PM, Teo KK, Maron DJ, Spertus JA, Mancini GJ, Kostuk W, Chaitman BR, Berman D, Lorin JD, Dada M, Weintraub WS, Boden WE (2015) Effect of pci on long-term survival in patients with stable ischemic heart disease. N Engl J Med 373(20):1937–1946. https://doi.org/10.1056/NEJMoa1505532, pMID: 26559572
Serruys PW, Morice MC, Kappetein AP, Colombo A, Holmes DR, Mack MJ, Ståhle E, Feldman TE, van den Brand M, Bass EJ, Van Dyck N, Leadley K, Dawkins KD, Mohr FW (2009) Percutaneous coronary intervention versus coronary-artery bypass grafting for severe coronary artery disease. New England Journal of Medicine 360(10):961–972. https://doi.org/10.1056/NEJMoa0804626, pMID: 19228612
Sianos G, Morel MA, Kappetein AP, Morice MC, Colombo A, Dawkins KD, van den Brand M, van Dyck N, Russell M, Serruys PW (2005) The syntax score: an angiographic tool grading the complexity of coronary artery disease. EuroIntervention 1(2):219–227. https://www.pcronline.com/eurointervention/2ndissue/36
Stoehlmacher J, Park D, Zhang W, Yang D, Groshen S, Zahedy S, Lenz H (2004) A multivariate analysis of genomic polymorphisms: prediction of clinical outcome to 5-fu/oxaliplatin combination chemotherapy in refractory colorectal cancer. British Journal of Cancer 91(2):344
Stout KK, Daniels CJ, Aboulhosn JA, Bozkurt B, Broberg CS, Colman JM, Crumb SR, Dearani JA, Fuller S, Gurvitz M et al (2018) 2018 aha/acc guideline for the management of adults with congenital heart disease: a report of the american college of cardiology/american heart association task force on clinical practice guidelines. Circulation, pp CIR–0000000000000603
Strom BL (2001) Data validity issues in using claims data. Pharmacoepidemiology and Drug Safety 10(5):389–392
Tucker KL, Sheppard JP, Stevens R, Bosworth HB, Bove A, Bray EP, Earle K, George J, Godwin M, Green BB et al (2017) Self-monitoring of blood pressure in hypertension: a systematic review and individual patient data meta-analysis. PLoS Medicine 14(9):e1002389
Wager S, Athey S (2018) Estimation and inference of heterogeneous treatment effects using random forests. J Am Stat Assoc 113(523):1228–1242
Warnes CA (2017) Adult congenital heart disease: the challenges of a lifetime. Eur Heart J 38 (26):2041–2047. https://doi.org/10.1093/eurheartj/ehw529
Wilson PWF (2017) Estimation of cardiovascular risk in an individual patient without known cardiovascular disease. UpToDate, Waltham
Wilson PWF, D’Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB (1998) Prediction of coronary heart disease using risk factor categories. Circulation 97(18):1837–1847. https://doi.org/10.1161/01.CIR.97.18.1837. http://circ.ahajournals.org/content/97/18/1837.full.pdf
Zhou Y, Wilkinson D, Schreiber R, Pan R (2008) Large-scale parallel collaborative filtering for the netflix prize. In: International conference on algorithmic applications in management, Springer, pp 337–348
Acknowledgements
The authors wish to thank the anonymous reviewers and the associate editor of the journal for their helpful comments on this manuscript. They, also, thank Theofanie Mela MD (Massachusetts General Hospital), and Abeel A. Mangi MD (Yale Medicine Department) for sharing clinical expertise as well as Bill Adams, MD and the Boston Medical Center for the use of its i2b2 database.
Funding
This research was supported by the National Science Foundation grant 6926678 [“SHB: Type II (INT): Collaborative Research: Algorithmic Approaches to Personalized Health Care”].
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interest.
Ethics approval
The Massachusetts Institute of Technology and Boston Medical Center Institutional Review Boards approved the study.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Availability of data and material
All datasets that are used in this study come from an academic medical center that applies to the Health Insurance Portability and Accountability Act. Due to the data protection laws, the dataset cannot be directly released to another organization. We invite readers that would like to gain access to the dataset to establish a data use agreement with the BMC.
Electronic supplementary material
Appendix
Appendix
Acronym | Acronym Definition |
---|---|
AHA | American Heart Association |
ASA | Aspirin |
AUC | Area Under the ROC Curve |
BMC | Boston Medical Center |
BMI | Body Mass Index |
CABG | Coronary Artery Bypass Graft |
CAD | Coronary Artery Disease |
CART | Classification and Regression Trees |
DMLA | Degree of ML Agreement |
ECG | Electrocardiogram |
EMR | Electronic Medical Records |
FDA | US Food and Drug Administration |
HDL | High-Density Lipoprotein |
k-NN | k-Nearest Neighbors |
LDL | Low-Density Lipoprotein |
ML | Machine Learning |
OCT | Optimal Classification Trees |
ORT | Optimal Regression Trees |
PE | Prescription Effectiveness |
PR | Prescription Robustness |
PCI | Percutaneous Coronary Intervention |
ROC | Receiver Operator Characteristic |
TAE | Time from diagnosis to a potential Adverse Event |
List of all acronyms used in the manuscript in alphabetical order along with the corresponding definition.
Rights and permissions
About this article
Cite this article
Bertsimas, D., Orfanoudaki, A. & Weiner, R.B. Personalized treatment for coronary artery disease patients: a machine learning approach. Health Care Manag Sci 23, 482–506 (2020). https://doi.org/10.1007/s10729-020-09522-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10729-020-09522-4