Predicting Cardiovascular Risk Level Based on Biochemical Risk Factor Indicators Using Machine Learning: A Case Study in Indonesia

  • Yaya HeryadiEmail author
  • Raymond Kosala
  • Raymond Bahana
  • Indrajani Suteja
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11432)


Early detection of cardiovascular risk level remains an important issue in healthcare. It is still considered a very important preventive measure of cardiovascular disease as it gives a significant impact to reducing mortality rates and cardiovascular events. Prior to developing a prediction model of cardiovascular risk, identification of dominant predictor variables is very crucial. Some prominent studies have proposed a vast number of predictor variables. Although some predictor variables might be universal in nature, as the premise of this study, some of the variables might be associated with local lifestyle that governs patient behavior. This paper presents a verificative study on previous studies predicting cardiovascular risk level by using Indonesian adult patients’ lab records as the input dataset. In relation to this objective, this study aimed to select dominant biochemical indicators as predictor variables and trained machine learning models as classifier. Finally, this study compared the performance of several prominent classifier models such as: XGBoost, Random Forest, k-NN, Gradient Boosting, Artificial Neural Network (Multilayer Perceptron), Decision Tree, and Ada Boost. The results show that: XGBoost model achieved the best training and testing accuracy (0.965 and 0.964) compared to Random Forest (0.964 and 0.962), 5-NN (0.952 and 0.948), Gradient Boosting (0.948 and 0.940), Artificial Neural Networks (0.945 and 0.933), Decision Tree (0.861 and 0.860) and Ada Boost models (0.748 and 0.718).


Cardiovascular risk prediction Machine learning 


  1. 1.
    Rovio, S.P., et al.: Cardiovascular risk factors from childhood and midlife cognitive performance: the Young Finns Study. J. Am. Coll. Cardiol. 69(18), 2279–2289 (2017)CrossRefGoogle Scholar
  2. 2.
    Hansson, G.K., Hermansson, A.: The immune system in atherosclerosis. Nat. Immunol. 12(3), 204–212 (2011)CrossRefGoogle Scholar
  3. 3.
  4. 4.
    Friedewald, W.T., Levy, R.I., Fredrickson, D.S.: Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. Clin. Chem. 18(6), 499–502 (1972)Google Scholar
  5. 5.
    Rovio, S.P., et al.: Cognitive performance in young adulthood and midlife: relations with age, sex, and education—the cardiovascular risk in Young Finns Study. Neuropsychology 30(5), 532 (2016)CrossRefGoogle Scholar
  6. 6.
    Cohn, J.N., Duprez, D.A., Hoke, L., Florea, N., Duval, S.: Office blood pressure and cardiovascular disease: pathophysiologic implications for diagnosis and treatment. Hypertension 69(5), e14–e20 (2017)CrossRefGoogle Scholar
  7. 7.
    Welham, S.: Longitudinal data analysis. In: Fitzmaurice, G., Davidian, M., Verbeke, G., Molenberghs, G. (eds.) Longitudinal Data Analysis, pp. 253–289. Chapman & Hall/CRC, Boca Raton (2009)Google Scholar
  8. 8.
    Sweeting, M.J., Barrett, J.K., Thompson, S.G., Wood, A.M.: The use of repeated blood pressure measures for cardiovascular risk prediction: a comparison of statistical models in the ARIC study. Stat. Med. 36(28), 4514–4528 (2017)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Patsch, J.R., et al.: Relation of triglyceride metabolism and coronary artery disease. Studies in the postprandial state. Arteriosclerosis and thrombosis. J. Vasc. Biol. 12(11), 1336–1345 (1992)Google Scholar
  10. 10.
    Weng, S.F., Reps, J., Kai, J., Garibaldi, J.M., Qureshi, N.: Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS ONE 12(4), e017494 (2017)CrossRefGoogle Scholar
  11. 11.
    Kannel, W.B., McGee, D.D., Gordon, T.: A general cardiovascular risk profile: the Framingham study. Am. J. Cardiol. 38(1), 46–51 (1976)CrossRefGoogle Scholar
  12. 12.
    Plekhova, N.G., et al.: Scale of binary variables for predicting cardiovascular risk scale for predicting cardiovascular risk. In: 2018 3rd IEEE Russian-Pacific Conference on Computer Technology and Applications (RPC), pp. 1–4 (2018)Google Scholar
  13. 13.
    Peters, S.A., Woodward, M., Rumley, A., Tunstall-Pedoe, H.D., Lowe, G.D.: Plasma and blood viscosity in the prediction of cardiovascular disease and mortality in the Scottish Heart Health Extended Cohort study. Eur. J. Prevent. Cardiol. 24(2), 161–167 (2017)CrossRefGoogle Scholar
  14. 14.
    Muntner, P., Whelton, P.K.: Using predicted cardiovascular disease risk in conjunction with blood pressure to guide antihypertensive medication treatment. J. Am. Coll. Cardiol. 69(19), 2446–2456 (2017)CrossRefGoogle Scholar
  15. 15.
    Marcovina, S.M., et al.: Biochemical and bioimaging markers for risk assessment and diagnosis in major cardiovascular diseases: a road to integration of complementary diagnostic tools. J. Intern. Med. 261(3), 214–234 (2007)CrossRefGoogle Scholar
  16. 16.
    Miao, C., et al.: Cardiovascular health score and the risk of cardiovascular diseases. PLoS ONE 10(7), e0131537 (2015)CrossRefGoogle Scholar
  17. 17.
    Sun, X., Jia, Z.: A brief review of biomarkers for preventing and treating cardiovascular diseases. J. Cardiovasc. Dis. Res. 3, 251 (2012)CrossRefGoogle Scholar
  18. 18.
    Heryadi, Y., Miranda, E., Warnars, H.L.H.S.: Learning decision rules from incomplete biochemical risk factor indicators to predict cardiovascular risk level for adult patients. In: Proceedings of 2017 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom), Puket, Thailand (2017)Google Scholar
  19. 19.
    Miranda, E., Irwansyah, E., Amelga, A.Y., Maribondang, M.M., Salim, M.: Detection of cardiovascular disease risk’s level for adults using Naive Bayes classifier. Healthc. Inform. Res. 22(3), 196–205 (2016)CrossRefGoogle Scholar
  20. 20.
    Juarez-Orozco, L.E., Knol, R.J.J., Sanchez-Catasus, C.A., Van Der Zant, F.M., Knuuti, J.: Improving the value of clinical variables in the assessment of cardiovascular risk using artificial neural networks. Eur. Heart J. 38(suppl_1), 227–228 (2017)Google Scholar
  21. 21.
    Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, Hoboken (1984)zbMATHGoogle Scholar
  22. 22.
    Geurts, P., Irrthum, A., Wehenkel, L.: Supervised learning with decision tree-based methods in computational and systems biology. Mol. BioSyst. 5(12), 1593–1605 (2009)CrossRefGoogle Scholar
  23. 23.
    Aertsen, W., Kint, V., van Orshoven, J., Özkan, K., Muys, B.: Comparison and ranking of different modelling techniques for prediction of site index in mediterranean mountain forests. Ecol. Model. 221, 1119–1130 (2010)CrossRefGoogle Scholar
  24. 24.
    Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2016)Google Scholar
  25. 25.
    Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Bennett, J., Lanning, S.: The Netflix prize. In: Proceedings of the KDD Cup Workshop 2007, New York, pp. 3–6 (2007)Google Scholar
  27. 27.
    Burges, C.: From ranknet to lambdarank to lambdamart: an overview. Learning 11, 23–581 (2010)Google Scholar
  28. 28.
    He, X., et al.: Practical lessons from predicting clicks on ads at Facebook. In: Proceedings of the Eighth International Workshop on Data Mining for Online Advertising, ADKDD 2014 (2014)Google Scholar
  29. 29.
    Li, P.: Robust Logitboost and adaptive base class (ABC) Logitboost. In: Proceedings of the Twenty-Sixth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI 2010), pp. 302–311 (2010)Google Scholar
  30. 30.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
  31. 31.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119 (1997)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)zbMATHGoogle Scholar
  33. 33.
    Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)zbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Yaya Heryadi
    • 1
    Email author
  • Raymond Kosala
    • 2
  • Raymond Bahana
    • 2
  • Indrajani Suteja
    • 3
  1. 1.Computer Science Department, BINUS Graduate Program – Doctor of Computer ScienceBina Nusantara UniversityJakartaIndonesia
  2. 2.Computer Science Program, Binus InternationalBina Nusantara UniversityJakartaIndonesia
  3. 3.School of Information SystemBina Nusantara UniversityJakartaIndonesia

Personalised recommendations