Annals of Operations Research

, Volume 263, Issue 1–2, pp 479–499 | Cite as

Predicting pediatric clinic no-shows: a decision analytic framework using elastic net and Bayesian belief network

  • Kazim Topuz
  • Hasmet Uner
  • Asil Oztekin
  • Mehmet Bayram Yildirim
Data Mining and Analytics


No-shows are becoming a major problem in primary care facilities, creating additional costs for the facility while adversely affecting the quality of patient care. Accurately predicting no-shows plays an important role in the overbooking strategy. In this study, a hybrid probabilistic prediction framework based on the elastic net (EN) variable-selection methodology integrated with probabilistic Bayesian Belief Network (BBN) is proposed. The study predicts the “no-show probability of the patient(s)” using demographics, socioeconomic status, current appointment information, and appointment attendance history of the patient and the family. The proposed framework is validated using ten years of local pediatric clinic data. It is shown that this EN-based BBN framework is a comparable prediction methodology regarding the best approaches found in the literature. More importantly, this methodology provides novel information on the interrelations of predictors and the conditional probability of predicting “no-shows.” The output of the model can be applied to the appointment scheduling system for a robust overbooking strategy.


No-show prediction Elastic net Bayesian belief networks Healthcare analytics 



We are grateful to the two anonymous reviewers for their constructive comments and suggestions.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.


  1. Abramson, B., Brown, J., Edwards, W., Murphy, A., & Winkler, R. L. (1996). Hailfinder: A Bayesian system for forecasting severe weather. International Journal of Forecasting, 12(1), 57–71.CrossRefGoogle Scholar
  2. Aickelin, U., & Li, J. (2007). An estimation of distribution algorithm for nurse scheduling. Annals of Operations Research, 155(1), 289–309.CrossRefGoogle Scholar
  3. Alaeddini, A., Yang, K., Reddy, C., & Yu, S. (2011). A probabilistic model for predicting the probability of no-show in hospital appointments. Health Care Management Science, 14(2), 146–157. doi: 10.1007/s10729-011-9148-9.CrossRefGoogle Scholar
  4. Bean, A. G., & Talaga, J. (1995). Predicting appointment breaking. Journal of Health Care Marketing, 15(1), 29–34.Google Scholar
  5. Bunn, C. C., Du, M., Niu, K., Johnson, T. R., Poston, W. S. C., & Foreyt, J. P. (1999). Predicting the risk of obesity using a Bayesian network. In Proceedings of the 1999 AMIA Symposium (pp. 1035), American Medical Informatics Association.Google Scholar
  6. Burnside, E. S., Rubin, D. L., Fine, J. P., Shachter, R. D., Sisney, G. A., & Leung, W. K. (2006). Bayesian network to predict breast cancer risk of mammographic microcalcifications and reduce number of benign biopsy results: Initial experience. Radiology, 240(3), 666–673.CrossRefGoogle Scholar
  7. Cayirli, T., & Veral, E. (2003). Outpatient scheduling in health care: A review of literature. Production and Operations Management, 12(4), 519–549.CrossRefGoogle Scholar
  8. Chickering, D. M., Heckerman, D., & Meek, C. (2004). Large-sample learning of Bayesian networks is NP-hard. The Journal of Machine Learning Research, 5, 1287–1330.Google Scholar
  9. Cho, S., Kim, K., Kim, Y. J., Lee, J. K., Cho, Y. S., Lee, J. Y., et al. (2010). Joint identification of multiple genetic variants via elastic-net variable selection in a Genome-Wide Association analysis. Annals of human genetics, 74(5), 416–428.CrossRefGoogle Scholar
  10. Chow, C. K., & Liu, C. N. (1968). Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, 14(3), 462–467. doi: 10.1109/Tit.1968.1054142.CrossRefGoogle Scholar
  11. Cinicioglu, E. N., Shenoy, P. P., & Kocabasoglu, C. (2007). Use of radio frequency identification for targeted advertising: A collaborative filtering approach using bayesian networks. In European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty (pp. 889–900). Berlin. Heidelberg: Springer.Google Scholar
  12. Cinicioglu, E., & Shenoy, P. (2012). A new heuristic for learning Bayesian networks from limited datasets: A real-time recommendation system application with RFID systems in grocery stores. Annals of Operations Research, 1–21, doi: 10.1007/s10479-012-1171-9.
  13. Cinicioglu, E. N., & Büyükuğur, G. (2014). How to create better performing Bayesian networks: A heuristic approach for variable selection. In International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (pp. 527–535). Springer International Publishing.Google Scholar
  14. Cinicioglu, E. N., & Yenilmez, T. (2016). Determination of variables for a Bayesian network and the most precious one. In International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (pp. 313–325). Springer International Publishing.Google Scholar
  15. Dag, A., Topuz, K., Oztekin, A., Bulur, S., & Megahed, F. M. (2016). A probabilistic data-driven framework for scoring the preoperative recipient-donor heart transplant survival. Decision Support Systems, 86, 1–12.CrossRefGoogle Scholar
  16. Daggy, J., Lawley, M., Willis, D., Thayer, D., Suelzer, C., DeLaurentis, P.-C., et al. (2010). Using no-show modeling to improve clinic performance. Health Informatics Journal, 16(4), 246–259. doi: 10.1177/1460458210380521.CrossRefGoogle Scholar
  17. Domingos, P., & Pazzani, M. (1996). Beyond independence: Conditions for the optimality of the simple Bayesian classifier. In Proceedings of the 13 \({th}\) International Conference on Machine Learning (pp. 105): CiteseerGoogle Scholar
  18. Eaton, D., & Murphy, K. (2012). Bayesian structure learning using dynamic programming and MCMC. arXiv preprint arXiv:1206.5247.
  19. Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statstical Software, 33(1), 1–22.Google Scholar
  20. Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29(2–3), 131–163.CrossRefGoogle Scholar
  21. Friedman, N., & Koller, D. (2003). Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Machine Learning, 50(1–2), 95–125. doi: 10.1023/A:1020249912095.CrossRefGoogle Scholar
  22. Glowacka, K. J., Henry, R. M., & May, J. H. (2009). A hybrid data mining/simulation approach for modelling outpatient no-shows in clinic scheduling. Journal of the Operational Research Society, 60(8), 1056–1068. doi: 10.1057/jors.2008.177.CrossRefGoogle Scholar
  23. Goldman, L., Freidin, R., Cook, E. F., Eigner, J., & Grich, P. (1982). A multivariate approach to the prediction of no-show behavior in a primary care center. Archives of Internal Medicine, 142(3), 563–567. doi: 10.1001/archinte.142.3.563.CrossRefGoogle Scholar
  24. Guo, H., & Hsu, W. H. (2007). A machine learning approach to algorithm selection for mathematical {NP}-hard optimization problems: A case study on the MPE problem. Annals of Operations Research, 156(1), 61–82.CrossRefGoogle Scholar
  25. Gurol-Urganci, I., de Jongh, T., Vodopivec-Jamsek, V., Atun, R., & Car, J. (2013). Mobile phone messaging reminders for attendance at healthcare appointments. Cochrane Database System Review, doi: 10.1002/14651858.CD007458.pub3.
  26. Hand, D. J. (1997). Construction and Assessment of Classification Rules. Hoboken: Wiley.Google Scholar
  27. Huang, Y., & Hanauer, D. (2014). Patient no-show predictive model development using multiple data sources for an effective overbooking approach. Applied Clinical Informatics, 5, 836–860.CrossRefGoogle Scholar
  28. Jee, S. H., & Cabana, M. D. (2006). Indices for continuity of care: A systematic review of the literature. Medical Care Research and Review, 63(2), 158–188.CrossRefGoogle Scholar
  29. Johnson, B. J., Mold, J. W., & Pontious, J. M. (2007). Reduction and management of no-shows by family medicine residency practice exemplars. The Annals of Family Medicine, 5(6), 534–539. doi: 10.1370/afm.752.CrossRefGoogle Scholar
  30. Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. International Joint Conference on Artificial Intelligence (IJCAI), 14, 1137–1145.Google Scholar
  31. Kohavi, R., & Provost, F. (1998). Glossary of terms. Machine Learning, 30(2–3), 271–274.Google Scholar
  32. Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. Cambridge: MIT Press.Google Scholar
  33. Koller, D., & Sahami, M. (1996). Toward optimal feature selection. Stanford InfoLab.Google Scholar
  34. Kristensen, A. R., & Jørgensen, E. (2000). Multi-level hierarchic Markov processes as a framework for herd management support. Annals of Operations Research, 94(1–4), 69–89.CrossRefGoogle Scholar
  35. Kubat, M., & Matwin, S. (1997). Addressing the curse of imbalanced training sets: One-sided selection. In International Conference on Machine Learning, (Vol. 97, pp. 179–186). Nashville, Tennessee.Google Scholar
  36. Lacy, N. L., Paulman, A., Reuter, M. D., & Lovejoy, B. (2004). Why we don’t come: Patient perceptions on no-shows. Annals of Family Medicine, 2(6), 541–545. doi: 10.1370/afm.123.CrossRefGoogle Scholar
  37. LaGanga, L. R., & Lawrence, S. R. (2007). Clinic overbooking to improve patient access and increase provider productivity. Decision Sciences, 38(2), 251–276.CrossRefGoogle Scholar
  38. Leong, K. C., Chen, W. S., Leong, K. W., Mastura, I., Mimi, O., Sheikh, M. A., et al. (2006). The use of text messaging to improve attendance in primary care: A randomized controlled trial. Family Practice, 23(6), 699–705.CrossRefGoogle Scholar
  39. Lucas, P. J. F. (2004). Bayesian networks in biomedicine and health-care. Artificial Intelligence in Medicine, 30, 201–214.CrossRefGoogle Scholar
  40. Meyfroidt, G., Güiza, F., Ramon, J., & Bruynooghe, M. (2009). Machine learning techniques to examine large patient databases. Best Practice & Research Clinical Anaesthesiology, 23(1), 127–143.CrossRefGoogle Scholar
  41. Mollineda, R., Alejo, R., & Sotoca, J. (2007).The class imbalance problem in pattern classification and learning. In II Congreso Español de Informática (CEDI 2007), 978–984, Citeseer.Google Scholar
  42. Olson, D. L., & Delen, D. (2008). Advanced data mining techniques. Berlin: Springer Publishing Company Inc.Google Scholar
  43. Oztekin, A., Delen, D., & Kong, Z. J. (2009). Predicting the graft survival for heart-lung transplantation patients: An integrated data mining methodology. international journal of medical informatics, 78(12), e84–e96.CrossRefGoogle Scholar
  44. Park, T., & Casella, G. (2008). The bayesian lasso. Journal of the American Statistical Association, 103(482), 681–686.CrossRefGoogle Scholar
  45. Pearl, J. (2009). Causal inference in statistics: An overview. Statistics Surveys, 3, 96–146.CrossRefGoogle Scholar
  46. Pearl, J. (2014). Probabilistic reasoning in intelligent systems: Networks of plausible inference. Burlington: Morgan Kaufmann.Google Scholar
  47. Petrovic, S., & Vanden Berghe, G. (2007). Special issue on personnel scheduling and planning–Preface. Annals of Operations Research, 155(1), 1–4.CrossRefGoogle Scholar
  48. Reunanen, J. (2003). Overfitting in making comparisons between variable selection methods. Journal of Machine Learning Research, 3(Mar), 1371–1382.Google Scholar
  49. Saultz, J. W. (2003). Defining and measuring interpersonal continuity of care. The Annals of Family Medicine, 1(3), 134–143.CrossRefGoogle Scholar
  50. Sevim, C., Oztekin, A., Bali, O., Gumus, S., & Guresen, E. (2014). Developing an early warning system to predict currency crises. European Journal of Operational Research, 237(3), 1095–1104.CrossRefGoogle Scholar
  51. Shih, D., Kim, S., Chen, V. P., Rosenberger, J., & Pilla, V. (2014). Efficient computer experiment-based optimization through variable selection. Annals of Operations Research, 216(1), 287–305. doi: 10.1007/s10479-012-1129-y.CrossRefGoogle Scholar
  52. Suits, D. B. (1984). Dummy variables: Mechanics v. interpretation. The Review of Economics and Statistics, 66(1), 177–180.CrossRefGoogle Scholar
  53. Sun, L., & Shenoy, P. P. (2007). Using Bayesian networks for bankruptcy prediction: Some methodological issues. European Journal of Operational Research, 180(2), 738–753.CrossRefGoogle Scholar
  54. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society Series B-Methodological, 58(1), 267–288.Google Scholar
  55. Williams, K., Thomson, D., Seto, I., Contopoulos-Ioannidis, D. G., Ioannidis, J. P. A., Curtis, S., et al. (2012). Standard 6: Age groups for pediatric trials. Pediatrics, 129(Supplement 3), S153–S160. doi: 10.1542/peds.2012-0055I.CrossRefGoogle Scholar
  56. Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320.CrossRefGoogle Scholar
  57. Zuckermann, A. O., Ofner, P., Holzinger, C., Grimm, M., Mallinger, R., Laufer, G., et al. (2000). Pre- and early postoperative risk factors for death after cardiac transplantation: A single center analysis. Transplant International, 13(1), 28–34.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Kazim Topuz
    • 1
  • Hasmet Uner
    • 2
  • Asil Oztekin
    • 3
  • Mehmet Bayram Yildirim
    • 1
  1. 1.Department of Industrial and Manufacturing EngineeringWichita State UniversityWichitaUSA
  2. 2.Department of PediatricsKansas University School of MedicineWichitaUSA
  3. 3.Operations and Information Systems, Biomedical Engineering and Biotechnology ProgramUniversity of Massachusetts LowellLowellUSA

Personalised recommendations