Abstract
The United States’ blood supply chain is experiencing market decline due to recent innovations in surgical practice, transfusion management, and hospital policy. These innovations strain US blood centers, resulting in cuts to surge capacities, consolidation, and reduced funding for research and outreach programs. In this study, we use data from a regional blood center to explore the application of contemporary machine learning algorithms for modeling donor retention. Such predictive models of donor retention can be used to design more cost effective donor outreach programs. Using data from a large US blood center paired with random forest classifiers, we are able to build a model of donor retention with a Mathews correlation of coefficient of 0.851.
Similar content being viewed by others
References
Abbasi, B., & Hosseinifard, S.Z. (2014). On the issuing policies for perishable items such as red blood cells and platelets in blood service. Decision Sciences, 45(5), 995–1020.
Abbasi, B., Vakili, G., & Chesneau, S. (2017). Impacts of reducing the shelf life of red blood cells: a view from down under. INFORMS Journal on Applied Analytics, 47(4), 336–351.
Baş, S., Carello, G., Lanzarone, E., & Yalçındağ, S. (2018). An appointment scheduling framework to balance the production of blood units from donation. European Journal of Operational Research, 265 (3), 1124–1143.
Beliën, J., & Forcé, H. (2012). Supply chain management of blood products: a literature review. European Journal of Operational Research, 217(1), 1–16.
Boonyanusith, W., & Jittamai, P. (2012). Blood donor classification using neural network and decision tree techniques. In Proceedings of the world congress on engineering and computer science, (Vol. 1 pp. 499–503).
Boughorbel, S., Jarray, F., & El-Anbari, M. (2017). Optimal classifier for imbalanced data using matthews correlation coefficient metric. PLOS ONE, 12(6), 1–17.
Charbonneau, J., Cloutier, M.S., & Carrier, É. (2016). Why do blood donors lapse or reduce their donation’s frequency? Transfusion Medicine Reviews, 30(1), 1—5.
Chawla, N.V., Bowyer, K.W., Hall, L.O., & Kegelmeyer, W.P. (2002). SMOTE: Synthetic Minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.
Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785–794).
Darwiche, M., Feuilloy, M., Bousaleh, G., & Schang, D. (2010). Prediction of blood transfusion donation. In 2010 Fourth International Conference on Research Challenges in Information Science (RCIS) (pp. 51–56).
Dheeru, D., & Karra Taniskidou, E. (2017). UCI machine learning repository. http://archive.ics.uci.edu/ml.
Domeniconi C, & Gunopulos D (2001). Adaptive nearest neighbor classification using support vector machines. In Proceedings of the 14th International conference on neural information processing systems: natural and synthetic, MIT Press, Cambridge, MA, USA, NIPS’01 (pp. 665–672).
van Dongen, A. (2015). Easy come, easy go. retention of blood donors. Transfusion Medicine, 25 (4), 227–233.
Ellingson, K.D., Sapiano, M.R.P., Haass, K.A., Savinkina, A.A., Baker, M.L., Chung, K.W., Henry, R.A., Berger, J.J., Kuehnert, M.J., & Basavaraju, S.V. (2017). Continued decline in blood collection and transfusion in the United States - 2015. Transfusion, 57(Suppl 2), 1588–1598.
Gaston, G., & Marc, G. (2013). Predicting first lifetime plasma donation among whole blood donors. Transfusion, 53(S5), 157S–161S.
Godin, G., Conner, M., Sheeran, P., Bélanger-Gravel, A, & Germain, M. (2007). Determinants of repeated blood donation among new and experienced blood donors. Transfusion, 47(9), 1607– 1615.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge: MIT Press. http://www.deeplearningbook.org.
Gupta, A., Deokar, A., Iyer, L., Sharda, R., & Schrader, D. (2018). Big data & analytics for societal impact: Recent research and trends. Information Systems Frontiers, 20(2), 185–194.
Johnson, J.M., & Khoshgoftaar, T.M. (2020). The effects of data sampling with deep learning and highly imbalanced big data. Information Systems Frontiers, 22(5), 1113–1131.
Kamyabniya, A., Lotfi, M.M., Naderpour, M., & Yih, Y. (2018). Robust platelet logistics planning in disaster relief operations under uncertainty: a coordinated approach. Information Systems Frontiers, 20 (4), 759–782.
Khalid, N.S.C., Burhanuddin, M., Ahmad, A., & Ghani, M. (2013). Classification techniques in blood donors sector–a survey. In E-Proceeding of Software Engineering Postgraduates Workshop (SEPoW).
Klievink, B., Romijn, B.J., Cunningham, S., & de Bruijn, H. (2017). Big data in the public sector: Uncertainties and readiness. Information systems frontiers, 19(2), 267–283.
Leipnitz, S., de Vries, M., Clement, M., & Mazar, N. (2018). Providing health checks as incentives to retain blood donors — evidence from two field experiments. International Journal of Research in Marketing, 35(4), 628–640.
Masser, B.M., White, K.M., Hamilton, K., & McKimmie, B.M. (2011). An examination of the predictors of blood donors’ intentions to donate during two phases of an avian influenza outbreak. Transfusion, 51 (3), 548–557.
Matthews, B. (1975). Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA,) - Protein Structure, 405(2), 442– 451.
Miah, M. (2020). Study of blood donation campaign communication methods and attributes of donors: a data analytics approach. International Journal of Healthcare Management, 0(0), 1–11.
Misje, A.H., Bosnes, V., Gåsdal, O., & Heier, H.E. (2005). Motivation, recruitment and retention of voluntary non-remunerated blood donors: a survey-based questionnaire study. Vox Sanguinis, 89(4), 236–244.
Mostafa, M.M. (2009). Profiling blood donors in egypt: a neural network analysis. Expert Systems with Applications, 36(3, Part 1), 5031–5038.
Mulcahy, A., & Health, R. (2016). Toward a sustainable blood supply in the United States: an analysis of the current system and alternatives for the future. Research report (Rand Corporation), RAND Corporation.
Osorio, A.F., Brailsford, S.C., & Smith, H.K. (2015). A structured review of quantitative models in the blood supply chain: a taxonomic framework for decision-making. International Journal of Production Research, 53(24), 7191–7212.
Pierskalla, W.P. (2005). Operations research and health care (Vol. 70, pp. 103–145). Boston: Springer. chap Supply Chain Management of Blood Banks.
Ramachandran, P., Girija, N., & Bhuvaneswari, T. (2011). Classifying blood donors using data mining techniques. International Journal of Computer Science Engineering & Technology, 1(1).
Riley, W., Schwei, M., & McCullough, J. (2007). The united states’ potential blood donor pool: estimating the prevalence of donor-exclusion factors on the pool of potential donors. Transfusion, 47(7), 1180–1188.
Santhanam, T., & Sundaram, S. (2010). Application of CART algorithm in blood donors classification. Journal of computer Science, 6(5), 548.
Smiti, S., & Soui, M. (2020). Bankruptcy prediction using deep learning approach based on borderline SMOTE. Information Systems Frontiers, 22(5), 1067–1083.
Testik, M.C., Ozkaya, B.Y., Aksu, S., & Ozcebe, O.I. (2012). Discovering blood donor arrival patterns using data mining: a method to investigate service quality at blood centers. Journal of Medical Systems, 36(2), 579–594.
Tirelli, T., & Pessani, D. (2011). Importance of feature selection in decision-tree and artificial-neural-network ecological applications. alburnus alburnus alborella: a practical example. Ecological Informatics, 6(5), 309–315.
Tomar, D., & Agarwal, S. (2013). A survey on data mining approaches for healthcare. International Journal of Bio-Science and Bio-Technology, 5(5), 241–266.
Whitaker, B.I., Henry, R.A., & Hinkins, S. (2013). AABB Blood collection, utilization, and, patient blood management survey report. Tech. rep., American Assocation of Blood Banks (AABB).
Yeh, I.C., Yang, K.J., & Ting, T.M. (2009). Knowledge discovery on rfm model using bernoulli sequence. Expert Systems with Applications, 36(3, Part 2), 5866–5871.
Acknowledgements
Xiao Qin’s work is supported by the U.S. National Science Foundation under Grants IIS-1618669, OAC-1642133, CCF-0845257.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Symbols and Annotations
Appendix B: Dataset Features
Glossary
- BC
-
Blood Center (Name blinded)
- ARC
-
The American Red Cross
- ANN
-
Artificial Neural Network
- CART
-
Classification & Regression Trees
- CBA
-
Classification Based Association
- CDC
-
The Center for Disease Control
- DRG
-
Diagnosis Related Groups
- FEMA
-
The Federal Emergency Management Agency
- FDA
-
The Food and Drug Administration
- GB
-
Gradient Boosting
- IRCS
-
Indian Red Cross Society
- k NN
-
k-Nearest Neighbors
- LDA
-
Linear Discriminant Analysis
- MCC
-
Matthew’s correlation coefficient
- NBCUS
-
The National Blood Collection and Utilization Survey
- PBM
-
Patient Blood Management
- RBC
-
Red Blood Cell
- RF
-
Random Forest
- RFM
-
Recency, Frequency, and Monetary Value
- SMOTE
-
Synthetic Minority Oversampling Technique
- SVM
-
Support Vector Machine
- UCI
-
University of California, Irvine
- US
-
United States
Rights and permissions
About this article
Cite this article
Kauten, C., Gupta, A., Qin, X. et al. Predicting Blood Donors Using Machine Learning Techniques. Inf Syst Front 24, 1547–1562 (2022). https://doi.org/10.1007/s10796-021-10149-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10796-021-10149-1