Predicting Blood Donors Using Machine Learning Techniques

Kauten, Christian; Gupta, Ashish; Qin, Xiao; Richey, Glenn

doi:10.1007/s10796-021-10149-1

Predicting Blood Donors Using Machine Learning Techniques

Published: 17 July 2021

Volume 24, pages 1547–1562, (2022)
Cite this article

Information Systems Frontiers Aims and scope Submit manuscript

Christian Kauten¹,
Ashish Gupta ORCID: orcid.org/0000-0003-4593-3261²,
Xiao Qin¹ &
…
Glenn Richey³

852 Accesses
7 Citations
Explore all metrics

Abstract

The United States’ blood supply chain is experiencing market decline due to recent innovations in surgical practice, transfusion management, and hospital policy. These innovations strain US blood centers, resulting in cuts to surge capacities, consolidation, and reduced funding for research and outreach programs. In this study, we use data from a regional blood center to explore the application of contemporary machine learning algorithms for modeling donor retention. Such predictive models of donor retention can be used to design more cost effective donor outreach programs. Using data from a large US blood center paired with random forest classifiers, we are able to build a model of donor retention with a Mathews correlation of coefficient of 0.851.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning for Blood Donors Classification Model Using Ensemble Learning

Development and validation of a machine learning method to predict intraoperative red blood cell transfusions in cardiothoracic surgery

Article Open access 25 January 2022

Zheng Wang, Shandian Zhe, … Ryan A. Metcalf

Machine learning-based prediction of fainting during blood donations using donor properties and weather data as features

Article Open access 20 August 2022

Susanne Suessner, Norbert Niklas, … Jens Meier

References

Abbasi, B., & Hosseinifard, S.Z. (2014). On the issuing policies for perishable items such as red blood cells and platelets in blood service. Decision Sciences, 45(5), 995–1020.
Article Google Scholar
Abbasi, B., Vakili, G., & Chesneau, S. (2017). Impacts of reducing the shelf life of red blood cells: a view from down under. INFORMS Journal on Applied Analytics, 47(4), 336–351.
Article Google Scholar
Baş, S., Carello, G., Lanzarone, E., & Yalçındağ, S. (2018). An appointment scheduling framework to balance the production of blood units from donation. European Journal of Operational Research, 265 (3), 1124–1143.
Article Google Scholar
Beliën, J., & Forcé, H. (2012). Supply chain management of blood products: a literature review. European Journal of Operational Research, 217(1), 1–16.
Article Google Scholar
Boonyanusith, W., & Jittamai, P. (2012). Blood donor classification using neural network and decision tree techniques. In Proceedings of the world congress on engineering and computer science, (Vol. 1 pp. 499–503).
Boughorbel, S., Jarray, F., & El-Anbari, M. (2017). Optimal classifier for imbalanced data using matthews correlation coefficient metric. PLOS ONE, 12(6), 1–17.
Article Google Scholar
Charbonneau, J., Cloutier, M.S., & Carrier, É. (2016). Why do blood donors lapse or reduce their donation’s frequency? Transfusion Medicine Reviews, 30(1), 1—5.
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., & Kegelmeyer, W.P. (2002). SMOTE: Synthetic Minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.
Article Google Scholar
Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785–794).
Darwiche, M., Feuilloy, M., Bousaleh, G., & Schang, D. (2010). Prediction of blood transfusion donation. In 2010 Fourth International Conference on Research Challenges in Information Science (RCIS) (pp. 51–56).
Dheeru, D., & Karra Taniskidou, E. (2017). UCI machine learning repository. http://archive.ics.uci.edu/ml.
Domeniconi C, & Gunopulos D (2001). Adaptive nearest neighbor classification using support vector machines. In Proceedings of the 14th International conference on neural information processing systems: natural and synthetic, MIT Press, Cambridge, MA, USA, NIPS’01 (pp. 665–672).
van Dongen, A. (2015). Easy come, easy go. retention of blood donors. Transfusion Medicine, 25 (4), 227–233.
Article Google Scholar
Ellingson, K.D., Sapiano, M.R.P., Haass, K.A., Savinkina, A.A., Baker, M.L., Chung, K.W., Henry, R.A., Berger, J.J., Kuehnert, M.J., & Basavaraju, S.V. (2017). Continued decline in blood collection and transfusion in the United States - 2015. Transfusion, 57(Suppl 2), 1588–1598.
Article Google Scholar
Gaston, G., & Marc, G. (2013). Predicting first lifetime plasma donation among whole blood donors. Transfusion, 53(S5), 157S–161S.
Google Scholar
Godin, G., Conner, M., Sheeran, P., Bélanger-Gravel, A, & Germain, M. (2007). Determinants of repeated blood donation among new and experienced blood donors. Transfusion, 47(9), 1607– 1615.
Article Google Scholar
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge: MIT Press. http://www.deeplearningbook.org.
Google Scholar
Gupta, A., Deokar, A., Iyer, L., Sharda, R., & Schrader, D. (2018). Big data & analytics for societal impact: Recent research and trends. Information Systems Frontiers, 20(2), 185–194.
Article Google Scholar
Johnson, J.M., & Khoshgoftaar, T.M. (2020). The effects of data sampling with deep learning and highly imbalanced big data. Information Systems Frontiers, 22(5), 1113–1131.
Article Google Scholar
Kamyabniya, A., Lotfi, M.M., Naderpour, M., & Yih, Y. (2018). Robust platelet logistics planning in disaster relief operations under uncertainty: a coordinated approach. Information Systems Frontiers, 20 (4), 759–782.
Article Google Scholar
Khalid, N.S.C., Burhanuddin, M., Ahmad, A., & Ghani, M. (2013). Classification techniques in blood donors sector–a survey. In E-Proceeding of Software Engineering Postgraduates Workshop (SEPoW).
Klievink, B., Romijn, B.J., Cunningham, S., & de Bruijn, H. (2017). Big data in the public sector: Uncertainties and readiness. Information systems frontiers, 19(2), 267–283.
Article Google Scholar
Leipnitz, S., de Vries, M., Clement, M., & Mazar, N. (2018). Providing health checks as incentives to retain blood donors — evidence from two field experiments. International Journal of Research in Marketing, 35(4), 628–640.
Article Google Scholar
Masser, B.M., White, K.M., Hamilton, K., & McKimmie, B.M. (2011). An examination of the predictors of blood donors’ intentions to donate during two phases of an avian influenza outbreak. Transfusion, 51 (3), 548–557.
Article Google Scholar
Matthews, B. (1975). Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA,) - Protein Structure, 405(2), 442– 451.
Article Google Scholar
Miah, M. (2020). Study of blood donation campaign communication methods and attributes of donors: a data analytics approach. International Journal of Healthcare Management, 0(0), 1–11.
Google Scholar
Misje, A.H., Bosnes, V., Gåsdal, O., & Heier, H.E. (2005). Motivation, recruitment and retention of voluntary non-remunerated blood donors: a survey-based questionnaire study. Vox Sanguinis, 89(4), 236–244.
Article Google Scholar
Mostafa, M.M. (2009). Profiling blood donors in egypt: a neural network analysis. Expert Systems with Applications, 36(3, Part 1), 5031–5038.
Article Google Scholar
Mulcahy, A., & Health, R. (2016). Toward a sustainable blood supply in the United States: an analysis of the current system and alternatives for the future. Research report (Rand Corporation), RAND Corporation.
Osorio, A.F., Brailsford, S.C., & Smith, H.K. (2015). A structured review of quantitative models in the blood supply chain: a taxonomic framework for decision-making. International Journal of Production Research, 53(24), 7191–7212.
Article Google Scholar
Pierskalla, W.P. (2005). Operations research and health care (Vol. 70, pp. 103–145). Boston: Springer. chap Supply Chain Management of Blood Banks.
Book Google Scholar
Ramachandran, P., Girija, N., & Bhuvaneswari, T. (2011). Classifying blood donors using data mining techniques. International Journal of Computer Science Engineering & Technology, 1(1).
Riley, W., Schwei, M., & McCullough, J. (2007). The united states’ potential blood donor pool: estimating the prevalence of donor-exclusion factors on the pool of potential donors. Transfusion, 47(7), 1180–1188.
Article Google Scholar
Santhanam, T., & Sundaram, S. (2010). Application of CART algorithm in blood donors classification. Journal of computer Science, 6(5), 548.
Article Google Scholar
Smiti, S., & Soui, M. (2020). Bankruptcy prediction using deep learning approach based on borderline SMOTE. Information Systems Frontiers, 22(5), 1067–1083.
Article Google Scholar
Testik, M.C., Ozkaya, B.Y., Aksu, S., & Ozcebe, O.I. (2012). Discovering blood donor arrival patterns using data mining: a method to investigate service quality at blood centers. Journal of Medical Systems, 36(2), 579–594.
Article Google Scholar
Tirelli, T., & Pessani, D. (2011). Importance of feature selection in decision-tree and artificial-neural-network ecological applications. alburnus alburnus alborella: a practical example. Ecological Informatics, 6(5), 309–315.
Google Scholar
Tomar, D., & Agarwal, S. (2013). A survey on data mining approaches for healthcare. International Journal of Bio-Science and Bio-Technology, 5(5), 241–266.
Article Google Scholar
Whitaker, B.I., Henry, R.A., & Hinkins, S. (2013). AABB Blood collection, utilization, and, patient blood management survey report. Tech. rep., American Assocation of Blood Banks (AABB).
Yeh, I.C., Yang, K.J., & Ting, T.M. (2009). Knowledge discovery on rfm model using bernoulli sequence. Expert Systems with Applications, 36(3, Part 2), 5866–5871.
Article Google Scholar

Download references

Acknowledgements

Xiao Qin’s work is supported by the U.S. National Science Foundation under Grants IIS-1618669, OAC-1642133, CCF-0845257.

Author information

Authors and Affiliations

Computer Science and Software Engineering, Samuel Ginn College of Engineering, Auburn University, Auburn, AL, 36849, USA
Christian Kauten & Xiao Qin
Department of Systems & Technology, Harbert College of Business, Auburn University, Auburn, AL, 36849, USA
Ashish Gupta
Department of Supply Chain Management, Harbert College of Business, Auburn University, Auburn, AL, 36849, USA
Glenn Richey

Authors

Christian Kauten
View author publications
You can also search for this author in PubMed Google Scholar
Ashish Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Qin
View author publications
You can also search for this author in PubMed Google Scholar
Glenn Richey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashish Gupta.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Symbols and Annotations

Table 5 Symbols and annotations

Full size table

Appendix B: Dataset Features

Table 6 Dataset Features: The engineered features of the donor retention dataset

Full size table

Glossary

BC: Blood Center (Name blinded)
ARC: The American Red Cross
ANN: Artificial Neural Network
CART: Classification & Regression Trees
CBA: Classification Based Association
CDC: The Center for Disease Control
DRG: Diagnosis Related Groups
FEMA: The Federal Emergency Management Agency
FDA: The Food and Drug Administration
GB: Gradient Boosting
IRCS: Indian Red Cross Society
k NN: k-Nearest Neighbors
LDA: Linear Discriminant Analysis
MCC: Matthew’s correlation coefficient
NBCUS: The National Blood Collection and Utilization Survey
PBM: Patient Blood Management
RBC: Red Blood Cell
RF: Random Forest
RFM: Recency, Frequency, and Monetary Value
SMOTE: Synthetic Minority Oversampling Technique
SVM: Support Vector Machine
UCI: University of California, Irvine
US: United States

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kauten, C., Gupta, A., Qin, X. et al. Predicting Blood Donors Using Machine Learning Techniques. Inf Syst Front 24, 1547–1562 (2022). https://doi.org/10.1007/s10796-021-10149-1

Download citation

Accepted: 20 May 2021
Published: 17 July 2021
Issue Date: October 2022
DOI: https://doi.org/10.1007/s10796-021-10149-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting Blood Donors Using Machine Learning Techniques

Abstract

Access this article

Similar content being viewed by others

Machine Learning for Blood Donors Classification Model Using Ensemble Learning

Development and validation of a machine learning method to predict intraoperative red blood cell transfusions in cardiothoracic surgery

Machine learning-based prediction of fainting during blood donations using donor properties and weather data as features

References

Acknowledgements