Skip to main content
Log in

Predicting Blood Donors Using Machine Learning Techniques

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

The United States’ blood supply chain is experiencing market decline due to recent innovations in surgical practice, transfusion management, and hospital policy. These innovations strain US blood centers, resulting in cuts to surge capacities, consolidation, and reduced funding for research and outreach programs. In this study, we use data from a regional blood center to explore the application of contemporary machine learning algorithms for modeling donor retention. Such predictive models of donor retention can be used to design more cost effective donor outreach programs. Using data from a large US blood center paired with random forest classifiers, we are able to build a model of donor retention with a Mathews correlation of coefficient of 0.851.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Abbasi, B., & Hosseinifard, S.Z. (2014). On the issuing policies for perishable items such as red blood cells and platelets in blood service. Decision Sciences, 45(5), 995–1020.

    Article  Google Scholar 

  • Abbasi, B., Vakili, G., & Chesneau, S. (2017). Impacts of reducing the shelf life of red blood cells: a view from down under. INFORMS Journal on Applied Analytics, 47(4), 336–351.

    Article  Google Scholar 

  • Baş, S., Carello, G., Lanzarone, E., & Yalçındağ, S. (2018). An appointment scheduling framework to balance the production of blood units from donation. European Journal of Operational Research, 265 (3), 1124–1143.

    Article  Google Scholar 

  • Beliën, J., & Forcé, H. (2012). Supply chain management of blood products: a literature review. European Journal of Operational Research, 217(1), 1–16.

    Article  Google Scholar 

  • Boonyanusith, W., & Jittamai, P. (2012). Blood donor classification using neural network and decision tree techniques. In Proceedings of the world congress on engineering and computer science, (Vol. 1 pp. 499–503).

  • Boughorbel, S., Jarray, F., & El-Anbari, M. (2017). Optimal classifier for imbalanced data using matthews correlation coefficient metric. PLOS ONE, 12(6), 1–17.

    Article  Google Scholar 

  • Charbonneau, J., Cloutier, M.S., & Carrier, É. (2016). Why do blood donors lapse or reduce their donation’s frequency? Transfusion Medicine Reviews, 30(1), 1—5.

    Article  Google Scholar 

  • Chawla, N.V., Bowyer, K.W., Hall, L.O., & Kegelmeyer, W.P. (2002). SMOTE: Synthetic Minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.

    Article  Google Scholar 

  • Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785–794).

  • Darwiche, M., Feuilloy, M., Bousaleh, G., & Schang, D. (2010). Prediction of blood transfusion donation. In 2010 Fourth International Conference on Research Challenges in Information Science (RCIS) (pp. 51–56).

  • Dheeru, D., & Karra Taniskidou, E. (2017). UCI machine learning repository. http://archive.ics.uci.edu/ml.

  • Domeniconi C, & Gunopulos D (2001). Adaptive nearest neighbor classification using support vector machines. In Proceedings of the 14th International conference on neural information processing systems: natural and synthetic, MIT Press, Cambridge, MA, USA, NIPS’01 (pp. 665–672).

  • van Dongen, A. (2015). Easy come, easy go. retention of blood donors. Transfusion Medicine, 25 (4), 227–233.

    Article  Google Scholar 

  • Ellingson, K.D., Sapiano, M.R.P., Haass, K.A., Savinkina, A.A., Baker, M.L., Chung, K.W., Henry, R.A., Berger, J.J., Kuehnert, M.J., & Basavaraju, S.V. (2017). Continued decline in blood collection and transfusion in the United States - 2015. Transfusion, 57(Suppl 2), 1588–1598.

    Article  Google Scholar 

  • Gaston, G., & Marc, G. (2013). Predicting first lifetime plasma donation among whole blood donors. Transfusion, 53(S5), 157S–161S.

    Google Scholar 

  • Godin, G., Conner, M., Sheeran, P., Bélanger-Gravel, A, & Germain, M. (2007). Determinants of repeated blood donation among new and experienced blood donors. Transfusion, 47(9), 1607– 1615.

    Article  Google Scholar 

  • Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge: MIT Press. http://www.deeplearningbook.org.

    Google Scholar 

  • Gupta, A., Deokar, A., Iyer, L., Sharda, R., & Schrader, D. (2018). Big data & analytics for societal impact: Recent research and trends. Information Systems Frontiers, 20(2), 185–194.

    Article  Google Scholar 

  • Johnson, J.M., & Khoshgoftaar, T.M. (2020). The effects of data sampling with deep learning and highly imbalanced big data. Information Systems Frontiers, 22(5), 1113–1131.

    Article  Google Scholar 

  • Kamyabniya, A., Lotfi, M.M., Naderpour, M., & Yih, Y. (2018). Robust platelet logistics planning in disaster relief operations under uncertainty: a coordinated approach. Information Systems Frontiers, 20 (4), 759–782.

    Article  Google Scholar 

  • Khalid, N.S.C., Burhanuddin, M., Ahmad, A., & Ghani, M. (2013). Classification techniques in blood donors sector–a survey. In E-Proceeding of Software Engineering Postgraduates Workshop (SEPoW).

  • Klievink, B., Romijn, B.J., Cunningham, S., & de Bruijn, H. (2017). Big data in the public sector: Uncertainties and readiness. Information systems frontiers, 19(2), 267–283.

    Article  Google Scholar 

  • Leipnitz, S., de Vries, M., Clement, M., & Mazar, N. (2018). Providing health checks as incentives to retain blood donors — evidence from two field experiments. International Journal of Research in Marketing, 35(4), 628–640.

    Article  Google Scholar 

  • Masser, B.M., White, K.M., Hamilton, K., & McKimmie, B.M. (2011). An examination of the predictors of blood donors’ intentions to donate during two phases of an avian influenza outbreak. Transfusion, 51 (3), 548–557.

    Article  Google Scholar 

  • Matthews, B. (1975). Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA,) - Protein Structure, 405(2), 442– 451.

    Article  Google Scholar 

  • Miah, M. (2020). Study of blood donation campaign communication methods and attributes of donors: a data analytics approach. International Journal of Healthcare Management, 0(0), 1–11.

    Google Scholar 

  • Misje, A.H., Bosnes, V., Gåsdal, O., & Heier, H.E. (2005). Motivation, recruitment and retention of voluntary non-remunerated blood donors: a survey-based questionnaire study. Vox Sanguinis, 89(4), 236–244.

    Article  Google Scholar 

  • Mostafa, M.M. (2009). Profiling blood donors in egypt: a neural network analysis. Expert Systems with Applications, 36(3, Part 1), 5031–5038.

    Article  Google Scholar 

  • Mulcahy, A., & Health, R. (2016). Toward a sustainable blood supply in the United States: an analysis of the current system and alternatives for the future. Research report (Rand Corporation), RAND Corporation.

  • Osorio, A.F., Brailsford, S.C., & Smith, H.K. (2015). A structured review of quantitative models in the blood supply chain: a taxonomic framework for decision-making. International Journal of Production Research, 53(24), 7191–7212.

    Article  Google Scholar 

  • Pierskalla, W.P. (2005). Operations research and health care (Vol. 70, pp. 103–145). Boston: Springer. chap Supply Chain Management of Blood Banks.

    Book  Google Scholar 

  • Ramachandran, P., Girija, N., & Bhuvaneswari, T. (2011). Classifying blood donors using data mining techniques. International Journal of Computer Science Engineering & Technology, 1(1).

  • Riley, W., Schwei, M., & McCullough, J. (2007). The united states’ potential blood donor pool: estimating the prevalence of donor-exclusion factors on the pool of potential donors. Transfusion, 47(7), 1180–1188.

    Article  Google Scholar 

  • Santhanam, T., & Sundaram, S. (2010). Application of CART algorithm in blood donors classification. Journal of computer Science, 6(5), 548.

    Article  Google Scholar 

  • Smiti, S., & Soui, M. (2020). Bankruptcy prediction using deep learning approach based on borderline SMOTE. Information Systems Frontiers, 22(5), 1067–1083.

    Article  Google Scholar 

  • Testik, M.C., Ozkaya, B.Y., Aksu, S., & Ozcebe, O.I. (2012). Discovering blood donor arrival patterns using data mining: a method to investigate service quality at blood centers. Journal of Medical Systems, 36(2), 579–594.

    Article  Google Scholar 

  • Tirelli, T., & Pessani, D. (2011). Importance of feature selection in decision-tree and artificial-neural-network ecological applications. alburnus alburnus alborella: a practical example. Ecological Informatics, 6(5), 309–315.

    Google Scholar 

  • Tomar, D., & Agarwal, S. (2013). A survey on data mining approaches for healthcare. International Journal of Bio-Science and Bio-Technology, 5(5), 241–266.

    Article  Google Scholar 

  • Whitaker, B.I., Henry, R.A., & Hinkins, S. (2013). AABB Blood collection, utilization, and, patient blood management survey report. Tech. rep., American Assocation of Blood Banks (AABB).

  • Yeh, I.C., Yang, K.J., & Ting, T.M. (2009). Knowledge discovery on rfm model using bernoulli sequence. Expert Systems with Applications, 36(3, Part 2), 5866–5871.

    Article  Google Scholar 

Download references

Acknowledgements

Xiao Qin’s work is supported by the U.S. National Science Foundation under Grants IIS-1618669, OAC-1642133, CCF-0845257.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashish Gupta.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Symbols and Annotations

Table 5 Symbols and annotations

Appendix B: Dataset Features

Table 6 Dataset Features: The engineered features of the donor retention dataset
Fig. 4
figure 4

A heatmap visualization of the correlation matrix of the engineered dataset described by Table 6. The color (luminance) of a cross-section, as well as the area, measures the Pearson correlation between variables in the dataset

Fig. 5
figure 5

Histogram plots illustrating the distributions of variables in the dataset

Glossary

BC

Blood Center (Name blinded)

ARC

The American Red Cross

ANN

Artificial Neural Network

CART

Classification & Regression Trees

CBA

Classification Based Association

CDC

The Center for Disease Control

DRG

Diagnosis Related Groups

FEMA

The Federal Emergency Management Agency

FDA

The Food and Drug Administration

GB

Gradient Boosting

IRCS

Indian Red Cross Society

k NN

k-Nearest Neighbors

LDA

Linear Discriminant Analysis

MCC

Matthew’s correlation coefficient

NBCUS

The National Blood Collection and Utilization Survey

PBM

Patient Blood Management

RBC

Red Blood Cell

RF

Random Forest

RFM

Recency, Frequency, and Monetary Value

SMOTE

Synthetic Minority Oversampling Technique

SVM

Support Vector Machine

UCI

University of California, Irvine

US

United States

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kauten, C., Gupta, A., Qin, X. et al. Predicting Blood Donors Using Machine Learning Techniques. Inf Syst Front 24, 1547–1562 (2022). https://doi.org/10.1007/s10796-021-10149-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-021-10149-1

Keywords

Navigation