Skip to main content

Predicting HIV Status Using Machine Learning Techniques and Bio-Behavioural Data from the Zimbabwe Population-Based HIV Impact Assessment (ZIMPHIA15-16)

  • Conference paper
  • First Online:
Artificial Intelligence Trends in Systems (CSOC 2022)

Abstract

HIV and AIDS continue to be a significant public health concern globally, with about 36 million people currently living with the epidemic. Several HIV interventions have been implemented to intensify virus transmission prevention, screening, and diagnosis in sub-Saharan African countries, including Zimbabwe. HIV prevalence is substantially high in Zimbabwe despite the significant progress made in the previous years. As the country moves closer to attaining the epidemic control status, there is a need for targeted HIV interventions focusing on HIV risk individuals. Most current HIV interventions are based on evidence about specific sub-population groups, undermining the diversity of individual risk levels within such groups. Therefore, this study applied random forest classifier, support vector machine, and logistic regression to predict HIV status outcomes using Zimbabwe Population-Based HIV Impact Assessment data to identify high-risk individuals and develop targeted interventions based on risk. This study shows that logistic regression outperformed the random forest classifier and support vector machine with the prediction accuracy of 85%, recall of 98%, and F1-score of 92%. However, the random forest classifier has the highest precision of 87% compared to the other models. The support vector machine outperformed the random forest classifier in recall and F1-score metrics, with a recall of 96% and F1-score of 91%. Machine learning models can help identify individuals at high risk of contracting HIV and assist policymakers in developing targeted HIV prevention and screening strategies informed with socio-demographic and risk behavioural data. However, this study only used socio-demographic and behavioural predictors to predict HIV status. There is a need to include other HIV clinical predictors to optimise HIV status prediction models better and further integrate them into real-world healthcare settings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Data Availability

The ZIMPHIA 2015 dataset is publicly accessible. Permission to access the dataset can be sought from the PHIA Project Document Manager (https://phia-data.icap.columbia.edu/).

References

  1. Jones, J., Sullivan, P.S., Curran, J.W.: Progress in the HIV epidemic: identifying goals and measuring success. PLoS Med. 16 (2019). https://doi.org/10.1371/JOURNAL.PMED.1002729

  2. Global HIV & AIDS statistics—Fact sheet|UNAIDS (n.d.) https://www.unaids.org/en/resources/fact-sheet. Accessed 4 Feb 2022

  3. UNAIDS: Global AIDS Update 2018: Miles to Go: Closing Gaps Breaking Barrier Righting Injustices (2019)

    Google Scholar 

  4. Carlberg, R., Wolgast, E., Kristensson Hallström, I., Biru, M.: Caregiver and child factors predicting HIV status disclosure among children enrolled on ART: a cross-regional study in Addis Ababa and Oromia, Ethiopia 34, 105–111 (2021). https://doi.org/10.1080/09540121.2021.1918622

  5. Doat, A.R., Negarandeh, R., Hasanpour, M.: Disclosure of HIV status to children in sub-Saharan Africa: a systematic review. Medicina 55 (2019). https://doi.org/10.3390/MEDICINA55080433

  6. Murewanhema, G.: HIV and sub-Saharan African women in the COVID-19 era and beyond. Int. J. Med. Rev. 8, 74–79 (2021). https://doi.org/10.30491/IJMR.2020.247736.1142

  7. National HIV Survey (ZIMPHIA 2020) Results Indicate Zimbabwe is on Track to Achieve HIV Epidemic Control by 2030 - U.S. Embassy in Zimbabwe (n.d.) https://zw.usembassy.gov/national-hiv-survey-zimphia-2020-results-indicate-zimbabwe-is-on-track-to-achieve-hiv-epidemic-control-by-2030/. Accessed 5 Feb 2022

  8. Muchabaiwa, L., Mbonigaba, J.: Impact of the adolescent and youth sexual and reproductive health strategy on service utilisation and health outcomes in Zimbabwe. PLoS One 14 (2018). https://doi.org/10.1371/JOURNAL.PONE.0218588

  9. Celum, C., Barnabas, R.: Reaching the 90-90-90 target: lessons from HIV self-testing. Lancet HIV 6, e68–e69 (2019). https://doi.org/10.1016/s2352-3018(18)30289-3

  10. Maman, D., et al.: Closer to 90-90-90. The cascade of care after 10 years of ART scale-up in rural Malawi: a population study. J. Int. AIDS Soc. 19 (2016). https://doi.org/10.7448/IAS.19.1.20673

  11. Gonese, E., et al.: Comparison of HIV incidence in the Zimbabwe population-based HIV impact assessment survey (2015–2016) with modeled estimates: progress toward epidemic control 36, 656–662 (2020). Https://HomeLiebertpubCom/Aid. https://doi.org/10.1089/aid.2020.0046

  12. Conan, N., et al.: Successes and gaps in the HIV cascade of care of a high HIV prevalence setting in Zimbabwe: a population-based survey. J. Int. AIDS Soc. 23, e25613 (2020). https://doi.org/10.1002/JIA2.25613

    Article  Google Scholar 

  13. Cambiano, V., Miners, A., Phillips, A.: What do we know about the cost–effectiveness of HIV preexposure prophylaxis, and is it affordable? Curr. Opin. HIV AIDS 11, 56–66 (2016). https://doi.org/10.1097/COH.0000000000000217

    Article  Google Scholar 

  14. Branca, M., et al.: Factors predicting the persistence of genital human papillomavirus infections and PAP smear abnormality in HIV-positive and HIV-negative women during prospective follow-up. Int. J. STD AIDS 14, 417–425 (2003). https://doi.org/10.1258/095646203765371321

    Article  Google Scholar 

  15. Ahlström, M.G., Ronit, A., Omland, L.H., Vedel, S., Obel, N.: Algorithmic prediction of HIV status using nation-wide electronic registry data. EClinicalMedicine 17, 100203 (2019). https://doi.org/10.1016/J.ECLINM.2019.10.016

    Article  Google Scholar 

  16. Mutai, C.K., McSharry, P.E., Ngaruye, I., Musabanganji, E.: Use of machine learning techniques to identify HIV predictors for screening in sub-Saharan Africa. BMC Med. Res. Methodol. 21, 1–11 (2021). https://doi.org/10.1186/S12874-021-01346-2

  17. Marcus, J.L., Hurley, L.B., Krakower, D.S., Alexeeff, S., Silverberg, M.J., Volk, J.E.: Use of electronic health record data and machine learning to identify candidates for HIV pre-exposure prophylaxis: a modelling study. Lancet HIV 6, e688–e695 (2019). https://doi.org/10.1016/S2352-3018(19)30137-7

    Article  Google Scholar 

  18. Krakower, D.S., et al.: Development and validation of an automated HIV prediction algorithm to identify candidates for pre-exposure prophylaxis: a modelling study. Lancet HIV 6, e696–e704 (2019). https://doi.org/10.1016/S2352-3018(19)30139-0

    Article  Google Scholar 

  19. Bao, Y., et al.: Predicting the diagnosis of HIV and sexually transmitted infections among men who have sex with men using machine learning approaches. J. Infect. 82, 48–59 (2021). https://doi.org/10.1016/J.JINF.2020.11.007

    Article  Google Scholar 

  20. Sachathep, K., et al.: Population-based HIV impact assessments survey methods, response, and quality in Zimbabwe, Malawi, and Zambia. J. Acquir. Immune Defic. Syndr. 87, S6-16 (2021). https://doi.org/10.1097/QAI.0000000000002710

    Article  Google Scholar 

  21. Bisaso, K.R., Karungi, S.A., Kiragga, A., Mukonzo, J.K., Castelnuovo, B.: A comparative study of logistic regression based machine learning techniques for prediction of early virological suppression in antiretroviral initiating HIV patients. BMC Med. Inform. Decis. Mak. 18, 1–10 (2018). https://doi.org/10.1186/S12911-018-0659-X/FIGURES/3

    Article  Google Scholar 

  22. Zhang, X., et al.: Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality. Clin. Epigenetics 10, 1–15 (2018). https://doi.org/10.1186/S13148-018-0591-Z/FIGURES/5

    Article  Google Scholar 

  23. Haas, O., Maier, A., Rothgang, E.: Machine learning-based HIV risk estimation using incidence rate ratios. Front Reprod. Heal. 96 (2021). https://doi.org/10.3389/FRPH.2021.756405

  24. Orel, E., Esra, R., Estill, J., Marchand-Maillet, S., Merzouki, A., Keiser, O.: Machine learning to identify socio-behavioural predictors of HIV positivity in East and Southern Africa. MedRxiv (2020). https://doi.org/10.1101/2020.01.27.20018242

  25. Mutai, C.K., McSharry, P.E., Ngaruye, I., Musabanganji, E.: Use of machine learning techniques to identify HIV predictors for screening in sub-Saharan Africa. BMC Med. Res. Methodol. 21, 1–11 (2021). https://doi.org/10.1186/S12874-021-01346-2/TABLES/3

    Article  Google Scholar 

  26. Wray, T.B., Luo, X., Ke, J., Pérez, A.E., Carr, D.J., Monti, P.M.: Using smartphone survey data and machine learning to identify situational and contextual risk factors for HIV risk behavior among men who have sex with men who are not on PrEP. Prev. Sci. 20, 904–913 (2019). https://doi.org/10.1007/S11121-019-01019-Z/FIGURES/1

    Article  Google Scholar 

  27. ZIMPHIA: Zimbabwe Population-Based HIV Impact Assessment (ZIMPHIA). Zimbabwe Ministry of Health Child Care, pp. 1–39 (2021)

    Google Scholar 

  28. Fashoto, S.G., Mbunge, E., Ogunleye, G., den Burg, J.V.: Implementation of machine learning for predicting maize crop yields using multiple linear regression and backward elimination. Malaysian J. Comput. (MJoC) 6(1) 679-697 (2021)

    Google Scholar 

  29. Fashoto, S.G., Owolabi, O., Mbunge, E., Metfula, A.S.: Evaluating the performance of two hybrid feature selection model of machine learning for credit card fraud detection on classification and prediction methods. Adv. Appl. Sci. Technol. 2, 70–87 (2019)

    Google Scholar 

  30. Macaulay, B.O., Aribisala, B.S., Akande, S.A., Akinnuwesi, B.A., Olabanjo, O.A.: Breast cancer risk prediction in African women using random forest classifier. Cancer Treat. Res. Commun. 28, 100396 (2021). https://doi.org/10.1016/J.CTARC.2021.100396

    Article  Google Scholar 

  31. Cutler, A., Cutler, D.R., Stevens, J.R.: Random forests. In: Zhang, C., Ma, Y. (eds.) Ensemble Machine Learning, pp. 157–175 (2012). Springer US, Boston. https://doi.org/10.1007/978-1-4419-9326-7_5

  32. Mbunge, E., Muchemwa, B.: Deep learning and machine learning techniques for analyzing travelers’ online reviews: a review. In: Gustavo, N., Pronto, J., Carvalho, L., Belo, M. (ed.) Optimizing Digital Solutions for Hyper-personalization in Tourism and Hospitality, pp. 20–39. IGI Global (2022). https://doi.org/10.4018/978-1-7998-8306-7.ch002

  33. Mbunge, E., Fashoto, S.G., Bimha, H.: Prediction of box-office success: a review of trends and machine learning computational models. Int. J. Bus. Intell. Data Min. 20, 192 (2022). https://doi.org/10.1504/IJBIDM.2022.120825

    Article  Google Scholar 

  34. Govender, P., et al.: The application of machine learning to predict genetic relatedness using human mtDNA hypervariable region I sequence. PLoS One 17, e0263790 (2022). https://doi.org/10.1371/JOURNAL.PONE.0263790

    Article  Google Scholar 

  35. Mbunge, E., Simelane, S., Fashoto, S.G., Akinnuwesi, B., Metfula, A.S.: Application of deep learning and machine learning models to detect COVID-19 face masks - a review. Sustain. Oper. Comput. 2, 235–245 (2021). https://doi.org/10.1016/J.SUSOC.2021.08.001

    Article  Google Scholar 

Download references

Funding

This secondary data analysis research received no external funding. However, ZIMPHIA 2015–2016 was supported by the President’s Emergency Plan for AIDS Relief (PEPFAR) through the Centers for Disease Control and Prevention (CDC) to ICAP at Columbia University under the terms of cooperative agreement #U2GGH001226. The funder did not have any role in the study design, data collection and analysis, decision to publish, or preparation of the current manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tafadzwa Dzinamarira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chingombe, I. et al. (2022). Predicting HIV Status Using Machine Learning Techniques and Bio-Behavioural Data from the Zimbabwe Population-Based HIV Impact Assessment (ZIMPHIA15-16). In: Silhavy, R. (eds) Artificial Intelligence Trends in Systems. CSOC 2022. Lecture Notes in Networks and Systems, vol 502. Springer, Cham. https://doi.org/10.1007/978-3-031-09076-9_24

Download citation

Publish with us

Policies and ethics