Abstract
HIV and AIDS continue to be a significant public health concern globally, with about 36 million people currently living with the epidemic. Several HIV interventions have been implemented to intensify virus transmission prevention, screening, and diagnosis in sub-Saharan African countries, including Zimbabwe. HIV prevalence is substantially high in Zimbabwe despite the significant progress made in the previous years. As the country moves closer to attaining the epidemic control status, there is a need for targeted HIV interventions focusing on HIV risk individuals. Most current HIV interventions are based on evidence about specific sub-population groups, undermining the diversity of individual risk levels within such groups. Therefore, this study applied random forest classifier, support vector machine, and logistic regression to predict HIV status outcomes using Zimbabwe Population-Based HIV Impact Assessment data to identify high-risk individuals and develop targeted interventions based on risk. This study shows that logistic regression outperformed the random forest classifier and support vector machine with the prediction accuracy of 85%, recall of 98%, and F1-score of 92%. However, the random forest classifier has the highest precision of 87% compared to the other models. The support vector machine outperformed the random forest classifier in recall and F1-score metrics, with a recall of 96% and F1-score of 91%. Machine learning models can help identify individuals at high risk of contracting HIV and assist policymakers in developing targeted HIV prevention and screening strategies informed with socio-demographic and risk behavioural data. However, this study only used socio-demographic and behavioural predictors to predict HIV status. There is a need to include other HIV clinical predictors to optimise HIV status prediction models better and further integrate them into real-world healthcare settings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Data Availability
The ZIMPHIA 2015 dataset is publicly accessible. Permission to access the dataset can be sought from the PHIA Project Document Manager (https://phia-data.icap.columbia.edu/).
References
Jones, J., Sullivan, P.S., Curran, J.W.: Progress in the HIV epidemic: identifying goals and measuring success. PLoS Med. 16 (2019). https://doi.org/10.1371/JOURNAL.PMED.1002729
Global HIV & AIDS statistics—Fact sheet|UNAIDS (n.d.) https://www.unaids.org/en/resources/fact-sheet. Accessed 4 Feb 2022
UNAIDS: Global AIDS Update 2018: Miles to Go: Closing Gaps Breaking Barrier Righting Injustices (2019)
Carlberg, R., Wolgast, E., Kristensson Hallström, I., Biru, M.: Caregiver and child factors predicting HIV status disclosure among children enrolled on ART: a cross-regional study in Addis Ababa and Oromia, Ethiopia 34, 105–111 (2021). https://doi.org/10.1080/09540121.2021.1918622
Doat, A.R., Negarandeh, R., Hasanpour, M.: Disclosure of HIV status to children in sub-Saharan Africa: a systematic review. Medicina 55 (2019). https://doi.org/10.3390/MEDICINA55080433
Murewanhema, G.: HIV and sub-Saharan African women in the COVID-19 era and beyond. Int. J. Med. Rev. 8, 74–79 (2021). https://doi.org/10.30491/IJMR.2020.247736.1142
National HIV Survey (ZIMPHIA 2020) Results Indicate Zimbabwe is on Track to Achieve HIV Epidemic Control by 2030 - U.S. Embassy in Zimbabwe (n.d.) https://zw.usembassy.gov/national-hiv-survey-zimphia-2020-results-indicate-zimbabwe-is-on-track-to-achieve-hiv-epidemic-control-by-2030/. Accessed 5 Feb 2022
Muchabaiwa, L., Mbonigaba, J.: Impact of the adolescent and youth sexual and reproductive health strategy on service utilisation and health outcomes in Zimbabwe. PLoS One 14 (2018). https://doi.org/10.1371/JOURNAL.PONE.0218588
Celum, C., Barnabas, R.: Reaching the 90-90-90 target: lessons from HIV self-testing. Lancet HIV 6, e68–e69 (2019). https://doi.org/10.1016/s2352-3018(18)30289-3
Maman, D., et al.: Closer to 90-90-90. The cascade of care after 10 years of ART scale-up in rural Malawi: a population study. J. Int. AIDS Soc. 19 (2016). https://doi.org/10.7448/IAS.19.1.20673
Gonese, E., et al.: Comparison of HIV incidence in the Zimbabwe population-based HIV impact assessment survey (2015–2016) with modeled estimates: progress toward epidemic control 36, 656–662 (2020). Https://HomeLiebertpubCom/Aid. https://doi.org/10.1089/aid.2020.0046
Conan, N., et al.: Successes and gaps in the HIV cascade of care of a high HIV prevalence setting in Zimbabwe: a population-based survey. J. Int. AIDS Soc. 23, e25613 (2020). https://doi.org/10.1002/JIA2.25613
Cambiano, V., Miners, A., Phillips, A.: What do we know about the cost–effectiveness of HIV preexposure prophylaxis, and is it affordable? Curr. Opin. HIV AIDS 11, 56–66 (2016). https://doi.org/10.1097/COH.0000000000000217
Branca, M., et al.: Factors predicting the persistence of genital human papillomavirus infections and PAP smear abnormality in HIV-positive and HIV-negative women during prospective follow-up. Int. J. STD AIDS 14, 417–425 (2003). https://doi.org/10.1258/095646203765371321
Ahlström, M.G., Ronit, A., Omland, L.H., Vedel, S., Obel, N.: Algorithmic prediction of HIV status using nation-wide electronic registry data. EClinicalMedicine 17, 100203 (2019). https://doi.org/10.1016/J.ECLINM.2019.10.016
Mutai, C.K., McSharry, P.E., Ngaruye, I., Musabanganji, E.: Use of machine learning techniques to identify HIV predictors for screening in sub-Saharan Africa. BMC Med. Res. Methodol. 21, 1–11 (2021). https://doi.org/10.1186/S12874-021-01346-2
Marcus, J.L., Hurley, L.B., Krakower, D.S., Alexeeff, S., Silverberg, M.J., Volk, J.E.: Use of electronic health record data and machine learning to identify candidates for HIV pre-exposure prophylaxis: a modelling study. Lancet HIV 6, e688–e695 (2019). https://doi.org/10.1016/S2352-3018(19)30137-7
Krakower, D.S., et al.: Development and validation of an automated HIV prediction algorithm to identify candidates for pre-exposure prophylaxis: a modelling study. Lancet HIV 6, e696–e704 (2019). https://doi.org/10.1016/S2352-3018(19)30139-0
Bao, Y., et al.: Predicting the diagnosis of HIV and sexually transmitted infections among men who have sex with men using machine learning approaches. J. Infect. 82, 48–59 (2021). https://doi.org/10.1016/J.JINF.2020.11.007
Sachathep, K., et al.: Population-based HIV impact assessments survey methods, response, and quality in Zimbabwe, Malawi, and Zambia. J. Acquir. Immune Defic. Syndr. 87, S6-16 (2021). https://doi.org/10.1097/QAI.0000000000002710
Bisaso, K.R., Karungi, S.A., Kiragga, A., Mukonzo, J.K., Castelnuovo, B.: A comparative study of logistic regression based machine learning techniques for prediction of early virological suppression in antiretroviral initiating HIV patients. BMC Med. Inform. Decis. Mak. 18, 1–10 (2018). https://doi.org/10.1186/S12911-018-0659-X/FIGURES/3
Zhang, X., et al.: Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality. Clin. Epigenetics 10, 1–15 (2018). https://doi.org/10.1186/S13148-018-0591-Z/FIGURES/5
Haas, O., Maier, A., Rothgang, E.: Machine learning-based HIV risk estimation using incidence rate ratios. Front Reprod. Heal. 96 (2021). https://doi.org/10.3389/FRPH.2021.756405
Orel, E., Esra, R., Estill, J., Marchand-Maillet, S., Merzouki, A., Keiser, O.: Machine learning to identify socio-behavioural predictors of HIV positivity in East and Southern Africa. MedRxiv (2020). https://doi.org/10.1101/2020.01.27.20018242
Mutai, C.K., McSharry, P.E., Ngaruye, I., Musabanganji, E.: Use of machine learning techniques to identify HIV predictors for screening in sub-Saharan Africa. BMC Med. Res. Methodol. 21, 1–11 (2021). https://doi.org/10.1186/S12874-021-01346-2/TABLES/3
Wray, T.B., Luo, X., Ke, J., Pérez, A.E., Carr, D.J., Monti, P.M.: Using smartphone survey data and machine learning to identify situational and contextual risk factors for HIV risk behavior among men who have sex with men who are not on PrEP. Prev. Sci. 20, 904–913 (2019). https://doi.org/10.1007/S11121-019-01019-Z/FIGURES/1
ZIMPHIA: Zimbabwe Population-Based HIV Impact Assessment (ZIMPHIA). Zimbabwe Ministry of Health Child Care, pp. 1–39 (2021)
Fashoto, S.G., Mbunge, E., Ogunleye, G., den Burg, J.V.: Implementation of machine learning for predicting maize crop yields using multiple linear regression and backward elimination. Malaysian J. Comput. (MJoC) 6(1) 679-697 (2021)
Fashoto, S.G., Owolabi, O., Mbunge, E., Metfula, A.S.: Evaluating the performance of two hybrid feature selection model of machine learning for credit card fraud detection on classification and prediction methods. Adv. Appl. Sci. Technol. 2, 70–87 (2019)
Macaulay, B.O., Aribisala, B.S., Akande, S.A., Akinnuwesi, B.A., Olabanjo, O.A.: Breast cancer risk prediction in African women using random forest classifier. Cancer Treat. Res. Commun. 28, 100396 (2021). https://doi.org/10.1016/J.CTARC.2021.100396
Cutler, A., Cutler, D.R., Stevens, J.R.: Random forests. In: Zhang, C., Ma, Y. (eds.) Ensemble Machine Learning, pp. 157–175 (2012). Springer US, Boston. https://doi.org/10.1007/978-1-4419-9326-7_5
Mbunge, E., Muchemwa, B.: Deep learning and machine learning techniques for analyzing travelers’ online reviews: a review. In: Gustavo, N., Pronto, J., Carvalho, L., Belo, M. (ed.) Optimizing Digital Solutions for Hyper-personalization in Tourism and Hospitality, pp. 20–39. IGI Global (2022). https://doi.org/10.4018/978-1-7998-8306-7.ch002
Mbunge, E., Fashoto, S.G., Bimha, H.: Prediction of box-office success: a review of trends and machine learning computational models. Int. J. Bus. Intell. Data Min. 20, 192 (2022). https://doi.org/10.1504/IJBIDM.2022.120825
Govender, P., et al.: The application of machine learning to predict genetic relatedness using human mtDNA hypervariable region I sequence. PLoS One 17, e0263790 (2022). https://doi.org/10.1371/JOURNAL.PONE.0263790
Mbunge, E., Simelane, S., Fashoto, S.G., Akinnuwesi, B., Metfula, A.S.: Application of deep learning and machine learning models to detect COVID-19 face masks - a review. Sustain. Oper. Comput. 2, 235–245 (2021). https://doi.org/10.1016/J.SUSOC.2021.08.001
Funding
This secondary data analysis research received no external funding. However, ZIMPHIA 2015–2016 was supported by the President’s Emergency Plan for AIDS Relief (PEPFAR) through the Centers for Disease Control and Prevention (CDC) to ICAP at Columbia University under the terms of cooperative agreement #U2GGH001226. The funder did not have any role in the study design, data collection and analysis, decision to publish, or preparation of the current manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chingombe, I. et al. (2022). Predicting HIV Status Using Machine Learning Techniques and Bio-Behavioural Data from the Zimbabwe Population-Based HIV Impact Assessment (ZIMPHIA15-16). In: Silhavy, R. (eds) Artificial Intelligence Trends in Systems. CSOC 2022. Lecture Notes in Networks and Systems, vol 502. Springer, Cham. https://doi.org/10.1007/978-3-031-09076-9_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-09076-9_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09075-2
Online ISBN: 978-3-031-09076-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)