Refusal Bias in the Estimation of HIV Prevalence


In 2007, UNAIDS corrected estimates of global HIV prevalence downward from 40 million to 33 million based on a methodological shift from sentinel surveillance to population-based surveys. Since then, population-based surveys are considered the gold standard for estimating HIV prevalence. However, prevalence rates based on representative surveys may be biased because of nonresponse. This article investigates one potential source of nonresponse bias: refusal to participate in the HIV test. We use the identity of randomly assigned interviewers to identify the participation effect and estimate HIV prevalence rates corrected for unobservable characteristics with a Heckman selection model. The analysis is based on a survey of 1,992 individuals in urban Namibia, which included an HIV test. We find that the bias resulting from refusal is not significant for the overall sample. However, a detailed analysis using kernel density estimates shows that the bias is substantial for the younger and the poorer population. Nonparticipants in these subsamples are estimated to be three times more likely to be HIV-positive than participants. The difference is particularly pronounced for women. Prevalence rates that ignore this selection effect may be seriously biased for specific target groups, leading to misallocation of resources for prevention and treatment.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  1. 1.

    Information about DHS+ surveys is available online (

  2. 2.

    In repeat population-based surveys, a fourth source of bias may arise: attrition (Obare 2010). A fifth bias inherent to household surveys is the sampling frame when only people residing in households are included.

  3. 3.

    Estimates are from authors’ own calculations.

  4. 4.

    The test results of 384 individuals were dropped from the sample because of fraudulent practices by one of the interviewers (Janssens et al. 2010). Because the interviewers were randomly assigned to households, this should not affect the results.

  5. 5.

    The wealth index is calculated based on the first factor loadings of a principal component analysis of 28 assets and 7 dwelling characteristics, with missing values imputed.

  6. 6.

    This can be calculated with the margins command in STATA version 11.0.

  7. 7.

    The Staiger and Stock (1997) rule of thumb to assess the strength of instrumental variables is not applicable in the case of a Heckman model. Instead, we calculate the likelihood ratio for the first stage with and without instruments. This shows that including the instrumental variables substantially increases the ratio from 187 to 298.

  8. 8.

    See Online Resource 1, sections 3 and 4, for the detailed regression results of the probit and Heckman model, respectively.

  9. 9.

    Another way of correctly calculating standard errors is by bootstrapping. The bootstrapped confidence intervals are not reported in the table because in about 10 % of the bootstrap iterations, the probit and Heckman models do not converge and it cannot be ruled out that nonconvergence is selective. The other 90 % yield intervals that are very similar to the intervals calculated with the delta method.

  10. 10.

    Online Resource 1, section 5, compares observed with predicted prevalence rates by stratum of probit predicted HIV prevalence, which is presumably less sensitive to probit misspecification. The results show that the probit predictions are very similar to the observed rates in all strata. The Heckman predictions are increasingly higher than the probit estimates for each consecutive stratum, suggesting that the bias in the population prevalence increases with the propensity of HIV-infection. However, refusal is most common in the first stratum with the lowest HIV infection rate. The section A Detailed Look at Nonparticipants explores the characteristics of nonparticipants in more detail.

  11. 11.

    Although the estimates are large and negative at –.329, –.198, and –.622, respectively, for all individuals and males and females, they are not significant at the 5 % level. For females, the p value is .061. A negative sign of indicates that individuals less likely to participate are more likely to be HIV-positive.

  12. 12.

    Attrition rates, which may be selective with respect to HIV status (Obare 2010), were very similar for HIV-negative versus HIV-positive participants: 36 % versus 35 %.

  13. 13.

    Section 6 of Online Resource 1 discusses how increases in sample size affect confidence intervals.

  14. 14.

    The subgroups overlap because the Heckman model does not converge for our data when taking a strict boundary between subgroups.

  15. 15.

    For comparison, Online Resource 1, section 7, shows the same plots including the densities from the probit model. The Heckman model shifts the kernel to the right compared with the probit model for the young and poor, but not the old and nonpoor subsamples.


  1. Aulagnier, M., Janssens, W., De Beer, I., van Rooy, G., Gaeb, E., Hesp, C., . . . Rinke de Wit, T. F. (2011). Incidence of HIV in Windhoek, Namibia: Demographic and socio-economic associations. PLoS ONE, 6(10), e25860. doi:10.1371/journal.pone.0025860

  2. Bärnighausen, T., Bor, J., Wandira-Kazibwe, S., & Canning, D. (2011). Correcting HIV prevalence estimates for survey nonparticipation using Heckman-type selection models. Epidemiology, 22, 27–35.

    Article  Google Scholar 

  3. Boerma, J. T., Ghys, P. D., & Walker, N. (2003). Estimates of HIV-1 prevalence from national population-based surveys as a new gold standard. The Lancet, 362, 1929–1931.

    Article  Google Scholar 

  4. Davidson, R., & MacKinnon, J. G. (2004). Econometric theory and methods. New York, NY: Oxford University Press.

    Google Scholar 

  5. De Walque, D. (2009). Does education affect HIV status? Evidence from five African countries. World Bank Economic Review, 23, 209–233.

    Article  Google Scholar 

  6. Durrant, G. B., Groves, R. M., Staetsky, L., & Steele, F. (2010). Effects of interviewer attitudes and behaviors on refusal in household surveys. Public Opinion Quarterly, 74, 1–36. doi:10.1093/poq/nfp098

    Article  Google Scholar 

  7. Floyd, S., Molesworth, A., Dube, A. Crampin, A. C., Houben, R., Chihana, M., . . . Glynn, J. R. (2013). Underestimation of HIV prevalence in surveys when some people already know their status, and ways to reduce the bias. AIDS, 27, 233–242.

  8. Fortson, J. (2008). The gradient in sub-Saharan Africa: Socioeconomic status and HIV/AIDS. Demography, 45, 303–322.

    Article  Google Scholar 

  9. García-Calleja, J. M., Gouws, E., & Ghys, P. D. (2006). National population based HIV prevalence surveys in sub-Saharan Africa: Results and implications for HIV and AIDS estimates. Sexually Transmitted Infections, 82(Suppl. 3), iii64–iii70. doi:10.1136/sti.2006.019901

    Google Scholar 

  10. Gouws, E., Mishra, V., & Fowler, T. B. (2008). Comparison of adult HIV prevalence from national population-based surveys and antenatal clinic surveillance in countries with generalized epidemics: Implications for calibrating surveillance data. Sexually Transmitted Infections, 84(Suppl. 1), i17–i23. doi:10.1136/sti.2008.030452

    Article  Google Scholar 

  11. Hamers, R. L., de Beer, I. H., Kaura, H., van Vugt, M., Caparos, L., & Rinke de Wit, T. F. (2008). Diagnostic accuracy of two oral fluid-based tests for HIV surveillance in Namibia. AIDS, 48, 116–118.

    Google Scholar 

  12. Hogan, D. R., Salomon, J. A., Canning, D., Hammitt, J. K., Zaslavsky, A. M., & Bärnighausen, T. (2012). National HIV prevalence estimates for sub-Saharan Africa: Controlling selection bias with Heckman-type selection models. Sexually Transmitted Infections, 88(Suppl. 2), i17–i23. doi:10.1136/sextrans-2012-050636

  13. Janssens, W., de Beer, I., Coutinho, H. M., Van Rooy, G., van der Gaag, J., & Rinke de Wit, T. F. (2010). Estimating HIV prevalence: A cautious note on household surveys in poor settings. BMJ, 341, c6323. doi:10.1136/bmj.c6323

    Article  Google Scholar 

  14. Lachaud, J. P. (2007). HIV prevalence and poverty in Africa: Micro- and macro-econometric evidences applied to Burkina Faso. Journal of Health Economics, 26, 483–504.

    Article  Google Scholar 

  15. Lydié, N., Robinson, N. J., Ferry, B., Akam, E., De Loenzien, M., & Abega, S. (2004). Mobility, sexual behavior, and HIV infection in an urban population in Cameroon. Journal of Acquired Immune Deficiency Syndromes, 35, 67–74.

    Article  Google Scholar 

  16. Manski, C. F. (1989). Anatomy of the selection problem. Journal of Human Resources, 24, 343–360.

    Article  Google Scholar 

  17. Marston, M., Harriss, K., & Slaymaker, E. (2008). Non-response bias in estimates of HIV prevalence due to mobility of absentees in national population-based surveys: A study of nine national surveys. Sexually Transmitted Infections, 84(Suppl.1), i71–i77. doi:10.1136/sti.2008.030353

    Article  Google Scholar 

  18. McNaghten, A. D., Herold, J. M., Dube, H. M., & St Louis, M. E. (2007). Response rates for providing a blood specimen for HIV testing in a population-based survey of young adults in Zimbabwe. BMC Public Health, 7, 145–150.

    Article  Google Scholar 

  19. Mishra, V., Barrere, B., Hong, R., & Khan, S. (2008). Evaluation of bias in HIV seroprevalence estimates from national household surveys. Sexually Transmitted Infections, 84(Suppl.1), i63–i70. doi:10.1136/sti.2008.030411

    Article  Google Scholar 

  20. Mishra, V., Bignami-Van Asshe, S., Greener, R., Vaessen, M., Hong, R., Ghys, P. D., . . . Rutstein, S. (2007). HIV infection does not disproportionately affect the poorer in sub-Saharan Africa. AIDS, 21(Suppl.7), S17–S28.

  21. Mishra, V., Vaessen, M., Boerma, J. T., Arnold, F., Way, A., Barrere, B., . . . Sangha, J. (2006). HIV testing in national population-based surveys: Experience from the Demographic and Health Surveys. Bulletin of the World Health Organization, 84, 537–545.

  22. Montana, L. S., Mishra, V., & Hong, R. (2008). Comparison of HIV prevalence estimates from antenatal care surveillance and population-based surveys in sub-Saharan Africa. Sexually Transmitted Infections, 84(Suppl.1), i78–i84. doi:10.1136/sti.2008.030106

    Article  Google Scholar 

  23. Obare, F. (2010). Nonresponse in repeat population-based voluntary counseling and testing for HIV in rural Malawi. Demography, 47, 651–665.

    Article  Google Scholar 

  24. O’Muircheartaigh, C., & Campanelli, P. (1999). A multilevel exploration of the role of interviewers in survey non-response. Journal of the Royal Statistical Society, 162, 437–446.

    Article  Google Scholar 

  25. Pison, G., Le Guenno, B., Lagarde, E., Enel, C., & Seck, C. (1993). Seasonal migration: A risk factor for HIV infection in rural Senegal. Journal of Acquired Immune Deficiency Syndromes, 6, 196–200.

    Google Scholar 

  26. Reniers, G., Araya, T., Berhane, Y., Davey, G., & Sanders, E. J. (2009). Implications of the HIV testing protocol for refusal bias in seroprevalence surveys. BMC Public Health, 9(1), 163.

    Article  Google Scholar 

  27. Reniers, G., & Eaton, J. (2009). Refusal bias in HIV prevalence estimates from nationally representative seroprevalence surveys. AIDS, 23, 621–629.

    Article  Google Scholar 

  28. Staiger, D., & Stock, J. (1997). Instrumental variables regression with weak instruments. Econometrica, 65, 557–586.

    Article  Google Scholar 

  29. UNAIDS. (2007). AIDS epidemic update. Geneva, Switzerland: UNAIDS.

    Google Scholar 

  30. UNAIDS/WHO. (2005). Guidelines for measuring national HIV prevalence in population-based surveys. Geneva, Switzerland: UNAIDS/WHO Working Group on Global HIV/AIDS and STI Surveillance.

    Google Scholar 

  31. Van de Ven, W. P. M. M., & Van Praag, B. M. S. (1981). The demand for deductibles in private health insurance: A probit model with sample selection. Journal of Econometrics, 17, 229–252.

    Article  Google Scholar 

  32. Walker, N., Grassly, N. C., Garnett, G. P., Stanecki, K. A., & Ghys, P. D. (2004). Estimating the global burden of HIV/AIDS: What do we really know about the HIV pandemic? The Lancet, 363, 2180–2185.

    Article  Google Scholar 

Download references


This work was supported by the Dutch Ministry of Development Cooperation (Grant No. 13298) and the Dutch Organization of Scientific Research (NWO) (Rubicon Grant No. 446-08-004 to W.J.). The survey data used in this article were collected by the University of Namibia (UNAM) and the National Institute of Population (NIP), with technical assistance from PharmAccess International and the Amsterdam Institute of International Development. Special thanks are due to Ingrid De Beer, Gert van Rooy, and Christa Schier for organizing the fieldwork and providing detailed insights into the data collection process. We are also grateful to Chris Elbers, Angus Deaton, and Aico van Vuren for helpful discussions on technical aspects of the estimation. We would like to thank participants at the 2007 AIID workshop on “The Economic Consequences of HIV/AIDS,” the 2008 Tinbergen Annual Conference in Amsterdam, the 2009 CSAE Conference in Oxford, and the 2012 Scientific EUDN Conference in Paris for useful comments and suggestions. Finally, we thank the editors of this journal and three anonymous reviewers for the positive and constructive feedback on earlier versions of this article.

Author information



Corresponding author

Correspondence to Wendy Janssens.

Electronic supplementary material

Below is the link to the electronic supplementary material.


(DOCX 204 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Janssens, W., van der Gaag, J., Rinke de Wit, T.F. et al. Refusal Bias in the Estimation of HIV Prevalence. Demography 51, 1131–1157 (2014).

Download citation


  • HIV prevalence
  • Population-based survey
  • Refusal bias
  • Heckman selection model
  • Namibia