Skip to main content
Log in

Hierarchical Bayes small area estimation for county-level health prevalence to having a personal doctor

Statistical Methods & Applications Aims and scope Submit manuscript

Cite this article


The complexity of survey data and the availability of data from auxiliary sources motivate researchers to explore estimation methods that extend beyond traditional survey-based estimation. The U.S. Centers for Disease Control and Prevention’s Behavioral Risk Factor Surveillance System (BRFSS) collects a wide range of health information, including whether respondents have a personal doctor. While the BRFSS focuses on state-level estimation, there is demand for county-level estimation of health indicators using BRFSS data. A hierarchical Bayes small area estimation model is developed to combine county-level BRFSS survey data with county-level data from auxiliary sources, while accounting for various sources of error and nested geographical levels. To mitigate extreme proportions and unstable survey variances, a transformation is applied to the survey data. Model-based county-level predictions are constructed for prevalence of having a personal doctor for all the counties in the U.S., including those where BRFSS survey data were not available. An evaluation study using only the counties with large BRFSS sample sizes to fit the model versus using all the counties with BRFSS data to fit the model is also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others


  • Battese G, Harter R, Fuller W (1988) An error-components model for prediction of county crop areas using survey and satellite data. J Am Stat Assoc 83:28–36

    Article  Google Scholar 

  • Berkowitz Z, Zhang X, Richards T, Nadel M, Peipins L, Holt J (2018) Multilevel small-area estimation of colorectal cancer screening in the United States. Cancer Epidemiol Biomark Prev 27(3):245–253

    Article  Google Scholar 

  • Berkowitz Z, Zhang X, Richards T et al (2019) Multilevel regression for small-area estimation of mammography use in the United States. Cancer Epidemiol Biomark Prev 28(1):32–40

    Article  Google Scholar 

  • Browne W, Draper D (2006) A comparison of Bayesian and likelihood based methods for fitting multilevel models. Bayesian Anal 1(3):473–514

    Article  MathSciNet  MATH  Google Scholar 

  • Cadwell B, Thompson T, Boyle J, Baker L (2010) Bayesian small area estimation of diabetes prevalence by U.S. county, 2005. J Data Sci 8:173–188

    Google Scholar 

  • Erciulescu A, Cruze N, Nandram B (2020) Statistical challenges in combining survey and auxiliary data to produce official statistics. J Off Stat 36(1):63–88

    Article  Google Scholar 

  • Erciulescu A, Opsomer J (2019) A model-based approach to predict employee compensation components, In: Joint statistical meetings proceedings, Government Statistics Section, American Statistical Association, July 27–August 1; Alexandria, pp 1601–1623

  • Fabrizi E, Ferrante MR, Trivisano C (2016) Hierarchical Beta regression models for the estimation of poverty and inequality parameters in small areas. In: Analysis of poverty data by small area methods. Wiley, pp 299–314

  • Fay R, Herriot R (1979) Estimates of income for small places: an application of James–Stein procedures to census data. J Am Stat Assoc 74(366a):269–277

    Article  MathSciNet  Google Scholar 

  • Fuller W, Goyeneche J (1998) Estimation of the state variance component. (Unpublished manuscript)

  • Gabler S, Häder S, Lahiri P (1999) A model based justification of Kish’s formula for design effects forweighting and clustering. Surv Methodol 25:105–106

    Google Scholar 

  • Gelman A (2006) Prior distributions for variance parameters in hierarchical models (Comment on an article by Browne and Draper). Bayesian Anal 1(3):515–534

    Article  MathSciNet  MATH  Google Scholar 

  • Holt J, Matthews K, Lu H et al (2019) Small area estimates of populations with chronic conditions for community preparedness for public health emergencies. Am J Public Health 109(S4):S325–S331

    Article  Google Scholar 

  • Janicki R (2020) Properties of the beta regression model for small area estimation of proportions and applicationto estimation of poverty rates. Commun Stat Theor Methods 49(9):2264–2284

    Article  MATH  Google Scholar 

  • Kish L (1965) Survey sampling. Wiley, New York

    MATH  Google Scholar 

  • Krenzke T, Mohadjer L, Li J, et al (2020) Program for the international assessment of adult competencies (PIAAC): state and county estimation methodology report. Tech. Reports NCES2020225, U.S. Department of Education, Rockville: Westat.

  • Lahiri P, Suntornchost J (2015) Variable selection for linear mixed models with applications in small areaestimation. Sankhya B 77(2):312–320

    Article  MathSciNet  MATH  Google Scholar 

  • Liu B, Parsons V, Feuer E et al (2019) Small area estimation of cancer risk factors and screening behaviors in U.S. counties by combining two large national health surveys. Prev Chronic Dis 16:E119:190013

    Article  Google Scholar 

  • Pierannunzi C, Xu F, Wallace R et al (2016) A methodological approach to small area estimation for the Behavioral Risk Factor Surveillance System. Prev Chronic Dis 13:E91:150480

    Article  Google Scholar 

  • Polson N, Scott J (2012) On the half-Cauchy prior for a global scale parameter. Bayesian Anal 7(4):887–902

    Article  MathSciNet  MATH  Google Scholar 

  • Raghunathan T, Xie D, Schenker N et al (2007) Combining information from two surveys to estimate county-level prevalence rates of cancer risk factors and screening. J Am Stat Assoc 102:474–486

    Article  MathSciNet  MATH  Google Scholar 

  • Torabi M, Rao J (2014) On small area estimation under a sub-area level model. J Multivar Anal 127:36–55

    Article  MathSciNet  MATH  Google Scholar 

  • Watanabe S (2013) A widely applicable Bayesian information criterion. J Mach Learn Res 14:867–897

    MathSciNet  MATH  Google Scholar 

  • Wieczorek J, Hawala S (2011) A bayesian zero-one inflated beta model for estimating poverty in us counties. In: Proceedings of the American statistical sssociation, section on survey research methods. American Statistical Association, Alexandria, VA

  • Zhang Z, Holt J, Lu H et al (2014) Multilevel regression and poststratification for small-area estimation of population health outcomes: a case study of chronic obstructive pulmonary disease prevalence using the Behavioral Risk Factor Surveillance System. Am J Epidemiol 179(8):1025–1033

    Article  Google Scholar 

Download references


This work was conducted under a CDC-Westat project. The authors thank Carol Pierannunzi, the CDC’s main contact for the project, for helpful discussions and comments. Dr. Li contributed to this work while she was a Senior Statistician at Westat. Disclaimer: The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.


The work described in this paper was conducted under contract with the Centers for Disease Control and Prevention (CDC Contract #HHSD2002013M53968B Order #75D30120F09442). The BRFSS data are confidential to CDC so cannot be shared.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Andreea Erciulescu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix A Auxiliary data pool

See Table 5.

Table 5 Auxiliary data pool

Appendix B STAN code

1.1 Model specification

figure a

1.2 Model fit

figure b

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Erciulescu, A., Li, J., Krenzke, T. et al. Hierarchical Bayes small area estimation for county-level health prevalence to having a personal doctor. Stat Methods Appl (2022).

Download citation

  • Accepted:

  • Published:

  • DOI: