Computational Statistics

, Volume 34, Issue 4, pp 1765–1778 | Cite as

Bootstrap ICC estimators in analysis of small clustered binary data

  • Bei Wang
  • Yi Zheng
  • Kyle M. Irimata
  • Jeffrey R. WilsonEmail author
Original Paper


Survey data are often obtained through a multilevel structure and, as such, require hierarchical modeling. While large sample approximation provides a mechanism to construct confidence intervals for the intraclass correlation coefficients (ICCs) in large datasets, challenges arise when we are faced with small-size clusters and binary outcomes. In this paper, we examine two bootstrapping methods, cluster bootstrapping and split bootstrapping. We use these methods to construct the confidence intervals for the ICCs (based on a latent variable approach) for small binary data obtained through a three-level or higher hierarchical data structure. We use 26 scenarios in our simulation study with the two bootstrapping methods. We find that the latent variable method performs well in terms of coverage. The split bootstrapping method provides confidence intervals close to the nominal coverage when the ratio of the ICC for the primary cluster to the ICC for the secondary cluster is small. While the cluster bootstrapping is preferred when the cluster size is larger and the ratio of the ICCs is larger. A numerical example based on teacher effectiveness is assessed.


Generalized linear mixed model Small sample inference Resampling scheme 



  1. Bland JM, Altman DG (1990) A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement. Comput Biol Med 20(5):337–340CrossRefGoogle Scholar
  2. Braschel MC, Svec I, Darlington GA, Donner A (2016) A comparison of confidence interval methods for the intraclass correlation coefficient in community-based cluster randomization trials with a binary outcome. Clin Trials 13(2):180–187CrossRefGoogle Scholar
  3. Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88(421):9–25zbMATHGoogle Scholar
  4. Capanu M, Gönen M, Begg CB (2013) An assessment of estimation methods for generalized linear mixed models with binary outcomes. Stat Med 32(26):4550–4566MathSciNetCrossRefGoogle Scholar
  5. Cramér H (1999) Mathematical methods of statistics. Princeton University Press, PrincetonzbMATHGoogle Scholar
  6. Davison AC, Hinkley DV (1997) Bootstrap methods and their application. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  7. Donner A (1986) A review of inference methods for the intraclass correlation coefficient in the one-way random effects model. Int Stat Rev 54(1):67CrossRefGoogle Scholar
  8. Efron B, Tibshirani R (1986) Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci 1(1):54–75MathSciNetCrossRefGoogle Scholar
  9. Field CA, Welsh AH (2007) Bootstrapping clustered data. J R Stat Soc: Ser B (Stat Methodol) 69(3):369–390MathSciNetCrossRefGoogle Scholar
  10. Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382CrossRefGoogle Scholar
  11. Irimata KM, Wilson JR (2018) Identifying intraclass correlations necessitating hierarchical modeling. J Appl Stat 45(4):626–641MathSciNetCrossRefGoogle Scholar
  12. Kleinman JC (1973) Proportions with extraneous variance: single and independent samples. J Am Stat Assoc 68(341):46–54Google Scholar
  13. Liang K, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73(1):13MathSciNetCrossRefGoogle Scholar
  14. Maas CJ, Hox JJ (2005) Sufficient sample sizes for multilevel modeling. Methodology 1(3):85–91CrossRefGoogle Scholar
  15. Mak TK (1988) Analysing intraclass correlation for dichotomous variables. Appl Stat 37(3):344CrossRefGoogle Scholar
  16. McGraw KO, Wong SP (1996) Forming inferences about some intraclass correlation coefficients. Psychol Methods 1(1):30–46CrossRefGoogle Scholar
  17. McMahon JM, Pouget ER, Tortu S (2006) A guide for multilevel modeling of dyadic data with binary outcomes using SAS PROC NLMIXED. Comput Stat Data Anal 50(12):3663–3680MathSciNetCrossRefGoogle Scholar
  18. Mudelsee M (2003) Estimating pearson’s correlation coefficient with bootstrap confidence interval from serially dependent time series. Math Geol 35(6):651–665CrossRefGoogle Scholar
  19. National Institute for Excellence in Teaching (2011) TAP: the system for teacher and student advancement. National Institute for Excellence in Teaching, Santa MonicaGoogle Scholar
  20. Nelder JA, Pregibon D (1987) An extended quasi-likelihood function. Biometrika 74(2):221MathSciNetCrossRefGoogle Scholar
  21. O’Connell AA, McCoach DB (2008) Multilevel modeling of educational data. IAP, CharlotteGoogle Scholar
  22. Puth M, Neuhauser M, Ruston GD (2015) On the variety of methods for calculating confidence intervals by bootstrapping. J Anim Ecol 84:892–897CrossRefGoogle Scholar
  23. Ren S, Yang S, Lai S (2006) Intraclass correlation coefficients and bootstrap methods of hierarchical binary outcomes. Stat Med 25(20):3576–3588MathSciNetCrossRefGoogle Scholar
  24. Ren S, Lai H, Tong W, Aminzadeh M, Hou X, Lai S (2010) Nonparametric bootstrapping for hierarchical data. J Appl Stat 37(9):1487–1498MathSciNetCrossRefGoogle Scholar
  25. Smith CA (1957) On the estimation of intraclass correlation. Ann Hum Genet 21(4):363–373MathSciNetCrossRefGoogle Scholar
  26. Snijders TA, Bosker RJ (2012) Multilevel analysis: an introduction to basic and advanced multilevel modeling. SAGE, Los AngeleszbMATHGoogle Scholar
  27. Tan M, Qu Y, Mascha E, Schubert A (1999) A Bayesian hierarchical model for multi-level repeated ordinal data: analysis of oral practice examinations in a large anesthesiology training programme. Stat Med 18(15):1983–1992CrossRefGoogle Scholar
  28. Ukoumunne OC, Davison AC, Gulliford MC, Chinn S (2003) Non-parametric bootstrap confidence intervals for the intraclass correlation coefficient. Stat Med 22(24):3805–3821CrossRefGoogle Scholar
  29. Wang B, Zheng Y, Fang D, Kamarianakis I, Wilson JR (2019) Bootstrap hierarchical logistic regression models. Stat Med. MathSciNetCrossRefGoogle Scholar
  30. Wu S, Crespi CM, Wong WK (2012) Comparison of methods for estimating the intraclass correlation coefficient for binary responses in cancer prevention cluster randomized trials. Contemp Clin Trials 33(5):869–880CrossRefGoogle Scholar
  31. Zou G, Donner A (2004) confidence interval estimation of the intraclass correlation coefficient for binary outcome data. Biometrics 60(3):807–811MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Mathematical and Statistical SciencesArizona State UniversityTempeUSA
  2. 2.Mary Lou Fulton Teachers College and School of Mathematical and Statistical SciencesArizona State UniversityTempeUSA
  3. 3.Department of EconomicsArizona State UniversityTempeUSA

Personalised recommendations