Abstract
Studies of individuals sampled in unbalanced clusters have become common in health services and epidemiological research, but available tools for power/sample size estimation and optimal design are currently limited. This paper presents and illustrates power estimation formulas for t-test comparisons of effect of an exposure at the cluster level on continuous outcomes in unbalanced studies with unequal numbers of clusters and/or unequal numbers of subjects per cluster in each exposure arm. Iterative application of these power formulas obtains minimal sample size needed and/or minimal detectable difference. SAS subroutines to implement these algorithms are given in the Appendices. When feasible, power is optimized by having the same number of clusters in each arm k A =k B and (irrespective of numbers of clusters in each arm) the same total number of subjects in each arm n A k A =n B k B . Cost beneficial upper limits for numbers of subjects per cluster may be approximately (5/ρ) −5 or less where ρ is the intraclass correlation. The methods presented here for simple cluster designs may be extended to some settings involving complex hierarchical weighted cluster samples.
Similar content being viewed by others
References
Cornfield J. Randomization by group: a formal analysis. Am J Epidemiol. 1978;108(3):100–102.
Susser E, Desvarieux M, Wittkowski KM. Reporting sexual risk behavior for HIV: a practical risk index and a method for improving risk indices. Am J Public Health, 1988; 88(4):671–674.
Hansen MH, Hurwitz WN, Madow WG. Sample Survey Methods and Theory. Volume 1. Methods and Application. New York: Wiley; 1953.
Donner A, Birkett N, Buck C. Randomization by cluster, sample size requirements and analysis. Am J Epidemiol. 1981;114(6):906–914.
Falconer DS. Introduction to Quantitative Genetics. New York: Ronald Press; 1960.
Dixon WJ, Massey FM. Introduction to Statistical Analysis. New York: McGraw Hill; 1983.
Hsieh FY. Sample size formulae for intervention studies with the cluster as unit of randomization. Stat Med. 1988;7(11):1195–1201.
Diggle PJ, Liang K-Y, Zeger SL, Analysis of Longitudinal Data. Oxford, UK: Clarendon Press; 1994.
Campbell M, Grimshaw J, Steen N. Sample size calculations for cluster randomised trials. Health Serv Res Policy, 2000;55(1):12–16.
Snijders TAB, Bosker RJ. Standard errors and sample sizes for two level research. J Educ Stat. 1993;18:237–259.
Cohen MP. Sample sizes for surveys with data analyzed by hierarchical models. J Official Stat. 1998;14(3):267–275.
Dunn OJ, Clark VA. Applied Statistics: Analysis of Variance and Regression. New York: Wiley and Sons; 1974.
Satterthwaite FE. An approximate distribution of estimates of variance components. Biometrics Bull. 1946;2(6):110–114.
Di Santostefano RL, Muller KE. A comparison of power approximations for Satterth-waite’s test. Commun Stat Sim. 1995;24(3):583–593.
Hoover DR. Clinical trials of behavioral interventions with heterogeneous teaching subgroup effects. Stat Med. In press.
Donner A, Klar N. Cluster randomization trials in epidemiology: theory and application. J Stat Plan Inf. 1994;42:37–56.
Olin GL, Liu H, Merriman B. Health and Health Care of the Medicare Population. Data From the 1995 Medicare Beneficiary Survey. Rockville, MD: Westat; 1996.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hoover, D.R. Power for T-test comparisons of unbalanced cluster exposure studies. J Urban Health 79, 278–294 (2002). https://doi.org/10.1093/jurban/79.2.278
Issue Date:
DOI: https://doi.org/10.1093/jurban/79.2.278