Abstract
Behavioral and psychological researchers have shown strong interests in investigating contextual effects (i.e., the influences of combinations of individual- and group-level predictors on individual-level outcomes). The present research provides generalized formulas for determining the sample size needed in investigating contextual effects according to the desired level of statistical power as well as width of confidence interval. These formulas are derived within a three-level random intercept model that includes one predictor/contextual variable at each level to simultaneously cover various kinds of contextual effects that researchers can show interest. The relative influences of indices included in the formulas on the standard errors of contextual effects estimates are investigated with the aim of further simplifying sample size determination procedures. In addition, simulation studies are performed to investigate finite sample behavior of calculated statistical power, showing that estimated sample sizes based on derived formulas can be both positively and negatively biased due to complex effects of unreliability of contextual variables, multicollinearity, and violation of assumption regarding the known variances. Thus, it is advisable to compare estimated sample sizes under various specifications of indices and to evaluate its potential bias, as illustrated in the example.
Similar content being viewed by others
References
Borenstein, M., Hedges, L. V., & Rothstein, H. (2012). CRT power (2nd ed.). Teaneck, NJ: Biostat.
Christ, O., et al. (2014). Contextual effect of positive intergroup contact on outgroup prejudice. Proceedings of the National Academy of Sciences of the United States of America, 111, 3996–4000.
Chow, S. C., Shao, J., & Wang, H. (2003). Sample size calculation in clinical research (2nd ed.). New York: Chapman & Hall.
Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review. Journal of Abnormal and Social Psychology, 65, 145–153.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Mahwah, NJ: Erlbaum.
Cohen, M. P. (2005). Sample size considerations for multilevel surveys. International Statistical Review, 73, 279–287.
Cook, J. A., Bruckner, T., MacLennan, G. S., & Seiler, C. M. (2012). Clustering in surgical trials -database of intracluster correlations. Trials, 13, 1–8.
Dai, D. Y., & Rinn, A. N. (2008). The big-fish-little-pond effect: What do we know and where do we go from here? Educational Psychological Review, 20(3), 283–317.
De Jong, K., Moerbeek, M., & der Leeden, Van. (2010). A priori power analysis in longitudinal three-level multilevel models: An example with therapist effects. Psychotherapy Research, 20, 273–284.
Donner, A., & Klar, N. (2000). Design and analysis of cluster randomization trials in health research. London: Arnold.
Fisher, R. A. (1958). Statistical methods for research workers (13th ed.). London: Hafner Press.
Goldstein, H. (2003). Multilevel statistical models (3rd ed.). New York: Oxford University Press.
Grilli, L., & Rampichini, C. (2011). The role of sample cluster means in multilevel models: A view on endogeneity and measurement error issues. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 7, 121–133.
Hayes, R. J., & Moulton, L. H. (2009). Cluster randomised trials. Boca Raton: CRC Press.
Hedeker, D., Gibbons, R. D., & Waternaux, C. (1999). Sample size estimation for longitudinal designs with attrition: Comparing time-related contrasts between two groups. Journal of Educational and Behavioral Statistics, 24, 70–93.
Hedges, L. V., & Hedberg, E. (2007). Intraclass correlation values for planning group randomized trials in education. Educational Evaluation and Policy Analysis, 29, 60–87.
Heo, M., & Leon, A. C. (2008). Statistical power and sample size requirements for three-level hierarchical cluster randomized trials. Biometrics, 64, 1256–1262.
Heo, M., & Leon, A. C. (2009). Sample size requirements to detect an intervention by time interaction in longitudinal cluster randomized trials. Statistical Medicine, 28, 1017–1027.
Heo, M., Xue, X., & Kim, M. Y. (2013). Sample size requirements to detect an intervention by time interaction in longitudinal cluster randomized clinical trials with random slopes. Computational Statistics and Data Analysis, 60, 169–178.
Heo, M., Xue, X., & Kim, M. Y. (2014). Sample size requirements to detect a two- or three-way interaction in longitudinal cluster randomized clinical trials with second-level randomization. Clinical Trials, 11, 503–507.
Hox, J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). Mahwah, NJ: Erlbaum.
Huguet, P., Dumas, F., Marsh, H., Régner, I., Wheeler, L., Suls, J., et al. (2009). Clarifying the role of social comparison in the Big-Fish-Little-Pond Effect (BFLPE): An integrative study. Journal of Personality and Social Psychology, 96, 156–170.
Ito, A., & Matsui, H. (2001). Construction of the classroom climate inventory. The Japanese Journal of Educational Psychology, 49, 449–457.
Kline, R. B. (2004). Beyond significance testing. Washington, DC: American Psychological Association.
Little, T. D., Bovaird, J. A., & Card, N. A. (Eds). (2007). Modeling contextual effects in longitudinal studies. Mahwah, NJ: Erlbaum.
Little, T. D. (2013). Longitudinal structural equation modeling. New York: Guilford Press.
Lüdtke, O., Marsh, H., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muthén, B. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13, 203–229.
Maas, C. J. M., & Hox, J. (2005). Sufficient sample sizes for multilevel modeling. Methodology, 1(3), 86–92.
Marsh, H. W. (1974). Judgmental anchoring: Stimulus and response variables. Unpublished doctoral dissertation, University of California, Los Angeles
Marsh, H. W., & Parker, J. (1984). Determinants of student self-concept: Is it better to be a relatively large fish in a small pond even if you don’t learn to swim as well? Journal of Personality and Social Psychology, 47, 213–231.
Marsh, H. W., Kong, C. K., & Hau, K. (2000). Longitudinal multilevel models of the big-fish-little-pond effect on academic self-concept: Counterbalancing contrast and reflected glory effects in Hong Kong schools. Journal of Personality and Social Psychology, 78, 337–349.
Marsh, H. W., & Hau, K. T. (2003). Big fish little pond effect on academic self-concept: A crosscultural (26 country) test of the negative effects of academically selective schools. American Psychologist, 58, 364–376.
Marsh, H. W., Seaton, M., Trautwein, U., Lüdtke, O., Hau, K. T., O’Mara, A. J., et al. (2008). The big-fish-little-pond-effect stands up to critical scrutiny: Implications for theory, methodology, and future research. Educational Psychology Review, 20, 319–350.
Marsh, H. W., Trautwein, U., Lüdtke, O., & Köller, O. (2008). Social comparison and big-fish-little-pond effects on self-concept and other self-belief constructs: Role of generalized and specific others. Journal of Educational Psychology, 100, 510–524.
Marsh, H. W., Lüdtke, O., Robitzsch, A., Trautwein, U., Asparouhov, T., Muthén, B., et al. (2009). Doubly-latent models of school contextual effects: Integrating multilevel and structural equation approaches to control measurement and sampling error. Multivariate Behavioral Research, 44, 764–802.
Maxwell, S. E. (2004). The persistence of underpowered studies in psychological research: Causes, consequences, and remedies. Psychological Methods, 9, 147–163.
Maxwell, S. E., Kelley, K., & Rausch, J. R. (2008). Sample size planning for statistical power and accuracy in parameter estimation. Annual Review of Psychology, 59, 537–563.
McArdle, J. J., & Nesselroade, J. R. (2014). Longitudinal data analysis using structural equation models. Washington DC: American Psychological Association.
McCulloch, C. E., & Searle, S. R. (2001). Generalized, linear, and mixed models. New York: Wiley.
Murray, D. M., & Blitstein, J. L. (2003). Methods to reduce the impact of intraclass correlation in group-randomized trials. Evaluation Review, 27, 79–103.
Muthén, L. K., & Muthén, B. O. (2002). How to use a monte carlo study to decide on sample size and determine power. Structural Equation Modeling, 9, 599–620.
Preckel, F., Zeidner, M., Goetz, T., & Schleyer, E. (2008). Female ’big fish’ swimming against the tide: The ’big-fish-little-pond effect’ and gender-ratio in special gifted classes. Contemporary Educational Psychology, 33, 78–96.
R Development Core Team (2011). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, http://www.R-project.org/.
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004). Generalized multilevel structural equation modelling. Psychometrika, 69, 167–190.
Raudenbush, S. W., & Bryk, A. S. (1986). A hierarchical model for studying school effects. Sociology of Education, 59, 1–17.
Raudenbush, S. W. (1997). Statistical analysis and optimal design for cluster randomized trials. Psychological Methods, 2, 173–185.
Raudenbush, S. W., & Liu, X. (2000). Statistical power and optimal design for multisite randomized trials. Psychological Methods, 5, 199–213.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). London: Sage.
Raudenbush, S. W., Martinez, A., & Spybrook, J. (2007). Strategies for improving precision in group randomized experiments. Educational Evaluation and Policy Analysis, 29, 5–29.
Rotondi, M. A., & Donner, A. (2009). Sample size estimation in cluster randomized educational trials: An empirical Bayes approach. Journal of Educational and Behavioral Statistics, 34, 229–237.
Raudenbush, S. W., Spybrook, J., Congdon, R., Liu, X., Martinez, A., Bloom, H., et al. (2011). Optimal design plus empirical evidence, Version 3.0.
Roy, A., Bhaumik, D. K., Aryal, S., & Gibbons, R. D. (2007). Sample size determination for hierarchical longitudinal designs with differential attrition rates. Biometrics, 63, 699–707.
Satterthwaite, F. E. (1941). Synthesis of variance. Psychometrika, 6, 309–316.
Searle, S. R., Casella, G., & McCulloch, C. E. (1992). Variance components. New York: Wiley.
Seaton, M., Marsh, H. W., & Craven, R. G. (2009). Earning its place as a pan-human theory: Universality of the big-fish-little-pond effect across 41 culturally and economically diverse countries. Journal of Educational Psychology, 101, 403–419.
Shin, Y., & Raudenbush, S. W. (2010). A latent cluster mean approach to the contextual effects model with missing data. Journal of Educational and Behavioral Statistics, 35, 26–53.
Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis. New York: Oxford.
Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling. Multilevel, longitudinal, and structural equation models. Boca Raton, FL: Chapman & Hall/CRC.
Smeeth, L., & Ng, E. S.-W. (2002). Intra-class correlation coefficients for cluster randomized trials in primary care: Data from the MRC trial of the assessment and management of older people in the community. Control Clinical Trials, 23, 409–421.
Snijders, T. A. B., & Bosker, R. J. (1993). Standard errors and sample sizes for two-level research. Journal of Educational Statistics, 18, 237–260.
Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis (2nd ed.). London: Sage.
Thompson, D. M., Fernald, D. H., & Mold, J. W. (2012). Intraclass correlation coefficients typical of cluster-randomized studies: Estimates from the Robert Wood Johnson Prescription for Health projects. The Annals of Family Medicine, 10, 235–240.
Tymms, P. (2001). A test of the big fish in a little pond hypothesis: An investigation into the feelings of seven-year-old pupils in school. School Effectiveness and School Improvement, 12, 161–181.
Tymms, P. (2004). Effect sizes in multilevel models. In I. Schagen & K. Elliot (Eds.), But what does it mean? The use of effect sizes in educational research (pp. 55–66). London: National Foundation for Educational Research.
Usami, S. (2011a). Statistical power of experimental research with hierarchical data. Behaviormetrika, 38, 63–84.
Usami, S. (2011b). A unified method for determining sample size needed to evaluate mean difference in hierarchical research design and construction of numerical table. Japanese Journal of Educational Psychology, 59, 385–401.
Usami, S. (2014). Generalized sample size determination formulas for experimental research with hierarchical data. Behavior Research Methods, 46, 346–356.
Kenny, J., Kenny, D. A., & Judd, C. M. (2014). Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli. Journal of Experimental Psychology: General, 143, 2020–2045.
Yang, J. S., & Cai, L. (2012). Estimation of contextual effects through nonlinear multilevel latent variable modeling with a Metropolis-Hastings Robbins-Monro algorithm. Lincoln, NE: Paper presented at International Meeting of the Psychometric Society.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: Derivation of Standard Errors of Contextual Effects Estimates
Appendix: Derivation of Standard Errors of Contextual Effects Estimates
Without loss of generality, assume an intercept \(\gamma _0\) is 0. In matrix notation, the Eq. (1) can now be expressed as
Here, \({\varvec{{\beta }}}=(\gamma _1,\gamma _2,\gamma _3)'\), and \({{\varvec{{Y}}}}\) is an \((I\times J\times K)\times 1\) outcome vector. Its elements are arranged as \(\varvec{Y}=(\varvec{Y'_{1}},\dots ,\varvec{Y'_{k}},\dots ,\varvec{Y'_{K}})'\), where \(\varvec{Y_{k}}=(\varvec{Y'_{1k}},\dots ,\varvec{Y'_{jk}},\dots ,\varvec{Y'_{Jk}})'\) and \(\varvec{Y_{jk}}=(Y_{1jk},\dots ,Y_{ijk},\dots ,Y_{Ijk})'\). As well, \(\varvec{\tilde{X}}=(\varvec{X},\varvec{{\bar{X}}_{.jk}}, \varvec{{\bar{X}}_{..k}})\) is a \((I\times J\times K)\times 3\) predictor matrix; \(\varvec{X}\) is a \((I\times J\times K)\times 1\) vector including predictors \(X_{ijk}\), and its elements are arranged in similarly to those of \(\varvec{Y}\). Then, \(\varvec{{\bar{X}}_{.jk}}\) is an \((I\times J\times K)\times 1\) level-2 units mean vector, expressed as \(\varvec{{\bar{X}}_{.jk}} =({\bar{X}}_{.11},\dots ,{\bar{X}}_{.1K},\dots ,{\bar{X}}_{.J1},\dots ,{\bar{X}}_{.JK})' \otimes \varvec{1_{I}} \), where \(\otimes \) indicates the Kronecker product. Also, \(\varvec{{\bar{X}}_{..k}}\) is an \((I\times J\times K)\times 1\) level-3 units mean vector, expressed as \(\varvec{{\bar{X}}_{..k}} =({\bar{X}}_{..1},\dots ,{\bar{X}}_{..k},\dots ,{\bar{X}}_{..K})' \otimes \varvec{1_{IJ}}\). In addition, \(\varvec{\tilde{\epsilon }}\) is the corresponding \((I\times J\times K)\times 1\) residual vector consisting of \({\tilde{e}}_{ijk}=e_{k}+e_{jk}+e_{ijk}\). From Eq. (7), it can be shown that \(\varvec{\tilde{\epsilon }}\) is distributed as \(\varvec{\tilde{\epsilon }}\sim N(\varvec{0},{\tilde{\varvec{\Sigma }}})\), where
Here, we assume that \(\sigma _1^2\ge 0\), \(\sigma _2^2\ge 0\), and \(\sigma _3^2\ge 0\), and that the inverse matrix of \(\varvec{\Sigma }\) (denoted as \(\varvec{\Sigma ^{-1}}\)) exists. Let the diagonal elements of \(\varvec{\Sigma ^{-1}}\) be \(\sigma ^{(1)}\), off-diagonal elements denoting the same level-2 and level-3 unit in \(\varvec{\Sigma ^{-1}}\) be \(\sigma ^{(2)}\), and off-block elements denoting the same level-3 unit in \(\varvec{\Sigma ^{-1}}\) be \(\sigma ^{(3)}\). Comparing the left and right sides of the identity \(\varvec{\Sigma }\varvec{\Sigma ^{-1}}=\varvec{I}\), the following relations are obtained:
Using the generalized least squares estimators, a sample distribution of \(\hat{\varvec{\beta }}\) can be expressed as
Then \(se({\hat{\gamma }}_2)\) and \(se({\hat{\gamma }}_3)\) can be evaluated by the square root of (2, 2) and (3, 3) elements of \( ({\tilde{\varvec{X}}}^{\prime }{\varvec{{\tilde{\Sigma }^{-1}}}}\tilde{\varvec{X}})^{-1}=[{\tilde{\varvec{X}}}^{\prime }({\varvec{I_K}\otimes \varvec{\Sigma ^{-1}}})\tilde{\varvec{X}}]^{-1} =\varvec{\Sigma ^*}^{-1}\), respectively. Here, \(\varvec{\Sigma ^*}={\tilde{\varvec{X}}}^{\prime }({\varvec{I_K}\otimes \varvec{\Sigma ^{-1}}})\tilde{\varvec{X}}\).
\(\sigma ^*_{11}\), a (1,1) element of \(\varvec{\Sigma ^*}\), can be calculated as
Here, \({\bar{X}}_{...}\) is the overall mean of the predictor. \(\sigma ^*_{22}\), a (2,2) element of \(\varvec{\Sigma ^*}\), can be calculated as
\(\sigma ^*_{33}\), a (3,3) element of \(\varvec{\Sigma ^*}\), can be calculated as
Considering the identical equations of \(I\sum _{j}\sum _{k}{{\bar{X}}^2{}_{.jk}}=\sum _{i}\sum _{j}\sum _{k}X_{ijk}{{\bar{X}}_{.jk}}\), it follows that \(\sigma ^*_{12}\), a (1,2) element of \(\varvec{\Sigma ^*}\), is equal to \(\sigma ^*_{22}\). Likewise, \(\sigma ^*_{13}\) and \(\sigma ^*_{23}\), which are, respectively, (1,3) and (2,3) elements of \(\varvec{\Sigma ^*}\), are equal to \(\sigma ^*_{33}\). Thus, elements in \(\varvec{\Sigma ^*}\) show the following inclusion relation among levels:
Assume here \({\bar{X}}_{...}=0\) without loss of generality of discussion. Since \(\sum _{i}\sum _{j}\sum _{k}(X_{ijk}-{\bar{X}}_{...})^2=\sum _{i}\sum _{j}\sum _{k}X^2{}_{ijk}=IJK\sigma _x^2\), \(\eta _3^2=\frac{\sum _{k}{\bar{X}}^2{}_{..k}}{K\sigma _x^2}\) from Eq. (12). Then, \(\sum _{k}{\bar{X}}^2{}_{..k}\) can be expressed using \(\eta _3^2\) as
From this result and the relations (11)–(14), the following relations are also obtained:
Then, it is possible to further simplify \(\sigma ^{*}_{11}\), \(\sigma ^{*}_{22}\), and \(\sigma ^{*}_{33}\) as
where \(f=1+I(J-1)\rho _2+(I-1)\rho _1=1/[\sigma ^{(1)}+(I-1)\sigma ^{(2)}+I(J-1)\sigma ^{(3)}]\). From these results and derived structures of \(\varvec{\Sigma ^*}\), the standard errors \(se({\hat{\gamma }}_1)\), \(se({\hat{\gamma }}_2)\), and \(se({\hat{\gamma }}_3)\) can be calculated as
Here, det(\(\cdot \)) denotes the determinant, and det(\(\varvec{\Sigma ^{*}})=\sigma ^{*}_{33}(\sigma ^{*}_{22}-\sigma ^{*}_{33})(\sigma ^{*}_{11}-\sigma ^{*}_{22})\).
Rights and permissions
About this article
Cite this article
Usami, S. Generalized SAMPLE SIZE Determination Formulas for Investigating Contextual Effects by a Three-Level Random Intercept Model. Psychometrika 82, 133–157 (2017). https://doi.org/10.1007/s11336-016-9532-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-016-9532-y