Skip to main content
Log in

Analysis of clustered data in community psychology: With an example from a worksite smoking cessation project

  • Published:
American Journal of Community Psychology

Abstract

Although it is common in community psychology research to have data at both the community, or cluster, and individual level, the analysis of such clustered data often presents difficulties for many researchers. Since the individuals within the cluster cannot be assumed to be independent, the use of many traditional statistical techniques that assumes independence of observations is problematic. Further, there is often interest in assessing the degree of dependence in the data resulting from the clustering of individuals within communities. In this paper, a random-effects regression model is described for analysis of clustered data. Unlike ordinary regression analysis of clustered data, random-effects regression models do not assume that each observation is independent, but do assume data within clusters are dependent to some degree. The degree of this dependency is estimated along with estimates of the usual model parameters, thus adjusting these effects for the dependency resulting from the clustering of the data. Models are described for both continuous and dichotomous outcome variables, and available statistical software for these models is discussed. An analysis of a data set where individuals are clustered within firms is used to illustrate fetatures of random-effects regression analysis, relative to both individual-level analysis which ignores the clustering of the data, and cluster-level analysis which aggregates the individual data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agresti, A. (1990).Categorical data analysis. New York: Wiley.

    Google Scholar 

  • Aitkin, M., & Longford, N. (1986). Statistical modelling issues in school effectiveness studies (with discussion).Journal of the Royal Statistical Society, Series A, 149, 1–43.

    Article  Google Scholar 

  • Anderson, D., & Aitken, M. (1985). Variance component models with binary response: Interviewer variability.Journal of the Royal Statistical Society, Series B, 47, 203–210.

    Google Scholar 

  • Barker, R. G. (1968).Ecological psychology: Concepts and methods for studying the environment of human behavior. Stanford, CA: Stanford University Press.

    Google Scholar 

  • Bock, R. D. (1983). The discrete Bayesian. In H. Wainer & S. Messick (Eds.),Modern advances in psychometric research (pp. 103–115). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Bock, R. D. (1989). Measurement of human variation: A two stage model. In R. D. Bock (Ed.),Multilevel analysis of educational data (pp. 319–342). New York: Academic Press.

    Google Scholar 

  • Bryk, A. S., & Raudenbush, S. W. (1987). Application of hierarchical linear models to assessing change.Psychological Bulletin, 101 147–158.

    Article  Google Scholar 

  • Bryk, A. S., & Raudenbush, S. W. (1992).Hierarchical linear models: Applications and data analysis methods. Newbury Park, CA: Sage.

    Google Scholar 

  • Bryk, A. S., Raudenbush, S. W., Seltzer, M., & Congdon, R. (1989).An introduction to HLM: Computer program and users' guide. Chicago: Scientific Software.

    Google Scholar 

  • Burstein, L. (1980). The analysis of multilevel data in educational research and evaluation. In D. Berliner (Ed.),Review of research in education Vol. 8, pp. 158–233 Washington, DC: American Educational Research Association.

    Google Scholar 

  • Conaway, M. R. (1989). Analysis of repeated categorical measurements with conditional like-lihood methods.Journal of the American Statistical Association, 84, 53–61.

    Article  Google Scholar 

  • DeLeeuw, J., & Kreft, I. (1986). Random coefficient models for multilevel analysis.Journal of Educational Statistics, 11, 57–85.

    Article  Google Scholar 

  • Donner, A. (1982). An empirical study of cluster randomization.International Journal of Epidemiology, 11, 283–286.

    PubMed  Google Scholar 

  • Donner, A. (1985). A regression approach to the analysis of data arising from cluster randomization.International Journal of Epidemiology, 14, 322–326.

    PubMed  Google Scholar 

  • Ezzet, F., & Whitehead, J. (1991). A random effects model for ordinal responses from a crossover trial.Statistics in Medicine, 10, 10, 901–907.

    PubMed  Google Scholar 

  • Fielding, J. (1984). Health promotion and disease prevention at the worksite.Annual Review of Public Health 5, 237–265.

    Article  PubMed  Google Scholar 

  • Finney, D. J. (1971).Probit analysis (3rd ed.), New York: Cambridge University Press.

    Google Scholar 

  • Florin, P., Giamartino, G. A., Kenny, D. A., & Wandersman, A. (1990). Levels of analysis and effects: Clarifying group influence and climate by separating individual and group effects.Journal of Applied Social Psychology, 20, 881–900.

    Article  Google Scholar 

  • Gibbons, R. D., & Bock, R. D. (1987). Trend in correlated proportions.Psychometrika, 52, 113–124.

    Article  Google Scholar 

  • Gibbons, R. D., & Hedeker, D. (1994). Applicsation of random-effects probit regression models.Journal of Consulting and Clinical Psychology, 62, 285–296.

    Article  PubMed  Google Scholar 

  • Gibbons, R. D., Hedeker, D., Elkin, I., Waternaux, C., Kraemer, H. C., Greenhouse, J. B., Shea, M. T., Imber, S. D., Sotsky, S. M., & Watkins, J. T. (1993). Some conceptual and statistical issues in analysis of longitudinal psychiatric data.Archives of General Psychiatry, 50, 739–750.

    PubMed  Google Scholar 

  • Gibbons, R. D., Hedeker, D., Waternaux, C., & Davis, J. M. (1988). Random regression models: a comprehensive approach to the analysis of longitudinal psychiatric data.Psychopharmacology Bulletin, 24, 438–443.

    PubMed  Google Scholar 

  • Glasgow, R., & Terborg, J. (1988). Occupational health promotion programs to reduce cardiovascular risk.Journal of Consulting and Clinical Psychology, 56, 365–373.

    Article  PubMed  Google Scholar 

  • Goldstein, H. (1987).Multilevel models in educational and social research. New York: Oxford University Press.

    Google Scholar 

  • Goldstein, H. (1991). Nonlinear multilevel models, with an application to discrete response data.Biometrika, 78, 45–51.

    Article  Google Scholar 

  • Harville, D. A., & Mee, R. W. (1984). A mixed-model procedure for analyzing ordered categorical data.Biometrics, 40, 393–408.

    Article  Google Scholar 

  • Hedeker, D. (1992a).MIXOR: A Fortran program for mixed-effects ordinal probit and logistic regression. Technical Report, School of Public Health, University of Illinois at Chicago.

  • Hedeker, D. (1992b).MIXREG: A Fortran program for mixed-effects linear regression with auto-correlated errors. Technical Report, School of Public Health, University of Illinois at Chicago.

  • Hedeker, D., & Gibbons; R. D. (1994). A random-effects ordinal regression model for multilevel data.Biometrics, 50.

  • Hedeker, D., Gibbons, R. D., & Davis, J. M. (1991). Random regression models for multicenter clinical trials data.Psychopharmacology Bulletin, 27, 73–77.

    PubMed  Google Scholar 

  • Hedeker, D., Gibbons, R. D. & Flay, B. R. (1994). Random-effects regression models for clustered data: With an example from smoking prevention research.Journal of Consulting and Clinical Psychology, 62, 757–765.

    Article  PubMed  Google Scholar 

  • Hedeker, D., Gibbons, R. D., Waternaux, C., & Davis, J. M. (1989). Investigating drug plasma levels and clinical response using random regression models.Psychopharmacology Bulletin, 25, 227–231.

    PubMed  Google Scholar 

  • Hopkins, K. D. (1982). The unit of analysis: Group means versus individual observations.American Educational Research Journal, 19, 5–18.

    Article  Google Scholar 

  • Jacobs, D. R., Jeffery, R. W., & Hannan, P. J. (1989). Methodological issues in worksite health intervention research: II. Computation of variance in worksite data: Unit of analysis. In K. Johnson, J. H. LaRosa, C. J. Scheirer, et al. (Eds.)Proceedings of the 1988 methodological issues in worksite research conference (pp. 77–88). Airlie, VA: United States Department of Health and Human Services.

    Google Scholar 

  • Jasnen, J. (1990). On the statistical analysis of ordinal data when extravariation is present.Applied Statistics, 39, 75–84.

    Article  Google Scholar 

  • Jason, L., Salina, D., Hedeker, D., Kimball, P., Kaufman, J., Bennett, P., Bernstein, R., & Lesondak, L. (1991). Designing an effective worksite smoking cessation program using self-help manuals, incentives, groups and media.Journal of Business and Psychology, 6, 155–166.

    Article  Google Scholar 

  • Jennrich, R. I., & Sampson, P. F. (1988). 3V: General mixed model analysis of variance. In W. J. Dixon (Chief Ed.),BMDP statistical software manual (Vol. 2, pp. 1025–1043). Berkeley: University of California Press.

    Google Scholar 

  • Kelly, J. G. (1966): Ecological constraints on mental health services.American Psychologist, 21, 535–539.

    Article  PubMed  Google Scholar 

  • Kenny, D. A., & La Voie, L. (1985). Separating individual and group effects.Journal of Personality and Social Psychology, 48, 339–348.

    Article  Google Scholar 

  • Kish, L. (1965),Survey sampling. New York: Wiley.

    Google Scholar 

  • Klesges, R. C., Cigrang, J., & Glasgow, R. E. (1987). Worksite smoking modification programs: A state-of-the-art review and directions for future research.Current Psychological Research & Reviews, 6(1), 26–56.

    Google Scholar 

  • Koepke, D., & Flay, B. R. (1989). Levels of analysis In M. T. Braverman (Ed.),Evaluating health promotion programs. New directions for program evaluation (No. 43, pp. 75–87). San Francisco: Jossey-Bass.

    Google Scholar 

  • Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data.Biometrics, 38, 963–974.

    Article  PubMed  Google Scholar 

  • Levine, M., & Perkins, D. V. (1987).Principles of community psychology: Perspectives and applications. New York: Oxford University Press.

    Google Scholar 

  • Linney, J. A., & Reppucci, N. D. (1982). Research design and methods in community psychology. In P. C. Kendall & J. N. Butcher (Eds),Handbook of research methods in clinical psychology (pp. 535–566). New York: Wiley.

    Google Scholar 

  • Longford, N. T. (1986). VARCL-Interactive software for variance component analysis.Professional Statistician, 74, 817–827.

    Google Scholar 

  • Longford, N. T. (1987). A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects.Biometrika, 74, 817–827.

    Article  Google Scholar 

  • McKinlay, S. M., Stone, E. J., & Zucker, D. M. (1989). Research design and analysis issues.Health Education Quarterly, 16, 307–313.

    PubMed  Google Scholar 

  • Moos, R. H. (1976).The human context: Environmental determinants of behavior. New York: Wiley.

    Google Scholar 

  • Murray, D. M., Hannan, P. J., & Zucker, D. M. (1989). Analysis issues in school-based health promotion studies.Health Education Quarterly, 16, 315–320.

    PubMed  Google Scholar 

  • Prosser, R., Rasbash, J., & Goldstein, H. (1991).ML3 software for three-level analysis, users' guide for v.2. London: Institute of Education, University of London.

    Google Scholar 

  • Raudenbush, S. W., & Bryk, A. S. (1986). A hierarchical model for studying school effects.Sociology of Education, 59, 1–17.

    Article  Google Scholar 

  • Raudenbush, S. W., & Bryk, A. S. (1988–89). Methodological advances in analyzing the effects of schools and classrooms on student learning. In E. Z. Rothkopf (Ed.).Review of research in education (Vol. 15, pp 423–475). Washington, DC: American Educational Research Association.

    Google Scholar 

  • Sarason, S. B. (1972).The culture of the school and the problem of change. Boston: Allyn & Bacon.

    Google Scholar 

  • Schwartz, J. L. (1987, April).Review and evaluation of smoking cessation methods: The United States and Canada. 1975–1985. (DHHS No. 87-2940). Washington, DC: National Cancer Institute.

    Google Scholar 

  • Searle, S. R. (1987).Linear models for unbalanced data. New York: Wiley.

    Google Scholar 

  • Shinn, M. (1990). Mixing and matching: Levels of conceptualization, measurement, and statistical analysis in community research. In P. Tolan, C. Keys, F. Chertok, & L. Jason (Eds.),Researching community psychology: Issues of theory and methods (pp. 111–126). Washington, DC: American Psychological Association.

    Chapter  Google Scholar 

  • Stirateli, R., Laird, N. M., & Ware, J. H. (1984). Random-effects models for serial observations with binary response.Biometrics, 40, 961–971.

    Article  Google Scholar 

  • Sorensen, G., Pechacek, & Pallonen, U. (1986). Occupational and worksite norms and attitudes about smoking cessation.American Journal of Public Health, 76, 544–549.

    Article  PubMed  Google Scholar 

  • Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large.Transactions of the American Mathematical Society, 54, 426–482.

    Article  Google Scholar 

  • Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses.Annals of Mathematical Statistics, 9, 60–62.

    Google Scholar 

  • Wong, G. Y., & Mason, W. M. (1985). The hierarchical logistic regression model for multilevel analysis.Journal of the American Statistical Association, 80, 513–524.

    Article  Google Scholar 

  • Zeger, S. L., Liang, K-Y., & Self, S. G. (1985). The analysis of binary longitudinal data with time independent covariates.Biometrika, 72, 31–38.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Preparation of this article was supported by National Heart, Lung, and Blood Institute Grant R18 HL42987-01A1, National Institutes of Mental Health Grant MH44826-01A2, and University of Illinois at Chicago Prevention Research Center Developmental Project CDC Grant R48/CCR505025.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hedeker, D., McMahon, S.D., Jason, L.A. et al. Analysis of clustered data in community psychology: With an example from a worksite smoking cessation project. Am J Commun Psychol 22, 595–615 (1994). https://doi.org/10.1007/BF02506895

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02506895

Key words

Navigation