Skip to main content
Log in

Incidental Second-Level Dependence in Educational Survey Data with a Nested Data Structure

  • REVIEW ARTICLE
  • Published:
Educational Psychology Review Aims and scope Submit manuscript

Abstract

Many national and international educational data collection programs offer researchers opportunities to investigate contextual effects related to student performance. In those programs, schools are often used in the first-stage sampling process and students are randomly drawn from selected schools. However, the incidental dependence of students within classrooms, which are not part of the sampling design, may violate assumptions of statistical models, but this nesting also offers the opportunity for educational researchers to evaluate contextual effects. In this manuscript, we utilize the Early Childhood Longitudinal Study-Kindergarten dataset to demonstrate impacts of incidental dependence using a two-level model and a three-level model. We then illustrate, through a simulation, that both models can yield unbiased parameter estimates. However, two-level models tend to provide underestimated standard errors for fixed effects at the incidental level, and variance components of the random effect at the incidental level are divided into the flanking levels when it is ignored. In addition, another method of modeling nested data, using generalized estimating equations, was also compared with the model-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. The five existing publicly released datasets are Early Childhood Longitudinal Study: Kindergarten Class of 1998–1999 (ECLS-K), Education Longitudinal Study of 2002 (ELS), National Education Longitudinal Study: 1988 (NELS), Schools and Staffing Survey of 1999–2000 with Teacher Follow-up Study of 2000-2001 (SASS-TFS).

References

  • Adelson, J. L., McCoach, D. B., & Gavin, M. K. (2012). Examining the effects of gifted programming in mathematics and reading using the ECLS-K. The Gifted Child Quarterly, 56(1), 25–39. https://doi.org/10.1177/0016986211431487.

    Article  Google Scholar 

  • Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Csaki (Eds.), Proceedings of the 2nd international symposium on information theory (pp. 267–281). Budapest: Akademiai Kiado.

    Google Scholar 

  • Ballinger, G. A. (2004). Using generalized estimating equations for longitudinal data analysis. Organizational Research Methods, 7(2), 127–150. https://doi.org/10.1177/1094428104263672.

    Article  Google Scholar 

  • Bauer, D. J., & Sterba, S. K. (2011). Fitting multilevel models with ordinal outcomes: performance of alternative specifications and methods of estimation. Psychological Methods, 16(4), 373–390. https://doi.org/10.1037/a0025813.

    Article  Google Scholar 

  • Bell, B. A., Ferron, J. M., & Kromrey, J. D. (2008). Cluster size in multilevel models: the impact of sparse data structures on point and interval estimates in two-level models. JSM Proceedings, 1122–1129. Retrieved from https://ww2.amstat.org/sections/srms/Proceedings/y2008/Files/300933.pdf

  • Bronfenbrenner, U. (1994). Ecological models of human development. The International Encyclopedia of Education, 3(2), 1643–1647.

    Google Scholar 

  • Chen, Q. (2012). The impact of ignoring a level of nesting structure in multilevel mixture model: a Monte Carlo study. SAGE Open, 2(1), 2158244012442518. https://doi.org/10.1177/2158244012442518.

    Article  Google Scholar 

  • Cheong, Y. F., Fotiu, R. P., & Raudenbush, S. W. (2001). Efficiency and robustness of alternative estimators for two- and three-level models: the case of NAEP. Journal of Educational and Behavioral Statistics, 26(4), 411–429. https://doi.org/10.3102/10769986026004411.

    Article  Google Scholar 

  • Clarke, P. (2008). When can group level clustering be ignored? Multilevel models versus single-level models with sparse data. Journal of Epidemiology & Community Health, 62(8), 752–758. https://doi.org/10.1136/jech.2007.060798.

    Article  Google Scholar 

  • Croninger, R. G., Rice, J. K., Rathbun, A., & Nishio, M. (2007). Teacher qualifications and early learning: effects of certification, degree, and experience on first-grade student achievement. Economics of Education Review, 26(3), 312–324. https://doi.org/10.1016/J.ECONEDUREV.2005.05.008.

    Article  Google Scholar 

  • van den Wijngaard, O., Beausaert, S., Segers, M., & Gijselaers, W. (2015). The development and validation of an instrument to measure conditions for social engagement of students in higher education. Studies in Higher Education, 40(4), 704–720. https://doi.org/10.1080/03075079.2013.842214.

    Article  Google Scholar 

  • Goldstein, H. (2003). Multilevel statistical models. London: Arnold.

    Google Scholar 

  • Green, P. J., Herget, D., & Rosen, J. (2009). User’s guide for the Program for International Student Assessment (PISA): 2006 data files and database with United States specific variables (NCES 2009-055). Washington, DC: National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education.

    Google Scholar 

  • Hox, J. (2002). Multilevel analysis: techniques and applications. Mahwah, NJ: Lawrence Erlbaum Associates.

  • Hox, J. J. (2010). Multilevel analysis: techniques and applications. (2nd ed.). Routledge. https://doi.org/10.4324/9780203852279.

  • Ingels, S. J., Pratt, D. J., Rogers, J. E., Siegel, P. H., & Stutts, E. S. (2005). Education longitudinal study of 2002: base-year to first follow-up data file documentation. Washington, D.C.: National Center for Education Statistics, United States Department of Education.

    Google Scholar 

  • Jennings, J. L., & DiPrete, T. A. (2010). Teacher effects on social and behavioral skills in early elementary school. Sociology of Education, 83(2), 135–159. https://doi.org/10.1177/0038040710368011.

    Article  Google Scholar 

  • Kendall, M., & Stuart, A. (1977). The advanced theory of statistics: distribution theory (4th ed.). London: Griffin.

    Google Scholar 

  • Liang, K., & Zeger, S. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73(1). https://doi.org/10.2307/2336267.

  • Longford, N. T. (1993). Random coefficient models. Oxford, England: Clarendon.

    Google Scholar 

  • Lu, B., Preisser, J. S., Qaqish, B. F., Suchindran, C., Bangdiwala, S. I., & Wolfson, M. (2007). A comparison of two bias-corrected covariance estimators for generalized estimating equations. Biometrics, 63(3), 935–941. https://doi.org/10.1111/j.1541-0420.2007.00764.x.

    Article  Google Scholar 

  • Martínez, J. F., Stecher, B., & Borko, H. (2009). Classroom assessment practices, teacher judgments, and student achievement in mathematics: evidence from the ECLS. Educational Assessment, 14(2), 78–102. https://doi.org/10.1080/10627190903039429.

    Article  Google Scholar 

  • McNeish, D. M. (2014). Modeling sparsely clustered data: design-based, model-based, and single-level methods. Psychological Methods, 19(4), 552–563. https://doi.org/10.1037/met0000024.

    Article  Google Scholar 

  • McNeish, D., Stapleton, L. M., & Silverman, R. D. (2017). On the unnecessary ubiquity of hierarchical linear modeling. Psychological Methods, 22(1), 114–140. https://doi.org/10.1037/met0000078.

    Article  Google Scholar 

  • Moerbeek, M. (2004). The consequence of ignoring a level of nesting in multilevel analysis. Multivariate Behavioral Research, 39(1), 129–149 https://doi.org/10.1207/s15327906mbr39015.

  • Morel, J. G., Bokossa, M. C., & Neerchal, N. K. (2003). Small sample correction for the variance of GEE estimators. Biometrical Journal, 45(4), 395–409. https://doi.org/10.1002/bimj.200390021.

    Article  Google Scholar 

  • O’Connell, A. A., & McCoach, D. B. (2008). Multilevel modeling of educational data. Charlotte, NC: IAP.

    Google Scholar 

  • Opdenakker, M., & Van Damme, J. (2000). The importance of identifying levels in multilevel analysis: an illustration of the effects of ignoring the top or intermediate levels in school effectiveness research. Taylor & Francis. Retrieved from https://doi.org/10.1076/0924-3453(200003)11%3A1%3B1-A%3BFT103

  • Palardy, G. J., & Rumberger, R. W. (2008). Teacher effectiveness in first grade: the importance of background qualifications, attitudes, and instructional practices for student learning. Educational Evaluation and Policy Analysis, 30(2), 111–140. https://doi.org/10.3102/0162373708317680.

    Article  Google Scholar 

  • Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear model: applications and data analysis methods. Sage. Retrieved from https://us.sagepub.com/en-us/nam/hierarchical-linear-models/book9230

  • Raykov, T., Patelis, T., Marcoulides, G. A., & Lee, C.-L. (2016). Examining intermediate omitted levels in hierarchical designs via latent variable modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(1), 111–115.

    Article  Google Scholar 

  • SAS Institute. (2015). Base SAS 9.4 procedures guide. SAS Institute.

  • Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464.

    Article  Google Scholar 

  • Snijders, T. A. B., & Bosker, R. J. (Eds.). (2012). Multilevel analysis: an introduction to basic and advanced multilevel modeling (2nd ed.). Sage

  • Spencer, B. D., Frankel, M. R., Ingels, S. J., Rasinski, K. A., Tourangeau, R., & Owings, J. A. (1990). National Educational Longitudinal Study of 1988: base year sample design report Washington, D.C. Retrieved from https://nces.ed.gov/pubs90/90463.pdf

  • Stapleton, L. M., & Kang, Y. (2016). Design effects of multilevel estimates from national probability samples. Sociological Methods & Research, 47(3), 430–457. https://doi.org/10.1177/0049124116630563.

    Article  Google Scholar 

  • Tourangeau, K., Nord, C., Lê, T., Sorongon, A. G., & Najarian, M. (2009). Early childhood longitudinal study, kindergarten class of 1998–99 (ECLS-K), combined user’s manual for the ECLS-K eighth-grade and k–8 full sample data files and electronic codebooks (NCES 2009–004). Retrieved from https://nces.ed.gov/ecls/data/ECLSKK8Manualpart1.pdf

  • Vaezghasemi, M., Ng, N., Eriksson, M., & Subramanian, S. V. (2016). Households, the omitted level in contextual analysis: disentangling the relative influence of households and districts on the variation of BMI about two decades in Indonesia. International Journal for Equity in Health, 15(1), 102. https://doi.org/10.1186/s12939-016-0388-7.

    Article  Google Scholar 

  • Van Den Noortgate, W., Opdenakker, M. C., & Onghena, P. (2005). The effects of ignoring a level in multilevel analysis. School Effectiveness and School Improvement, 16(3), 281–303. https://doi.org/10.1080/09243450500114850.

    Article  Google Scholar 

  • Van Landeghem, G., De Fraine, B., & Van Damme, J. (2005). The consequence of ignoring a level of nesting in multilevel analysis: a comment. Multivariate Behavioral Research, 40(4), 423–434.

    Article  Google Scholar 

  • Williams, T., Ferraro, D., Roey, S., Brenwald, S., Kastberg, D., Jocelyn, L., Smith, C., & Stearns, P. (2009). TIMSS 2007 U.S. technical report and user guide (NCES 2009-012). Washington, DC: National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education.

    Google Scholar 

  • Zeger, S. L., Liang, K.-Y., & Albert, P. S. (1988). Models for longitudinal data: a generalized estimating equation approach. Biometrics, 44(4), 1049–1060. https://doi.org/10.2307/2531734.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weimeng Wang.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Table 6 Percentage relative bias (%) of fixed-effect estimates and standard errors
Table 7 Estimates and standard error estimates of random-effect variance

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, W., Liao, M. & Stapleton, L. Incidental Second-Level Dependence in Educational Survey Data with a Nested Data Structure. Educ Psychol Rev 31, 571–596 (2019). https://doi.org/10.1007/s10648-019-09480-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10648-019-09480-6

Keywords

Navigation