Omitted Variables in Multilevel Models

Kim, Jee-Seon; Frees, Edward W.

doi:10.1007/s11336-005-1283-0

Omitted Variables in Multilevel Models

Published: 11 November 2006

Volume 71, pages 659–690, (2006)
Cite this article

Psychometrika Aims and scope Submit manuscript

Jee-Seon Kim^1,2 &
Edward W. Frees¹

1015 Accesses
43 Citations
4 Altmetric
Explore all metrics

Abstract

Statistical methodology for handling omitted variables is presented in a multilevel modeling framework. In many nonexperimental studies, the analyst may not have access to all requisite variables, and this omission may lead to biased estimates of model parameters. By exploiting the hierarchical nature of multilevel data, a battery of statistical tools are developed to test various forms of model misspecification as well as to obtain estimators that are robust to the presence of omitted variables. The methodology allows for tests of omitted effects at single and multiple levels. The paper also introduces intermediate-level tests; these are tests for omitted effects at a single level, regardless of the presence of omitted effects at a higher level. A simulation study shows, not surprisingly, that the omission of variables yields bias in both regression coefficients and variance components; it also suggests that omitted effects at lower levels may cause more severe bias than at higher levels. Important factors resulting in bias were found to be the level of an omitted variable, its effect size, and sample size. A real data study illustrates that an omitted variable at one level may yield biased estimators at any level and, in this study, one cannot obtain reliable estimates for school-level variables when omitted child effects exist. However, robust estimators may provide unbiased estimates for effects of interest even when the efficient estimators fail, and the one-degree-of-freedom test helps one to understand where the problem is located. It is argued that multilevel data typically contain rich information to deal with omitted variables, offering yet another appealing reason for the use of multilevel models in the social sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

Sampling Techniques for Quantitative Research

Mixed methods research: what it is and what it could be

Article Open access 29 March 2019

References

Ahn, S.C., Lee, Y.H., & Schmidt, P. (2001). GMM estimation of linear panel data models with time-varying individual effects. Journal of Econometrics, 101, 219–55.
Article Google Scholar
Anderson, G.E., Jimerson, S.R., & Whipple, A.D. (2002). Grade retention: Achievement and mental health outcomes. National Association of School Psychologists. Available at http://www.nasponline.org/pdf/graderetention.pdf.
Arellano, M. (1993). On the testing of correlated effects with panel data. Journal of Econometrics, 59, 87–7.
Article Google Scholar
Blundell, R., & Windmeijer, F. (1997). Cluster effects and simultaneity in multilevel models. Health Economics, 6, 439–43.
Article PubMed Google Scholar
Boardman, A.E., & Murnane, R.J. (1979). Using panel data to improve estimates of the determinants of educational achievement. Sociology of Education, 52, 113–21.
Article Google Scholar
Bonesrø nning, H. (2004). Can effective teacher behavior be identified? Economics of Education Review, 23, 237–47.
Article Google Scholar
Chamberlain, G. (1978). Omitted variable bias in panel data: Estimating the returns to schooling. Annales de l’INSEE, 30–1, 49–2.
Google Scholar
Chamberlain, G. (1985). Heterogeneity, omitted variable bias, duration dependence. In J.J. Heckman, & B. Singer (Eds.), Longitudinal analysis of labor market data. Cambridge, UK: Cambridge University Press.
Google Scholar
Coleman, J.S., Campbell, E.Q., Hobson, C.J., McPartland, J., Mood, A.M., Weinfeld, F.D. et al. (1966). Equality of educational opportunity. Washington, DC: US Government Printing Office.
Google Scholar
Dee, T.S. (1998). Competition and the quality of public schools. Economics of Educational Review, 17, 419–27.
Article Google Scholar
Diggle, P.J., Heagarty, P., Liang, K.-Y., & Zeger, S.L. (2002). Analysis of longitudinal data (2nd ed.). London: Oxford University Press.
Google Scholar
Dunn, M.C., Kadane, J.B., & Garrow, J.R. (2003). Comparing harm done by mobility and class absence: Missing students and missing data. Journal of Educational and Behavioral Statistics, 28, 269–88.
Article Google Scholar
Ebbes, P., Bockenholt, U., & Wedel, M. (2004). Regressor and random-effects dependencies in multilevel models. Statistica Neerlandica, 58, 161–78.
Article Google Scholar
Ehrenberg, R.G., & Brewer, D.J. (1994). Do school and teacher characteristics matter? Evidence from High School and Beyond. Economics of Education Review, 13, 1–7.
Article Google Scholar
Ehrenberg, R.G., & Brewer, D.J. (1995). Did teachers verbal-ability and race matter in the 1960s—Coleman revisited. Economics of Educational Review, 14, 1–1.
Article Google Scholar
Ehrenberg, R.G., Brewer, D.J., Gamoran, A., & Willms, J.D. (2001). Class size and student achievement. Psychological Science in the Public Interest, 2, 1–0.
Article Google Scholar
Ehrenberg, R.G., Goldhaber, D.D., & Brewer, D.J. (1995). Do teachers’ race, gender, and ethnicity matter? Evidence from NELS:88. Industrial and Labor Relations Review, 48, 547–61.
Article Google Scholar
Frank, K.A. (2000). Impact of a confounding variable on a regression coefficient. Sociological Methods & Research, 29, 147–94.
Article Google Scholar
Frees, E.W. (2001). Omitted variables in longitudinal data models. The Canadian Journal of Statistics, 29, 573–95.
Article Google Scholar
Frees, E.W. (2004). Longitudinal and panel data: Analysis and applications for the social sciences. Cambridge, UK: Cambridge University Press.
Google Scholar
Frees, E.W., & Kim, J.-S. (2006). Multilevel model prediction. Psychometrika, 71, 79–04.
Article Google Scholar
Goldhaber, D.D., & Brewer, D.J. (1997). Why don’t schools and teachers seem to matter? Assessing the impact of unobservables on educational productivity. The Journal of Human Resources, 32, 505–23.
Article Google Scholar
Goldstein, H. (2003). Multilevel statistical models (3rd ed.). London: Oxford University Press.
Google Scholar
Griliches, Z. (1977). Estimating the returns to schooling. Econometrica, 45, 1–2.
Article Google Scholar
Halaby, C.H. (2004). Panel models in sociological research: Theory into practice. Annual Review of Sociology, 30, 507–40.
Article Google Scholar
Hanushek, E.A. (2003). The failure of input-based schooling policies. The Economic Journal, 113, 64–8.
Article Google Scholar
Hanushek, E.A., Kane, J.F., & Rivkin, S.G. (2004). Disruption versus Tiebout improvement: The costs and benefits of switching schools. Journal of Public Econometrics, 88, 1721–746.
Article Google Scholar
Hausman, J.A. (1978). Specification tests in econometrics. Econometrica, 46, 1251–272.
Article Google Scholar
Hausman, J.A., & Taylor, W.E. (1981). Panel data and unobservable individual effects. Econometrica, 49, 1377–398.
Article Google Scholar
Heckman, J.J., & Singer, B. (1982). Population heterogeneity in demographic models. In K. Land, & A. Rogers (Eds.), Multidimensional mathematical demography. New York: Academic Press.
Google Scholar
Hedges, L., Laine, R., & Greenwald, R. (1994). Does money matter? A meta analysis of the effects of differential school inputs on student outcomes. Educational Research, 23, 5–4.
Google Scholar
Hsiao, C. (2003). Analysis of panel data (2nd ed.). Cambridge, UK: Cambridge University Press.
Google Scholar
Kiefer, N.M. (1980). Estimation of fixed effects models for time series of cross sections with arbitrary intertemporal covariance. Journal of Econometrics, 14, 195–02.
Article Google Scholar
Kim, J.-S., & Frees, E.W. (2005). Fixed effects estimation in multilevel models. University of Wisconsin working paper, available at http://research.bus.wisc.edu/jfrees/.
Laird, N. (2004). Analysis of longitudinal and cluster-correlated data. Institute of Mathematical Statistics, Beachwood, OH.
Google Scholar
Ludwig, J., & Bassi, L.J. (1999). The puzzling case of school resources and student achievement. Educational Evaluation and Policy Analysis, 21, 385–03.
Google Scholar
Maas, C.J., & Hox, J.J. (2004). Robustness issues in multilevel regression analysis. Statistica Neerlandica, 58, 127–37.
Article Google Scholar
Maddala, G.S. (1971). The use of variance components models in pooling cross section and time series data. Econometrica, 39, 341–58.
Article Google Scholar
Marsh, L.C. (2004). The econometrics of higher education: Editor’s view. Journal of Econometrics, 121, 1–8.
Article Google Scholar
McCaffrey, D.F., Koretz, D., Louis, T.A., & Hamilton, L. (2004). Models for value-added modeling of teacher effects. Journal of Educational and Behavioral Statistics, 29, 67–01.
Article PubMed Google Scholar
Murnane, R.J., & Phillips, B.R. (1981). What do effective teachers of inner-city children have in common? Social Science Research, 10, 83–00.
Article Google Scholar
National Association of School Psychologists (NASP) (2003). Position statement on student grade retention and school promotion. Available at http://www.nasponline.org/information/pospaper_graderetent.html.
Palta, M., & Yao, T.-J. (1991). Analysis of longitudinal data with unmeasured confounders. Biometrics, 47, 1355–369.
Article PubMed Google Scholar
Phillips, M. (1997). What makes schools effective. A comparison of the relationships of communitarian climate and academic climate to mathematics achievement and attendance during middle school. American Educational Research Journal, 34, 633–62.
Google Scholar
Raudenbush, S.W., & Bryk, A.S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Newbury Park, CA: Sage.
Google Scholar
Raudenbush, S.W., & Willms, J.D. (1995). The estimation of school effects. Journal of Educational and Behavioral Statistics, 20, 307–35.
Google Scholar
Rice, N., Jones, A., & Goldstein, H. (1998). Multilevel models where the random effects are correlated with the fixed predictors: A conditioned iterative generalised least squares estimator (CIGLS). York: University of York, Centre for Health Economics.
Google Scholar
Rivkin, S.G., Hanushek, E.A., & Kain, J.F. (2005). Teachers, schools, and academic achievement. Econometrica, 73, 417–58.
Article Google Scholar
Singer, J. (1998). Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individual growth models. Journal of Educational and Behavioral Statistics, 24, 323–55.
Google Scholar
Snijders, T.A.B., & Bosker, R.J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. London: Sage.
Google Scholar
Verbeke, G., Spiessens, B., & Lesaffre, E. (2001). Conditional linear mixed models. The American Statistician, 55, 25–4.
Article Google Scholar
Vermunt, J.K. (1997). Log-linear models for event histories. Thousand Oaks, CA: Sage.
Google Scholar
Webb, N.L., Clune, W.H., Bolt, D.M., Gamoran, A., Meyer, R.H., Osthoff, E., & Thorn, C. (2002). Models for analysis of NSF’s systemic initiative programs—The impact of the urban system initiatives on student achievement in Texas, 1994–000. Wisconsin Center for Education Research, Technical Report. Madison, WI.
Google Scholar
Wooldridge, J.M. (2002). Econometric analysis of cross section and panel data. Cambridge, MA: MIT Press.
Google Scholar
Yamaguchi, K. (1986). Alternative approaches to unobserved heterogeneity in the analysis of repeatable events. In B. Tuma (Ed.), Sociological methodology (pp. 213–49). Washington, DC: American Sociological Association.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Wisconsin, Madison
Jee-Seon Kim & Edward W. Frees
Department of Educational Psychology, University of Wisconsin, 1025 West Johnson Street, Madison, WI, 53706, USA
Jee-Seon Kim

Authors

Jee-Seon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Edward W. Frees
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jee-Seon Kim.

Additional information

This research was supported by the National Academy of Education/Spencer Foundation and the National Science Foundation, Grant Number SES-0436274.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, JS., Frees, E.W. Omitted Variables in Multilevel Models. Psychometrika 71, 659–690 (2006). https://doi.org/10.1007/s11336-005-1283-0

Download citation

Received: 08 January 2005
Accepted: 30 December 2005
Published: 11 November 2006
Issue Date: December 2006
DOI: https://doi.org/10.1007/s11336-005-1283-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Omitted Variables in Multilevel Models

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Sampling Techniques for Quantitative Research

Mixed methods research: what it is and what it could be

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Omitted Variables in Multilevel Models

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Sampling Techniques for Quantitative Research

Mixed methods research: what it is and what it could be

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation