Abstract
The present chapter provides a tutorial with example demonstration designed to guide the reader through the complexities and unique challenges associated with analyzing large-scale assessment data through multilevel modeling techniques. The reader will learn how to use relevant literature to specify an appropriate model considering effects at various levels; how to address large-scale data complexities including sampling weights and plausible values; and how to estimate and interpret the results from such a model. The chapter will provide a conceptual background and short description of each of these topics, followed by an example demonstration using Programme for International Student Assessment (PISA) 2018 data. The demonstration will be used to provide a concrete example for analysis, including the process of model specification, estimation, and interpretation. In addition, annotated R syntax and R output associated with aspects of modeling will be provided so that readers will easily be able to conduct their own multilevel analyses with PISA data in order to answer their own research questions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bailey, P., Kelley, C., Nguyen, T., & Huo, H. (2021). WeMix: Weighted mixed-effects model using multilevel pseudo maximum likelihood estimation. R package version 3.1.8. https://CRAN.R-project.org/package=WeMix
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
Caro, D. H., & Biecek, P. (2017). intsvy: An R package for analyzing international large-scale assessment data. Journal of Statistical Software, 81(7), 1–44. https://doi.org/10.18637/jss.v081.i07
Ertem, H. Y. (2021). Examination of Turkey’s PISA 2018 reading literacy scores within student-level and school-level variables. Participatory Educational Research, 8(1), 248–264. https://doi.org/10.17275/per.21.14.8.1
Ferrao, M. E., Costa, P. M., & Matos, D. A. S. (2017). The relevance of the school socioeconomic composition and school proportion of repeaters on grade repetition in Brazil: A multilevel logistic model of PISA 2012. Large-Scale Assessments in Education, 5(7), 1–13. https://doi.org/10.1186/s40536-017-0036-8
Lorah, J. A. (2018). Effect size measures for multilevel models: Definition, interpretation, and TIMSS example. Large-scale Assessments in Education, 6(8). https://doi.org/10.1186/s40536-018-0061-2.
Lorah, J. A. (2019). Estimating a multilevel model with complex survey data: Demonstration using TIMSS. Journal of Modern Applied Statistical Methods, 18(2), 2–14. https://doi.org/10.22237/jmasm/1604190360
Martin, M. O., & Mullis, I. V. S. (Eds.). (2012). Methods and procedures in TIMSS and PIRLS 2011. TIMSS & PIRLS International Study Center. https://timssandpirls.bc.edu/methods/
Ma, X., Ma., L., & Bradley, K. D. (2008). Using multilevel modeling to investigate school effects. In A. A. O’Connell & D. B. McCoach (Eds.), Multilevel modeling of educational data (pp. 59–110). Information Age Publishing Inc.
OECD. (2009). PISA data analysis manual: SPSS second edition. Oecd.org.
OECD. (2021a, July 30). PISA 2018 technical report, chapter 4: Sample design. Oecd.org. https://www.oecd.org/pisa/data/pisa2018technicalreport/
OECD. (2021b, July 30). How to prepare and analyse the PISA database. Oecd.org. https://www.oecd.org/pisa/data/httpoecdorgpisadatabase-instructions.htm
R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Rogers, A. M., & Stoeckel, J. J. (2008). NAEP 2008 arts: Music and visual arts restricted-use data files data companion (NCES 2011–470). US Department of Education Institute of Education Sciences, National Center for Education Statistics.
Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Sage Publishing.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: R syntax
Appendix: R syntax
#R syntax for demonstration analysis
############################## #1. Import two packages for analysis of complex survey data and set working directory
#intsvy package will be used to import data & compute descriptive statistics library(intsvy)
#WeMix package will be used for estimating multilevel models #this package allows including of sampling weights in the model with the “mix” function library(WeMix)
#set working directory (wd) where PISA data was saved setwd("C:/Users/file path here")
############################## #2. Save variable information and import data
#prints/saves variable labels and names of countries in a text file in working directory pisa.var.label(folder=file.path(getwd(),"PISA 2018"), school.file="CY07_MSU_SCH_QQQ.sav", student.file="CY07_MSU_STU_QQQ.sav")
#selects & merges data; acheivement & weight variables selected by default mydata <- pisa.select.merge(folder = file.path(getwd(), "PISA 2018"), school.file = "CY07_MSU_SCH_QQQ.sav", student.file = "CY07_MSU_STU_QQQ.sav", student = c("ESCS","ST004D01T","GFOFAIL","MASTGOAL"), school = c("W_SCHGRNRABWT","SCHSIZE"), countries = c("USA"))
############################## #3. Data preparation
mydata$SCHSIZE_TH<-mydata$SCHSIZE/1000 #Size in thousands of students mydata$Male<-mydata$ST004D01T #recode gender in direction of effect
##############################
#4. Descriptive statistics #using functions from intsvy package pisa.mean.pv(pvlabel="MATH",data=mydata) pisa.mean(variable="MASTGOAL",data=mydata) pisa.mean(variable="ESCS",data=mydata) pisa.mean(variable="Male",data=mydata) pisa.mean(variable="GFOFAIL",data=mydata) pisa.mean(variable="SCHSIZE_TH",data=mydata) pisa.table(variable="Male",data=mydata) #frequency distribution pisa.ben.pv(pvlabel="MATH",data=mydata) #percents at proficiency levels pisa.mean.pv(pvlabel="MATH",by="Male",data=mydata) #for plausible values pisa.rho(variable=c("PV1MATH","MASTGOAL","ESCS","Male","GFOFAIL","SCHSIZE_TH"), data=mydata) #correlations pisa.rho(variable=c("PV2MATH","MASTGOAL","ESCS","Male","GFOFAIL","SCHSIZE_TH"), data=mydata) #correlations pisa.rho(variable=c("PV3MATH","MASTGOAL","ESCS","Male","GFOFAIL","SCHSIZE_TH"), data=mydata) #correlations
############################## #5. Estimate multilevel models
#Math DV# #listwise delete newdata<-subset(mydata, select = c(PV1MATH,PV2MATH,PV3MATH,CNTSCHID,W_FSTUWT,W_SCHGRNRABWT, ESCS,Male,SCHSIZE_TH,GFOFAIL)) newdata <- na.omit(newdata)
#run models M1a<-mix(PV1MATH~1+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M1b<-mix(PV2MATH~1+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M1c<-mix(PV3MATH~1+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M2a<-mix(PV1MATH~1+ESCS+Male+SCHSIZE_TH+GFOFAIL+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M2b<-mix(PV2MATH~1+ESCS+Male+SCHSIZE_TH+GFOFAIL+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M2c<-mix(PV3MATH~1+ESCS+Male+SCHSIZE_TH+GFOFAIL+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT"))
#MASTGOAL DV# #listwise delete newdata<-subset(mydata, select = c(MASTGOAL,CNTSCHID,W_FSTUWT,W_SCHGRNRABWT, ESCS,Male,SCHSIZE_TH,GFOFAIL)) newdata <- na.omit(newdata)
#run models M3<-mix(MASTGOAL~1+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M4<-mix(MASTGOAL~1+ESCS+Male+SCHSIZE_TH+GFOFAIL+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT"))
############################## #6. Combine results from multiple models to account for plausible values
#this demonstrates the procedure for the intercept from the empty model
#average of 3 intercept estimates represents combined coefficient mean(M1a$coef,M1b$coef,M1c$coef)
#SE is computed with the following formula #M is the number of PV; 3 for demonstration; PISA 2018 has 10 PV total M<-3 mean(M1a$SE,M1b$SE,M1c$SE) + (1+(1/M))*var(c(M1a$coef,M1b$coef,M1c$coef))
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Lorah, J. (2022). Analyzing Large-Scale Assessment Data with Multilevel Analyses: Demonstration Using the Programme for International Student Assessment (PISA) 2018 Data. In: Khine, M.S. (eds) Methodology for Multilevel Modeling in Educational Research. Springer, Singapore. https://doi.org/10.1007/978-981-16-9142-3_7
Download citation
DOI: https://doi.org/10.1007/978-981-16-9142-3_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-9141-6
Online ISBN: 978-981-16-9142-3
eBook Packages: EducationEducation (R0)