Analyzing Large-Scale Assessment Data with Multilevel Analyses: Demonstration Using the Programme for International Student Assessment (PISA) 2018 Data

Lorah, Julie

doi:10.1007/978-981-16-9142-3_7

Julie Lorah²

950 Accesses
1 Citations

Abstract

The present chapter provides a tutorial with example demonstration designed to guide the reader through the complexities and unique challenges associated with analyzing large-scale assessment data through multilevel modeling techniques. The reader will learn how to use relevant literature to specify an appropriate model considering effects at various levels; how to address large-scale data complexities including sampling weights and plausible values; and how to estimate and interpret the results from such a model. The chapter will provide a conceptual background and short description of each of these topics, followed by an example demonstration using Programme for International Student Assessment (PISA) 2018 data. The demonstration will be used to provide a concrete example for analysis, including the process of model specification, estimation, and interpretation. In addition, annotated R syntax and R output associated with aspects of modeling will be provided so that readers will easily be able to conduct their own multilevel analyses with PISA data in order to answer their own research questions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bailey, P., Kelley, C., Nguyen, T., & Huo, H. (2021). WeMix: Weighted mixed-effects model using multilevel pseudo maximum likelihood estimation. R package version 3.1.8. https://CRAN.R-project.org/package=WeMix
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
Article Google Scholar
Caro, D. H., & Biecek, P. (2017). intsvy: An R package for analyzing international large-scale assessment data. Journal of Statistical Software, 81(7), 1–44. https://doi.org/10.18637/jss.v081.i07
Article Google Scholar
Ertem, H. Y. (2021). Examination of Turkey’s PISA 2018 reading literacy scores within student-level and school-level variables. Participatory Educational Research, 8(1), 248–264. https://doi.org/10.17275/per.21.14.8.1
Ferrao, M. E., Costa, P. M., & Matos, D. A. S. (2017). The relevance of the school socioeconomic composition and school proportion of repeaters on grade repetition in Brazil: A multilevel logistic model of PISA 2012. Large-Scale Assessments in Education, 5(7), 1–13. https://doi.org/10.1186/s40536-017-0036-8
Article Google Scholar
Lorah, J. A. (2018). Effect size measures for multilevel models: Definition, interpretation, and TIMSS example. Large-scale Assessments in Education, 6(8). https://doi.org/10.1186/s40536-018-0061-2.
Lorah, J. A. (2019). Estimating a multilevel model with complex survey data: Demonstration using TIMSS. Journal of Modern Applied Statistical Methods, 18(2), 2–14. https://doi.org/10.22237/jmasm/1604190360
Martin, M. O., & Mullis, I. V. S. (Eds.). (2012). Methods and procedures in TIMSS and PIRLS 2011. TIMSS & PIRLS International Study Center. https://timssandpirls.bc.edu/methods/
Ma, X., Ma., L., & Bradley, K. D. (2008). Using multilevel modeling to investigate school effects. In A. A. O’Connell & D. B. McCoach (Eds.), Multilevel modeling of educational data (pp. 59–110). Information Age Publishing Inc.
Google Scholar
OECD. (2009). PISA data analysis manual: SPSS second edition. Oecd.org.
Google Scholar
OECD. (2021a, July 30). PISA 2018 technical report, chapter 4: Sample design. Oecd.org. https://www.oecd.org/pisa/data/pisa2018technicalreport/
OECD. (2021b, July 30). How to prepare and analyse the PISA database. Oecd.org. https://www.oecd.org/pisa/data/httpoecdorgpisadatabase-instructions.htm
Google Scholar
R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Rogers, A. M., & Stoeckel, J. J. (2008). NAEP 2008 arts: Music and visual arts restricted-use data files data companion (NCES 2011–470). US Department of Education Institute of Education Sciences, National Center for Education Statistics.
Google Scholar
Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Sage Publishing.
Google Scholar

Download references

Author information

Authors and Affiliations

Indiana University, Bloomington, IN, USA
Julie Lorah

Authors

Julie Lorah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julie Lorah .

Editor information

Editors and Affiliations

Curtin University, Bentley, WA, Australia
Myint Swe Khine

Appendix: R syntax

#R syntax for demonstration analysis

############################## #1. Import two packages for analysis of complex survey data and set working directory

#intsvy package will be used to import data & compute descriptive statistics library(intsvy)

#WeMix package will be used for estimating multilevel models #this package allows including of sampling weights in the model with the “mix” function library(WeMix)

#set working directory (wd) where PISA data was saved setwd("C:/Users/file path here")

############################## #2. Save variable information and import data

#prints/saves variable labels and names of countries in a text file in working directory pisa.var.label(folder=file.path(getwd(),"PISA 2018"), school.file="CY07_MSU_SCH_QQQ.sav", student.file="CY07_MSU_STU_QQQ.sav")

#selects & merges data; acheivement & weight variables selected by default mydata <- pisa.select.merge(folder = file.path(getwd(), "PISA 2018"), school.file = "CY07_MSU_SCH_QQQ.sav", student.file = "CY07_MSU_STU_QQQ.sav", student = c("ESCS","ST004D01T","GFOFAIL","MASTGOAL"), school = c("W_SCHGRNRABWT","SCHSIZE"), countries = c("USA"))

############################## #3. Data preparation

mydata$SCHSIZE_TH<-mydata$SCHSIZE/1000 #Size in thousands of students mydata$Male<-mydata$ST004D01T #recode gender in direction of effect

##############################

#4. Descriptive statistics #using functions from intsvy package pisa.mean.pv(pvlabel="MATH",data=mydata) pisa.mean(variable="MASTGOAL",data=mydata) pisa.mean(variable="ESCS",data=mydata) pisa.mean(variable="Male",data=mydata) pisa.mean(variable="GFOFAIL",data=mydata) pisa.mean(variable="SCHSIZE_TH",data=mydata) pisa.table(variable="Male",data=mydata) #frequency distribution pisa.ben.pv(pvlabel="MATH",data=mydata) #percents at proficiency levels pisa.mean.pv(pvlabel="MATH",by="Male",data=mydata) #for plausible values pisa.rho(variable=c("PV1MATH","MASTGOAL","ESCS","Male","GFOFAIL","SCHSIZE_TH"), data=mydata) #correlations pisa.rho(variable=c("PV2MATH","MASTGOAL","ESCS","Male","GFOFAIL","SCHSIZE_TH"), data=mydata) #correlations pisa.rho(variable=c("PV3MATH","MASTGOAL","ESCS","Male","GFOFAIL","SCHSIZE_TH"), data=mydata) #correlations

############################## #5. Estimate multilevel models

#Math DV# #listwise delete newdata<-subset(mydata, select = c(PV1MATH,PV2MATH,PV3MATH,CNTSCHID,W_FSTUWT,W_SCHGRNRABWT, ESCS,Male,SCHSIZE_TH,GFOFAIL)) newdata <- na.omit(newdata)

#run models M1a<-mix(PV1MATH~1+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M1b<-mix(PV2MATH~1+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M1c<-mix(PV3MATH~1+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M2a<-mix(PV1MATH~1+ESCS+Male+SCHSIZE_TH+GFOFAIL+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M2b<-mix(PV2MATH~1+ESCS+Male+SCHSIZE_TH+GFOFAIL+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M2c<-mix(PV3MATH~1+ESCS+Male+SCHSIZE_TH+GFOFAIL+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT"))

#MASTGOAL DV# #listwise delete newdata<-subset(mydata, select = c(MASTGOAL,CNTSCHID,W_FSTUWT,W_SCHGRNRABWT, ESCS,Male,SCHSIZE_TH,GFOFAIL)) newdata <- na.omit(newdata)

#run models M3<-mix(MASTGOAL~1+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT")) M4<-mix(MASTGOAL~1+ESCS+Male+SCHSIZE_TH+GFOFAIL+(1|CNTSCHID),data=newdata, weights=c("W_FSTUWT","W_SCHGRNRABWT"))

############################## #6. Combine results from multiple models to account for plausible values

#this demonstrates the procedure for the intercept from the empty model

#average of 3 intercept estimates represents combined coefficient mean(M1a$coef,M1b$coef,M1c$coef)

#SE is computed with the following formula #M is the number of PV; 3 for demonstration; PISA 2018 has 10 PV total M<-3 mean(M1a$SE,M1b$SE,M1c$SE) + (1+(1/M))*var(c(M1a$coef,M1b$coef,M1c$coef))

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lorah, J. (2022). Analyzing Large-Scale Assessment Data with Multilevel Analyses: Demonstration Using the Programme for International Student Assessment (PISA) 2018 Data. In: Khine, M.S. (eds) Methodology for Multilevel Modeling in Educational Research. Springer, Singapore. https://doi.org/10.1007/978-981-16-9142-3_7

Download citation

DOI: https://doi.org/10.1007/978-981-16-9142-3_7
Published: 11 April 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-9141-6
Online ISBN: 978-981-16-9142-3
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics

Analyzing Large-Scale Assessment Data with Multilevel Analyses: Demonstration Using the Programme for International Student Assessment (PISA) 2018 Data

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: R syntax

Appendix: R syntax

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation