Abstract
Contextual factors have been identified as greatly influencing students’ reading performance. However, the collaborative influence of key contextual factors on students’ reading performance is still elusive and warrants further exploration. Based on Walberg’s educational productivity theory and Bronfenbrenner’s ecological system theory, emphasizing that learning in humans can only be understood by considering the influence of multiple factors combined into a unit or system, the current study sought to identify the optimal factor set of key contextual factors that collaboratively influences fourth-grade students’ reading performance. In this study, data from 183,428 students from 61 countries/regions were extracted from the progress in international reading literacy study 2016 dataset. First, a support vector machine (SVM) was adopted to classify the contextual factors influencing high-performing (students whose reading score is above 550) and low-performing (students whose reading score is below 475) students. Second, SVM recursive feature elimination (SVM–RFE) was applied to identify the key contextual factors capable of differentiating the two student cohorts. The findings indicate that 20 key contextual factors selected from 106 contextual factors at the student, family and school levels collectively differentiate high- and low-performing students, providing implications for future teaching and learning on elementary school students’ reading performance.
Similar content being viewed by others
References
Alivernini, F. (2013). An exploration of the gap between highest and lowest ability readers across 20 countries. Educational Studies, 39(4), 399–417. https://doi.org/10.1080/03055698.2013.767187
Alivernini, F., Manganelli, S., & Lucidi, F. (2016). The last shall be the first: Competencies, equity and the power of resilience in the Italian school system. Learning and Individual Differences, 51, 19–28. https://doi.org/10.1016/j.lindif.2016.08.010
Araújo, L., & Costa, P. (2015). Home book reading and reading achievement in EU countries: The progress in international reading literacy study 2011 (PIRLS). Educational Research and Evaluation, 21, 422–438. https://doi.org/10.1080/13803611.2015.1111803
Areepattamannil, S., Freeman, J. G., & Klinger, D. A. (2010). Influence of motivation, self–beliefs, and instructional practices on science achievement of adolescents in Canada. Social Psychology of Education, 14, 233–259. https://doi.org/10.1007/s11218-010-9144-9
Chen, J., Zhang, Y., & Hu, J. (2021). Synergistic effects of instruction and affect factors on high- and low-ability disparities in elementary students’ reading literacy. Reading and Writing: An Interdisciplinary Journal, 34(1), 199–230. https://doi.org/10.1007/s11145-020-10070-0
Dong, X., & Hu, J. (2019). An exploration of impact factors influencing students’ reading literacy in Singapore with machine learning approaches. International Journal of English Linguistics, 9(5), 52–65. https://doi.org/10.5539/ijel.v9n5p52
Baldi, P., Brunak, S., Chauvin, Y., Andersen, C. A. F., & Nielsen, H. (2000). Assessing the accuracy of prediction algorithms for classification: An overview. Bioinformatics, 16, 412–424. https://doi.org/10.1093/bioinformatics/16.5.412
Berninger, V. W., Nielsen, K. H., Abbott, R. D., Wijsman, E., & Raskind, W. (2008). Gender differences in severity of writing and reading disabilities. Journal of School Psychology, 46, 151–172. https://doi.org/10.1016/j.jsp.2007.02.007
Bowman, B., Donovan, M. S., & Burns, S. (2000). Eager to learn: Educating our preschoolers. National Research Council.
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Wadsworth.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1023/A1010933404324
Bronfenbrenner, U. (1979). The ecology of human development: Experiments by nature and design. Harvard University Press.
Burman, D. D., Bitan, T., & Booth, J. R. (2008). Sex differences in neural processing of language among children. Neuropsychologia, 46, 1349–1362. https://doi.org/10.1016/j.neuropsychologia.2007.12.021
Caro, D. H., Sandoval-Hernández, A., & Lüdtke, O. (2013). Cultural, social, and economic capital constructs in international assessments: An evaluation using exploratory structural equation modeling. School Effectiveness and School Improvement, 25, 433–450. https://doi.org/10.1080/09243453.2013.812568
Chang, C., & Lin, C. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 1–27. https://doi.org/10.1145/1961189.1961199
Chen, J., Zhang, Y., Wei, Y., & Hu, J. (2019). Discrimination of the contextual features of top performers in scientific literacy using a machine learning approach. Research in Science Education. Advance online publication. https://doi.org/10.1007/s11165-019-9835-y.
Chen, J., Zhang, Y., & Hu, J. (2021). Synergistic effects of instruction and affect factors on high- and low-ability disparities in elementary students’ reading literacy. Reading and Writing: An Interdisciplinary Journal, 34(1), 199–230. https://doi.org/10.1007/s11145-020-10070-0
Cheung, W. M., Lam, J. W. I., Au, D. W. H., So, W. W. Y., Huang, Y., & Tsang, H. W. H. (2017). Explaining student and home variance of Chinese reading achievement of the PIRLS 2011 Hong Kong. Psychology in the Schools, 54, 889–904. https://doi.org/10.1002/pits.22041
Cordero, J. M., Santín, D., & Simancas, R. (2017). Assessing European primary school performance through a conditional nonparametric model. Journal of the Operational Research Society, 68, 364–376. https://doi.org/10.1057/jors.2015.42
Cortes, C., & Vapnik, V. (1995). Support–vector networks. Machine Learning, 20, 273–297. https://doi.org/10.1007/bf00994018
Creemers, B., & Kyriakides, L. (2010). School factors explaining achievement on cognitive and affective outcomes: Establishing a dynamic model of educational effectiveness. Scandinavian Journal of Educational Research, 54, 263–294. https://doi.org/10.1080/00313831003764529
Cui, X. J., Yang, Q. X., Li, B., Tang, J., Zhang, X. Y., Li, S., & Zhu, F. (2019). Assessing the effectiveness of direct data merging strategy in long term and large scale. Frontiers in Pharmacology, 10, 127. https://doi.org/10.3389/fphar.2019.00127
Dosenbach, N. U., Nardos, B., Cohen, A. L., Fair, D. A., Power, J. D., Church, J. A., & Schlaggar, B. L. (2010). Prediction of individual brain maturity using fMRI. Science, 329, 1358–1361. https://doi.org/10.1126/science.1194144
Eriksson, M., Ghazinour, M., & Hammarström, A. (2018). Different uses of Bronfenbrenner’s ecological theory in public mental health research: What is their value for guiding public mental health policy and practice? Social Theory & Health, 16, 414–433. https://doi.org/10.1057/s41285-018-0065-6
Finch, W. H., Hernández Finch, M. E., & French, B. F. (2016). Recursive partitioning to identify potential causes of differential item functioning in cross–national data. International Journal of Testing, 16, 21–53. https://doi.org/10.1080/15305058.2015.1039644
Fraser, B. J., Walberg, H. J., Welch, W. W., & Hattie, J. A. (1987). Syntheses of educational productivity research. International Journal of Educational Research, 11(2), 147–252. https://doi.org/10.1016/0883-0355(87)90035-8.
Gabriel, F., Signolet, J., & Westwell, M. (2017). A machine learning approach to investigating the effects of mathematics dispositions on mathematical literacy. International Journal of Research & Method in Education, 41, 306–327. https://doi.org/10.1080/1743727X.2017.1301916
Gnaldi, M., Schagen, I., Twist, L., & Morrison, J. (2005). Attitude items and low ability students: The need for a cautious approach to interpretation. Educational Studies, 31, 103–113. https://doi.org/10.1080/03055690500095241
Gormley, W. T., Gayer, T., Phillips, D., & Dawson, B. (2005). The effects of universal pre–K on cognitive development. Developmental Psychology, 41, 872–884. https://doi.org/10.1037/0012-1649.41.6.872
Gorostiaga, A., & Rojo-Álvarez, J. L. (2016). On the use of conventional and statistical–learning techniques for the analysis of PISA results in Spain. Neurocomputing, 171, 625–637. https://doi.org/10.1016/j.neucom.2015.07.001
Graham, S., Liu, X., Aitken, A., Ng, C., Bartlett, B., Harris, K. R., & Holzapfel, J. (2017). Effectiveness of literacy programs balancing reading and writing instruction: A meta–analysis. Reading Research Quarterly, 53, 279–304. https://doi.org/10.1002/rrq.194
Greenwald, R., Hedges, L. V., & Laine, R. D. (1996). The effect of school resources on student achievement. Review of Educational Research, 66, 361–396. https://doi.org/10.2307/1170528
Gustafsson, J., & Balke, G. (1993). General and specific abilities as predictors of school achievement. Multivariate Behavioral Research, 28, 407–434. https://doi.org/10.1207/s15327906mbr28042
Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46, 389–422. https://doi.org/10.1023/A:1012487302797
Hammerness, K. M., Darling-Hammond, L., Bransford, J., Berliner, D., Cochran-Smith, M., McDonald, M., & Zeichner, K. (2005). How teachers learn and develop. In L. Darling-Hammond & J. Bransford (Eds.), Preparing teachers for a changing world: What teachers should learn and be able to do (pp. 358–389). Jossey-Bass.
Hu, J. (2014). An analysis of the design process of a language learning management system. Control and Intelligent Systems, 42(1), 80–86. https://doi.org/10.2316/Journal.201.2014.1.201-2534
Huebner, C. E., & Meltzoff, A. N. (2005). Intervention to change parent–child reading style: A comparison of instructional methods. Journal of Applied Developmental Psychology, 26, 296–313. https://doi.org/10.1016/j.appdev.2005.02.006
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. Springer.
Jerrim, J., & Micklewright, J. (2014). Socio–economic gradients in children’s cognitive skills: Are cross–country comparisons robust to who reports family background? European Sociological Review, 30, 766–781. https://doi.org/10.1093/esr/jcu072
Jing, Y., Li, B., Chen, N., Li, X., & Hu, J. (2015). The discrimination of learning styles by bayes-based statistics: An extended study on ILS system. Control and Intelligent Systems, 43(2), 68–75. https://doi.org/10.2316/Journal.201.2015.2.201-2666
Kirsch, I., De Jong, J., Lafontaine, D., McQueen, J., Mendelovits, J., & Monseur, C. (2002). Reading for change: Performance and engagement across countries: Results of PISA 2000. Paris, France: OECD Publishing. http://www.oecd.org/education/school/programmeforinternationalstudentassessmentpisa/33690904.pdf
Kuhn, M. (2008). Building predictive models in R using the caret package. Journal of Statistical Software, 28, 1–26. https://doi.org/10.18637/jss.v028.i05
Kuhn, M., & Johnson, K. (2013). Applied predictive modelling. Springer.
La Paro, K. M., Pianta, R. C., & Stuhlman, M. (2004). The classroom assessment scoring system: Findings from the prekindergarten year. The Elementary School Journal, 104, 409–426. https://doi.org/10.1086/499760
Lam, T. Y., & Lau, K. C. (2014). Examining factors affecting science achievement of Hong Kong in PISA 2006 using hierarchical linear modeling. International Journal of Science Education, 36, 2463–2480. https://doi.org/10.1080/09500693.2013.879223
Law, Y. (2009). The role of attribution beliefs, motivation and strategy use in Chinese fifth–graders’ reading comprehension. Educational Research, 51, 77–95. https://doi.org/10.1080/00131880802704764
Leonard, J. (2011). Using Bronfenbrenner’s ecological theory to understand community partnerships: A historical case study of one urban high school. Urban Education, 46, 987–1010. https://doi.org/10.1177/0042085911400337
Li, H., & Sun, J. (2011). Predicting business failure using support vector machines with straightforward wrapper: A re–sampling study. Expert Systems with Applications, 38, 12747–12756. https://doi.org/10.1016/j.eswa.2008.01.003
Lien, H. Y. (2017). EFL college learners’ perceptions of self–selected materials for extensive reading. The English Teacher, 39, 194–204.
Liu, X., & Ruiz, M. E. (2008). Using data mining to predict K–12 students’ performance on large–scale assessment items related to energy. Journal of Research in Science Teaching, 45, 554–573. https://doi.org/10.1002/tea.20232
Machin, S., McNally, S., & Wyness, G. (2013). Educational attainment across the UK nations: Performance, inequality and evidence. Educational Research, 55, 139–164. https://doi.org/10.1080/00131881.2013.801242
Marsh, H. W., Lüdtke, O., Nagengast, B., Trautwein, U., Morin, A. J., Abduljabbar, A. S., & Köller, O. (2012). Classroom climate and contextual effects: Conceptual and methodological issues in the evaluation of group–level effects. Educational Psychologist, 47, 106–124. https://doi.org/10.1080/00461520.2012.670488
Mou, W. J., Liu, Z. Q., Luo, Y., Zou, M., Ren, C., Zhang, C. Y., & Tian, Y. P. (2014). Development and cross–validation of prognostic models to assess the treatment effect of cisplatin/pemetrexed chemotherapy in lung adenocarcinoma patients. Medical Oncology, 31, 59. https://doi.org/10.1007/s12032-014-0059-8
Mullis, I. V. S., Kennedy, A. M., Martin, M. O., & Sainsbury, M. (2006). PIRLS 2006 assessment framework and specifications (2nd ed.). TIMSS & PIRLS International Study Center, Boston College.
Mullis, I. V. S., Martin, M. O., Foy, P., & Drucker, K. T. (2012). PIRLS 2011 international results in reading. TIMSS & PIRLS International Study Center, Boston College.
Mullis, I. V. S., Martin, M. O., Foy, P., & Hooper, M. (2017). PIRLS 2016 international results in reading. Boston College, TIMSS & PIRLS International Study Center.
Mullis, I. V., Martin, M. O., Gonzalez, E. J., & Chrostowski, S. J. (2004). Findings from IEA’s trends in international mathematics and science study at the fourth and eighth grades. TIMSS & PIRLS International Study Center, Boston College.
Mullis, I. V. S., Martin, M. O., Kennedy, A. M. & Foy, P. (2007). PIRLS 2006 international report: IEA’s progress in international reading literacy study in primary schools in 40 countries. Chestnut Hill, MA: TIMSS & PIRLS International Study Center. Boston College.
Myrberg, E. (2007). The effect of formal teacher education on reading achievement of 3rd-grade students in public and independent schools in Sweden. Educational Studies, 33, 145–162. https://doi.org/10.1080/03055690601068311
OECD. (2009). PISA data analysis manual: SPSS (2nd ed.). OECD Publishing. https://doi.org/10.1787/9789264056275-en
O’Sullivan, J. T., & Howe, M. L. (1996). Causal attributions and reading achievement: Individual differences in low–income families. Contemporary Educational Psychology, 21, 363–387. https://doi.org/10.1006/ceps.1996.0027
Pham, B. T., Pradhan, B., Bui, D. T., Prakash, I., & Dholakia, M. (2016). A comparative study of different machine learning methods for landslide susceptibility assessment: A case study of Uttarakhand area (India). Environmental Modelling & Software, 84, 240–250. https://doi.org/10.1016/j.envsoft.2016.07.005
Park, Y. (2011). How motivational constructs interact to predict elementary students’ reading performance: Examples from attitudes and self–concept in reading. Learning and Individual Differences, 21, 347–358. https://doi.org/10.1016/j.lindif.2011.02.009
Pianta, R. C., Hamre, B., & Stuhlman, M. (2003). Relationships between teachers and children. In W. M. Reynolds & G. E. Miller (Eds.), Handbook of psychology Educational psychology (pp. 199–234). Hoboken: Wiley.
Ponzo, M. (2013). Does bullying reduce educational achievement? An evaluation using matching estimators. Journal of Policy Modeling, 35, 1057–1078. https://doi.org/10.1016/j.jpolmod.2013.06.002
Qiao, X., & Jiao, H. (2018). Data mining techniques in analyzing process data: A didactic. Frontiers in Psychology, 9, 2231. https://doi.org/10.3389/fpsyg.2018.02231
Raikes, H., Pan, B. A., Luze, G., Tamis-LeMonda, C. S., Brooks-Gunn, J., Constantine, J., & Rodriguez, E. T. (2006). Mother–child bookreading in low–income families: Correlates and outcomes during the first three years of life. Child Development, 77, 924–953. https://doi.org/10.1111/j.1467-8624.2006.00911.x
Reilly, D. (2015). Gender differences in reading from a cross-cultural perspective: The contribution of gender equality. In Proceedings of the International Convention of Psychological Science, Amsterdam, Netherlands. http://dx.doi.org/https://doi.org/10.13140/RG.2.2.18218.72647
Rindermann, H., Michou, C. D., & Thompson, J. (2011). Children’s writing ability: Effects of parent’s education, mental speed and intelligence. Learning and Individual Differences, 21, 562–568. https://doi.org/10.1016/j.lindif.2011.07.010
Sanzana, M. B., Garrido, S. S., & Poblete, C. M. (2015). Profiles of Chilean students according to academic performance in mathematics: An exploratory study using classification trees and random forests. Studies in Educational Evaluation, 44, 50–59. https://doi.org/10.1016/j.stueduc.2015.01.002
Saskia, K. B., Antonia, A. M. H., & van de Grift, W. J. C. M. (2019). The relationship among students’ reading performance, their classroom behavior, and teacher skills. The Journal of Educational Research, 112(1), 1–11. https://doi.org/10.1080/00220671.2017.1411878
Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results. Review of Educational Research, 77, 454–499. https://doi.org/10.3102/0034654307310317
Song, L., Spier, E. T., & Tamis–Lemonda, C. S. (2013). Reciprocal influences between maternal language and children’s language and cognitive development in low-income families. Journal of Child Language, 41, 305–326. https://doi.org/10.1017/s0305000912000700
Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: Rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14, 323–348. https://doi.org/10.1037/a0016973
Tramonte, L., & Willms, J. D. (2010). Cultural capital and its effects on education outcomes. Economics of Education Review, 29(2), 200–213. https://doi.org/10.1016/j.econedurev.2009.06.003
Tse, S. K., & Xiao, X. Y. (2014). Differential influences of affective factors and contextual factors on high-proficiency readers and low-proficiency readers: A multilevel analysis of PIRLS data from Hong Kong. Large-Scale Assessments in Education, 2, 1–24. https://doi.org/10.1186/s40536-014-0006-3
Twist, L., Gnaldi, M., Schagen, I., & Morrison, J. (2004). Good readers but at a cost? Attitudes to reading in England. Journal of Research in Reading, 27, 387–400. https://doi.org/10.1111/j.1467-9817.2004.00241.x
Van Bergen, E., Snowling, M. J., De Zeeuw, E. L., Van Beijsterveldt, C. E., Dolan, C. V., & Boomsma, D. I. (2018). Why do children read more? The influence of reading ability on voluntary reading practices. Journal of Child Psychology and Psychiatry, 59, 1205–1214. https://doi.org/10.1111/jcpp.12910
Van Bergen, E., Van Zuijen, T., Bishop, D., & De Jong, P. F. (2016). Why are home literacy environment and children’s reading skills associated? What Parental skills reveal. Reading Research Quarterly, 52, 147–160. https://doi.org/10.1002/rrq.160
Walberg, H. J. (1981). A psychological theory of educational productivity. In F. H. Farley & N. Gordon (Eds.), Psychology and education (pp. 81–110). McCutchan.
Walberg, H. J. (1984). Improving the productivity of America’s schools. Educational Leadership, 41, 19–27.
Wang, J. H., & Guthrie, J. T. (2004). Modeling the effects of intrinsic motivation, extrinsic motivation, amount of reading, and past reading achievement on text comprehension between U.S. and Chinese students. Reading Research Quarterly, 39, 162–186. https://doi.org/10.1598/rrq.39.2.2
Wei, Y., Yang, Q., Chen, J., & Hu, J. (2018). The exploration of a machine learning approach for the assessment of learning styles changes. Mechatronic Systems and Control, 46, 121–126. https://doi.org/10.2316/Journal.201.2018.3.201-2979
Weizman, Z. O., & Snow, C. E. (2001). Lexical output as related to children’s vocabulary acquisition: Effects of sophisticated exposure and support for meaning. Developmental Psychology, 37, 265–279. https://doi.org/10.1037/0012-1649.37.2.265
Wiium, N., & Wold, B. (2009). An ecological system approach to adolescent smoking behavior. Journal of Youth and Adolescence, 38(10), 1351–1363. https://doi.org/10.1007/s10964-008-9349-9.
William, H. R., Timothy, R. B., & William, D. N. (2009). Effective reading instruction for struggling readers: The role ofdirect/explicit teaching. Reading & Writing Quarterly, 25(2–3), 125–138. https://doi.org/10.1080/10573560802683523.
Rupley, W. H., Blair, T. R., & Nichols, W. D. (2009). Effective reading instruction for struggling readers: The role of direct/explicit teaching. Reading & Writing Quarterly, 25(2–3), 125–138. https://doi.org/10.1080/10573560802683523
Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., & Steinberg, D. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14, 1–37. https://doi.org/10.1007/s10115-007-0114-2
Xia, J., Broadhurst, D., Wilson, M., & Wishart, D. (2013). Translational biomarker discovery in clinical metabolomics: An introductory tutorial. Metabolomics, 9, 280–299. https://doi.org/10.1007/s11306-012-0482-9
Xiao, Y., Liu, Y., & Hu, J. (2019). Regression analysis of ICT impact factors on early adolescents’ reading proficiency in five high-performing countries. Frontiers in Psychology, 10,1646. https://doi.org/10.3389/fpsyg.2019.01646.
Zhang, F., Kaufman, H. L., Deng, Y., & Drabier, R. (2013). Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood. BMC Medical Genomics, 6, S4. https://doi.org/10.1186/1755-8794-6-s1-s4
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflicts of interest.
Ethical standards
The data were extracted from the open access PIRLS 2016 dataset (URL: https://timssandpirls.bc.edu/pirls2016/international–database) and are also available upon request from the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Xin Dong and Yi Peng have contributed equally.
This research was supported by the Philosophical and Social Sciences Planning Project of Zhejiang Province in 2020 [grant number 20NDJC01Z], the Second Batch of 2019 Industry-University Collaborative Education Project of Chinese Ministry of Education [grant number 201902016038], the Fundamental Research Funds for the Central Universities of Zhejiang University, and the SUPERB College English Action Plan.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Hu, J., Dong, X. & Peng, Y. Discovery of the key contextual factors relevant to the reading performance of elementary school students from 61 countries/regions: insight from a machine learning-based approach. Read Writ 35, 93–127 (2022). https://doi.org/10.1007/s11145-021-10176-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11145-021-10176-z