Abstract
This study applied the many-facet Rasch measurement (MFRM) to assess students’ knowledge-in-use in middle school physical science. 240 students completed three knowledge-in-use classroom assessment tasks on an online platform. We developed transformable scoring rubrics to score students’ responses, including a task-generic polytomous rubric (applicable to the three tasks), a task-specific polytomous rubric (for each task), and a task-specific dichotomous rubric (for each task). Three qualified raters scored 240 students’ responses to the three tasks. MFRM reported student ability, item difficulty, rater severity, and interaction effects, which helped improve the assessment tasks and rubrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bond, T. G., & Fox, C. M. (2015). Applying the Rasch models: Fundamental measurement in the human sciences. Lawrence Erlbaum Associate.
Boone, W. J., & Staver, J. R. (2020). Advances in Rasch analyses in the human sciences. Springer.
Boone, W. J., Townsend, J. S., & Staver, J. R. (2016). Utilizing multifaceted Rasch measurement through FACETS to evaluate science education data sets composed of judges, respondents, and rating scale items: An exemplar utilizing the elementary science teaching analysis matrix instrument. Science Education, 100(2), 221–238.
Brown, J. S., Collins, A., & Duguid, P. (1989). Situated cognition and the culture of learning. Educational Researcher, 18(1), 32–42.
Chandler, P., & Sweller, J. (1991). Cognitive load theory and the format of instruction. Cognition and Instruction, 8(4), 293–332.
Chen, Y.-C., & Terada, T. (2021). Development and validation of an observation-based protocol to measure the eight scientific practices of the next generation science standards in K-12 science classrooms. Journal of Research in Science Teaching, 58(10), 1489–1526.
Chen, Y., Irving, P. W., & Sayre, E. C. (2013). Epistemic game for answer making in learning about hydrostatics. Physical Review Special Topics—Physics Education Research, 9(1), 1–7. https://doi.org/10.1103/PhysRevSTPER.9.010108
Chi, S., Liu, X., & Wang, Z. (2021). Comparing student science performance between hands-on and traditional item types: A many-facet Rasch analysis. Studies in Educational Evaluation, 70, 100998.
Cohen, J. E. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Crowder, E. M. (1996). Gestures at work in sensemaking science talk. Journal of the Learning Sciences, 5(3), 173–208.
Duncan, R. G., Krajcik, J. S., & Rivet, A. E. (Eds.). (2017). Disciplinary core ideas: Reshaping teaching and learning. NTSA Press.
Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25(2), 155–185.
Eckes, T. (2015). Introduction to many-facet Rasch measurement: Analyzing and evaluating rater-mediated assessments (2nd ed.). Peter Lang.
Finnish National Board of Education. (2016). National core curriculum for basic education 2014. Finnish National Board of Education.
Fleiss, J. L., & Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of agreement. Educational and Psychological Measurement., 33, 613–619.
Fleiss, J., Levin, B., & Paik, M. (2003). Statistical methods for rates & proportions (3rd ed.). Wiley & Sons.
Gotwals, A. W., & Songer, N. B. (2013). Validity evidence for learning progression-based assessment items that fuse core disciplinary ideas and science practices. Journal of Research in Science Teaching, 50(5), 597–626.
Gotwals, A. W., Songer, N. B., & Bullard, L. (2012). Assessing students’ progressing abilities to construct scientific explanations. In Learning progressions in science (pp. 183–210). Brill Sense.
Greeno, J. G., Collins, A. M., & Resnick, L. B. (1996). Cognition and learning. Handbook of Educational Psychology, 77, 15–46.
Harris, C. J., McNeill, K. L., Lizotte, D. L., Marx, R. W., & Krajcik, J. (2006). Usable assessments for teaching science content and inquiry standards. In M. McMahon, P. Simmons, R. Sommers, D. DeBaets, & F. Crowley (Eds.), Assessment in science: Practical experiences and education research (pp. 67–88). National Science Teachers Association Press.
Harris, C. J., Krajcik, J. S., Pellegrino, J. W., & DeBarger, A. H. (2019). Designing knowledge-in-use assessments to promote deeper learning. Educational Measurement: Issues and Practice., 38(2), 53–67.
He, P., Liu, X., Zheng, C., & Jia, M. (2016). Using Rasch measurement to validate an instrument for measuring the quality of classroom teaching in secondary chemistry lessons. Chemistry Education Research and Practice, 17(2), 381–393.
He, P., Chen, I., Touitou, I., Bartz, K., Schneider, B., & Krajcik, J. (2023). Predicting student science achievement using post-unit assessment performances in a coherent high school chemistry project-based learning system. Journal of Research in Science Teaching, 60(4), 724–760.
Jescovitch, L. N., Scott, E. E., Cerchiara, J. A., Merrill, J., Urban-Lurain, M., Doherty, J. H., & Haudek, K. C. (2021). Comparison of machine learning performance using analytic and holistic coding approaches across constructed response assessments aligned to a science learning progression. Journal of Science Education and Technology, 30(2), 150–167.
Kaldaras, L., Akaeze, H., & Krajcik, J. (2021). Developing and validating next generation science standards-aligned learning progression to track three-dimensional learning of electrical interactions in high school physical science. Journal of Research in Science Teaching, 58(4), 589–618.
Kapon, S. (2017). Unpacking sensemaking. Science Education, 101(1), 165–198.
Kim, E. M., Nabors Oláh, L., & Peters, S. (2020). A learning progression for constructing and interpreting data display. ETS Research Report Series, 2020(1), 1–27.
Krajcik, J. S. (2021). Commentary—Applying machine learning in science assessment: Opportunity and challenges. Journal of Science Education and Technology, 30(2), 313–318.
Krist, C., Schwarz, C. V., & Reiser, B. J. (2019). Identifying essential epistemic heuristics for guiding mechanistic reasoning in science learning. Journal of the Learning Sciences, 28(2), 160–205.
Leckie, G., & Baird, J. A. (2011). Rater effects on essay scoring: A multilevel analysis of severity drift, central tendency, and rater experience. Journal of Educational Measurement, 48(4), 399–418.
Linacre, J. M. (1989). Many-faceted Rasch measurement. MESA Press.
Liu, X. (2020). Using and developing measurement instruments in science education: A Rasch modeling approach (2nd ed.). IAP.
Liu, O. L., Rios, J. A., Heilman, M., Gerard, L., & Linn, M. C. (2016). Validation of automated scoring science assessments. Journal of Research in Science Teaching, 53(2), 215–233.
Mao, L., Liu, O. L., Roohr, K., Belur, V., Mulholland, M., Lee, H.-S., & Pallant, A. (2018). Validation of automated scoring for a formative assessment that employs scientific argumentation. Educational Assessment, 23(2), 121–138.
Mayer, K., & Krajcik, J. (2015). Designing and assessing scientific modeling tasks. In Encyclopedia of science education (pp. 291–297). Springer.
McNeill, K. L., Lizotte, D. J., Krajcik, J., & Marx, R. W. (2006). Supporting students’ construction of scientific explanations by fading scaffolds in instructional materials. Journal of the Learning Sciences, 15(2), 153–191.
Ministry of Education, P. R. China. (2017). Chemistry curriculum standards for senior high school [普通高中化学课程标准]. People’s Education Press.
Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2003). A brief introduction to evidence-centered design. ETS Research Report Series, 2003(1), i–29.
National Research Council. (2012). A framework for K-12 science education: Practices, crosscutting concepts, and core ideas. The National Academies Press.
NGSA. (2022). Next generation science assessment. https://ngss-assessment.portal.concord.org
NGSS Lead States. (2013). Next generation science standards: For states, by states. The National Academies Press.
Nordine, J., & Lee, O. (Eds.). (2021). Crosscutting concepts: Strengthening science and engineering learning. National Science Teaching Association.
Odden, T. O. B., & Russ, R. S. (2019). Defining sensemaking: Bringing clarity to a fragmented theoretical construct. Science Education, 103(1), 187–205.
Organization for Economic Cooperation and Development. (2019). PISA 2018 assessment and analytical framework. OECD Publishing. https://doi.org/10.1787/b25efab8-en
Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent developments. Educational Psychologist, 38(1), 1–4.
Pellegrino, J. W., & Hilton, M. L. (2012). Education for life and work: Developing transferable knowledge and skills in the 21st century. National Academies Press.
Penuel, W. R., Turner, M. L., Jacobs, J. K., Van Horne, K., & Sumner, T. (2019). Developing tasks to assess phenomenon-based science learning: Challenges and lessons learned from building proximal transfer tasks. Science Education, 103(6), 1367–1395.
Schwarz, C., Passmore, C., & Reiser, B. J. (2017). Helping students make sense of the world using next generation science and engineering practices. National Science Teachers Association.
Styck, K. M., Anthony, C. J., Sandilos, L. E., & DiPerna, J. C. (2021). Examining rater effects on the classroom assessment scoring system. Child Development, 92(3), 976–993.
Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and Instruction, 4(4), 295–312.
Wang, C., Liu, X., Wang, L., Sun, Y., & Zhang, H. (2021). Automated scoring of Chinese grades 7–9 students’ competence in interpreting and arguing from evidence. Journal of Science Education and Technology, 30(2), 269–282.
Wright, B. D., & Stone, M. H. (1979). Best test design. MESA press.
Wu, M. L., Adams, R. J., Wilson, M. R., & Haldane, S. A. (2007). ACER ConQuest version 2.0: Generalised item response modelling software. ACER Press.
Yang, Y., He, P., & Liu, X. (2018). Validation of an instrument for measuring students’ understanding of interdisciplinary science in grades 4-8 over multiple semesters: A Rasch measurement study. International Journal of Science and Mathematics Education, 16(4), 639–654.
Zangori, L., Peel, A., Kinslow, A., Friedrichsen, P., & Sadler, T. D. (2017). Student development of model-based reasoning about carbon cycling and climate change in a socio-scientific issues unit. Journal of Research in Science Teaching, 54(10), 1249–1273.
Zhai, X. (2022). Assessing high-school students’ modeling performance on newtonian mechanics. Journal of Research in Science Teaching., 59, 1–41.
Zhai, X., Haudek, K. C., Stuhlsatz, M. A., & Wilson, C. (2020). Evaluation of construct-irrelevant variance yielded by machine and human scoring of a science teacher PCK constructed response assessment. Studies in Educational Evaluation, 67, 100916.
Zhai, X., He, P., & Krajcik, J. (2022). Applying machine learning to automatically assess scientific models. Journal of Research in Science Teaching., 59 (10), 1765–1794.
Acknowledgements
This study is supported by the National Science Foundation (Grant Numbers 2101104, 2100964, 2201068; DRL-1903103, DRL-1316874, DRL-1316903, DRL-1316908). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: ConQuest Many-Facet Rasch Modeling Codes
Appendix: ConQuest Many-Facet Rasch Modeling Codes
MFRM analysis code (by ConQuest) | Annotation |
---|---|
Datafile Fulldata01.Dat; Format rater 5 responses 7–27 / Rater 5 responses 7–27 / Rater 5 responses 7–27! Criteria(21); Label << NGSA240.Nam; Set update = yes,warning = no; Model rater + criteria; Export parameters >> NGSA240.Prm; Export reg >> NGSA240.Reg; Export cov >> NGSA240.Cov; Estimate! Nodes = 10,stderr = full; Show parameters!Estimates = latent,tables = 1:2:4> > NGSA240.Shw; | The final model in the section of applying dichotomous MFRMs to analyze scores using task-specific dichotomous rubrics. See the main findings in Table 13.4 and Fig. 13.4. |
Datafile Fulldata Recode012.Dat; Format rater 5 responses 7–16 / Rater 5 responses 7–16 / Rater 5 responses 7–16! Element(10); Label << NGSA240r.Nam; Set update = yes,warning = no; Model rater + element + rater*element + rater*element*step; Export parameters >> NGSA240r.Prm; Export reg >> NGSA240r.Reg; Export cov >> NGSA240r.Cov; Estimate! Nodes = 10,stderr = full; Show parameters!Estimates = latent,tables = 1:2:4> > NGSA240r.Shw; | The final model (model 4) in the section of applying partial credit MFRMs to analyze scores using task-specific Polytomous rubric. See the main findings in Table 13.5 and Fig. 13.6. |
Datafile Fulldata Recode012Com.Dat; Format rater 5 task 7 responses 9–12 / Rater 5 task 7 responses 9–12 / Rater 5 task 7 responses 9–12! Com(4); Label << NGSA240r2.Nam; Set update = yes,warning = no; Model rater + task + com + rater*task + rater*com + rater*com*step; Export parameters >> NGSA240r2.Prm; Export reg >> NGSA240r2.Reg; Export cov >> NGSA240r2.Cov; Estimate! Nodes = 10,stderr = full; Show parameters!Estimates = latent,tables = 1:2:4> > NGSA240r2.Shw; | The final model (model 2) in the section of applying partial credit MFRMs to analyze scores using task-generic Polytomous rubrics. See the main findings in Tables 13.8 and 13.9 and Fig. 13.7. |
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
He, P., Zhai, X., Shin, N., Krajcik, J. (2023). Applying Rasch Measurement to Assess Knowledge-in-Use in Science Education. In: Liu, X., Boone, W.J. (eds) Advances in Applications of Rasch Measurement in Science Education. Contemporary Trends and Issues in Science Education, vol 57. Springer, Cham. https://doi.org/10.1007/978-3-031-28776-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-28776-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28775-6
Online ISBN: 978-3-031-28776-3
eBook Packages: EducationEducation (R0)