Applying Rasch Measurement to Assess Knowledge-in-Use in Science Education

He, Peng; Zhai, Xiaoming; Shin, Namsoo; Krajcik, Joseph

doi:10.1007/978-3-031-28776-3_13

Peng He¹²,
Xiaoming Zhai¹³,
Namsoo Shin¹² &
…
Joseph Krajcik¹²

Part of the book series: Contemporary Trends and Issues in Science Education ((CTISE,volume 57))

327 Accesses
2 Citations

Abstract

This study applied the many-facet Rasch measurement (MFRM) to assess students’ knowledge-in-use in middle school physical science. 240 students completed three knowledge-in-use classroom assessment tasks on an online platform. We developed transformable scoring rubrics to score students’ responses, including a task-generic polytomous rubric (applicable to the three tasks), a task-specific polytomous rubric (for each task), and a task-specific dichotomous rubric (for each task). Three qualified raters scored 240 students’ responses to the three tasks. MFRM reported student ability, item difficulty, rater severity, and interaction effects, which helped improve the assessment tasks and rubrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bond, T. G., & Fox, C. M. (2015). Applying the Rasch models: Fundamental measurement in the human sciences. Lawrence Erlbaum Associate.
Book Google Scholar
Boone, W. J., & Staver, J. R. (2020). Advances in Rasch analyses in the human sciences. Springer.
Book Google Scholar
Boone, W. J., Townsend, J. S., & Staver, J. R. (2016). Utilizing multifaceted Rasch measurement through FACETS to evaluate science education data sets composed of judges, respondents, and rating scale items: An exemplar utilizing the elementary science teaching analysis matrix instrument. Science Education, 100(2), 221–238.
Article Google Scholar
Brown, J. S., Collins, A., & Duguid, P. (1989). Situated cognition and the culture of learning. Educational Researcher, 18(1), 32–42.
Article Google Scholar
Chandler, P., & Sweller, J. (1991). Cognitive load theory and the format of instruction. Cognition and Instruction, 8(4), 293–332.
Article Google Scholar
Chen, Y.-C., & Terada, T. (2021). Development and validation of an observation-based protocol to measure the eight scientific practices of the next generation science standards in K-12 science classrooms. Journal of Research in Science Teaching, 58(10), 1489–1526.
Article Google Scholar
Chen, Y., Irving, P. W., & Sayre, E. C. (2013). Epistemic game for answer making in learning about hydrostatics. Physical Review Special Topics—Physics Education Research, 9(1), 1–7. https://doi.org/10.1103/PhysRevSTPER.9.010108
Article Google Scholar
Chi, S., Liu, X., & Wang, Z. (2021). Comparing student science performance between hands-on and traditional item types: A many-facet Rasch analysis. Studies in Educational Evaluation, 70, 100998.
Article Google Scholar
Cohen, J. E. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Article Google Scholar
Crowder, E. M. (1996). Gestures at work in sensemaking science talk. Journal of the Learning Sciences, 5(3), 173–208.
Article Google Scholar
Duncan, R. G., Krajcik, J. S., & Rivet, A. E. (Eds.). (2017). Disciplinary core ideas: Reshaping teaching and learning. NTSA Press.
Google Scholar
Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25(2), 155–185.
Article Google Scholar
Eckes, T. (2015). Introduction to many-facet Rasch measurement: Analyzing and evaluating rater-mediated assessments (2nd ed.). Peter Lang.
Google Scholar
Finnish National Board of Education. (2016). National core curriculum for basic education 2014. Finnish National Board of Education.
Google Scholar
Fleiss, J. L., & Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of agreement. Educational and Psychological Measurement., 33, 613–619.
Article Google Scholar
Fleiss, J., Levin, B., & Paik, M. (2003). Statistical methods for rates & proportions (3rd ed.). Wiley & Sons.
Book Google Scholar
Gotwals, A. W., & Songer, N. B. (2013). Validity evidence for learning progression-based assessment items that fuse core disciplinary ideas and science practices. Journal of Research in Science Teaching, 50(5), 597–626.
Article Google Scholar
Gotwals, A. W., Songer, N. B., & Bullard, L. (2012). Assessing students’ progressing abilities to construct scientific explanations. In Learning progressions in science (pp. 183–210). Brill Sense.
Chapter Google Scholar
Greeno, J. G., Collins, A. M., & Resnick, L. B. (1996). Cognition and learning. Handbook of Educational Psychology, 77, 15–46.
Google Scholar
Harris, C. J., McNeill, K. L., Lizotte, D. L., Marx, R. W., & Krajcik, J. (2006). Usable assessments for teaching science content and inquiry standards. In M. McMahon, P. Simmons, R. Sommers, D. DeBaets, & F. Crowley (Eds.), Assessment in science: Practical experiences and education research (pp. 67–88). National Science Teachers Association Press.
Google Scholar
Harris, C. J., Krajcik, J. S., Pellegrino, J. W., & DeBarger, A. H. (2019). Designing knowledge-in-use assessments to promote deeper learning. Educational Measurement: Issues and Practice., 38(2), 53–67.
Article Google Scholar
He, P., Liu, X., Zheng, C., & Jia, M. (2016). Using Rasch measurement to validate an instrument for measuring the quality of classroom teaching in secondary chemistry lessons. Chemistry Education Research and Practice, 17(2), 381–393.
Article Google Scholar
He, P., Chen, I., Touitou, I., Bartz, K., Schneider, B., & Krajcik, J. (2023). Predicting student science achievement using post-unit assessment performances in a coherent high school chemistry project-based learning system. Journal of Research in Science Teaching, 60(4), 724–760.
Google Scholar
Jescovitch, L. N., Scott, E. E., Cerchiara, J. A., Merrill, J., Urban-Lurain, M., Doherty, J. H., & Haudek, K. C. (2021). Comparison of machine learning performance using analytic and holistic coding approaches across constructed response assessments aligned to a science learning progression. Journal of Science Education and Technology, 30(2), 150–167.
Article Google Scholar
Kaldaras, L., Akaeze, H., & Krajcik, J. (2021). Developing and validating next generation science standards-aligned learning progression to track three-dimensional learning of electrical interactions in high school physical science. Journal of Research in Science Teaching, 58(4), 589–618.
Article Google Scholar
Kapon, S. (2017). Unpacking sensemaking. Science Education, 101(1), 165–198.
Article Google Scholar
Kim, E. M., Nabors Oláh, L., & Peters, S. (2020). A learning progression for constructing and interpreting data display. ETS Research Report Series, 2020(1), 1–27.
Article Google Scholar
Krajcik, J. S. (2021). Commentary—Applying machine learning in science assessment: Opportunity and challenges. Journal of Science Education and Technology, 30(2), 313–318.
Article Google Scholar
Krist, C., Schwarz, C. V., & Reiser, B. J. (2019). Identifying essential epistemic heuristics for guiding mechanistic reasoning in science learning. Journal of the Learning Sciences, 28(2), 160–205.
Article Google Scholar
Leckie, G., & Baird, J. A. (2011). Rater effects on essay scoring: A multilevel analysis of severity drift, central tendency, and rater experience. Journal of Educational Measurement, 48(4), 399–418.
Article Google Scholar
Linacre, J. M. (1989). Many-faceted Rasch measurement. MESA Press.
Google Scholar
Liu, X. (2020). Using and developing measurement instruments in science education: A Rasch modeling approach (2nd ed.). IAP.
Google Scholar
Liu, O. L., Rios, J. A., Heilman, M., Gerard, L., & Linn, M. C. (2016). Validation of automated scoring science assessments. Journal of Research in Science Teaching, 53(2), 215–233.
Article Google Scholar
Mao, L., Liu, O. L., Roohr, K., Belur, V., Mulholland, M., Lee, H.-S., & Pallant, A. (2018). Validation of automated scoring for a formative assessment that employs scientific argumentation. Educational Assessment, 23(2), 121–138.
Article Google Scholar
Mayer, K., & Krajcik, J. (2015). Designing and assessing scientific modeling tasks. In Encyclopedia of science education (pp. 291–297). Springer.
Chapter Google Scholar
McNeill, K. L., Lizotte, D. J., Krajcik, J., & Marx, R. W. (2006). Supporting students’ construction of scientific explanations by fading scaffolds in instructional materials. Journal of the Learning Sciences, 15(2), 153–191.
Article Google Scholar
Ministry of Education, P. R. China. (2017). Chemistry curriculum standards for senior high school [普通高中化学课程标准]. People’s Education Press.
Google Scholar
Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2003). A brief introduction to evidence-centered design. ETS Research Report Series, 2003(1), i–29.
Article Google Scholar
National Research Council. (2012). A framework for K-12 science education: Practices, crosscutting concepts, and core ideas. The National Academies Press.
Google Scholar
NGSA. (2022). Next generation science assessment. https://ngss-assessment.portal.concord.org
NGSS Lead States. (2013). Next generation science standards: For states, by states. The National Academies Press.
Google Scholar
Nordine, J., & Lee, O. (Eds.). (2021). Crosscutting concepts: Strengthening science and engineering learning. National Science Teaching Association.
Google Scholar
Odden, T. O. B., & Russ, R. S. (2019). Defining sensemaking: Bringing clarity to a fragmented theoretical construct. Science Education, 103(1), 187–205.
Article Google Scholar
Organization for Economic Cooperation and Development. (2019). PISA 2018 assessment and analytical framework. OECD Publishing. https://doi.org/10.1787/b25efab8-en
Book Google Scholar
Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent developments. Educational Psychologist, 38(1), 1–4.
Article Google Scholar
Pellegrino, J. W., & Hilton, M. L. (2012). Education for life and work: Developing transferable knowledge and skills in the 21st century. National Academies Press.
Google Scholar
Penuel, W. R., Turner, M. L., Jacobs, J. K., Van Horne, K., & Sumner, T. (2019). Developing tasks to assess phenomenon-based science learning: Challenges and lessons learned from building proximal transfer tasks. Science Education, 103(6), 1367–1395.
Article Google Scholar
Schwarz, C., Passmore, C., & Reiser, B. J. (2017). Helping students make sense of the world using next generation science and engineering practices. National Science Teachers Association.
Google Scholar
Styck, K. M., Anthony, C. J., Sandilos, L. E., & DiPerna, J. C. (2021). Examining rater effects on the classroom assessment scoring system. Child Development, 92(3), 976–993.
Article Google Scholar
Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and Instruction, 4(4), 295–312.
Article Google Scholar
Wang, C., Liu, X., Wang, L., Sun, Y., & Zhang, H. (2021). Automated scoring of Chinese grades 7–9 students’ competence in interpreting and arguing from evidence. Journal of Science Education and Technology, 30(2), 269–282.
Article Google Scholar
Wright, B. D., & Stone, M. H. (1979). Best test design. MESA press.
Google Scholar
Wu, M. L., Adams, R. J., Wilson, M. R., & Haldane, S. A. (2007). ACER ConQuest version 2.0: Generalised item response modelling software. ACER Press.
Google Scholar
Yang, Y., He, P., & Liu, X. (2018). Validation of an instrument for measuring students’ understanding of interdisciplinary science in grades 4-8 over multiple semesters: A Rasch measurement study. International Journal of Science and Mathematics Education, 16(4), 639–654.
Article Google Scholar
Zangori, L., Peel, A., Kinslow, A., Friedrichsen, P., & Sadler, T. D. (2017). Student development of model-based reasoning about carbon cycling and climate change in a socio-scientific issues unit. Journal of Research in Science Teaching, 54(10), 1249–1273.
Article Google Scholar
Zhai, X. (2022). Assessing high-school students’ modeling performance on newtonian mechanics. Journal of Research in Science Teaching., 59, 1–41.
Article Google Scholar
Zhai, X., Haudek, K. C., Stuhlsatz, M. A., & Wilson, C. (2020). Evaluation of construct-irrelevant variance yielded by machine and human scoring of a science teacher PCK constructed response assessment. Studies in Educational Evaluation, 67, 100916.
Article Google Scholar
Zhai, X., He, P., & Krajcik, J. (2022). Applying machine learning to automatically assess scientific models. Journal of Research in Science Teaching., 59 (10), 1765–1794.
Article Google Scholar

Download references

Acknowledgements

This study is supported by the National Science Foundation (Grant Numbers 2101104, 2100964, 2201068; DRL-1903103, DRL-1316874, DRL-1316903, DRL-1316908). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

CREATE for STEM Institute, Michigan State University, East Lansing, MI, USA
Peng He, Namsoo Shin & Joseph Krajcik
Department of Mathematics, Science, and Social Studies Education, University of Georgia, Athens, GA, USA
Xiaoming Zhai

Authors

Peng He
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoming Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Namsoo Shin
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Krajcik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng He .

Editor information

Editors and Affiliations

Department of Learning & Instruction, Graduate School of Education University at Buffalo, Buffalo, NY, USA
Xiufeng Liu
Department of Educational Psychology, Program in Learning Sciences and Human Development, Miami University, Oxford, OH, USA
William J. Boone

Appendix: ConQuest Many-Facet Rasch Modeling Codes

MFRM analysis code (by ConQuest)	Annotation
Datafile Fulldata01.Dat; Format rater 5 responses 7–27 / Rater 5 responses 7–27 / Rater 5 responses 7–27! Criteria(21); Label << NGSA240.Nam; Set update = yes,warning = no; Model rater + criteria; Export parameters >> NGSA240.Prm; Export reg >> NGSA240.Reg; Export cov >> NGSA240.Cov; Estimate! Nodes = 10,stderr = full; Show parameters!Estimates = latent,tables = 1:2:4> > NGSA240.Shw;	The final model in the section of applying dichotomous MFRMs to analyze scores using task-specific dichotomous rubrics. See the main findings in Table 13.4 and Fig. 13.4.
Datafile Fulldata Recode012.Dat; Format rater 5 responses 7–16 / Rater 5 responses 7–16 / Rater 5 responses 7–16! Element(10); Label << NGSA240r.Nam; Set update = yes,warning = no; Model rater + element + raterelement + raterelement*step; Export parameters >> NGSA240r.Prm; Export reg >> NGSA240r.Reg; Export cov >> NGSA240r.Cov; Estimate! Nodes = 10,stderr = full; Show parameters!Estimates = latent,tables = 1:2:4> > NGSA240r.Shw;	The final model (model 4) in the section of applying partial credit MFRMs to analyze scores using task-specific Polytomous rubric. See the main findings in Table 13.5 and Fig. 13.6.
Datafile Fulldata Recode012Com.Dat; Format rater 5 task 7 responses 9–12 / Rater 5 task 7 responses 9–12 / Rater 5 task 7 responses 9–12! Com(4); Label << NGSA240r2.Nam; Set update = yes,warning = no; Model rater + task + com + ratertask + ratercom + ratercomstep; Export parameters >> NGSA240r2.Prm; Export reg >> NGSA240r2.Reg; Export cov >> NGSA240r2.Cov; Estimate! Nodes = 10,stderr = full; Show parameters!Estimates = latent,tables = 1:2:4> > NGSA240r2.Shw;	The final model (model 2) in the section of applying partial credit MFRMs to analyze scores using task-generic Polytomous rubrics. See the main findings in Tables 13.8 and 13.9 and Fig. 13.7.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

He, P., Zhai, X., Shin, N., Krajcik, J. (2023). Applying Rasch Measurement to Assess Knowledge-in-Use in Science Education. In: Liu, X., Boone, W.J. (eds) Advances in Applications of Rasch Measurement in Science Education. Contemporary Trends and Issues in Science Education, vol 57. Springer, Cham. https://doi.org/10.1007/978-3-031-28776-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-28776-3_13
Published: 01 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28775-6
Online ISBN: 978-3-031-28776-3
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics

Applying Rasch Measurement to Assess Knowledge-in-Use in Science Education

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: ConQuest Many-Facet Rasch Modeling Codes

Appendix: ConQuest Many-Facet Rasch Modeling Codes

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation