Scaling Mixed-Methods Formative Assessments (mixFA) in Classrooms: A Clustering Pipeline to Identify Student Knowledge

Chen, Xinyue; Wang, Xu

doi:10.1007/978-3-031-11644-5_35

Xinyue Chen¹¹ &
Xu Wang¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13355))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

3741 Accesses
5 Altmetric

Abstract

Formative assessments provide valuable data for teachers to make instructional decisions and help students actively manage their progress and learning. Multiple-choice questions (MCQ) and free-text open-ended questions are typically employed as formative assessments. While MCQs have the benefit of ease of grading and visualizing student answers, they lack capabilities in revealing diverse student ideas and reasoning beyond the options. On the other hand, open-ended tasks and free-text submissions may elicit students’ perspectives more comprehensively, though it requires laborious work for instructors to analyze such responses. In this work, we explore the use of mixed-methods formative assessments in a college-level CS class, in which we assign MCQs and ask students to explain their answers. We propose a clustering pipeline to categorize students’ free-text explanations leveraging the meta-data the original MCQs provide. We find that using students’ choices in MCQs to resolve co-reference in their explanations and adding students’ choices as features significantly improve clustering performance. Moreover, our work demonstrates that providing structures in the data collection process improves the clustering of free-text responses without making changes to the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/UM-Lifelong-Learning-Lab/AIED2022-MixFA-dataset.

References

Alhazmi, S., Hamilton, M., Thevathayan, C.: CS for all: catering to diversity of master’s students through assignment choices. In: Proceedings of the 49th ACM Technical Symposium on Computer Science Education, pp. 38–43 (2018)
Google Scholar
Amershi, S., Cakmak, M., Knox, W.B., Kulesza, T.: Power to the people: the role of humans in interactive machine learning. AI Mag. 35(4), 105–120 (2014)
Google Scholar
Aranganayagi, S., Thangavel, K.: Clustering categorical data using silhouette coefficient as a relocating measure. In: International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), vol. 2, pp. 13–17 (2007)
Google Scholar
Bennett, R.E.: Formative assessment: a critical review. Assess. Educ. Principles Policy Pract. 18(1), 5–25 (2011)
Article Google Scholar
Chung, C.-Y., Hsiao, I.-H.: Examining the effect of self-explanations in distributed self-assessment. In: De Laet, T., Klemke, R., Alario-Hoyos, C., Hilliger, I., Ortega-Arranz, A. (eds.) EC-TEL 2021. LNCS, vol. 12884, pp. 149–162. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86436-1_12
Chapter Google Scholar
Condor, A., Litster, M., Pardos, Z.: Automatic short answer grading with SBERT on out-of-sample questions. International Educational Data Mining Society (2021)
Google Scholar
Crouch, C.H., Mazur, E.: Peer instruction: ten years of experience and results. Am. J. Phys. 69(9), 970–977 (2001)
Article Google Scholar
Feldman, M.Q., Cho, J.Y., Ong, M., Gulwani, S., Popović, Z., Andersen, E.: Automatic diagnosis of students’ misconceptions in K-8 mathematics. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (2018)
Google Scholar
Galhardi, L.B., Brancher, J.D.: Machine learning approach for automatic short answer grading: a systematic review. In: Simari, G.R., Fermé, E., Gutiérrez Segura, F., Rodríguez Melquiades, J.A. (eds.) IBERAMIA 2018. LNCS (LNAI), vol. 11238, pp. 380–391. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03928-8_31
Chapter Google Scholar
Harrison, C.J., Könings, K.D., Schuwirth, L.W., Wass, V., Van der Vleuten, C.P.: Changing the culture of assessment: the dominance of the summative assessment paradigm. BMC Med. Educ. 17(1), 1–14 (2017)
Article Google Scholar
Huggingface: Huggingface/neuralcoref: fast coreference resolution in spacy with neural networks. https://github.com/huggingface/neuralcoref
Kanli, U.: Using a two-tier test to analyse students’ and teachers’ alternative concepts in astronomy. Sci. Educ. Int. 26(2), 148–165 (2015)
Google Scholar
Kara, E., Tonin, M., Vlassopoulos, M.: Class size effects in higher education: differences across stem and non-stem fields. Econ. Educ. Rev. 82, 102104 (2021)
Article Google Scholar
Karataş, P., Karaman, A.C.: Challenges faced by novice language teachers: support, identity, and pedagogy in the initial years of teaching. Int. J. Res. Teach. Educ. 4(3), 10–23 (2013)
Google Scholar
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)
Article Google Scholar
Mandinach, E.B., Gummer, E.S., Muller, R.D.: The Complexities of Integrating Data-Driven Decision Making into Professional Preparation in Schools of Education: It’s Harder Than You Think. CNA Analysis & Solutions, Alexandria (2011)
Google Scholar
Michalenko, J.J., Lan, A.S., Baraniuk, R.G.: Data-mining textual responses to uncover misconception patterns. In: Proceedings of the Fourth ACM Conference on Learning @ Scale, L@S 2017, New York, NY, USA (2017)
Google Scholar
Nandini, V., Maheswari, P.U.: Automatic assessment of descriptive answers in online examination system using semantic relational features. J. Supercomput. 76(6), 4430–4448 (2020)
Article Google Scholar
Nathan, M.J., Petrosino, A.: Expert blind spot among preservice teachers. Am. Educ. Res. J. 40(4), 905–928 (2003)
Article Google Scholar
Ndukwe, I.G., Amadi, C.E., Nkomo, L.M., Daniel, B.K.: Automatic grading system using sentence-BERT network. In: Bittencourt, I.I., Cukurova, M., Muldner, K., Luckin, R., Millán, E. (eds.) AIED 2020. LNCS (LNAI), vol. 12164, pp. 224–227. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52240-7_41
Chapter Google Scholar
Polat, M.: Analysis of multiple-choice versus open-ended questions in language tests according to different cognitive domain levels. Novitas-ROYAL (Res. Youth Lang.) 14(2), 76–96 (2020)
Google Scholar
Qian, Y., Lehman, J.: Students’ misconceptions and other difficulties in introductory programming: a literature review. ACM Trans. Comput. Educ. (TOCE) 18(1), 1–24 (2017)
Article Google Scholar
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. arXiv preprint arXiv:1908.10084 (2019)
Schildkamp, K., van der Kleij, F.M., Heitink, M.C., Kippers, W.B., Veldkamp, B.P.: Formative assessment: a systematic review of critical teacher prerequisites for classroom practice. Int. J. Educ. Res. 103, 101602 (2020)
Article Google Scholar
Shi, Y., Mao, T., Barnes, T., Chi, M., Price, T.W.: More with less: exploring how to use deep learning effectively through semi-supervised learning for automatic bug detection in student code. In: Proceedings of the 14th International Conference on Educational Data Mining (EDM) 2021 (2021)
Google Scholar
Shi, Y., Shah, K., Wang, W., Marwan, S., Penmetsa, P., Price, T.: Toward semi-automatic misconception discovery using code embeddings. In: LAK21: 11th International Learning Analytics and Knowledge Conference, pp. 606–612 (2021)
Google Scholar
Singh, A., Karayev, S., Gutowski, K., Abbeel, P.: GradeScope: a fast, flexible, and fair system for scalable assessment of handwritten work. In: Proceedings of the Fourth ACM Conference on Learning@ Scale, pp. 81–88 (2017)
Google Scholar
Sirkiä, T., Sorva, J.: Exploring programming misconceptions: an analysis of student mistakes in visual program simulation exercises. In: Proceedings of the 12th International Conference on Computing Education Research, pp. 19–28 (2012)
Google Scholar
Sung, C., Dhamecha, T.I., Mukhi, N.: Improving short answer grading using transformer-based pre-training. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds.) AIED 2019. LNCS (LNAI), vol. 11625, pp. 469–481. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23204-7_39
Chapter Google Scholar
Uto, M., Uchida, Y.: Automated short-answer grading using deep neural networks and item response theory. In: Bittencourt, I.I., Cukurova, M., Muldner, K., Luckin, R., Millán, E. (eds.) AIED 2020. LNCS (LNAI), vol. 12164, pp. 334–339. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52240-7_61
Chapter Google Scholar
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)
MathSciNet MATH Google Scholar
Wang, X., Rose, C., Koedinger, K.: Seeing beyond expert blind spots: online learning design for scale and quality. In: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–14 (2021)
Google Scholar
Wang, X., Talluri, S.T., Rose, C., Koedinger, K.: Upgrade: sourcing student open-ended solutions to create scalable learning opportunities. In: Proceedings of the Sixth ACM Conference on Learning@ Scale, pp. 1–10 (2019)
Google Scholar
Williams, J.J., et al.: Axis: generating explanations at scale with learnersourcing and machine learning. In: Proceedings of the Third (2016) ACM Conference on Learning@ Scale, pp. 379–388 (2016)
Google Scholar
Zhang, L., Huang, Y., Yang, X., Yu, S., Zhuang, F.: An automatic short-answer grading model for semi-open-ended questions. Interact. Learn. Environ. 30(1), 177–190 (2022)
Article Google Scholar
Zou, D., Xie, H.: Flipping an English writing class with technology-enhanced just-in-time teaching and peer instruction. Interact. Learn. Environ. 27, 1127–1142 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Michigan, Ann Arbor, MI, 48109, USA
Xinyue Chen & Xu Wang

Authors

Xinyue Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xu Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xinyue Chen .

Editor information

Editors and Affiliations

Ateneo De Manila University, Quezon, Philippines
Maria Mercedes Rodrigo
Department of Computer Science, North Carolina State University, Raleigh, NC, USA
Noburu Matsuda
Durham University, Durham, UK
Alexandra I. Cristea
University of Leeds, Leeds, UK
Vania Dimitrova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, X., Wang, X. (2022). Scaling Mixed-Methods Formative Assessments (mixFA) in Classrooms: A Clustering Pipeline to Identify Student Knowledge. In: Rodrigo, M.M., Matsuda, N., Cristea, A.I., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2022. Lecture Notes in Computer Science, vol 13355. Springer, Cham. https://doi.org/10.1007/978-3-031-11644-5_35

Download citation

DOI: https://doi.org/10.1007/978-3-031-11644-5_35
Published: 27 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11643-8
Online ISBN: 978-3-031-11644-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics