Explainability in Automatic Short Answer Grading

Schlippe, Tim; Stierstorfer, Quintus; Koppel, Maurice ten; Libbrecht, Paul

doi:10.1007/978-981-19-8040-4_5

Tim Schlippe⁶,
Quintus Stierstorfer⁶,
Maurice ten Koppel⁶ &
…
Paul Libbrecht⁶

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 154))

Included in the following conference series:

International Conference on Artificial Intelligence in Education Technology

674 Accesses
1 Citations

Abstract

Massive open online courses and other online study opportunities are providing easier access to education for more and more people around the world. To cope with the large number of exams to be assessed in these courses, AI-driven automatic short answer grading can recommend teaching staff to assign points when evaluating free text answers, leading to faster and fairer grading. But what would be the best way to work with the AI? In this paper, we investigate and evaluate different methods for explainability in automatic short answer grading. Our survey of over 70 professors, lecturers and teachers with grading experience showed that displaying the predicted points together with matches between student answer and model answer is rated better than the other tested explainable AI (XAI) methods in the aspects trust, informative content, speed, consistency and fairness, fun, comprehensibility, applicability, use in exam preparation, and in general.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Also called sample answer or sample response in literature.
2.
https://docs.google.com/forms.

References

United Nations: Sustainable development goals: 17 goals to transform our world (2021). https://www.un.org/sustainabledevelopment/sustainable-development-goals
Correia, A.P., Liu, C., Xu, F.: Evaluating videoconferencing systems for the quality of the educational experience. Distance Educ. 41(4), 429–452 (2020). https://doi.org/10.1080/01587919.2020.1821607
Koravuna, S., Surepally, U.K.: Educational gamification and artificial intelligence for promoting digital literacy. Association for Computing Machinery, New York, NY, USA (2020)
Google Scholar
Chen, L., Chen, P., Lin, Z.: Artificial intelligence in education: A review. IEEE Access 8, 75264–75278 (2020). https://doi.org/10.1109/ACCESS.2020.2988510
Heffernan, N.T., Heffernan, C.L.: The ASSISTments ecosystem: Building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching. Int. J. Artif. Intell. Educ. 24(4), 470–497 (2014). https://doi.org/10.1007/s40593-014-0024-x
Article MathSciNet Google Scholar
Libbrecht, P., Declerck, T., Schlippe, T., Mandl, T., Schiffner, D.: NLP for student and teacher: Concept for an AI based information literacy tutoring system. In: The 29th ACM International Conference on Information and Knowledge Management (CIKM2020). Galway, Ireland (2020)
Google Scholar
Schlippe, T., Sawatzki, J.: Cross-lingual automatic short answer grading. In: Proceedings of the 2nd International Conference on Artificial Intelligence in Education Technology (AIET 2021). Wuhan, China (2021)
Google Scholar
Adadi, A., Berrada, M.: Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160 (2018). https://doi.org/10.1109/ACCESS.2018.2870052(2018)
Article Google Scholar
Ng, A.: Machine learning yearning. Online draft. https://github.com/ajaymache/machine-learning-yearning (2017)
Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning (2017). arXiv:1702.08608
Hansen, L.K., Rieger, L.: Interpretability in intelligent systems—a new concept? In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R. (eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS (LNAI), vol. 11700, pp. 41–49. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28954-6_3
Chapter Google Scholar
Bodria, F., Giannotti, F., Guidotti, R., Naretto, F., Pedreschi, D., Rinzivillo, S.: Benchmarking and survey of explanation methods for black box models (2021). arXiv:2102.13076
Carvalho, D.V., Pereira, E.M., Cardoso, J.S.: Machine learning interpretability: A survey on methods and metrics. Electronics 8(8) (2019). https://doi.org/10.3390/electronics8080832
Danilevsky, M., Qian, K., Aharonov, R., Katsis, Y., Kawas, B., Sen, P.: A survey of the state of explainable AI for natural language processing (2020). arXiv:2010.00711
Samek, W., Montavon, G., Lapuschkin, S., Anders, C.J., Müller, K.R.: Explaining deep neural networks and beyond: A review of methods and applications. Proc. IEEE 109(3), 247–278 (2021). https://doi.org/10.1109/JPROC.2021.3060483
Rudin, C., Radin, J.: Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition. Harvard Data Science Issue 1.2 (2019). https://doi.org/10.1162/99608f92.5a8a3a3d
Sawatzki, J., Schlippe, T., Benner-Wickner, M.: Deep learning techniques for automatic short answer grading: Predicting scores for English and German answers. In: Proceedings of The 2nd International Conference on Artificial Intelligence in Education Technology (AIET 2021). Wuhan, China (2021)
Google Scholar
Burrows, S., Gurevych, I., Stein, B.: The eras and trends of automatic short answer grading. Int. J. Artif. Intell. Educ. 25(1), 60–117 (2014). https://doi.org/10.1007/s40593-014-0026-8
Article Google Scholar
Camus, L., Filighera, A.: Investigating transformers for automatic short answer grading. In: Bittencourt, I.I., Cukurova, M., Muldner, K., Luckin, R., Millán, E. (eds.) AIED 2020. LNCS (LNAI), vol. 12164, pp. 43–48. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52240-7_8
Chapter Google Scholar
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: RoBERTa: A robustly optimized BERT pretraining approach. CoRR (2019). arXiv:1907.11692
Pires, T., Schlinger, E., Garrette, D.: How multilingual is multilingual BERT? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pp. 4996–5001 (2019). https://doi.org/10.18653/v1/P19-1493
Poulton, A., Eliens, S.: Explaining transformer-based models for automatic short answer grading. In: Proceedings of the 5th International Conference on Digital Technology in Education (ICDTE 2021). Association for Computing Machinery, New York, NY, USA, pp. 110–116 (2021). https://doi.org/10.1145/3488466.3488479
van der Waa, J., Schoonderwoerd, T., van Diggelen, J., Neerincx, M.: Interpretable confidence measures for decision support systems. Int. J. Hum.-Comput. Stud. 144 (2020). https://doi.org/10.1016/j.ijhcs.2020.102493
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Association for Computing Machinery, New York, NY, USA, pp. 1135–1144 (2016). https://doi.org/10.1145/2939672.2939778
Kim, B., Wattenberg, M., Gilmer, J., Cai, C.J., Wexler, J., Viégas, F., Sayres, R.: Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV). In: ICML 2018
Google Scholar
Hanna, R.N., Linden, L.L.: Discrimination in grading. Am. Econ. J. Econ. Policy 4(4), 146–168 (2012). http://www.jstor.org/stable/23358248
Mohler, M., Bunescu, R., Mihalcea, R.: Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, Oregon, USA, pp. 752–762 (2011)
Google Scholar
Schlippe, T., Sawatzki, J.: AI-based multilingual interactive exam preparation. In: Guralnick, D., Auer, M.E., Poce, A. (eds.) TLIC 2021. LNNS, vol. 349, pp. 396–408. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-90677-1_38
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

IU International University of Applied Sciences, Erfurt, Germany
Tim Schlippe, Quintus Stierstorfer, Maurice ten Koppel & Paul Libbrecht

Authors

Tim Schlippe
View author publications
You can also search for this author in PubMed Google Scholar
Quintus Stierstorfer
View author publications
You can also search for this author in PubMed Google Scholar
Maurice ten Koppel
View author publications
You can also search for this author in PubMed Google Scholar
Paul Libbrecht
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tim Schlippe .

Editor information

Editors and Affiliations

Department of Curriculum and Instruction, The Education University of Hong Kong, Tai Po, Hong Kong
Eric C. K. Cheng
Swinburne University of Technology, Melbourne, VIC, Australia
Tianchong Wang
IU International University of Applied Sciences, Erfurt, Germany
Tim Schlippe
Department of Food Science & Technology, University of Patras, Agrinio, Greece
Grigorios N. Beligiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schlippe, T., Stierstorfer, Q., Koppel, M.t., Libbrecht, P. (2023). Explainability in Automatic Short Answer Grading. In: Cheng, E.C.K., Wang, T., Schlippe, T., Beligiannis, G.N. (eds) Artificial Intelligence in Education Technologies: New Development and Innovative Practices. AIET 2022. Lecture Notes on Data Engineering and Communications Technologies, vol 154. Springer, Singapore. https://doi.org/10.1007/978-981-19-8040-4_5

Download citation

DOI: https://doi.org/10.1007/978-981-19-8040-4_5
Published: 01 January 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8039-8
Online ISBN: 978-981-19-8040-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics