Automatic Task Requirements Writing Evaluation via Machine Reading Comprehension

Xu, Shiting; Xu, Guowei; Jia, Peilei; Ding, Wenbiao; Wu, Zhongqin; Liu, Zitao

doi:10.1007/978-3-030-78292-4_36

Shiting Xu¹³,
Guowei Xu¹³,
Peilei Jia¹³,
Wenbiao Ding¹³,
Zhongqin Wu¹³ &
…
Zitao Liu¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12748))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

3120 Accesses
2 Citations

Abstract

Task requirements (TRs) writing is an important question type in Key English Test and Preliminary English Test. A TR writing question may include multiple requirements and a high-quality essay must respond to each requirement thoroughly and accurately. However, the limited teacher resources prevent students from getting detailed grading instantly. The majority of existing automatic essay scoring systems focus on giving a holistic score but rarely provide reasons to support it. In this paper, we proposed an end-to-end framework based on machine reading comprehension (MRC) to address this problem to some extent. The framework not only detects whether an essay responds to a requirement question, but clearly marks where the essay answers the question. Our framework consists of three modules: question normalization module, ELECTRA based MRC module and response locating module. We extensively explore state-of-the-art MRC methods. Our approach achieves 0.93 accuracy score and 0.85 F1 score on a real-world educational dataset. To encourage reproducible results, we make our code publicly available at https://github.com/aied2021TRMRC/AIED_2021_TRMRC_code.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Klebanov, B.B., Madnani, N.: Automated evaluation of writing - 50 years and counting. In: Proceedings of ACL, pp. 7796–7810. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.697, https://www.aclweb.org/anthology/2020.acl-main.697
Carlile, W., Gurrapadi, N., Ke, Z., Ng, V.: Give me more feedback: annotating argument persuasiveness and related attributes in student essays. In: Proceedings of ACL, pp. 621–631. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/P18-1058, https://www.aclweb.org/anthology/P18-1058
Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. In: Proceedings of ICLR. OpenReview.net (2020). https://openreview.net/forum?id=r1xMH1BtvB
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/N19-1423, https://www.aclweb.org/anthology/N19-1423
Dhingra, B., Liu, H., Yang, Z., Cohen, W., Salakhutdinov, R.: Gated-attention readers for text comprehension. In: Proceedings of ACL, pp. 1832–1846. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/P17-1168, https://www.aclweb.org/anthology/P17-1168
Dong, F., Zhang, Y., Yang, J.: Attention-based recurrent convolutional neural network for automatic essay scoring. In: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp. 153–162. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/K17-1017, https://www.aclweb.org/anthology/K17-1017
Hewlett, D., et al.: WikiReading: a novel large-scale language understanding task over Wikipedia. In: Proceedings of ACL, pp. 1535–1545. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/P16-1145, https://www.aclweb.org/anthology/P16-1145
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jia, R., Liang, P.: Adversarial examples for evaluating reading comprehension systems. In: Proceedings of EMNLP, pp. 2021–2031. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/D17-1215, https://www.aclweb.org/anthology/D17-1215
Joshi, M., Choi, E., Weld, D., Zettlemoyer, L.: TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension. In: Proceedings of ACL, pp. 1601–1611. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/P17-1147, https://www.aclweb.org/anthology/P17-1147
Kadlec, R., Schmid, M., Bajgar, O., Kleindienst, J.: Text understanding with the attention sum reader network. In: Proceedings of ACL, pp. 908–918. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/P16-1086, https://www.aclweb.org/anthology/P16-1086
Ke, Z., Inamdar, H., Lin, H., Ng, V.: Give me more feedback II: annotating thesis strength and related attributes in student essays. In: Proceedings of ACL, pp. 3994–4004. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/P19-1390, https://www.aclweb.org/anthology/P19-1390
Ke, Z., Ng, V.: Automated essay scoring: a survey of the state of the art. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 10–16 August 2019, pp. 6300–6308. ijcai.org (2019). https://doi.org/10.24963/ijcai.2019/879, https://doi.org/10.24963/ijcai.2019/879
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: Proceedings of ICLR. OpenReview.net (2020). https://openreview.net/forum?id=H1eA7AEtvS
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of ACL, pp. 7871–7880. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.703, https://www.aclweb.org/anthology/2020.acl-main.703
Liu, X., Shen, Y., Duh, K., Gao, J.: Stochastic answer networks for machine reading comprehension. In: Proceedings of ACL, pp. 1694–1704. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/P18-1157, https://www.aclweb.org/anthology/P18-1157
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Nguyen, T., et al.: A human generated machine reading comprehension dataset. arXiv preprint ArXiv:1607.06275 (2016)
O’Shea, K., Nash, R.: An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 (2015)
Page, E.B.: The imminence of... grading essays by computer. The Phi Delta Kappan 47(5), 238–243 (1966)
Google Scholar
Persing, I., Ng, V.: Modeling prompt adherence in student essays. In: Proceedins of ACL, pp. 1534–1543. Association for Computational Linguistics (2014). https://doi.org/10.3115/v1/P14-1144, https://www.aclweb.org/anthology/P14-1144
Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: Unanswerable questions for SQuAD. In: Proceedings of ACL, pp. 784–789. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/P18-2124, https://www.aclweb.org/anthology/P18-2124
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of EMNLP, pp. 2383–2392. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/D16-1264, https://www.aclweb.org/anthology/D16-1264
Shen, Y., Huang, P., Gao, J., Chen, W.: Reasonet: Learning to stop reading in machine comprehension. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13–17, 2017, pp. 1047–1055. ACM (2017). https://doi.org/10.1145/3097983.3098177, https://doi.org/10.1145/3097983.3098177
Shen, Y., Liu, X., Duh, K., Gao, J.: An empirical analysis of multiple-turn reasoning strategies in reading comprehension tasks. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). pp. 957–966. Asian Federation of Natural Language Processing (2017). https://www.aclweb.org/anthology/I17-1096
Taghipour, K., Ng, H.T.: A neural approach to automated essay scoring. In: Proceedings of EMNLP, pp. 1882–1891. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/D16-1193, https://www.aclweb.org/anthology/D16-1193
Trischler, A., et al.: NewsQA: a machine comprehension dataset. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp. 191–200. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/W17-2623, https://www.aclweb.org/anthology/W17-2623
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp. 5998–6008 (2017). https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
Wang, Y., Wei, Z., Zhou, Y., Huang, X.: Automatic essay scoring incorporating rating schema via reinforcement learning. In: Proceedings of EMNLP, pp. 791–797. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/D18-1090, https://www.aclweb.org/anthology/D18-1090
Wang, Z., Liu, H., Tang, J., Yang, S., Huang, G.Y., Liu, Z.: Learning multi-level dependencies for robust word recognition. Proc. AAAI Conf. Artif. Intell. 34, 9250–9257 (2020)
Google Scholar
Zhang, Z., Yang, J., Zhao, H.: Retrospective reader for machine reading comprehension. arXiv preprint arXiv:2001.09694 (2020)

Download references

Acknowledgment

This work was supported in part by National Key R&D Program of China, under Grant No. 2020AAA0104500 and in part by Beijing Nova Program (Z201100006820068) from Beijing Municipal Science & Technology Commission.

Author information

Authors and Affiliations

TAL Education Group, Beijing, China
Shiting Xu, Guowei Xu, Peilei Jia, Wenbiao Ding, Zhongqin Wu & Zitao Liu

Authors

Shiting Xu
View author publications
You can also search for this author in PubMed Google Scholar
Guowei Xu
View author publications
You can also search for this author in PubMed Google Scholar
Peilei Jia
View author publications
You can also search for this author in PubMed Google Scholar
Wenbiao Ding
View author publications
You can also search for this author in PubMed Google Scholar
Zhongqin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zitao Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenbiao Ding .

Editor information

Editors and Affiliations

Technion – Israel Institute of Technology, Haifa, Israel
Ido Roll
Arizona State University, Tempe, AZ, USA
Danielle McNamara
Utrecht University, Utrecht, The Netherlands
Sergey Sosnovsky
London Knowledge Lab, London, UK
Rose Luckin
University of Leeds, Leeds, UK
Vania Dimitrova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, S., Xu, G., Jia, P., Ding, W., Wu, Z., Liu, Z. (2021). Automatic Task Requirements Writing Evaluation via Machine Reading Comprehension. In: Roll, I., McNamara, D., Sosnovsky, S., Luckin, R., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2021. Lecture Notes in Computer Science(), vol 12748. Springer, Cham. https://doi.org/10.1007/978-3-030-78292-4_36

Download citation

DOI: https://doi.org/10.1007/978-3-030-78292-4_36
Published: 11 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78291-7
Online ISBN: 978-3-030-78292-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics