Skip to main content

Automatic Task Requirements Writing Evaluation via Machine Reading Comprehension

  • Conference paper
  • First Online:
Artificial Intelligence in Education (AIED 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12748))

Included in the following conference series:

Abstract

Task requirements (TRs) writing is an important question type in Key English Test and Preliminary English Test. A TR writing question may include multiple requirements and a high-quality essay must respond to each requirement thoroughly and accurately. However, the limited teacher resources prevent students from getting detailed grading instantly. The majority of existing automatic essay scoring systems focus on giving a holistic score but rarely provide reasons to support it. In this paper, we proposed an end-to-end framework based on machine reading comprehension (MRC) to address this problem to some extent. The framework not only detects whether an essay responds to a requirement question, but clearly marks where the essay answers the question. Our framework consists of three modules: question normalization module, ELECTRA based MRC module and response locating module. We extensively explore state-of-the-art MRC methods. Our approach achieves 0.93 accuracy score and 0.85 F1 score on a real-world educational dataset. To encourage reproducible results, we make our code publicly available at https://github.com/aied2021TRMRC/AIED_2021_TRMRC_code.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.cambridgeenglish.org/exams-and-tests/key.

  2. 2.

    https://www.cambridgeenglish.org/exams-and-tests/preliminary.

  3. 3.

    https://huggingface.co.

References

  1. Klebanov, B.B., Madnani, N.: Automated evaluation of writing - 50 years and counting. In: Proceedings of ACL, pp. 7796–7810. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.697, https://www.aclweb.org/anthology/2020.acl-main.697

  2. Carlile, W., Gurrapadi, N., Ke, Z., Ng, V.: Give me more feedback: annotating argument persuasiveness and related attributes in student essays. In: Proceedings of ACL, pp. 621–631. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/P18-1058, https://www.aclweb.org/anthology/P18-1058

  3. Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. In: Proceedings of ICLR. OpenReview.net (2020). https://openreview.net/forum?id=r1xMH1BtvB

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/N19-1423, https://www.aclweb.org/anthology/N19-1423

  5. Dhingra, B., Liu, H., Yang, Z., Cohen, W., Salakhutdinov, R.: Gated-attention readers for text comprehension. In: Proceedings of ACL, pp. 1832–1846. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/P17-1168, https://www.aclweb.org/anthology/P17-1168

  6. Dong, F., Zhang, Y., Yang, J.: Attention-based recurrent convolutional neural network for automatic essay scoring. In: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp. 153–162. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/K17-1017, https://www.aclweb.org/anthology/K17-1017

  7. Hewlett, D., et al.: WikiReading: a novel large-scale language understanding task over Wikipedia. In: Proceedings of ACL, pp. 1535–1545. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/P16-1145, https://www.aclweb.org/anthology/P16-1145

  8. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  9. Jia, R., Liang, P.: Adversarial examples for evaluating reading comprehension systems. In: Proceedings of EMNLP, pp. 2021–2031. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/D17-1215, https://www.aclweb.org/anthology/D17-1215

  10. Joshi, M., Choi, E., Weld, D., Zettlemoyer, L.: TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension. In: Proceedings of ACL, pp. 1601–1611. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/P17-1147, https://www.aclweb.org/anthology/P17-1147

  11. Kadlec, R., Schmid, M., Bajgar, O., Kleindienst, J.: Text understanding with the attention sum reader network. In: Proceedings of ACL, pp. 908–918. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/P16-1086, https://www.aclweb.org/anthology/P16-1086

  12. Ke, Z., Inamdar, H., Lin, H., Ng, V.: Give me more feedback II: annotating thesis strength and related attributes in student essays. In: Proceedings of ACL, pp. 3994–4004. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/P19-1390, https://www.aclweb.org/anthology/P19-1390

  13. Ke, Z., Ng, V.: Automated essay scoring: a survey of the state of the art. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 10–16 August 2019, pp. 6300–6308. ijcai.org (2019). https://doi.org/10.24963/ijcai.2019/879, https://doi.org/10.24963/ijcai.2019/879

  14. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: Proceedings of ICLR. OpenReview.net (2020). https://openreview.net/forum?id=H1eA7AEtvS

  15. Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of ACL, pp. 7871–7880. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.703, https://www.aclweb.org/anthology/2020.acl-main.703

  16. Liu, X., Shen, Y., Duh, K., Gao, J.: Stochastic answer networks for machine reading comprehension. In: Proceedings of ACL, pp. 1694–1704. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/P18-1157, https://www.aclweb.org/anthology/P18-1157

  17. Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  18. Nguyen, T., et al.: A human generated machine reading comprehension dataset. arXiv preprint ArXiv:1607.06275 (2016)

  19. O’Shea, K., Nash, R.: An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 (2015)

  20. Page, E.B.: The imminence of... grading essays by computer. The Phi Delta Kappan 47(5), 238–243 (1966)

    Google Scholar 

  21. Persing, I., Ng, V.: Modeling prompt adherence in student essays. In: Proceedins of ACL, pp. 1534–1543. Association for Computational Linguistics (2014). https://doi.org/10.3115/v1/P14-1144, https://www.aclweb.org/anthology/P14-1144

  22. Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: Unanswerable questions for SQuAD. In: Proceedings of ACL, pp. 784–789. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/P18-2124, https://www.aclweb.org/anthology/P18-2124

  23. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of EMNLP, pp. 2383–2392. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/D16-1264, https://www.aclweb.org/anthology/D16-1264

  24. Shen, Y., Huang, P., Gao, J., Chen, W.: Reasonet: Learning to stop reading in machine comprehension. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13–17, 2017, pp. 1047–1055. ACM (2017). https://doi.org/10.1145/3097983.3098177, https://doi.org/10.1145/3097983.3098177

  25. Shen, Y., Liu, X., Duh, K., Gao, J.: An empirical analysis of multiple-turn reasoning strategies in reading comprehension tasks. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). pp. 957–966. Asian Federation of Natural Language Processing (2017). https://www.aclweb.org/anthology/I17-1096

  26. Taghipour, K., Ng, H.T.: A neural approach to automated essay scoring. In: Proceedings of EMNLP, pp. 1882–1891. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/D16-1193, https://www.aclweb.org/anthology/D16-1193

  27. Trischler, A., et al.: NewsQA: a machine comprehension dataset. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp. 191–200. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/W17-2623, https://www.aclweb.org/anthology/W17-2623

  28. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp. 5998–6008 (2017). https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

  29. Wang, Y., Wei, Z., Zhou, Y., Huang, X.: Automatic essay scoring incorporating rating schema via reinforcement learning. In: Proceedings of EMNLP, pp. 791–797. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/D18-1090, https://www.aclweb.org/anthology/D18-1090

  30. Wang, Z., Liu, H., Tang, J., Yang, S., Huang, G.Y., Liu, Z.: Learning multi-level dependencies for robust word recognition. Proc. AAAI Conf. Artif. Intell. 34, 9250–9257 (2020)

    Google Scholar 

  31. Zhang, Z., Yang, J., Zhao, H.: Retrospective reader for machine reading comprehension. arXiv preprint arXiv:2001.09694 (2020)

Download references

Acknowledgment

This work was supported in part by National Key R&D Program of China, under Grant No. 2020AAA0104500 and in part by Beijing Nova Program (Z201100006820068) from Beijing Municipal Science & Technology Commission.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenbiao Ding .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, S., Xu, G., Jia, P., Ding, W., Wu, Z., Liu, Z. (2021). Automatic Task Requirements Writing Evaluation via Machine Reading Comprehension. In: Roll, I., McNamara, D., Sosnovsky, S., Luckin, R., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2021. Lecture Notes in Computer Science(), vol 12748. Springer, Cham. https://doi.org/10.1007/978-3-030-78292-4_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78292-4_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78291-7

  • Online ISBN: 978-3-030-78292-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics