Understanding Lexical Features for Chinese Essay Grading

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1042)


Essay grading is an important and difficult task in natural language processing. Most of the existing works focus on grading non-native English essays, such as essays in TOEFL. However, these works are not applicable for Chinese essays due to word segmentation and different syntax features. Considering lexical features are important for essay grading, in this paper, we study the expert evaluation standard and propose an interpretable lexical grading method for essays. We first study different levels of vocabulary provided by experts and introduce a quantitative evaluation framework on lexical features. Based on these standards, we quantify the Chinese essay dataset of 12 education grades in primary and middle schools and propose a set of interpretable features. Then a Bi-LSTM network model is proposed for semantically grading essay, which accepts a sequence of word vectors as input and integrates attention mechanism in terms of lexical richness. We evaluate our method on real datasets and the experimental results show that it outperforms other methods on the task of lexically Chinese essay grading. Besides, our method gives interpretable results, which are helpful for practical applications.


Essay grading LSTM Lexical richness Interpretable 



This work was supported by the National Key Research and Development Program of China under Grant No. 2018YFC0831401, the National Natural Science Foundation of China under Grant No. 91646119, the Major Project of NSF Shandong Province under Grant No. ZR2018ZB0420, and the Key Research and Development Program of Shandong province under Grant No. 2017GGX10114. The scientific calculations in this paper have been done on the HPC Cloud Platform of Shandong University.


  1. 1.
    Attali, Y., Burstein, J.: Automated essay scoring with e-rater® V. 2. J. Technol. Learn. Assess. 4(3), 1–30 (2006)Google Scholar
  2. 2.
    Juku Correction Website.
  3. 3.
    Graves, A.: Supervised sequence labelling with recurrent neural networks. Stud. Comput. Intell. 385, 1–131 (2012)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  5. 5.
    Page, E.B.: Grading essays by computer: progress report. In: Proceedings of the Invitational Conference on Testing Problems, pp. 87–100 (1967)Google Scholar
  6. 6.
    Daigon, A.: Computer grading of English essays. Engl. J. 55(1), 46–52 (1966)CrossRefGoogle Scholar
  7. 7.
    Foltz, P.W., Laham, D., Landauer, T.K.: The intelligent essay assessor: applications to educational technology. Interact. Multimedia Electron. J. Comput.-Enhanc. Learn. 1(2), 939–944 (1999)Google Scholar
  8. 8.
    Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)CrossRefGoogle Scholar
  9. 9.
    Rudner, L.: Computer grading using Bayesian networks-overview. Wayback Machine (2012)Google Scholar
  10. 10.
    Automated Student Assessment Prize (ASAP).
  11. 11.
    Alikaniotis, D., Yannakoudakis, H., Rei, M.: Automatic text scoring using neural networks. arXiv preprint. arXiv:1606.04289 (2016)
  12. 12.
    Dong, F., Zhang, Y., Yang, J.: Attention-based recurrent convolutional neural network for automatic essay scoring. In: Proceedings of the 21st Conference on Computational Natural Language Learning, pp. 153–162. ACL, Vancouver (2017)Google Scholar
  13. 13.
    Cozma, M., Butnaru, A.M., Ionescu, R.T.: Automated essay scoring with string kernels and word embeddings. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 503–509. ACL, Melbourne (2018)Google Scholar
  14. 14.
    Taghipour, K., Ng, H.T.: A neural approach to automated essay scoring. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1882–1891. ACL, Austin (2016)Google Scholar
  15. 15.
    Jin, C., He, B., Hui, K., et al.: TDNN: a two-stage deep neural network for prompt-independent automated essay scoring. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 1088–1097. ACL, Melbourne (2018)Google Scholar
  16. 16.
    Tay, Y., Phan, M.C., Tuan, L.A., et al.: SkipFlow: incorporating neural coherence features for end-to-end automatic text scoring. In: Thirty-Second AAAI Conference on Artificial Intelligence, pp. 5948–5955. AAAI, New Orleans (2018)Google Scholar
  17. 17.
    Ruiji, F., Dong, W., Shijin, W., Guoping, H., Ting, L.: Elegart sentence recognition for automated essay scoring. J. Chin. Inf. Process. 32(6), 88–97 (2018)Google Scholar
  18. 18.
    Examination Center of the Office of the National HSK Examination Committee: Outline of Chinese Proficiency Vocabulary and Chinese Characters. Economic Science Press, Beijing (2001)Google Scholar
  19. 19.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196. IMLS, Beijing (2014)Google Scholar
  20. 20.
    Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543. ACL, Doha (2014)Google Scholar
  21. 21.
    Shen, L., Zhe, Z., Renfen, H., Wensi, L., Tao, L., Xiaoyong, D.: Analogical reasoning on Chinese morphological and semantic relations. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 138–143. ACL, Melbourne (2018)Google Scholar
  22. 22.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, Microtome, San Diego (2015)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.School of Software, Ministry of EducationShandong UniversityJinanChina
  2. 2.School of Computer Science and Technology, Ministry of EducationShandong UniversityJinanChina
  3. 3.Engineering Research Center of Digital Media Technology, Ministry of EducationShandong UniversityJinanChina

Personalised recommendations