Using machine learning to classify reviewer comments in research article drafts to enable students to focus on global revision

Article

Abstract

Reviewer comments in research articles such as journal papers or dissertations guide students during the revision process to improve the quality of their articles. Our goal is to make the comments more meaningful to the students’ revision process. Revision involves implicit cognitive processes and ICT has the potential to make such processes explicit. Previous research into the cognitive processes involved in revision has shown that novices focus on local, sentence level revision while expert writers focus on global revision of ideas or restructuring of arguments. For better quality writing, students should focus more on global revision. The reviewer comments can either trigger more meaningful global revision (content-related comments) or local revision (non content-related comments). In this paper, a machine learning algorithm was applied to classify the comments in academic drafts in our laboratory as either content-related or not. Reviewer comments in academic article drafts are usually short. Therefore, this research applied a Support Vector Machine (SVM) algorithm for the classification, which is one of the most common machine learning algorithms for short texts. Performance evaluation was based on the measures of accuracy, precision and recall for the non content-related comments. Using cross validation, highest scores of 86%, 89% and 89% were achieved for accuracy, recall, and precision, respectively. The results demonstrate the success of the automatic classification, which can be applied to filter out non content-related comments so that the students focus first on revising the content-related comments. In this way, the students can increase their awareness of the importance of global revision.

Keywords

Reviewer comment classification Short text classification Support vector machine Academic article revision 

Notes

Acknowledgements

This work was supported by the Japan Society for the Promotion of Science (KAKENHI) Grant Number 17 K00479.

References

  1. Amiangshu, B., Greiler, M., & Bird, C. (2015). Characteristics of useful code reviews: An empirical study at microsoft. Mining Software Repositories (MSR), 2015 IEEE/ACM 12th Working Conference on (S. 146–156). Florence: IEEE and ACM.Google Scholar
  2. Atreya, B., Walters, C., & Shepherd, M. (2003). Support Vector Machines for Text Categorization. Proceedings of the 36th Hawaii International Conference on System Sciences. Hawaii.Google Scholar
  3. Bird, S., Loper, E., & Klein, E. (2009). Natural Language Processing with Python. O’Reilly Media Inc. Abgerufen am 11. 9 2017 von http://www.nltk.org/
  4. Chih-Chung, C., & Chih-Jen, L. (2011). LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27.Google Scholar
  5. Chinnappa, G., Miller, T., & Gurevych, I. (2016). CNN-and LSTM-based Claim Classification in Online User Comments. 26th International Conference on Computational Linguistics (S. 2740–2751). Osaka: Association for Computational Linguistics.Google Scholar
  6. Chu, T., Kylie, J., & Wang, M. (2016). Comment Abuse Classification with Deep Learning. Von https://web.stanford.edu/class/cs224n/reports/2762092.pdf abgerufen.
  7. Flower, L., & Hayes, J. R. (1981). A cognitive process theory of writing. College Composition and Communication, 32(4), 365–387.CrossRefGoogle Scholar
  8. Hasegawa, S., & Yamane, K. (2011). An Article/Presentation Revising Support System for Transferring Laboratory Knowledge. 19th International Conference on Computers in Education (S. 247–254). Chiang Mai, Thailand: Asia-Pacific Society for Computers in Education.Google Scholar
  9. Hayes, J. R., Linda, F., Schriver, K. A., Stratman, J., & Carey, L. (1987). Cognitive processes in revision. Advances in applied psycholinguistics (2), 176–240.Google Scholar
  10. Iwai, H., Hijikata, Y., Ikeda, K., & Nishida, S. (2014). Sentence-based Plot Classification for Online Review Comments. IEEE/WIC/ACM International Joint Conference on Web Intelligencce (WI) and Intelligent Agent Technologies (IAT) (S. 245–253). Warsaw: IEEE Computer Society Press.Google Scholar
  11. Joty, S., Alberto, B.-C., Giovanni, D. S., Simone, F., Lluís, M., Alessandro, M., & Preslav, N. (2015). Global thread-level inference for comment classification in community question answering. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (S. 573–578). Lisbon: Association of Computational Linguistics.Google Scholar
  12. Kak, A. (2016). DecisionTree-3.4.3.html. Von DecisionTree-3.4.3: https://engineering.purdue.edu/kak/distDT/DecisionTree-3.4.3.html abgerufen.
  13. Kaszuba, T., Albert, H., & Adam, W. (2009). Comment classification for internet auction platforms. East European Conference on Advances in Databases and Information Systems (S. 129–136). Orhid: Springer.Google Scholar
  14. Kozma, R. B. (1991). Computer-based writing tools and the cognitive needs of novice writers. Computers and composition, 8(2), 31–45.CrossRefGoogle Scholar
  15. Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP Natural Language Processing Toolkit. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, (S. 55–60). Von http://stanfordnlp.github.io/CoreNLP/index.html abgerufen.
  16. Mosab, F., Abdulla, N., Al-Ayyoub, M., Jararweh, Y., & Quwaider, M. (2014). Cross-lingual short-text document classification for facebook comments. Future Internet of Things and Cloud (FiCloud) (S. 573–578). Barcelona: IEEE.Google Scholar
  17. Mukherjee, A., & Bing, L. (2012). Modeling review comments. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1 (S. 320–329). Jeju Island: Association for Computational Linguistics.Google Scholar
  18. Ocharo, H. N., Hasegawa, S., & Shirai, K. (2017). Topic-based Revision Tool to Support Academic Writing Skill for Research Students. Proceedings of The Tenth International Conference (S. 102–107). Nice: ThinkMind.Google Scholar
  19. Refaeilzadeh, P., Lei, T., & Huan, L. (2009). Cross-validation. Encyclopedia of database systems, 532–538.Google Scholar
  20. Sun, Y., Ma, L., & Wang, S. (February 2015). A Comparative Evaluation of String Similarity Metrics. Journal of Information & Computational Science, 12(3), 957–964.CrossRefGoogle Scholar
  21. Yue, L., Zhai, C., & Sundaresan, N. (2009). Rated aspect summarization of short comments. Proceedings of the 18th international conference on World wide web (S. 131–140). Madrid: Association for Computing Machinery (ACM).Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Information ScienceJapan Advanced Institute of Science and TechnologyNomi CityJapan
  2. 2.Research Center for Advanced Computing InfrastructureJapan Advanced Institute of Science and TechnologyNomi CityJapan

Personalised recommendations