To What Extent Can Text Classification Help with Making Inferences About Students’ Understanding
- 585 Downloads
In this paper we apply supervised machine learning algorithms to automatically classify the text of students’ reflective learning journals from an introductory Java programming module with the aim of identifying students who need help with their understanding of the topic they are reflecting on. Such a system could alert teaching staff to students who may need an intervention to support their learning.
Several different classifier algorithms have been validated on the training data set to find the best model in two situations; with equal cost for a positive or negative classification and with cost sensitive classification. Methods were used to identify those individual parameters which maximise the performance of each algorithm. Precision, recall and F1-score, as well as confusion matrices were used to understand the behaviour of each classifier and choose the one with the best performance.
The classifiers that obtained the best results from the validation were then evaluated on a testing data set containing different data to that used for training.
We believe that although the results could be improved with further work, our initial results show that machine learning could be applied to students’ reflective writing to assist staff in identifying those students who are struggling to understand the topic.
- 1.Aphinyanaphongs, Y., Tsamardinos, I., Statnikov, A., Hardin, D., Aliferis, C.F.: Text categorization models for high-quality article retrieval in internal medicine. J. Am. Med. Inform. Assoc. 12(2), 207–216 (2005). https://doi.org/10.1197/jamia.M1641
- 3.Carreras, X., Marquez, L.: Boosting trees for anti-spam email filtering (2001). https://arxiv.org/abs/cs/0109015. Accessed 12 Jun 2018
- 4.Chawla, N.V.: C4.5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In: Proceedings of the ICML, vol. 3, p. 66 (2003)Google Scholar
- 6.Elankavi, R., Kalaiprasath, R., Udayakumar, D.R.: A fast clustering algorithm for high-dimensional data. Int. J. Civ. Eng. Technol. (IJCIET) 8(5), 1220–1227 (2017)Google Scholar
- 8.Friedman, C.: A broad-coverage natural language processing system. In: Proceedings of the AMIA Symposium, pp. 270–274 (2000)Google Scholar
- 9.Géron, A.: Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media, Inc. (2017)Google Scholar
- 11.Gruber, M.: Improving Efficiency by Shrinkage: The James–Stein and RidgeRegression Estimators. Routledge (2017)Google Scholar
- 14.Plotly Technologies Inc.: Collaborative data science (2015). https://plot.ly
- 17.Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
- 18.Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: AAAI, vol. 333, pp. 2267–2273 (2015)Google Scholar
- 19.Larkey, L.: Automatic essay grading using text categorization techniques. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 90–95. ACM, August 1998Google Scholar
- 20.Lewis, D., Gale, W.: A sequential algorithm for training text classifiers. In: ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3–12. Springer, New York (1994). https://doi.org/10.1007/978-1-4471-2099-5_1
- 23.Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. CoRR abs/1310.4546 (2013). http://arxiv.org/abs/1310.4546
- 24.Moon, J.: Reflection in Learning and Professional Development. Routledge, London (1999)Google Scholar
- 25.Murphy, K.P., et al.: Naive Bayes Classifiers, p. 18. University of British Columbia (2006)Google Scholar
- 28.Rudner, L., Liang, T.: Automated essay scoring using Bayes’ theorem. J. Technol. Learn. Assessment 1(2) (2002) Google Scholar
- 29.Silge, J., Robinson, D.: Text Mining with R: A Tidy Approach. O’Reilly Media, Inc. (2017)Google Scholar
- 31.Sukkarieh, J.Z., Pulman, S.G., Raikes, N.: Auto-marking: using computational linguistics to score short, free text responses. In: 29th Annual Conference of the International Association for Educational Assessment (IAEA), Manchester, UK (2003)Google Scholar
- 35.Valenti, S., Neri, F., Cucchiarelli, A.: An overview of current research on automated essay grading. J. Inf. Technol. Educ. Res. 2, 319–330 (2003)Google Scholar