Abstract
Formative assessments are an important component of instruction and pedagogy, as they provide students and teachers with insights on how students are progressing in their learning and problem-solving tasks. Most formative assessments are now coded and graded manually, impeding timely interventions that help students overcome difficulties. Automated evaluation of these assessments can facilitate more effective and timely interventions by teachers, allowing them to dynamically discern individual and class trends that they may otherwise miss. State-of-the-art BERT-based models dominate the NLP landscape but require large amounts of training data to attain sufficient classification accuracy and robustness. Unfortunately, educational data sets are often small and unbalanced, limiting any benefits that BERT-like approaches might provide. In this paper, we examine methods for balancing and augmenting training data consisting of students’ textual answers from formative assessments, then analyze the impacts in order to improve the accuracy of BERT-based automated evaluations. Our empirical studies show that these techniques consistently outperform models trained on unbalanced and unaugmented data.
The assessment project described in this article was funded, in part, by the NSF Award # 2017000. The opinions expressed are those of the authors and do not represent views of NSF.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
While there were 99 students in the study, not all students answered each question.
- 2.
Here, Maj and Min refer to the number of available sentences from the majority and minority classes.
References
Bayer, M., Kaufhold, M.A., Reuter, C.: A survey on data augmentation for text classification. arXiv preprint arXiv:2107.03158 (2021)
Biswas, G., Segedy, J.R., Bunchongchit, K.: From design to implementation to practice a learning by teaching system: Betty’s brain. Int. J. Artif. Intell. Educ. 26(1), 350–364 (2016)
Black, P., Wiliam, D.: Developing the theory of formative assessment. Educ. Assessm. Evaluat. Accountab. 21, 5–31 (2009)
Brown, T.B., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020)
Chen, J., Tam, D., Raffel, C., Bansal, M., Yang, D.: An empirical survey of data augmentation for limited data learning in NLP. arXiv preprint arXiv:2106.07499 (2021)
Clark, I.: Formative assessment: assessment is for self-regulated learning. Educ. Psychol. Rev. 24, 205–249 (2012). https://doi.org/10.1007/s10648-011-9191-6
Cohn, C.: BERT Efficacy on Scientific and Medical Datasets: A Systematic Literature Review. DePaul University (2020)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Feng, S.Y., Gangal, V., Kang, D., Mitamura, T., Hovy, E.: GenAug: data augmentation for finetuning text generators. arXiv preprint arXiv:2010.01794 (2020)
Geden, M., Emerson, A., Carpenter, D., Rowe, J., Azevedo, R., Lester, J.: Predictive student modeling in game-based learning environments with word embedding representations of reflection. Int. J. Artif. Intell. Educ. 31(1), 1–23 (2020). https://doi.org/10.1007/s40593-020-00220-4
Hattie, J., Timperley, H.: The power of feedback. Rev. Educ. Res. 77(1), 81–112 (2007). https://doi.org/10.3102/003465430298487
Higgins, M., Grant, F., Thompson, P.: Formative assessment: balancing educational effectiveness and resource efficiency. J. Educ. Built Environ. 5(2), 4–24 (2010). https://doi.org/10.11120/jebe.2010.05020004
Hughes, S.: Automatic Inference of Causal Reasoning Chains from Student Essays. Ph.D. thesis, DePaul University, Chicago (2019). https://via.library.depaul.edu/cdm_etd/19/
Käser, T., Schwartz, D.L.: Modeling and analyzing inquiry strategies in open-ended learning environments. Int. J. Artif. Intell. Educ. 30(3), 504–535 (2020)
Liu, P., Wang, X., Xiang, C., Meng, W.: A survey of text data augmentation. In: 2020 International Conference on Computer Communication and Network Security (CCNS), pp. 191–195. IEEE (2020)
Luckin, R., du Boulay, B.: Reflections on the Ecolab and the zone of proximal development. Int. J. Artif. Intell. Educ. 26(1), 416–430 (2015). https://doi.org/10.1007/s40593-015-0072-x
McElhaney, K.W., Zhang, N., Basu, S., McBride, E., Biswas, G., Chiu, J.: Using computational modeling to integrate science and engineering curricular activities. In: Gresalfi, M., Horn, I.S. (Eds.). The Interdisciplinarity of the Learning Sciences, 14th International Conference of the Learning Sciences (ICLS) 2020, vol. 3 (2020)
Mislevy, R.J., Haertel, G.D.: Implications of evidence-centered design for educational testing. Educational Measurement: Issu. Pract. 25(4), 6–20 (2006) https://doi.org/10.1111/j.1745-3992.2006.00075.x
NGSS: Next Generation Science Standards. For States, By States. The National Academies Press (2013)
Wei, J., Zou, K.: EDA: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196 (2019)
Winne, Philip H.., Hadwin, Allyson F..: nStudy: tracing and supporting self-regulated learning in the Internet. In: Azevedo, Roger, Aleven, Vincent (eds.) International Handbook of Metacognition and Learning Technologies. SIHE, vol. 28, pp. 293–308. Springer, New York (2013). https://doi.org/10.1007/978-1-4419-5546-3_20
Zhang, N., Biswas, G., Hutchins, N.: Measuring and analyzing students’ strategic learning behaviors in open-ended learning environments. Int. J. Artif. Intell. Educ. (2021). https://doi.org/10.1007/s40593-021-00275-x
Zhang, N., et al.: Studying the interactions between science, engineering, and computational thinking in a learning-by-modeling environment. In: Bittencourt, I.I., Cukurova, M., Muldner, K., Luckin, R., Millán, E. (eds.) AIED 2020. LNCS (LNAI), vol. 12163, pp. 598–609. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52237-7_48
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Cochran, K., Cohn, C., Hutchins, N., Biswas, G., Hastings, P. (2022). Improving Automated Evaluation of Formative Assessments with Text Data Augmentation. In: Rodrigo, M.M., Matsuda, N., Cristea, A.I., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2022. Lecture Notes in Computer Science, vol 13355. Springer, Cham. https://doi.org/10.1007/978-3-031-11644-5_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-11644-5_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11643-8
Online ISBN: 978-3-031-11644-5
eBook Packages: Computer ScienceComputer Science (R0)