Skip to main content
Log in

Automated Assessment of Student Hand Drawings in Free-Response Items on the Particulate Nature of Matter

  • Published:
Journal of Science Education and Technology Aims and scope Submit manuscript

Abstract

Here, we describe the development and validation of an automatic assessment system that examines students’ hand-drawn visual representations in free-response items. The data were collected from 1,028 students in the second through 11th grades in South Korea using two items from the Test About Particles in a Gas questionnaire (Novick & Nussbaum, 1981). Students’ free responses, which include hand drawings and writing, were coded for two dimensions — structural (particulate/continuous/other) and distributional (expanded/concentrated/other). Machine learning (ML) models were trained to assess the responses on the particulate nature of matter. For classifying hand drawings, a pre-trained Inception-v3 model followed by a support vector machine was trained and its performance was evaluated. The assessment model yielded high machine-human agreement (MHA) (kappa = 0.732–0.926, accuracy = 0.820–0.942, precision = 0.817–0.941, recall = 0.820–0.942, F1 = 0.818–0.941, and area under the curve [AUC] = 0.906–0.990). Students’ written responses were tokenized, and a dictionary of scientific semantic scores was prepared. The final model for the overall assessment of both drawing and writing yielded high MHA (kappa = 0.800–0.881, accuracy = 0.859–0.956, precision = 0.865–0.957, recall = 0.859–0.956, F1 = 0.859–0.956, and AUC = 0.944–0.995), which varied by the final classifiers of the models. There were some variances in the performance of the assessment model according to the school level. This study suggests that artificial intelligence can be used to automate assessments of students’ representations of scientific concepts in free-response items, particularly those drawn in a pencil-and-paper format.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.image-net.org/index (Accessed: April 1, 2021).

References

  • Adadan, E., Irving, K. E., & Trundle, K. C. (2009). Impacts of multi-representational instruction on high school students’ conceptual understandings of the particulate nature of matter. International Journal of Science Education, 31(13), 1743–1775.

    Google Scholar 

  • Adadan, E. (2013). Using multiple representations to promote grade 11 students’ scientific understanding of the particle theory of matter. Research in Science Education, 43(3), 1079–1105.

    Google Scholar 

  • Ayas, A., Özmen, H., & Çalik, M. (2010). Students’ conceptions of the particulate nature of matter at secondary and tertiary level. International Journal of Science and Mathematics Education, 8(1), 165–184.

    Google Scholar 

  • Benson, D. L., Wittrock, M. C., & Baur, M. E. (1993). Students’ preconceptions of the nature of gases. Journal of Research in Science Teaching, 30(6), 587–597.

    Google Scholar 

  • Braun, H. I., Bennett, R. E., Frye, D., & Soloway, E. (1990). Scoring constructed responses using expert systems. Journal of Educational Measurement, 27(2), 93–108.

    Google Scholar 

  • Chang, H. Y., & Tzeng, S. F. (2018). Investigating Taiwanese students’ visualization competence of matter at the particulate level. International Journal of Science and Mathematics Education, 16(7), 1207–1226.

    Google Scholar 

  • Delgado, R., & Tibau, X. A. (2019). Why Cohen’s Kappa should be avoided as performance measure in classification. PLoS ONE, 14(9), e0222916.

    Google Scholar 

  • Gabel, D. L., Samuel, K. V., & Hunn, D. (1987). Understanding the particulate nature of matter. Journal of Chemical Education, 64(8), 695.

    Google Scholar 

  • Gerard, L. F., Ryoo, K., McElhaney, K. W., Liu, O. L., Rafferty, A. N., & Linn, M. C. (2016). Automated guidance for student inquiry. Journal of Educational Psychology, 108(1), 60–81.

    Google Scholar 

  • Ghali, R., Ouellet, S., & Frasson, C. (2016). LewiSpace: An exploratory study with a machine learning model in an educational game. Journal of Education and Training Studies, 4(1), 192–201.

    Google Scholar 

  • Gillespie, R. J. (1997). The great ideas of chemistry. Journal of Chemical Education, 74(7), 862–863.

    Google Scholar 

  • Harrison, A. G., & Treagust, D. F. (2002). The particulate nature of matter: Challenges in understanding the submicroscopic world. In J. K. Gilbert, O. De Jong, R. Justi, D. F. Treagust, & J. H. Van Driel (Eds.), Chemical education: Towards research-based practice (pp. 189–212). Springer.

    Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer.

    Google Scholar 

  • Haudek, K. C., Prevost, L. B., Moscarella, R. A., Merrill, J., & Urban-Lurain, M. (2012). What are they thinking? Automated analysis of student writing about acid–base chemistry in introductory biology. CBE-Life Sciences Education, 11(3), 283–293.

    Google Scholar 

  • Hogan, T. P., & Murphy, G. (2007). Recommendations for preparing and scoring constructed-response items: What the experts say. Applied Measurement in Education, 20(4), 427–441.

    Google Scholar 

  • Hosmer, D. W. Jr., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (3rd ed.). Wiley.

  • Hurd, P. D. (1998). Scientific literacy: New minds for a changing world. Science Education, 82(3), 407–416.

    Google Scholar 

  • Jescovitch, L. N., Scott, E. E., Cerchiara, J. A., Merrill, J., Urban-Lurain, M., Doherty, J. H., & Haudek, K. C. (2021). Comparison of machine learning performance using analytic and holistic coding approaches across constructed response assessments aligned to a science learning progression. Journal of Science Education and Technology, 30(2), 150–167.

    Google Scholar 

  • Jin, X., Chi, J., Peng, S., Tian, Y., Ye, C., & Li, X. (2016). Deep image aesthetics classification using inception modules and fine-tuning connected layer. In 2016 8th International Conference on Wireless Communications & Signal Processing (WCSP) (pp. 1–6). IEEE.

  • Karacop, A., & Doymus, K. (2013). Effects of jigsaw cooperative learning and animation techniques on students’ understanding of chemical bonding and their conceptions of the particulate nature of matter. Journal of Science Education and Technology, 22(2), 186–203.

    Google Scholar 

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097–1105.

    Google Scholar 

  • Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.

    Google Scholar 

  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.

    Google Scholar 

  • Lee, G. -G., & Ha, M. (2020). The present and future of AI-based automated evaluation: A literature review on descriptive assessment and other side. Journal of Educational Technology, 36(2), 353–382. (written in Korean)

  • Liu, O. L., Rios, J. A., Heilman, M., Gerard, L., & Linn, M. C. (2016). Validation of automated scoring of science assessments. Journal of Research in Science Teaching, 53(2), 215–233.

    Google Scholar 

  • Liu, X., & Lesniak, K. M. (2005). Students’ progression of understanding the matter concept from elementary to high school. Science Education, 89(3), 433–450.

    Google Scholar 

  • Luckin, R., Holmes, W., Griffiths, M., & Forcier, L. B. (2016). Intelligence unleashed: An argument of AI in education. Pearson Education.

    Google Scholar 

  • Maestrales, S., Zhai, X., Touitou, I., Baker, Q., Schneider, B., & Krajcik, J. (2021). Using machine learning to score multi-dimensional assessments of chemistry and physics. Journal of Science Education and Technology, 30(2), 239–254.

    Google Scholar 

  • National Research Council [NRC]. (2012). A framework for K-12 science education: Practices, cross-cutting concepts, and core ideas. National Academies Press.

  • Nehm, R. H., & Ha, M. (2011). Item feature effects in evolution assessment. Journal of Research in Science Teaching, 48(3), 237–256.

    Google Scholar 

  • NGSS Lead States. (2013). Next generation science standards: For states, by states. National Academies Press.

  • Novick, S., & Nussbaum, J. (1981). Pupils’ understanding of the particulate nature of matter: A cross-age study. Science Education, 65(2), 187–196.

    Google Scholar 

  • Nyachwaya, J. M., Mohamed, A.-R., Roehrig, G. H., Wood, N. B., Kern, A. L., & Schneider, J. L. (2011). The development of an open-ended drawing tool: An alternative diagnostic tool for assessing students’ understanding of the particulate nature of matter. Chemistry Education Research and Practice, 12(2), 121–132.

    Google Scholar 

  • Opfer, J. E., Nehm, R. H., & Ha, M. (2012). Cognitive foundations for science assessment design: Knowing what students know about evolution. Journal of Research in Science Teaching, 49(6), 744–777.

    Google Scholar 

  • Özmen, H. (2011). Effect of animation enhanced conceptual change texts on 6th grade students’ understanding of the particulate nature of matter and transformation during phase changes. Computers & Education, 57(1), 1114–1126.

    Google Scholar 

  • Park, E. L., & Cho, S. (2014). KoNLPy: Korean natural language processing in Python. In Proceedings of the 26th Annual Conference on Human & Cognitive Language Technology, Chuncheon, Korea. (written in Korean)

  • Pei, B., Xing, W., & Lee, H. S. (2019). Using automatic image processing to analyze visual artifacts created by students in scientific argumentation. British Journal of Educational Technology, 50(6), 3391–3404.

    Google Scholar 

  • Russell, S., & Norvig, P. (2020). Artificial intelligence: A modern approach (4th ed.) Pearson Education.

  • Ryan, S. A., & Stieff, M. (2019). Drawing for assessing learning outcomes in chemistry. Journal of Chemical Education, 96(9), 1813–1820.

    Google Scholar 

  • Shin, D., & Shim, J. (2021). A systematic review on data mining for mathematics and science education. International Journal of Science and Mathematics Education, 19, 639–659.

    Google Scholar 

  • Smith, A., Leeman-Munk, S., Shelton, A., Mott, B., Wiebe, E., & Lester, J. (2019). A multi-modal assessment framework for integrating student writing and drawing in elementary science learning. IEEE Transactions on Learning Technologies, 12(1), 3–15.

    Google Scholar 

  • Smith, C. L., Wiser, M., Anderson, C. W., & Krajcik, J. (2006). Implications of research on children’s learning for standards and assessment: A proposed learning progression for matter and the atomic-molecular theory. Measurement: Interdisciplinary Research & Perspective, 4(1–2), 1–98.

  • Sripathi, K. N., Moscarella, R. A., Yoho, R., You, H. S., Urban-Lurain, M., Merrill, J., & Haudek, K. (2019). Mixed student ideas about mechanisms of human weight loss. CBE-Life Sciences Education, 18(ar37), 1–17.

    Google Scholar 

  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2818–2826.

  • Taber, K. S., & García-Franco, A. (2010). Learning processes in chemistry: Drawing upon cognitive resources to learn about the particulate structure of matter. The Journal of the Learning Sciences, 19(1), 99–142.

    Google Scholar 

  • Treagust, D. F., Chandrasegaran, A. L., Crowley, J., Yung, B. H., Cheong, I. P. A., & Othman, J. (2010). Evaluating students’ understanding of kinetic particle theory concepts relating to the states of matter, changes of state and diffusion: A cross-national study. International Journal of Science and Mathematics Education, 8(1), 141–164.

    Google Scholar 

  • Treagust, D. F., Chandrasegaran, A. L., Zain, A. N., Ong, E. T., Karpudewan, M., & Halim, L. (2011). Evaluation of an intervention instructional program to facilitate understanding of basic particle concepts among students enrolled in several levels of study. Chemistry Education Research and Practice, 12(2), 251–261.

    Google Scholar 

  • Yarroch, W. L. (1985). Student understanding of chemical equation balancing. Journal of Research in Science Teaching, 22(5), 449–459.

    Google Scholar 

  • Yilmaz, A., & Alp, E. (2006). Students’ understanding of matter: The effect of reasoning ability and grade level. Chemistry Education Research and Practice, 7(1), 22–31.

    Google Scholar 

  • Zhai, X., He, P., & Krajcik, J. (2022). Applying machine learning to automatically assess scientific models. Journal of Research in Science Teaching. https://doi.org/10.1002/tea.21773

    Article  Google Scholar 

  • Zhai, X., Krajcik, J., & Pellegrino, J. W. (2021). On the validity of machine learning-based next generation science assessments: A validity inferential network. Journal of Science Education and Technology. https://doi.org/10.1007/s10956-020-09879-9

    Article  Google Scholar 

  • Zhai, X., Shi, L., & Nehm, R. H. (2020a). A meta-analysis of machine learning-based science assessments: Factors impacting machine-human score agreements. Journal of Science Education and Technology. https://doi.org/10.1007/s10956-020-09875-z

    Article  Google Scholar 

  • Zhai, X., Yin, Y., Pellegrino, J. W., Haudek, K. C., & Shi, L. (2020b). Applying machine learning in science assessment: A systematic review. Studies in Science Education, 56(1), 111–151.

    Google Scholar 

  • Zhu, M., Lee, H. S., Wang, T., Liu, O. L., Belur, V., & Pallant, A. (2017). Investigating the impact of automated feedback on students’ scientific argumentation. International Journal of Science Education, 39(12), 1648–1668.

    Google Scholar 

  • Zhu, M., Liu, O. L., & Lee, H. S. (2020). The effect of automated feedback on revision behavior and learning gains in formative assessment of scientific argument writing. Computers & Education, 143, 103668.

    Google Scholar 

Download references

Acknowledgements

This research was supported by 2020 Student-Directed Education Regular Program from Seoul National University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hun-Gi Hong.

Ethics declarations

Ethics Approval

Not applicable.

Consent to Participate

Not applicable.

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Table 6 Number (%) of student representations of matter for item 1 (highest ratios in each school level in bold)
Table 7 Number (%) of student representations of matter for item 2 (highest ratios in each school level in bold)
Table 8 Confusion matrix of the best classifying layer (SVM) for hand-drawn responses in the test dataset (row: actual, column: predicted; n = 206)
Table 9 Confusion matrix of the best classifying layers for integrated hand drawings and writing in an item in the test dataset (row: actual; column: predicted) (n = 206)
Fig. 7
figure 7

An example of a misclassified elementary school student response to item 1

Fig. 8
figure 8

An example of a misclassified elementary school student response to item 2

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, J., Lee, GG. & Hong, HG. Automated Assessment of Student Hand Drawings in Free-Response Items on the Particulate Nature of Matter. J Sci Educ Technol 32, 549–566 (2023). https://doi.org/10.1007/s10956-023-10042-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10956-023-10042-3

Keywords

Navigation