Machine Learning-Enabled Automated Feedback: Supporting Students’ Revision of Scientific Arguments Based on Data Drawn from Simulation

Lee, Hee-Sun; Gweon, Gey-Hong; Lord, Trudi; Paessel, Noah; Pallant, Amy; Pryputniewicz, Sarah

doi:10.1007/s10956-020-09889-7

Machine Learning-Enabled Automated Feedback: Supporting Students’ Revision of Scientific Arguments Based on Data Drawn from Simulation

Published: 04 January 2021

Volume 30, pages 168–192, (2021)
Cite this article

Journal of Science Education and Technology Aims and scope Submit manuscript

Hee-Sun Lee ORCID: orcid.org/0000-0002-4673-5008¹,
Gey-Hong Gweon²,
Trudi Lord¹,
Noah Paessel¹,
Amy Pallant¹ &
…
Sarah Pryputniewicz¹

1974 Accesses
24 Citations
3 Altmetric
Explore all metrics

Abstract

A design study was conducted to test a machine learning (ML)-enabled automated feedback system developed to support students’ revision of scientific arguments using data from published sources and simulations. This paper focuses on three simulation-based scientific argumentation tasks called Trap, Aquifer, and Supply. These tasks were part of an online science curriculum module addressing groundwater systems for secondary school students. ML was used to develop automated scoring models for students’ argumentation texts as well as to explore emerging patterns between students’ simulation interactions and argumentation scores. The study occurred as we were developing the first version of simulation feedback to augment the existing argument feedback. We studied two cohorts of students who used argument only (AO) feedback (n = 164) versus argument and simulation (AS) feedback (n = 179). We investigated how AO and AS students interacted with simulations and wrote and revised their scientific arguments before and after receiving their respective feedback. Overall, the same percentages of students (49% each) revised their arguments after feedback, and their revised arguments received significantly higher scores for both feedback conditions, p < 0.001. Significantly greater numbers of AS students (36% across three tasks) reran the simulations after feedback as compared with the AO students (5%), p < 0.001. For AS students who reran the simulation, their simulation scores increased for the Trap task, p < .001, and for the Aquifer task, p < 0.01. AO students who did not receive simulation feedback but reran the simulations increased simulation scores only for the Trap task, p < .05. For the Trap and Aquifer tasks, students who increased simulation scores were more likely to increase argument scores in their revisions than those who did not increase simulation scores or did not revisit simulations at all after simulation feedback was provided. This pattern was not found for the Supply task. Based on these findings, we discuss strengths and weaknesses of the current automated feedback design, in particular the use of ML.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Empirically Grounded Framework That Evaluates Argument Quality in Scientific and Social Contexts

Article 02 May 2020

Scaffolding argumentation about water quality: a mixed-method study in a rural middle school

Article 05 March 2015

Constructing Scientific Arguments Using Evidence from Dynamic Computational Climate Models

Article 11 June 2014

References

Allchin, D. (2012). Teaching the nature of science through scientific errors. Science Education, 96(5), 904–926.
Article Google Scholar
Ash, D., & Levitt, K. (2003). Working within the zone of proximal development: formative assessment as professional development. Journal of Science Teacher Education, 14(1), 23–48.
Article Google Scholar
Azevedo, R., & Bernard, R. M. (1995). A meta-analysis of the effects of feedback in computer-based instruction. Journal of Educational Computing Research, 13(2), 111–127.
Article Google Scholar
Baker, R., & Siemens, G. (2013). Educational data mining and learning analytics. In K. Sawyer (Ed.), The Cambridge Handbook of the Learning Sciences (pp. 1380–1400). New York: Cambridge University Press.
Google Scholar
Baker, R., Hershkovitz, A., Rossi, L. M., Goldstein, A. B., & Gowda, S. M. (2013). Predicting robust learning with the visual form of the moment-by-moment learning curve. Journal of the Learning Sciences, 22, 639–666.
Article Google Scholar
Bar-Yam, Y. (2002). General features of complex systems. In Encyclopedia of Life Support Systems (EOLSS) (Vol. I, pp. 1–10). UNESCO.
Beggrow, E. P., Ha, M., Nehm, R. H., Pearl, D., & Boone, W. J. (2014). Assessing scientific practices using machine-learning methods: how closely do they match clinical interview performance? Journal of Science Education and Technology, 23(1), 160–182.
Article Google Scholar
Bell, B., & Cowie, B. (2001). The characteristics of formative assessment. Science Education, 85(5), 536–553.
Article Google Scholar
Ben-Haim, Y. (2014). Order and indeterminism: An info-gap perspective. In M. Boumans, G. Hon, & A. C. Petersen (Eds.), Error and uncertainty in scientific practice: History and philosophy of technoscience (pp. 157–176). London: Routledge.
Google Scholar
Berland, L. K., & McNeill, K. L. (2010). A learning progression for scientific argumentation: understanding student work and designing supportive instructional contexts. Science Education, 94(5), 765–793.
Article Google Scholar
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education, 5(1), 7–72.
Google Scholar
Brown, A. L. (1992). Design experiments: theoretical and methodological challenges in creating complex interventions in classroom settings. The Journal of the Learning Sciences, 2(2), 141–178.
Article Google Scholar
Cazden, C. B. (1988). Classroom discourse: The language of teaching and learning. Portsmouth, NH: Heinemann.
Google Scholar
Donnelly, D. F., Vitale, J. M., & Linn, M. C. (2015). Automated guidance for thermodynamics essays: critiquing versus revisiting. Journal of Science Education and Technology, 24(6), 861–874.
Article Google Scholar
Duschl, R. A., & Osborne, J. (2002). Supporting and promoting argumentation discourse in science education. Studies in Science Education, 38, 39–72.
Article Google Scholar
Erduran, S., Simon, S., & Osborne, J. (2004). TAPping into argumentation: developments in the application of Toulmin’s argument pattern for studying science discourse. Science Education, 88, 915–933.
Article Google Scholar
Fleiss, J. L., & Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 33, 613–619.
Article Google Scholar
Gerard, L., Kidron, A., & Linn, M. C. (2019). Guiding collaborative revision of science explanations. International Journal of Computer-Supported Collaborative Learning, 14, 1–34.
Article Google Scholar
Gobert, J., Sao Pedro, M., Raziuddin, J., & Baker, R. (2013). From log files to assessment metrics for science inquiry using educational data mining. Journal of the Learning Sciences, 22(4), 521–563.
Article Google Scholar
Guzdial, M. (1994). Software-realized scaffolding to facilitate programming for science learning. Interactive Learning Environments, 4(1), 1–44.
Article Google Scholar
Gweon, G.H., Lee, H. -S., & Finzer, W. (2016). Measuring systematcity of students’ experimentation in an open-ended simulation environment from logging data. Paper presented at the annual meeting of the Americal Educational Research Association. Washington D.C.
Hattie, J. (2009). Visible learning: a synthesis of over 800 meta-analyses relating to achievement. New York: Routledge.
Google Scholar
Heilman, M., & Madnani, N. (2013). Domain adaptation and stacking for short answer scoring. Seventh International Workshop on Semantic Evaluation, 2(SemEval), 275–279. Retrieved from http://www.aclweb.org/anthology/S13-2046
Honey, M. A., & Hilton, M. (2011). Learning science through computer games and simulations. Washington D.C.: The National Academies Press.
Ifenthaler, D., Eseryel, D., & Ge, X. (2012). Assessment in game-based learning: foundations, innovations, and perspectives. New York: Springer.
Book Google Scholar
Jacquart, M. (2018). Learning about reality through models and computer simulations. Science & Education, 27, 805–810.
Article Google Scholar
Kahneman, D., & Tversky, A. (1982). Variants of uncertainty. Cognition, 11(2), 143–157.
Article Google Scholar
Kluger, A. N., & DeNisi, A. (1998). Feedback interventions: Toward the understanding of a double-edged sword. Current Directions in Psychological Science, 7, 67–72.
Article Google Scholar
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.
Article Google Scholar
Lead States. (2013). The Next Generation Science Standards. The National Academies Press.
Lee, H.-S., Liu, O. L., Pallant, A., Roohr, K. C., Pryputniewicz, S., & Buck, Z. E. (2014). Assessment of uncertainty-infused scientific argumentation. Journal of Research in Science Teaching, 51, 581–605.
Article Google Scholar
Lee, H. -S., Pallant, A., Pryputniewicz, S., Lord, T., Mulholland, M., & Liu, O. L. (2019). Automated text scoring and real-time adjustable feedback: Supporting revision of scientific arguments involving uncertainty. Science Education, 103(3), 590–622.
Article Google Scholar
Linn, M. C., Gerard, L., Ryoo, K., McElhaney, K., Liu, O. L., & Rafferty, A. N. (2014). Computer-guided inquiry to improve science learning. Science, 344(6180), 155–156.
Article Google Scholar
Liu, O. L., Brew, C., Blackmore, J., Gerard, L., Madhok, J., & Linn, M. C. (2014). Automated scoring of constructed-response science items: prospects and obstacles. Educational Measurement: Issues and Practice, 33(2), 19–28.
Article Google Scholar
Madnani, N., Loukina, A., & Cahill, A. (2017). A large scale quantitative exploration of modeling strategies for content scoring. Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, 457–467. Retrieved from http://www.aclweb.org/anthology/W17-5052
Martin, T., & Sherin, B. (2013). Learning analytics and computational techniques for detecting and evaluating patterns in learning: an introduction to the special issue. Journal of the Learning Sciences, 22(4), 511–520.
Article Google Scholar
Mao, L., Liu, O. L., Roohr, K., Belur, V., Mulholland, M., Lee, H., & Pallant, A. (2018). Validation of automated scoring for a formative assessment that employs scientific argumentation. Educational Assessment, 23(2), 121-138.
Article Google Scholar
McNeill, K. L., & Pimentel, D. S. (2010). Scientific discourse in three urban classrooms: the role of the teacher in engaging high school students in argumentation. Science Education, 94(2), 203–229.
Google Scholar
McNeill, K. L., Lizotte, D. J., Krajcik, J., & Marx, R. W. (2006). Supporting students’ construction of scientific explanations by fading scaffolds in instructional materials. Journal of the Learning Sciences, 15(2), 153–191.
Article Google Scholar
Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction. Machine Learning, 4, 227–243.
Article Google Scholar
Mitchell, T. (1997). Machine learning. New York: McGraw-Hill.
Google Scholar
Morrison, M. (2015). Reconstructing reality: models, mathematics, and simulations. Oxford: Oxford University Press.
Book Google Scholar
National Center for Education. (2012). Science in action: hands-on and interactive computer tasks from the 2009 science assessment, 1–24. Retrieved from papers3://publication/uuid/F9DCC897–609A-4858–87C9–9105F4201EE3.
National Research Council. (2012). A framework for K-12 science education: practices, crosscutting concepts, and core ideas. Washington, DC: National Academies Press.
Google Scholar
Nehm, R. H., Ha, M., & Mayfield, E. (2012). Transforming biology assessment with machine learning: automated scoring of written evolutionary explanations. Journal of Science Education and Technology, 21(1), 183–196.
Article Google Scholar
Palincsar, A. S. (1998). Social constructivist perspectives on teaching and learning. Annual Review of Psychology, 49, 345–375.
Article Google Scholar
Pallant, A., & Lee, H.-S. (2015). Constructing scientific arguments using evidence from dynamic computational climate models. Journal of Science Education and Technology, 24(2), 378–395.
Article Google Scholar
Pei, B., Xing, W., & Lee, H. S. (2019). Using automatic image processing to analyze visual artifacts created by students in scientific argumentation. British Journal of Educational Technology, 50(6), 3391-3404.
Article Google Scholar
Pryor, J., & Crossouard, B. (2008). A socio-cultural theorisation of formative assessment. Oxford Review of Education, 34, 37–41.
Article Google Scholar
Quitana, C., Reiser, B. J., Davis, E. A., Krajcik, J., Fretz, E., Duncan, R. G., & Soloway, E. (2004). A scaffolding design framework for software to support science inquiry. The Journal of the Learning Sciences, 13(3), 337–386.
Article Google Scholar
Ruiz-primo, M. A., & Furtak, E. M. (2006). Informal formative assessment and scientific inquiry: exploring teachers’ practices and student learning, 11, 205–235.
Google Scholar
Russ, R. S., Coffey, J. E., Hammer, D., & Hutchison, P. (2009). Making classroom assessment more countable to scientific reasoning: a case for attending to mechanistic thinking. Science Education, 93(5), 875–891.
Article Google Scholar
Sadler, R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119–144.
Article Google Scholar
Schwartz, C. V., Reiser, B. J., Davis, E. A., Kenyon, L., Acher, A., Fortus, D., & Krajcik, J. (2009). Developing a learning progression for scientific modeling: making scientific modeling accessible and meaningful for learners. Journal of Research in Science Teaching, 46(6), 632–654.
Article Google Scholar
Shavelson, R. J., Young, D. B., Ayala, C. C., Brandon, P. R., Furtak, E. M., Ruiz-Primo, M. A., & Gunn, S. (2008). On the impact of curriculum-embedded formative assessment on learning: a collaboration between curriculum and assessment developers. Assessment in Education: Principles, Policy and Practice, 31(2), 59–75.
Google Scholar
Shepard, L. A. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4–14.
Article Google Scholar
Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153–189.
Article Google Scholar
Sterman, J. D. (2002). All models are wrong: reflections on becoming a systems scientist. System Dynamics Review, 18(4), 501–531.
Article Google Scholar
Stroupe, D. (2014). Examining classroom science practice communities: how teachers and students negotiate epistemic agency and learn science-as-practice. Science Education, 98(3), 487–516.
Article Google Scholar
Toulmin, S. (1958). The uses of argument. New York: Cambridge University Press.
Google Scholar
Vygotsky, L. S. (1978). Mind in society. Cambridge, MA: Harvard University Press.
Google Scholar
Weisberg, M. (2013). Simulation and similarity: using models to understand the world. Oxford: Oxford University Press.
Book Google Scholar
Williamson, D., Xi, X., & Breyer, J. (2012). A framework for evaluation and use of automated scoring. Educational Measurement: Issues and Practice, 31(1), 2–13.
Article Google Scholar
Yin, Y., Shavelson, R. J., Ayala, C. C., Ruiz-Primo, M. A., Brandon, P. R., Furtak, E. M., & Young, D. B. (2008). On the impact of formative assessment on student motivation, achievement, and conceptual change. Applied Measurement in Education, 21(4), 335–359.
Article Google Scholar
Zhai, X., Yin, Y., Pellegrino, J. W., Haudek, K. C., & Shi, L. (2020). Applying machine learning in science assessment: a systematic review. Studies in Science Education, 56(1), 111–151.
Article Google Scholar
Zhu, M., Lee, H.-S., Wang, T., Liu, O. L., Belur, V., & Pallant, A. (2017). Investigating the impact of automated feedback on students’ scientific argumentation. International Journal of Science Education, 39(12), 1648–166.
Article Google Scholar

Download references

Funding

National Science Foundation (1220756) Ms. Amy Pallant, National Science Foundation (1418019) Dr. Hee-Sun Lee.

Author information

Authors and Affiliations

The Concord Consortium, 25 Love Lane, Concord, MA, 01742, USA
Hee-Sun Lee, Trudi Lord, Noah Paessel, Amy Pallant & Sarah Pryputniewicz
Physics Front, 71 Terrace View Dr, Scotts Valley, CA, 95066, USA
Gey-Hong Gweon

Authors

Hee-Sun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Gey-Hong Gweon
View author publications
You can also search for this author in PubMed Google Scholar
Trudi Lord
View author publications
You can also search for this author in PubMed Google Scholar
Noah Paessel
View author publications
You can also search for this author in PubMed Google Scholar
Amy Pallant
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Pryputniewicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hee-Sun Lee.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, HS., Gweon, GH., Lord, T. et al. Machine Learning-Enabled Automated Feedback: Supporting Students’ Revision of Scientific Arguments Based on Data Drawn from Simulation. J Sci Educ Technol 30, 168–192 (2021). https://doi.org/10.1007/s10956-020-09889-7

Download citation

Accepted: 04 December 2020
Published: 04 January 2021
Issue Date: April 2021
DOI: https://doi.org/10.1007/s10956-020-09889-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning-Enabled Automated Feedback: Supporting Students’ Revision of Scientific Arguments Based on Data Drawn from Simulation

Abstract

Access this article

Similar content being viewed by others

An Empirically Grounded Framework That Evaluates Argument Quality in Scientific and Social Contexts

Scaffolding argumentation about water quality: a mixed-method study in a rural middle school

Constructing Scientific Arguments Using Evidence from Dynamic Computational Climate Models

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Machine Learning-Enabled Automated Feedback: Supporting Students’ Revision of Scientific Arguments Based on Data Drawn from Simulation

Abstract

Access this article

Similar content being viewed by others

An Empirically Grounded Framework That Evaluates Argument Quality in Scientific and Social Contexts

Scaffolding argumentation about water quality: a mixed-method study in a rural middle school

Constructing Scientific Arguments Using Evidence from Dynamic Computational Climate Models

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation