Skip to main content
Log in

Combining Machine Learning and Qualitative Methods to Elaborate Students’ Ideas About the Generality of their Model-Based Explanations

  • Published:
Journal of Science Education and Technology Aims and scope Submit manuscript

A Correction to this article was published on 20 November 2020

This article has been updated

Abstract

Assessing students’ participation in science practices presents several challenges, especially when aiming to differentiate meaningful (vs. rote) forms of participation. In this study, we sought to use machine learning (ML) for a novel purpose in science assessment: developing a construct map for students’ consideration of generality, a key epistemic understanding that undergirds meaningful participation in knowledge-building practices. We report on our efforts to assess the nature of 845 students’ ideas about the generality of their model-based explanations through the combination of an embedded written assessment and a novel data analytic approach that combines unsupervised and supervised machine learning methods and human-driven, interpretive coding. We demonstrate how unsupervised machine learning methods, when coupled with qualitative, interpretive coding, were used to revise our construct map for generality in a way that allowed for a more nuanced evaluation that was closely tied to empirical patterns in the data. We also explored the application of the construct map as a framework for coding used as a part of supervised machine learning methods, finding that it demonstrates some viability for use in future analyses. We discuss implications for the assessment of students’ meaningful participation in science practices in terms of their considerations of generality, the role of unsupervised methods in science assessment, and combining machine learning and human-driven approach for understanding students’ complex involvement in science practices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Change history

Notes

  1. The approach we used has been shown to lend greater stability to the k means clustering solution, which can be influenced by the starting points for the algorithm. This approach uses the results from hierarchical clustering as the starting points for k means (Bergman and El-Khouri 1999). In our technique, what is being clustered is the vector space representation of each document: in other words, the raw data for the clustering procedure is a row in a table, with values ranging from zero to the maximum number of times any term appears across all documents. The default distance metric for the hierarchical clustering is cosine similarity.

  2. The R package we created and used (Rosenberg and Lishinski 2018) is available to anyone via GitHub for anyone seeking to carry out a similar two-step cluster analysis in R (R Core Team 2019); Sherin (2020) provides a very similar package in python.

  3. LOOCV is equivalent to k folds cross-validation when k is equal to the number of observations in the dataset.

References

  • Allaire, J. J., & Chollet, F. (2019). keras: R interface to ‘Keras’. R package version 2.2.5.0. https://CRAN.R-project.org/package=keras

  • Anderson, D. J., Rowley, B., Stegenga, S., Irvin, P. S., & Rosenberg, J. M. (2020). (advance online publication). Evaluating content-related validity evidence using a text-based, machine learning procedure. Educational Measurement: Issues and Practice. https://doi.org/10.1111/emip.12314.

  • Beggrow, E. P., Ha, M., Nehm, R. H., Pearl, D., & Boone, W. J. (2014). Assessing scientific practices using machine-learning methods: how closely do they match clinical interview performance? Journal of Science Education and Technology, 23(1), 160–182.

    Google Scholar 

  • Benoit, K., Chester, P., & Müller, S. (2019a). quanteda.classifiers: models for supervised text classification. R package version 0.1. http://github.com/quanteda/quanteda.svm

  • Benoit, K., Muhr, D., and Watanabe, K. (2019b). Stopwords: Multilingual Stopword lists. R package version 1.0. https://CRAN.R-project.org/package=stopwords

  • Bergman, L. R., & El-Khouri, B. M. (1999). Studying individual patterns of development using I-states as objects analysis (ISOA). Biometrical Journal: Journal of Mathematical Methods in Biosciences, 41(6), 753–770.

    Google Scholar 

  • Berland, L., & Crucet, K. (2016). Epistemological trade-offs: accounting for context when evaluating epistemological sophistication of student engagement in scientific practices. Science Education, 100(1), 5–29.

    Google Scholar 

  • Berland, L. K., Schwarz, C. V., Krist, C., Kenyon, L., Lo, A. S., & Reiser, B. J. (2016). Epistemologies in practice: making scientific practices meaningful for students. Journal of Research in Science Teaching, 53(7), 1082–1112.

    Google Scholar 

  • Bouchet-Valat, M. (2014). SnowballC: snowball stemmers based on the C libstemmer UTF-8 library. R package version 0.5, 1.

  • Burrows, S., Gurevych, I., & Stein, B. (2015). The eras and trends of automatic short answer grading. International Journal of Artificial Intelligence in Education, 25(1), 60–117.

    Google Scholar 

  • Chinn, C. A., & Malhotra, B. A. (2002). Epistemologically authentic inquiry in schools: a theoretical framework for evaluating inquiry tasks. Science Education, 86(2), 175–218.

    Google Scholar 

  • Cohen, J. (1968). Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220.

    Google Scholar 

  • Core Team, R. (2019). A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing https://www.R-project.org.

    Google Scholar 

  • DeBarger, A. H., Penuel, W. R., & Harris, C. J. (2013). Designing NGSS assessments to evaluate the efficacy of curriculum interventions. Invitational Research Symposium on Science Assessment. Washington, DC: K-12 Center at ETS. Retrieved from http://www.k12center. org/rsc/pdf/debarger-penuel-harris.pdf.

  • Dickes, A. C., Sengupta, P., Farris, A. V., & Basu, S. (2016). Development of mechanistic reasoning and multilevel explanations of ecology in third grade using agent-based models. Science Education, 100(4), 734–776.

    Google Scholar 

  • Duncan, R. G., & Tseng, K. A. (2011). Designing project-based instruction to foster generative and mechanistic understandings in genetics. Science Education, 95(1), 21–56.

    Google Scholar 

  • Ford, M. J. (2015). Educational implications of choosing “practice” to describe science in the next generation science standards. Science Education, 99(6), 1041–1048.

    Google Scholar 

  • Ford, M. J., & Forman, E. A. (2006). Chapter 1: redefining disciplinary learning in classroom contexts. Review of Research in Education, 30(1), 1–32.

    Google Scholar 

  • Fram, S. M. (2013). The constant comparative analysis method outside of grounded theory. The Qualitative Report, 18, 1.

    Google Scholar 

  • Gerard, L. F., & Linn, M. C. (2016). Using automated scores of student essays to support teacher guidance in classroom inquiry. Journal of Science Teacher Education, 27(1), 111–129.

    Google Scholar 

  • Giere, R. N. (1988). Explaining science: a cognitive approach. Chicago: University of Chicago Press.

    Google Scholar 

  • Gobert, J. D., Sao Pedro, M., Raziuddin, J., & Baker, R. S. (2013). From log files to assessment metrics: measuring students’ science inquiry skills using educational data mining. The Journal of the Learning Sciences, 22(4), 521–563.

    Google Scholar 

  • Gobert, J. D., Baker, R. S., & Wixon, M. B. (2015). Operationalizing and detecting disengagement within online science microworlds. Educational Psychologist, 50(1), 43–57.

    Google Scholar 

  • Gotwals, A. W., & Songer, N. B. (2013). Validity evidence for learning progression-based assessment items that fuse core disciplinary ideas and science practices. Journal of Research in Science Teaching, 50(5), 597–626.

    Google Scholar 

  • Greene, D., Hoffmann, A. L., & Stark, L. (2019). Better, nicer, clearer, fairer: a critical assessment of the movement for ethical artificial intelligence and machine learning, Hawaii international conference on system sciences (HICSS). HI: Maui.

  • Harris, C. J., Krajcik, J. S., Pellegrino, J. W., & DeBarger, A. H. (2019). Designing knowledge-in-use assessments to promote deeper learning. Educational Measurement: Issues and Practice, 38(2), 53–67.

    Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd ed.). Springer.

  • Haudek, K. C., Osborne, J., & Wilson, C. D. (2019). Using automated analysis to assess middle school students’ competence with scientific argumentation. In National Conference on Measurement in Education. Toronto: NCME.

    Google Scholar 

  • Helleputte, T. (2017). LiblineaR: linear predictive models based on the Liblinear C/C++ library. R package version, 2, 10–18.

    Google Scholar 

  • Hirschberg, J., & Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245), 261–266.

    Google Scholar 

  • Inkinen, J., Klager, C., Juuti, K., Schneider, B., Salmela-Aro, K., Krajcik, J., & Lavonen, J. (2020). High school students’ situational engagement associated with scientific practices in designed science learning situations. Science Education, 104(4), 667-692.

  • Jiménez-Aleixandre, M. P., Bugallo Rodríguez, A., & Duschl, R. A. (2000). “Doing the lesson” or “doing science”: argument in high school genetics. Science Education, 84(6), 757–792.

    Google Scholar 

  • Kelly, G. J. (2008). Inquiry, activity and epistemic practice. In R. A. Duschl & R. E. Grandy (Eds.), Teaching Scientific Inquiry (pp. 99–117). https://doi.org/10.1163/9789460911453_009.

    Chapter  Google Scholar 

  • Kolodner, J. L. (Ed.). (1993). Case-based learning. Dordrecht: Kluwer Academic Publishers.

    Google Scholar 

  • Krajcik, J., McNeill, K. L., & Reiser, B. J. (2008). Learning-goals-driven design model: developing curriculum materials that align with national standards and incorporate project-based pedagogy. Science Education, 92(1), 1–32.

    Google Scholar 

  • Krajcik, J., Reiser, B., Sutherland, L., & Fortus, D. (2011). IQWST: investigating and questioning our world through science and technology (middle school science curriculum materials). Greenwich: Sangari Active Science.

  • Krist, C. (2020). Examining how classroom communities developed practice-based epistemologies for science through analysis of longitudinal video data. Journal of Education & Psychology, 112(3), 420–443. https://doi.org/10.1037/edu0000417.

    Article  Google Scholar 

  • Kuhn, D. (2000). Metacognitive development. Current Directions in Psychological Science, 9(5), 178–181.

    Google Scholar 

  • Landis, J. R., & Koch, G. G. (1977). An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics, 33(2), 363–374.

    Google Scholar 

  • Laverty, J. T., Underwood, S. M., Matz, R. L., Posey, L. A., Carmel, J. H., Caballero, M. D., Fata-Hartley, C. L., Ebert-May, D., Jardeleza, S. E., & Cooper, M. M. (2016). Characterizing college science assessments: the three-dimensional learning assessment protocol. PLoS One, 11(9), e0162333.

    Google Scholar 

  • Lead States, N. G. S. S. (2013). Next generation science standards: for states, by states. Washington, DC: National Academies Press.

    Google Scholar 

  • Lehrer, R., & Schauble, L. (2006). Cultivating model-based reasoning in science education. In R. K. Sawyer (Ed.), The Cambridge handbook of the learning sciences (p. 371–387). Cambridge University Press.

  • Lehrer, R. & Schauble, L. (2015). Developing scientific thinking. In L. S. Liben & U. Müller (Eds.), Cognitive processes. Handbook of child psychology and developmental science (Vol. 2, 7th ed., pp. 671-174). Hoboken, NJ: Wiley.

  • Lehrer, R., Schauble, Leona, & Petrosino, A. J. (2001). Reconsidering the role of experiment in science education. Designing for science: implications from everyday, classroom, and professional settings, 251–278.

  • Manz, E. (2012). Understanding the codevelopment of modeling practice and ecological knowledge. Science Education, 96(6), 1071–1105.

    Google Scholar 

  • Manz, E. (2015). Representing student argumentation as functionally emergent from scientific activity. Review of Educational Research, 85(4), 553–590.

    Google Scholar 

  • McNeill, K., Lizotte, D. J., Krajcik, J., & Marx, R. W. (2006). Supporting students’ construction of scientific explanations by fading scaffolds in instructional materials. The Journal of the Learning Sciences, 15(2), 153–191.

    Google Scholar 

  • Morell, L., Collier, T., Black, P., & Wilson, M. (2017). A construct-modeling approach to develop a learning progression of how students understand the structure of matter. Journal of Research in Science Teaching, 54(8), 1024–1048.

    Google Scholar 

  • National Research Council. (2012). A framework for K-12 science education: practices, crosscutting concepts, and core ideas. Washington, DC: National Academies Press.

    Google Scholar 

  • National Research Council (2014). Developing assessments for the Next Generation Science Standards. Washington, DC: The National Academies Press.https://doi.org/10.17226/18409.

  • Nehm, R. H., Ha, M., & Mayfield, E. (2012). Transforming biology assessment with machine learning: automated scoring of written evolutionary explanations. Journal of Science Education and Technology, 21(1), 183–196.

    Google Scholar 

  • Nelson, L. K. (2020). Computational grounded theory: a methodological framework. Sociological Methods & Research, 49(1), 3–42. https://doi.org/10.1177/0049124117729703.

    Article  Google Scholar 

  • Passmore, C., Schwarz, C. V., & Mankowski, J. (2017). Developing and using models. In C. V. Schwarz, C. Passmore, & B. J. Reiser (Eds.), Helping students make sense of the world using next generation science and engineering practices (pp. 109–135). Arlington, VA: NSTA Press.

    Google Scholar 

  • Pei, B., Xing, W., & Lee, H. S. (2019). Using automatic image processing to analyze visual artifacts created by students in scientific argumentation. British Journal of Educational Technology, 50(6), 3391–3404.

    Google Scholar 

  • Pellegrino, J. W. (2013). Proficiency in science: assessment challenges and opportunities. Science, 340(6130), 320–323.

    Google Scholar 

  • Penuel, W. R., Turner, M. L., Jacobs, J. K., Van Horne, K., & Sumner, T. (2019). Developing tasks to assess phenomenon-based science learning: challenges and lessons learned from building proximal transfer tasks. Science Education, 103(6), 1367–1395.

    Google Scholar 

  • Popper, K. R. (1959). The propensity interpretation of probability. The British Journal for the Philosophy of Science, 10(37), 25–42.

    Google Scholar 

  • Reiser, B. J., Kim, J., Toyama, Y., & Draney, K. (2016). Multi-year growth in mechanistic reasoning across units in biology, chemistry, and physics. Paper presented at NARST, April, 14, 2016.

    Google Scholar 

  • Rosenberg, J. M., & Lishinski, A. (2018). clustRcompaR: easy interface for clustering a set of documents and exploring group-based patterns [R package]. https://github.com/alishinski/clustRcompaR

  • Ryu, S., & Sandoval, W. A. (2012). Improvements to elementary children’s epistemic understanding from sustained argumentation. Science Education, 96(3), 488–526.

    Google Scholar 

  • Saldaña, J. (2016). The coding manual for qualitative researchers. Sage.

  • Sandoval, W. A. (2005). Understanding students’ practical epistemologies and their influence on learning through inquiry. Science Education, 89(4), 634–656.

    Google Scholar 

  • Sandoval, W. A., & Millwood, K. A. (2005). The quality of students’ use of evidence in written scientific explanations. Cognition and Instruction, 23(1), 23–55.

    Google Scholar 

  • Schwarz, C. V., Reiser, B. J., Davis, E. A., Kenyon, L. O., Archer, A., Fortus, D., & Krajcik, J. (2009). Developing a learning progression for scientific modeling: making scientific modeling accessible and meaningful for learners. Journal of Research in Science Teaching, 46(6), 632–654.

    Google Scholar 

  • Schwarz, C. V., Passmore, C., & Reiser, B. J. (2017). Helping students make sense of the world using next generation science and engineering practices. Arlington, VA: NSTA Press.

    Google Scholar 

  • Shaffer, D. W. (2017). Quantitative ethnography. Madison: Cathcart Press.

    Google Scholar 

  • Sherin, B. (2013). A computational study of commonsense science: an exploration in the automated analysis of clinical interview data. The Journal of the Learning Sciences, 22(4), 600–638.

    Google Scholar 

  • Shin, N., Stevens, S. Y., & Krajcik, J. (2010). Tracking student learning over time using construct-centred design. In Using Analytical Frameworks for Classroom Research (pp. 56–76). Routledge.

  • Tabak, I., & Reiser, B.J. (1999). Steering the course of dialogue in inquiry-based science. Paper presented at the Annual Meeting of the American Educational Research Association Montreal, Canada.

  • Thagard, P. R. (1978). The best explanation: criteria for theory choice. Journal of Philosophy, 75(2), 76–92.

    Google Scholar 

  • Wiley, J., Hastings, P., Blaum, D., Jaeger, A. J., Hughes, S., Wallace, P., Griffin, T. D., & Britt, M. A. (2017). Different approaches to assessing the quality of explanations following a multiple-document inquiry activity in science. International Journal of Artificial Intelligence in Education, 27(4), 758–790.

    Google Scholar 

  • Wilson, M. (2004). Constructing measures: An item response modeling approach. London: Routledge.

    Google Scholar 

  • Zangori, L., Forbes, C. T., & Schwarz, C. V. (2015). Exploring the effect of embedded scaffolding within curricular tasks on third-grade students’ model-based explanations about hydrologic cycling. Science & Education, 24(7–8), 957–981.

    Google Scholar 

  • Zehner, F., Sälzer, C., & Goldhammer, F. (2016). Automatic coding of short text responses via clustering in educational assessment. Educational and Psychological Measurement, 76(2), 280–303.

    Google Scholar 

  • Zhai, X., Yin, Y., Pellegrino, J. W., Haudek, K. C., & Shi, L. (2020). Applying machine learning in science assessment: a systematic review. Studies in Science Education, 56(1), 111–151.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joshua M. Rosenberg.

Ethics declarations

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee (Northwestern University #STU00034615 and Wright State University #FWA00002427) and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Conflict of Interest

The authors declare that they have no conflicts of interest.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic Supplementary Material

ESM 1

(DOCX 17 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rosenberg, J.M., Krist, C. Combining Machine Learning and Qualitative Methods to Elaborate Students’ Ideas About the Generality of their Model-Based Explanations. J Sci Educ Technol 30, 255–267 (2021). https://doi.org/10.1007/s10956-020-09862-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10956-020-09862-4

Keywords

Navigation