Assessing Pre-Service Science Teachers’ Scientific Reasoning Competencies

  • Moritz Krell
  • Christine Redman
  • Sabrina Mathesius
  • Dirk Krüger
  • Jan van Driel


Scientific reasoning competencies are highlighted in science education policy papers and standard documents in various countries around the world and pre-service science teachers are asked to develop them during teacher training as part of their professional competencies. In order to monitor the development of pre-service science teachers’ scientific reasoning competencies during their course of studies and to enable evidence-based improvements of teacher training, instruments for the assessment of scientific reasoning competencies are needed. However, studies propose that the validity of most instruments for assessing scientific reasoning competencies available so far can be questioned. This study presents an English translation of an already developed German multiple-choice instrument to assess pre-service science teachers’ scientific reasoning competencies. A sample of N = 105 Australian pre-service science teachers participated in this study by answering the translated instrument. Quantitative (differential item functioning, Mantel–Haenszel statistic) and qualitative (think-aloud protocols) analyses provide validity evidence for the translated instrument. Furthermore, the interpretation of the data as an indicator for the participating Australian pre-service science teachers’ scientific reasoning competencies suggests that there is a need for a more explicit emphasis on scientific reasoning in Australian science teacher training.


Scientific reasoning competencies Pre-service science teachers Assessment Test equivalence Differential item functioning 



The authors thank the German Federal Ministry of Education and Research for funding the projects Ko-WADiS/ValiDiS (grant numbers 01PK11004A/01PK15004A) and the Center for International Cooperation of Freie Universität Berlin for funding the research in Melbourne.


  1. ACARA [Australian Curriculum, Assessment, and Reporting Authority] (2013). General capabilities. January 2013 Edition. Retrieved from
  2. AERA, APA, & NCME [American Educational Research Association, American Psychological Association, & National Council on Measurement in Education]. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.Google Scholar
  3. AITSL [Australian Institute for Teaching and School Leadership]. (2011). Australian Professional Standards for Teachers. Carlton South: Education Council Retrieved from Scholar
  4. ASTA [Australian Science Teacher Association]. (2009). National professional standards for highly accomplished teachers of science: Final draft. Deakin: ASTA.Google Scholar
  5. Baumert, J., & Kunter, M. (2013). The COACTIV model of teachers’ professional competence. In M. Kunter, J. Baumert, W. Blum, U. Klusmann, S. Krauss, & M. Neubrand (Eds.), Cognitive activation in the mathematics classroom and professional competence of teachers (pp. 25–48). Boston: Springer US.CrossRefGoogle Scholar
  6. Bond, T., & Fox, C. (2001). Applying the Rasch model: Fundamental measurement in the human sciences. Mahwah: Erlbaum.CrossRefGoogle Scholar
  7. Brennan, R., & Prediger, D. (1981). Coefficient kappa: Some uses, misuses, and alternatives. Educational and Psychological Measurement, 41, 687–699.CrossRefGoogle Scholar
  8. Burnham, K., & Anderson, D. (2004). Multimodel inference: Understanding AIC and BIC in model selection. Sociological Methods & Research, 33, 261–304.CrossRefGoogle Scholar
  9. Capps, D., & Crawford, B. (2013). Inquiry-based professional development: What does it take to support teachers in learning about inquiry and nature of science? International Journal of Science Education, 35(12), 1947–1978. Scholar
  10. Ding, L., Wei, X., & Mollohan, K. (2016). Does higher education improve student scientific reasoning skills? International Journal of Science and Mathematics Education, 14, 619–634. Scholar
  11. Driver, R., Leach, J., Millar, R., & Scott, P. (1996). Young people’s images of science. Buckingham: Open University Press.Google Scholar
  12. Educational Policies Commission. (1966). Education and the spirit of science. Washington, DC: National Education Association.Google Scholar
  13. Embretson, S., & Reise, S. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.Google Scholar
  14. Ercikan, K., & Lyons-Thomas, J. (2013). Adapting tests for use in other languages and cultures. In K. Geisinger (Ed.), APA handbook of testing and assessment in psychology. Testing and assessment in school psychology and education (pp. 545–569). Washington, DC: American Psychological Association.CrossRefGoogle Scholar
  15. Ercikan, K., Gierl, M., McCreith, T., Puhan, G., & Koh, K. (2004). Comparability of bilingual versions of assessments: Sources of incomparability of English and French versions of Canada’s National Achievement Tests. Applied Measurement in Education, 17, 301–321. Scholar
  16. Ercikan, K., Arim, R., Law, D., Domene, J., Gagnon, F., & Lacroix, S. (2010). Application of think aloud protocols for examining and confirming sources of differential item functioning identified by expert reviews. Educational Measurement: Issues and Practice, 29, 24–35. Scholar
  17. Ericsson, K., & Simon, H. (1998). How to study thinking in everyday life: Contrasting think-aloud protocols with descriptions and explanations of thinking. Mind, Culture, and Activity, 5, 178–186.CrossRefGoogle Scholar
  18. European Commission. (2015). Science education for responsible citizenship. Brussels: European Commission Retrieved from Scholar
  19. Forsyth, B., Kudela, M., Levin, K., Lawrence, D., & Willis, G. (2016). Methods for translating an English-language survey questionnaire on tobacco use into Mandarin, Cantonese, Korean, and Vietnamese. Field Methods, 19, 264–283. Scholar
  20. Frey, A. (2006). Strukturierung und Methoden zur Erfassung von Kompetenz (Structuring and methods for competence assessment). Bildung und Erziehung, 59, 125–166.CrossRefGoogle Scholar
  21. Großschedl, J., Harms, U., Kleickmann, T., & Glowinski, I. (2015). Preservice biology teachers’ professional knowledge: Structure and learning opportunities. Journal of Science Teacher Education, 26(3), 291–318. Scholar
  22. Hanushek, E., & Woessmann, L. (2011). How much do educational outcomes matter in OECD countries? Economic Policy, 26, 427–491. Scholar
  23. Harkness, J. (2003). Questionnaire translation. In J. Harkness, F. J. R. van de Vijver, & P. Mohler (Eds.), Cross-cultural survey methods (pp. 35–56). Hoboken: Wiley.Google Scholar
  24. Harkness, J., Pennell, B.-E., & Schoua-Glusberg, A. (2004). Survey questionnaire translation and assessment. In S. Presser, J. Rothgeb, M. Couper, J. Lessler, E. Martin, J. Martin, & E. Singer (Eds.), Methods for testing and evaluating survey questionnaires (pp. 453–473). Hoboken: Wiley.CrossRefGoogle Scholar
  25. Hartmann, S., Upmeier zu Belzen, A., Krüger, D., & Pant, H. (2015). Scientific reasoning in higher education. Zeitschrift für Psychologie, 223, 47–53. Scholar
  26. Heijnes, D., van Joolingen, W., & Leenaars, F. (2017). Stimulating scientific reasoning with drawing-based modeling. Journal of Science Education and Technology, 333, 1096. Scholar
  27. Hodson, D. (2014). Learning science, learning about science, doing science: Different goals demand different learning methods. International Journal of Science Education, 36, 2534–2553. Scholar
  28. Justi, R., & van Driel, J. (2005). A case study of the development of a beginning chemistry teacher's knowledge about models and modelling. Research in Science Education, 35, 197–219. Scholar
  29. Kane, M. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73.CrossRefGoogle Scholar
  30. Kind, P., & Osborne, J. (2017). Styles of scientific reasoning: A cultural rationale for science education? Science Education, 101, 8–31. Scholar
  31. Kleickmann, T., & Anders, Y. (2013). Learning at university. In M. Kunter, J. Baumert, W. Blum, U. Klusmann, S. Krauss, & M. Neubrand (Eds.), Cognitive activation in the mathematics classroom and professional competence of teachers (pp. 321–332). Boston: Springer US.CrossRefGoogle Scholar
  32. Klieme, E., Hartig, J., & Rauch, D. (2008). The concept of competence in educational contexts. In J. Hartig, E. Klieme, & D. Leutner (Eds.), Assessment of competencies in educational contexts (pp. 3–22). Göttingen: Hogrefe.Google Scholar
  33. KMK (Ed.). (2017). Ländergemeinsame inhaltliche Anforderungen für die Fachwissenschaften und Fachdidaktiken in der Lehrerbildung (Common guidelines for the subjects and the subject didactics in teacher education). Berlin. Retrieved from
  34. KMK [Sekretariat der Ständigen Konferenz der Kultusminister der Länder in der BRD]. (2005). Bildungsstandards im Fach Biologie für den Mittleren Schulabschluss (Biology education standards for the Mittlere Schulabschluss). München: Wolters Kluwer.Google Scholar
  35. Krell, M., & Krüger, D. (2015). Testing models: A key aspect to promote teaching activities related to models and modelling in biology lessons? Journal of Biological Education, 50, 160–173. Scholar
  36. Krell, M., Koska, J., Penning, F., & Krüger, D. (2015a). Fostering pre-service teachers’ views about nature of science: Evaluation of a new STEM curriculum. Research in Science & Technological Education, 33(3), 344–365. Scholar
  37. Krell, M., Reinisch, B., & Krüger, D. (2015b). Analyzing students’ understanding of models and modeling referring to the disciplines biology, chemistry, and physics. Research in Science Education, 45, 367–393. CrossRefGoogle Scholar
  38. Krell, M. (2017). Schwierigkeitserzeugende Aufgabenmerkmale bei Multiple-Choice-Aufgaben zur Experimentierkompetenz im Biologieunterricht: Eine Replikationsstudie [Difficulty generating task characteristics of multiple-choice-tasks to assess experimental competencies]. Zeitschrift für Didaktik der Naturwissenschaften.
  39. Krell, M., Walzer, C., Hergert, S., & Krüger, D. (2017). Development and Application of a Category System to Describe Pre-Service Science Teachers’ Activities in the Process of Scientific Modelling. Research in Science Education, 333, 1096. Scholar
  40. Krell, M., Vergara, C., van Driel, J., Upmeier zu Belzen, A., & Krüger, D. (2018). Assessing pre-service teachers' scientific reasoning competencies: translation of a German mc instrument into Spanish/ English. Paper presented at NARST conference 2018. USA: Atlanta, GA.Google Scholar
  41. Kunter, M., Klusmann, U., Baumert, J., Richter, D., Voss, T., & Hachfeld, A. (2013). Professional competence of teachers: Effects on instructional quality and student development. Journal of Educational Psychology, 105, 805–820. Scholar
  42. Lawson, A. (2004). The nature and development of scientific reasoning: A synthetic view. International Journal of Science and Mathematics Education, 2, 307–338. Scholar
  43. Mathesius, S., Upmeier zu Belzen, A., & Krüger, D. (2014). Kompetenzen von Biologiestudierenden im Bereich der naturwissenschaftlichen Erkenntnisgewinnung: Entwicklung eines Testinstruments [Competencies of biology students in the field of scientific inquiry: Development of a testing instrument]. Erkenntnisweg Biologiedidaktik, 13, 73–88.Google Scholar
  44. Mathesius, S., Hartmann, S., Upmeier zu Belzen, A., & Krüger, D. (2016). Scientific reasoning as an aspect of pre-service biology teacher education. In T. Tal & A. Yarden (Eds.), The future of biology education research. Proceedings of the 10th conference of European Researchers in Didactics of Biology (ERIDOB) (pp. 93–110). Haifa, Israel.Google Scholar
  45. Mathesius, S., Upmeier zu Belzen, A. & Krüger, D. (2018a). Eyetracking als Methode zur Untersuchung von Lösungsprozessen bei Multiple-Choice-Aufgaben zum wissenschaftlichen Denken. In: M. Hammann & M. Lindner (Hrsg.), Lehr- und Lernforschung in der Biologiedidaktik, Band 8 (pp. 225–244). Innsbruck: Studienverlag.Google Scholar
  46. Mathesius, S., Upmeier zu Belzen, A. & Krüger, D. (2018b). Lautes Denken bei der Bearbeitung von Multiple Choice Aufgaben zur Erfassung von Kompetenzen des wissenschaftlichen Denkens (working title). Manuscript in preparation.Google Scholar
  47. Mayer, J. (2007). Erkenntnisgewinnung als wissenschaftliches Problemlösen (Scientific inquiry as problem solving). In D. Krüger & H. Vogt (Eds.), Theorien in der biologiedidaktischen Forschung (pp. 177–186). Berlin: Springer.Google Scholar
  48. Mayer, D., Sodian, B., Koerber, S., & Schwippert, K. (2014). Scientific reasoning in elementary school children: Assessment and relations with cognitive abilities. Learning and Instruction, 29, 43–55. Scholar
  49. Morris, B., Croker, S., Masnick, A., & Zimmerman, C. (2012). The emergence of scientific reasoning. In H. Kloos, B. Morris, & J. Amaral (Eds.), Current topics in children's learning and cognition (pp. 61–82). InTech.Google Scholar
  50. Neumann, K., Härtig, H., Harms, U., & Parchmann, I. (2017). Science teacher preparation in Germany. In J. Pedersen, T. Isozaki, & T. Hirano (Eds.), Model science teacher preparation programs. An international comparison of what works (pp. 29–52). Information Age: Charlotte.Google Scholar
  51. NGSS Lead States (Ed.). (2013). Next generation science standards: For states, by states. Washington, DC: The National Academies Press.Google Scholar
  52. OECD. (2010). The high cost of low educational performance: The long-run economic impact of improving PISA outcomes. Paris. Retrieved from
  53. Opitz, A., Heene, M., & Fischer, F. (2017). Measuring scientific reasoning: A review of test instruments. Educational Research and Evaluation, 23, 78–101. Scholar
  54. Osborne, J. (2013). The 21st century challenge for science education: Assessing scientific reasoning. Thinking Skills and Creativity, 10, 265–279. Scholar
  55. Osborne, J. (2014). Scientific practices and inquiry in the science classroom. In N. Lederman & S. Abell (Eds.), Handbook of research on science education (pp. 579–599). New York: Routledge.Google Scholar
  56. Pedersen, J. E., Isozaki, T., & Hirano, T. (Eds.). (2017). Model science teacher preparation programs: An international comparison of what works. Charlotte: Information Age.Google Scholar
  57. Roth, W.-M., Oliveri, M., Sandilands, D., Lyons-Thomas, J., & Ercikan, K. (2013). Investigating linguistic sources of differential item functioning using expert think-aloud protocols in science achievement tests. International Journal of Science Education, 35, 546–576. Scholar
  58. Schauble, L., Klopfer, L., & Raghavan, K. (1991). Students’ transition from an engineering model to a science model of experimentation. Journal of Research in Science Teaching, 28, 859–882.CrossRefGoogle Scholar
  59. Shavelson, R. (2013). On an approach to testing and modeling competence. Educational Psychologist, 48, 73–86. Scholar
  60. Schreier, M. (2012). Qualitative content analysis in practice. Thousand Oaks: Sage.Google Scholar
  61. Schwarz, C., & White, B. (2005). Metamodeling knowledge: Developing students’ understanding of scientific modeling. Cognition and Instruction, 23, 165–205.CrossRefGoogle Scholar
  62. Shulman, L. (1986). Those who understand: Knowledge growth in teaching. Educational Researcher, 15, 4–14.CrossRefGoogle Scholar
  63. Stiller, J., Hartmann, S., Mathesius, S., Straube, P., Tiemann, R., Nordmeier, V., … Upmeier zu Belzen, A. (2016). Assessing scientific reasoning: A comprehensive evaluation of item features that affect item difficulty. Assessment & Evaluation in Higher Education, 41, 721–732. doi: CrossRefGoogle Scholar
  64. Thompson, E., Bowling, B., & Markle, R. (2017). Predicting student success in a major’s introductory biology course via logistic regression analysis of scientific reasoning ability and mathematics scores. Research in Science Education, 30(2), 663–163. Scholar
  65. Upmeier zu Belzen, A., & Krüger, D. (2010). Modellkompetenz im Biologieunterricht [Model competence in biology teaching]. Zeitschrift für Didaktik der Naturwissenschaften, 16, 41–57.Google Scholar
  66. van der Graaf, J., Segers, E., & Verhoeven, L. (2016). Scientific reasoning in kindergarten: Cognitive factors in experimentation and evidence evaluation. Learning and Individual Differences, 49, 190–200. Scholar
  67. VCAA [Victorian Curriculum and Assessment Authority]. (2016a). Victorian certificate of education biology: Advice for teachers. Melbourne: VCAA.Google Scholar
  68. VCAA [Victorian Curriculum and Assessment Authority]. (2016b). Victorian Curriculum: F-10. Melbourne, VIC. Retrieved from
  69. Weinert, F. (2001). Concept of competence: A conceptual clarification. In D. Rychen & L. Salganik (Eds.), Defining and selecting key competencies (pp. 45–65). Kirkland: Hogrefe.Google Scholar
  70. White, B., Collins, A., & Frederiksen, J. (2011). The nature of scientific meta-knowledge. In M. Khine & I. Saleh (Eds.), Models and modeling. Cognitive tools for scientific enquiry (pp. 41–76). Dordrecht: Springer.Google Scholar
  71. Windschitl, M., Thompson, J., & Braaten, M. (2008). Beyond the scientific method: Model-based inquiry as a new paradigm of preference for school science investigations. Science Education, 92(5), 941–967. Scholar
  72. Won, M., Hackling, M., & Treagust, D. (2017). Secondary science teacher education in Australia. In J. Pedersen, T. Isozaki, & T. Hirano (Eds.), Model science teacher preparation programs. An international comparison of what works (pp. 229–248). Information Age: Charlotte.Google Scholar
  73. Wu, M. L., Adams, R., Wilson, M., & Haldane, S. (2007). ACER ConQuest. Camberwell: ACER Press.Google Scholar
  74. Zwick, R., Thayer, D., & Lewis, C. (1999). An empirical Bayes approach to Mantel-Haenszel DIF analysis. Journal of Educational Measurement, 36, 1–28.CrossRefGoogle Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.Freie Universität BerlinBerlinGermany
  2. 2.Melbourne Graduate School of EducationThe University of MelbourneCarltonAustralia

Personalised recommendations