Abstract
Growing evidence from recent curriculum documents and previous research suggests that reform-oriented science teaching practices promote students’ conceptual understanding, levels of achievement, and motivation to learn, especially when students are actively engaged in constructing their ideas through scientific inquiries. However, it is difficult to identify to what extent science teachers engage students in reform-oriented teaching practices (RTPs) in their science classrooms. In order to exactly diagnose the current status of science teachers’ implementation of the RTPs, a valid and reliable instrument tool is needed. The principles of validity and reliability are fundamental cornerstones in developing a robust measurement tool. As such, this study was motivated by the desire to point out the limitations of the existing statistical and psychometric analyses and to further examine the validation of the RTP survey instrument. This paper thus aims at calibrating the items of the RTPs for science teachers using the Rasch model. The survey instrument scale was adapted from the 2012 National Survey of Science and Mathematics Education (NSSME) data. A total of 3701 science teachers from 1403 schools from across the USA participated in the NSSME survey. After calibrating the RTP items and persons on the same scale, the RTP instrument well represented the population of US science teachers. Model-data fit determined by Infit and Outfit statistics was within an appropriate range (0.5–1.5), supporting the unidimensional structure of the RTPs. The ordered category thresholds and the probability of the thresholds showed that the five-point rating scale functioned well. The results of this study support the use of the RTP measure from the 2012 NSSME in assessing usage of RTPs.
Similar content being viewed by others
Notes
Theta represents underlying ability of a particular person with a scale typically depicted as ranging from −3 to 3, with 0.0 representing average ability.
Logit = \(\ln \frac{\varphi }{1 - \varphi }\), where \(\varphi\) represents the probability of correctly responding to an item.
Parameter invariance: The word parameter indicates population quantities, and the word invariance indicates that parameter values are identical in different population groups or across different measurement conditions (Rupp & Zumbo, 2006).
References
Abell, S., Anderson, G., & Chezem, J. (2000). Science as argument and explanation: Exploring concepts of sound in third grade. In J. Minstrell & E. H. V. Zee (Eds.), Inquiry into inquiry learning and teaching in science (pp. 100–119). Washington, DC: American Association for the Advancement of Science.
American Association for the Advancement of Science (AAAS). (2009). Benchmarks for science literacy on-line. Retrieved from http://www.project2061.org/publications/bsl/online
Anderson, J., & Bobis, J. (2005). Reform-oriented teaching practices: A survey of primary school teachers. International Group for the Psychology of Mathematics Education, 2, 65–72.
Baghaei, P., & Amrahi, N. (2011). Validation of a multiple choice English vocabulary test with the Rasch model. Journal of Language Teaching and Research, 2, 1052–1060.
Banilower, E. R., Smith, P. S., Weiss, I. R., Malzahn, K. A., Campbell, K. M., & Weis, A. M. (2013). Report of the 2012 National Survey of Science and Mathematics Education. Chapel Hill, NC: Horizon Research.
Barak, M., & Shakhman, L. (2008). Reform-based science teaching: Teachers’ instructional practices and conceptions. Eurasia Journal of Mathematics, Science & Technology Education, 4(1), 11–20.
Bidwell, A. (2013, December). American students fall in international academic tests, Chinese lead the pack. Retrieved from http://www.usnews.com/news/articles/2013/12/03/american-students-fall-in-international-academic-tests-chinese-lead-the-pack
Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in the human sciences (2nd ed.). Mahwah, NJ: L. Erlbaum.
Boone, W. J., Staver, J. R., & Yale, M. S. (2014). Rasch analysis in the human sciences. Dordrecht: Springer.
Borko, H., Stecher, B. M., Alonzo, A., Moncure, S., & McClam, S. (2005). Artifact packages for measuring instructional practice: A pilot study. Educational Assessment, 10(2), 73–104.
Borko, H., Stecher, B., & Kuffner, K. (2007). Using artifacts to characterize reform-oriented instruction: The scoop notebook and rating guide. CSE technical report 707. National Center for Research on Evaluation, Standards, and Student Testing (CRESST).
Brinthaupt, T. M., & Kang, M. (2014). Many-facet Rasch calibration: An example using the self-talk scale. Assessment, 21(2), 241–249.
Cobb, P., Wood, T., Yackel, E., Nicholls, J., Wheatley, G., Trigatti, B., & Perlwitz, M. (1991). Assessment of a problem-centered second-grade mathematics project. Journal for research in mathematics education, 3–29.
Copur-Gencturk, Y., Hug, B., & Lubienski, S. T. (2014). The effects of a master’s program on teachers’ science instruction: Results from classroom observations, teacher reports, and student surveys. Journal of Research in Science Teaching, 51, 219–249.
Cronbach, L. J. (1980). Validity on parole: How can we go straight? New directions for testing and measurement: Measuring achievement over a decade. Paper presented at the proceedings of 1979 ETS invitational conference, San Francisco.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. New York: Psychology Press.
Hulin, C. L., Lissak, R. I., & Drasgow, F. (1982). Recovery of two-and three-parameter logistic item characteristic curves: A Monte Carlo study. Applied Psychological Measurement, 6, 249–260.
Ivanitskaya, L., Clark, D., Montgomery, G., & Primeau, R. (2002). Interdisciplinary learning: Process and outcomes. Innovative Higher Education, 27(2), 95–111.
Johnson, C. C., Kahle, J. B., & Fargo, J. (2007). Effective teaching results in increased science achievement for all students. Science Education, 91(3), 371–383.
Le, V. N., Lockwood, J., Stecher, B. M., Hamilton, L. S., & Martinez, J. F. (2009). A longitudinal investigation of the relationship between teachers’ self-reports of reform-oriented instruction and mathematics and science achievement. Educational Evaluation and Policy Analysis, 31(3), 200–220.
Linacre, J. M. (1999). Investigating rating scale category utility. Journal of Outcome Measurement, 3, 103–122.
Linacre, J. M. (2002). What do infit and outfit, mean-square and standardized mean. Rasch Measurement Transactions, 16(2), 878.
Liu, O. L., Lee, H. S., & Linn, M. C. (2010). An investigation of teacher impact on student inquiry science performance using a hierarchical linear model. Journal of Research in Science Teaching, 47, 807–819.
MacIsaac, D., & Falconer, K. (2002). Using RTOP to reform a secondary science teacher preparation program. American Association of Physics Teachers Announcer, 32(2), 130.
Manno, J. L. (2011). K-5 Mentor teachers’ journeys toward reform-oriented science within a professional development school context. The Pennsylvania State University.
Mayer, D. (1999). Measuring instructional practice: Can policymakers trust survey data? Educational Evaluation and Policy Analysis, 21(1), 29–46.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: Macmillan.
Messick, S. (1995). Standards of validity and the validity of standards in performance assessment. Educational Measurement: Issues and Practice, 14(4), 5–8.
National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics. Reston, VA: NCTM.
National Research Council. (1996). National science education standards. Washington, DC: National Academy Press.
National Research Council. (2007). Taking Science to School: Learning and Teaching Science in Grades K-8. Committee on Science Learning, Kindergarten Through Eighth Grade. In R. A. Duschl, H. A. Schweingruber, & A. W. Shouse (Eds.), Board on Science Education, Center for Education. Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press.
National Research Council. (2008). Assessing Accomplished Teaching: Advanced-Level Certification Programs. Committee on Evaluation of Teacher Certification by the National Board for Professional Teaching Standards. In M. D. Hakel, J. A. Koenig, & S. W. Elliott (Eds.), Board on Testing and Assessment, Center for Education, Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press.
National Research Council. (2012). A framework for K-12 science education: Practices, crosscutting concepts, and core ideas. Washington, DC: The National Academies Press.
NGSS Lead States. (2013). Next generation science standards: For states, by states. Washington, DC: National Academies Press.
OECD. (2013). PISA 2012 assessment and analytical framework: Mathematics, reading, science, problem solving and financial literacy. New York: OECD Publishing.
Patton, M. Q. (1990). Qualitative evaluation and research methods (2nd ed.). Newbury Park, CA: Sage.
Rupp, A. A., & Zumbo, B. D. (2006). Understanding parameter invariance in unidimensional IRT models. Educational and Psychological Measurement, 66(1), 63–84.
Sawada, D., Piburn, M. D., Judson, E., Turley, J., Falconer, K., Benford, R., & Bloom, I. (2002). Measuring reform practices in science and mathematics classrooms: The reformed teaching observation protocol. School Science and Mathematics, 102, 245–253.
Schroeder, C. M., Scott, T. P., Tolson, H., Huang, T. Y., & Lee, Y. H. (2007). A meta-analysis of national research: Effects of teaching strategies on student achievement in science in the United States. Journal of Research in Science Teaching, 44, 1436–1460.
Seatter, C. S. (2003). Constructivist science teaching: Intellectual and strategic teaching acts. Interchange, 34(1), 63–87.
Smith, E. V. (2001). Evidence for the reliability of measures and validity of measure interpretation: A Rasch measurement perspective. Journal of Applied Measurement, 2(3), 281–311.
Speer, N. M., & Wagner, J. F. (2009). Knowledge needed by a teacher to provide analytic scaffolding during undergraduate mathematics classroom discussions. Journal for Research in Mathematics Education, 40(5), 530–562.
Spillane, J. P., & Zeuli, J. S. (1999). Reform and teaching: Exploring patterns of practice in the context of national and state mathematics reforms. Educational Evaluation & Policy Analysis, 21(1), 1–27.
Stecher, B., Borko, H., Kuffner, K., Wood, A., Arnold, S., Gilbert, M., & Dorman, E. (2005). Using classroom artifacts to measure instructional practices in middle school mathematics: A two-state field test (CSE technical report no. 662). Los Angeles, CA: University of California, National Center for Research on Evaluation, Standards and Student Testing (CRESST).
Stein, M., & Strutchens, M. (2001). Mathematical argumentation: Putting umph into classroom discussions. Mathematics Teaching in the Middle School, 7(2), 110–113.
Stevens, S. S. (1946). On the theory of scales of measurement. Science, New Series, 103, 677–680.
Sussman, J., Beaujean, A. A., Worrell, F. C., & Watson, S. (2013). An analysis of cross racial identity scale scores using classical test theory and Rasch item response models. Measurement and Evaluation in Counseling and Development, 46, 136–153.
Tesio, L. (2003). Measuring behaviours and perceptions: Rasch analysis as a tool for rehabilitation research. Journal of Rehabilitation Medicine, 35, 105–115.
Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Harvard: Harvard University Press.
Weaver, D., Dick, T., Higgins, K., Marrongelle, K., Foreman, L., & Miller, N. (2005). OMLI classroom observation protocol. Portland, OR: RMC Research Corporation.
Wolfe, E., & Smith, E, Jr. (2007). Instrument development tools and activities for measure validation using Rasch models: Part II—Validation activities. Journal of Applied Measurement, 8, 204–234.
Wright, B. D., & Linacre, M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8(3), 370.
Zhu, W., Timm, G., & Ainsworth, B. (2001). Rasch calibration and optimal categorization of an instrument measuring women’s exercise perseverance and barriers. Research Quarterly for Exercise and Sport, 72, 104–116.
Acknowledgments
These data were collected by Horizon Research, Inc. under National Science Foundational Award Number DRL-1008228. Any opinions, findings, and conclusions or recommendations expressed herein are those of the authors and do not necessarily reflect the views of the National Science Foundation or Horizon Research, Inc.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
You, H. Rasch Validation of a Measure of Reform-Oriented Science Teaching Practices. J Sci Teacher Educ 27, 373–392 (2016). https://doi.org/10.1007/s10972-016-9466-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10972-016-9466-3