Skip to main content

Valid and Reliable Science Content Assessments for Science Teachers


Science teachers’ content knowledge is an important influence on student learning, highlighting an ongoing need for programs, and assessments of those programs, designed to support teacher learning of science. Valid and reliable assessments of teacher science knowledge are needed for direct measurement of this crucial variable. This paper describes multiple sources of validity and reliability (Cronbach’s alpha greater than 0.8) evidence for physical, life, and earth/space science assessments—part of the Diagnostic Teacher Assessments of Mathematics and Science (DTAMS) project. Validity was strengthened by systematic synthesis of relevant documents, extensive use of external reviewers, and field tests with 900 teachers during assessment development process. Subsequent results from 4,400 teachers, analyzed with Rasch IRT modeling techniques, offer construct and concurrent validity evidence.

This is a preview of subscription content, access via your institution.

Fig. 1


  • American Association for the Advancement of Science. (1993). Benchmarks for science literacy. New York: Oxford University Press.

    Google Scholar 

  • American Educational Research Association, American Psychological Association, National Council on Measurement in Education, & Joint Committee on Standards for Educational and Psychological Testing (U.S.). (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

  • Bond, T. G., & Fox, C. M. (2007). Applying the Rasch Model: Fundamental measurement in the human sciences (2nd ed.). New York: Routledge, Taylor & Francis Group.

    Google Scholar 

  • Boone, W. J., Townsend, J. S., & Staver, J. (2011). Using Rasch theory to guide the practice of survey development and survey data analysis in science education and to inform science reform efforts: An exemplar utilizing STEBI self-efficacy data. Science Education, 95(2), 258–280.

    Article  Google Scholar 

  • Darling-Hammond, L. (2000). How teacher education matters. Journal of Teacher Education, 51, 166–173.

    Article  Google Scholar 

  • De Jong, O., Acampo, J., & Verdonk, A. H. (1995). Problems in teaching the topic of redox reactions. Journal of Research in Science Teaching, 32(10), 1097–1110.

    Article  Google Scholar 

  • Gess-Newsome, J., & Lederman, N. G. (1993). Preservice biology teachers’ knowledge structures as a function of professional teacher education: A year-long assessment. Science Education, 77(1), 25–45.

    Article  Google Scholar 

  • Goodwin, L. D., & Leech, N. L. (2003). The meaning of validity in the new “Standards for Educational and Psychological Testing”: Implications for measurement courses. Measurement and Evaluation in Counseling and Development, 36, 181–191.

    Google Scholar 

  • Haidar, A. (1997). Prospective chemistry teachers’ conceptions of the conservation of matter and related concepts. Journal of Research in Science Teaching, 34(2), 181–197.

    Article  Google Scholar 

  • Henson, R. K. (2001). Understanding internal consistency reliability estimates: A conceptual primer on coefficient alpha. Measurement and Evaluation in Counseling and Development, 34(3), 177–189.

    Google Scholar 

  • Hoz, R., Tomer, Y., & Tamir, P. (1990). The relations between disciplinary and pedagogical knowledge and the length of teaching experience of biology and geology teachers. Journal of Research in Science Teaching, 27(10), 973–985.

    Article  Google Scholar 

  • Ingersoll, R. (2000). Challenges to finding and keeping teachers. Retrieved August 27, 2004 from\scproject\scp_teaching.html.

  • Interstate New Teacher Assessment and Support Consortium (INTASC) Science Standards Drafting Committee. (2002). Model standards in science for beginning teacher licensing and development: A resource for state dialogue. Council of Chief State School Officers. Retrieved January 14, 2004 from

  • Kendall, J. S., & Marzano, R. J. (2004). Content knowledge: A compendium of standards and benchmarks for K-12 education. Aurora, CO: Mid-continent Research for Education and Learning. Online database:

  • Laczko-Kerr, I., & Berliner, D. (2002). The effectiveness of “Teach for America” and other under-certified teachers on student academic achievement: A case of harmful public policy. Education Policy Analysis Archives, 10(37), 1–53. Retrieved March 1, 2012 from

  • Lee, Y. J., Izard, J., & Yeoh, O. C. (1997). Teacher knowledge of biological evolution from the perspectives of classical test and item response theory. Paper presented at the Education Research Association Annual Conference, Singapore.

  • Li, M., & Shavelson, R. J. (2001). Examining the links between science achievement and assessment. Paper presented at the AERA Annual Meeting, Seattle, WA.

  • Linacre, J. M. (2011). A user’s guide to winsteps. Retrieved March 13, 2012 from, p. 600.

  • Liu, X. (2010). Using and developing measurement instruments in science education: A Rasch Modeling approach. Charlotte, NC: Information Age Publishing, Inc.

    Google Scholar 

  • Mullis, I. V. S., Martin, M. O., Smith, T. A., Garden, R. A., Gregory, K. D., Gonzalez, E. J., et al. (2003). TIMSS assessment frameworks and specifications 2003 (2nd ed.). Retrieved January 14, 2004 from

  • National Academies, Committee on Prospering in the Global Economy of the 21st Century. (2006). Rising above the gathering storm: Energizing and employing America for a brighter future. Washington, DC: National Academies Press.

  • National Assessment Governing Board. (2004). Science framework for the 2005 national assessment for educational progress. Washington, D.C: Author.

    Google Scholar 

  • National Research Council. (1996). National science education standards. Washington, DC: National Academy Press.

    Google Scholar 

  • National Science Teachers Association (NSTA). (2003). Standards for science teacher preparation. NSTA. Retrieved January 14, 2004 from

  • No Child Left Behind Act. (2001). Public Law 107-110. Retrieved January 14, 2004 from

  • Praxis Series: Professional Assessments for Beginning Teachers. (2004). General Science Content Knowledge Part 2 (0432). Educational Testing Service (ETS). Retrieved January 14, 2004 from

  • Preece, P. F. W. (1997). Force and motion: Pre-service and practicing secondary science teachers’ language and understanding. Research in Science and Technological Education, 15(1), 123–128.

    Article  Google Scholar 

  • Rutledge, M. L., & Warden, M. A. (1999). The development and validation of the measure of acceptance of the theory of evolution instrument. School Science and Mathematics, 99, 13–18.

    Article  Google Scholar 

  • Saderholm, J., & Tretter, T. R. (2008). Identification of the most critical content knowledge base for middle school science teachers. Journal of Science Teacher Education, 19(3), 269–283.

    Google Scholar 

  • Shulman, L. (1986). Those who understand, knowledge growth in teaching. Educational Researcher, 15(2), 4–14.

    Google Scholar 

  • Stiggins, R. J. (1997). Performance assessment of skill and product outcomes. In K. M. Davis (Ed.), Student-centered classroom assessment (2nd ed., pp. 261–302). Columbus, OH: Prentice-Hall, Inc.

    Google Scholar 

  • Tamir, P., Gal-Choppin, R., & Nussinovitz, R. (1981). How do intermediate and junior high students conceptualize living and nonliving? Journal of Research in Science Teaching, 18, 241–248.

    Article  Google Scholar 

  • Trumper, R. (1997). A survey of conceptions of energy of Israeli pre-service high school biology teachers. International Journal of Science Education, 19(1), 31–46.

    Article  Google Scholar 

  • US Department of Education, National Center for Education Statistics. (2004). Qualifications of the public school teacher workforce: Prevalence of out-of-field teaching, 1987–88 to 1999–2000. NCES 2002-603 Revised. Washington, D.C.

  • van Driel, J. H., Verloop, N., & de Vos, W. (1998). Developing science teachers’ pedagogical content knowledge. Journal of Research in Science Teaching, 35(6), 673–695.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Thomas R. Tretter.

About this article

Cite this article

Tretter, T.R., Brown, S.L., Bush, W.S. et al. Valid and Reliable Science Content Assessments for Science Teachers. J Sci Teacher Educ 24, 269–295 (2013).

Download citation

  • Published:

  • Issue Date:

  • DOI: