Evaluating the Psychometric Qualities of the National Board for Professional Teaching Standards' Assessments: A Methodological Accounting

Jaeger, Richard M.

doi:10.1023/A:1008085128230

Evaluating the Psychometric Qualities of the National Board for Professional Teaching Standards' Assessments: A Methodological Accounting

Published: June 1998

Volume 12, pages 189–210, (1998)
Cite this article

Journal of Personnel Evaluation in Education Aims and scope Submit manuscript

Richard M. Jaeger¹

121 Accesses
12 Citations
Explore all metrics

Abstract

In 1991 the National Board for Professional Teaching Standards established a Technical Analysis Group (TAG) with responsibility for conducting research on the measurement quality of its innovative performance assessments of classroom teachers. The TAG's measurement research agenda focused on four principal areas of inquiry and development—(1) validation of the Board's assessments, (2) characterizing the reliability of the Board's assessments, (3) establishing standards of performance for awarding candidate teachers National Board Certification, and (4) investigation of the presence and degree of adverse impact and bias in the Board's assessments. Because the National Board's assessments differed materially from conventional tests that had been used in the past for assessing teachers' knowledge and skills (for example, the National Teacher Examinations), textbook approaches to evaluation of their measurement properties were largely inapplicable. New measurement methodology was thus required. This article contains a summary of the measurement strategies developed and employed by the TAG. Because investigations of the degree of adverse impact and bias in the National Board's assessments are described in another contribution to this journal issue, the article considers only the first three issues mentioned above. The article begins with a brief description of the structure of the National Board's assessments. A final section of the article identifies some remaining measurement dilemmas and provides suggestions for additional inquiry.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Teacher Evaluation with Multiple Indicators: Conceptual and Methodological Considerations Regarding Validity

Using test scores to evaluate and hold school teachers accountable in New Mexico

Article 01 May 2020

Tray J. Geiger, Audrey Amrein-Beardsley & Jessica Holloway

Performance assessment in teacher education research—A scoping review of characteristics of assessment instruments in the DACH region

Article Open access 01 June 2023

Carina Albu & Anke Lindmeier

References

Angoff, W.H. (1971). Scales, norms, and equivalent scores. In R.L. Thorndike (ed.), Educational measurement (2nd ed.). Washington, DC: American Council on Education.
Google Scholar
Coombs, C. (1964). A theory of data. New York: Wiley.
Google Scholar
Cronbach, L.J. (1995). Personal communication.
Crocker, L. (1997). Assessing the content representativeness of performance assessment exercises. Applied Measurement in Education, 10, 83–95.
Google Scholar
Ebel, R.L. (1972). Essentials of educational measurement. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Edwards, W. (1977). How to use multiattribute utility measurements for social decision-making. IEEE Transactions on Systems, Man and Cybernetics, SMC-7, 326–340.
Google Scholar
Edwards, W., & Newman, J.R. (1982). Multiattribute evaluation. Beverly Hills, CA: Sage Publications.
Google Scholar
Haertel, E.H., Harnishfeger, A., Pifer, R.E., Wiley, D.E., & Woods, E.M. (1977). Achievement measures as Title I eligibility criteria: Concepts, methods, and eligibility estimation. Technical Report of the M-L Group for Policy Studies in Education, Chicago: CEMREL.
Google Scholar
Hattie, J.A. (1996). Validating the specification of a complex content domain. Paper presented at the Annual Conference of the American Educational Research Association, New York.
Jaeger, R.M. (1982). An iterative structured judgment process for establishing standards on competency tests: Theory and application. Educational Evaluation and Policy Analysis, 4, 461–475.
Google Scholar
Jaeger, R.M. (1994a). On the cognitive construction of standard-setting judgments: The case of configural scoring. Paper presented before the NCES/NAGB Conference on Standard-Setting Methodology, Washington, DC, October.
Jaeger, R.M. (1994b). Setting standards for complex performances: An iterative judgmental policy capturing strategy. Paper presented at the annual meeting of the American Psychological Society, Washington, DC, June.
Jaeger, R.M. (1995a). Setting standards for complex performances: an iterative judgmental policy capturing strategy. Educational Measurement: Issues and Practice, 14, 16–20.
Google Scholar
Jaeger, R.M. (1995b). Setting performance standards through two-stage judgmental policy capturing. Applied Measurement in Education, 8, 15–40.
Google Scholar
Jaeger, R.M., Hambleton, R.L., & Plake, B.S. (1995, April). Eliciting configural performance standards through a sequenced application of complementary methods. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.
Linn, R.L., & Miller, M.D. (1986). Review of test validation procedures and results. In R.M. Jaeger, J.C. Busch, L. Bond, R.L. Linn, M.D. Miller, J. Millman, R.G. O'Sullivan & R. Traub, An evaluation of the Georgia teacher certification testing program. Greensboro, NC: Center for Educational Research and Evaluation, University of North Carolina at Greensboro.
Google Scholar
Livingston, S.A., & Lewis, C. (1995). Estimating the consistency and accuracy of classifications based on test scores. Journal of Educational Measurement, 32, 179–197.
Google Scholar
Lord, F.M. (1965). A strong true-score theory, with applications. Psychometrika, 30, 239–270.
Google Scholar
Nedelsky, L. (1954). Absolute grading standards for objective tests. Educational and Psychological Measurement, 14, 3–19.
Google Scholar
Pearlman, M.A. (1997, March). What technology cannot offer for setting standards on complex performance examinations. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL.
Pitz, G.F., & Sachs, N.J. (1984). Judgment and decision: Theory and application. Annual Review of Psychology, 35, 139–163.
Google Scholar
Plake, B.S., Hambleton, R.K., & Jaeger, R.M. (1997). A new standard-setting method for performance assessments: the dominant profile judgment method and some field test results. Educational and Psychological Measurement, 57, 400–411.
Google Scholar
Putnam, S.E., Pence, P., & Jaeger, R.M. (1995). A multi-stage dominant profile method for setting standards on complex performance assessments, Applied Measurement in Education, 8, 57–83.
Google Scholar
Spearman, C. (1904). The proof and measurement of association between two things. American Journal of Psychology, 15, 72–101.
Google Scholar
Spearman, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271–295.
Google Scholar
Standards for Educational and Psychological Testing. (1985). Washington, D.C.: American Psychological Association.
Traub, R.E., Haertel, E.H., & Shavelson, R. (1996, April). The effects of measurement error on the trustworthiness of examinee classifications. Paper presented at the annual meeting of the American Educational Research Association, New York.
U.S. Department of Justice. (1978). Uniform guidelines on employee selection procedures. Federal Register, August 25, 1978.

Download references

Author information

Authors and Affiliations

Center for Educational Research and Evaluation, University of North Carolina at Greensboro–, Greensboro, NC, 27412
Richard M. Jaeger

Authors

Richard M. Jaeger
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jaeger, R.M. Evaluating the Psychometric Qualities of the National Board for Professional Teaching Standards' Assessments: A Methodological Accounting. Journal of Personnel Evaluation in Education 12, 189–210 (1998). https://doi.org/10.1023/A:1008085128230

Download citation

Issue Date: June 1998
DOI: https://doi.org/10.1023/A:1008085128230

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating the Psychometric Qualities of the National Board for Professional Teaching Standards' Assessments: A Methodological Accounting

Abstract

Access this article

Similar content being viewed by others

Teacher Evaluation with Multiple Indicators: Conceptual and Methodological Considerations Regarding Validity

Using test scores to evaluate and hold school teachers accountable in New Mexico

Performance assessment in teacher education research—A scoping review of characteristics of assessment instruments in the DACH region

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Evaluating the Psychometric Qualities of the National Board for Professional Teaching Standards' Assessments: A Methodological Accounting

Abstract

Access this article

Similar content being viewed by others

Teacher Evaluation with Multiple Indicators: Conceptual and Methodological Considerations Regarding Validity

Using test scores to evaluate and hold school teachers accountable in New Mexico

Performance assessment in teacher education research—A scoping review of characteristics of assessment instruments in the DACH region

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation