Abstract
The vast majority of measures have, at their core, a purpose of personal and social change. If test developers and users want measures to have personal and social consequences and impact, then it is critical to consider the consequences and side effects of measurement in the validation process itself. The consequential basis of test interpretation and use, as introduced in Messick’s (Educational measurement, Macmillan, New York, pp. 13–103, 1989) progressive matrix model of unified validity theory, has been misunderstood by many measurement experts, test developers, researchers, and practitioners. The purposes of this paper were to (a) review Messick’s unified view of validity and clarify his consequential basis of test interpretation and use, (b) discuss the kinds of questions evoked by value implications and social consequences and their role in construct validity and score meaning, (c) present a reframing of Messick’s model and a new model of unified validity and validation, (d) bring the concept of multilevel measures under the same validation umbrella as individual differences measures, and (e) offer some thoughts and directions for more explicit consideration of value implications, intended social consequences, and unintended side effects of legitimate test interpretation and use. This paper has implications for the interpretation, use, and validation of both individual differences and multilevel measures in education, psychology, and health contexts.
Similar content being viewed by others
References
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Psychological Association.
Anastasi, A. (1986). Evolving concepts of test validation. Annual Review of Psychology, 37, 1–15.
Brennan, R. L. (2006). Perspectives on the evolution and future of educational measurement. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 1–16). Westport, CT: American Council on Education/Praeger.
Cizek, G. J., Rosenberg, S., & Koons, H. (2008). Sources of validity evidence for educational and psychological tests. Educational and Psychological Measurement, 68, 397–412.
Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 443–507). Washington, DC: American Council on Education.
Cronbach, L. J. (1988). Five perspectives on validity argument. In H. Wainer & H. Braun (Eds.), Test validity (pp. 3–17). Hillsdale, NJ: Lawrence Erlbaum.
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302.
Forer, B., & Zumbo, B. D. (2011). Validation of multilevel constructs: Validation methods and empirical findings for the EDI. Social Indicators Research. doi:10.1007/s11205-011-9844-3.
Hubley, A. M., & Zumbo, B. D. (1996). A dialectic on validity: Where we have been and where we are going. The Journal of General Psychology, 123, 207–215.
Janus, M. (2006). Early Development Instrument: An indicator of developmental health at school entry. Monograph from the proceedings of the International Conference on Measuring Early Child Development, Vaudreuil Quebec.
Kane, M. (2006). Validation. In R. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Washington, DC: American Council on Education and National Council on Measurement in Education.
Linn, R. L. (1997). Evaluating the validity of assessments: The consequences of use. Educational Measurement: Issues and Practice, 16, 14–16.
Linn, R. L. (2006). Validity of inferences from test-based educational accountability systems. Journal of Personnel Evaluation in Education, 19, 5–15.
Linn, R. L. (2008). Validation of uses and interpretations of state assessments. Washington, DC: Council of Chief State School Officers.
Linn, R. L. (2009). The concept of validity in the context of NCLB. In R. W. Lissitz (Ed.), The concept of validity: Revisions, new directions and applications (pp. 195–212). Charlotte, NC: IAP—Information Age Publishing, Inc.
Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports (Monograph Supplement), 3, 635–694.
Mehrens, W. A. (1997). The consequences of consequential validity. Educational Measurement: Issues and Practice, 16, 16–18.
Messick, S. (1980). Test validity and the ethics of assessment. American Psychologist, 35, 1012–1027.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York, NY: Macmillan.
Messick, S. (1995). Validity of psychological assessment. Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741–749.
Messick, S. (1998). Test validity: A matter of consequences. Social Indicators Research, 45, 35–44.
Messick, S. (2000). Consequences of test interpretation and use: The fusion of validity and values in psychological assessment. In R. D. Goffin & E. Helmes (Eds.), Problems and solutions in human assessment: Honoring Douglas N. Jackson at seventy (pp. 3–20). Boston: Kluwer Academic Publishers.
Popham, W. J. (1997). Consequential validity: Right concern–wrong concept. Educational Measurement: Issues and Practice, 16, 9–13.
Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385–401.
Shepard, L. A. (1997). The centrality of test use and consequences for test validity. Educational Measurement: Issues and Practice, 16, 5–8,13, 24.
Willingham, W. W. (2002). Seeking fair alternatives in construct design. In H. I. Braun, D. N. Jackson, D. E. Wiley, & S. Messick (Eds.), The role of constructs in psychological and educational measurement. Mahwah, NJ: Lawrence Erlbaum.
Willingham, W. W., & Cole, N. J. (1997). Gender and fair assessment. Mahwah, NJ: Lawrence Erlbaum.
Zumbo, B. D. (2007). Validity: Foundational issues and statistical methodology. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics, vol. 26: Psychometrics (pp. 45–79). The Netherlands: Elsevier Science B.V.
Zumbo, B. D. (2009). Validity as contextualized and pragmatic explanation, and its implications for validation practice. In R. W. Lissitz (Ed.), The concept of validity: Revisions, new directions and applications (pp. 65–82). Charlotte, NC: IAP—Information Age Publishing, Inc.
Zumbo, B. D., & Forer, B. (2011). Testing and measurement from a multilevel view: Psychometrics and validation. In J. A. Bovaird, K. Geisinger, & C. Buckendahl (Eds). High stakes testing in education—science and practice in K-12 settings [Festschrift to Barbara Plake]. Washington, DC: American Psychological Association Press (in press).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hubley, A.M., Zumbo, B.D. Validity and the Consequences of Test Interpretation and Use. Soc Indic Res 103, 219–230 (2011). https://doi.org/10.1007/s11205-011-9843-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11205-011-9843-4