Social Indicators Research

, Volume 45, Issue 1–3, pp 83–117 | Cite as

The Construct of Content Validity

  • Stephen G. Sireci
Article

Abstract

Many behavioral scientists argue that assessments used in social indicators research must be content-valid. However, the concept of content validity has been controversial since its inception. The current unitary conceptualization of validity argues against use of the term content validity, but stresses the importance of content representation in the instrument construction and evaluation processes. However, by arguing against use of this term, the importance of demonstrating content representativeness has been severely undermined. This paper reviews the history of content validity theory to underscore its importance in evaluating construct validity. It is concluded that although measures cannot be “validated” based on content validity evidence alone, demonstration of content validity is a fundamental requirement of all assessment instruments.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

REFERENCES

  1. Aiken, L. R.: 1980, ‘Content validity and reliability of single items or questionnaires’, Educational and Psychological Measurement 40, pp. 955–959.Google Scholar
  2. American Psychological Association, Committee on Test Standards: 1952, ‘Technical recommendations for psychological tests and diagnostic techniques: A preliminary proposal’, American Psychologist 7, pp. 461–465.Google Scholar
  3. American Psychological Association: 1954, ‘Technical recommendations for psychological tests and diagnostic techniques’ (Author, Washington, DC).Google Scholar
  4. American Psychological Association: 1966, Standards for Educational and Psychological Tests and Manuals (Author, Washington, DC).Google Scholar
  5. American Psychological Association, American Educational Research Association, & National Council on Measurement in Education: 1974, Standards for Educational and Psychological Tests (American Psychological Association, Washington, DC).Google Scholar
  6. American Psychological Association, American Educational Research Association, & National Council on Measurement in Education: 1985, Standards for Educational and Psychological Testing (American Psychological Association, Washington, DC).Google Scholar
  7. Anastasi, A.: 1954, Psychological Testing (MacMillan, New York).Google Scholar
  8. Anastasi, A.: 1986, ‘Evolving concepts of test validation’, Annual Review of Psychology 37, pp. 1–15.Google Scholar
  9. Angoff, W. H.: 1988, ‘Validity: An evolving concept’, in H. Wainer and H. I. Braun (eds.), Test Validity (Lawrence Erlbaum, Hillsdale, New Jersey), pp. 19–32.Google Scholar
  10. Bingham, W. V.: 1937, Aptitudes and Aptitude Testing (Harper, New York).Google Scholar
  11. Colton, D. A.: 1993, ‘A multivariate generalizability analysis of the 1989 and 1990 AAP Mathematics test forms with respect to the table of specifications’, ACT Research Report Series: 93–6 (American College Testing Program, Iowa City).Google Scholar
  12. Crocker, L. M., D. Miller and E. A. Franks: 1989, ‘Quantitative methods for assessing the fit between test and curriculum’, Applied Measurement in Education 2, pp. 179–194.Google Scholar
  13. Cronbach, L. J.: 1971, ‘Test validation’, in R. L. Thorndike (ed.), Educational Measurement, 2nd ed. (American Council on Education, Washington, DC), pp. 443–507.Google Scholar
  14. Cronbach, L. J.: 1988, ‘Five perspectives on the validity argument’, in H. Wainer and H. I. Braun (eds.), Test Validity (Lawrence Erlbaum, Hillsdale, New Jersey), pp. 3–17.Google Scholar
  15. Cronbach, L. J. and P. E. Meehl: 1955, ‘Construct validity in psychological tests’, Psychological Bulletin 52, pp. 281–302.Google Scholar
  16. Cureton, E. E.: 1951, ‘Validity’, in E. F. Lindquist (ed.), Educational Measurement, 1st ed. (American Council on Education, Washington, DC), pp. 621–694.Google Scholar
  17. Davison, M. L.: 1985, ‘Multidimensional scaling versus components analysis of test intercorrelations’, Psychological Bulletin 97, pp. 94–105.Google Scholar
  18. Deville, C. W.: 1996, ‘An empirical link of content and construct equivalence’, Applied Psychological Measurement 20, pp. 127–139.Google Scholar
  19. Dorans, N. J. and I. M. Lawrence: 1987, ‘The internal construct validity of the SAT’ (Research Report) (Educational Testing Service, Princeton, NJ).Google Scholar
  20. Ebel, R. L.: 1956, ‘Obtaining and reporting evidence for content validity’, Educational and Psychological Measurement 16, pp. 269–282.Google Scholar
  21. Ebel, R. L.: 1961, ‘Must all tests be valid?’ American Psychologist 16, pp. 640–647.Google Scholar
  22. Ebel, R. L.: 1977, ‘Comments on some problems of employment testing’, Personnel Psychology 30, pp. 55–63.Google Scholar
  23. Embretson (Whitley), S.: 1983, ‘Construct validity: construct representation versus nomothetic span’, Psychological Bulletin 93, pp. 179–197.Google Scholar
  24. Fitzpatrick, A. R.: 1983, ‘The meaning of content validity’, Applied Psychological Measurement 7, pp. 3–13.Google Scholar
  25. Geisinger, K. F.: 1992, ‘The metamorphosis in test validity’, Educational Psychologist 27, pp. 197–222.Google Scholar
  26. Goodenough, F. L.: 1949, Mental Testing (Rinehart, New York).Google Scholar
  27. Green, S. B.: 1983, ‘Identifiability of spurious factors with linear factor analysis with binary items’, Applied Psychological Measurement 7, pp. 3–13.Google Scholar
  28. Guilford, J. P.: 1946, ‘New standards for test evaluation’, Educational and Psychological Measurement 6, pp. 427–439.Google Scholar
  29. Guion, R. M.: 1977, ‘Content validity: The source of my discontent’, Applied Psychological Measurement 1, pp. 1–10.Google Scholar
  30. Guion, R. M.: 1978, ‘Scoring of content domain samples: the problem of fairness’, Journal of Applied Psychology 63, pp. 499–506.Google Scholar
  31. Guion, R. M.: 1980, ‘On trinitarian doctrines of validity’, Professional Psychology 11, pp. 385–398.Google Scholar
  32. Gulliksen, H.: 1950a, ‘Intrinsic validity’, American Psychologist 5, pp. 511–517.Google Scholar
  33. Gulliksen, H.: 1950b, Theory of Mental Tests (Wiley, New York).Google Scholar
  34. Hambleton, R. K.: 1980, ‘Test score validity and standard setting methods’, in R. A. Berk (ed.), Criterion-Referenced Measurement: The State of the Art (Johns Hopkins University Press, Baltimore).Google Scholar
  35. Hambleton, R. K.: 1984, ‘Validating the test score’, in R. A. Berk (ed.), A Guide to Criterion-Referenced Test Construction (Johns Hopkins University Press, Baltimore), pp. 199–230.Google Scholar
  36. Hubley, A. M. and B. D. Zumbo: 1996, ‘A dialectic on validity: Where we have been and where we are going’, The Journal of General Psychology 123, pp. 207–215.Google Scholar
  37. Jackson, D. N.: 1976, Jackson Personality Inventory: Manual (Research Psychologists Press, Port Huron, MI).Google Scholar
  38. Jackson, D. N.: 1984, Personality Research Form: Manual (Research Psychologists Press, Port Huron, MI).Google Scholar
  39. Jarjoura, D. and R. L. Brennan: 1982, ‘A variance components model for measurement procedures associated with a table of specifications’, Applied Psychological Measurement 6, pp. 161–171.Google Scholar
  40. Jenkins J. G.: 1946, ‘Validity for what?’ Journal of Consulting Psychology 10, pp. 93–98.Google Scholar
  41. Kane, M. T.: 1992, ‘An argument-based approach to validity’, Psychological Bulletin 112, pp. 527–535.Google Scholar
  42. Kelley, T. L.: 1927, Interpretation of Educational Measurement (World Book Co., Yonkers-on-Hudson, NY).Google Scholar
  43. LaDuca, A.: 1994, ‘Validation of professional licensure examinations’, Evaluation & the Health Professions 17, pp. 178–197.Google Scholar
  44. Lawshe, C. H.: 1975, ‘A quantitative approach to content validity’, Personnel Psychology 28, pp. 563–575.Google Scholar
  45. Lennon, R. T.: 1956, ‘Assumptions underlying the use of content validity’, Educational and Psychological Measurement 16, pp. 294–304.Google Scholar
  46. Lindquist, E. F. (Ed.): 1951, Educational Measurement (American Council on Education, Washington, DC).Google Scholar
  47. Linn, R. L.: 1994, ‘Criterion-referenced measurement: A valuable perspective clouded by surplus meaning’, Educational Measurement: Issues and Practice 13, pp. 12–15.Google Scholar
  48. Loevinger, J.: 1957, ‘Objective tests as instruments of psychological theory’, Psychological Reports 3, pp. 635–694 (Monograph Supplement 9).Google Scholar
  49. Messick, S.: 1975, ‘The standard problem: meaning and values in measurement and evaluation’, American Psychologist 30, pp. 955–966.Google Scholar
  50. Messick, S.: 1980, ‘Test validity and the ethics of assessment’, American Psychologist 35, pp. 1012–1027.Google Scholar
  51. Messick, S.: 1988, ‘The once and future issues of validity: Assessing the meaning and consequences of measurement’, in H. Wainer and H. I. Braun (eds.), Test Validity (Lawrence Erlbaum, Hillsdale, New Jersey), pp. 33–45.Google Scholar
  52. Messick, S.: 1989a, ‘Meaning and values in test validation: the science and ethics of assessment’, Educational Researcher 18, pp. 5–11.Google Scholar
  53. Messick, S.: 1989b, ‘Validity’, in R. Linn (ed.), Educational Measurement, 3rd ed. (American Council on Education, Washington, DC).Google Scholar
  54. Morris, L. L. and C. T. Fitz-Gibbon: 1978, How to Measure Achievement (Sage, Beverly Hills).Google Scholar
  55. Mosier, C. I.: 1947, ‘A critical examination of the concepts of face validity’, Educational and Psychological Measurement 7, pp. 191–205.Google Scholar
  56. Napior, D.: 1972, ‘Nonmetric multidimensional techniques for summated ratings’, in R. N. Shepard, A. K. Romney and S. B. Nerlove (eds.), Multidimensional Scaling: Volume 1: Theory (Seminar Press, New York).Google Scholar
  57. Nunnally, J. C.: 1967, Psychometric Theory (McGraw-Hill, New York).Google Scholar
  58. Oltman, P. K., L. J. Stricker and T. S. Barrows: 1990, ‘Analyzing test structure by multidimensional scaling’, Journal of Applied Psychology 75, pp. 21–27.Google Scholar
  59. Osterlind, S. J.: 1989, Constructing Test Items (Kluwer, Hingham, MA).Google Scholar
  60. Popham, W. J.: 1992, ‘Appropriate expectations for content judgments regarding teacher licensure tests’, Applied Measurement in Education 5, pp. 285–301.Google Scholar
  61. Popham, W. J.: 1994, ‘The instructional consequences of criterion-referenced clarity’, Educational Measurement: Issues and Practice 13, pp. 15–20, 39.Google Scholar
  62. Popham. W. J.: 1995, April, Postcursive Review of Criterion-Referenced Test Items Based on “Soft” Item Specifications. A symposium paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco.Google Scholar
  63. Raymond, M. R.: 1989, ‘Applications of multidimensional scaling research in the health professions’, Evaluation & the Health Professions 12, pp. 379–408.Google Scholar
  64. Raymond, M. R.: 1994, April, Equivalence of Weights for Test Specifications Obtained Using Empirical and Judgmental Procedures. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.Google Scholar
  65. Rulon, P. J.: 1946, ‘On the validity of educational tests’, Harvard Educational Review 16, pp. 290–296.Google Scholar
  66. Schaefer, L., M. Raymond and A. S. White: 1992, ‘A comparison of two methods for structuring performance domains’, Applied Measurement in Education 5, pp. 321–335.Google Scholar
  67. Shavelson, R. J., X. Gao and G. P. Baxter: 1995, ‘On the content validity of performance assessments: Centrality of domain specification’, in M. Birenbaum, and F. Douchy (eds.), Alternatives in Assessment of Achievements, Learning Process, and Prior Knowledge (Kluwer Academic, Boston), pp. 131–141.Google Scholar
  68. Shepard, L. A.: 1993, ‘Evaluating test validity’, Review of Research in Education 19, pp. 405–450.Google Scholar
  69. Shepard, L. A.: 1996, ‘The centrality of test use and consequences for test validity’, Educational Measurement: Issues and Practice 16, pp. 5–24.Google Scholar
  70. Sireci, S. G. and K. F. Geisinger: 1992, ‘Analyzing test content using cluster analysis and multidimensional scaling’, Applied Psychological Measurement 16, pp. 17–31.Google Scholar
  71. Sireci, S. G. and K. F. Geisinger: 1995, ‘Using subject matter experts to assess content representation: A MDS analysis’, Applied Psychological Measurement 19, pp. 241–255.Google Scholar
  72. Smith, I. L., R. K. Hambleton and G. A. Rosen: 1988, April, Content Validity Studies of the Examination for Professional Practice in Psychology. Paper presented at the annual convention of the American Psychological Association, Atlanta, GA.Google Scholar
  73. Tenopyr, M. L.: 1977, ‘Content-construct confusion’, Personnel Psychology 30, pp. 47–54.Google Scholar
  74. Thorndike, E. L.: 1931, Measurement of Intelligence (Bureau of Publishers, Columbia University, New York).Google Scholar
  75. Thorndike, R. L.: 1949, Personnel Selection: Test and Measurement Techniques (Wiley, New York).Google Scholar
  76. Thorndike, R. L. (Ed.): 1971, Educational Measurement, 2nd ed. (American Council on Education, Washington. DC).Google Scholar
  77. Thurstone, L. L.: 1932, The Reliability and Validity of Tests (Edwards Brothers, Ann Arbor, Michigan).Google Scholar
  78. Toops, H. A.: 1944, ‘The criterion’, Educational and Psychological Measurement 4, pp. 271–297.Google Scholar
  79. Tucker, L. R.: 1961, Factor Analysis of Relevance Judgments: An Approach to Content Validity. Paper presented at the Invitational Conference on Testing Problems, Princeton, NJ (reprinted in A. Anastasi (ed.), Testing Problems in Perspective (1966), (American Council on Education, Washington, DC), pp. 577–586.Google Scholar
  80. Yalow, E. S. and W. J. Popham: 1983, ‘Content validity at the crossroads’, Educational Researcher 12, pp. 10–14.Google Scholar

Copyright information

© Kluwer Academic Publishers 1998

Authors and Affiliations

  • Stephen G. Sireci
    • 1
  1. 1.University of Massachusetts – AmherstAmherstUSA

Personalised recommendations