Psychometrika

, Volume 66, Issue 4, pp 473–485 | Cite as

Psychometric engineering as art

Articles

Abstract

The Psychometric Society is “devoted to the development of Psychology as a quantitative rational science”. Engineering is often set in contradistinction with science; art is sometimes considered different from science. Why, then, juxtapose the words in the title:psychometric, engineering, andart? Because an important aspect of quantitative psychology is problem-solving, and engineering solves problems. And an essential aspect of a good solution is beauty—hence, art. In overview and with examples, this presentation describes activities that are quantitative psychology as engineering and art—that is, as design. Extended illustrations involve systems for scoring tests in realistic contexts. Allusions are made to other examples that extend the conception of quantitative psychology as engineering and art across a wider range of psychometric activities.

Key words

psychometrics quantitative psychology design 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allen, N.L., Carlson, J.E., & Zelenak, C.A. (1999).The NAEP 1996 technical report. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement.Google Scholar
  2. Baker, F.B., & Harwell, M.R. (1996). Computing elementary symmetric functions and their derivatives: A didactic.Applied Psychological Measurement, 20(2), 169–192.Google Scholar
  3. Barr, A.H. (1946).Picasso: Fifty years of his art. New York, NY: The Museum of Modern Art.Google Scholar
  4. Berkson, J. (1944). Application of the logistic function to bio-assay.Journal of the American Statistical Association.39, 357–375.Google Scholar
  5. Berkson, J. (1953). A statistically precise and relatively simple method of estimating the bio-assay with quantal response, based on the logistic function.Journal of the American Statistical Association, 48, 565–599.Google Scholar
  6. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In F.M. Lord & M.R. Novick,Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley.Google Scholar
  7. Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm.Psychometrika, 46, 443–459.CrossRefGoogle Scholar
  8. Bock, R.D., & Lieberman, M. (1970). Fitting a response model forn dichotomously scored items.Psychometrika, 35, 179–197.Google Scholar
  9. Bock, R.D., & Mislevy, R.J. (1981). An item response curve model for matrix-sampling data: The California grade-three assessment.New Directions for Testing and Measurement, 10, 65–90.Google Scholar
  10. Bock, R.D., & Mislevy, R.J. (1982). Adaptive EAP estimation of ability in a microcomputer environment.Applied Psychological Measurement, 6, 431–444.Google Scholar
  11. Box, G.E.P. (1979). Some problems of statistics and everday life.Journal of the American Statistical Association, 74, 1–4.Google Scholar
  12. Brooks, F.P. (1996). The computer scientist as toolsmith II.Communications of the ACM, 39, 61–68.Google Scholar
  13. Brooks, F.P. (in press). The design of design.Communications of the ACM.Google Scholar
  14. Chen, W.H. (1995).Estimation of item parameters for the three-parameter logistic model using the marginal likelihood of summed scores. Unpublished doctoral dissertation, The University of North Carolina at Chapel Hill.Google Scholar
  15. Chen, W.H., & Thissen, D. (1999). Estimation of item parameters for the three-parameter logistic model using the marginal likelihood of summed scores.British Journal of Mathematical and Statistical Psychology, 52, 19–37.CrossRefGoogle Scholar
  16. Cronbach, L.J., Gleser, G.C., Nanda, H., & Rajaratnam, N. (1972).The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York, NY: John Wiley & Sons.Google Scholar
  17. Finney, D.J. (1952).Probit analysis: A statistical treatment of the sigmoid response curve. London: Cambridge University Press.Google Scholar
  18. Fischer, G.H. (1974).Einführung in die Theorie psychologischer Tests [Introduction to the theory of psychological tests]. Bern: Huber.Google Scholar
  19. Fischer, G.H., & Allerup, P. (1968). Rechentchnische Fragen zu Raschs eindimensionalem Model [An inquiry into computational techniques for the Rasch model]. In G.H. Fischer (Ed.),Psychologische Testtheorie (pp. 269–280). Bern: Huber.Google Scholar
  20. Goldstein, A. (2001, March 12). Making another big score.Time, 157, 66–67.Google Scholar
  21. Henriques, D.B., & Steinberg, J. (2001, May 20). Errors plague testing industry.The New York Times, pp. A1, A22–A23.Google Scholar
  22. Jones, L.V. (1998). L.L. Thurstone's vision of psychology as a quantitative rational science. In G.A. Kimble & M. Wertheimer (Eds.),Portraits of pioneers in psychology, Vol III (pp. 84–102). Washington, DC: American Psychological Association; Mahwah, NJ: Lawrence Erlbaum & Associates.Google Scholar
  23. Kelley, T.L. (1927).The interpretation of educational measurements. New York, NY: World Book.Google Scholar
  24. Kelley, T.L. (1947).Fundamentals of statistics. Cambridge: Harvard University Press.Google Scholar
  25. Lazarsfeld, P.F. (1950). The logical and mathematical foundation of latent structure analysis. In S.A. Stouffer, L. Guttman, E.A. Suchman, P.F. Lazarsfeld, S.A. Star, & J.A. Clausen,Measurement and prediction (pp. 362–412). New York, NY: John Wiley & Sons.Google Scholar
  26. Laidlaw, D.H., Fleischer, K.W., & Barr, A.H. (1995, September).Bayesian mixture classification of MRI data for geometric modeling and visualization. Poster presented at the First International Workshop on Statistical Mixture Modeling, Aussois, France. (Retrieved from the Worldwide Web: http://www.gg.caltech.edu/~dhl/aussois/paper.html)Google Scholar
  27. Lewis, B. (1996, March 15). IS survival guide.Infoworld, 21, p. 96.Google Scholar
  28. Lewis, B. (2001, March 19). IS survival guide.Infoworld, 23, p. 42.Google Scholar
  29. Lindley, D.V., & Smith, A.F.M. (1972). Bayes estimates for the linear model.Journal of the Royal Statistical Society, Series B, 34, 1–41.Google Scholar
  30. Liou, M. (1994). More on the computation of higher-order derivatives of the elementary symmetric functions in the Rasch model.Applied Psychological Measurement, 18, 53–62.Google Scholar
  31. Lord, F.M. (1953). The relation of test score to the trait underlying the test.Educational and Psychological Measurement, 13, 517–548.Google Scholar
  32. Lord, F.M., & Novick, M. (1968).Statistical theories of mental test scores. Reading, MA: Addison Wesley.Google Scholar
  33. Lord, F.M., & Wingersky, M.S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”.Applied Psychological Measurement, 8, 453–461.Google Scholar
  34. Mislevy, R.M., Johnson, E.G., & Muraki, E. (1992). Scaling procedures in NAEP.Journal of Educational Statistics, 17, 131–154.Google Scholar
  35. Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm.Applied Psychological Measurement, 16, 159–176.Google Scholar
  36. Muraki, E. (1997). A generalized partial credit model. In W. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 153–164). New York, NY: Springer.Google Scholar
  37. Novick, M.R. (1980). Statistics as psychometrics.Psychometrika, 45, 411–424.CrossRefGoogle Scholar
  38. Orlando, M. (1997).Item fit in the context of item response theory. Unpublished doctoral dissertation, The University of North Carolina at Chapel Hill.Google Scholar
  39. Orlando, M., & Thissen, D. (2000). New item fit indices for dichotomous item response theory models.Applied Psychological Measurement, 24, 50–64.Google Scholar
  40. Picasso, P. (1923). Picasso speaks—A statement by the artist.The Arts, 3, 315–326.Google Scholar
  41. Rasch, G. (1960).Probabilistic models for some intelligence and attainment tests. Copenhagen: Denmarks Paedagogiske Institut. (Republished in 1980 by the University of Chicago Press of Chicago)Google Scholar
  42. Raz, J., Turetsky, B.I., & Dickerson, L.W. (2001). Inference for a random wavelet packet model of single-channel event-related potentials.Journal of the American Statistical Association, 96, 409–420.CrossRefGoogle Scholar
  43. Robbins, H. (1952). Some aspects of the sequential design of experiments.Bulletin of the American Mathematical Soceity, 58, 527–535.Google Scholar
  44. Rosa, K., Swygert, K., Nelson, L., & Thissen, D. (2001). Item response theory applied to combinations of multiple-choice and constructed-response items—scale scores for patterns of summed scores. In D. Thissen & H. Wainer (Eds),Test scoring (pp. 253–292). Mahwah, NJ: Lawrence Erlbaum & Associates.Google Scholar
  45. Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores.Psychometric Monograph, No. 17.Google Scholar
  46. Samejima, F. (1997). Graded response model. In W. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 85–100). New York, NY: Springer.Google Scholar
  47. Thissen, D., Nelson, L., Rosa, K., & McLeod, L.D. (2001). Item response theory for items scored in more than two categories. In D. Thissen & H. Wainer (Eds),Test scoring (pp. 141–186). Mahwah, NJ: Lawrence Erlbaum & Associates.Google Scholar
  48. Thissen, D., Nelson, L., & Swygert, K. (2001). Item response theory applied to combinations of multiple-choice and constructed-response items—Approximation methods for scale scores. In D. Thissen & H. Wainer (Eds),Test scoring (pp. 293–341). Mahwah, NJ: Lawrence Erlbaum & Associates.Google Scholar
  49. Thissen, D., & Orlando, M. (2001). Item response theory for items scored in two categories. In D. Thissen & H. Wainer (Eds),Test scoring (pp. 73–140). Mahwah, NJ: Lawrence Erlbaum & Associates.Google Scholar
  50. Thissen, D., Pommerich, M., Billeaud, K., & Williams, V.S.L. (1995). Item response theory for scores on tests including polytomous items with ordered responses.Applied Psychological Measurement, 19, 39–49.Google Scholar
  51. Thissen, D. & Wainer, H. (Eds.) (2001)Test scoring. Mahwah, NJ: Lawrence Erlbaum & Associates.Google Scholar
  52. Thurstone, L.L. (1925). A method of scaling psychological and educational tests.Journal of Educational Psychology, 16, 433–449.Google Scholar
  53. Thurstone, L.L. (1927). The law of comparative judgment.Psychological Review, 34, 278–286.Google Scholar
  54. Thurstone, L.L. (1937). Psychology as a quantitative rational science.Science, 85, 227–232.Google Scholar
  55. Thurstone, L.L. (1938).Primary mental abilities. Chicago, IL: University of Chicago Press.Google Scholar
  56. Tukey, J.W. (1961).Data analysis and behavioral science or learning to bear the quantitative man's burden by shunning badmandments. Unpublished manuscript. (Reprinted inThe collected works of John W. Tukey, Vol III, Philosophy and principles of data analysis: 1949–1964, pp. 187–389 by L.V. Jones (Ed.), 1986, Monterey, CA: Wadsworth & Brooks-Cole)Google Scholar
  57. Tukey, J.W. (1962). The future of data analysis.Annals of Mathematical Statistics, 33, 1–67. (Reprinted inThe collected works of John W. Tukey, Vol III, Philosophy and principles of data analysis: 1949–1964, pp. 391–484 by L.V. Jones (Ed.), 1986, Monterey, CA: Wadsworth & Brooks-Cole)Google Scholar
  58. Verhelst, N.D., & Veldhuijzen, N.H. (1991).A new algorithm for computing elementary symmetric functions and their first and second derivatives (Measurement and Research Department Rep. 91-1). Arnhem, The Netherlands: Netherlands Central Bureau of Statistics.Google Scholar
  59. Wainer, H., Vevea, J.L., Camacho, F., Reeve, B, Rosa, K., Nelson, L., Swygert, K., & Thissen, D. (2001). Augmented scores—“borrowing strength” to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds),Test scoring (pp. 343–387). Mahwah, NJ: Lawrence Erlbaum & Associates.Google Scholar
  60. Williams, V.S.L., Pommerich, M., & Thissen, D. (1998). A comparison of developmental scales based on Thurstone methods and item response theory.Journal of Educational Measurement, 35, 93–107.Google Scholar
  61. Yen, W.M. (1984). Obtaining maximum likelihood trait estimates from number-correct scores for the three-parameter logistic model.Journal of Educational Measurement, 21, 93–111.CrossRefGoogle Scholar

Copyright information

© The Psychometric Society 2001

Authors and Affiliations

  1. 1.L. L. Thurstone Psychometric LaboratoryUniversity of North CarolinaChapel Hill

Personalised recommendations