Psychometric engineering as art

Thissen, David

doi:10.1007/BF02296190

Psychometric engineering as art

Articles
Published: December 2001

Volume 66, pages 473–485, (2001)
Cite this article

Psychometrika Aims and scope Submit manuscript

David Thissen¹

370 Accesses
14 Citations
Explore all metrics

Abstract

The Psychometric Society is “devoted to the development of Psychology as a quantitative rational science”. Engineering is often set in contradistinction with science; art is sometimes considered different from science. Why, then, juxtapose the words in the title:psychometric, engineering, andart? Because an important aspect of quantitative psychology is problem-solving, and engineering solves problems. And an essential aspect of a good solution is beauty—hence, art. In overview and with examples, this presentation describes activities that are quantitative psychology as engineering and art—that is, as design. Extended illustrations involve systems for scoring tests in realistic contexts. Allusions are made to other examples that extend the conception of quantitative psychology as engineering and art across a wider range of psychometric activities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Positive Psychology: An Introduction

How feature integration theory integrated cognitive psychology, neurophysiology, and psychophysics

Article 09 July 2019

References

Allen, N.L., Carlson, J.E., & Zelenak, C.A. (1999).The NAEP 1996 technical report. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement.
Google Scholar
Baker, F.B., & Harwell, M.R. (1996). Computing elementary symmetric functions and their derivatives: A didactic.Applied Psychological Measurement, 20(2), 169–192.
Google Scholar
Barr, A.H. (1946).Picasso: Fifty years of his art. New York, NY: The Museum of Modern Art.
Google Scholar
Berkson, J. (1944). Application of the logistic function to bio-assay.Journal of the American Statistical Association.39, 357–375.
Google Scholar
Berkson, J. (1953). A statistically precise and relatively simple method of estimating the bio-assay with quantal response, based on the logistic function.Journal of the American Statistical Association, 48, 565–599.
Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In F.M. Lord & M.R. Novick,Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley.
Google Scholar
Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm.Psychometrika, 46, 443–459.
Article Google Scholar
Bock, R.D., & Lieberman, M. (1970). Fitting a response model forn dichotomously scored items.Psychometrika, 35, 179–197.
Google Scholar
Bock, R.D., & Mislevy, R.J. (1981). An item response curve model for matrix-sampling data: The California grade-three assessment.New Directions for Testing and Measurement, 10, 65–90.
Google Scholar
Bock, R.D., & Mislevy, R.J. (1982). Adaptive EAP estimation of ability in a microcomputer environment.Applied Psychological Measurement, 6, 431–444.
Google Scholar
Box, G.E.P. (1979). Some problems of statistics and everday life.Journal of the American Statistical Association, 74, 1–4.
Google Scholar
Brooks, F.P. (1996). The computer scientist as toolsmith II.Communications of the ACM, 39, 61–68.
Google Scholar
Brooks, F.P. (in press). The design of design.Communications of the ACM.
Chen, W.H. (1995).Estimation of item parameters for the three-parameter logistic model using the marginal likelihood of summed scores. Unpublished doctoral dissertation, The University of North Carolina at Chapel Hill.
Chen, W.H., & Thissen, D. (1999). Estimation of item parameters for the three-parameter logistic model using the marginal likelihood of summed scores.British Journal of Mathematical and Statistical Psychology, 52, 19–37.
Article Google Scholar
Cronbach, L.J., Gleser, G.C., Nanda, H., & Rajaratnam, N. (1972).The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York, NY: John Wiley & Sons.
Google Scholar
Finney, D.J. (1952).Probit analysis: A statistical treatment of the sigmoid response curve. London: Cambridge University Press.
Google Scholar
Fischer, G.H. (1974).Einführung in die Theorie psychologischer Tests [Introduction to the theory of psychological tests]. Bern: Huber.
Google Scholar
Fischer, G.H., & Allerup, P. (1968). Rechentchnische Fragen zu Raschs eindimensionalem Model [An inquiry into computational techniques for the Rasch model]. In G.H. Fischer (Ed.),Psychologische Testtheorie (pp. 269–280). Bern: Huber.
Google Scholar
Goldstein, A. (2001, March 12). Making another big score.Time, 157, 66–67.
Google Scholar
Henriques, D.B., & Steinberg, J. (2001, May 20). Errors plague testing industry.The New York Times, pp. A1, A22–A23.
Jones, L.V. (1998). L.L. Thurstone's vision of psychology as a quantitative rational science. In G.A. Kimble & M. Wertheimer (Eds.),Portraits of pioneers in psychology, Vol III (pp. 84–102). Washington, DC: American Psychological Association; Mahwah, NJ: Lawrence Erlbaum & Associates.
Google Scholar
Kelley, T.L. (1927).The interpretation of educational measurements. New York, NY: World Book.
Google Scholar
Kelley, T.L. (1947).Fundamentals of statistics. Cambridge: Harvard University Press.
Google Scholar
Lazarsfeld, P.F. (1950). The logical and mathematical foundation of latent structure analysis. In S.A. Stouffer, L. Guttman, E.A. Suchman, P.F. Lazarsfeld, S.A. Star, & J.A. Clausen,Measurement and prediction (pp. 362–412). New York, NY: John Wiley & Sons.
Google Scholar
Laidlaw, D.H., Fleischer, K.W., & Barr, A.H. (1995, September).Bayesian mixture classification of MRI data for geometric modeling and visualization. Poster presented at the First International Workshop on Statistical Mixture Modeling, Aussois, France. (Retrieved from the Worldwide Web: http://www.gg.caltech.edu/~dhl/aussois/paper.html)
Lewis, B. (1996, March 15). IS survival guide.Infoworld, 21, p. 96.
Google Scholar
Lewis, B. (2001, March 19). IS survival guide.Infoworld, 23, p. 42.
Google Scholar
Lindley, D.V., & Smith, A.F.M. (1972). Bayes estimates for the linear model.Journal of the Royal Statistical Society, Series B, 34, 1–41.
Google Scholar
Liou, M. (1994). More on the computation of higher-order derivatives of the elementary symmetric functions in the Rasch model.Applied Psychological Measurement, 18, 53–62.
Google Scholar
Lord, F.M. (1953). The relation of test score to the trait underlying the test.Educational and Psychological Measurement, 13, 517–548.
Google Scholar
Lord, F.M., & Novick, M. (1968).Statistical theories of mental test scores. Reading, MA: Addison Wesley.
Google Scholar
Lord, F.M., & Wingersky, M.S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”.Applied Psychological Measurement, 8, 453–461.
Google Scholar
Mislevy, R.M., Johnson, E.G., & Muraki, E. (1992). Scaling procedures in NAEP.Journal of Educational Statistics, 17, 131–154.
Google Scholar
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm.Applied Psychological Measurement, 16, 159–176.
Google Scholar
Muraki, E. (1997). A generalized partial credit model. In W. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 153–164). New York, NY: Springer.
Google Scholar
Novick, M.R. (1980). Statistics as psychometrics.Psychometrika, 45, 411–424.
Article Google Scholar
Orlando, M. (1997).Item fit in the context of item response theory. Unpublished doctoral dissertation, The University of North Carolina at Chapel Hill.
Orlando, M., & Thissen, D. (2000). New item fit indices for dichotomous item response theory models.Applied Psychological Measurement, 24, 50–64.
Google Scholar
Picasso, P. (1923). Picasso speaks—A statement by the artist.The Arts, 3, 315–326.
Google Scholar
Rasch, G. (1960).Probabilistic models for some intelligence and attainment tests. Copenhagen: Denmarks Paedagogiske Institut. (Republished in 1980 by the University of Chicago Press of Chicago)
Google Scholar
Raz, J., Turetsky, B.I., & Dickerson, L.W. (2001). Inference for a random wavelet packet model of single-channel event-related potentials.Journal of the American Statistical Association, 96, 409–420.
Article Google Scholar
Robbins, H. (1952). Some aspects of the sequential design of experiments.Bulletin of the American Mathematical Soceity, 58, 527–535.
Google Scholar
Rosa, K., Swygert, K., Nelson, L., & Thissen, D. (2001). Item response theory applied to combinations of multiple-choice and constructed-response items—scale scores for patterns of summed scores. In D. Thissen & H. Wainer (Eds),Test scoring (pp. 253–292). Mahwah, NJ: Lawrence Erlbaum & Associates.
Google Scholar
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores.Psychometric Monograph, No. 17.
Samejima, F. (1997). Graded response model. In W. van der Linden & R.K. Hambleton (Eds.),Handbook of modern item response theory (pp. 85–100). New York, NY: Springer.
Google Scholar
Thissen, D., Nelson, L., Rosa, K., & McLeod, L.D. (2001). Item response theory for items scored in more than two categories. In D. Thissen & H. Wainer (Eds),Test scoring (pp. 141–186). Mahwah, NJ: Lawrence Erlbaum & Associates.
Google Scholar
Thissen, D., Nelson, L., & Swygert, K. (2001). Item response theory applied to combinations of multiple-choice and constructed-response items—Approximation methods for scale scores. In D. Thissen & H. Wainer (Eds),Test scoring (pp. 293–341). Mahwah, NJ: Lawrence Erlbaum & Associates.
Google Scholar
Thissen, D., & Orlando, M. (2001). Item response theory for items scored in two categories. In D. Thissen & H. Wainer (Eds),Test scoring (pp. 73–140). Mahwah, NJ: Lawrence Erlbaum & Associates.
Google Scholar
Thissen, D., Pommerich, M., Billeaud, K., & Williams, V.S.L. (1995). Item response theory for scores on tests including polytomous items with ordered responses.Applied Psychological Measurement, 19, 39–49.
Google Scholar
Thissen, D. & Wainer, H. (Eds.) (2001)Test scoring. Mahwah, NJ: Lawrence Erlbaum & Associates.
Google Scholar
Thurstone, L.L. (1925). A method of scaling psychological and educational tests.Journal of Educational Psychology, 16, 433–449.
Google Scholar
Thurstone, L.L. (1927). The law of comparative judgment.Psychological Review, 34, 278–286.
Google Scholar
Thurstone, L.L. (1937). Psychology as a quantitative rational science.Science, 85, 227–232.
Google Scholar
Thurstone, L.L. (1938).Primary mental abilities. Chicago, IL: University of Chicago Press.
Google Scholar
Tukey, J.W. (1961).Data analysis and behavioral science or learning to bear the quantitative man's burden by shunning badmandments. Unpublished manuscript. (Reprinted inThe collected works of John W. Tukey, Vol III, Philosophy and principles of data analysis: 1949–1964, pp. 187–389 by L.V. Jones (Ed.), 1986, Monterey, CA: Wadsworth & Brooks-Cole)
Tukey, J.W. (1962). The future of data analysis.Annals of Mathematical Statistics, 33, 1–67. (Reprinted inThe collected works of John W. Tukey, Vol III, Philosophy and principles of data analysis: 1949–1964, pp. 391–484 by L.V. Jones (Ed.), 1986, Monterey, CA: Wadsworth & Brooks-Cole)
Google Scholar
Verhelst, N.D., & Veldhuijzen, N.H. (1991).A new algorithm for computing elementary symmetric functions and their first and second derivatives (Measurement and Research Department Rep. 91-1). Arnhem, The Netherlands: Netherlands Central Bureau of Statistics.
Google Scholar
Wainer, H., Vevea, J.L., Camacho, F., Reeve, B, Rosa, K., Nelson, L., Swygert, K., & Thissen, D. (2001). Augmented scores—“borrowing strength” to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds),Test scoring (pp. 343–387). Mahwah, NJ: Lawrence Erlbaum & Associates.
Google Scholar
Williams, V.S.L., Pommerich, M., & Thissen, D. (1998). A comparison of developmental scales based on Thurstone methods and item response theory.Journal of Educational Measurement, 35, 93–107.
Google Scholar
Yen, W.M. (1984). Obtaining maximum likelihood trait estimates from number-correct scores for the three-parameter logistic model.Journal of Educational Measurement, 21, 93–111.
Article Google Scholar

Download references

Author information

Authors and Affiliations

L. L. Thurstone Psychometric Laboratory, University of North Carolina, CB #3270, Davie Hall, 27599-3270, Chapel Hill, NC
David Thissen

Authors

David Thissen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Thissen.

Additional information

This article is based on the Presidential Address David Thissen gave at the 66th Annual Meeting of the Psychometric Society held in King of Prussia, Pennsylvania on June 24, 2001. The address was also given on July 16 at the 2001 International Meeting of the Psychometric Society held in Osaka, Japan.—Editor

Thanks to R. Darrell Bock, Paul De Boeck, Lyle V. Jones, Cynthia Null, Lynne Steinberg, and Howard Wainer for constructive comments on early drafts of this manuscript. And thanks to Val Williams, Mary Pommerich, Lee Chen, Kathleen Rosa, Lauren Nelson, Maria Orlando, Kimberly Swygert, Lori McLeod, Bryce Reeve, Fabian Camacho, David Flora, Viji Sathy, Michael Edwards, and Jack Vevea for their many contributions to some of the research that illustrates this commentary. Of course, any flaws in the argument or its presentation remain the author's.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thissen, D. Psychometric engineering as art. Psychometrika 66, 473–485 (2001). https://doi.org/10.1007/BF02296190

Download citation

Issue Date: December 2001
DOI: https://doi.org/10.1007/BF02296190

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Psychometric engineering as art

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Positive Psychology: An Introduction

How feature integration theory integrated cognitive psychology, neurophysiology, and psychophysics

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Psychometric engineering as art

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Positive Psychology: An Introduction

How feature integration theory integrated cognitive psychology, neurophysiology, and psychophysics

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation