Theoretically-Consistent Cognitive Ability Test Development and Score Interpretation

Beaujean, A. Alexander; Benson, Nicholas F.

doi:10.1007/s40688-018-0182-1

Theoretically-Consistent Cognitive Ability Test Development and Score Interpretation

Published: 20 March 2018

Volume 23, pages 126–137, (2019)
Cite this article

Contemporary School Psychology Aims and scope Submit manuscript

A. Alexander Beaujean¹ &
Nicholas F. Benson²

901 Accesses
16 Citations
Explore all metrics

Abstract

Clinical cognitive ability assessment—and its corollary, score interpretation—are in a state of disarray. Many current instruments are designed to provide a bevy of scores to appeal to a variety of school psychologists. These scores are not all grounded in the attribute’s theory or developed from sound measurement or psychometric theory. Thus, for a given instrument, there can be substantial variation between school psychologists when interpreting scores from the same instrument. This is contrary to the very purpose of psychological assessment. As a contrast, we provide a sketch of theoretically driven test development and score interpretation. In addition, we provide examples of how this could be implemented using two theories of intelligence (Spearman’s two-factor and Cattell and Horn’s Gf-Gc) and measurement theory about the nature of psychological test scores. While different from what is often implemented by school psychologists, it is consistent with the guiding principles of evidence-based psychological assessment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Relationship Between Theories of Intelligence and Intelligence Tests

Theories of Intelligence

Notes

Unfortunately, David Wechsler never actually defined the attribute he was measuring with the “verbal” subtests on his instruments. Instead, it appears he included them because he wanted to have a cognitive ability test that was different from those that were already in existence circa 1920s.
My usual examination of subjects included, in addition to a short interview, administration of the Stanford-Binet or Yerkes Point Scale, and nearly always one or more of the available performance tests. It then occurred to me that an intelligence scale, combining verbal and nonverbal tests, would be a useful addition to the psychometrist’s armamentarium (Wechsler 1981, p. 83).
A possible exception is the WJ IV, which uses principal component analysis-derived weights for the calculation of the General Intellectual Ability score. Although the results are “truly enough not identical with ‘g’ [they] are usually at any rate very good approximations to it” (Spearman 1946, p. 121).
There are other noted abilities contained within Gf-Gc theory (Horn and Blankson 2012), but Gf and Gc are believed to make the most important contributions to intellectual functioning.
Readers interested in information on how to calculate these statistics can consult Grice (2001).
We calculated reliability of the aggregate scores using the Guttman-Cronbach α.

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education [AERA/APA/NCME]. (2014). Standards for educational and psychological testing (4th ed.). Washington, DC: Authors.
Google Scholar
Beaujean, A. A. (2018). Simulating data for clinical research: a tutorial. The Journal of Psychoeducational Assessment, 36, 7–20. https://doi.org/10.1177/0734282917690302.
Article Google Scholar
Beaujean, A. A., & Sheng, Y. (2014). Assessing the Flynn effect in the Wechsler scales. Journal of Individual Differences, 35, 63–78. https://doi.org/10.1027/1614-0001/a000128.
Article Google Scholar
Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111, 1061–1071. https://doi.org/10.1037/0033-295X.111.4.1061.
Article PubMed Google Scholar
Borsboom, D., Cramer, A. O. J., Kievit, R. A., Scholten, A. Z., & Franić, S. (2009). The end of construct validity. In R. W. Lissitz (Ed.), The concept of validity: revisions, new directions, and applications (pp. 135–170). Charlotte: Information Age Publishing.
Google Scholar
Braden, J. P., & Ouzts, S. M. (2005). Review of the Kaufman assessment battery for children, second edition. In B. S. Plake & J. C. Impara (Eds.), The sixteenth mental measurements yearbook (2nd ed., pp. 517–520). Lincoln: Buros Institute of Mental Measurements.
Google Scholar
Bringmann, L. F., & Eronen, M. I. (2016). Heating up the measurement debate: what psychologists can learn from the history of physics. Theory & Psychology, 26, 27–43. https://doi.org/10.1177/0959354315617253.
Article Google Scholar
Canivez, G. L., & Watkins, M. W. (2016). Review of the Wechsler intelligence scale for children-fifth edition: critique, commentary, and independent analyses. In A. S. Kaufman, S. E. Raiford, & D. L. Coalson (Eds.), Intelligent testing with the WISC-V (pp. 683–702). Hoboken: Wiley.
Google Scholar
Carroll, J. B. (1996). A three-stratum theory of intelligence: Spearman’s contribution. In I. Dennis & P. Tapsfield (Eds.), Human abilities: their nature and measurement (pp. 1–17). Mahwah: Erlbaum.
Google Scholar
Cattell, R. B. (1943). The measurement of adult intelligence. Psychological Bulletin, 40, 153–193. https://doi.org/10.1037/h0059973.
Article Google Scholar
Cattell, R. B. (1963). Theory of fluid and crystallized intelligence: a critical experiment. Journal of Educational Psychology, 54, 1–22. https://doi.org/10.1037/h0046743.
Article Google Scholar
Cattell, R. B. (1987). Intelligence: its structure, growth, and action. New York: Elsevier.
Google Scholar
Courville, T., Coalson, D. L., Kaufman, A. S., & Raiford, S. E. (2016). Does WISC-V scatter matter? In A. S. Kaufman, S. E. Raiford, & D. L. Coalson (Eds.), Intelligent testing with the WISC-V (pp. 209–228). Hoboken: Wiley.
Google Scholar
Downing, S. M. (2006). Twelve steps for effective test development. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of testing (pp. 3–25). Mahwah: Lawrence Erlbaum.
Google Scholar
Finkelstein, L. (2005). Problems of measurement in soft systems. Measurement, 38, 267–274. https://doi.org/10.1016/j.measurement.2005.09.002.
Article Google Scholar
Flanagan, D. P., & Alfonso, V. C. (2017). Essentials of WISC-V assessment (2nd ed.). Hoboken: Wiley.
Google Scholar
Flanagan, D. P., Ortiz, S. O., & Alfonso, V. C. (2013). Essentials of cross-battery assessment (3rd ed.). Hoboken: Wiley.
Google Scholar
Floyd, R. G., Bergeron, R., McCormack, A. C., Anderson, J. L., & Hargrove-Owens, G. L. (2005). Are Cattell-Horn-Carroll (CHC) broad ability composite scores exchangeable across batteries? School Psychology Review, 34, 329–357.
Google Scholar
Frazier, T. W., & Youngstrom, E. A. (2007). Historical increase in the number of factors measured by commercial tests of cognitive ability: are we overfactoring? Intelligence, 35, 169–182. https://doi.org/10.1016/j.intell.2006.07.002.
Article Google Scholar
Grace, J. B., & Bollen, K. A. (2008). Representing general theoretical concepts in structural equation models: the role of composite variables. Environmental and Ecological Statistics, 15, 191–213. https://doi.org/10.1007/s10651-007-0047-7.
Article Google Scholar
Grégoire, J. (2013). Measuring components of intelligence: mission impossible? Journal of Psychoeducational Assessment, 31, 138–147. https://doi.org/10.1177/0734282913478034.
Article Google Scholar
Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6, 430–450. https://doi.org/10.1037/1082-989X.6.4.430.
Article PubMed Google Scholar
Groth-Marnat, G. (1999). Financial efficacy of clinical assessment: rational guidelines and issues for future research. Journal of Clinical Psychology, 55, 813–824.
Article Google Scholar
Grove, W. M., & Vrieze, S. I. (2013). The clinical versus mechanical prediction controversy. In K. F. Geisinger, B. A. Bracken, J. F. Carlson, J. I. C. Hansen, N. R. Kuncel, S. P. Reise, & M. C. Rodriguez (Eds.), APA handbook of testing and assessment in psychology, Vol. 2: Testing and assessment in clinical and counseling psychology (pp. 51–62). Washington, DC: American Psychological Association.
Chapter Google Scholar
Hale, J. B., Fiorello, C. A., Kavanagh, J. A., Hoeppner, J.-A. B., & Gaither, R. A. (2001). WISC-III predictors of academic achievement for children with learning disabilities: are global and factor scores comparable? School Psychology Quarterly, 16, 31–55. https://doi.org/10.1521/scpq.16.1.31.19158.
Article Google Scholar
Horn, J. L. (1963). Equations representing combinations of components in scoring psychological variables. Acta Psychologica, 21, 184–217. https://doi.org/10.1016/0001-6918(63)90048-9.
Article Google Scholar
Horn, J. L. (1985). Remodeling old models of intelligence. In B. B. Wolman (Ed.), Handbook of intelligence (pp. 267–300). New York: Wiley.
Google Scholar
Horn, J. L. (1989). Models of intelligence. In R. L. Linn (Ed.), Intelligence, measurement, theory and public policy (pp. 29–73). Urbana: University of Illinois Press.
Google Scholar
Horn, J. L. (1991). Measurement of intellectual capabilities: a review of theory. In K. S. McGrew, J. K. Werder, & R. W. Woodcock (Eds.), Woodcock-Johnson psycho-educational battery-revised technical manual (pp. 197–232). Chicago: Riverside.
Google Scholar
Horn, J. L., & Blankson, A. N. (2012). Foundations for better understanding of cognitive abilities. In D. P. Flanagan & P. L. Harrison (Eds.), Contemporary intellectual assessment: theories, tests, and issues (3rd ed., pp. 73–98). New York: Guilford Press.
Google Scholar
Horn, J. L., & Cattell, R. B. (1966). Refinement and test of the theory of fluid and crystallized intelligence. Journal of Educational Psychology, 57, 253–270. https://doi.org/10.1037/h0023816.
Article PubMed Google Scholar
Horn, J. L., & McArdle, J. J. (2007). Understanding human intelligence since Spearman. In R. Cudeck & R. C. MacCallum (Eds.), Factor analysis at 100: historical developments and future directions (pp. 205–247). Mahwah: Erlbaum.
Google Scholar
Hunsley, J., & Mash, E. J. (2007). Evidence-based assessment. Annual Review of Clinical Psychology, 3, 29–51. https://doi.org/10.1146/annurev.clinpsy.3.022806.091419.
Article PubMed Google Scholar
Jackson, J. S. H., & Maraun, M. (1996). The conceptual validity of empirical scale construction: the case of the sensation seeking scale. Personality and Individual Differences, 21, 103–110. https://doi.org/10.1016/0191-8869(95)00217-0.
Article Google Scholar
Jensen, A. R. (1993). Psychometric g and achievement. In B. R. Gifford (Ed.), Policy perspectives on educational testing (pp. 117–227). New York: Kluwer Academic Publishers.
Chapter Google Scholar
Jensen, A. R. (2002). Galton’s legacy to research on intelligence. Journal of Biosocial Science, 34, 145–172. https://doi.org/10.1017/s0021932002001451.
Article PubMed Google Scholar
Kamphaus, R. W., Winsor, A. P., Rowe, E. W., & Kim, S. (2012). A history of intelligence test interpretation. In D. P. Flanagan & P. L. Harrison (Eds.), Contemporary intellectual assessment (3rd ed., pp. 56–70). New York: Guilford.
Google Scholar
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73. https://doi.org/10.1111/jedm.12000.
Article Google Scholar
Kaufman, A. S., & Kaufman, N. L. (2004). Kaufman assessment battery for children-second edition. Circle Pines: American Guidance Service.
Google Scholar
Kaufman, A. S., Raiford, S. E., & Coalson, D. L. (2016). Intelligent testing with the WISC-V. Hoboken: Wiley.
Google Scholar
Keith, T. Z., & Reynolds, M. R. (2010). Cattell-Horn-Carroll abilities and cognitive tests: what we’ve learned from 20 years of research. Psychology in the Schools, 47, 635–650. https://doi.org/10.1002/pits.20496.
Article Google Scholar
Kingston, N. M., Scheuring, S. T., & Kramer, L. B. (2013). Test development strategies. In K. F. Geisinger, B. A. Bracken, J. F. Carlson, J. I. C. Hansen, N. R. Kuncel, S. P. Reise, & M. C. Rodriguez (Eds.), APA handbook of testing and assessment in psychology, Vol. 1: test theory and testing and assessment in industrial and organizational psychology (pp. 165–184). Washington, DC: American Psychological Association.
Google Scholar
Kline, P. (2000). The handbook of psychological testing (2nd ed.). London: Routledge.
Google Scholar
Krause, M. S. (2012). Measurement validity is fundamentally a matter of definition, not correlation. Review of General Psychology, 16, 391–400. https://doi.org/10.1037/a0027701.
Article Google Scholar
Krause, M. S. (2013). The data analytic implications of human psychology’s dimensions being ordinally scaled. Review of General Psychology, 17, 318–325. https://doi.org/10.1037/a0032292.
Article Google Scholar
Littell, W. M. (1960). The Wechsler intelligence scale for children: review of a decade of research. Psychological Bulletin, 57, 132–156. https://doi.org/10.1037/h0044513.
Article PubMed Google Scholar
Luecht, R. M., Gierl, M. J., Tan, X., & Huff, K. (2006). Scalability and the development of useful diagnostic scales. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco, CA.
Luria, A. R. (1973). The working brain: an introduction to neuropsychology. New York: Basic Books.
Google Scholar
Maraun, M. D. (1998a). Measurement as a normative practice: implications of Wittgenstein’s philosophy for measurement in psychology. Theory & Psychology, 8, 435–461. https://doi.org/10.1177/0959354398084001.
Article Google Scholar
Maraun, M. D. (1998b). The nexus misconceived: Wittgenstein made silly. Theory & Psychology, 8, 489–501. https://doi.org/10.1177/0959354398084004.
Article Google Scholar
Mari, L., Carbone, P., & Petri, D. (2015). Fundamentals of hard and soft measurement. In A. Ferrero, D. Petri, P. Carbone & M. Catelani (Eds.), Modern measurements: Fundamentals and applications (pp. 203–262). Hoboken, NJ: Wiley-IEEE Press.
McDonald, R. P. (1999). Test theory: a unified treatment. Mahwah: Erlbaum.
Google Scholar
McGrew, K. S. (2009). CHC theory and the human cognitive abilities project: standing on the shoulders of the giants of psychometric intelligence research. Intelligence, 37, 1–10. https://doi.org/10.1016/j.intell.2008.08.004.
Article Google Scholar
McGrew, K. S., LaForte, E. M., & Schrank, F. A. (2014). Woodcock- Johnson IV technical manual. Rolling Meadows: Riverside.
Google Scholar
Michell, J. (1999). Measurement in psychology: critical history of a methodological concept. New York: Cambridge University Press.
Book Google Scholar
Michell, J. (2007). Measurement. In S. P. Turner & M. W. Risjord (Eds.), Philosophy of anthropology and sociology (pp. 71–119). Amsterdam: North Holland.
Chapter Google Scholar
Michell, J. (2011). Qualitative research meets the ghost of Pythagoras. Theory & Psychology, 21, 241–259. https://doi.org/10.1177/0959354310391351.
Article Google Scholar
Michell, J. (2012). Alfred Binet and the concept of heterogeneous orders. Frontiers in Psychology, 3(261), 1–8. https://doi.org/10.3389/fpsyg.2012.00261.
Article Google Scholar
Petri, D., Mari, L., & Carbone, P. (2015). A structured methodology for measurement development. IEEE Transactions on Instrumentation and Measurement, 64, 2367–2379. https://doi.org/10.1109/TIM.2015.2399023.
Article Google Scholar
Pfeiffer, S. I., Reddy, L. A., Kletzel, J. E., Schmelzer, E. R., & Boyer, L. M. (2000). The practitioner’s view of IQ testing and profile analysis. School Psychology Quarterly, 15, 376–385. https://doi.org/10.1037/h0088795.
Article Google Scholar
R Development Core Team. (2017). R: a language and environment for statistical computing (version 3.3.3) [computer program]. Vienna: R Foundation for Statistical Computing.
Google Scholar
Raiford, S. E. (2017). Essentials of WISC-V integrated assessment. Hoboken: Wiley.
Google Scholar
Schneider, W. J. (2013). What if we took our models seriously? Estimating latent scores in individuals. Journal of Psychoeducational Assessment, 31, 186–201. https://doi.org/10.1177/0734282913478046.
Article Google Scholar
Schneider, W. J., & McGrew, K. S. (2012). The Cattell-Horn-Carroll model of intelligence. In D. P. Flanagan & P. L. Harrison (Eds.), Contemporary intellectual assessment (3rd ed., pp. 99–144). New York: Guilford.
Google Scholar
Schrank, F. A., McGrew, K. S., & Mather, N. (2014). Woodcock-Johnson IV tests of cognitive abilities. Rolling Meadows: Riverside.
Google Scholar
Sijtsma, K. (2012). Psychological measurement between physics and statistics. Theory & Psychology, 22, 786–809. https://doi.org/10.1177/0959354312454353.
Article Google Scholar
Sijtsma, K. (2013). Theory development as a precursor for test validity. In R. E. Millsap, L. A. van der Ark, D. M. Bolt, & C. M. Woods (Eds.), New developments in quantitative psychology: presentations from the 77th annual psychometric society meeting (pp. 267–274). New York: Springer.
Chapter Google Scholar
Spearman, C. E. (1927). The abilities of man: their nature and measurement. New York: Blackburn Press.
Google Scholar
Spearman, C. E. (1931). Our need of some science in place of the word ‘intelligence’. Journal of Educational Psychology, 22, 401–410. https://doi.org/10.1037/h0070599.
Article Google Scholar
Spearman, C. E. (1939). Thurstone’s work re-worked. Journal of Educational Psychology, 30, 1–16. https://doi.org/10.1037/h0061267.
Article Google Scholar
Spearman, C. E. (1946). Theory of general factor. British Journal of Psychology, 36, 117–131. https://doi.org/10.1111/j.2044-8295.1946.tb01114.x.
Article Google Scholar
Thomson, G. H. (1927). The tetrad-difference criterion. British Journal of Psychology. General Section, 17, 235–255. https://doi.org/10.1111/j.2044-8295.1927.tb00426.x.
Article Google Scholar
Thurstone, L. L. (1935). The vectors of mind: multiple-factor analysis for the isolation of primary traits. Chicago: University of Chicago Press.
Book Google Scholar
Tomarken, A. J., & Waller, N. G. (2003). Potential problems with “well fitting” models. Journal of Abnormal Psychology, 112, 578–598. https://doi.org/10.1037/0021-843X.112.4.578.
Article PubMed Google Scholar
Wechsler, D. (1950). Cognitive, conative, and non-intellective intelligence. American Psychologist, 5, 78–83. https://doi.org/10.1037/h0063112.
Article Google Scholar
Wechsler, D. (1975). Intelligence defined and undefined: a relativistic appraisal. American Psychologist, 30, 135–139. https://doi.org/10.1037/h0076868.
Article Google Scholar
Wechsler, D. (1981). The psychometric tradition: developing the Wechsler adult intelligence scale. Contemporary Educational Psychology, 6, 82–85. https://doi.org/10.1016/0361-476X(81)90035-7.
Article Google Scholar
Wechsler, D. (2014). Wechsler intelligence scale for children-fifth edition administration and scoring manual. Bloomington: NCS Pearson.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology & Neuroscience, Baylor University, One Bear Place #97334, Waco, TX, 76798-7334, USA
A. Alexander Beaujean
Department of Educational Psychology, Baylor University, One Bear Place #97301, Waco, TX, 76798, USA
Nicholas F. Benson

Authors

A. Alexander Beaujean
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas F. Benson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. Alexander Beaujean.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of Interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Beaujean, A.A., Benson, N.F. Theoretically-Consistent Cognitive Ability Test Development and Score Interpretation. Contemp School Psychol 23, 126–137 (2019). https://doi.org/10.1007/s40688-018-0182-1

Download citation

Published: 20 March 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s40688-018-0182-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Theoretically-Consistent Cognitive Ability Test Development and Score Interpretation

Abstract

Access this article

Similar content being viewed by others

The Relationship Between Theories of Intelligence and Intelligence Tests

Theories of Intelligence

Theories of Intelligence

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical Approval

Conflict of Interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Theoretically-Consistent Cognitive Ability Test Development and Score Interpretation

Abstract

Access this article

Similar content being viewed by others

The Relationship Between Theories of Intelligence and Intelligence Tests

Theories of Intelligence

Theories of Intelligence

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical Approval

Conflict of Interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation