Classical Test Theory

  • Sivakumar Alagumalai
  • David D. Curtis

Abstract

Measurement involves the processes of description and quantification. Questionnaires and test instruments are designed and developed to measure conceived variables and constructs accurately. Validity and reliability are two important characteristics of measurement instruments. Validity consists of a complex set of criteria used to judge the extent to which inferences, based on scores derived from the application of an instrument, are warranted. Reliability captures the consistency of scores obtained from applications of the instrument. Traditional or classical procedures for measurement were based on a variety of scaling methods. Most commonly, a total score is obtained by adding the scores for individual items, although more complex procedures in which items are differentially weighted are used occasionally. In classical analyses, criteria for the final selection of items are based on internal consistency checks. At the core of these classical approaches is an idea derived from measurement in the physical sciences: that an observed score is the sum of a true score and a measurement error term. This idea and a set of procedures that implement it are the essence of Classical Test Theory (CTT). This chapter examines underlying principles of CTT and how test developers use it to achieve measurement, as they have defined this term. In this chapter, we outline briefly the foundations of CTT and then discuss some of its limitations in order to lay a foundation for the examples of objective measurement that constitute much of the book.

Key words

classical test theory true score theory measurement 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

4. References

  1. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.Google Scholar
  2. Burns, R. B. (1997). Introduction to research methods (3rd ed.). South Melbourne, Australia: Longman.Google Scholar
  3. Embretson, S. E. (1999). Issues in the measurement of cognitive abilities. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement. What every psychologist and educator should know (pp. 1–15). Mahwah, NJ: Lawrence Erlbaum and Associates.Google Scholar
  4. Guilford, J.P, (1954). Psychometric Methods. (2nd Ed). Tokyo: Kogakusha Company. Holland, P.W., & Hoskens, M (2002). Classical Test Theory as a First-Order Item Response Theory: Application to True-Score Prediction From a Possibly Nonparallel Test. ETS Research Report. Educational Testing Service. Princeton, NJ.Google Scholar
  5. Hopkins, K.D. (1998). Educational and Psychological Measurement and Evaluation. (8th Ed.). Boston: Allyn and BaconKeats, J. A. (1994a). Classical test theory. In T. Husen & T. N. Postlethwaite (Eds.), The international encyclopedia of education (2 ed., Vol. 2nd, pp. 785–792). Amsterdam: Elsevier.Google Scholar
  6. Keats, J. A. (1994b). Measurement in educational research. In T. Husen & T. N. Postlethwaite (Eds.), The international encyclopedia of education (2 ed., Vol. 7, pp. 3698–3707). Amsterdam: Elsevier.Google Scholar
  7. Keats, J. A. (1997). Classical test theory. In J. P. Keeves (Ed.), Educational research, methodology, and measurement: an international handbook (pp. 713–719). Oxford: Pergamon.Google Scholar
  8. Keeves, J.P. & Masters, G.N. (1999), Introduction. In Masters, G.N. and Keeves, J.P. Advances in Measurement in Educational Research and Assessment. Amsterdam: Pergamon.Google Scholar
  9. Lord, F. M. & Novick, M. R. (1968). Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley.Google Scholar
  10. Novick, M. R. (1966). The axioms and principal results of classical test theory. Journal of Mathematical Psychology, 3, 1–18.CrossRefGoogle Scholar
  11. Oppenheim, A.N. (1992). Questionnaire design, interviewing and attitude measurement. London: ContinuumGoogle Scholar
  12. Michell, J. (1997). Quantitative science and the definition of measurement in psychology. British Journal of Psychology. 88, 355–383.Google Scholar
  13. Michell, J. (2002). Stevens’s theory of scales of measurement and its place in modern psychology. Australian Journal of Psychology, 54(2), 99–104.CrossRefGoogle Scholar
  14. Schumacker, R.E. (2003). Reliability in Rasch Measurement: Avoiding the Rubber Ruler. Paper presented at the Annual Meeting of the American Educational Research Association. Chicago, Illinois. 25 Apr.Google Scholar
  15. Stage, C. (2003). Classical Test Theory or Item Response Theory: The Swedish Experience. Online: Available at www.cepchile.clGoogle Scholar
  16. Stevens, S. S. (1951). Mathematics, measurement, and psychophysics. In S. S. Stevens (Ed.), Handbook of experimental psychology (pp. 1–49). New York: John Wiley.Google Scholar
  17. Thorndike, R. M. (1999). IRT and intelligence testing: past, present, and future. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement. What every psychologist and educator should know (pp. 17–35). Mahwah, NJ: Lawrence Erlbaum and Associates.Google Scholar
  18. Wright, B. (2001). Reliability! Rasch Measurement Transactions, 14(4).Google Scholar
  19. Zeller, R. A. (1997). Validity. In J. P. Keeves (Ed.), Educational research, methodology, and measurement: an international handbook (pp. 822–829). Oxford: Pergamon.Google Scholar

Copyright information

© Springer 2005

Authors and Affiliations

  • Sivakumar Alagumalai
    • 1
  • David D. Curtis
    • 1
  1. 1.Flinders UniversityAdelaide

Personalised recommendations