Quality & Quantity

, Volume 47, Issue 4, pp 2341–2360 | Cite as

Relevance and advantages of using the item response theory

  • Silvana Ligia Vincenzi Bortolotti
  • Rafael Tezza
  • Dalton Francisco de Andrade
  • Antonio Cezar Bornia
  • Afonso Farias de Sousa Júnior
Article

Abstract

The item response theory (IRT) also known as latent trait theory, is used for the development, evaluation and administration of standardized measurements; it is widely used in the areas of psychology and education. This theory was developed and expanded for over 50 years and has contributed to the development of measurement scales of latent traits. This paper presents the basic and fundamental concepts of this IRT and a practical example of the construction of scales is proposed to illustrate the feasibility, advantages and validity of IRT through a known measurement, the height. The results obtained with the practical application of IRT confirm its effectiveness in the evaluation of latent traits.

Keywords

Item response theory Latent trait Measurement 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allen M.J., Yen W.M.: Introduction to Measurement Theory. Waveland Press, Long Grove (2002)Google Scholar
  2. Andrade, D.F., Tavares, H.R., Valle, R.C.: Teoria de Resposta ao Item: conceitos e aplicações. Associação Brasileira de Estatística (ABE), 4° SINAPE (2000)Google Scholar
  3. Andrich D.: A rating formulation for ordered response categories. Psychometrika 43, 561–573 (1978a)CrossRefGoogle Scholar
  4. Andrich D.: Application of a psychometric rating model to ordered categories, which are scored with successive integers. Appl. Psychol. Measur. 2, 581–594 (1978b)CrossRefGoogle Scholar
  5. Andrich D.: A general hyperbolic cosine latent trait model for unfolding polytomous responses: Reconciling Thurstone and Likert methodologies. Br. J. Math. Stat. Psychol. 49, 347–365 (1996)CrossRefGoogle Scholar
  6. Andrich D., Luo G.: A hyperbolic cosine latent trait model for unfolding dichotomous single-stimulus responses. Appl. Psychol. Measur. 17, 253–276 (1993)CrossRefGoogle Scholar
  7. Andrich, D., Luo, G.: RUMMFOLDTM for WindowsTM, A Program for Unfolding Pairwise Preferences, Computer Program. Social Measurement Laboratory, Murdoch University, Murdoch (1998)Google Scholar
  8. Babbie E.: The Basics of Social Research. Wadsworth Publishing, Belmont (2005)Google Scholar
  9. Baker, F.B.: The Basis of Item Response Theory, 2nd edn. ERIC Clearinghouse on Assessment and Evaluation, College Park (2001). http://edres.org/irt/
  10. Beaton A.E., Allen N.L.: Interpreting scales through scale anchoring. J. Educ. Stat. 17, 191–204 (1999)CrossRefGoogle Scholar
  11. Bock R.D.: Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika 37, 29–51 (1972)CrossRefGoogle Scholar
  12. Bock R.D., Aitkin M.: Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika 46, 443–459 (1981)CrossRefGoogle Scholar
  13. Bock R.D., Lieberman M.: Fitting a response model for n dichotomously scored items. Psychometrika 35, 179–197 (1970)CrossRefGoogle Scholar
  14. Coombs C.H.: A Theory of Data. Wiley, New York (1964)Google Scholar
  15. de Ayala R.J.: The Theory and Practice of Item Response Theory. The Guilford Press, New York (2009)Google Scholar
  16. Drasgow F., Levine M.V., Tsien S.: Fitting polytomous item response theory models to multiple-choice tests. Appl. Psychol. Measur. 19, 143–165 (1995)CrossRefGoogle Scholar
  17. Embretson S., Reise S.P.: Item Response Theory for Psychologists. Lawrence Erlbaum Associates, Inc., Mahwah (2000)Google Scholar
  18. Flapper S.D.P., Fortuin L., Toop P.P.M.: Towards consistent performance management systems. Int. J. Oper. Prod. Manag. 16(7), 27–37 (1996)CrossRefGoogle Scholar
  19. Greer S.A.: The Logic of Social Inquiry. Aldine Pub, Chicago (1969)Google Scholar
  20. Hambleton R.K., Swaminathan H.: Item Response Theory: Principles and Applications. Kluwer-Nijhoff, Boston (1985)Google Scholar
  21. Hambleton R.K., Swaminathan H., Rogers H.J.: Fundamentals of Item Response Theory. Sage, Newbury Park (1991)Google Scholar
  22. Hancock G.R.: Structural equation modeling methods of hypothesis testing of latent variable means. Measur. Eval. Couns. Dev. 30, 91–105 (1997)Google Scholar
  23. Hofmans J., Theuns P., Van Acker F.: Combining quality and quantity. A psychometric evaluation of the self-anchoring scale. Qual. Quant. 43, 703–716 (2009)CrossRefGoogle Scholar
  24. Hoijtink H.: A latent trait model for dichotomous choice data. Psychometrika 55, 641–656 (1990)CrossRefGoogle Scholar
  25. Hoijtink H.: The measurement of latent traits by proximity items. Appl. Psychol. Measur. 15, 153–169 (1991)CrossRefGoogle Scholar
  26. Hoyle R.H.: Introduction to the special section: structural equation modeling in clinical research. Special section: structural equation modeling in clinical research. J. Consult. Clin. Psychol. 62(3), 427–428 (1994)CrossRefGoogle Scholar
  27. Hoyle R.H.: Structural equation modeling: Concepts, issues, and applications. Sage, Thousand Oaks (1995)Google Scholar
  28. Khurshid A., Sahai H.: Scales of measurements: an introduction and a selected bibliography. Qual. Quant. 27(3), 303–324 (1993)CrossRefGoogle Scholar
  29. Kingston N., Dorans N.: The analysis of item-ability regressions: an exploratory IRT model fit tool. Appl. Psychol. Measur. 9, 281–288 (1985)CrossRefGoogle Scholar
  30. Kolen M.J., Brennan R.L.: Test Equating. Springer, New York (1995)Google Scholar
  31. Lalla M., Facchinetti G., Mastroleo G.: Ordinal scales and fuzzy set systems to measure agreement: an application to the evaluation of teaching activity. Qual. Quant. 38, 577–601 (2004)CrossRefGoogle Scholar
  32. Lawley D.N.: On problems connected with item selection and test construction. Proc. R. Soc. Edinb. 61, 273–287 (1943)Google Scholar
  33. Lazarsfeld, P.F.: In: Stouffer, S.A., et al. (eds.) Studies in Social Psychology in World War II, vol. 4: Measurement and Prediction, chaps. 10 and 11. Princeton University Press, Princeton (1950)Google Scholar
  34. Lin T.H.: Identifying optimal items in quality of life assessment. Qual. Quant. 41, 661–672 (2007)CrossRefGoogle Scholar
  35. Lord F.M.: A Theory of Test Scores. Psychometric Monograph 7. Psychometric Society, New York (1952)Google Scholar
  36. Lord F.M.: Applications of Item Response Theory to Practical Testing Problems. Erlbaum, Hillsdale (1980)Google Scholar
  37. Lynch R.L., Cross K.F.: Managing the corporate warriors. Qual. Prog. 23(4), 54–59 (1990)Google Scholar
  38. Masters G.N.: A Rasch model for partial credit scoring. Psychometrika 47, 149–174 (1982)CrossRefGoogle Scholar
  39. Metz S.M., Wyrwich K.W., Babu A.N., Kroenke K., Tierney W.M., Wolinsky F.D.: A comparison of traditional and Rasch CUT points for assessing clinically important change in health-related quality of life among patients with asthma. Qual. Life Res. 15, 1639–1649 (2006)CrossRefGoogle Scholar
  40. Michell J.: An Introduction to the Logic of Psychological Measurement. Lawrence Erlbaum Associates, Hillsdale (1990)Google Scholar
  41. Mislevy R.J., Bock R.D.: BILOG 3: Item Analysis and Test Scoring with Binary Logistic Models. Scientific Software, Inc., Chicago (1990)Google Scholar
  42. Mosier C.I.: A psychometric study of meaning. J. Soc. Psychol. 13, 123–140 (1941)CrossRefGoogle Scholar
  43. Mosier C.I.: A modification of the method of successive intervals. Psychometrika 7(1), 19–29 (1942)CrossRefGoogle Scholar
  44. Muraki E.: A generalized partial credit model: application of the EM algorithm. Appl. Psychol. Measur. 16, 159–176 (1992)CrossRefGoogle Scholar
  45. Muraki E.: A generalized partial credit model. In: van der Linden, W, Hambleton, R.K. (eds) Handbook of modern item response theory, pp. 153–164. Springer, New York (1997)CrossRefGoogle Scholar
  46. Muraki E., Bock R.D.: PARSCALE: IRT Based Test Scoring and Item Analysis for Graded Open-Ended Exercises and Performance Tasks. Scientific Software, Inc., Chicago (1997)Google Scholar
  47. Novick M.R.: The axioms and principal results of classical test theory. J. Math. Psychol. 3(1), 1–18 (1966)CrossRefGoogle Scholar
  48. Nunnally J.: Psychometric Theory. McGraw-Hill, New York (2005)Google Scholar
  49. Ommundsen R., Larsen K.S.: Attitudes toward illegal immigration in Scandinavia and United States. Psychol. Rep. 84, 1331–1338 (1999)Google Scholar
  50. Orlando M., Thissen D.: Likelihood-based item-fit indices for dichotomous item response theory models. Appl. Psychol. Measur. 24(1), 50–64 (2000)CrossRefGoogle Scholar
  51. Orlando M., Thissen D.: Further examination of the performance of S-X2, an item fit index for dichotomous item response theory models. Appl. Psychol. Measur. 27(4), 289–298 (2003)CrossRefGoogle Scholar
  52. Reckase M.D.: Unifactor latent trait models applied to multifactor tests: Results and implications. J. Educ. Stat. 4 , 207–230 (1979)CrossRefGoogle Scholar
  53. Richardson M.W.: The relationship between difficulty and the differential validity of a test. Psychometrika 1, 33–49 (1936)CrossRefGoogle Scholar
  54. Roberts J.S., Donoghue J.R., Laughlin J.E.: A general model for unfolding Unidimensional polytomous responses using item response theory. Appl. Psychol. Measur. 24(1), 3–32 (2000)CrossRefGoogle Scholar
  55. Roberts, J.S., Fang, H., Cui, W., Wang, Y.: GGUM2004: A Windows-based program to estimate parameters of the generalized graded unfolding model. Manuscript preparation (2004)Google Scholar
  56. Rost J., Langeheine R.: Applications of Latent Trait and Latent Class Models in the Social Sciences. Waxmann, New York (1997)Google Scholar
  57. Samejima, F.: Estimation of Latent Ability Using a Response Pattern Of Graded Scores. Psychometric Monography 34 (1969)Google Scholar
  58. Samejima F.: Graded response model. In: van der Linden, W., Hambleton, R.K. (eds) Handbook of Modern Item Response Theory, pp. 85–100. Springer, New York (1997)CrossRefGoogle Scholar
  59. Singh J.: Tackling measurement problems with Item Response Theory: principles, characteristics, and assessment, with an illustrative example. J. Bus. Res. 57, 184–208 (2004)CrossRefGoogle Scholar
  60. Szeles, M.R., Fusco, A.: Item response theory and the measurement of deprivation: evidence from Luxembourg data. Qual. Quant. (2011). Online FirstTM, 4 OctoberGoogle Scholar
  61. Terman L.M.: The Measurement of Intelligence. Houghton Mifflin, Boston (1916)CrossRefGoogle Scholar
  62. Thissen D.: MULTILOG user’s guide: multiple categorical item analysis and test scoring using item response theory. Scientific Software Int., Chicago (1991)Google Scholar
  63. Thomson, W.: Lord Kelvin. In: Popular Lectures and Addresses, vol. 1. Macmillan and Company, London (1891)Google Scholar
  64. Thurstone L.L.: A law of comparative judgments. Psychol. Rev. 34, 278–286 (1928)Google Scholar
  65. Thurstone L.L.: Motion Pictures and the Attitudes of Children. University of Chicago Press, Chicago (1932)Google Scholar
  66. Tucker L.R.: Maximum validity of a test with equivalent items. Psychometrika 11, 1–13 (1946)CrossRefGoogle Scholar
  67. Van Schuur, W.H., Post, W.J.: MUDFOLD. A Program for Multiple Unidimensional Unfolding [Software Manual]. ProGAMMA, Groningen (1998)Google Scholar
  68. Veer K.V.D., Ommundsen R., Hak T., Larsen K.S.: Meaning shift of items in different language versions. A cross-national validation study of the illegal aliens scale. Qual. Quant. 37, 193–206 (2003)CrossRefGoogle Scholar
  69. Wilson M.: Constructing Measures: An Item Response Modeling Approach. Erlbaum, Mahwah (2005)Google Scholar
  70. Wilson M., Allen D.D., Li J.C.: Improving measurement in health education and health behavior research using item response modeling: Comparison with the classical test theory approach. Health Educ. Res. 21(1), 19–32 (2006)CrossRefGoogle Scholar
  71. Wright, B., Mead, R.: BICAL: Calibrating Items and Scales with the Rasch Model, Research Memorandum 23. University of Chicago, Department of Education, Statistical Laborator, Chicago (1977)Google Scholar
  72. Zimowski M.F., Muraki E., Mislevy R.J., Bock R.D.: BILOG-MG: Multiple-Group IRT Analysis and Test Maintenance for Binary Items. Scientific Software, Inc., Chicago (1996)Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  • Silvana Ligia Vincenzi Bortolotti
    • 1
  • Rafael Tezza
    • 2
  • Dalton Francisco de Andrade
    • 3
  • Antonio Cezar Bornia
    • 4
  • Afonso Farias de Sousa Júnior
    • 5
  1. 1.Federal Technological University of Paraná, Campus MedianeiraMedianeiraBrazil
  2. 2.Department of Business AdministrationSanta Catarina State UniversityItacorubi, FlorianópolisBrazil
  3. 3.Informatics and Statistics DepartmentFederal University of Santa CatarinaTrindadeBrazil
  4. 4.Production Engineering DepartmentFederal University of Santa CatarinaTrindadeBrazil
  5. 5.Air Force UniversityRio de JaneiroBrazil

Personalised recommendations