Skip to main content

Item Analysis: Methods for Fitting the Right Items to the Right Test

  • Chapter
  • First Online:
Mastering Modern Psychological Testing

Abstract

When developing a test, there are numerous procedures that are useful for assessing the quality and measurement characteristics of test items. Not all procedures are appropriate for all types of tests, and not all procedures will indicate the same level of quality about a particular item. Classical test theory characteristics such as item difficulty, item discrimination, and response distractors are useful, as are characteristics associated with qualitative analyses and the techniques associated with item response theory. The challenge for all test developers is to evaluate the results of these procedures against the intended use of the test and make item-selection decisions that will support and maximize the test’s overall effectiveness in measuring what it intends to measure.

The better the items, the better the test.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Aiken, L. R. (2000). Psychological testing and assessment. Boston, MA: Allyn & Bacon.

    Google Scholar 

  • Anastasi, A., & Urbina, S. (1997). Psychological testing (7th ed.). Upper Saddle River, NJ: Prentice Hall.

    Google Scholar 

  • Engelhart, M. D. (1965). A comparison of several item discrimination indices. Journal of Educational Measurement, 2, 69–76.

    Article  Google Scholar 

  • Friedenberg, L. (1995). Psychological testing: Design, analysis, and use. Boston, MA: Allyn & Bacon.

    Google Scholar 

  • Hopkins, K. D. (1998). Educational and psychological measurement and evaluation (8th ed.). Boston, MA: Allyn & Bacon.

    Google Scholar 

  • Jaffe, L. E. (2009). Development, interpretation, and application of the W score and the relative proficiency index (Woodcock-Johnson III Assessment Service Bulletin No. 11). Rolling Meadows, IL: Riverside.

    Google Scholar 

  • Johnson, A. P. (1951). Notes on a suggested index of item validity: The U-L index. Journal of Educational Measurement, 42, 499–504.

    Google Scholar 

  • Kelley, T. L. (1939). The selection of upper and lower groups for the validation of test items. Journal of Educational Psychology, 30, 17–24.

    Article  Google Scholar 

  • Linn, R. L., & Gronlund, N. E. (2000). Measurement and assessment in teaching (8th ed.). Upper Saddle River, NJ: Prentice Hall.

    Google Scholar 

  • Lord, F. M. (1952). The relation of the reliability of multiple-choice tests to the distribution of item difficulties. Psychometrika, 17, 181–194.

    Article  Google Scholar 

  • McGrew, K. S., LaForte, E. M., & Schrank, F. A. (2014). Technical Manual. Woodcock-Johnson IV. Rolling Meadows, IL: Riverside.

    Google Scholar 

  • Nitko, A. J. (2001). Educational assessment of students. Upper Saddle River, NJ: Merrill Prentice Hall.

    Google Scholar 

  • Oosterhof, A. C. (1976). Similarity of various item discrimination indices. Journal of Educational Measurement, 13, 145–150.

    Article  Google Scholar 

  • Popham, W. J. (2000). Modern educational measurement: Practical guidelines for educational leaders. Boston, MA: Allyn & Bacon.

    Google Scholar 

Recommended Reading

  • Embretson, S., & Reise, S. (2000). Item response theory for psychologists. London, England: Taylor & Francis.

    Google Scholar 

  • Johnson, A. P. (1951). Notes on a suggested index of item validity: The U-L index. Journal of Educational Measurement, 42, 499–504. This is a seminal article in the history of item analysis.

    Google Scholar 

  • Kelley, T. L. (1939). The selection of upper and lower groups for the validation of test items. Journal of Educational Psychology, 30, 17–24. A real classic!

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

7.1 Electronic Supplementary Material

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Reynolds, C.R., Altmann, R.A., Allen, D.N. (2021). Item Analysis: Methods for Fitting the Right Items to the Right Test. In: Mastering Modern Psychological Testing. Springer, Cham. https://doi.org/10.1007/978-3-030-59455-8_7

Download citation

Publish with us

Policies and ethics