Advertisement

Issues in Test Administration and Development

  • Cynthia G. Parshall
  • Judith A. Spray
  • John C. Kalohn
  • Tim Davey
Part of the Statistics for Social and Behavioral Sciences book series (SSBS)

Abstract

The processes of test administration and development are both critical elements in any testing program. Chronologically, the development of any test occurs before its administration, and thus the two are more commonly paired as “test development and administration.” However, in discussing computerized testing programs, it is often useful to address the administration issues first and then turn to the development considerations. This is the approach followed in this chapter.

Keywords

Differential Item Functioning American Psychological Association Item Pool Computerize Adaptive Test Item Selection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. American Educational Research Association (AERA), American Psychological Association (APA), and the National Council on Measurement in Education (NCME). (1985). Standards for educational and psychological testing. Washington, DC: APA.Google Scholar
  2. American Educational Research Association (AERA), American Psychological Association (APA), and the National Council on Measurement in Education (NCME). (1999). Standards for educational and psychological testing. Washington, DC: AERA.Google Scholar
  3. American Psychological Association Committee on Professional Standards and Committee on Psychological Tests and Assessment. (APA). (1986). Guidelines for computer-based tests and interpretations. Washington, DC: AuthorGoogle Scholar
  4. Association of Test Publishers (ATP). (2000). Computer-Based Testing Guidelines.Google Scholar
  5. Clauser, B. E., & Schuwirth, L. W. T. (in press). The use of computers in assessment. In G. Norman, C. van der Vleuten, & D. Newble (Eds.), The International Handbook for Research in Medical Education. Boston: Kluwer Publishing.Google Scholar
  6. Colton, G. D. (1997). High-tech approaches to breaching examination security. Paper presented at the annual meeting of NCME, Chicago.Google Scholar
  7. Crocker, L. & Algina, J. (1986). Introduction to Classical and Modern Test Theory. Ft. Worth: Holt, Rinehart & Winston.Google Scholar
  8. Godwin, J. (1999, April). Designing the ACT ESL Listening Test. Paper presented at the annual meeting of the National Council on Measurement in Education, Montreal, Canada.Google Scholar
  9. Green, B. F., Bock R. D., Humphreys, L. G., Linn, R. L., & Reckase, M. D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21, 347–360.CrossRefGoogle Scholar
  10. Hambleton, R. K., & Jones, R. W. (1994). Item parameter estimation errors and their influence on test information functions. Applied Measurement in Education, 7, 171–186.CrossRefGoogle Scholar
  11. Mazzeo, J., & Harvey, A. L. (1988). The equivalence of scores from automated and conventional educational and psychological tests: A review of the literature (College Board Rep. No. 88-8, ETS RR No. 88-21). Princeton, NJ: Educational Testing Service.Google Scholar
  12. Mead, A. D., & Drasgow, F. (1993). Equivalence of computerized and paper-and-pencil cognitive ability tests: A meta-analysis. Psychological Bulletin, 9, 287–304.Google Scholar
  13. NCME Software Committee. (2000). Report of NCME Ad Hoc Committee on Software Issues in Educational Measurement. Available online: http://www.b-a-h.com/ncmesoft/report.html.Google Scholar
  14. O’Neal, C. W. (1998). Surreptitious audio surveillance: The unknown danger to law enforcement. FBI Law Enforcement Bulletin, 67, 10–13.Google Scholar
  15. Parshall, C. G. (In press). Item development and pretesting. In C. Mills (Ed.) Computer-Based Testing. Lawrence Erlbaum.Google Scholar
  16. Pommerich, M., & Burden, T. (2000). From simulation to application: Examinees react to computerized testing. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans.Google Scholar
  17. Rosen, G.A. (2000, April). Computer-based testing: Test site security. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans.Google Scholar
  18. Shermis, M., & Averitt, J. (2000, April). Where did all the data go? Internet security for Web-based assessments. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans.Google Scholar
  19. Vale, C. D. (1995). Computerized testing in licensure. In J. C. Impara (Ed.) Licensure Testing: Purposes, Procedures, and Practices. Lincoln, NE: Büros Institute of Mental Measurement.Google Scholar
  20. Wainer, H. (Ed.) (1990). Computerized Adaptive Testing: A Primer. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
  21. Wang, T., & Kolen, M. J. (1997, March). Evaluating comparability in computerized adaptive testing: A theoretical framework. Paper presented at the annual meeting of the American Educational Research Association, Chicago.Google Scholar
  22. Way, W. D. (1998). Protecting the integrity of computerized testing item pools. Educational Measurement: Issues and Practice, 17, 17–27.CrossRefGoogle Scholar

Additional Readings

  1. Bugbee, A. C, & Bemt, F. M. (1990). Testing by computer: Findings in six years of use. Journal of Research on Computing in Education, 23, 87–100.Google Scholar
  2. Buhr, D. C, & Legg, S. M. (1989). Development of an Adaptive Test Version of the College Level Academic Skills Test. (Institute for Student Assessment and Evaluation, Contract No. 88012704). Gainesville, FL: University of Florida.Google Scholar
  3. Bunderson, C. V., Inouye, D. K., & Olsen, J. B. (1989). The four generations of computerized educational measurement. In R. L. Linn (Ed.), Educational Measurement (3rd ed., pp. 367–408). New York: Macmillan.Google Scholar
  4. Eaves, R. C, & Smith, E. (1986). The effect of media and amount of microcomputer experience on examination scores. Journal of Experimental Education, 55, 23–26.Google Scholar
  5. Eignor, D. R. (1993, April). Deriving Comparable Scores for Computer Adaptive and Conventional Tests: An Example Using the SAT. Paper presented at the annual meeting of the National Council on Measurement in Education, Atlanta.Google Scholar
  6. Greaud, V. A., & Green, B. F. (1986). Equivalence of conventional and computer presentation of speed tests. Applied Psychological Measurement, 10, 23–34.CrossRefGoogle Scholar
  7. Green, B. F., Bock, R. D., Humphreys, L. G., Linn, R. L., & Reckase, M. D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21, 347–360.CrossRefGoogle Scholar
  8. Haynie, K. A., & Way, W. D. (1995, April). An Investigation of Item Calibration Procedures for a Computerized Licensure Examination. Paper presented at symposium entitled Computer Adaptive Testing, at the annual meeting of NCME, San Francisco.Google Scholar
  9. Heppner, F. H., Anderson, J. G. T., Farstrup, A. E., & Weiderman, N. H. (1985). Reading performance on a standardized test is better from print than from computer display. Journal of Reading, 28, 321–325.Google Scholar
  10. Hoffman, K. I., & Lundberg, G. D. (1976). A comparison of computer-monitored group tests with paper-and-pencil tests. Educational and Psychological Measurement, 36, 791–809.CrossRefGoogle Scholar
  11. Keene, S., & Davey, B. (1987). Effects of computer-presented text on LD adolescents’ reading behaviors. Learning Disability Quarterly, 10, 283–290.CrossRefGoogle Scholar
  12. Lee, J. A. (1986). The effects of past computer experience on computerized aptitude test performance. Educational and Psychological Measurement, 46, 721–733.Google Scholar
  13. Lee, J. A., Moreno, K. E., & Sympson, J. B. (1986). The effects of mode of test administration on test performance. Educational and Psychological Measurement, 46, 467–173.CrossRefGoogle Scholar
  14. Legg, S. M., & Buhr, D. C. (1990). Investigating Differences in Mean Scores on Adaptive and Paper and Pencil Versions of the College Level Academic Skills Reading Test. Presented at the annual meeting of the National Council on Measurement in Education.Google Scholar
  15. Linn, R. L. (Ed.). The four generations of computerized educational measurement. Educational Measurement, 3rd ed., pp. 367–408, NY: MacMillan.Google Scholar
  16. Llabre, M. M., & Froman, T. W. (1987). Allocation of time to test items: A study of ethnic differences. Journal of Experimental Education, 55, 137–140.Google Scholar
  17. Mason, G. E. (1987). The relationship between computer technology and the reading process: Match or misfit? Computers in the Schools, 4, 15–23.CrossRefGoogle Scholar
  18. Mills, C. (1994, April). The Introduction and Comparability of the Computer Adaptive GRE General Test. Symposium presented at the annual meeting of the National Council on Measurement in Education, New Orleans.Google Scholar
  19. Olsen, J. B., Maynes, D. D., Slawson, D., & Ho, K. (1989). Comparisons of paper-administered, computer-administered and computerized adaptive achievement tests. Journal of Educational Computing Research, 5, 311–326.CrossRefGoogle Scholar
  20. Parshall, C. G., & Kromrey, J. D. (1993, April). Computer testing vs. Paper and pencil testing: an analysis of examinee characteristics associated with mode effect. Paper presented at the annual meeting of the American Educational Research Association, Atlanta.Google Scholar
  21. Raffeld, P. C, Checketts, K., & Mazzeo, J. (1990). Equating Scores from Computer-Based and Paper-Pencil Versions of College Level English and Mathematics Achievement Tests. Presented at the annual meeting of the National Council on Measurement in Education.Google Scholar
  22. Sachar, J. D., & Fletcher, J. D. (1978). Administering paper-and-pencil tests by computer, or the medium is not always the message. In D. J. Weiss (Ed.), Proceedings of the 1977 Computerized Adaptive Testing Conference. Wayzata, MN: University of Minnesota.Google Scholar
  23. Stocking, M. L. (1988). Scale Drift in On-Line Calibration. (Report No. 88-28-ONR). Princeton, NJ: Educational Testing Service.Google Scholar
  24. Sykes, R. C, & Fitzpatrick, A. R. (1992). The stability of IRT b values. Journal of Educational Measurement, 29, 201–211.CrossRefGoogle Scholar
  25. Wise, S. L., & Plake, B. S. (1989). Research on the effects of administering tests via computers. Educational Measurement: Issues and Practice, 3, 5–10.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2002

Authors and Affiliations

  • Cynthia G. Parshall
    • 1
  • Judith A. Spray
    • 2
  • John C. Kalohn
    • 2
  • Tim Davey
    • 3
  1. 1.University of South FloridaTampaUSA
  2. 2.ACT, Inc.Iowa CityUSA
  3. 3.Educational Testing ServicePrincetonUSA

Personalised recommendations