Are validity coefficients understated due to correctable defects in the GPA?
- 44 Downloads
The predictive validity of preadmissions measures such as standardized test scores and high school grades may be understated because of correctable defects in both the freshman year and cumulative grade point average (GPA). Measurement error in the criterion artificially depresses the size of observed validity coefficients. A study was conducted using item response theory (IRT) to develop a more reliable measure of performance, called an IRT-based GPA, and tested in a predictive validity study using data from Stanford University. Results indicate increased predictability when the IRT-based GPA is compared with the usual GPA.
KeywordsHigh School Measurement Error Test Score Validity Study Standardize Test
Unable to display preview. Download preview PDF.
- Cronbach, L. J. (1984).Essentials of Psychological Testing 4th ed. New York: Harper & Row.Google Scholar
- Elliott, R., and Strenta, A. C. (1988). Effects of improving the reliability of the GPA on prediction generally and on comparative predictions for gender and race particularly.Journal of Educational Measurement 25(4): 333–347.Google Scholar
- Goldman, R. D., and Hewitt, B. N. (1975). Adaptation-level as an explanation for differential standards in college grading.Journal of Educational Measurement 12(2): 149–161.Google Scholar
- Goldman, R. D, and Slaughter, R. E. (1976). Why college grade point average is difficult to predict.Journal of Educational Psychology 68(1): 9–14.Google Scholar
- Lord, F. M (1980).Application of Item Response Theory to Practical Testing Problems. Hillsdale, NJ: Erlbaum.Google Scholar
- McCornack, R. L., and McLeod, M. M. (1988). Gender bias in the prediction of college course performance.Journal of Educational Measurement 25(4): 321–331.Google Scholar
- McDonald, R. P. (1985). Unidimensional versus multidimensional models in item response theory. In D. J. Weiss (ed.),Proceedings of the 1982 Item Response Theory and Computerized Adaptive Testing Conference. Minneapolis: University of Minnesota.Google Scholar
- Muraki, E. (1990). Fitting polytomous item response models to Likert-type data.Applied Psychological Measurement 14(1): 59–71.Google Scholar
- Muraki, E., and Bock, R. D. (1988). PARSCALE: Parameter scaling of rating data (computer program). Mooresville, IN: Scientific Software, Inc.Google Scholar
- Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores.Psychometrika, Monograph Supplement No. 17.Google Scholar
- Strenta, A. C., and Elliott, R. (1987). Differential grading standards revisited.Journal of Educational Measurement 24(4): 281–291.Google Scholar
- Willingham, W. W. (1985).Success in College. New York: College Entrance Examination Board.Google Scholar
- Young, J. W. (1989a). Developing a universal scale for grades: Investigating predictive validity in college admissions. Ph.D. dissertation, Stanford University.Google Scholar
- Young, J. W. (1989b). Adjusting the cumulative GPA using item response theory. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA, March 1989.Google Scholar
- Young, J. W. (1990). Adjusting the cumulative GPA using item response theory.Journal of Educational Measurement 27(2): 175–186.Google Scholar