Research in Higher Education

, Volume 33, Issue 1, pp 71–84 | Cite as

Lies, damn lies, and statistics revisited a comparison of three methods of representing change

  • Gary R. Pike
AIR Forum Issue


Numerous authors have argued that change is fundamental to the education process, and that the measurement of change is an essential element in efforts to assess the quality and effectiveness of postsecondary education. Despite widespread support for the concept of studying student growth and development, many researchers have been critical of existing methods of representing change. Intended for assessment practitioners and educational researchers, this study examines three methods of measuring change: (1) gain scores, (2) residual scores, and (3) repeated measures. Analyses indicate that all three methods are seriously flawed, although repeated measures offer the greatest potential for adequately representing student growth and development.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aldenderfer, Mark S., and Blashfield, Roger K. (1984).Cluster Analysis (Sage University Paper Series on Quantitative Applications in the Social Sciences, series no. 07-044). Beverly Hills, CA: Sage.Google Scholar
  2. Astin, Alexander W. (1987).Achieving Educational Excellence. San Francisco: Jossey-Bass.Google Scholar
  3. Astin, Alexander W., and Ewell, Peter T. (1985). The value added debate ... continued.AAHE Bulletin 37(8): 11–13.Google Scholar
  4. Baird, Leonard L. (1988). Value-added: Using student gains as yardsticks of learning. In Clifford Adelman (ed.),Performance and Judgment: Essays on Principles and Practice in the Assessment of College Student Learning (pp. 205–216). Washington, DC: U.S. Government Printing Office.Google Scholar
  5. Banta, Trudy W., Lambert, E. Warren, Pike, Gary R., Schmidhammer, James L., and Schneider, Janet A. (1987). Estimated student score gain on the ACT COMP exam: Valid tool for institutional assessment?Research in Higher Education 27(3): 195–217.Google Scholar
  6. Bereiter, Carl (1963). Some persisting dilemmas in the measurement of change. In Chester W. Harris (ed.),Problems in Measuring Change (pp. 3–20). Madison, WI: University of Wisconsin Press.Google Scholar
  7. Cronbach, Lee J., and Furby, Lita (1970). How should we measure “change”—or should we?Psychological Bulletin 74(1): 68–80.Google Scholar
  8. Draper, Norman R., and Smith, Harry (1981).Applied Regression Analysis (2nd ed.). New York: John Wiley.Google Scholar
  9. Elashoff, Janet D. (1969). Analysis of covariance: A delicate instrument.American Education Research Journal 6(3): 383–401.Google Scholar
  10. Everson, Howart T. (1986). Where is the value in “value-added” testing? In Kathleen McGuiness (ed.),Legislative Action and Assessment: Reason and Reality (pp. 157–166). Fairfax, VA: George Mason University.Google Scholar
  11. Ewell, Peter T. (1984).The Self-Regarding Institution: Information for Excellence. Boulder, CO: National Center for Higher Education Management Systems.Google Scholar
  12. Fincher, Cameron (1985). What is value-added education?Research in Higher Education 22(4): 395–398.Google Scholar
  13. Forrest, Aubrey, and Steele, Joe M. (1982).Defining and Measuring General Education Knowledge and Skills. Iowa City, IA: American College Testing Program.Google Scholar
  14. Hanson, Gary R. (1988). Critical issues in the assessment of value added in education. In Trudy W. Banta (ed.),Implementing Outcomes Assessment: Promise and Perils (New Directions for Institutional Research, series no. 59, pp. 56–68). San Francisco: Jossey-Bass.Google Scholar
  15. Kennedy, John J., and Bush, Andrew J. (1985).An Introduction to the Design and Analysis of Experiments in Behavioral Research. New York: University Press of America.Google Scholar
  16. Kristof, Walter (1969). Estimation of true score and error variance for tests under various equivalence assumptions.Psychometrika 34(4): 489–507.Google Scholar
  17. Linn, Robert L. (1981). Measuring pretest-posttest performance changes. In Ronald A. Berk (ed.),Educational Evaluation Methodology: The State of the Art (pp. 84–109). Baltimore: Johns Hopkins University Press.Google Scholar
  18. Linn, Robert L., and Slinde, Jeffrey A. (1977). The determination of the significance of change between pre- and posttesting periods.Review of Educational Research 47(1): 121–150.Google Scholar
  19. Lord, Frederic M. (1956). The measurement of growth.Educational and Psychological Measurement 16(3): 421–437.Google Scholar
  20. Lord, Frederic M. (1957). A significance test for the hypothesis that two variables measure the same trait except for errors of measurement.Psychometrika 22(3): 207–220.Google Scholar
  21. Lord, Frederic M. (1958). Further problems in the measurement of growth.Educational and Psychological Measurement 18(3): 437–451.Google Scholar
  22. Lord, Frederic M. (1963). Elementary models for measuring change. In Chester W. Harris (ed.),Problems in Measuring Change (pp. 21–38). Madison, WI: University of Wisconsin Press.Google Scholar
  23. Lord, Frederic M. (1967). A paradox in the interpretation of group comparisons.Psychological Bulletin 68(5): 304–305.Google Scholar
  24. Lord, Frederic M. (1969). Statistical adjustments when comparing preexisting groups.Psychological Bulletin 72(5): 336–337.Google Scholar
  25. McMillan, James H. (1988). Beyond value-added education: Improvement alone is not enough.Journal of Higher Education 59(5): 564–579.Google Scholar
  26. Nuttall, Desmond L. (1986). Problems in the measurement of change. In Desmond L. Nuttall (ed.),Assessing Educational Achievement (pp. 153–167). London: Falmer Press.Google Scholar
  27. Pascarella, Ernest T. (1989). Methodological issues in assessing the outcomes of college. In Cameron Fincher (ed.),Assessing Institutional Effectiveness: Issues, Methods, and Management (pp. 19–32). Athens, GA: University of Georgia Press.Google Scholar
  28. Pike, Gary R., and Phillippi, Raymond H. (1989). Generalizability of the differential coursework methodology: Relationships between self-reported coursework and performance on the ACT-COMP exam.Research in Higher Education 30(3): 245–260.Google Scholar
  29. Ratcliff, James L. (1988). Developing a cluster-analytic model for identifying coursework patterns associated with general learned abilities of college students. Paper presented at the annual meeting of the American Educational Research Association, New Orleans.Google Scholar
  30. Roskam, Edward E. (1976). Multivariate analysis of change and growth: Critical review and perspectives. In Dato N. M. De Gruijter and Leo J. Th. van der Kamp (eds.),Advances in Psychological and Educational Measurement (pp. 111–133). New York: John Wiley.Google Scholar
  31. Steele, Joe M. (1988). Using measures of student outcomes and growth to improve college programs. Paper presented at the annual forum of the Association for Institutional Research, Phoenix.Google Scholar
  32. Steele, Joe M. (1989). College Outcome Measures Program (COMP): A generalizability analysis of the COMP Objective Test (Form 9). Unpublished manuscript, American College Testing Program, Iowa City, IA.Google Scholar
  33. Terenzini, Patrick T. (1989). Measuring the value of college: Prospects and problems. In Cameron Fincher (ed.),Assessing Institutional Effectiveness: Issues, Methods, and Management (pp. 33–47). Athens, GA: University of Georgia Press.Google Scholar
  34. Thorndike, Robert L. (1966). Intellectual status and intellectual growth.Journal of Educational Psychology 57(3): 121–127.Google Scholar
  35. Thorndike, Robert M. (1978).Correlational Procedures for Research. New York: Gardner.Google Scholar
  36. Traub, Ross E. (1967). A note on the reliability of residual change scores.Journal of Educational Measurement 4(4): 253–256.Google Scholar
  37. Vaughan, George B., and Templin, Robert G., Jr. (1987). Value added: Measuring the community college's effectiveness.Review of Higher Education 10(3): 235–245.Google Scholar
  38. Warren, Jonathan (1984). The blind alley of value added.AAHE Bulletin 37(1): 10–13.Google Scholar
  39. Willett, John B. (1988). Questions and answers in the measurement of change. In Ernst Z. Rothkopf (ed.),Review of Research in Education (vol 15, pp. 345–422). Washington, DC: American Educational Research Association.Google Scholar
  40. Winer, B. J. (1971).Statistical Principles in Experimental Design (2nd ed.). New York: McGraw-Hill.Google Scholar
  41. Zimmerman, Donald W., and Williams, Richard H. (1982). Gain scores can be highly reliable.Journal of Educational Measurement 19(2): 149–154.Google Scholar

Copyright information

© Human Sciences Press, Inc. 1992

Authors and Affiliations

  • Gary R. Pike
    • 1
  1. 1.Center for Assessment Research & DevelopmentUniversity of TennesseeKnoxville

Personalised recommendations