From testing to training: Evaluating automated diagnosis in statistics and algebra

  • Marc M. Sebrechts
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 608)


Much of the controversy over the role of testing in learning stems from its inability to differentiate among sources of error, the focus on outcome rather than problem solving, and the lack of useful information for further learning. To a large extent, these difficulties stem from the constrained form of testing typically used (i.e., multiple choice). Intelligent diagnosis of less constrained problem-solving scenarios is one way to integrate testing and learning. GIDE, a goal-based diagnostic system, provides a step in the direction of such integration. Initial empirical assessment in the domains of elementary statistics and algebra word problems suggests that GIDE's analysis retains information comparable to that provided by multiple-choice items and reflects a global similarity to human evaluators.


Intelligent Tutoring System Student Solution Correct Goal Human Evaluator Construct Response Item 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ager, T. A. (1990). From interactive instruction to interactive testing. In R. Freedle (Eds.), Artificial intelligence and the future of testing (pp. 21–52). Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  2. 2.
    Bennett, R. E., Sebrechts, M. M., & Rock, D. A. (1991). Expert-system scores for complex constructed-response quantitative items: A study of convergent validity. Applied Psychological Measurement, 15(3), 227–239.Google Scholar
  3. 3.
    Bennett, R. E., Sebrechts, M. M., Yamamoto, K. (1991). Fitting new measurement models to GRE General Test constructed-response item data. ETS Research Report No.91-60. Princeton, NJ: Educational Testing Service.Google Scholar
  4. 4.
    Brown, J. S., & Burton, R. R. (1978). Diagnostic Models for Procedural Bugs in Basic Mathematical Skills. Cognitive Science, 2, 155–192.Google Scholar
  5. 5.
    Glaser, R., & Bassok, M. (1989). Learning theory and the study of instruction. Annual Review of Psychology, 40, 631–666.Google Scholar
  6. 6.
    Hunt, E., & Lansman, M. (1986). Unified model of attention and problem solving. Psychological Review, 93(4), 446–461.Google Scholar
  7. 7.
    Johnson, W. L. (1986). Intention-based diagnosis of novice programming errors. Los Altos, CA: Morgan Kaufmann Publishers.Google Scholar
  8. 8.
    Johnson, W. L. & Soloway, E.(1985). PROUST:An automatic debugger for pascal programs. Byte(April), 179–190.Google Scholar
  9. 9.
    McArthur, D., Stasz, C., & Hotta, J. Y. (1986–87). Learning problem-solving skills in algebra. Journal of Educational Technology Systems, 15(3), 303–324.Google Scholar
  10. 10.
    Milson, R., Lewis, M. W., & Anderson, J. R. (1990). The teacher's apprentice project: Building an algebra tutor. In R. Freedle (Eds.), Artificial intelligence and the future of testing (pp. 53–71). Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  11. 11.
    National Council of Teachers of Mathematics (1988). The ideas of algebra, K-12. 1988 Yearbook, edited by Arthur F. Coxford. Reston, VA: The Council.Google Scholar
  12. 12.
    Payne, S. J., & Squibb, H. R. (1990). Algebra mal-rules and cognitive accounts of error. Cognitive Science, 14(3), 445–481.Google Scholar
  13. 13.
    Sebrechts, M. M., Bennett, R. E., & Rock, D. A. (1991). Agreement between expert-system and human raters' scores on complex constructed-response quantitative items. Journal of Applied Psychology, 76(6), 856–862.Google Scholar
  14. 14.
    Sebrechts, M. M., & Schooler, L. J. (1987). Diagnosing errors in statistical problem solving: Associative problem recognition and plan-based error detection. In Proceedings of the Ninth Annual Cognitive Science Society Meeting, (pp. 691–703). Lawrence Erlbaum Associates.Google Scholar
  15. 15.
    Sebrechts, M. M.,Schooler, L. J., & LaClaire, L. (1986). Matching Strategies for Error Diagnosis: A Statistics Tutoring Aid. In Proceedings of the International Conference on Systems. Man and Cybernetics. Atlanta, GA.Google Scholar
  16. 16.
    Sebrechts, M. M., Schooler, L. J., LaClaire, L., & Soloway, E. (1987). Computer-Based Inter-pretation of Students' Statistical Errors: A Preliminary Empirical Analysis of GIDE. In Proceedings of the 8th National Educational Computing Conference, (pp. 143–148). Eugene, OR: International Council on Computers in Education.Google Scholar
  17. 17.
    Shepard, L. A. (1990). Inflated test score gains: Is the problem old norms or teaching the test? Educational Measurement: Issues and Practice, 9, 15–22.Google Scholar
  18. 18.
    VanLehn, K. (1990). Mind bugs: The origins of procedural misconceptions. Cambridge, MA: The MIT Press.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1992

Authors and Affiliations

  • Marc M. Sebrechts
    • 1
  1. 1.Department of PsychologyThe Catholic University of AmericaWashington, DCUSA

Personalised recommendations