Automated Scoring of Handwritten Essays Based on Latent Semantic Analysis

  • Sargur Srihari
  • Jim Collins
  • Rohini Srihari
  • Pavithra Babu
  • Harish Srinivasan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3872)


Handwritten essays are widely used in educational assessments, particularly in classroom instruction. This paper concerns the design of an automated system for performing the task of taking as input scanned images of handwritten student essays in reading comprehension tests and to produce as output scores for the answers which are analogous to those provided by human scorers. The system is based on integrating the two technologies of optical handwriting recognition (OHR) and automated essay scoring (AES). The OHR system performs several pre-processing steps such as forms removal, rule-line removal and segmentation of text lines and words. The final recognition step, which is tuned to the task of reading comprehension evaluation in a primary education setting, is performed using a lexicon derived from the passage to be read. The AES system is based on the approach of latent semantic analysis where a set of human-scored answers are used to determine scoring system parameters using a machine learning approach. System performance is compared to scoring done by human raters. Testing on a small set of handwritten answers indicate that system performance is comparable to that of automatic scoring based on manual transcription.


Word Recognition Singular Value Decomposition Latent Semantic Analysis Text Line Training Corpus 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern information retrieval. Addison-Wesley, New York (1999)Google Scholar
  2. 2.
    Burstein, J.: The E-rater Scoring Engine: Automated essay scoring with natural language processing. In: Automated Essay Scoring (2003)Google Scholar
  3. 3.
    Hull, J.J.: Incorporation of a Markov model of syntax in a text recognition algorithm. In: Proceedings of the Symposium on Document Analysis and Information Retrieval, pp. 174–183 (1992)Google Scholar
  4. 4.
    Landauer, T., Laham, D., Foltz, P.: Automated scoring and annotation of essays with the Intelligent Essay Assessor. In: Automated Essay Scoring (2003)Google Scholar
  5. 5.
    Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Processes 25, 259–284Google Scholar
  6. 6.
    Larkey, L.S.: Automatic essay grading using text categorization techniques. In: Proceedings ACM-SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, pp. 90–95Google Scholar
  7. 7.
    Mahadevan, U., Srihari, S.N.: Parsing and recognition of city, state and ZIP Codes in handwritten addresses. In: Proceedings of Fifth International Conference on Document Analysis and Recognition (ICDAR), Bangalore, India, pp. 325–328 (1999)Google Scholar
  8. 8.
    Page, E.B.: Computer grading of student prose using modern concepts and software. Journal of Experimental Education 62, 127–142Google Scholar
  9. 9.
    Palmer, J., Williams, R., Dreher, H.: Automated essay grading system applied to a first year university subject - how can we do better? Informing Science, 1221–1229 (June 2002)Google Scholar
  10. 10.
    Plamondon, R., Srihari, S.N.: On-line and off-line handwriting recognition: A comprehensive survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(1), 63–84 (2000)CrossRefGoogle Scholar
  11. 11.
    Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)Google Scholar
  12. 12.
    Srihari, R.K., Ng, S., Baltus, C.M., Kud, J.: Use of language models in on-line sentence/phrase recognition. In: Proceedings of the International Workshop on Frontiers in Handwriting Recognition, Buffalo, pp. 284–294 (1993)Google Scholar
  13. 13.
    Srihari, S.N., Kim, G.: PENMAN: A system for reading unconstrained handwritten page images. In: Proceedings of the Symposium on Document Image Understanding Technology (SDIUT 1997), Annapolis, MD, pp. 142–153 (1997)Google Scholar
  14. 14.
    Srihari, S.N., Zhang, B., Tomai, C., Lee, S., Shi, Z., Shin, Y.C.: A system for handwriting matching and recognition. In: Proceedings of the Symposium on Document Image Understanding Technology (SDIUT 2003), Greenbelt, MD, pp. 67–75 (2003)Google Scholar
  15. 15.
    Srihari, S.N., Keubert, E.J.: Integration of handwritten address interpretation technology into the United States Postal Service Remote Computer Reader System. In: Proceedings of the Fourth International Conference on Document Analysis and Recognition (ICDAR 1997), Ulm, Germany, pp. 892–896 (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Sargur Srihari
    • 1
  • Jim Collins
    • 1
  • Rohini Srihari
    • 1
  • Pavithra Babu
    • 1
  • Harish Srinivasan
    • 1
  1. 1.Center of Excellence for Document Analysis and Recognition (CEDAR)University at Buffalo, State University of New YorkAmherstU.S.A

Personalised recommendations