Text Categorization for Assessing Multiple Documents Integration, or John Henry Visits a Data Mine

  • Peter Hastings
  • Simon Hughes
  • Joe Magliano
  • Susan Goldman
  • Kim Lawless
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6738)


A critical need for students in the digital age is to learn how to gather, analyze, evaluate, and synthesize complex and sometimes contradictory information across multiple sources and contexts. Yet reading is most often taught with single sources. In this paper, we explore techniques for analyzing student essays to give feedback to teachers on how well their students deal with multiple texts. We compare the performance of a simple regular expression matcher to Latent Semantic Analysis and to Support Vector Machines, a machine learning approach.


Natural Language Processing Machine Learning Corpus Analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    New London Group: A pedagogy of multiliteracies: Designing social futures. Harvard Educational Review 66, 60–92 (1996)Google Scholar
  2. 2.
    Goldman, S.R., Lawless, K.A., Gomez, K.W., Braasch, J.L.G., MacLeod, S., Manning, F.: Literacy in the digital world: Comprehending and learning from multiple sources. In: McKeown, M.G., Kucan, L. (eds.) Bringing Reading Researchers to Life, Guilford, NY, pp. 257–284 (2010)Google Scholar
  3. 3.
    Britt, M.A., Wiemer-Hastings, P., Larson, A., Perfetti, C.: Using intelligent feedback to improve sourcing and integration in students’ essays. International Journal of Artificial Intelligence in Education 14, 359–374 (2004)Google Scholar
  4. 4.
    Britt, M.A., Kurby, C., Dandotkar, S., Wolfe, C.: I agreed with what? Memory for simple argument claims. Discourse Processes 45(1), 52–84 (2008)CrossRefGoogle Scholar
  5. 5.
    Goldman, S.R., Bloome, D.M.: Learning to construct and integrate. In: Healy, A.F. (ed.) Experimental Cognitive Psychology and its Applications: Festshrift in Honor of Lyle Bourne, Walter Kintsch, and Thomas Landauer, pp. 169–182. American Psychological Association, Washington, D.C (2005)CrossRefGoogle Scholar
  6. 6.
    Wolfe, M.B., Goldman, S.R.: Relationships between adolescents’ text processing and reasoning. Cognition & Instruction 23(4), 467–502 (2005)CrossRefGoogle Scholar
  7. 7.
    VanSledright, B.: Confronting history’s interpretive paradox while teaching fifth graders to investigate the past. American Educational Research Journal 39, 1089–1115 (2002)CrossRefGoogle Scholar
  8. 8.
    Rouet, J.F.: The skills of document use. Erlbaum, Mahwah (2006)Google Scholar
  9. 9.
    Rouet, J.F., Britt, M.A.: Relevance processes in multiple document comprehension. In: McCrudden, M.T., Magliano, J.P., Schraw, G. (eds.) Text Relevance and Learning from Text. Information Age Publishing, Greenwich (in press)Google Scholar
  10. 10.
    Hobbs, J., Appelt, D., Tyson, M., Bear, J., Israel, D.: SRI International: Description of the FASTUS system used for MUC-4. In: Proceedings of the Fourth Message Understanding Conference. Morgan Kaufmann Publishers, Inc., San Mateo (1992)Google Scholar
  11. 11.
    Landauer, T., Dumais, S.: A solution to Plato’s problem: The Latent Semantic Analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104, 211–240 (1997)CrossRefGoogle Scholar
  12. 12.
    Medlock, B.: Investigating classification for natural language processing tasks. PhD thesis, University of Cambridge, Technical Report UCAM-CL-TR-721 (2007)Google Scholar
  13. 13.
    Joachims, T.: Learning to Classify Text Using Support Vector Machines. PhD thesis. Cornell University. Kluwer (2002)Google Scholar
  14. 14.
    Samuel, K., Carberry, S., Vijay-Shanker, K.: Computing dialogue acts from features with transformation-based learning. In: Papers from the 1998 AAAI Spring Symposium on Applying Machine Learning to Discourse Processing, pp. 90–97. AAAI Press, Menlo Park (1998) Number SS-98-01Google Scholar
  15. 15.
    Larkey, L.S.: Automatic essay grading using text categorization techniques. In: Proceedings of SIGIR 1998, pp. 90–95 (1998)Google Scholar
  16. 16.
    Sathiyamurthy, K., Geetha, T.V.: Association of domain concepts with educational objectives for e-learning. In: Proceedings of Compute 2010, pp. 330–333 (2010)Google Scholar
  17. 17.
    Bloom, B. (ed.): Taxonomy of educational objectives: The classification of educational goals: Handbook I, cognitive domain. Longmans, Green (1956)Google Scholar
  18. 18.
    Yilmazel, O., Balasubramanian, N., Harwell, S.C., Bailey, J., Diekema, A.R., Liddy, E.D.: Text categorization for aligning educational standards. In: Proceedings of the 40th Hawaii International Conference on System Sciences, pp. 73–80 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Peter Hastings
    • 1
  • Simon Hughes
    • 1
  • Joe Magliano
    • 2
  • Susan Goldman
    • 3
  • Kim Lawless
    • 3
  1. 1.DePaul UniversityUSA
  2. 2.Northern Illinois UniversityUSA
  3. 3.University of IllinoisChicagoUSA

Personalised recommendations