Summary Generation and Evaluation in SumUM

  • Horacio Saggion
  • Guy Lapalme
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1952)


We describe and evaluate SumUM, a text summarization system that produces indicative-informative abstracts for technical papers. Our approach consists of the shallow syntactic and conceptual analysis of the source document and of the implementation of text re-generation techniques based on a study of abstracts produced by professional abstractors. In an evaluation of indicative content in a categorization task, we observed no differences with other automatic method, while differences are observed in an evaluation of informative content. In an evaluation of text quality, the abstracts were considered acceptable when compared with other automatic abstracts.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    H.P. Edmunson. New Methods in Automatic Extracting. Journal of the Association for Computing Machinery, 16(2):264–285, April 1969.Google Scholar
  2. 2.
    G. Foster. Statistical Lexical Disambiguation. Master’s thesis, McGill University, School of Computer Science, 1991.Google Scholar
  3. 3.
    E.D. Liddy. The Discourse-Level Structure of Empirical Abstracts: An Exploratory Study. Information Processing & Management, 27(1):55–81, 1991.CrossRefGoogle Scholar
  4. 4.
    I. Mani, D. House, G. Klein, L. Hirshman, L. Obrst, T. Firmin, M. Chrzanowski, and B. Sundheim. The TIPSTER SUMMAC Text Summarization Evaluation. Technical report, The Mitre Corporation, 1998.Google Scholar
  5. 5.
    D. Marcu. From Discourse Structures to Text Summaries. In The Proceedings of the ACL’97/EACL’97 Workshop on Intelligent Scalable Text Summarization, pages 82–88, Madrid, Spain, July 11 1997.Google Scholar
  6. 6.
    C.D. Paice and P.A. Jones. The Identification of Important Concepts in Highly Structured Technical Papers. In R. Korfhage, E. Rasmussen, and P. Willett, editors, Proc. of the 16th ACM-SIGIR Conference, pages 69–78, 1993.Google Scholar
  7. 7.
    L.H.M. Rino and D. Scott. A Discourse Model for Gist Preservation. In D.L. Borges and C.A.A. Kaestner, editors, Proceedings of the 13th Brazilian Symposium on Artificial Intelligence, SBIA’96, Advances in Artificial Intelligence, pages 131–140. Springer, October 23–25, Curitiba, Brazil 1996.Google Scholar
  8. 8.
    H. Saggion and G. Lapalme. Where does Information come from? Corpus Analysis for Automatic Abstracting. In Rencontre Internationale sur l’Extraction le Filtrage et le Résumé Automatique. RIFRA’98, pages 72–83, Sfax, Tunisie, Novembre 11–14 1998.Google Scholar
  9. 9.
    H. Saggion and G. Lapalme. Concept Identification and Presentation in the Context of Technical Text Summarization. In Proceedings of the Workshop on Automatic Summarization. ANLP-NAACL2000, Seattle, WA, USA, 30 April 2000. Association for Computational Linguistics.Google Scholar
  10. 10.
    G. Salton, A. Singhal, M. Mitra, and C. Buckley. Automatic Text Structuring and Summarization. Information Processing & Management, 33(2):193–207, 1997.CrossRefGoogle Scholar
  11. 11.
    S. Teufel and M. Moens. Argumentative classification of extracted sentences as a first step towards flexible abstracting. In I. Mani and M.T. Maybury, editors, Advances in Automatic Text Summarization, pages 155–171. The MIT Press, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Horacio Saggion
    • 1
  • Guy Lapalme
    • 1
  1. 1.DIRO - Université de MontréalMontréal, QuébecCanada

Personalised recommendations