Terms Derived from Frequent Sequences for Extractive Text Summarization

  • Yulia Ledeneva
  • Alexander Gelbukh
  • René Arnulfo García-Hernández
Conference paper

DOI: 10.1007/978-3-540-78135-6_51

Part of the Lecture Notes in Computer Science book series (LNCS, volume 4919)
Cite this paper as:
Ledeneva Y., Gelbukh A., García-Hernández R.A. (2008) Terms Derived from Frequent Sequences for Extractive Text Summarization. In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg

Abstract

Automatic text summarization helps the user to quickly understand large volumes of information. We present a language- and domain-independent statistical-based method for single-document extractive summarization, i.e., to produce a text summary by extracting some sentences from the given text. We show experimentally that words that are parts of bigrams that repeat more than once in the text are good terms to describe the text’s contents, and so are also so-called maximal frequent sentences. We also show that the frequency of the term as term weight gives good results (while we only count the occurrences of a term in repeating bigrams).

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yulia Ledeneva
    • 1
  • Alexander Gelbukh
    • 1
  • René Arnulfo García-Hernández
    • 2
  1. 1.Natural Language and Text Processing Laboratory, Center for Computing ResearchNational Polytechnic InstituteMexico
  2. 2.Instituto Tecnologico de TolucaMexico

Personalised recommendations