Terms Derived from Frequent Sequences for Extractive Text Summarization
- Cite this paper as:
- Ledeneva Y., Gelbukh A., García-Hernández R.A. (2008) Terms Derived from Frequent Sequences for Extractive Text Summarization. In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg
Automatic text summarization helps the user to quickly understand large volumes of information. We present a language- and domain-independent statistical-based method for single-document extractive summarization, i.e., to produce a text summary by extracting some sentences from the given text. We show experimentally that words that are parts of bigrams that repeat more than once in the text are good terms to describe the text’s contents, and so are also so-called maximal frequent sentences. We also show that the frequency of the term as term weight gives good results (while we only count the occurrences of a term in repeating bigrams).
Unable to display preview. Download preview PDF.