Multi-document Automatic Text Summarization Using Entropy Estimates

Ravindra, G.; Balakrishnan, N.; Ramakrishnan, K. R.

doi:10.1007/978-3-540-24618-3_25

Multi-document Automatic Text Summarization Using Entropy Estimates

G. Ravindra⁸,
N. Balakrishnan⁸ &
K. R. Ramakrishnan⁹

Conference paper

473 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2932))

Abstract

This paper describes a sentence ranking technique using entropy measures, in a multi-document unstructured text summarization application. The method is topic specific and makes use of a simple language independent training framework to calculate entropies of symbol units. The document set is summarized by assigning entropy-based scores to a reduced set of sentences obtained using a graph representation for sentence similarity. The performance is seen to be better than some of the common statistical techniques, when applied on the same data set. Commonly used measures like precision, recall and f-score have been modified and used as a new set of measures for comparing the performance of summarizers. The rationale behind such a modification is also presented. Experimental results are presented to illustrate the relevance of this method in cases where it is difficult to have language specific dictionaries, translators and document-summary pairs for training.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baldwin, B., Morton, T.: Dynamic Co-Reference Based Summarization. In: Proc. Third Conference on Emperical Methods in Natural Language Processing, pp. 630–632 (1998)
Google Scholar
Carbonell, J.G., Goldstein, J.: Use of mmr Diversity-Based Re-Ranking for Recording Documents and Producing Summaries. In: Proc. ACM, SIGIR 1998 (1998)
Google Scholar
Hovy, E.H., Lin, C.Y.: Automated Text Summarization in SUMMARIST, ch. 8. MIT Press, Cambridge (1999)
Google Scholar
Deerwester, S.D., et al.: Indexing by Latent Semantic Analysis. American Society for Information Science 41, 391–407 (1990)
Article Google Scholar
Paice, C.: Constructing Literature Abstracts by Computer: Techniques and Prospects. Information Processing and Management 26, 171–186 (1990)
Article Google Scholar
Barzilay, R., Elhadad, M.: Using Lexical Chains for Text Summarization. In: Proc. Workshop on Intelligent Scalable Text Summarization, Madrid, Spain (1997)
Google Scholar
Morris, J., Hirst, G.: Lexical Cohesion Computed by Thesaural Relations as an Indication of the Structure of Text. Computational Linguistics 17, 21–43 (1991)
Google Scholar
Yihong Gong, X.L.: Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis. In: Proc. ACM SIGIR 2001, pp. 19–25 (2001)
Google Scholar
Radev, D., Budzikowska, M: Centroid-Based Summarization of Multiple Documents: Sentence Extaction, Utility-Based Evaluation and User Studies. In: Proc. ANLP/NAACL 2000 (2000)
Google Scholar
Dragomir Radev, V.H., McKeowen, K.R.: A Description of the Cidr System as Used for tdt-2. In: Proc. DARPA Broadcast News Workshop, Herndon (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Science, Supercomputer Education and Research Center, Bangalore, 560012, India
G. Ravindra & N. Balakrishnan
Dept. of Electrical Engineering, Institute of Science, Bangalore, 560012, India
K. R. Ramakrishnan

Authors

G. Ravindra
View author publications
You can also search for this author in PubMed Google Scholar
N. Balakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
K. R. Ramakrishnan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Sciences, ILLC - Department of Mathematics and Computer Science, University of Amsterdam, Plantage Muidergracht 24, 1018 TV, Amsterdam, The Netherlands
Peter Van Emde Boas
Faculty of Mathematics and Physics, Charles University, Prague
Jaroslav Pokorný
Institute of Informatics and Software Engineering Faculty of Informatics and Information technologies, Slovak University of Technology, Ilkovičova 3, 842 16, Bratislava
Mária Bieliková
Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod Vodárenskou věží 2, 182 07, Prague 8 Czech Republic
Július Štuller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ravindra, G., Balakrishnan, N., Ramakrishnan, K.R. (2004). Multi-document Automatic Text Summarization Using Entropy Estimates. In: Van Emde Boas, P., Pokorný, J., Bieliková, M., Štuller, J. (eds) SOFSEM 2004: Theory and Practice of Computer Science. SOFSEM 2004. Lecture Notes in Computer Science, vol 2932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24618-3_25

Download citation

DOI: https://doi.org/10.1007/978-3-540-24618-3_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20779-5
Online ISBN: 978-3-540-24618-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics