Combining Summaries Using Unsupervised Rank Aggregation

Palshikar, Girish Keshav; Deshpande, Shailesh; Athiappan, G.

doi:10.1007/978-3-642-28601-8_32

Girish Keshav Palshikar¹⁷,
Shailesh Deshpande¹⁷ &
G. Athiappan¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7182))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1345 Accesses

Abstract

We model the problem of combining multiple summaries of a given document into a single summary in terms of the well-known rank aggregation problem. Treating sentences in the document as candidates and summarization algorithms as voters, we determine the winners in an election where each voter selects and ranks k candidates in order of its preference. Many rank aggregation algorithms are supervised: they discover an optimal rank aggregation function from a training dataset of where each ”record” consists of a set of candidate rankings and a model ranking. But significant disagreements between model summaries created by human experts as well as high costs of creating them makes it interesting to explore the use of unsupervised rank aggregation techniques. We use the well-known Condorcet methodology, including a new variation to improve its suitability. As voters, we include summarization algorithms from literature and two new ones proposed here: the first is based on keywords and the second is a variant of the lexical-chain based algorithm in [1]. We experimentally demonstrate that the combined summary is often very similar (when compared using different measures) to the model summary produced manually by human experts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Stokes, N.: Applications of Lexical Cohesion in the Topic Detection and Tracking Domain, Ph.D. thesis, National university of Ireland, Dublin (2004)
Google Scholar
van Halteren, H., Teufel, S.: Examining the consensus between human summaries: initial experiments with factoid analysis. In: Proceedings of the HLT-NAACL 2003 on Text Summarization Workshop (HLT-NAACL-DUC 2003), vol. 5, pp. 57–64 (2003)
Google Scholar
Jing, H., Barzilay, R., Mckeown, K., Elhadad, M.: Summarization evaluation methods: Experiments and analysis. In: AAAI Symposium on Intelligent Summarization, pp. 60–68 (1998)
Google Scholar
Hovy, E.H.: Automated text summarization. In: Mitkov, R. (ed.) The Oxford Handbook of Computational Linguistics, pp. 583–598 (2005)
Google Scholar
Teufel, S., Moens, M.: Sentence extraction as a classification task. In: Proc. Workshop on Intelligent Scalable Summarization ACL/EACL Conference, pp. 58–65 (1997)
Google Scholar
Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proc. 18th Int. ACM Conf. Research and Development in Information Retrieval (SIGIR), pp. 68–73 (1995)
Google Scholar
Hovy, E., Lin, C.Y.: Automated text summarization in summarist. In: Maybury, M., Mani, I. (eds.) Advances in Automatic Text Summarization. MIT Press (1999)
Google Scholar
Edmundson, H.P.: New methods in automatic extraction. Journal of the ACM 16(2), 264–285 (1968)
Article Google Scholar
Matsumura, N., Ohsawa, Y., Ishizuka, M.: Pai: Automatic indexing for extracting assorted keywords from a document. In: Proc. AAAI 2002 (2002)
Google Scholar
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. International Journal on AI Tools 13(1), 157–169 (2004)
Article Google Scholar
Ohsawa, Y., Benson, N.E., Yachida, M.: Keygraph: automatic indexing by co-occurrence graph based on building construction metaphor. In: Proc. Advanced Digital Library Conference (ADL 1998), pp. 12–18 (1998)
Google Scholar
Palshikar, G.: Keyword Extraction from Single Document Using Centrality Measures. In: Ghosh, A., De, R.K., Pal, S.K. (eds.) PReMI 2007. LNCS, vol. 4815, pp. 503–510. Springer, Heidelberg (2007)
Chapter Google Scholar
Bouras, C., Poulopoulos, V., Tsogkas, V.: Perssonal’s core functionality evaluation: Enhancing text labeling through personalized summaries. Data and Knowledge Engineering 64(1), 330–345 (2008)
Article Google Scholar
Zha, H.: Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In: Proc. 25th Int. ACM Conf. Research and Development in Information Retrieval (SIGIR), pp. 113–120 (2002)
Google Scholar
Wan, X., Yang, J., Xiao, J.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In: ACL, pp. 552–559 (2007)
Google Scholar
Morris, J., Hirst, G.: Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Computational Linguistics 17, 21–48 (1991)
Google Scholar
Alam, H., Kumar, A., Nakamura, M., Rahman, F., Tarnikova, Y., Wilcox, C.: Structured and unstructured document summarization: Design of a commercial summarizer using lexical chains. In: Proc. Seventh Int. Conf. Document Analysis and Recognition (ICDAR 2003), pp. 1147–1152 (2003)
Google Scholar
Barzilay, R., Elbadad, M.: Using lexical chains for text summarization. In: Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, pp. 10–17 (1997)
Google Scholar
Mani, I., Bloedorn, E., Gates, B.: Using cohesion and coherence models for text summarization. In: Using Cohesion and Coherence Models for Text Summarization, pp. 69–76 (1998)
Google Scholar
Nenkova, A.: Summarization evaluation for text and speech: issues and approaches. In: Ninth International Conference on Spoken Language Processing, INTERSPEECH 2006 (2006)
Google Scholar
Lin, C.Y., Cao, G., Gao, J., Nie, J.Y.: An information-theoretic approach to automatic evaluation of summaries. In: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL 2006), pp. 463–470 (2006)
Google Scholar
Louis, A., Nenkova, A.: Automatic summary evaluation without human models. In: Proc. of Text Analysis Conference, TAC 2008 (2008)
Google Scholar
Nenkova, A., Passonneau, R., McKeown, K.: The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Trans. Speech Lang. Process. 4 (2007)
Google Scholar
Hovy, E., Lin, C.Y., Zhou, L., Fukumoto, J.: Automated summarization evaluation with basic element. In: Proceedings of the Fifth Conference on Language Resources and Evaluation (LREC 2006) (2006)
Google Scholar
Radev, D.R., Tam, D.: Summarization evaluation using relative utility. In: Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM 2003), pp. 508–511 (2003)
Google Scholar
Fagin, R., Kumar, R., Sivakumar, D.: Efficient similarity search and classification via rank aggregation. In: Proc. of 2003 ACM SIGMOD Int. Conf. on Management of Data, pp. 301–312 (2003)
Google Scholar
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proc. of 10th Int. World Wide Web Conference, pp. 613–622 (2001)
Google Scholar
Kang, B.Y.: A novel approach to semantic indexing based on concept. In: Proc. 41st Annual Meeting of Association of Computational Linguistics (ACL 2003), vol. 2, pp. 44–49 (2003)
Google Scholar
Liu, K., Terzi, E., Grandison, T.: Manyaspects: a system for highlighting diverse concepts in documents. In: Proc. Int. Conf. Very Large Databases (VLDB), pp. 1444–1447 (2008)
Google Scholar
Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of AI Research 22, 457–479 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Tata Research Development and Design Centre (TRDDC), Tata Consultancy Services Limited, 54B, Hadapsar Industrial Estate, Pune, 411013, India
Girish Keshav Palshikar, Shailesh Deshpande & G. Athiappan

Authors

Girish Keshav Palshikar
View author publications
You can also search for this author in PubMed Google Scholar
Shailesh Deshpande
View author publications
You can also search for this author in PubMed Google Scholar
G. Athiappan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research (CIC), National Polytechnic Institute (IPN), Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Palshikar, G.K., Deshpande, S., Athiappan, G. (2012). Combining Summaries Using Unsupervised Rank Aggregation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28601-8_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-28601-8_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28600-1
Online ISBN: 978-3-642-28601-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics