Abstract
We model the problem of combining multiple summaries of a given document into a single summary in terms of the well-known rank aggregation problem. Treating sentences in the document as candidates and summarization algorithms as voters, we determine the winners in an election where each voter selects and ranks k candidates in order of its preference. Many rank aggregation algorithms are supervised: they discover an optimal rank aggregation function from a training dataset of where each ”record” consists of a set of candidate rankings and a model ranking. But significant disagreements between model summaries created by human experts as well as high costs of creating them makes it interesting to explore the use of unsupervised rank aggregation techniques. We use the well-known Condorcet methodology, including a new variation to improve its suitability. As voters, we include summarization algorithms from literature and two new ones proposed here: the first is based on keywords and the second is a variant of the lexical-chain based algorithm in [1]. We experimentally demonstrate that the combined summary is often very similar (when compared using different measures) to the model summary produced manually by human experts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Stokes, N.: Applications of Lexical Cohesion in the Topic Detection and Tracking Domain, Ph.D. thesis, National university of Ireland, Dublin (2004)
van Halteren, H., Teufel, S.: Examining the consensus between human summaries: initial experiments with factoid analysis. In: Proceedings of the HLT-NAACL 2003 on Text Summarization Workshop (HLT-NAACL-DUC 2003), vol. 5, pp. 57–64 (2003)
Jing, H., Barzilay, R., Mckeown, K., Elhadad, M.: Summarization evaluation methods: Experiments and analysis. In: AAAI Symposium on Intelligent Summarization, pp. 60–68 (1998)
Hovy, E.H.: Automated text summarization. In: Mitkov, R. (ed.) The Oxford Handbook of Computational Linguistics, pp. 583–598 (2005)
Teufel, S., Moens, M.: Sentence extraction as a classification task. In: Proc. Workshop on Intelligent Scalable Summarization ACL/EACL Conference, pp. 58–65 (1997)
Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proc. 18th Int. ACM Conf. Research and Development in Information Retrieval (SIGIR), pp. 68–73 (1995)
Hovy, E., Lin, C.Y.: Automated text summarization in summarist. In: Maybury, M., Mani, I. (eds.) Advances in Automatic Text Summarization. MIT Press (1999)
Edmundson, H.P.: New methods in automatic extraction. Journal of the ACM 16(2), 264–285 (1968)
Matsumura, N., Ohsawa, Y., Ishizuka, M.: Pai: Automatic indexing for extracting assorted keywords from a document. In: Proc. AAAI 2002 (2002)
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. International Journal on AI Tools 13(1), 157–169 (2004)
Ohsawa, Y., Benson, N.E., Yachida, M.: Keygraph: automatic indexing by co-occurrence graph based on building construction metaphor. In: Proc. Advanced Digital Library Conference (ADL 1998), pp. 12–18 (1998)
Palshikar, G.: Keyword Extraction from Single Document Using Centrality Measures. In: Ghosh, A., De, R.K., Pal, S.K. (eds.) PReMI 2007. LNCS, vol. 4815, pp. 503–510. Springer, Heidelberg (2007)
Bouras, C., Poulopoulos, V., Tsogkas, V.: Perssonal’s core functionality evaluation: Enhancing text labeling through personalized summaries. Data and Knowledge Engineering 64(1), 330–345 (2008)
Zha, H.: Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In: Proc. 25th Int. ACM Conf. Research and Development in Information Retrieval (SIGIR), pp. 113–120 (2002)
Wan, X., Yang, J., Xiao, J.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In: ACL, pp. 552–559 (2007)
Morris, J., Hirst, G.: Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Computational Linguistics 17, 21–48 (1991)
Alam, H., Kumar, A., Nakamura, M., Rahman, F., Tarnikova, Y., Wilcox, C.: Structured and unstructured document summarization: Design of a commercial summarizer using lexical chains. In: Proc. Seventh Int. Conf. Document Analysis and Recognition (ICDAR 2003), pp. 1147–1152 (2003)
Barzilay, R., Elbadad, M.: Using lexical chains for text summarization. In: Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, pp. 10–17 (1997)
Mani, I., Bloedorn, E., Gates, B.: Using cohesion and coherence models for text summarization. In: Using Cohesion and Coherence Models for Text Summarization, pp. 69–76 (1998)
Nenkova, A.: Summarization evaluation for text and speech: issues and approaches. In: Ninth International Conference on Spoken Language Processing, INTERSPEECH 2006 (2006)
Lin, C.Y., Cao, G., Gao, J., Nie, J.Y.: An information-theoretic approach to automatic evaluation of summaries. In: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL 2006), pp. 463–470 (2006)
Louis, A., Nenkova, A.: Automatic summary evaluation without human models. In: Proc. of Text Analysis Conference, TAC 2008 (2008)
Nenkova, A., Passonneau, R., McKeown, K.: The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Trans. Speech Lang. Process. 4 (2007)
Hovy, E., Lin, C.Y., Zhou, L., Fukumoto, J.: Automated summarization evaluation with basic element. In: Proceedings of the Fifth Conference on Language Resources and Evaluation (LREC 2006) (2006)
Radev, D.R., Tam, D.: Summarization evaluation using relative utility. In: Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM 2003), pp. 508–511 (2003)
Fagin, R., Kumar, R., Sivakumar, D.: Efficient similarity search and classification via rank aggregation. In: Proc. of 2003 ACM SIGMOD Int. Conf. on Management of Data, pp. 301–312 (2003)
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proc. of 10th Int. World Wide Web Conference, pp. 613–622 (2001)
Kang, B.Y.: A novel approach to semantic indexing based on concept. In: Proc. 41st Annual Meeting of Association of Computational Linguistics (ACL 2003), vol. 2, pp. 44–49 (2003)
Liu, K., Terzi, E., Grandison, T.: Manyaspects: a system for highlighting diverse concepts in documents. In: Proc. Int. Conf. Very Large Databases (VLDB), pp. 1444–1447 (2008)
Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of AI Research 22, 457–479 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Palshikar, G.K., Deshpande, S., Athiappan, G. (2012). Combining Summaries Using Unsupervised Rank Aggregation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28601-8_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-28601-8_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28600-1
Online ISBN: 978-3-642-28601-8
eBook Packages: Computer ScienceComputer Science (R0)