Skip to main content

Combining Summaries Using Unsupervised Rank Aggregation

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2012)

Abstract

We model the problem of combining multiple summaries of a given document into a single summary in terms of the well-known rank aggregation problem. Treating sentences in the document as candidates and summarization algorithms as voters, we determine the winners in an election where each voter selects and ranks k candidates in order of its preference. Many rank aggregation algorithms are supervised: they discover an optimal rank aggregation function from a training dataset of where each ”record” consists of a set of candidate rankings and a model ranking. But significant disagreements between model summaries created by human experts as well as high costs of creating them makes it interesting to explore the use of unsupervised rank aggregation techniques. We use the well-known Condorcet methodology, including a new variation to improve its suitability. As voters, we include summarization algorithms from literature and two new ones proposed here: the first is based on keywords and the second is a variant of the lexical-chain based algorithm in [1]. We experimentally demonstrate that the combined summary is often very similar (when compared using different measures) to the model summary produced manually by human experts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Stokes, N.: Applications of Lexical Cohesion in the Topic Detection and Tracking Domain, Ph.D. thesis, National university of Ireland, Dublin (2004)

    Google Scholar 

  2. van Halteren, H., Teufel, S.: Examining the consensus between human summaries: initial experiments with factoid analysis. In: Proceedings of the HLT-NAACL 2003 on Text Summarization Workshop (HLT-NAACL-DUC 2003), vol. 5, pp. 57–64 (2003)

    Google Scholar 

  3. Jing, H., Barzilay, R., Mckeown, K., Elhadad, M.: Summarization evaluation methods: Experiments and analysis. In: AAAI Symposium on Intelligent Summarization, pp. 60–68 (1998)

    Google Scholar 

  4. Hovy, E.H.: Automated text summarization. In: Mitkov, R. (ed.) The Oxford Handbook of Computational Linguistics, pp. 583–598 (2005)

    Google Scholar 

  5. Teufel, S., Moens, M.: Sentence extraction as a classification task. In: Proc. Workshop on Intelligent Scalable Summarization ACL/EACL Conference, pp. 58–65 (1997)

    Google Scholar 

  6. Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proc. 18th Int. ACM Conf. Research and Development in Information Retrieval (SIGIR), pp. 68–73 (1995)

    Google Scholar 

  7. Hovy, E., Lin, C.Y.: Automated text summarization in summarist. In: Maybury, M., Mani, I. (eds.) Advances in Automatic Text Summarization. MIT Press (1999)

    Google Scholar 

  8. Edmundson, H.P.: New methods in automatic extraction. Journal of the ACM 16(2), 264–285 (1968)

    Article  Google Scholar 

  9. Matsumura, N., Ohsawa, Y., Ishizuka, M.: Pai: Automatic indexing for extracting assorted keywords from a document. In: Proc. AAAI 2002 (2002)

    Google Scholar 

  10. Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. International Journal on AI Tools 13(1), 157–169 (2004)

    Article  Google Scholar 

  11. Ohsawa, Y., Benson, N.E., Yachida, M.: Keygraph: automatic indexing by co-occurrence graph based on building construction metaphor. In: Proc. Advanced Digital Library Conference (ADL 1998), pp. 12–18 (1998)

    Google Scholar 

  12. Palshikar, G.: Keyword Extraction from Single Document Using Centrality Measures. In: Ghosh, A., De, R.K., Pal, S.K. (eds.) PReMI 2007. LNCS, vol. 4815, pp. 503–510. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  13. Bouras, C., Poulopoulos, V., Tsogkas, V.: Perssonal’s core functionality evaluation: Enhancing text labeling through personalized summaries. Data and Knowledge Engineering 64(1), 330–345 (2008)

    Article  Google Scholar 

  14. Zha, H.: Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In: Proc. 25th Int. ACM Conf. Research and Development in Information Retrieval (SIGIR), pp. 113–120 (2002)

    Google Scholar 

  15. Wan, X., Yang, J., Xiao, J.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In: ACL, pp. 552–559 (2007)

    Google Scholar 

  16. Morris, J., Hirst, G.: Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Computational Linguistics 17, 21–48 (1991)

    Google Scholar 

  17. Alam, H., Kumar, A., Nakamura, M., Rahman, F., Tarnikova, Y., Wilcox, C.: Structured and unstructured document summarization: Design of a commercial summarizer using lexical chains. In: Proc. Seventh Int. Conf. Document Analysis and Recognition (ICDAR 2003), pp. 1147–1152 (2003)

    Google Scholar 

  18. Barzilay, R., Elbadad, M.: Using lexical chains for text summarization. In: Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, pp. 10–17 (1997)

    Google Scholar 

  19. Mani, I., Bloedorn, E., Gates, B.: Using cohesion and coherence models for text summarization. In: Using Cohesion and Coherence Models for Text Summarization, pp. 69–76 (1998)

    Google Scholar 

  20. Nenkova, A.: Summarization evaluation for text and speech: issues and approaches. In: Ninth International Conference on Spoken Language Processing, INTERSPEECH 2006 (2006)

    Google Scholar 

  21. Lin, C.Y., Cao, G., Gao, J., Nie, J.Y.: An information-theoretic approach to automatic evaluation of summaries. In: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL 2006), pp. 463–470 (2006)

    Google Scholar 

  22. Louis, A., Nenkova, A.: Automatic summary evaluation without human models. In: Proc. of Text Analysis Conference, TAC 2008 (2008)

    Google Scholar 

  23. Nenkova, A., Passonneau, R., McKeown, K.: The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Trans. Speech Lang. Process. 4 (2007)

    Google Scholar 

  24. Hovy, E., Lin, C.Y., Zhou, L., Fukumoto, J.: Automated summarization evaluation with basic element. In: Proceedings of the Fifth Conference on Language Resources and Evaluation (LREC 2006) (2006)

    Google Scholar 

  25. Radev, D.R., Tam, D.: Summarization evaluation using relative utility. In: Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM 2003), pp. 508–511 (2003)

    Google Scholar 

  26. Fagin, R., Kumar, R., Sivakumar, D.: Efficient similarity search and classification via rank aggregation. In: Proc. of 2003 ACM SIGMOD Int. Conf. on Management of Data, pp. 301–312 (2003)

    Google Scholar 

  27. Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proc. of 10th Int. World Wide Web Conference, pp. 613–622 (2001)

    Google Scholar 

  28. Kang, B.Y.: A novel approach to semantic indexing based on concept. In: Proc. 41st Annual Meeting of Association of Computational Linguistics (ACL 2003), vol. 2, pp. 44–49 (2003)

    Google Scholar 

  29. Liu, K., Terzi, E., Grandison, T.: Manyaspects: a system for highlighting diverse concepts in documents. In: Proc. Int. Conf. Very Large Databases (VLDB), pp. 1444–1447 (2008)

    Google Scholar 

  30. Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of AI Research 22, 457–479 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Palshikar, G.K., Deshpande, S., Athiappan, G. (2012). Combining Summaries Using Unsupervised Rank Aggregation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28601-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28601-8_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28600-1

  • Online ISBN: 978-3-642-28601-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics