Sentence Ordering for Coherent Multi-document Summary Generation

  • C. R. Chowdary
  • P. Sreenivasa Kumar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5071)

Abstract

Web queries often give rise to a lot of documents and the user is overwhelmed by the information. Query-specific extractive summarization of a selected set of retrieved documents helps the user to get a gist of the information. The current extractive summary generation systems focus on extracting query-relevant sentences from the documents. However, the selected sentences are presented either in the order in which the documents were considered or in the order in which they were selected. This approach does not guarantee a coherent summary. In this paper, we propose incremental integrated graph to represent the sentences in a collection of documents. Sentences from the documents are merged into a master sequence to improve coherence and flow. The same ordering is used for sequencing the sentences in the extracted summary. User evaluations indicate that the proposed technique markedly improves the user satisfaction with regard to coherence in the summary.

Keywords

Summarization Coherent Incremental Integrated Graph Ordering of Sentences 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Barzilay, R., Elhadad, N., McKeown, K.R.: Sentence ordering in multidocument summarization. In: HLT 2001: Proceedings of the first international conference on Human language technology research, pp. 1–7. Association for Computational Linguistics, Morristown, NJ, USA (2001)Google Scholar
  3. 3.
    Frakes, W.B., Baeza-Yates, R.A. (eds.): Information Retrieval: Data Structures & Algorithms. Prentice-Hall, Englewood Cliffs (1992)Google Scholar
  4. 4.
    Goldstein, J., Carbonell, J.: Summarization (1) using mmr for diversity - based reranking and (2) evaluating summaries. In: Proceedings of a workshop, Baltimore, Maryland, pp. 181–195. Association for Computational Linguistics, Morristown, NJ, USA (1996)Google Scholar
  5. 5.
    Schlesinger, J.D., Conroy, J.M., Stewart, J.G.: CLASSY query-based multi-document summarization. In: Proceedings of the Document Understanding Conference (DUC) (2005)Google Scholar
  6. 6.
    Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)MATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Knight, K., Marcu, D.: Statistics-based summarization - step one: Sentence compression. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, pp. 703–710. AAAI Press / The MIT Press (2000)Google Scholar
  8. 8.
    Li, W., Wu, M., Lu, Q., Xu, W., Yuan, C.: Extractive summarization using inter- and intra- event relevance. In: ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, pp. 369–376. Association for Computational Linguistics, Morristown, NJ, USA (2006)CrossRefGoogle Scholar
  9. 9.
    Liddy, E.D., Paik, W., Yu, E.S., McVearry, K.A.: An overview of dr-link and its approach to document filtering. In: HLT 1993: Proceedings of the workshop on Human Language Technology, pp. 358–362. Association for Computational Linguistics, Morristown, NJ, USA (1993)CrossRefGoogle Scholar
  10. 10.
    Mani, I., Bloedorn, E.: Multi-document summarization by graph search and matching. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI 1997), pp. 622–628. AAAI/IAAI (1997)Google Scholar
  11. 11.
    McKenna, M., Liddy, E.: Multiple & single document summarization using dr-link. In: Proceedings of a workshop, Baltimore, Maryland, pp. 215–221. Association for Computational Linguistics, Morristown, NJ, USA (1996)Google Scholar
  12. 12.
    Mihalcea, R., Tarau, P.: Multi-Document Summarization with Iterative Graph-based Algorithms. In: Proceedings of the First International Conference on Intelligent Analysis Methods and Tools (IA 2005), McLean, VA (May 2005)Google Scholar
  13. 13.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. In: Proceedings of the 7th International World Wide Web Conference, Brisbane, Australia, pp. 161–172 (1998)Google Scholar
  14. 14.
    Radev, D.R., Jing, H., Budzikowska, M.: Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In: NAACL-ANLP 2000 Workshop on Automatic summarization, pp. 21–30. Association for Computational Linguistics, Morristown, NJ, USA (2000)CrossRefGoogle Scholar
  15. 15.
    Radev, D.R., McKeown, K.R.: Generating natural language summaries from multiple on-line sources. Comput. Linguist. 24(3), 470–500 (1998)Google Scholar
  16. 16.
    Varadarajan, R., Hristidis, V.: A system for query-specific document summarization. In: CIKM 2006: Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 622–631. ACM Press, New York (2006)CrossRefGoogle Scholar
  17. 17.
    Witbrock, M.J., Mittal, V.O.: Ultra-summarization (poster abstract): a statistical approach to generating highly condensed non-extractive summaries. In: SIGIR 1999: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 315–316. ACM, New York (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • C. R. Chowdary
    • 1
  • P. Sreenivasa Kumar
    • 1
  1. 1.Department of Computer Science and EngineeringIndian Institute of Technology MadrasChennaiIndia

Personalised recommendations