Advertisement

The Great Importance of Cross-Document Relationships for Multi-document Summarization

  • Xiaojun Wan
  • Jianwu Yang
  • Jianguo Xiao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4285)

Abstract

Graph-based methods have been developed for multi-document summarization in recent years and they make use of the relationships between sentences in a graph-based ranking algorithm to extract salient sentences. This paper proposes to differentiate the cross-document relationships and the within-document relationships between sentences for multi-document summarization. The two kinds of relationships between sentences are deemed to have unequal contributions in the graph-based ranking algorithm. We apply the graph-based ranking algorithm based on each kind of sentence relationships and explore their relative importance for multi-document summarization. Experimental results on DUC 2002 and DUC 2004 data demonstrate the great importance of the cross-document relationships between sentences for multi-document summarization. Even the system based only on the cross-document relation-ships can perform better than or at least as well as the systems based on both kinds of relationships between sentences.

Keywords

Longe Common Subsequence Text Summarization Summarization Method Document Summarization Diversity Penalty 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrival. ACM Press and Addison Wesley (1999)Google Scholar
  2. 2.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 1–7 (1984)Google Scholar
  3. 3.
    Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of SIGIR 1998 (1998)Google Scholar
  4. 4.
    Erkan, G., Radev, D.: LexPageRank: prestige in multi-document text summarization. In: Proceedings of EMNLP 2004 (2004)Google Scholar
  5. 5.
    Harabagiu, S., Lacatusu, F.: Topic themes for multi-document summarization. In: Proceedings of SIGIR 2005, Salvador, Brazil, pp. 202–209 (2005)Google Scholar
  6. 6.
    Hardy, H., Shimizu, N., Strzalkowski, T., Ting, L., Wise, G.B., Zhang, X.: Cross-document summarization by concept classification. In: Proceedings of SIGIR 2002, Tampere, Finland (2002)Google Scholar
  7. 7.
    Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Lin, C.-Y., Hovy, E.H.: From Single to Multi-document Summarization: A Prototype System and its Evaluation. In: Proceedings of ACL 2002 (2002)Google Scholar
  9. 9.
    Lin, C.-Y., Hovy, E.H.: Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics. In: Proceedings of HLT-NAACL 2003 (2003)Google Scholar
  10. 10.
    Mani, I., Bloedorn, E.: Summarizing Similarities and Differences Among Related Documents. Information Retrieval 1(1) (2000)Google Scholar
  11. 11.
    Mihalcea, R., Tarau, P.: A language independent algorithm for single and multiple document summarization. In: Proceedings of IJCNLP 2005 (2005)Google Scholar
  12. 12.
    Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)Google Scholar
  13. 13.
    Radev, D., Allison, T., Blair-Goldensohn, S., Blitzer, J., et al.: The Mead multi-document summarizer (2003), http://www.summarization.com/mead/
  14. 14.
    Radev, D.R., Jing, H.Y., Stys, M., Tam, D.: Centroid-based summarization of multiple documents. Information Processing and Management 40, 919–938 (2004)MATHCrossRefGoogle Scholar
  15. 15.
    Zhang, B., Li, H., Liu, Y., Ji, L., Xi, W., Fan, W., Chen, Z., Ma, W.-Y.: Improving web search results using affinity graph. In: Proceedings of SIGIR 2005 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Xiaojun Wan
    • 1
  • Jianwu Yang
    • 1
  • Jianguo Xiao
    • 1
  1. 1.Institute of Computer Science and TechnologyPeking UniversityBeijingChina

Personalised recommendations