Document-Aware Graph Models for Query-Oriented Multi-document Summarization

  • Furu Wei
  • Wenjie Li
  • Yanxiang He
Part of the Studies in Computational Intelligence book series (SCI, volume 346)

Abstract

Sentence ranking is the issue of most concern in document summarization. In recent years, graph-based summarization models and sentence ranking algorithms have drawn considerable attention from the extractive summarization community due to their capability of recursively calculating sentence significance from the entire text graph that links sentences together rather than relying on single sentence alone. However, when dealing with multi-document summarization, existing sentence ranking algorithms often assemble a set of documents into one large file. The document dimension is ignored. In this work, we develop two alternative models to integrate the document dimension into existing sentence ranking algorithms. They are the one-layer (i.e. sentence layer) document-sensitive model and the two-layer (i.e. document and sentence layers) mutual reinforcement model. While the former implicitly incorporates the document’s influence in sentence ranking, the latter explicitly formulates the mutual reinforcement among sentence and document during ranking. The effectiveness of the proposed models and algorithms are examined on the DUC query-oriented multi-document summarization data sets.

Keywords

Query-oriented multi-document summarization document-sensitive sentence ranking mutual-reinforcement sentence ranking 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. The ACM Press, New York (1999)Google Scholar
  2. 2.
    Brin, S., Page, L.: The Anatomy of a Large-scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)CrossRefGoogle Scholar
  3. 3.
  4. 4.
  5. 5.
    Erkan, G., Radev, D.R.: LexRank: Graph-based Centrality as Salience in Text Summarization. Journal of Artificial Intelligence Research 22, 457–479 (2004)Google Scholar
  6. 6.
    Haveliwala, T.H.: Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search. IEEE Transactions on Knowledge and Data Engineering 15(4), 784–796 (2003)CrossRefGoogle Scholar
  7. 7.
    Jones, K.S.: Automatic Summarising: The State of the art. Information Processing and Management 43, 1449–1481 (2007)CrossRefGoogle Scholar
  8. 8.
    Langville, A.N., Meyer, C.D.: Deeper Inside PageRank. Journal of Internet Mathematics 1(3), 335–380 (2004)CrossRefMATHMathSciNetGoogle Scholar
  9. 9.
    Leskovec, J., Grobelnik, M., Milic-Frayling, N.: Learning Sub-structures of Document Semantic Graphs for Document Summarization. In: Proceedings of Link KDD Workshop, pp. 133–138 (2004)Google Scholar
  10. 10.
    Li, W.J., Wu, M.L., Lu, Q., Xu, W., Yuan, C.F.: Extractive Summarization using Intra- and Inter-Event Relevance. In: Proceedings of ACL/COLING, pp. 369–376 (2006)Google Scholar
  11. 11.
    Lin, C.Y., Hovy, E.: The Automated Acquisition of Topic Signature for Text Summarization. In: Proceedings of 18th COLING, pp. 495–501 (2000)Google Scholar
  12. 12.
    Lin, C.Y., Hovy, E.: Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics. In: Proceedings of HLT-NAACL, pp. 71–78 (2003)Google Scholar
  13. 13.
    Lin, Z.H., Chua, T.S., Kan, M.Y., Lee, W.S., Qiu, L., Ye, S.R.: NUS at DUC 2007: Using Evolutionary Models for Text. In: Proceedings of Document Understanding Conference (2007)Google Scholar
  14. 14.
    Mihalcea, R.: Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization. In: Proceedings of ACL 2004, Article No. 20 (2004)Google Scholar
  15. 15.
    Mihalcea, R.: Language Independent Extractive Summarization. In: Proceedings of ACL 2005, pp. 49–52 (2005)Google Scholar
  16. 16.
    Kleinberg, J.M.: Authoritative Sources in a Hyperlinked Environment. In: Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 668–677 (1999)Google Scholar
  17. 17.
    Mani, I., Maybury, M.T. (eds.): Advances in Automatic Summarization. The MIT Press, Cambridge (1999)Google Scholar
  18. 18.
    Otterbacher, J., Erkan, G., Radev, D.R.: Using Random Walks for Question-focused Sentence Retrieval. In: Proceedings of HLT/EMNLP, pp. 915–922 (2005)Google Scholar
  19. 19.
    Ouyang, Y., Li, S.Y., Li, W.J.: Developing Learning Strategies for Topic-Based Summarization. In: Proceedings of the 16th ACM Conference on Information and Knowledge Management, pp. 79–86 (2007)Google Scholar
  20. 20.
    Over, P., Dang, H., Harman, D.: DUC in Context. Information Processing and Management 43(6), 1506–1520 (2007)CrossRefGoogle Scholar
  21. 21.
  22. 22.
    Radev, D.R., Jing, H.Y., Stys, M., Tam, D.: Centroid-based Summarization of Multiple Documents. Information Processing and Management 40, 919–938 (2004)CrossRefMATHGoogle Scholar
  23. 23.
    Vanderwende, L., Banko, M., Menezes, A.: Event-Centric Summary Generation. In: Working Notes of DUC 2004 (2004)Google Scholar
  24. 24.
    Wan, X.J., Yang, J.W., Xiao, J.G.: Using Cross-document Random Walks for Topic-focused Multi-document Summarization. In: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 1012–1018 (2006)Google Scholar
  25. 25.
    Wan, X.J., Yang, J.W., Xiao, J.G.: Towards Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction. In: Proceedings of ACL, pp. 552–559 (2007)Google Scholar
  26. 26.
    Wei, F.R., Li, W.J., Lu, Q., He, Y.X.: A Cluster-Sensitive Graph Model for Query-Oriented Multi-document Summarization. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 446–453. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  27. 27.
    Wei, F.R., Li, W.J., Lu, Q., He, Y.X.: Query-Sensitive Mutual Reinforcement Chain with Its Application in Query-Oriented Multi-Document Summarization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 283–290 (2008)Google Scholar
  28. 28.
    Wong, K.F., Wu, M.L., Li, W.J.: Extractive Summarization Using Supervised and Semi-Supervised Learning. In: Proceedings of the 22nd International Conference on Computational Linguistics, pp. 985–992 (2008)Google Scholar
  29. 29.
    Yoshioka, M., Haraguchi, M.: Multiple News Articles Summarization based on Event Reference Information. In: Working Notes of NTCIR-4 (2004)Google Scholar
  30. 30.
    Zha, H.Y.: Generic Summarization and Key Phrase Extraction using Mutual Reinforcement Principle and Sentence Clustering. In: Proceedings of the 25th ACM SIGIR, pp. 113–120 (2002)Google Scholar
  31. 31.
    Padmanabhan, D., Desikan, P., Srivastava, J., Riaz, K.: WICER: A Weighted Inter-Cluster Edge Ranking for Clustered Graphs. In: Proceedings of 2005 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 522–528 (2005)Google Scholar
  32. 32.
    Wei, F.R., Li, W.J., Lu, Q., He, Y.X.: Applying Two-Level Mutual Reinforcement Ranking in Query-Oriented Multi-document Summarization. Journal of the American Society for Information Science and Technology (2009) (in press)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Furu Wei
    • 1
    • 2
  • Wenjie Li
    • 1
  • Yanxiang He
    • 3
  1. 1.Department of ComputingThe Hong Kong Polytechnic UniversityHong Kong
  2. 2.IBM China Research LaboratoryBeijingChina
  3. 3.Department of Computer Science and TechnologyWuhan UniversityWuhanChina

Personalised recommendations