Multi-document Summarization Using a Clustering-Based Hybrid Strategy

  • Yu Nie
  • Donghong Ji
  • Lingpeng Yang
  • Zhengyu Niu
  • Tingting He
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4182)


In this paper we propose a clustering-based hybrid approach for multi-document summarization which integrates sentence clustering, local recommendation and global search. For sentence clustering, we adopt a stability-based method which can determine the optimal cluster number automatically. We weight sentences with terms they contain for local sentence recommendation of each cluster. For global selection, we propose a global criterion to evaluate overall performance of a summary. Thus the sentences in the final summary are determined by not only the configuration of individual clusters but also the overall performance. This approach successfully gets top-level performance running on corpus of DUC04.


Cluster Number Cluster Validation Global Selection Global Criterion Local Recommendation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Barzilay, R., McKeown, K.R., Elhadad, M.: Information fusion in the context of multi-document summarization. In: ACL 1999, Maryland (1999)Google Scholar
  2. 2.
    Blair-Goldensohn, S., Evans, D.: Columbia University at DUC 2004. In: DUC 2004 Workshop, Boston, MA (2004)Google Scholar
  3. 3.
    Boros, E., Kantor, P.B., Neu, D.J.: A Clustering Based Approach to Creating Multi-Document Summaries. In: DUC 2001 workshop (2001)Google Scholar
  4. 4.
    Hardy, H., Shimizu, N.: Cross-Document Summarization by Concept Classification. In: SIGIR 2002, pp. 121–128.Google Scholar
  5. 5.
    Lange, T., Braun, M., Roth, V., Buhmann, J.M.: Stability-Based Model Selection. In: Advances in Neural Information Processing Systems, vol. 15 (2002)Google Scholar
  6. 6.
    Levine, E., Domany, E.: Resampling Method for Unsupervised Estimation of Clus-ter Calidity. Neural Computation 13, 2573–2593 (2001)MATHCrossRefGoogle Scholar
  7. 7.
    Lin, C.-Y., Hovy, E.: Automatic Evaluation of Summaries Using N-gram Co- Occurrence Statistics. In: Proceedings of the Human Technology Conference (HLTNAACL- 2003), Edmonton, Canada (2003)Google Scholar
  8. 8.
    Niu, Z., Ji, D., Tan, C.L.: Document Clustering Based on Clus-ter Validation. In: CIKM 2004, Washington, DC, USA, November 8-13 (2004)Google Scholar
  9. 9.
    Radev, D., Allison, T., Blair-Goldensohn, S., Blitzer, J., Çelebi, A., Dimitrov, S., Drabek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H., Teufel, S., Topper, M., Winkel, A., Zhu, Z.: MEAD - a platform for multidocument multilingual text summarization. In: Proceedings of LREC 2004, Lisbon, Portugal (May 2004)Google Scholar
  10. 10.
    Roth, V., Lange, T.: Feature Selection in Clustering Problems. In: NIPS 2003 workshop (2003)Google Scholar
  11. 11.
    Siddharthan, A., Nenkova, A., McKeown, K.: Syntactic Simplication for Improving Content Selection in Multi-Document Summarization. In: Proceeding of COLING 2004, Geneva, Switzerland (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yu Nie
    • 1
  • Donghong Ji
    • 1
  • Lingpeng Yang
    • 1
  • Zhengyu Niu
    • 1
  • Tingting He
    • 2
  1. 1.Institute for Infocomm ResearchSingapore
  2. 2.Huazhong Normal UniversityWuhanChina

Personalised recommendations