Abstract
In this paper, we propose a method for multi-document summarization based on unsupervised clustering. First, the main topics are determined by a MDL-based clustering strategy capable of inferring optimal cluster numbers. Then, the problem of multi-document summarization is formalized on the clusters using an entropy-based object function.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barzilay, R., McKeown, K.R., Elhadad, M.: Information fusion in the context of multidocument summarization. In: Proceedings of the 37th ACL, Maryland (1999)
Boros, E., Kantor, P.B., Neu, D.J.: A Clustering Based Approach to Creating Multi- Document Summaries. In: Proceedings of the 24th ACM SIGIR Conference, LA (2001)
Bouman, C.A., Shapiro, M., Cook, G.W., Atkins, C.B., Cheng, H.: Cluster: An unsupervised algorithm for modeling Gaussian mixtures (1998)
Hardy, H., Shimizu, N.: Cross-Document Summarization by Concept Classification. In: SIGIR 2002, pp. 121–128 (2002)
Hatzivassiloglou, V., Klavans, J., Eskin, E.: Detecting text similarity over short passages: exploring linguistic feature combinations via machine learning. In: Proceedings of EMNLP 1999 (1999)
Knight, K., Marcu, D.: Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artificial Intelligence 139(1) (2002)
Mann, W., Thompson, S.: Rhetorical structure theory: towards a functional theory of text organization. Text 1988 8(3), 243–281 (1988)
Over, P., Yen, J.: An Introduction to DUC2004: Intrinsic Evaluation of Generic New Text Summarization Systems. In: Proceedings of DUC 2004 (2004)
Radev, D., Allison, T., Goldensohn, S.B., Blitzer, J., Çelebi, A., Dimitrov, S., Drabek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H.: MEAD - a platform for multidocument multilingual text summarization. In: Proceedings of LREC 2004, Lisbon, Portugal (May 2004)
Siddharthan, A., Nenkova, A., McKeown, K.: Syntactic Simplication for Improving Content Selection in Multi-Document Summarization. In: Proceeding of COLING 2004, Geneva, Switzerland (2004)
Stein, G.C., Bagga, A., Wise, G.B.: Multi-Document Summarization: Methodologies and Evaluations. In: Conference TALN 2000, Lausanne (October 2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ji, P. (2006). Multi-document Summarization Based on Unsupervised Clustering. In: Ng, H.T., Leong, MK., Kan, MY., Ji, D. (eds) Information Retrieval Technology. AIRS 2006. Lecture Notes in Computer Science, vol 4182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880592_46
Download citation
DOI: https://doi.org/10.1007/11880592_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45780-0
Online ISBN: 978-3-540-46237-8
eBook Packages: Computer ScienceComputer Science (R0)