Abstract
This paper aims to sum up our work in the area of comparative summarization and to present our results. The focus of comparative summarization is the analysis of input documents and the creation of summaries which depict the most significant differences in them. We experiment with two well known methods – Latent Semantic Analysis and Latent Dirichlet Allocation – to obtain the latent topics of documents. These topics can be compared and thus we can learn the main factual differences and select the most significant sentences into the output summaries. Our algorithms are briefly explained in section 2 and their evaluation on the TAC 2011 dataset with the ROUGE toolkit is then presented in section 3.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lee, S., Baker, J., Song, J., Wetherbe, J.C.: An Empirical Comparison of Four Text Mining Methods. In: Proceedings of the 2010 43rd Hawaii International Conference on System Sciences, pp. 1–10 (2010)
Gelbukh, A.F., Sidorov, G., Guzman-Arenas, A.: Document Comparison with a Weighted Topic Hierarchy. In: Proceedings of the 10th International Workshop on Database & Expert Systems Applications, pp. 566–570 (1999)
Wang, D., Zhu, S., Li, T., Gong, Y.: Comparative document summarization via discriminative sentence selection. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1963–1966 (2009)
Huang, X., Wan, X., Xiao, J.: Comparative news summarization using linear programming. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers, vol. 2, pp. 648–653 (2011)
Steinberger, J., Ježek, K.: Update Summarization Based on Latent Semantic Analysis. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 77–84. Springer, Heidelberg (2009)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. The Journal of Machine Learning Research, 993–1022 (2003)
Phan, X.-H., Nguyen, C.-T.: http://jgibblda.sourceforge.net/
Lin, C.-Y., Hovy, E.: Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 71–78 (2003)
Steinberger, J., Jezek, K.: Update summarization based on novel topic distribution. In: Proceedings of the 9th ACM Symposium on Document Engineering, Munich, Germany, pp. 205–213 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Campr, M., Ježek, K. (2013). Topic Models for Comparative Summarization. In: Habernal, I., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2013. Lecture Notes in Computer Science(), vol 8082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40585-3_71
Download citation
DOI: https://doi.org/10.1007/978-3-642-40585-3_71
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40584-6
Online ISBN: 978-3-642-40585-3
eBook Packages: Computer ScienceComputer Science (R0)