Topic Models for Comparative Summarization

Campr, Michal; Ježek, Karel

doi:10.1007/978-3-642-40585-3_71

Michal Campr²⁰ &
Karel Ježek²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8082))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

2424 Accesses
4 Citations

Abstract

This paper aims to sum up our work in the area of comparative summarization and to present our results. The focus of comparative summarization is the analysis of input documents and the creation of summaries which depict the most significant differences in them. We experiment with two well known methods – Latent Semantic Analysis and Latent Dirichlet Allocation – to obtain the latent topics of documents. These topics can be compared and thus we can learn the main factual differences and select the most significant sentences into the output summaries. Our algorithms are briefly explained in section 2 and their evaluation on the TAC 2011 dataset with the ROUGE toolkit is then presented in section 3.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lee, S., Baker, J., Song, J., Wetherbe, J.C.: An Empirical Comparison of Four Text Mining Methods. In: Proceedings of the 2010 43rd Hawaii International Conference on System Sciences, pp. 1–10 (2010)
Google Scholar
Gelbukh, A.F., Sidorov, G., Guzman-Arenas, A.: Document Comparison with a Weighted Topic Hierarchy. In: Proceedings of the 10th International Workshop on Database & Expert Systems Applications, pp. 566–570 (1999)
Google Scholar
Wang, D., Zhu, S., Li, T., Gong, Y.: Comparative document summarization via discriminative sentence selection. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1963–1966 (2009)
Google Scholar
Huang, X., Wan, X., Xiao, J.: Comparative news summarization using linear programming. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers, vol. 2, pp. 648–653 (2011)
Google Scholar
Steinberger, J., Ježek, K.: Update Summarization Based on Latent Semantic Analysis. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 77–84. Springer, Heidelberg (2009)
Chapter Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. The Journal of Machine Learning Research, 993–1022 (2003)
Google Scholar
Phan, X.-H., Nguyen, C.-T.: http://jgibblda.sourceforge.net/
Lin, C.-Y., Hovy, E.: Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 71–78 (2003)
Google Scholar
Steinberger, J., Jezek, K.: Update summarization based on novel topic distribution. In: Proceedings of the 9th ACM Symposium on Document Engineering, Munich, Germany, pp. 205–213 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, FAV, University of West Bohemia, 301 00, Plzen, Czech Republic
Michal Campr & Karel Ježek

Authors

Michal Campr
View author publications
You can also search for this author in PubMed Google Scholar
Karel Ježek
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of West Bohemia, 306 14, Pilsen, Czech Republic
Ivan Habernal & Václav Matoušek &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Campr, M., Ježek, K. (2013). Topic Models for Comparative Summarization. In: Habernal, I., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2013. Lecture Notes in Computer Science(), vol 8082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40585-3_71

Download citation

DOI: https://doi.org/10.1007/978-3-642-40585-3_71
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40584-6
Online ISBN: 978-3-642-40585-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics