Abstract
The combination of web document contents, sentences and users’ comments from social networks provides a viewpoint of a web document towards a special event. This paper proposes a framework named SoRTESum to take advantage of information from Twitter viz. Diversity and reflection of document content to generate high-quality summaries by a novel sentence similarity measurement. The framework first formulates sentences and tweets by recognizing textual entailment (RTE) relation to incorporate social information. Next, they are modeled in a Dual Wing Entailment Graph, which captures the entailment relation to calculate the sentence similarity based on mutual reinforcement information. Finally, important sentences and representative tweets are selected by a ranking algorithm. By incorporating social information, SoRTESum obtained improvements over state-of-the-art unsupervised baselines e.g., Random, SentenceLead, LexRank of 0.51 %–8.8 % of ROUGE-1 and comparable results with strong supervised methods e.g., L2R and CrossL2R trained by RankBoost for single-document summarization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
http://twitter.com - a microblogging system.
- 2.
The RTE term was kept instead of the similarity because all features were derived from RTE task.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
References
Dagan, I., Dolan, B., Magnini, B., Roth, D.: Recognizing textual entailment: rational, evaluation and approaches - erratum. Nat. Lang. Eng. 16(1), 105–105 (2010)
Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)
Gao, W., Li, P., Darwish, K.: Joint topic modeling for event summarization across news, social media streams. In: CIKM, pp. 1173–1182 (2012)
Meishan, H., Sun, A., Lim, E.-P.: Comments-oriented blog summarization by sentence extraction. In: CIKM, pp. 901–904 (2007)
Meishan, H., Sun, A., Lim, E.-P.: Comments-oriented document summarization: understanding document with readers’ feedback. In: SIGIR, pp. 291–298 (2008)
Po, H., Sun, C., Longfei, W., Ji, D.-H., Teng, C.: Social summarization via automatically discovered social context. In: IJCNLP pp. 483–490 (2011)
Huang, L., Li, H., Huang, L.: Comments-oriented document summarization based on multi-aspect co-feedback ranking. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds.) WAIM 2013. LNCS, vol. 7923, pp. 363–374. Springer, Heidelberg (2013)
Lin, C.-Y., Hovy, E.H.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: HLT-NAACL, pp. 71–78 (2003)
Yue, L., Zhai, C.X., Sundaresan, N.: Rated aspect summarization of short comments. In: WWW, pp. 131–140 (2009)
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)
Nenkova, A.: Automatic text summarization of newswire: lessons learned from the document understanding conference. In: AAAI pp. 1436–1441 (2005)
Nguyen, M.-T., Ha, Q.-T., Nguyen, T.-D., Nguyen, T.-T., Nguyen, L.-M.: Recognizing textual entailment in vietnamese text: an experimental study. In: KSE (2015). doi:10.1109/KSE.2015.23
Nguyen, M.-T., Kitamoto, A., Nguyen, T.-T.: TSum4act: a framework for retrieving and summarizing actionable tweets during a disaster for reaction. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS, vol. 9078, pp. 64–75. Springer, Heidelberg (2015)
Porter, M.F.: Snowball: a language for stemming algorithms (2011)
Wan, X., Yang, J.: Multi-document summarization using cluster-based link analysis. In: SIGIR, pp. 299–306 (2008)
Wei, Z., Gao, W.: Utilizing microblogs for automatic news highlights extraction. In: COLING, pp. 872–883 (2014)
Wei, Z., Gao, W.: Gibberish, assistant, or master? Using tweets linking to news for extractive single-document summarization. In: SIGIR, pp. 1003–1006 (2015)
Yang, Z., Cai, K., Tang, J., Zhang, L., Zhong, S., Li, J.: Social context summarization. In: SIGIR, pp. 255–264 (2011)
Acknowledgment
We would like to thank to Preslav Nakov and Wei Gao for useful discussions and insightful comments on earlier drafts; Chien-Xuan Tran for building the web interface. We also thank to anonymous reviewers for their detailed comments for improving our paper. This work was partly supported by JSPS KAKENHI Grant number 3050941.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Nguyen, MT., Nguyen, ML. (2016). SoRTESum: A Social Context Framework for Single-Document Summarization. In: Ferro, N., et al. Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science(), vol 9626. Springer, Cham. https://doi.org/10.1007/978-3-319-30671-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-30671-1_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30670-4
Online ISBN: 978-3-319-30671-1
eBook Packages: Computer ScienceComputer Science (R0)