Multi-roles Graph Based Extractive Summarization
In this paper, we propose a multi-roles graph model for extractive single-document summarization. In our model, we consider that each text can be expressed in some important words which we call roles. We design three roles, including noun role, verb role and numeral role, and build a multi-roles graph according to these three roles to represent a text. And then we project this graph into three single role graphs according to the role of nodes. After that, we extract some import features from these four graphs by applying a modified PageRank algorithm and then combine them with some statistical features such as sentence position and the length of sentence to represent each sentence. Finally we train a random forest model to learn the pattern of selecting important sentences to generate summaries. To evaluate our model, we perform some experiments on DUC2001 and DUC2002 and achieve 13.9% improvement over latest methods. Besides, we also obtain best results in ROUGE-2 compared with some classic methods.
KeywordsSummarization Multi-roles graph Classification Random forest
The research was supported in part by NSFC under Grant Nos. 61572158 and 61602132, and Shenzhen Science and Technology Program under Grant Nos. JCYJ20160330163900579 and JSGG20150512145714247, Research Award Foundation for Outstanding Young Scientists in Shandong Province, (Grant No. 2014BSA10016), the Scientific Research Foundation of Harbin Institute of Technology at Weihai (Grant No. HIT(WH)201412).
- 2.Luhn, H.P.: The automatic creation of literature abstracts. IBM Corp (1958)Google Scholar
- 6.Carenini, G., Ng, R.T., Zhou, X.: Summarizing emails with conversational cohesion and subjectivity. In: ACL 2008, Proceedings of the, Meeting of the Association for Computational Linguistics, June 15–20, 2008, Columbus, Ohio, USA, pp. 353–361. DBLP (2008)Google Scholar
- 12.Parveen, D., Strube, M.: Integrating importance, non-redundancy and coherence in graph-based extractive summarization. In: International Conference on Artificial Intelligence, pp. 1298–1304. AAAI Press (2015)Google Scholar
- 13.Li, W.: Abstractive multi-document summarization with semantic information extraction. In: Conference on Empirical Methods in Natural Language Processing, pp. 1908–1913 (2015)Google Scholar
- 15.Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Conference on Empirical Methods in Natural Language Processing, EMNLP 2004, A Meeting of Sigdat, A Special Interest Group of the Acl, Held in Conjunction with ACL 2004, 25–26 July 2004, Barcelona, Spain, pp. 404–411. DBLP (2004)Google Scholar
- 17.Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. In: Meeting of the Association for Computational Linguistics, pp. 484–494 (2016)Google Scholar
- 18.Wan, X.: Towards a unified approach to simultaneous single-document and multi-document summarizations, pp. 1137–1145 (2010)Google Scholar
- 19.Shen, D., Sun, J. T., Li, H., Yang, Q., Chen, Z.: Document summarization using conditional random fields. In: International Joint Conference on Artifical Intelligence, pp. 2862–2867. Morgan Kaufmann Publishers Inc. (2007)Google Scholar