Multi-roles Graph Based Extractive Summarization

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10634)

Abstract

In this paper, we propose a multi-roles graph model for extractive single-document summarization. In our model, we consider that each text can be expressed in some important words which we call roles. We design three roles, including noun role, verb role and numeral role, and build a multi-roles graph according to these three roles to represent a text. And then we project this graph into three single role graphs according to the role of nodes. After that, we extract some import features from these four graphs by applying a modified PageRank algorithm and then combine them with some statistical features such as sentence position and the length of sentence to represent each sentence. Finally we train a random forest model to learn the pattern of selecting important sentences to generate summaries. To evaluate our model, we perform some experiments on DUC2001 and DUC2002 and achieve 13.9% improvement over latest methods. Besides, we also obtain best results in ROUGE-2 compared with some classic methods.

Keywords

Summarization Multi-roles graph Classification Random forest 

Notes

Acknowledgments

The research was supported in part by NSFC under Grant Nos. 61572158 and 61602132, and Shenzhen Science and Technology Program under Grant Nos. JCYJ20160330163900579 and JSGG20150512145714247, Research Award Foundation for Outstanding Young Scientists in Shandong Province, (Grant No. 2014BSA10016), the Scientific Research Foundation of Harbin Institute of Technology at Weihai (Grant No. HIT(WH)201412).

References

  1. 1.
    Fattah, M.A.: A hybrid machine learning model for multi-document summarization. Appl. Intell. 40(4), 592–600 (2014)CrossRefGoogle Scholar
  2. 2.
    Luhn, H.P.: The automatic creation of literature abstracts. IBM Corp (1958)Google Scholar
  3. 3.
    Gambhir, M., Gupta, V.: Recent automatic text summarization techniques: a survey. Artif. Intell. Rev. 47(1), 1–66 (2017)CrossRefGoogle Scholar
  4. 4.
    Chan, S.W.K.: Beyond keyword and cue-phrase matching: a sentence-based abstraction technique for information extraction. Decis. Support Syst. 42(2), 759–777 (2006)CrossRefGoogle Scholar
  5. 5.
    Ye, S., Chua, T.S., Kan, M.Y., Qiu, L.: Document concept lattice for text understanding and summarization. Inf. Process. Manage. 43(6), 1643–1662 (2007)CrossRefGoogle Scholar
  6. 6.
    Carenini, G., Ng, R.T., Zhou, X.: Summarizing emails with conversational cohesion and subjectivity. In: ACL 2008, Proceedings of the, Meeting of the Association for Computational Linguistics, June 15–20, 2008, Columbus, Ohio, USA, pp. 353–361. DBLP (2008)Google Scholar
  7. 7.
    Antiqueira, L., Oliveira Jr., O.N., da Fontoura Costa, L., das Graças Volpe Nunes, M.: A complex network approach to text summarization. Inf. Sci. 179(5), 584–599 (2009)CrossRefMATHGoogle Scholar
  8. 8.
    Alguliev, R.M., Aliguliyev, R.M., Hajirahimova, M.S., Mehdiyev, C.A.: Mcmr: maximum coverage and minimum redundant text summarization model. Expert Syst. Appl. 38(12), 14514–14522 (2011)CrossRefGoogle Scholar
  9. 9.
    Ouyang, Y., Li, W., Zhang, R., Li, S., Lu, Q.: A progressive sentence selection strategy for document summarization. Inf. Process. Manage. 49(1), 213–221 (2013)CrossRefGoogle Scholar
  10. 10.
    Yang, L., Cai, X., Zhang, Y., Shi, P.: Enhancing sentence-level clustering with ranking-based clustering framework for theme-based summarization. Inf. Sci. 260(1), 37–50 (2014)CrossRefGoogle Scholar
  11. 11.
    Fang, H., Lu, W., Wu, F., Zhang, Y., Shang, X., Shao, J., et al.: Topic aspect-oriented summarization via group selection. Neurocomputing 149, 1613–1619 (2015)CrossRefGoogle Scholar
  12. 12.
    Parveen, D., Strube, M.: Integrating importance, non-redundancy and coherence in graph-based extractive summarization. In: International Conference on Artificial Intelligence, pp. 1298–1304. AAAI Press (2015)Google Scholar
  13. 13.
    Li, W.: Abstractive multi-document summarization with semantic information extraction. In: Conference on Empirical Methods in Natural Language Processing, pp. 1908–1913 (2015)Google Scholar
  14. 14.
    Fattah, M.A., Ren, F.: Ga, mr, ffnn, pnn and gmm based models for automatic text summarization. Comput. Speech Lang. 23(1), 126–144 (2009)CrossRefGoogle Scholar
  15. 15.
    Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Conference on Empirical Methods in Natural Language Processing, EMNLP 2004, A Meeting of Sigdat, A Special Interest Group of the Acl, Held in Conjunction with ACL 2004, 25–26 July 2004, Barcelona, Spain, pp. 404–411. DBLP (2004)Google Scholar
  16. 16.
    Mendoza, M., Bonilla, S., Noguera, C., Cobos, C., Elizabeth, N.: Extractive single-document summarization based on genetic operators and guided local search. Expert Syst. Appl. 41(9), 4158–4169 (2014)CrossRefGoogle Scholar
  17. 17.
    Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. In: Meeting of the Association for Computational Linguistics, pp. 484–494 (2016)Google Scholar
  18. 18.
    Wan, X.: Towards a unified approach to simultaneous single-document and multi-document summarizations, pp. 1137–1145 (2010)Google Scholar
  19. 19.
    Shen, D., Sun, J. T., Li, H., Yang, Q., Chen, Z.: Document summarization using conditional random fields. In: International Joint Conference on Artifical Intelligence, pp. 2862–2867. Morgan Kaufmann Publishers Inc. (2007)Google Scholar
  20. 20.
    Alguliyev, R.M., Aliguliyev, R.M., Isazade, N.R., Abdi, A., Idris, N.: A model for text summarization. Int. J. Intell. Inf. Technol. 13(1), 67–85 (2017)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceHarbin Institute of TechnologyShenzhenChina
  2. 2.Harbin Institute of TechnologyWeihaiChina

Personalised recommendations