Advertisement

WS-Rank: Bringing Sentences into Graph for Keyword Extraction

  • Fan Yang
  • Yue-Sheng ZhuEmail author
  • Yu-Jia Ma
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9932)

Abstract

Graph-based method is one of the most efficient unsupervised ways to extract keyword from a single web text. However, rarely did the previous graph-based methods consider the sentence importance. In this paper, we propose a graph-based keyword extractor WS-Rank which brings sentences into graph where sentences are distinctively treated according to their importance. The candidate keywords are extracted through the voting mechanism between words and sentences. To evaluate the experiment, we compare our method with TextRank, a graph-based method which uses the logic distribution relationship only between words. Experiment on 13702 web texts carried out shows that WS-Rank achieves more ideal results with an average F-score of 25.20 %.

Keywords

Edge Weight Word Segmentation Chinese Text Normal Sentence Keyword Extraction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Abilhoa, W.D., de Castro, L.N.: A keyword extraction method from twitter messages represented as graphs. Appl. Math. Comput. 240, 308–325 (2014)Google Scholar
  2. 2.
    Boudin, F.: A comparison of centrality measures for graph-based keyphrase extraction. In: International Joint Conference on Natural Language Processing (IJCNLP), pp. 834–838 (2013)Google Scholar
  3. 3.
    Li, L., Su, C., Sun, Y., Xiong, S., Xu, G.: Hashtag biased ranking for keyword extraction from microblog posts. In: Zhang, S., Wirsing, M., Zhang, Z. (eds.) KSEM 2015. LNCS, vol. 9403, pp. 348–359. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-25159-2_32 CrossRefGoogle Scholar
  4. 4.
    Mihalcea, R., Tarau, P.: Textrank: Bringing order into texts. In: Association for Computational Linguistics (2004)Google Scholar
  5. 5.
    Peng, L., Bin, W., Zhiwei, S., Yachao, C., Hengxun, L.: Tag-textrank: a webpage keyword extraction method based on tags. J. Comput. Res. Dev. 11, 014 (2012)Google Scholar
  6. 6.
    Wan, X.: Timedtextrank: adding the temporal dimension to multi-document summarization. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 867–868. ACM (2007)Google Scholar
  7. 7.
    Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. AAAI 8, 855–860 (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Communication and Information Security Lab, Shenzhen Graduate School, Institute of Big Data TechnologiesPeking UniversityShenzhenChina

Personalised recommendations