Skip to main content

MFSRank: An Unsupervised Method to Extract Keyphrases Using Semantic Information

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 7094)

Abstract

This paper presents an unsupervised graph-based method to extract keyphrases using semantic information. The proposed method has two stages. In the first one, we have extracted MFS (Maximal Frequent Sequences) and built the nodes of a graph with them. The weight of the connection between two nodes has been established according to common statistical information and semantic relatedness. In the second stage, we have ranked MFS with traditionally PageRank algorithm; but we have included ConceptNet. This external resource adds an extra weight value between two MFS. The experimental results are competitive with traditional approaches developed in this area. MFSRank overcomes the baseline for top 5 keyphrases in precision, recall and F-score measures.

Keywords

  • Keyphrase Extraction
  • Maximal frequent sequences
  • Semantic Graphs

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-25324-9_29
  • Chapter length: 7 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-642-25324-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jianga, X., Hub, Y., Lib, H.: A ranking Approach to Keyphrase Extraction. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, pp. 756–757 (2009)

    Google Scholar 

  2. Gelbukh, A., Sidorov, G., Guzmán-Arenas, A.: Use of a Weighted Topic Hierarchy for Document Classification. In: Matoušek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds.) TSD 1999. LNCS (LNAI), vol. 1692, pp. 133–138. Springer, Heidelberg (1999)

    CrossRef  Google Scholar 

  3. Ledo Mezquita, Y., Sidorov, G., Gelbukh, A.: Tool for Computer-Aided Spanish Word Sense Disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 277–280. Springer, Heidelberg (2003)

    CrossRef  Google Scholar 

  4. Gelbukh, A., Sidorov, G., Galicia Haro, S., Bolshakov, I.: Environment for Development of a Natural Language Syntactic Analyzer. Acta Academia 2002, 206–213 (2002)

    Google Scholar 

  5. Kim, S.N., Medelyan, O., Kan, M.Y., Baldwin, T.: SemEval-2010 task 5: Automatic keyphrase extraction from scientific articles. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 21–26 (2010)

    Google Scholar 

  6. Xiaojun, W., Jianguo, X.: Single document keyphrase extraction using neighborhood knowledge. In: Proceedings of the 23rd National Conference on Artificial Intelligence, vol. 2, pp. 855–860 (2008)

    Google Scholar 

  7. Rada, M., Paul, T.: TextRank: Bringing order into texts. In: Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)

    Google Scholar 

  8. Xiaojun, W., Jianwu, Y., Jianguo, X.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 552–559 (2007)

    Google Scholar 

  9. Kazi, S.H., Vincent, N.: Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 365–373 (2010)

    Google Scholar 

  10. Roberto, O., David, P., Mireya, T., Héctor, J.: BUAP: An unsupervised approach to automatic keyphrase extraction from scientific articles. In: Proceedings of the 5th International Workshop on Semantic Evaluation (SemEval 2010), pp. 174–177 (2010)

    Google Scholar 

  11. Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank Citation Ranking: Bringing Order to the Web. Technical report, Stanford Digital Libraries (1998)

    Google Scholar 

  12. Sandra, G., Roxana, D., Paolo, R.: Drug-Drug Interaction Detection: A New Approach Based on Maximal Frequent Sequences. Procesamientto de Lenguje Natural 45 (2010)

    Google Scholar 

  13. Helena, A.M.: Discovery of Frequent Word Sequences in Text. In: Proceedings of the ESF Exploratory Workshop on Pattern Detection and Discovery, pp. 180–189 (2002)

    Google Scholar 

  14. Liu, H., Singh, P.: ConceptNet: A Practical Commonsense Reasoning Tool-Kit. BT Technology Journal 22 (2004)

    Google Scholar 

  15. Liu, H., Singh, P.: Commonsense Reasoning in and Over Natural Language. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS (LNAI), vol. 3215, pp. 293–306. Springer, Heidelberg (2004)

    CrossRef  Google Scholar 

  16. Ledeneva, Y., Gelbukh, A., García-Hernández, R.: Keeping Maximal Frequent Sequences Facilitates Extractive Summarization. In: Sidorov, G., et al. (eds.) Advances in Computer Science and Engineering, 9th Conference on Computing (CORE 2008), Research in Computing Science, vol. 34, pp. 163–174 (2008)

    Google Scholar 

  17. Ian, H.W., Gordon, W.P., Eibe, F., Carl, G., Craig, G.: KEA: Practical automatic keyphrase extraction. In: Proceedings of the Fourth ACM Conference on Digital Libraries (DL 1999), pp. 254–255. ACM (1999)

    Google Scholar 

  18. Chong, H., Yonghong, T., Zhi, Z., Charles, X.L., Tiejun, H.: Keyphrase extraction using semantic networks structure analysis. In: Proc. of the ICDM 2006, pp. 275–284 (2006)

    Google Scholar 

  19. Peter, D.: Learning Algorithms for Keyphrase Extraction. Inf. Retr. 2(4), 303–336 (2006)

    Google Scholar 

  20. Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

López, R.E., Barreda, D., Tejada, J., Cuadros, E. (2011). MFSRank: An Unsupervised Method to Extract Keyphrases Using Semantic Information. In: Batyrshin, I., Sidorov, G. (eds) Advances in Artificial Intelligence. MICAI 2011. Lecture Notes in Computer Science(), vol 7094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25324-9_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25324-9_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25323-2

  • Online ISBN: 978-3-642-25324-9

  • eBook Packages: Computer ScienceComputer Science (R0)