Skip to main content

Keyword Extraction Using Graph Centrality and WordNet

  • Chapter
  • First Online:
Towards Extensible and Adaptable Methods in Computing

Abstract

Keywords are a summarized shortened version of any document. While a lot of research has been done on keyword extraction, very few of them analyze the network of a semantic network to identify the most important words in the document. In this research, we present one such method which uses WordNet as our knowledge base to exploit the semantic relatedness of terms and hence determine keywords. This is based upon graph centrality measures which help to identify the central nodes or keywords from the document.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Stop-words are the words in a language that do not serve much meaning but are used for construction of a sentence and binding it together.

References

  1. Jones, S., & Paynter, G. (2002). Automatic extraction of document keyphrases for use in digital libraries: Evaluation and applications. Journal of the American Society for Information Science and Technology.

    Google Scholar 

  2. Fellbaum, C. (1998). WordNet: An electronic lexical database. Cambridge, MA: MIT Press.

    Google Scholar 

  3. Zhang, K., Hui, X., Tang, J., Li, J., Yu, J. X., Kitsuregawa, M., Leong, H. V. (2006). Keyword extraction using support vector machine advances in web-age information management.

    Google Scholar 

  4. Rose, S., & Engel, D., Cramer, N., Cowley, W. (2010). Automatic keyword extraction from individual documents. Text Mining: Applications and Theory, 1–20. https://doi.org/10.1002/9780470689646.ch1.

    Google Scholar 

  5. Witten, I. H., Paynter, G. W, Frank, E., Gutwin, C., & Nevill-Manning, C. G. (1999). KEA: Practical automatic keyphrase extraction. ACM DL.

    Google Scholar 

  6. Turney, P. (1999). Learning to extract keyphrases from text. Information Retrieval.

    Google Scholar 

  7. Witten, I. H. & Medelyan, O. (2006). Thesaurus based automatic keyphrase indexing. In Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’06) (pp. 296–297), Chapel Hill, NC.

    Google Scholar 

  8. Cerbulescu, C., & Leotescu, G. S. (2017). Extracting text keywords using WordNet (pp. 1–4). https://doi.org/10.1145/3136273.3136280.

  9. Beliga, S., Ana, M., & Martinčić-Ipšić, S. (2015). An overview of graph-based keyword extraction methods and approaches. Journal of Information and Organizational Sciences, 39(1), 1–20.

    Google Scholar 

  10. HaCohen-Kerner, Y. (2003). Automatic extraction of keywords from abstracts.

    Google Scholar 

  11. Pudotta, A., Dattolo, A., & Baruzzo, A. (2010). New domain independent keyphrase extraction system digital libraries. In 6th Italian Research Conference, IRCDL. Padua, Italy.

    Google Scholar 

  12. Wei, T., Lu, Y., Chang, H., Zhou, Q., & Bao, X. (2015). A semantic approach for text clustering using WordNet and lexical chains. Expert Systems with Applications, 42(4), 2264–2275, ISSN 0957-4174.

    Article  Google Scholar 

  13. Tsatsaronis, G., Varlamis, I., & Nørvåg, K. (2010). SemanticRank: ranking keywords and sentences using semantic graphs. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10).

    Google Scholar 

  14. Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The pagerank citation ranking: bringing order to the web. Technical Report, Stanford InfoLab.

    Google Scholar 

  15. Kleinberg, J. M. (1998). Authoritative sources in a hyperlinked environment. In Proceedings of the Ninth Symposium Discrete Algorithms (pp. 668–677).

    Google Scholar 

  16. Boudin, F. (2013). A comparison of centrality measures for graph-based keyphrase extraction. In International Joint Conference on Natural Language Processing (IJCNLP) (pp. 834–838), Nagoya, Japan.

    Google Scholar 

  17. Schluter, N. (2014). Centrality measures for non-contextual graph-based unsupervised single document keyword extraction. In Proceedings of TALN Association for Computational Linguistics.

    Google Scholar 

  18. Tixier, A., Malliaros, F., & Vazirgiannis, M. (2016). A graph degeneracy-based approach to keyword extraction. In EMNLP.

    Google Scholar 

  19. Navigli, R., & Lapata, M. (2010). An experimental study of graph connectiv-ity for unsupervised word sense disambiguation. IEEE Transaction on Pattern Analysis and Machine Learning, 32(4).

    Article  Google Scholar 

  20. Jain, A., Mittal, K., & Tayal, D. K. (2014). Automatically incorporating context meaning for query expansion using graph connectivity measures. Progress in Artificial Intelligence, 2, 129–139.

    Article  Google Scholar 

  21. Jain, A., & Lobiyal, D. K. (2014). A new approach for unsupervised word sense disambiguation in Hindi language using graph connectivity measures. International Journal Artificial Intelligence Soft Computing 4(4), 318–334.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minni Jain .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sharma, C., Jain, M., Aggarwal, A. (2018). Keyword Extraction Using Graph Centrality and WordNet. In: Chakraverty, S., Goel, A., Misra, S. (eds) Towards Extensible and Adaptable Methods in Computing. Springer, Singapore. https://doi.org/10.1007/978-981-13-2348-5_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-2348-5_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-2347-8

  • Online ISBN: 978-981-13-2348-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics