Keyphrase and Relation Extraction from Scientific Publications

  • R. C. AnjuEmail author
  • Sree Harsha Ramesh
  • P. C. Rafeeque
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 705)


This paper proposes a detailed view of extracting keyphrases and its relations from scientifically published articles such as research papers using conditional random fields (CRF). Keyphrase is a word or set of words that describe the close relationship of content and context in particular documents (Sharan, International conference on advances in computing communications and informatics (ICACCI), 2014) [1]. Keyphrases may be the topics of the document which represent the key logic of the document. Automatic keyphrase extraction has a major role in automatic systems like independent summarization, query or topic generation, question-answering system, search engine, information retrieval, document classification, etc. The relationships of the keyphrases are also extracted. Two types of relations are considered—synonym and hyponyms. The result shows that our proposed system outperforms the existing systems.


Keyphrase extraction Topic extraction Information extraction (IE) Summarization Question answering (QA) Document classification 


  1. 1.
    Sharan, A., Siddiqi, S.: A supervised approach to distinguish between keywords and stopwords using probability distribution functions. In: 2014 International Conference on Advances in Computing Communications and Informatics (ICACCI) (2014)Google Scholar
  2. 2.
    Python Vocabulary 0.0.5. Accessed 01 Feb 2017
  3. 3.
    Mihalcea, R., Tarau, P.: Text Rank: bringing order into texts. In: Conference on Empirical Methods in Natural Language Processing (2004)Google Scholar
  4. 4.
    Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Craig, G.: Nevill-Manning KEA: Practical Automatic Keyphrase Extraction (1999)Google Scholar
  5. 5.
    Turney, P.D.: Learning algorithms for keyphrase extraction. Information Retrieval-INRT, pp. 34–99 (2000)Google Scholar
  6. 6.
    Bhaskar, P., Nongnieikapam, K., Bandyopadhyay, S.: Keyphrase extraction in scientific articles: a supervised approach. In: Proceeding of COLING 2012, pp. 17–24. Mumbai (2012)Google Scholar
  7. 7.
    EI-Beltagy, S.R., Rafea, A., Miner, K.P.: Participated in SemEval-2 Proceeding the 5th International Workshop on Semantic Evaluation, pp. 190–193. ACL, Uppsala, Sweden (2010)Google Scholar
  8. 8.
    Nguyen, T.D., Kan, M.Y.: WINGNUS: keyphrase extraction utilizing document logical structure. In: Proceeding the 5th International Workshop on Sematic Evaluation, pp. 166–169. ACL, Uppsala, Sweden (2010)Google Scholar
  9. 9.
    Lahiri, S., Mihalcea, R., Lai, P.H.: Keyword extraction from emails. In: Proceedings of 5th International Workshop on Semantic Evaluation, 2016, pp. 1–24. ACL, Cambridge University Press, UK (2010)CrossRefGoogle Scholar
  10. 10.
    Sarkar, K.: A hybrid approach to extract keyphrases from medical documents. Int. J. Comput. Appl. (0975-8887) 63(18) (2013)CrossRefGoogle Scholar
  11. 11.
    Rake: Rapid Automatic Keyword Extraction. Accessed 23 Mar 2017
  12. 12.
    AlchemyAPI. Accessed 20 Mar 2017
  13. 13.
    Textacy. Accessed 13 Mar 2017
  14. 14.
    Spacy. Accessed 25 Feb 2017
  15. 15.
    NLTK. Accessed 27 Jan 2017
  16. 16.
    Huang, C., Tian, Y., Zhou, Z., Ling, C.X., Huang, T.: Keyphrase extraction using semantic networks structure analysis. In: IEEE International Conference on Data Mining, pp. 275–284 (2006)Google Scholar
  17. 17.
    Marsi, E., Ozturk, P.: Extraction and generalization of variables from scientific publications. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 505–511. Lisbon, Portugal (2015)Google Scholar
  18. 18.
    Bordea, G., Buitelaar, P.: DERIUNLP: a context based approach to automatic keyphrase extraction. In: Proceedings of the 5th International Workshop on Semantic Evaluation ACL, pp. 146–149. Uppsala, Sweden (2010)Google Scholar
  19. 19.
    Palshikar, G.K.: Keyword extraction from a single document using centrality measures. In: 2nd International Conference PReMI 2007 LNCS 4815, pp. 503–510 (2007)Google Scholar
  20. 20.
    Eichler, K., Neumann, G.: DFKIKeyWE: ranking keyphrases extracted from scientific articles. In: Proceedings of the 5th International Workshop on Sematic Evaluation, pp. 150–153. ACL, Uppsala, Sweden (2010)Google Scholar
  21. 21.
    Haddoud, M., Mokhtari, A., Leiroq, T., Abdeddaim, S.: Accurate keyphrase extraction from scientific papers by mining linguistics information, CLBib (2015)Google Scholar
  22. 22.
    Litvak, M., Last, M., Aizenman, H., Gobits, I., Kande, A.: l DegExt: a language-independent graph-based keyphrase extractor. Adv. Intel. Soft Comput. 86, 121–130 (2011)CrossRefGoogle Scholar
  23. 23.
    Litvak, M., Last, M.: Graph-based keyword extraction for single-document summarization. In: Proceedings of the 2nd Workshop on Multi-source Multilingual Information Extraction and Summarization, pp. 17–24. Manchester, UK (2008)Google Scholar
  24. 24.
    Nallapati, R., Allan, J., Mahadevan, S.: Extraction of keywords from news stories. CIIR Technical Report, IR(345) (2013)Google Scholar
  25. 25.
    CRF++. Accessed 28 Jan 2017
  26. 26.
    Stanford CoreNLP. Accessed 27 Feb 2017

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • R. C. Anju
    • 1
    Email author
  • Sree Harsha Ramesh
    • 2
  • P. C. Rafeeque
    • 1
  1. 1.Government Engineering CollegePalakkadIndia
  2. 2.Surukam Analytics Pvt. LtdChennaiIndia

Personalised recommendations