Advances in Machine Learning and Data Science pp 113-120 | Cite as
Keyphrase and Relation Extraction from Scientific Publications
- 2 Mentions
- 1.3k Downloads
Abstract
This paper proposes a detailed view of extracting keyphrases and its relations from scientifically published articles such as research papers using conditional random fields (CRF). Keyphrase is a word or set of words that describe the close relationship of content and context in particular documents (Sharan, International conference on advances in computing communications and informatics (ICACCI), 2014) [1]. Keyphrases may be the topics of the document which represent the key logic of the document. Automatic keyphrase extraction has a major role in automatic systems like independent summarization, query or topic generation, question-answering system, search engine, information retrieval, document classification, etc. The relationships of the keyphrases are also extracted. Two types of relations are considered—synonym and hyponyms. The result shows that our proposed system outperforms the existing systems.
Keywords
Keyphrase extraction Topic extraction Information extraction (IE) Summarization Question answering (QA) Document classificationReferences
- 1.Sharan, A., Siddiqi, S.: A supervised approach to distinguish between keywords and stopwords using probability distribution functions. In: 2014 International Conference on Advances in Computing Communications and Informatics (ICACCI) (2014)Google Scholar
- 2.Python Vocabulary 0.0.5. http://pypi.python.org/pypi/vacabulary/0.0.5. Accessed 01 Feb 2017
- 3.Mihalcea, R., Tarau, P.: Text Rank: bringing order into texts. In: Conference on Empirical Methods in Natural Language Processing (2004)Google Scholar
- 4.Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Craig, G.: Nevill-Manning KEA: Practical Automatic Keyphrase Extraction (1999)Google Scholar
- 5.Turney, P.D.: Learning algorithms for keyphrase extraction. Information Retrieval-INRT, pp. 34–99 (2000)Google Scholar
- 6.Bhaskar, P., Nongnieikapam, K., Bandyopadhyay, S.: Keyphrase extraction in scientific articles: a supervised approach. In: Proceeding of COLING 2012, pp. 17–24. Mumbai (2012)Google Scholar
- 7.EI-Beltagy, S.R., Rafea, A., Miner, K.P.: Participated in SemEval-2 Proceeding the 5th International Workshop on Semantic Evaluation, pp. 190–193. ACL, Uppsala, Sweden (2010)Google Scholar
- 8.Nguyen, T.D., Kan, M.Y.: WINGNUS: keyphrase extraction utilizing document logical structure. In: Proceeding the 5th International Workshop on Sematic Evaluation, pp. 166–169. ACL, Uppsala, Sweden (2010)Google Scholar
- 9.Lahiri, S., Mihalcea, R., Lai, P.H.: Keyword extraction from emails. In: Proceedings of 5th International Workshop on Semantic Evaluation, 2016, pp. 1–24. ACL, Cambridge University Press, UK (2010)CrossRefGoogle Scholar
- 10.Sarkar, K.: A hybrid approach to extract keyphrases from medical documents. Int. J. Comput. Appl. (0975-8887) 63(18) (2013)CrossRefGoogle Scholar
- 11.Rake: Rapid Automatic Keyword Extraction. https://hackage.haskell.org/package/rake. Accessed 23 Mar 2017
- 12.AlchemyAPI. https://www.ibm.com/watson/alchemy-api.htm. Accessed 20 Mar 2017
- 13.Textacy. https://pypi.python.org/pypi/textacy. Accessed 13 Mar 2017
- 14.Spacy. https://pypi.python.org/pypi/spacy. Accessed 25 Feb 2017
- 15.NLTK. http://www.nltk.org. Accessed 27 Jan 2017
- 16.Huang, C., Tian, Y., Zhou, Z., Ling, C.X., Huang, T.: Keyphrase extraction using semantic networks structure analysis. In: IEEE International Conference on Data Mining, pp. 275–284 (2006)Google Scholar
- 17.Marsi, E., Ozturk, P.: Extraction and generalization of variables from scientific publications. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 505–511. Lisbon, Portugal (2015)Google Scholar
- 18.Bordea, G., Buitelaar, P.: DERIUNLP: a context based approach to automatic keyphrase extraction. In: Proceedings of the 5th International Workshop on Semantic Evaluation ACL, pp. 146–149. Uppsala, Sweden (2010)Google Scholar
- 19.Palshikar, G.K.: Keyword extraction from a single document using centrality measures. In: 2nd International Conference PReMI 2007 LNCS 4815, pp. 503–510 (2007)Google Scholar
- 20.Eichler, K., Neumann, G.: DFKIKeyWE: ranking keyphrases extracted from scientific articles. In: Proceedings of the 5th International Workshop on Sematic Evaluation, pp. 150–153. ACL, Uppsala, Sweden (2010)Google Scholar
- 21.Haddoud, M., Mokhtari, A., Leiroq, T., Abdeddaim, S.: Accurate keyphrase extraction from scientific papers by mining linguistics information, CLBib (2015)Google Scholar
- 22.Litvak, M., Last, M., Aizenman, H., Gobits, I., Kande, A.: l DegExt: a language-independent graph-based keyphrase extractor. Adv. Intel. Soft Comput. 86, 121–130 (2011)CrossRefGoogle Scholar
- 23.Litvak, M., Last, M.: Graph-based keyword extraction for single-document summarization. In: Proceedings of the 2nd Workshop on Multi-source Multilingual Information Extraction and Summarization, pp. 17–24. Manchester, UK (2008)Google Scholar
- 24.Nallapati, R., Allan, J., Mahadevan, S.: Extraction of keywords from news stories. CIIR Technical Report, IR(345) (2013)Google Scholar
- 25.CRF++. http://taku910.github.io/crfpp. Accessed 28 Jan 2017
- 26.Stanford CoreNLP. https://stanfordnlp.github.io/CoreNLP. Accessed 27 Feb 2017