Network-Enabled Keyword Extraction for Under-Resourced Languages

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10151)

Abstract

In this paper we discuss advantages of network-enabled keyword extraction from texts in under-resourced languages. Network-enabled methods are shortly introduced, while focus of the paper is placed on discussion of difficulties that methods must overcome when dealing with content in under-resourced languages (mainly exhibit as a lack of natural language processing resources: corpora and tools). Additionally, the paper discusses how to circumvent the lack of NLP tools with network-enabled method such is SBKE method.

Keywords

Network-enabled keyword extraction Under-resourced languages NLP tools SBKE method 

References

  1. 1.
    Beliga, S., Meštrović, A., Martinčić-Ipšić, S.: An overview of graph-based keyword extraction methods and approaches. J. Inf. Organ. Sci. 39(1), 1–20 (2015)Google Scholar
  2. 2.
    Besacier, L., Barnard, E., Karpov, A., Schultz, T.: Automatic speech recognition for under-resourced languages: a survey. Speech Commun. 56, 85–100 (2014)CrossRefGoogle Scholar
  3. 3.
    Krauwer, S.: The basic language resource kit (BLARK) as the first milestone for the language resources roadmap. In: Proceedings of the 2003 International Workshop Speech and Computer SPECOM-2003, pp. 8–15. Moscow, Russia (2003)Google Scholar
  4. 4.
    Berment, V.: Méthodes pour informatiser des langues et des groupes de langues “peu dotées”. Ph.D. Thesis, J. Fourier University – Grenoble I (2004)Google Scholar
  5. 5.
    Abilhoa, W.D., Castro, L.N.: A keyword extraction method from twitter messages represented as graphs. Appl. Math. Comput. 240, 308–325 (2014)Google Scholar
  6. 6.
    Palshikar, G.K.: Keyword extraction from a single document using centrality measures. In: Ghosh, A., De, R.K., Pal, S.K. (eds.) PReMI 2007. LNCS, vol. 4815, pp. 503–510. Springer, Heidelberg (2007). doi:10.1007/978-3-540-77046-6_62 CrossRefGoogle Scholar
  7. 7.
    Mihalcea, R., Tarau, P.: TextRank: Bringing order into texts. In: Proceedings of Empirical Methods in Natural Language Processing – EMNLP 2004, pp. 404–411. ACL, Barcelona, Spain (2004)Google Scholar
  8. 8.
    META-NET – official site May 2016. http://www.meta-net.eu/
  9. 9.
    META-NET White Paper Series: Key Results and Cross-Language Comparison May 2016. http://www.meta-net.eu/whitepapers/key-results-and-cross-language-comparison
  10. 10.
    Joorabchi, A., Mahdi, A.E.: Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms. J. Inf. Sci. 39(3), 410–426 (2013)CrossRefGoogle Scholar
  11. 11.
    Lahiri, S., Choudhury, S.R., Caragea, C.: Keyword and Keyphrase Extraction Using Centrality Measures on Collocation Networks (2014). arXiv preprint arXiv:1401.6571
  12. 12.
    Grineva, M., Grinev, M., Lizorkin, D.: Extracting key terms from noisy and multitheme documents. In: ACM 18th conference on World Wide Web, pp. 661–670 (2009)Google Scholar
  13. 13.
    Beliga, S., Meštrović, A., Martinčić-Ipšić, S.: Toward selectivity-based keyword extraction for croatian news. In: CEUR Proceedings of the Workshop on Surfacing the Deep and the Social Web (SDSW 2014), vol. 1310, pp. 1–8, Riva del Garda, Trentino, Italy (2014)Google Scholar
  14. 14.
    Beliga, S., Meštrović, A., Martinčić-Ipšić, S.: Selectivity-based keyword extraction method. Int. J. Semant. Web Inf. Syst. (IJSWIS) 12(3), 1–26 (2016)CrossRefGoogle Scholar
  15. 15.
    Proceedings of the ACL 2015 Workshop on Novel Computational Approaches to Keyphrase Extraction, ACL-IJCNLP 2015, Beijing, China (2015)Google Scholar
  16. 16.
    Paroubek, P., Zweigenbaum, P., Forest, D., Grouin, C.: Indexation libreet controlee d’articles scientifiques. Presentation et resultats du defi fouille de textes DEFT2012. In: Proceedings of the DEfi Fouille de Textes 2012 Workshop, pp. 1–13 (2012)Google Scholar
  17. 17.
    Kozłowski, M.: PKE: a novel Polish keywords extraction method. Pomiary Automatyka Kontrola, R. 60(5), 305–308 (2014)Google Scholar
  18. 18.
    Mijić, J., Dalbelo-Bašić, B., Šnajder, J.: Robust keyphrase extraction for a large-scale croatian news production system. In: Proceedings of the 7th International Conference on Formal Approaches to South Slavic and Balkan Languages, Zagreb, Croatia: Croatian Language Technologies Society, pp. 59–66 (2010)Google Scholar
  19. 19.
  20. 20.
    Zunde, P., Dexter, M.E.: Indexing consistency and quality. Am. Documentation 20(3), 259–267 (1969)CrossRefGoogle Scholar
  21. 21.
    Loza, V., Lahiri, S., Mihalcea, R., Lai, P.: Building a dataset for summarization and keyword extraction from emails. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014). pp. 2441–2446, Reykjavik, Iceland (2014)Google Scholar
  22. 22.
    Su, N.K., Medelyan, O., Min-Yen, K., Timothy, B.: Automatic keyphrase extraction from scientific articles. Lang. Resour. Eval. 47(3), 723–742 (2013)CrossRefGoogle Scholar
  23. 23.
    Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., et al.: Part-of-speech tagging for twitter: annotation, features, and experiments. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers – vol. 2, HLT 2011, Stroudsburg, PA, USA. Association for Computational Linguistics (2011)Google Scholar
  24. 24.
    Marujo, L., Wang, L., Trancoso, I., Dyer, C., Black, A.W., Gershman, A., et al.: Automatic keyword extraction on twitter. In: ACL (2015)Google Scholar
  25. 25.
    Medelyan, O.: Human-competitive automatic topic indexing. Ph.D. thesis. Department of Computer Science, University of Waikato, New Zealand (2009)Google Scholar
  26. 26.
    Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 216–223 (2003)Google Scholar
  27. 27.
    Nguyen, T.D., Kan, M.-Y.: Keyphrase extraction in scientific publications. In: Goh, D.H.-L., Cao, T.H., Sølvberg, I.T., Rasmussen, E. (eds.) ICADL 2007. LNCS, vol. 4822, pp. 317–326. Springer, Heidelberg (2007). doi:10.1007/978-3-540-77094-7_41 CrossRefGoogle Scholar
  28. 28.
    Wan, X., Xiao, J.: CollabRank: towards a collaborative approach to single-document keyphrase extraction. In: Proceedings of COLING, pp. 969–976 (2008)Google Scholar
  29. 29.
    Krapivin, M., Autaeu, A., Marchese, M.: Large dataset for keyphrase extraction. Technical Report DISI-09-055, DISI, University of Trento, Italy (2009)Google Scholar
  30. 30.
    Medelyan, O., Witten, I.H.: Domain independent automatic keyphrase indexing with small training sets. J. Am. Soc. Inf. Sci. Technol. 59(7), 1026–1040 (2008)CrossRefGoogle Scholar
  31. 31.
    Marujo, L., Gershman, A., Carbonell, J., Frederking, R., Neto, J.P.: Supervised topical key phrase extraction of news stories using crowdsourcing. In: Light Filtering and Co-reference Normalization. Proceedings of LREC 2012 (2012)Google Scholar
  32. 32.
    Marujo, L., Viveiros, M., Neto, J.P.: Keyphrase cloud generation of broadcast news. In: Proceeding of 12th Annual Conference of the International Speech Communication Association, Interspeech (2011)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of InformaticsUniversity of RijekaRijekaCroatia

Personalised recommendations