Skip to main content
Log in

A cascaded framework for identification and extraction of antonym for Turkish language

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Identification and extraction of semantic relations are challenging tasks in Natural Language Processing. In this paper, we design and propose three different models for the two separate tasks of identifying and extracting antonyms. In the first model, we develop two methods to identify antonyms: the first method consists of a probabilistic approach to calculate the probability of a given target/candidate pair being an antonym, whereby two distinct scoring functions are proposed to decide about the correct candidate for each target word; the second method consists of learning word embeddings and measuring embedding similarity to identify antonym pairs. In the second proposed model, we represent target/candidate pairs by a set of features that are compatible with those that are used by a supervised machine learning algorithm. The first and second models both especially well-suited for the identification of antonymy. In the last and third model, we adopt a minimally supervised bootstrapping approach, which operates by starting with a few antonym pairs and producing, thereafter, both seeds and patterns in an iterative fashion. Our study is deemed to be a significant contribution toward enriching the lexicon of the Turkish language.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. Türk Dil Kurumu (The Turkish Language Association).

  2. Vikisözlük: Özgür Sözlük.

  3. Türk Dil Kurumu (The Turkish Language Association).

  4. https://www.cmpe.boun.edu.tr/~hasim/.

  5. https://radimrehurek.com/gensim/.

  6. https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md.

References

  • Aldhubayi L, AlYahya M (2014) Automatic Arabic antonym extraction using a corpus analysis tool. J Theor Appl Inf Technol 70(3):422–433

    Google Scholar 

  • Al-Yahya M, Aldhubayi L, Al-Malak S (2014) A pattern-based approach to semantic relation extraction using a seed ontology. In: 2014 IEEE international conference on semantic computing (ICSC), Newport Beach, pp 96–99

  • Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155

    MATH  Google Scholar 

  • Bilgin O, Çetinoğlu O, Oflazer K (2004) Building a wordnet for Turkish. Rom J Inf Sci Technol 7(1–2):163–172

    Google Scholar 

  • Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146

    Article  Google Scholar 

  • Charles WG, Miller GA (1989) Contexts of antonymous adjectives. Appl Psycholinguist 10(3):357–75

    Article  Google Scholar 

  • Collobert JW, Bottou L, Karlen M, Kavukcuoglu K, Kuksa PP (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537

    MATH  Google Scholar 

  • Cruse A (1986) Lexical semantics. Cambridge University Press, Cambridge

    Google Scholar 

  • Deese J (1964) The associative structure of some common English adjectives. J Verbal Learn Verbal Behav 3:347–57

    Article  Google Scholar 

  • Dekang L (1998) Automatic retrieval and clustering of similar words. In: Proceedings of COLING/ACL98, Montreal, pp 768–774

  • Fellbaum C (1998) WordNet: an electronic lexical database. MIT Press, Cambridge

    Book  MATH  Google Scholar 

  • Harris Z (1954) Distributional structure. Word 10(23):146–162

    Article  Google Scholar 

  • Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th international conference on computational linguistics, COLING 1992, Nantes, pp 539–545

  • Huang W, Liu C (2017) Exploring lexical, syntactic, and semantic features for Chinese textual entailment in NTCIR RITE evaluation tasks. Soft Comput 21(2):311–330

    Article  Google Scholar 

  • Jones S (2002) Antonymy: a corpus-based perspective. Routledge advances in corpus linguistics. Routledge, London

    Book  Google Scholar 

  • Jones S, Murphy ML, Paradis C, Willners C (2007) Googling for opposites—a web-based study of antonym canonical. Corpora 2:129–155

    Article  Google Scholar 

  • Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, pp 427–431

  • Justeson J, Katz S (1991) Co-occurrences of antonymous adjectives and their Contexts. Comput Linguist 17:1–19

    Google Scholar 

  • Lin D, Zhao S, Qin L, Zhou M (2003) Identifying synonyms among distributionally similar words. In: Proceedings of IJCAI 2003. Acapulco, pp 1492–1493

  • Lobanova A, Spenader J, Van de Cruys T, van der Kleij T, Tjong Kim Sang E (2009) Automatic relation extraction—can synonym extraction benefit from antonym knowledge? In: NODALIDA 2009 workshop WordNets and other Lexical semantic resources—between lexical semantics. lexicography, terminology and formal ontologies, Odense

  • Lobanova A, van der Kleij T, Spenader J (2010a) Defining antonymy: a corpus-based study of opposites by lexico-syntactic patterns. Int J Lexicogr 23:19–53

    Article  Google Scholar 

  • Lobanova A, Bouma G, Jong Kim Sang JK E (2010b) Using a treebank for finding opposites. In: Proceedings of TLT9, Tartu, pp 139–150

  • Lucerto C, Pinto D, Jimienez-Salazar H (2004) An automatic method to identify antonym. In: Workshop on lexical resources and the web for word sense disambiguation. Puebla, pp 105–111

  • Marton Y, Kholy AE, Habash N (2011) Filtering antonymous, trend-Contrasting, and polarity-dissimilar distributional paraphrases for improving statistical machine translation. In: Proceedings of the sixth workshop on statistical machine translation, Edinburg, pp 237–249

  • Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013a) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems, Lake Tahoe, pp 3111–3119

  • Mikolov T, Yih W, Zweig G (2013b) Linguistic regularities in continuous space word representations. In: Human language technologies: conference of the North American chapter of the association of computational linguistics, Atlanta, 2013, pp 746–751

  • Mohammad S, Dorr B, Hirst G (2008) Computing word-pair antonymy. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP ’08). Association for computational linguistics, Stroudsburg, pp 982–991

  • Mohammad S, Bonnie D, Hirst G, Turney PD (2013) Computing lexical contrasts. Comput Linguist 39(3):555–590

    Article  Google Scholar 

  • Nguyen KA, Vu NT, Walde SS (2016) Integrating distributional lexical contrast into word embeddings for antonym–synonym distinction. CoRR, arXiv:1605.07766

  • Nguyen KA, Schulte im Walde S, Vu NT (2017) Distinguishing antonyms and synonyms in a pattern-based neural network. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics (EACL), Valencia, pp 76–85

  • Ono M, Miwa M, Sasaki Y (2015) Word embedding-based antonym detection using thesauri and distributional information. In: Conference of the North American chapter of the association for computational linguistics—human language technologies, Denver, pp 984–989

  • Pantel P, Pennacchiotti M (2006) Espresso: leveraging generic patterns for automatically harvesting semantic relations. In: Proceeding of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics, Sydney, pp 113–120

  • Roth M, Walde SS (2014) Combining word patterns and discourse markers for paradigmatic relation classification. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, Baltimore, pp 524–530

  • Rychly R (2008) A lexicographer-friendly association score. In: Proceedings of recent advances in slavonic natural language processing, Brno, pp 6–9

  • Şahin G, Diri B, Yıldız T (2015) Analysis of lexico-syntactic patterns for antonym pair extraction from a Turkish corpus. In: Proceedings of the fourth international conference on advanced information technologies and applications, Dubai

  • Sak H, Güngör T, Saraçlar M (2008) Turkish language resources: morphological parser, morphological disambiguator and web corpus. In: Nordström B, Ranta A (eds) Advances in natural language processing, vol 5221. LNCS. Springer, Berlin, Heidelberg, pp 417–427

    Chapter  Google Scholar 

  • Salton G (1971) The SMART retrieval system: experiments in automatic document processing. Prentice Hall, Upper Saddle River

    Google Scholar 

  • Santus E, Lu Q, Lenci A, Huang C (2014) Taking antonymy mask off in vector space. In: Proceedings of the 28th Pacific Asia conference on language, information and computing: poster presentation, Phuket, pp 135–144

  • Schwartz R, Reichart R, Rappoport A (2015) Symmetric pattern based word embeddings for improved word similarity prediction. In: Proceedings of the nineteenth conference on computational natural language learning, pp 258–267

  • Serbetçi A, Orhan Z, Pehlivan I (2011) Extraction of semantic word relations in Turkish from dictionary definitions. In: Proceedings of the ACL 2011 workshop on relational models of semantics, RELMS 2011, Portland, pp 11–18

  • Snow R, Jurafsky D, Ng A (2004) Learning syntactic patterns for automatic hypernym discovery. In: Advances in neural information processing systems, Vancouver, pp 1297–1304

  • Turian J, Ratinov L, Bengio Y (2010) Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th annual meeting of the association for computational linguistics, ACL ’10, Stroudsburg, pp 384–394

  • Turney, PD (2008) A uniform approach to analogies, synonyms, antonyms, and associations. In: Proceedings of the 22nd international conference on computational linguistics (COLING 2008), pp 905–912

  • Vylomova E, Rimell L, Cohn T, Baldwin T (2015) Take and took, gaggle and goose, book and read: evaluating the utility of vector differences for lexical relation learning. CoRR, arXiv:1509.01692

  • Walde SSchulte im, Köper M Maximilian (2013) Pattern-based distinction of paradigmatic relations for German nouns, verbs, adjectives. In: Gurevych I, Biemann C, Zesch T (eds) Language processing and knowledge in the web, vol 8105. LNCS. Springer, Berlin, Heidelberg, pp 184–198

    Chapter  Google Scholar 

  • Wang W, Thomas C, Sheth A, Chan V (2010) Pattern-based synonym and antonym extraction. In: Proceedings of the 48th annual southeast regional conference (ACM SE ’10), New York, pp 1–4

  • Yazıcı E, Amasyalı MF (2011) Automatic extraction of semantic relationships using Turkish dictionary definitions. EMO Bilimsel Dergi, Istanbul

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tuğba Yıldız.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yıldız, T., Yıldırım, S. A cascaded framework for identification and extraction of antonym for Turkish language. Soft Comput 23, 7853–7864 (2019). https://doi.org/10.1007/s00500-018-3417-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3417-1

Keywords

Navigation