Comparing Distributional and Mirror Translation Similarities for Extracting Synonyms

  • Philippe Muller
  • Philippe Langlais
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6657)


Automated thesaurus construction by collecting relations between lexical items (synonyms, antonyms, etc) has a long tradition in natural language processing. This has been done by exploiting dictionary structures or distributional context regularities (coocurrence, syntactic associations, or translation equivalents), in order to define measures of lexical similarity or relatedness. Dyvik had proposed to use aligned multilingual corpora and defines similar terms as terms that often share their translations. We evaluate the usefulness of this similarity for the extraction of synonyms, compared to the more widespread distributional approach.


Semantic Similarity Word Pair Lexical Item Mean Average Precision Candidate List 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Edmonds, P., Hirst, G.: Near-Synonymy and lexical choice. Computational Linguistics 28(2), 105–144 (2002)CrossRefGoogle Scholar
  2. 2.
    Michiels, A., Noel, J.: Approaches to thesaurus production. In: Proceedings of Coling 1982 (1982)Google Scholar
  3. 3.
    Kozima, H., Furugori, T.: Similarity between words computed by spreading activation on an english dictionary. In: Proceedings of the Conference of the European Chapter of the ACL, pp. 232–239 (1993)Google Scholar
  4. 4.
    Niwa, Y., Nitta, Y.: Co-occurrence vectors from corpora vs. distance vectors from dictionaries. In: Proceedings of Coling 1994 (1994)Google Scholar
  5. 5.
    Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of Coling 1998, Montreal, vol. 2, pp. 768–774 (1998)Google Scholar
  6. 6.
    Freitag, D., Blume, M., Byrnes, J., Chow, E., Kapadia, S., Rohwer, R., Wang, Z.: New experiments in distributional representations of synonymy. In: Proceedings of CoNLL, pp. 25–32 (2005)Google Scholar
  7. 7.
    van der Plas, L., Tiedemann, J.: Finding synonyms using automatic word alignment and measures of distributional similarity. In: Proceedings of the COLING/ACL Poster Sessions, pp. 866–873 (2006)Google Scholar
  8. 8.
    Wu, H., Zhou, M.: Optimizing synonyms extraction with mono and bilingual resources. In: Proceedings of the Second International Workshop on Paraphrasing. Association for Computational Linguistics, Sapporo (2003)Google Scholar
  9. 9.
    Dyvik, H.: Translations as semantic mirrors: From parallel corpus to wordnet. In: The Theory and Use of English Language Corpora, ICAME 2002 (2002)Google Scholar
  10. 10.
    Bannard, C., Callison-Burch, C.: Paraphrasing with bilingual parallel corpora. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 597–604 (2005)Google Scholar
  11. 11.
    Zhitomirsky-Geffet, M., Dagan, I.: Bootstrapping distributional feature vector quality. Computational Linguistics 35(3), 435–461 (2009)CrossRefGoogle Scholar
  12. 12.
    Weeds, J.E.: Measures and Applications of Lexical Distributional Similarity. PhD thesis, University of Sussex (2003)Google Scholar
  13. 13.
    Barzilay, R., McKeown, K.R.: Extracting paraphrases from a parallel corpus. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (2001)Google Scholar
  14. 14.
    Lin, D., Zhao, S., Qin, L., Zhou, M.: Identifying synonyms among distributionally similar words. In: Proceedings of IJCAI 2003, pp. 1492–1493 (2003)Google Scholar
  15. 15.
    Curran, J.R., Moens, M.: Improvements in automatic thesaurus extraction. In: Proceedings of the ACL 2002 Workshop on Unsupervised Lexical Acquisition, pp. 59–66 (2002)Google Scholar
  16. 16.
    Lonneke, P., Tiedemann, J., Manguin, J.: Automatic acquisition of synonyms for French using parallel corpora. In: Proceedings of the 4th International Workshop on Distributed Agent-Based Retrieval Tools (2010)Google Scholar
  17. 17.
    Hagiwara, M., Ogawa, Y., Toyama, K.: Supervised synonym acquisition using distributional features and syntactic patterns. Journal of Natural Language Processing 16(2), 59–83 (2009)CrossRefGoogle Scholar
  18. 18.
    Ferret, O.: Testing semantic similarity measures for extracting synonyms from a corpus. In: Proceeding of LREC (2010)Google Scholar
  19. 19.
    Turney, P.: A uniform approach to analogies, synonyms, antonyms, and associations. In: Proceedings of Coling 2008, pp. 905–912 (2008)Google Scholar
  20. 20.
    Miller, G., Charles, W.: Contextual correlates of semantic similarity. Language and Cognitive Processes 6(1), 1–28 (1991)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Philippe Muller
    • 1
  • Philippe Langlais
    • 2
  1. 1.IRITUniv. Toulouse & Alpage, INRIAFrance
  2. 2.DIRO, Univ. MontréalCanada

Personalised recommendations