Advertisement

Unsupervised Cross-Lingual Mapping for Phrase Embedding Spaces

  • Abraham G. AyanaEmail author
  • Hailong Cao
  • Tiejun Zhao
Conference paper
  • 18 Downloads
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1130)

Abstract

Cross-lingual embedding has shown an effective way to learn cross-lingual representation in a joint embedding space. Recent work showed that cross-lingual phrase embedding is important to induce phrase table for unsupervised phrase-based machine translation. However, most of the cross-lingual representation from the literature are either limited to word level embedding or uses bilingual supervision for shared phrase embedding space. Therefore, in this paper, we explore the ways to map phrase embeddings of two languages into a common embedding space without supervision. Our model uses a three-step process: first we identify phrase in a sentence by using their mutual information, and combine component words of the phrase in the preprocessing stage; then we independently learn phrase embedding for each language based on their distributional properties, finally a fully unsupervised linear transformation method based on self-learning is used to map the phrase embeddings into a shared space. We extracted bilingual phrase translation as a gold standard to evaluate the result of the system. Besides its simplicity, the proposed method has shown a promising result for phrase embedding mapping.

Keywords

Cross-lingual mapping Word embedding Phrase embedding Machine translation Mutual Information Linear transformation 

References

  1. 1.
    Ruder, S., Vulić, I., Søgaard, A.: A survey of cross-lingual word embedding models. Computing Research Repository abs/1706.0 (4304), no. 661–3 (2017). arXiv:1706.04902
  2. 2.
    Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting similarities among languages for machine translation. arXiv:1309.4168v1.  https://doi.org/10.1162/153244303322533223
  3. 3.
    Faruqui, M., Dyer, C.: Improving vector space word representations using multilingual correlation. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 462–471. Association for Computational Linguistics, Stroudsburg (2014).  https://doi.org/10.3115/v1/E14-1049. http://aclweb.org/anthology/E14-1049
  4. 4.
    Xing, C., Wang, D., Liu, C., Lin, Y.: Normalized word embedding and orthogonal transform for bilingual word translation. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1006–1011. Association for Computational Linguistics, Stroudsburg (2015).  https://doi.org/10.3115/v1/N15-1104. http://aclweb.org/anthology/N15-1104
  5. 5.
    Hermann, K.M., Blunsom, P.: Multilingual models for compositional distributed semantics. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 58–68 (2014). arXiv:1404.4641
  6. 6.
    Vulic, I., Moens, M.F.: Bilingual distributed word representations from document-aligned comparable data. J. Artif. Intell. Res. 55(2), 953–994 (2016). arXiv:1509.07308v2MathSciNetCrossRefGoogle Scholar
  7. 7.
    Vyas, Y., Carpuat, M.: Sparse bilingual word representations for cross-lingual lexical entailment. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1187–1197. Association for Computational Linguistics, Stroudsburg (2016).  https://doi.org/10.18653/v1/N16-1142. http://aclweb.org/anthology/N16-1142
  8. 8.
    Artetxe, M., Labaka, G., Agirre, E.: Unsupervised statistical machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3632–3642 (2018). arXiv:1809.01272
  9. 9.
    Lample, G., Ott, M., Conneau, A., Denoyer, L., Ranzato, M.: Phrase-based & neural unsupervised machine translation. In: Emperical Methods for Natural Language Processing, vol. 25, no. 6, pp. 1109–1112 (2018). arXiv:1804.07755.  https://doi.org/10.1053/j.jvca.2010.06.032. https://arxiv.org/pdf/1804.07755.pdf
  10. 10.
    Zhang, J., Liu, S., Li, M., Zhou, M., Zong, C.: Bilingually-constrained phrase embeddings for machine translation. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 111–121 (2014).  https://doi.org/10.3115/v1/P14-1011. http://aclweb.org/anthology/P14-1011
  11. 11.
    Su, J., Xiong, D., Zhang, B., Liu, Y., Yao, J., Zhang, M.: Bilingual correspondence recursive autoencoder for statistical machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1248–1258. Association for Computational Linguistics, Stroudsburg (2015).  https://doi.org/10.18653/v1/D15-1146. http://aclweb.org/anthology/D15-1146
  12. 12.
    Alexis Conneau, H.J., Lample, G., Ranzato, M., Denoyer, L.: Word translation without parallel data. In: ICLR Conference Paper (2018). arXiv:1710.04087.  https://doi.org/10.1111/j.1540-4560.2007.00543.x. http://doi.wiley.com/10.1111/j.1540-4560.2007.00543.x
  13. 13.
    Artetxe, M., Labaka, G., Agirre, E.: A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Long Papers), Melbourne, Australia, pp. 789–798 (2018). arXiv:1805.06297
  14. 14.
    Miceli Barone, A.V.: Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders. In: Proceedings of the 1st Workshop on Representation Learning for NLP, pp. 121–126. Association for Computational Linguistics, Stroudsburg (2016). arXiv:1608.02996.  https://doi.org/10.18653/v1/W16-1614. http://aclweb.org/anthology/W16-1614
  15. 15.
    Zhang, M., Liu, Y., Luan, H., Sun, M.: Adversarial training for unsupervised bilingual lexicon induction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1959–1970. Association for Computational Linguistics, Stroudsburg (2017).  https://doi.org/10.18653/v1/P17-1179. http://aclweb.org/anthology/P17-1179
  16. 16.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.  https://doi.org/10.1162/153244303322533223
  17. 17.
    Guo, J., Che, W., Yarowsky, D., Wang, H., Liu, T.: Cross-lingual dependency parsing based on distributed representations. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing Volume 1, pp. 1234–1244 (2015). http://www.research.philips.com/publications/downloads/martin_wilcox_thesis.pdf
  18. 18.
    Artetxe, M., Labaka, G., Agirre, E.: Generalizing and improving bilingual word embedding mappings with a multi-step framework of linear transformations. In: The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018). Association for the Advancement of Artificial Intelligence (2018)Google Scholar
  19. 19.
    Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 451–462. Association for Computational Linguistics, Stroudsburg (2017).  https://doi.org/10.18653/v1/P17-1042. http://aclweb.org/anthology/P17-1042
  20. 20.
    Bouma, G.: Normalized (pointwise) mutual information in collocation extraction. In: International Conference of the German Society for Computational Linguistics and Language Technology, pp. 31–40 (2009)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyHarbin Institute of TechnologyHarbinChina

Personalised recommendations