Advertisement

Unsupervised Cross-Lingual Sentence Representation Learning via Linguistic Isomorphism

  • Shuai Wang
  • Lei HouEmail author
  • Meihan Tong
Conference paper
  • 648 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11776)

Abstract

Recently, many researches on learning cross-lingual word embeddings without parallel data have achieved success by utilizing word isomorphism among languages. However, unsupervised cross-lingual sentence representation, which aims to learn a unified semantic space without parallel data, has not been well explored. Though many cross-lingual tasks can be solved by learning a unified sentence representation of different languages benefiting from cross-lingual word embeddings, the performance is not competitive with their supervised counterparts. In this paper, we propose a novel framework for unsupervised cross-lingual sentence representation learning by utilizing linguistic isomorphism in both word and sentence level. After generating pseudo-parallel sentence based on the pre-trained cross-lingual word embeddings, the framework iteratively conducts sentence modeling, word embedding tuning and parallel sentences update. Our experiments show that the proposed framework achieves state-of-the-art results in many cross-lingual tasks, as well as improves the quality of cross-lingual word embeddings. The codes and pre-trained encoders will be released upon the paper publishing.

Keywords

Unsupervised learning Cross-lingual Sentence representation Language model 

Notes

Acknowledgement

The work is supported by NSFC key project (U1736204, 61533018, 61661146007), Ministry of Education and China Mobile Joint Fund (MCM20170301), and THUNUS NExT Co-Lab.

References

  1. 1.
    Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 451–462 (2017)Google Scholar
  2. 2.
    Artetxe, M., Labaka, G., Agirre, E.: A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. arXiv preprint arXiv:1805.06297 (2018)
  3. 3.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  4. 4.
    Barone, A.V.M.: Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders. arXiv preprint arXiv:1608.02996 (2016)
  5. 5.
    Chen, Q., Li, W., Lei, Y., Liu, X., He, Y.: Learning to adapt credible knowledge in cross-lingual sentiment analysis. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), vol. 1, pp. 419–429 (2015)Google Scholar
  6. 6.
    Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167. ACM (2008)Google Scholar
  7. 7.
    Conneau, A., Lample, G., Ranzato, M., Denoyer, L., Jégou, H.: Word translation without parallel data. arXiv preprint arXiv:1710.04087 (2017)
  8. 8.
    Conneau, A., et al.: XNLI: evaluating cross-lingual sentence representations. arXiv preprint arXiv:1809.05053 (2018)
  9. 9.
    Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
  10. 10.
    Kiros, R., et al.: Skip-thought vectors. In: Advances in Neural Information Processing Systems, pp. 3294–3302 (2015)Google Scholar
  11. 11.
    Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)Google Scholar
  12. 12.
    Logeswaran, L., Lee, H.: An efficient framework for learning sentence representations. arXiv preprint arXiv:1803.02893 (2018)
  13. 13.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  14. 14.
    Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)Google Scholar
  15. 15.
    Schwenk, H., Douze, M.: Learning joint multilingual sentence representations with neural machine translation, pp. 157–167 (2017)Google Scholar
  16. 16.
    Storer, T.: Linguistic isomorphisms. Univ. Chicago Press Behalf Philos. Sci. Assoc. 19(1), 77–85 (1952)MathSciNetGoogle Scholar
  17. 17.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)Google Scholar
  18. 18.
    Taylor, W.L.: “Cloze procedure”: a new tool for measuring readability. Journal. Bull. 30(4), 415–433 (1953)Google Scholar
  19. 19.
    Wada, T., Iwata, T.: Unsupervised cross-lingual word embedding by multilingual neural language models. arXiv preprint arXiv:1809.02306 (2018)
  20. 20.
    Williams, A., Nangia, N., Bowman, S.: A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1112–1122. Association for Computational Linguistics (2018). http://aclweb.org/anthology/N18-1101
  21. 21.
    Zhang, M., Liu, Y., Luan, H., Sun, M.: Adversarial training for unsupervised bilingual lexicon induction. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1959–1970 (2017)Google Scholar
  22. 22.
    Zhang, M., Liu, Y., Luan, H., Sun, M.: Earth mover’s distance minimization for unsupervised bilingual lexicon induction. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1934–1945 (2017)Google Scholar
  23. 23.
    Zhang, M., Wu, Y., Li, W., Li, W.: Learning universal sentence representations with mean-max attention autoencoder. arXiv preprint arXiv:1809.06590 (2018)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Tsinghua UniversityBeijingChina

Personalised recommendations