Dual Learning: Theoretical Study and an Algorithmic Extension

Zhao, Zhibing; Xia, Yingce; Qin, Tao; Xia, Lirong; Liu, Tie-Yan

doi:10.1007/s42979-021-00799-y

Dual Learning: Theoretical Study and an Algorithmic Extension

Original Research
Published: 14 August 2021

Volume 2, article number 413, (2021)
Cite this article

SN Computer Science Aims and scope Submit manuscript

Zhibing Zhao ORCID: orcid.org/0000-0001-6473-8205¹,
Yingce Xia²,
Tao Qin²,
Lirong Xia³ &
…
Tie-Yan Liu²

403 Accesses
3 Citations
Explore all metrics

Abstract

Dual learning has been successfully applied in many machine learning applications including machine translation, image-to-image transformation, etc. The high-level idea of dual learning is very intuitive: if we map an x from one domain to another and then map it back, we should recover the original x. Although its effectiveness has been empirically verified, the theoretical understanding of dual learning is still very limited. In this paper, we characterize sufficient conditions for dual learning to outperform vanilla translators. Based on our theoretical analysis, we further extend dual learning by introducing more related mappings and propose multi-step dual learning, in which we leverage feedback signals from additional domains to improve the qualities of the mappings. We show that multi-step dual learning has the potential to boost the performance of dual learning. Experiments on WMT 14 English \(\leftrightarrow\) German, MultiUN English \(\leftrightarrow\) French, and IWSLT’17 English \(\leftrightarrow\) Chinese translations verify our theoretical findings on dual learning, and the results on the translations among English, French, and Spanish of MultiUN demonstrate the effectiveness of multi-step dual learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semi-coupled Transform Learning

Modular Domain-to-Domain Translation Network

Transformer: A General Framework from Machine Translation to Others

Article 02 June 2023

Notes

Data available at http://www.statmt.org/wmt14/translation-task.html.
http://opus.nlpl.eu/MultiUN.php.
https://sites.google.com/site/iwsltevaluation2017/TED-tasks.
https://github.com/fxsjy/jieba.
http://data.statmt.org/news-crawl/.
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/multi-bleu.perl.
Signature: BLEU+case.mixed+lang.en-zh+numrefs.1+smooth.exp+test.iwslt17+tok.zh+version.1.5.1

References

Artetxe M, Labaka G, Agirre E, Cho K. Unsupervised neural machine translation. In: Proceedings of the Sixth International Conference on Learning Representations; 2018.
Yong C, Wei X, Zhongjun H, Wei H, Hua W, Maosong S, Yang L. Semi-supervised learning for neural machine translation. Proceedings of the Fifty-Fourth Annual Meeting of the Association for Computational Linguistics. 2016;1:1965–74.
Google Scholar
Edunov S, Ott M, Auli M, Grangier D. Understanding back-translation at scale. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing; 2018. p. 489–500.
Eisele A, Chen Y. Multiun: A multilingual corpus from united nation documents. In: Proceedings of the Seventh conference on International Language Resources and Evaluation, 5, 2010; pp. 2868–2872.
Galanti T, Wolf L, Benaim S. The role of minimal complexity functions in unsupervised learning of semantic mappings. In: Proceedings of the Sixth International Conference on Learning Representations; 2018.
He D, Xia Y, Qin T, Wang L, Yu N, Liu T, Ma W-Y. Dual learning for machine translation. In: Advances in Neural Information Processing Systems; 2016. pp. 820–828.
Johnson M, Schuster M, Le Quoc V, Krikun M, Yonghui W, Chen Z, Thorat N, Viégas F, Wattenberg M, Corrado G et al. Googles multilingual neural machine translation system: enabling zero-shot translation. Trans Assoc Comput Linguist. 2017;5:339–51.
Kim T, Cha M, Kim H, Lee JK, Kim J. Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the Thirty-fourth International Conference on Machine Learning; 2017. p. 1857–1865.
Kingma DP, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the Third International Conference on Learning Representations; 2015.
Lample G, Conneau A, Denoyer L, Ranzato M. Unsupervised machine translation using monolingual corpora only. In: Proceedings of the Sixth International Conference on Learning Representations; 2018.
Luo P, Wang G, Lin L, Wang X. Deep dual learning for semantic image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA; 2017. p. 21–26.
Ott M, Edunov S, Grangier D, Auli M. Scaling neural machine translation. In: Proceedings of the Third Conference on Machine Translation: Research Papers; 2018. p. 1–9.
Papineni K, Roukos S, Ward T, Zhu W-J. BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the Fortieth Annual Meeting on Association for Computational Linguistics, pages 311–318. Association for Computational Linguistics; 2002.
Poncelas A, Shterionov D, Way A, de Buy WGM, Passban P. Investigating backtranslation in neural machine translation. In: Proceedings of the Twenty-First Annual Conference of the European Association for Machine Translation: 28–30 May 2018, Universitat d’Alacant, Alacant, Spain, pages 249–258. European Association for Machine Translation, 2018.
Ren S, Chen W, Liu S, Li M, Zhou M, Ma Shuai. Triangular architecture for rare language translation. In: Proceedings of the Fifty-Sixth Annual Meeting of the Association for Computational Linguistics; 2018.
Sennrich R, Haddow B, Birch Aandra. Improving neural machine translation models with monolingual data. In: Proceedings of the Fifty-Fourth Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 86–96, Berlin, Germany, August 2016. Association for Computational Linguistics.
Sennrich R, Haddow B, Birch A. Neural machine translation of rare words with subword units. In: Proceedings of the Fifty-Fourth Annual Meeting of the Association for Computational Linguistics; 2016.
Tang D, Duan N, Qin T, Yan Z, Zhou M. Question answering and question generation as dual tasks. 2017. arXiv:1706.02027.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Łukasz, Polosukhin Illia. Attention is all you need. In: Advances in Neural Information Processing Systems; 2017. p. 5998–6008.
Wang Y, Xia Y, Zhao L, Bian J, Qin T, Liu G, Liu T. Dual transfer learning for neural machine translation with marginal distribution regularization. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence; 2018.
Xia Y, Bian J, Qin T, Yu N, Liu T-Y. Dual inference for machine learning. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence; 2017. p. 3112–3118.
Xia Y, Qin T, Chen W, Bian J, Yu N, Liu T-Y. Dual supervised learning. In: Proceedings of the Thirty-Fourth International Conference on Machine Learning; 2017. p. 3789–3798.
Xia Y, Tan X, Tian F, Qin T, Yu N, Liu T-Y. Model-level dual learning. In: Proceedings of the Thirty-Fifth International Conference on Machine Learning; 2018. p. 3789–3798.
Zhu J-Y, Park T, Isola P, Efros AA. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision; 2017.

Download references

Author information

Authors and Affiliations

Microsoft, Bellevue, WA, USA
Zhibing Zhao
Microsoft Research Asia, Beijing, China
Yingce Xia, Tao Qin & Tie-Yan Liu
Rensselaer Polytechnic Institute, Troy, NY, USA
Lirong Xia

Authors

Zhibing Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yingce Xia
View author publications
You can also search for this author in PubMed Google Scholar
Tao Qin
View author publications
You can also search for this author in PubMed Google Scholar
Lirong Xia
View author publications
You can also search for this author in PubMed Google Scholar
Tie-Yan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhibing Zhao.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “ACML 2020” guest edited by Masashi Sugiyama, Sinno Jialin Pan, Thanaruk Theeramunkong and Wray Buntine.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, Z., Xia, Y., Qin, T. et al. Dual Learning: Theoretical Study and an Algorithmic Extension. SN COMPUT. SCI. 2, 413 (2021). https://doi.org/10.1007/s42979-021-00799-y

Download citation

Received: 22 December 2020
Accepted: 25 July 2021
Published: 14 August 2021
DOI: https://doi.org/10.1007/s42979-021-00799-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual Learning: Theoretical Study and an Algorithmic Extension

Abstract

Access this article

Similar content being viewed by others

Semi-coupled Transform Learning

Modular Domain-to-Domain Translation Network

Transformer: A General Framework from Machine Translation to Others

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dual Learning: Theoretical Study and an Algorithmic Extension

Abstract

Access this article

Similar content being viewed by others

Semi-coupled Transform Learning

Modular Domain-to-Domain Translation Network

Transformer: A General Framework from Machine Translation to Others

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation