Nucleus Beam Search for Machine Translation Decoding

Chen, Zheng; Tao, Ruiwen; Wang, Yifan

doi:10.1007/978-981-99-4752-2_49

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14089))

Included in the following conference series:

International Conference on Intelligent Computing

990 Accesses

Abstract

Beam search is the most widely-used decoding algorithm for machine translation. Its success, however, may be attributed to the inadvertent implementation of the Uniform Information Density (UID) hypothesis. The UID hypothesis suggests that humans prefer sentences with evenly distributed information across the linguistic signal, while adhering to grammatical constraints. This paper presents Nucleus Beam Search, a novel machine translation decoding algorithm aimed at achieving the UID objective. By combining nucleus filtering with beam search, our approach effectively expands the search space without violating the UID hypothesis, enabling the generation of lengthier and more com prehensive translations. Experimental results reveal that Nucleus Beam Search outperforms traditional decoding algorithms in terms of BLEU, METEOR, ROUGE-L and CIDEr scores. Nevertheless, our findings also suggest that information density is not the sole determinant of translation quality, with beamwidth playing a significant role as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Boulanger-Lewandowski, N., Bengio, Y., Vincent, P.: Audio chord recognition with recurrent neural networks. In: Proceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013 pp. 335–340 (2013)
Google Scholar
Caccia, M., Caccia, L., Fedus, W., Larochelle, H., Pineau, J., Charlin, L.: Language GANs falling short. In: Proceedings of the 8th International Conference on Learning Representations (2020). http://arxiv.org/abs/1811.02549
Cohen, E., Beck, J.C.: Empirical analysis of beam search performance degradation in neural sequence models. In: 36th International Conference on Machine Learning, ICML 2019 2019-June, pp. 2294–2312 (2019)
Google Scholar
Fan, A., Lewis, M., Dauphin, Y.: Hierarchical neural story generation. ACL 2018- 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) 1, 889–898 (2018)
Google Scholar
Feng, V.W., Hirst, G.: Text-level discourse parsing with rich linguistic features. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 60–68 (2012)
Google Scholar
Ficler, J., Goldberg, Y.: Controlling linguistic style aspects in neural language generation. In: Proceedings of the Workshop on Stylistic Variation, pp. 94–104 (2017)
Google Scholar
Graves, A.: Sequence Transduction with Recurrent Neural Networks. arXiv preprint arXiv:1211.3711 (2012). http://arxiv.org/abs/1211.3711
He, W., He, Z., Wu, H., Wang, H.: Improved neural machine translation with SMT features. In: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, no. 10, pp.151–157 (2016)
Google Scholar
Holtzman, A., Buys, J., Du, L., Forbes, M., Choi, Y.: The curious case of neural text degeneration. In: The International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Jaeger, T., Levy, R.: Speakers optimize information density through syntactic reduction. Adv. Neural. Inf. Process. Syst. 19, 849–856 (2007)
Google Scholar
Jean, S., Firat, O., Cho, K., Memisevic, R., Bengio, Y.: Montreal neural machine translation systems for wmt’15. In: 10th Workshop on Statistical Machine Translation, WMT 2015 at the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015 – Proceedings, pp. 134–140 (2015)
Google Scholar
Koehn, P., Knowles, R.: Six Challenges for Neural Machine Translation. First Workshop on Neural Machine Translation pp. 28–39 (2017)
Google Scholar
Lukasik, M., et al.: Semantic label smoothing for sequence to sequence problems. In: EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (2016), pp. 4992–4998 (2020)
Google Scholar
Luong, M.T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W.: Addressing the rare word problem in neural machine translation. In: ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Inter- national Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference, vol. 1, pp. 11–19 (2015)
Google Scholar
Meister, C., Cotterell, R., Vieira, T.: If beam search is the answer, what was the question? In: EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 2173–2185 (2020)
Google Scholar
Meister, C., Pimentel, T., Haller, P., Jäger, L., Cotterell, R., Levy, R.: Revisiting the uniform information density hypothesis. In: EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (i), pp. 963–980 (2021)
Google Scholar
Mohamed, S.A., Elsayed, A.A., Hassan, Y.F., Abdou, M.A.: Neural machine translation: past, present, and future. Neural Comput. Appl. 33(23), 15919–15931 (2021). https://doi.org/10.1007/s00521-021-06268-0
Article Google Scholar
Murray, K., Chiang, D.: Correcting length bias in neural machine translation. In: WMT 2018 - 3rd Conference on Machine Translation, Proceedings of the Conference, vol. 1, pp. 212–223 (2018)
Google Scholar
Peters, B., Martins, A.F.T.: Smoothing and shrinking the Sparse Seq2Seq search space. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 2642–2654 (2021)
Google Scholar
Shaham, U., Levy, O.: What do you get when you cross beam search with nucleus sampling? In: Proceedings of the Third Workshop on Insights from Negative Results in NLP, Dublin, Ireland, pp. 38–45. Association for Computational Linguistics, May 2022. https://doi.org/10.18653/v1/2022.insights-1.5, https://aclanthology.org/2022.insights-1.5
Smith, N.J., Levy, R.: The effect of word predictability on reading time is logarithmic. Cognition 128(3), 302–319 (2013)
Article Google Scholar
Stahlberg, F.: Neural machine translation: a review. J. Artif. Intell. Res. 69, 343–418 (2020)
Article MathSciNet Google Scholar
Stahlberg, F., Byrne, B.: On NMT search errors and model errors: cat got your tongue? In: EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, pp. 3356–3362 (2019)
Google Scholar
Tiedemann, J., Thottingal, S.: OPUS-MT — building open translation services for the world. In: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT). Lisbon, Portugal (2020)
Google Scholar
Tily, H., Piantadosi, S.: Refer efficiently: Use less informative expressions for more predictable meanings. In: Proceedings of the Workshop on the Production of Referring Expressions: Bridging the Gap Between Computational and Empirical Approaches to Reference (2009)
Google Scholar
Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 pp. 1–23 (2016). http://arxiv.org/abs/1609.08144
Yang, S., Wang, Y., Chu, X.: A survey of deep learning techniques for neural machine translation. arXiv preprint arXiv:2002.07526 (2020). http://arxiv.org/abs/2002.07526
Yang, Y., Huang, L., Ma, M.: Breaking the beam search curse: a study of (re)scoring methods and stopping criteria for neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018, pp. 3054–3059 (2018)
Google Scholar

Download references

Acknowledgements

This work is supported by Sichuan Science and Technology Program (2022ZHCG0007), and the Natural Science Foundation of Sichuan Province (2022NSFSC0503).

Author information

Authors and Affiliations

School of Information and Software Engineering, University of Electronic Science and Technology of China, No. 4, Section 2, North Jianshe Road, Chengdu, Sichuan, People’s Republic of China
Zheng Chen, Ruiwen Tao & Yifan Wang

Authors

Zheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ruiwen Tao
View author publications
You can also search for this author in PubMed Google Scholar
Yifan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zheng Chen .

Editor information

Editors and Affiliations

Department of Computer Science, Eastern Institute of Technology, Zhejiang, China
De-Shuang Huang
University of Wollongong, North Wollongong, NSW, Australia
Prashan Premaratne
Zhengzhou University of Light Industry, Zhengzhou, China
Baohua Jin
Zhong Yuan University of Technology, Zhengzhou, China
Boyang Qu
University of Ulsan, Ulsan, Korea (Republic of)
Kang-Hyun Jo
Department of Computer Science, Liverpool John Moores University, Liverpool, UK
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Z., Tao, R., Wang, Y. (2023). Nucleus Beam Search for Machine Translation Decoding. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14089. Springer, Singapore. https://doi.org/10.1007/978-981-99-4752-2_49

Download citation

DOI: https://doi.org/10.1007/978-981-99-4752-2_49
Published: 31 July 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4751-5
Online ISBN: 978-981-99-4752-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics