Abstract
Beam search is the most widely-used decoding algorithm for machine translation. Its success, however, may be attributed to the inadvertent implementation of the Uniform Information Density (UID) hypothesis. The UID hypothesis suggests that humans prefer sentences with evenly distributed information across the linguistic signal, while adhering to grammatical constraints. This paper presents Nucleus Beam Search, a novel machine translation decoding algorithm aimed at achieving the UID objective. By combining nucleus filtering with beam search, our approach effectively expands the search space without violating the UID hypothesis, enabling the generation of lengthier and more com prehensive translations. Experimental results reveal that Nucleus Beam Search outperforms traditional decoding algorithms in terms of BLEU, METEOR, ROUGE-L and CIDEr scores. Nevertheless, our findings also suggest that information density is not the sole determinant of translation quality, with beamwidth playing a significant role as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boulanger-Lewandowski, N., Bengio, Y., Vincent, P.: Audio chord recognition with recurrent neural networks. In: Proceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013 pp. 335–340 (2013)
Caccia, M., Caccia, L., Fedus, W., Larochelle, H., Pineau, J., Charlin, L.: Language GANs falling short. In: Proceedings of the 8th International Conference on Learning Representations (2020). http://arxiv.org/abs/1811.02549
Cohen, E., Beck, J.C.: Empirical analysis of beam search performance degradation in neural sequence models. In: 36th International Conference on Machine Learning, ICML 2019 2019-June, pp. 2294–2312 (2019)
Fan, A., Lewis, M., Dauphin, Y.: Hierarchical neural story generation. ACL 2018- 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) 1, 889–898 (2018)
Feng, V.W., Hirst, G.: Text-level discourse parsing with rich linguistic features. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 60–68 (2012)
Ficler, J., Goldberg, Y.: Controlling linguistic style aspects in neural language generation. In: Proceedings of the Workshop on Stylistic Variation, pp. 94–104 (2017)
Graves, A.: Sequence Transduction with Recurrent Neural Networks. arXiv preprint arXiv:1211.3711 (2012). http://arxiv.org/abs/1211.3711
He, W., He, Z., Wu, H., Wang, H.: Improved neural machine translation with SMT features. In: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, no. 10, pp.151–157 (2016)
Holtzman, A., Buys, J., Du, L., Forbes, M., Choi, Y.: The curious case of neural text degeneration. In: The International Conference on Learning Representations (ICLR) (2020)
Jaeger, T., Levy, R.: Speakers optimize information density through syntactic reduction. Adv. Neural. Inf. Process. Syst. 19, 849–856 (2007)
Jean, S., Firat, O., Cho, K., Memisevic, R., Bengio, Y.: Montreal neural machine translation systems for wmt’15. In: 10th Workshop on Statistical Machine Translation, WMT 2015 at the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015 – Proceedings, pp. 134–140 (2015)
Koehn, P., Knowles, R.: Six Challenges for Neural Machine Translation. First Workshop on Neural Machine Translation pp. 28–39 (2017)
Lukasik, M., et al.: Semantic label smoothing for sequence to sequence problems. In: EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (2016), pp. 4992–4998 (2020)
Luong, M.T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W.: Addressing the rare word problem in neural machine translation. In: ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Inter- national Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference, vol. 1, pp. 11–19 (2015)
Meister, C., Cotterell, R., Vieira, T.: If beam search is the answer, what was the question? In: EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 2173–2185 (2020)
Meister, C., Pimentel, T., Haller, P., Jäger, L., Cotterell, R., Levy, R.: Revisiting the uniform information density hypothesis. In: EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (i), pp. 963–980 (2021)
Mohamed, S.A., Elsayed, A.A., Hassan, Y.F., Abdou, M.A.: Neural machine translation: past, present, and future. Neural Comput. Appl. 33(23), 15919–15931 (2021). https://doi.org/10.1007/s00521-021-06268-0
Murray, K., Chiang, D.: Correcting length bias in neural machine translation. In: WMT 2018 - 3rd Conference on Machine Translation, Proceedings of the Conference, vol. 1, pp. 212–223 (2018)
Peters, B., Martins, A.F.T.: Smoothing and shrinking the Sparse Seq2Seq search space. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 2642–2654 (2021)
Shaham, U., Levy, O.: What do you get when you cross beam search with nucleus sampling? In: Proceedings of the Third Workshop on Insights from Negative Results in NLP, Dublin, Ireland, pp. 38–45. Association for Computational Linguistics, May 2022. https://doi.org/10.18653/v1/2022.insights-1.5, https://aclanthology.org/2022.insights-1.5
Smith, N.J., Levy, R.: The effect of word predictability on reading time is logarithmic. Cognition 128(3), 302–319 (2013)
Stahlberg, F.: Neural machine translation: a review. J. Artif. Intell. Res. 69, 343–418 (2020)
Stahlberg, F., Byrne, B.: On NMT search errors and model errors: cat got your tongue? In: EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, pp. 3356–3362 (2019)
Tiedemann, J., Thottingal, S.: OPUS-MT — building open translation services for the world. In: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT). Lisbon, Portugal (2020)
Tily, H., Piantadosi, S.: Refer efficiently: Use less informative expressions for more predictable meanings. In: Proceedings of the Workshop on the Production of Referring Expressions: Bridging the Gap Between Computational and Empirical Approaches to Reference (2009)
Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 pp. 1–23 (2016). http://arxiv.org/abs/1609.08144
Yang, S., Wang, Y., Chu, X.: A survey of deep learning techniques for neural machine translation. arXiv preprint arXiv:2002.07526 (2020). http://arxiv.org/abs/2002.07526
Yang, Y., Huang, L., Ma, M.: Breaking the beam search curse: a study of (re)scoring methods and stopping criteria for neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018, pp. 3054–3059 (2018)
Acknowledgements
This work is supported by Sichuan Science and Technology Program (2022ZHCG0007), and the Natural Science Foundation of Sichuan Province (2022NSFSC0503).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, Z., Tao, R., Wang, Y. (2023). Nucleus Beam Search for Machine Translation Decoding. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14089. Springer, Singapore. https://doi.org/10.1007/978-981-99-4752-2_49
Download citation
DOI: https://doi.org/10.1007/978-981-99-4752-2_49
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4751-5
Online ISBN: 978-981-99-4752-2
eBook Packages: Computer ScienceComputer Science (R0)