BBN’s low-resource machine translation for the LoReHLT 2016 evaluation

Setiawan, Hendra; Huang, Zhongqiang; Zbib, Rabih

doi:10.1007/s10590-017-9206-2

BBN’s low-resource machine translation for the LoReHLT 2016 evaluation

Published: 24 October 2017

Volume 32, pages 45–57, (2018)
Cite this article

Machine Translation

361 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

We describe BBN’s contribution to the machine translation (MT) task in the LoReHLT 2016 evaluation, focusing on the techniques and methodologies employed to build the Uyghur–English MT systems in low-resource conditions. In particular, we discuss the data selection process, morphological segmentation of the source, neural network feature models, and our use of a native informant and related language resources. Our final submission for the evaluation was ranked first among all participants.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Chinese-Russian Shared Task on Multi-domain Translation

Experimenting with Different Machine Translation Models in Medium-Resource Settings

ISTIC’s Neural Machine Translation Systems for CCMT’ 2023

Notes

LDC catalogue number LDC2011T07.
http://dict.yulghun.com.

References

Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the international conference on learning representations
Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. In: ACL ’05 Proceedings of the 43rd annual meeting on association for computational linguistics, Association for Computational Linguistics, Stroudsburg, PA, pp 263–270. doi:10.3115/1219840.1219873
Chiang D, Knight K, Wang W (2009) 11,001 new features for statistical machine translation. In: NAACL ’09: proceedings of the 2009 human language technology conference of the North American chapter of the association for computational linguistics, pp 218–226
Devlin J (2009) Lexical features for statistical machine translation. Master’s thesis, University of Maryland
Devlin J, Matsoukas S (2012) Trait-based hypothesis selection for machine translation. In: NAACL HLT ’12 proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: human language technologies, Association for Computational Linguistics, Stroudsburg, PA, pp 528–532. http://dl.acm.org/citation.cfm?id=2382029.2382107
Devlin J, Zbib R, Huang Z, Lamar T, Schwartz R, Makhoul J (2014) Fast and robust neural network joint models for statistical machine translation. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (vol 1: Long Papers), Association for Computational Linguistics, Baltimore, Maryland, pp 1370–1380. http://www.aclweb.org/anthology/P14-1129
Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN (2017) Convolutional sequence to sequence learning. ArXiv e-prints 1705:03122
Grönroos SA, Virpioja S, Smit P, Kurimo M (2014) Morfessor flatcat: an HMM-based method for unsupervised and semi-supervised learning of morphology. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: Technical Papers, Dublin City University and Association for Computational Linguistics, Dublin, pp 1177–1185. http://www.aclweb.org/anthology/C14-1111
Ha TL, Niehues J, Waibel A (2016) Toward multilingual neural machine translation with universal encoder and decoder. In: Proceedings of the 13th international workshop on spoken language translation
Haghighi A, Blitzer J, DeNero J, Klein D (2009) Better word alignments with supervised ITG models. In: Proceedings of ACL, Association for Computational Linguistics, Suntec, pp 923–931. http://www.aclweb.org/anthology/P/P09/P09-1104
Johnson M, Schuster M, Le QV, Krikun M, Wu Y, Chen Z, Thorat N, Vigas F, Wattenberg M, Corrado G, Hughes M, Dean J (2016) Google’s multilingual neural machine translation system: enabling zero-shot translation. arxiv:1611.04558
Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the 2003 human language technology conference of the North American chapter of the association for computational linguistics, Edmonton, pp 48–54
Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51
Article MATH Google Scholar
Papineni K, Roukos S, Ward T, Zhu W (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics (ACL), Philadelphia, PA
Rosti AI, Ayan NF, Xiang B, Matsoukas S, Schwartz R, Dorr B (2007) Combining outputs from multiple machine translation systems. In: Proceedings of the 2007 human language technology conference of the North American chapter of the association for computational linguistics, Rochester, NY
Rosti AI, Zhang B, Matsoukas S, Schwartz R (2010) BBN system description for WMT10 system combination task. In: ACL 2010 joint fifth workshop on statistical machine translation and metrics MATR, Uppsala
Sennrich R, Haddow B, Birch A (2016) Edinburgh neural machine translation systems for WMT 16. In: Proceedings of the first conference on machine translation
Setiawan H, Huang Z, Devlin J, Lamar T, Zbib R, Schwartz R, Makhoul J (2015) Statistical machine translation features with multitask tensor networks. In: Proceedings of ACL, Association for Computational Linguistics
Shen L, Xu J, Weischedel R (2008) A new string-to-dependency machine translation algorithm with a target dependency language model. In: Proceedings of the 46th annual meeting of the association for computational linguistics (ACL), Columbus, Ohio, pp 577–585
Shen L, Xu J, Weischedel R (2010) String-to-dependency statistical machine translation. Comput Linguist 36(4):649–671
Article Google Scholar
Stallard D, Devlin J, Kayser M, Lee YK, Barzilay R (2012) Unsupervised morphology rivals supervised morphology for Arabic MT. In: The 50th annual meeting of the association for computational linguistics, Proceedings of the conference, vol 2: short papers, Jeju Island, Korea, 8–14 July 2012, pp 322–327. http://www.aclweb.org/anthology/P12-2063
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, ukasz Kaiser, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado G, Hughes M, Dean J (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. CoRR arxiv:1609.08144,
Xu H, Marcus M, Ungar L, Yang C (2017) Unsupervised morphology learning with statistical paradigms (Unpublished)

Download references

Acknowledgements

This work was supported by DARPA/I2O under the LORELEI program. The views, opinions, and/or findings contained in this article are those of the author and should not be interpreted as representing the official views or policies, either expressed or implied, of the Defense Advanced Research Projects Agency or the Department of Defense.

Author information

Authors and Affiliations

Apple Inc., 1 Infinite Loop, Cupertino, CA, 95014, USA
Hendra Setiawan
Raytheon BBN Technologies, 10 Moulton Street, Cambridge, MA, 02138, USA
Zhongqiang Huang & Rabih Zbib

Authors

Hendra Setiawan
View author publications
You can also search for this author in PubMed Google Scholar
Zhongqiang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Rabih Zbib
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rabih Zbib.

Additional information

Work done while Hendra Setiawan was at Raytheon BBN Technologies.

This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR0011-15-C-0113. The views, opinions and/or findings expressed are those of the author and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government. Distribution Statement ‘A’ (Approved for Public Release by DARPA on Aug 29, 2017 (DISTAR Approval #28392), Distribution Unlimited).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Setiawan, H., Huang, Z. & Zbib, R. BBN’s low-resource machine translation for the LoReHLT 2016 evaluation. Machine Translation 32, 45–57 (2018). https://doi.org/10.1007/s10590-017-9206-2

Download citation

Received: 29 September 2017
Accepted: 05 October 2017
Published: 24 October 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s10590-017-9206-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BBN’s low-resource machine translation for the LoReHLT 2016 evaluation

Abstract

Access this article

Similar content being viewed by others

Chinese-Russian Shared Task on Multi-domain Translation

Experimenting with Different Machine Translation Models in Medium-Resource Settings

ISTIC’s Neural Machine Translation Systems for CCMT’ 2023

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

BBN’s low-resource machine translation for the LoReHLT 2016 evaluation

Abstract

Access this article

Similar content being viewed by others

Chinese-Russian Shared Task on Multi-domain Translation

Experimenting with Different Machine Translation Models in Medium-Resource Settings

ISTIC’s Neural Machine Translation Systems for CCMT’ 2023

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation