Skip to main content
Log in

A Bayesian non-linear method for feature selection in machine translation quality estimation

  • Published:
Machine Translation

Abstract

We perform a systematic analysis of the effectiveness of features for the problem of predicting the quality of machine translation (MT) at the sentence level. Starting from a comprehensive feature set, we apply a technique based on Gaussian processes, a Bayesian non-linear learning method, to automatically identify features leading to accurate model performance. We consider application to several datasets across different language pairs and text domains, with translations produced by various MT systems and scored for quality according to different evaluation criteria. We show that selecting features with this technique leads to significantly better performance in most datasets, as compared to using the complete feature sets or a state-of-the-art feature selection approach. In addition, we identify a small set of features which seem to perform well across most datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. A fuzzy match score represents the percentage of common words between a segment to translate and segments previously translated in a database, and thus for which a correct translation is available and can be used directly.

  2. http://www.sheffieldml.github.io/GPy/.

  3. http://www.quest.dcs.shef.ac.uk.

  4. http://www.dcs.shef.ac.uk/~lucia/resources.html.

  5. http://www.statmt.org/moses/?n=Moses.Baseline.

  6. This formulation is equivalent to sentence-level BLEU without a brevity penalty, where the n-gram precision scores are smoothed; we add one to the numerator and denominator terms in order to avoid division by 0 errors.

  7. These feature sets were made available by the task organisers at http://www.dcs.shef.ac.uk/~lucia/resources.html.

  8. The GP trained on the selected features consistently outperforms the linear model learned by RL.

References

  • Avramidis E (2012) Quality estimation for machine translation output using linguistic analysis and decoding features. In: Proceedings of 7th workshop on statistical machine translation, WMT 2012, Montreal, pp 84–90

  • Bach N, Huang F, Alonaizan Y (2011) Goodness: a method for measuring machine translation confidence. In: ACL HLT 2011, The 49th annual meeting of the association for computational linguistics: human language technologies, proceedings of the conference, Portland, pp 211–219

  • Blatz J, Fitzgerald E, Foster G, Gandrabur S, Goutte C, Kulesza A, Sanchis A, Ueffing N (2004) Confidence estimation for machine translation. In: 20th international conference on computational linguistics, proceedings, vol I, Geneva, pp 315–321

  • Bojar O, Buck C, Callison-Burch C, Federmann C, Haddow B, Koehn P, Monz C, Post M, Soricut R, Specia L (2013) Findings of the 2013 workshop on statistical machine translation. In: Proceedings of 8th workshop on statistical machine translation, WMT 2013, Sofia, pp 1–44

  • Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2):263–311

    Google Scholar 

  • Buck C (2012) Black-box features for the WMT 2012 quality estimation shared task. In: Proceedings of the 7th workshop on statistical machine translation, WMT 2012, Montreal, pp 91–95

  • Callison-Burch C, Koehn P, Monz C, Post M, Soricut R, Specia L (2012) Findings of the 2012 WMT. In: Proceedings of the 7th workshop on statistical machine translation, WMT 2012, Montreal, pp 10–51

  • Felice M, Specia L (2012) Linguistic features for quality estimation. In: Proceedings of the 7th workshop on statistical machine translation, WMT 2012, Montreal, pp 96–103

  • González-Rubio J, Sanchís A, Casacuberta F (2012) PRHLT submission to the WMT12 quality estimation task. In: Proceedings of the 7th workshop on statistical machine translation, WMT 2012, Montreal, pp 104–108

  • Hardmeier C, Nivre J, Tiedemann J (2012) Tree Kernels for machine translation quality estimation. In: Proceedings of the 7th workshop on statistical machine translation, WMT 2012, Montreal, pp 109–113

  • He Y, Ma Y, van Genabith J, Way A (2010) Bridging SMT and TM with translation recommendation. In: Proceedings of the 48th annual meeting of the association for computational linguistics, ACL 2010, Uppsala, pp 622–630

  • Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, ACL 2007, Prague, pp 177–180

  • Langlois D, Raybaud S, Smaïli K (2012) LORIA system for the WMT12 quality estimation shared task. In: Proceedings of the 7th workshop on statistical machine translation, WMT 2012, Montreal, pp 114–119

  • Meinshausen N, Bühlmann P (2010) Stability selection. J R Stat Soc 72(4):417–473

    Article  MathSciNet  Google Scholar 

  • Moreau E, Vogel C (2012) Quality estimation: an experimental study using unsupervised similarity measures. In: Proceedings of the 7th workshop on statistical machine translation, WMT 2012, Montreal, pp 120–126

  • Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51

    Article  MATH  Google Scholar 

  • Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the conference on 40th annual meeting of the association for computational linguistics, Philadelphia, pp 311–318

  • Pighin D, González M, Màrquez L (2012) The UPC submission to the WMT 2012 shared task on quality estimation. In: Proceedings of the 7th workshop on statistical machine translation, WMT 2012, Montreal, pp 127–132

  • Potet M, Esperança-Rodier E, Besacier L, Blanchon H (2012) Collection of a large database of French–English SMT output corrections. In: Eighth conference on language resources and evaluation, Istanbul, pp 4043–4048

  • Quiñonero-Candela J, Rasmussen CE (2005) A unifying view of sparse approximate gaussian process regression. J Mach Learn Res 6:1939–1959

    MathSciNet  MATH  Google Scholar 

  • Rasmussen CE, Williams CK (2006) Gaussian processes for machine learning, vol 1. MIT Press, Cambridge

    MATH  Google Scholar 

  • Shah K, Avramidis E, Biçici E, Specia L (2013) Quest: design, implementation and extensions of a framework for machine translation quality estimation. Prague Bull Math Linguist 100:19–30

    Article  Google Scholar 

  • Shah K, Cohn T, Specia L (2013) An investigation on the effectiveness of features for translation quality estimation. In: Proceedings of the XIV machine translation summit, Nice, pp 167–174

  • Sikes R (2007) Fuzzy matching in theory and practice. Multilingual 18(6):39–43

    Google Scholar 

  • Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th conference of the association for machine translation in the Americas: visions for the future of machine translation, AMTA 2006, Cambridge, pp 223–231

  • Soricut R, Bach N, Wang Z (2012) The SDL language weaver systems in the WMT12 quality estimation shared task. In: Proceedings of the 7th workshop on statistical machine translation, WMT 2012, Montreal, pp 145–151

  • Soricut R, Echihabi A (2010) TrustRank: inducing trust in automatic translations via ranking. In: Proceedings of the conference on the 48th annual meeting of the association for computational linguistics, ACL 2010, Uppsala, pp 612–621

  • Specia L (2011) Exploiting objective annotations for measuring translation post-editing effort. In: Proceedings of the 15th conference of the European association for machine translation, EAMT 2011, Leuven, pp 73–80

  • Specia L, Hajlaoui N, Hallett C, Aziz W (2011) Predicting machine translation adequacy. In: MT Summit XIII: the thirteenth machine translation summit, Xiamen, pp 513–520

  • Specia L, Raj D, Turchi M (2010) Machine translation evaluation versus quality estimation. Mach Transl 24(1):39–50

    Article  Google Scholar 

  • Specia L, Shah K, de Souza JGC, Cohn T (2013) QuEst—a translation quality estimation framework. In: Proceedings of the conference, system demonstrations, 51st annual meeting of the association for computational linguistics, ACL 2013, Sofia, pp 79–84

  • Specia L, Turchi M, Cancedda N, Dymetman M, Cristianini N (2009) Estimating the sentence-level quality of machine translation systems. In: Proceedings of the 13th annual conference of the European association for machine translation, EAMT-2009, Barcelona, pp 28–37

  • Stolcke A (2002) SRILM: an extensible language modeling toolkit. In: Proceedings of the seventh international conference of spoken language processing (ICSLP 2002), Denver, pp 901–904

  • Wisniewski G, Singh AK, Segal N, Yvon F (2013) Design and analysis of a large corpus of post-edited translations: quality estimation, failure analysis and the variability of post-edition. In: Proceedings of the XIV machine translation summit, Nice, pp 117–124

Download references

Acknowledgments

This work has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under Grant agreement No. 296347 (QTLaunchPad). Dr Cohn is the recipient of an Australian Research Council Future Fellowship (Project number FT130101105).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kashif Shah.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shah, K., Cohn, T. & Specia, L. A Bayesian non-linear method for feature selection in machine translation quality estimation. Machine Translation 29, 101–125 (2015). https://doi.org/10.1007/s10590-014-9164-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-014-9164-x

Keywords

Navigation