Abstract
Although machine translation (MT) has been an object of study for decades now, the texts generated by the state-of-the-art MT systems still present several errors for many language pairs. Aiming at coping with this drawback, lots of efforts have been made to post-edit those errors either manually or automatically. Manual post-editing is more accurate but can be prohibitive when too many changes have to be made. Automatic post-editing demands less effort but can also be less effective and give rise to new errors. A way to avoid unnecessary automatic post-editing and new errors is by previously selecting only the machine-translated segments that really need to be post-edited. Thus, this paper describes the experiments carried out to automatically identify MT errors generated by a state-of-the-art phrase-based statistical MT system. Despite the fact that our experiments have been carried out using a statistical MT engine, we believe the approach can also be applied to other types of MT systems. The experiments investigated the well-known machine-learning algorithms Naive Bayes, Decision Trees and Support Vector Machines. Using the decision tree algorithm it was possible to identify wrong segments with around 77 % precision and recall when a small training corpus of only 2,147 error instances was used. Our experiments were performed on English-to-Brazilian Portuguese MT, and although some of the features are language-dependent, the proposed approach is language-independent and can be easily generalized to other language pairs.
Similar content being viewed by others
Notes
Translation gisting is “[...] where users just want to understand in their own native language(s) the main idea(s) of a document that only exists in a foreign language. For such needs, a perfect translation and the ensuing details are not as critical.” (Allen 2003, p. 300).
“[...] the guiding objective in minimal post-editing context is to make the least amount of comments possible for producing an understandable working document, rather than producing a high-quality document” ((Allen 2003, p. 305)).
Available at: http://www.dfki.de/~mapo02/hjerson/.
Available at: https://wiki.ufal.ms.mff.cuni.cz/user:zeman:addicter.
Available at: http://terra.cl.uzh.ch/terrorcat.html/.
The teste-a of the FAPESP corpus is available at: http://pers-www.wlv.ac.uk/~in1676/resources/fapesp/index.html.
TAEIP was trained using Moses (Koehn et al. 2007) on the entire FAPESP Brazilian Portuguese–English parallel corpus (Aziz and Specia 2011), as explained in Vieira and Caseli (2011). To use the PorTAl translation systems, access: http://www.lalic.dc.ufscar.br/portal.
The reference sentences are not used in the training of ML algorithms, but they helped the human experts during the manual annotation of training corpus (as explained in Sect. 3.2).
A multiword expression can be defined as any word combination for which the syntactic or semantic properties of the whole expression cannot be obtained from its parts (Sag et al. 2002).
In the morphological features genTokenNBefSys, genTokenNAftSys, numTokenNBefSys, numTokenNAftSys, poSTokenNBefSrc and poSTokenNAftSrc, the N represents the position of the token with respect to the CT. For example, for the W of size 5 in Fig. 4, they will be: genToken1BefSys (m), genToken2BefSys (NC), genToken1AftSys (m), genToken2AftSys (m), numToken1BefSys (pl), numToken2BefSys (NC), numToken1AftSys (sg), numToken2AftSys (sg), poSToken1BefSrc (det), poSToken2BefSrc (pr), poSToken1AftSrc (pr) and poSToken2AftSrc (det). See Table 2 to check the values of these features.
For the test instances where W is checked for errors, the LCT is also obtained from Sys.
The test-b of FAPESP corpus is available at: http://pers-www.wlv.ac.uk/~in1676/resources/fapesp/index.html. The source sentences of test-b were translated by the English-to-Brazilian Portuguese PB-SMT system of PorTAl (TAEIP).
The J48 Weka confidence parameter (-C) represents the confidence-based pruning mechanism for C4.5. More information can be found in Witten and Frank (2005).
The SMO Weka complexity parameter (-C) represents a parameter which controls the building of the hyperplane between two target classes. More information can be found in Bottou and Lin (2007).
As explained in Sect. 3.3, the number of features varies with the size of W. It is 32 for 5-W, 40 for 7-W and 56 for 11-W.
The Brazilian WordNet is under development, but parts of it have already been made available at: http://www.nilc.icmc.usp.br/wordnetbr/index.html.
References
Allen J (2003) In: Somers H (ed) Computers and translation: a translator’s guide. John Benjamins Publishing Company, Amsterdam, pp 297–317
Allen J, Hogan C (2000) Toward the development of a post-editing module for raw machine translation output: a controlled language perspective. Proceedings of the third international workshop on controlled language applications. Seattle, pp 62–71
Armentano-Oller C, Carrasco RC, Corbí-Bellot AM, Forcada ML, Ginestí-Rosell M, Ortiz-Rojas S, Pérez-Ortiz J, Ramírez-Sánchez G, Sánchez-Martínez F, Scalco MA (2006) Open-source Portuguese-Spanish machine translation. Proceedings of the 7th international workshop on computational processing of written and spoken Portuguese. Itatiaia, pp 50–59
Aziz W, Specia L (2011) Fully automatic compilation of Portuguese–English and Portuguese–Spanish parallel corpora. Proceedings of the 8th Brazilian symposium in information and human language technology (STIL 2011). Cuiabá, pp 234–238
Aziz W, de Sousa SCM, Specia L (2012) PET: a Tool for post-editing and assessing machine translation. Eighth international conference on language resources and evaluation (LREC 2012). Istanbul, pp 3982–3987
Béchara H, Ma Y, van Genabith J (2011) Statistical post-editing for a statistical MT system. Proceedings of the thirteenth machine translation summit (MT Summit XIII). Xiamen, pp 308–315
Blatz J, Fitzgerald E, Foster G, Gandrabur S, Goutte C, Kulesza A, Sanchis A, Ueffing N (2004) Confidence estimation for machine translation. Twentieth international conference on computational linguistics. Proceedings, Geneva, pp 315–321
Bottou L, Lin CJ (2007) Support vector machine solvers. In: L. Bottou, O. Chapelle, D. DeCoste and J. Weston (eds.) Large scale kernel machines, MIT Press, Cambridge, pp 301–320, http://leon.bottou.org/papers/bottou-lin-2006
Caseli HM (2007) Indução de léxicos bilíngües e regras para a tradução automática. PhD thesis, USP, São Carlos, São Paulo
Caseli HM, Nunes M, Forcada M (2006) Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation. Mach Transl 20(4):227–245. doi:10.1007/s10590-007-9027-9
Doddington G (2002) Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. HLT 2002: human language technology conference: proceedings of the second international conference on human language technology research. San Diego, pp 138–145
Elming J (2006) Transformation-based correction of rule-based MT. Eleventh annual conference of the European association for machine translation. Proceedings, Oslo, pp 219–226
Felice M, Specia L (2012) Linguistic features for quality estimation. Proceedings of the 7th workshop on statistical machine translation. Montreal, pp 96–103
Fishel M, Sennrich R, Popovic M, Bojar O (2012) TerrorCat. Proceedings of the 7th workshop on statistical machine translation. Montreal, pp 64–70
Font Llitjós A (2007) Automatic improvement of machine translation systems. PhD thesis, Carnegie Mellon University, Pittsburgh
George C, Japkowicz N (2005) Automatic correction of French to English relative pronoun translations using natural language processing and machine learning techniques. In: Computational linguistics In the North East (CLiNE’05), Ottawa
Gomes FT, Pardo TAS (2008) Trapezio–Translation post editor. In: Anais do Congresso da Academia Trinacional de Ciências (C3N), Foz do Iguaçu, Paraná, pp 1–10 http://www.icmc.usp.br/ taspardo/C3N2008-TassarioPardo
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software. SIGKDD Explor 11:10–18
Hastie T, Tibshirani R (1998) Classification by pairwise coupling. Ann Stat 26(2):451–471
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. Eleventh conference on uncertainty in artificial intelligence. San Mateo, pp 338–345
Kawamorita C, Caseli HM (2012) Memórias de Tradução: auxiliando o humano a traduzir, trabalho apresentado no Encontro de Linguística de Corpus (ELC 2012), São Carlos, São Paulo, p 10. http://nilc.icmc.sc.usp.br/elc-ebralc2012/anais/completos/103989
Keerthi S, Shevade S, Bhattacharyya C, Murthy K (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649
Knight K, Chander I (1994) Automated post-editing of documents. Proceedings of the 12th national conference on artificial intelligence. Seattle, pp 779–784
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. Proceedings of the ACL 2007 demo and poster sessions. Czech Republic, Prague, pp 177–180
Krings HP (2001) Repairing texts—empirical investigations of machine translation post-editing processes. The Kent State University Press, Kent
Lagarda AL, Alabau V, Casacuberta F, Silva R, DíAZ-DE-LIAO E (2009) Statistical post-editing of a rule-based machine translation system. In: human language technologies: the 2009 annual conference of the North American chapter of the association for computational linguistics, Proceedings of the conference, Boulder, pp 217–220
Levenshtein VI (1966) Binary codes capable of correcting deletions. Insertions and reversals. Sov Phys Dokl 10(8):707–710
Martins DBJ (2014) Pós-edição automática de textos traduzidos automaticamente de inglês para português do Brasil. Master’s thesis, Centro de Ciências Exatas e de Tecnologia - Programa de Pós-graduação em Ciência da Computação, Universidade Federal de São Carlos, São Paulo http://www.bdtd.ufscar.br/htdocs/tedeSimplificado/tde_busca/arquivo.php?codArquivo=7354
Martins DBJ, Caseli HM (2013) Anotação manual de erros de tradução automática em textos traduzidos de inglês para português do Brasil. Tech. Rep. NILC-TR-13-02, Série de Relatórios do NILC, Brazil http://www.nilc.icmc.usp.br/nilc/download/NILC-TR-13-02
Martins DBJ, Avanço LV, Nunes MGV, Caseli HM (2013) In: Hardie A, Love R (eds) Corpus linguistics. Lancaster, pp 189–192
Nießen S, Och FJ, Leusch G, Ney H (2000) An evaluation tool for machine translation. Proceedings of the second international conference on language resources and evaluation (LREC). Athens, pp 39–45
O’Brien S (2002) Teaching post-editing. Sixth EAMT workshop “Teaching machine translation”. Manchester, England, pp 99–106
Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51
Och FJ, Ney H (2004) The alignment template approach to statistical machine translation. Comput Linguist 30(4):417–449
Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: ACL-2002: 40th annual meeting of the association for computational linguistics, Philadelphia, pp 311–318
Platt J (1998) Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf B, Burges C, Smola A (eds) Advances in kernel methods—support vector learning. MIT Press, Cambridge
Popovic M (2011) Hjerson. Prague Bull Math Linguist 96:59–67
Popovic M, Burchardt A (2011) From human to automatic error classification for machine translation output. EAMT 2011, proceedings of the 15th conference of the European association for machine translation. Belgium, pp 265–272
Potet M, Esperança-Rodier E, Blanchon H, Besacier L (2011) Preliminary experiments on using users’ post-editions to enhance a SMT system. EAMT 2011, proceedings of the 15th conference of the European association for machine translation. Belgium, pp 161–168
Quinlan JR (1993) C4.5. Morgan Kaufmann Publishers, San Mateo
Roturier J (2009) Deploying novel MT technology to raise the bar for quality: A review of key advantages and challenges. MT summit XII: proceedings of the twelfth machine translation summit. Ottawa, pp 1–8
Sag IA, Baldwin T, Bond F, Copestake A, Flickinger D (2002) Multiword expressions: A pain in the neck for NLP. In: Proceedings of the third international conference on computational linguistics and intelligent text processing (CICLing-2002), Springer, London (lecture notes in computer science), vol 2276, pp 1–15
Shah K, Cohn T, Specia L (2013) An investigation on the effectiveness of features for translation quality estimation. Proceedings of the XIV machine translation summit. Nice, pp 167–174
Simard M, Goutte C, Isabelle P (2007) Statistical phrase-based post-editing. In: Human language Technologies 2007: the conference of the North American chapter of the association for computational linguistics, Proceedings of the main conference, Rochester, pp 508–515
Specia L (2011) Exploiting objective annotations for measuring translation post-editing effort. In: EAMT 2011, proceedings of the 15th conference of the European association for machine translation, Leuven, pp 73–80
Stymne S (2011) BLAST: A Tool for error analysis of machine translation output. Proceedings of the ACLHLT 2011 system demonstrations. Portland, pp 56–61
Tillmann C, Vogel S, Ney H, Zubiaga A, Sawaf H (1997) Accelerated DP based search for statistical translation. European conference on speech communication and technology. Rhodes, pp 2667–2670
Vieira TL, Caseli HM (2011) PorTAl: Recursos e Ferramentas de Tradução Automática para o Português do Brasil. In: Eighth Brazilian symposium in information and human language technology (STIL), Cuiabá, pp 179–183
Vilar D, Xu J, D’Haro LF, Ney H (2006) Error analysis of statistical machine translation output. LREC-2006: fifth international conference on language resources and evaluation. Proceedings, Genoa, pp 22–28
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Springer, Berlin
Zeman D, Fishel M, Berka J, Bojar O (2011) Addicter: what is wrong with my translations? Prague Bull Math Linguist 96:79–88
Acknowledgments
This project was developed with support of the Grants #2011/03799-4, #2010/07517-0 and #2013/11811-0 from the São Paulo Research Foundation (FAPESP). We also thank Maria das Graças Volpe Nunes and Lucas Vinicius Avanço for their help in the corpus annotation process. This work is also part of the CAMELEON (CAPES-COFECUB #707-11) and AIM-WEST (FAPESP #2013/50757-0) projects.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
de Jesus Martins, D.B., de Medeiros Caseli, H. Automatic machine translation error identification. Machine Translation 29, 1–24 (2015). https://doi.org/10.1007/s10590-014-9163-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-014-9163-y