Handbook of Natural Language Processing and Machine Translation

pp 745-843


Machine Translation Evaluation and Optimization

  • Bonnie DorrAffiliated withUniversity of Maryland
  • , Joseph OliveAffiliated withDefense Advanced Research Projects Agency
  • , John McCaryAffiliated withDefense Advanced Research Projects Agency
  • , Caitlin ChristiansonAffiliated withDefense Advanced Research Projects Agency

* Final gross prices may vary according to local VAT.

Get Access


The evaluation of machine translation (MT) systems is a vital field of research, both for determining the effectiveness of existing MT systems and for optimizing the performance of MT systems. This part describes a range of different evaluation approaches used in the GALE community and introduces evaluation protocols and methodologies used in the program. We discuss the development and use of automatic, human, task-based and semi-automatic (human-in-the-loop) methods of evaluating machine translation, focusing on the use of a human-mediated translation error rate HTER as the evaluation standard used in GALE. We discuss the workflow associated with the use of this measure, including post editing, quality control, and scoring. We document the evaluation tasks, data, protocols, and results of recent GALE MT Evaluations. In addition, we present a range of different approaches for optimizing MT systems on the basis of different measures. We outline the requirements and specific problems when using different optimization approaches and describe how the characteristics of different MT metrics affect the optimization. Finally, we describe novel recent and ongoing work on the development of fully automatic MT evaluation metrics that have the potential to substantially improve the effectiveness of evaluation and optimization of MT systems.