Advertisement

Validity of an Automatic Evaluation of Machine Translation Using a Word-Alignment-Based Classifier

  • Katsunori Kotani
  • Takehiko Yoshimi
  • Takeshi Kutsumi
  • Ichiko Sata
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5459)

Abstract

Because human evaluation of machine translation is extensive but expensive, we often use automatic evaluation in developing a machine translation system. From viewpoint of evaluation cost, there are two types of evaluation methods: one uses (multiple) reference translation, e.g., METEOR, and the other classifies machine translation either into machine-like or human-like translation based on translation properties, i.e., a classification-based method. Previous studies showed that classification-based methods could perform evaluation properly. These studies constructed a classifier by learning linguistic properties of translation such as length of a sentence, syntactic complexity, and literal translation, and their classifiers marked high classification accuracy. These previous studies, however, have not examined whether their classification accuracy could present translation quality. Hence, we investigated whether classification accuracy depends on translation quality. The experiment results showed that our method could correctly distinguish the degrees of translation quality.

Keywords

machine translation evaluation translation property classification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Corston-Oliver, S., Gamon, M., Brockett, C.: A machine Learning Approach to the Automatic Evaluation of Machine Translation. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France, pp. 148–155 (2001)Google Scholar
  2. 2.
    Kulesza, A., Shieber, S.M.: A Learning Approach to Improving Sentence-level MT Evaluation. In: Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation, Baltimore, Maryland, pp. 75–84 (2004)Google Scholar
  3. 3.
    Gamon, M., Aue, A., Smets, M.: Sentence-level MT Evaluation without Reference Translations: Beyond Language Modeling. In: Proceedings of the 10th European Association for Machine Translation Conference, Budapest, Hungary, pp. 103–111 (2005)Google Scholar
  4. 4.
    Kotani, K., Yoshimi, T., Kutsumi, T., Sata, I., Isahara, H.: A Classification Approach to Automatic Evaluation of Machine Translation Based on Word Alignment. In: Language Forum, vol. 34, pp. 153–168 (2008)Google Scholar
  5. 5.
    Papineni, K.A., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: A Method for Automatic Evaluation of Machine Translation. Technical Report RC22176 (W0109–022). IBM Research Division, Thomas J. Watson Research Center (2001)Google Scholar
  6. 6.
    Doddington, G.: Automatic Evaluation of Machine Translation Quality Using N-gram Co-occurrence Statistics. In: Proceedings of the 2nd Human Language Technology Conference, San Diego, California, pp. 128–132 (2002)Google Scholar
  7. 7.
    Banerjee, S., Alon, L.: METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In: Proceedings of the 43th Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Michigan, pp. 65–72 (2005)Google Scholar
  8. 8.
    Quirk, C.B.: Training a Sentence-level Machine Translation Confidence Measure. In: Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal, pp. 825–828 (2004)Google Scholar
  9. 9.
    Albrecht, J.S., Hwa, R.: A Re-examination of Machine Learning Approaches for Sentence-level MT Evaluation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, pp. 880–887 (2007)Google Scholar
  10. 10.
    Paul, M., Finch, A., Sumita, E.: Reducing Human Assessment of Machine Translation Quality to Binary Classifiers. In: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde, Sweden, pp. 154–162 (2007)Google Scholar
  11. 11.
    Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1992)Google Scholar
  12. 12.
    Whitelock, P., Poznanski, V.: The SLE Example-Based Translation System. In: Proceedings of the International Workshop on Spoken Language Translation, Kyoto, Japan, pp. 111–115 (2006)Google Scholar
  13. 13.
    Vapnik, V.: Statistical Learning Theory. Wiley Interscience, New York (1998)Google Scholar
  14. 14.
    Och, F.J., Ney, H.: A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics 29(1), 19–51 (2003)CrossRefGoogle Scholar
  15. 15.
    Utiyama, M., Isahara, H.: Reliable Measures for Aligning Japanese-English News Articles and Sentences. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, pp. 72–79 (2003)Google Scholar
  16. 16.
  17. 17.

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Katsunori Kotani
    • 1
  • Takehiko Yoshimi
    • 2
  • Takeshi Kutsumi
    • 3
  • Ichiko Sata
    • 3
  1. 1.Kansai Gaidai UniversityOsakaJapan
  2. 2.Ryukoku UniverisityShigaJapan
  3. 3.Sharp CorporationNaraJapan

Personalised recommendations