Skip to main content

A Diagnostic Evaluation Approach for English to Hindi MT Using Linguistic Checkpoints and Error Rates

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7817))

Abstract

This paper addresses diagnostic evaluation of machine translation (MT) systems for Indian languages, English to Hindi MT in particular, assessing the performance of MT systems on relevant linguistic phenomena (checkpoints). We use the diagnostic evaluation tool DELiC4MT to analyze the performance of MT systems on various PoS categories (e.g. nouns, verbs). The current system supports only word level checkpoints which might not be as helpful in evaluating the translation quality as compared to using checkpoints at phrase level and checkpoints that deal with named entities (NE), inflections, word order, etc. We therefore suggest phrase level checkpoints and NEs as additional checkpoints for DELiC4MT. We further use Hjerson to evaluate checkpoints based on word order and inflections that are relevant for evaluation of MT with Hindi as the target language. The experiments conducted using Hjerson generate overall (document level) error counts and error rates for five error classes (inflectional errors, reordering errors, missing words, extra words, and lexical errors) to take into account the evaluation based on word order and inflections. The effectiveness of the approaches was tested on five English to Hindi MT systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Snover, M., Madnani, N., Dorr, B.J., Schwartz, R.: Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric. In: Proceedings of the 4th EACL Workshop on Statistical Machine Translation, pp. 259–268. Association for Computational Linguistics, Athens (2009)

    Google Scholar 

  2. Callison-Burch, C., Fordyce, C., Koehn, P., Monz, C., Schroeder, J.: (Meta-) evaluation of machine translation. In: Proceedings of the Second Workshop on Statistical Machine Translation, Prague, Czech Republic, pp. 136–158 (2007)

    Google Scholar 

  3. Stymne, S., Ahrenberg, L.: On the practice of error analysis for machine translation evaluation. In: Proceedings of 8th International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, pp. 1785–1790 (2012)

    Google Scholar 

  4. Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: A method for automatic evaluation of machine translation. In: Proceedings of 40th Annual Meeting of the ACL, Philadelphia, PA, USA, pp. 311–318 (2002)

    Google Scholar 

  5. Doddington, G.: Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics. In: Proceedings of the Human Language Technology Conference (HLT), San Diego, CA, pp. 128–132 (2002)

    Google Scholar 

  6. Banerjee, S., Lavie, A.: METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL 2005 Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization, Ann Arbor, Michigan, pp. 65–72 (2005)

    Google Scholar 

  7. Lavie, A., Agarwal, A.: METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments. In: Proceedings of the Second ACL Workshop on Statistical Machine Translation, Prague, Czech Republic, pp. 228–231 (2007)

    Google Scholar 

  8. Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of Association for Machine Translation in the Americas, AMTA 2006, Cambridge, MA, pp. 223–231 (2006)

    Google Scholar 

  9. Chatterjee, N., Balyan, R.: Towards Development of a Suitable Evaluation Metric for English to Hindi Machine Translation. International Journal of Translation 23(1), 7–26 (2011)

    Google Scholar 

  10. Gupta, A., Venkatapathy, S., Sangal, R.: METEOR-Hindi: Automatic MT Evaluation Metric for Hindi as a Target Language. In: Proceedings of ICON 2010: 8th International Conference on Natural Language Processing. Macmillan Publishers, India (2010)

    Google Scholar 

  11. Ananthakrishnan, R., Bhattacharyya, P., Sasikumar, M., Shah, R.: Some issues in automatic evaluation of English-Hindi MT: More blues for BLEU. In: Proceeding of 5th International Conference on Natural Language Processing (ICON 2007), Hyderabad, India (2007)

    Google Scholar 

  12. Chatterjee, N., Johnson, A., Krishna, M.: Some improvements over the BLEU metric for measuring the translation quality for Hindi. In: Proceedings of the International Conference on Computing: Theory and Applications, ICCTA 2007, Kolkata, India, pp. 485–490 (2007)

    Google Scholar 

  13. Moona, R.S., Sangal, R., Sharma, D.M.: MTeval: A Evaluation methodolgy for Machine Translation system. In: Proceedings of SIMPLE 2004, Kharagpur, India, pp. 15–19 (2004)

    Google Scholar 

  14. Toral, A., Naskar, S.K., Gaspari, F., Groves, D.: DELiC4MT: A Tool for Diagnostic MT Evaluation over User-defined Linguistic Phenomena. The Prague Bulletin of Mathematical Linguistics 98, 121–131 (2012)

    Article  Google Scholar 

  15. Popović, M.: Hjerson:An Open Source Tool for Automatic Error Classification of Machine Translation Output. The Prague Bulletin of Mathematical Linguistics 96, 59–68 (2011)

    Google Scholar 

  16. Vilar, D., Xu, J., Fernando, L., D’Haro, N.H.: Error analysis of statistical machine translation output. In: Proceedings of Fifth International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, pp. 697–702 (2006)

    Google Scholar 

  17. Farrús, M., Costa-jussà, M.R.: Mariño, J. B., Fonollosa, J. A. R.: Linguistic-based evaluation criteria to identify statistical machine translation errors. In: Proceedings of EAMT, Saint Raphaël, France, pp. 52–57 (2010)

    Google Scholar 

  18. Popović, M., Ney, H.: Towards automatic error analysis of machine translation output. Computational Linguistics 37(4), 657–688 (2011)

    Article  MathSciNet  Google Scholar 

  19. Popović, M., Ney, H., Gispert, A.D., Mariño, J.B., Gupta, D., Federico, M., Lambert, P., Banchs, R.: Morpho-syntactic information for automatic error analysis of statistical machine translation output. In: StatMT 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, pp. 1–6 (2006)

    Google Scholar 

  20. Popović, M., Burchardt, A.: From human to automatic error classification for machine translation output. In: Proceedings of EAMT 2011, Leuven, Belgium, pp. 265–272 (2011)

    Google Scholar 

  21. Popović, M.: rgbF: An Open Source Tool for n-gram Based Automatic Evaluation of Machine Translation Output. The Prague Bulletin of Mathematical Linguistics 98, 99–108 (2012)

    Google Scholar 

  22. Zeman, D., Fishel, M., Berka, J., Bojar, O.: Addicter: What Is Wrong with My Translations? The Prague Bulletin of Mathematical Linguistics 96, 79–88 (2011)

    Article  Google Scholar 

  23. Fishel, M., Sennrich, R., Popović, M., Bojar, O.: TerrorCat: a translation error categorization-based MT quality metric. In: WMT 2012 Proceedings of the Seventh Workshop on Statistical Machine Translation, Stroudsburg, PA, USA, pp. 64–70 (2012)

    Google Scholar 

  24. Xiong, D., Zhang, M., Li, H.: Error detection for statistical machine translation using linguistic features. In: Proceedings of ACL 2010, Uppsala, Sweden, pp. 604–611 (2010)

    Google Scholar 

  25. Zhou, M., Wang, B., Liu, S., Li, M., Zhang, D., Zhao, T.: Diagnostic Evaluation of Machine Translation Systems using Automatically Constructed Linguistic Checkpoints. In: Proceedings of 22nd International Conference on Computational Linguistics (COLING 2008), pp. 1121–1128. Manchester (2008)

    Google Scholar 

  26. Naskar, S.K., Toral, A., Gaspari, F., Ways, A.: A framework for Diagnostic Evaluation of MT based on Linguistic Checkpoints. In: Proceedings of the 13th Machine Translation Summit, Xiamen,China, pp. 529–536 (2011)

    Google Scholar 

  27. Popović, M., Ney, H.: Word Error Rates: Decomposition over POS classes and Applications for Error Analysis. In: Proceedings of the 2nd ACL 2007 Workshop on Statistical MachineTranslation (WMT 2007), Prague, Czech Republic, pp. 48–55 (2007)

    Google Scholar 

  28. Koehn, P.: Statistical Significance Tests for Machine Translation Evaluation. In: Proceedings of the Conference on Empirical Methods on Natural Language Processing, EMNLP, pp. 385–395 (2004)

    Google Scholar 

  29. Balyan, R., Naskar, S.K., Toral, A., Chatterjee, N.: A Diagnostic Evaluation Approach Targeting MT Systems for Indian Languages. In: Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages (MTPIL-COLING 2012), Mumbai, India, pp. 61–72 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Balyan, R., Naskar, S.K., Toral, A., Chatterjee, N. (2013). A Diagnostic Evaluation Approach for English to Hindi MT Using Linguistic Checkpoints and Error Rates. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2013. Lecture Notes in Computer Science, vol 7817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37256-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37256-8_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37255-1

  • Online ISBN: 978-3-642-37256-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics