Human Evaluation of Online Machine Translation Services for English/Russian-Croatian

  • Sanja SeljanEmail author
  • Marko Tucaković
  • Ivan Dunđer
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 353)


This paper presents results of human evaluation of machine translated texts for one non closely-related language pair, English-Croatian, and for one closely-related language pair, Russian-Croatian. 400 sentences from the domain of tourist guides were analysed, i.e. 100 sentences for each language pair and for two online machine translation services, Google Translate and Yandex.Translate. Human evaluation is made with regard to the criteria of fluency and adequacy. In order to measure internal consistency, Cronbach’s alpha is calculated. Error analysis is made for several categories: untranslated/omitted words, surplus words, morphological errors/wrong word endings, lexical errors/wrong translations, syntactic errors/wrong word order and punctuation errors. At the end of this paper, conclusions and suggestions for further research are given.


machine translation human evaluation Google Translate Yandex.Translate English Russian Croatian adequacy fluency error analysis inter-evaluator agreement 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ni, Y., Saunders, C., Szedmak, S., Niranjan, M.: Exploitation of Machine Learning Techniques in Modelling Phrase Movements for Machine Translation. Journal of Machine Learning Research 12, 1–3 (2011)zbMATHGoogle Scholar
  2. 2.
    Koehn, P.: Statistical Machine Translation, 1st edn. Cambridge University Press, New York (2010)zbMATHGoogle Scholar
  3. 3.
    Vasiljevs, A., Gornostay, T., Skadins, R.: LetsMT! – Online Platform for Sharing Training Data and Building User Tailored Machine Translation. In: Proceedings of the Fourth International Conference Baltic HLT, pp. 133–140. IOS Press, Netherlands (2010)Google Scholar
  4. 4.
    Latour, J.: Evaluating Statistical Machine Translation from English to Dutch. In: Proceedings of the 1st Twente Student Conference on IT: Intelligent Interaction, Twente, Netherlands (2004)Google Scholar
  5. 5.
    Och, F.J.: Minimum Error Rate Training in Statistical Machine Translation. In: Proceeding ACL 2003 of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 160–167 (2003)Google Scholar
  6. 6.
    Koehn, P.: Statistical Significance Tests for Machine Translation Evaluation. In: EMNLP (2004)Google Scholar
  7. 7.
    Specia, L., Hajlaoui, N., Hallett, C., Aziz, W.: Predicting Machine Translation Adequacy. In: Machine Translation Summit XIII, Xiamen, China (2011)Google Scholar
  8. 8.
    Quirk, C.: Training a Sentence-Level Machine Translation Confidence Measure. In: Proceedings of the 4th Conference on Language Resources and Evaluation, Lisbon, Portugal, pp. 825–828 (2004)Google Scholar
  9. 9.
    de Sousa, S.C.M., Aziz, W., Specia, L.: Assessing the post-editing effort for automatic and semi-automatic translations of DVD subtitles. In: International Conference Recent Advances in Natural Language Processing, RANLP, Hissar, Bulgaria, pp. 97–103 (2011)Google Scholar
  10. 10.
    Garcia-Santiago, L., Olvera-Lobo, M.D.: Automatic Web Translators as Part of a Multilingual Question-Answering (QA) System: Translation of Questions. Translation Journal 14(1) (2010)Google Scholar
  11. 11.
    Hampshire, S., Porta Salvia, C.: Translation and the Internet: Evaluating the Quality of Free Online Machine Translators. Quaderns: Revista de traducció 17, 197–209 (2010)Google Scholar
  12. 12.
    Kit, C., Wong, T.M.: Comparative Evaluation of Online Machine Translation Systems with Legal Texts. Law Library Journal 100(2), 299–321 (2008)Google Scholar
  13. 13.
    Zervaki, T.: Online Free Translation Services. In: Proceedings of the 24th International Conference on Translating and the Computer, London, Aslib (2002)Google Scholar
  14. 14.
    Seljan, S., Brkić, M., Kučiš, V.: Evaluation of Free Online Machine Translations for Croatian-English and English-Croatian Language Pairs. In: INFuture2011: The Future of Information Sciences – Information Sciences and e-Society, Zagreb, pp. 331–344 (2011)Google Scholar
  15. 15.
    Seljan, S., Vičić, T., Brkić, M.: BLEU Evaluation of Machine-Translated English- Croatian Legislation. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation - LREC, pp. 2143–2148. ELRA (2012)Google Scholar
  16. 16.
    Ma, X., Cieri, C.: Corpus Support for Machine Translation at LDC. In: Proceedings of the 5th edition of the International Conference on Language Resources and Evalution - LREC, pp. 859–864 (2006)Google Scholar
  17. 17.
    Soricut, E.: TrustRank: Inducing Trust in Automatic Translations via Ranking. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 612–621. Association for Computational Linguistics (2010)Google Scholar
  18. 18.
    He, Y., Ma, Y., van Genabith, J., Way, A., Bridging, S.M.T.: TM with translation recommendation. In: ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 2010–2048 (2010)Google Scholar
  19. 19.
    Specia, L.: Exploiting Objective Annotations for Measuring Translation Post-editing Effort. In: 15th Annual Conference of the European Association for Machine Translation, Leuven, Belgium, pp. 73–80 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Faculty of Humanities and Social Sciences, Department of Information and Communication SciencesUniversity of ZagrebZagrebCroatia

Personalised recommendations