Hybrid Machine Translation Oriented to Cross-Language Information Retrieval: English-Spanish Error Analysis

  • Juncal Gutiérrez-Artacho
  • María-Dolores Olvera-Lobo
  • Irene Rivera-TriguerosEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 930)


The main objective of this study focuses on analysing the automatic translation of questions (intended as query inputs to a Cross-Language Information Retrieval System) and on the creation of a taxonomy of translation errors present in hybrid machine translation (HMT) systems.

An analysis of translations by HMT systems was carried out. From these, there is a proposal of a type 1, 2 or 3 error taxonomy weighted according to their level of importance. Results indicate that post-editing is an essential task in the automatic translation process.


Cross-language information retrieval Hybrid machine translation systems Translation errors Post-editing 


  1. 1.
    Zhou, D., Truran, M., Brailsford, T., Wade, V., Ashman, H.: Translation techniques in cross-language information retrieval. ACM Comput. Surv. 45, 1:1–1:44 (2012)CrossRefGoogle Scholar
  2. 2.
    Banchs, R.E., Costa-Jussà, M.R.: Cross-language document retrieval by using nonlinear semantic mapping. Appl. Artif. Intell. 27, 781–802 (2013)CrossRefGoogle Scholar
  3. 3.
    Sharma, V.K., Mittal, N.: Cross lingual information retrieval (CLIR): review of tools, challenges and translation approaches. In: Satapathy, S., Mandal, J., Udgata, S., Bhateja, V. (eds.) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol. 433. Springer, New Delhi (2016)Google Scholar
  4. 4.
    Hull, D.A., Grefenstette, G.: Querying across languages: a dictionary-based approach to multilingual information retrieval. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR 1996, pp. 49–57. ACM Press, New York (1996)Google Scholar
  5. 5.
    Olvera-Lobo, M., Gutierrez-Artacho, J.: Language resources used in multi-lingual question-answering systems. Online Inf. Rev. 35, 543–557 (2011)CrossRefGoogle Scholar
  6. 6.
    García-Santiago, L., Olvera-Lobo, M.D.: Analysis of automatic translation of questions for question-answering systems. Inf. Res. 15(4), paper 450 (2010).
  7. 7.
    Olvera-Lobo, M.D., Garcia-Santiago, L.: Analysis of errors in the automatic translation of questions for translingual QA systems. J. Doc. 66, 434–455 (2010)CrossRefGoogle Scholar
  8. 8.
    Madankar, M., Chandak, M.B., Chavhan, N.: Information retrieval system and machine translation: a review. Procedia Comput. Sci. 78, 845–850 (2016)CrossRefGoogle Scholar
  9. 9.
    Allen, J.: Post-editing. In: Somers, H.L. (ed.) Computers and Translation: A Translators Guide, pp. 297–317. John Benjamins, Amsterdam/Philadelphia (2003)CrossRefGoogle Scholar
  10. 10.
    Koponen, M.: Is machine translation post-editing worth the effort? A survey of research into post-editing and effort. J. Spec. Transl. 25, 131–148 (2016)Google Scholar
  11. 11.
    Olvera-Lobo, M.D., Castro-Prieto, M.R., Quero-Gervilla, E., Munoz-Martin, R., Munoz-Raya, E., Murillo-Melero, M., Robinson, B., Senso-Ruiz, A., Vargas-Quesada, B., Dominguez-Lopez, C.: Translator training and modern market demands. Perspect. Transl. 13, 132–142 (2005)Google Scholar
  12. 12.
    Lagarda, A.L., Ortiz-Martinez, D., Alabau, V., Casacuberta, F.: Translating without in-domain corpus: machine translation post-editing with online learning techniques. Comput. Speech Lang. 32, 109–134 (2015)CrossRefGoogle Scholar
  13. 13.
    Temizöz, Ö.: Postediting machine translation output: subject-matter experts versus professional translators. Perspectives (Montclair) 24, 646–665 (2016)CrossRefGoogle Scholar
  14. 14.
    Torres-Hostench, O., Cid-Leal, P., Presas, M. (coords.) El uso de traducción automática y posedición en las empresas de servicios lingüísticos españolas: Informe de investigación ProjecTA 2015, Bellaterra (2016)Google Scholar
  15. 15.
    Costa, Â., Ling, W., Luís, T., Correia, R., Coheur, L.: A linguistically motivated taxonomy for machine translation error analysis. Mach. Transl. 29, 127–161 (2015)CrossRefGoogle Scholar
  16. 16.
    Mesa-Lao, B.: Introduction to post-editing – the CasMaCat GUI 1. Introduction: Why post-editing MT outputs? (2013)Google Scholar
  17. 17.
    Costa-Jussà, M.R., Fonollosa, J.A.R.: Latest trends in hybrid machine translation and its applications. Comput. Speech Lang. 32, 3–10 (2015)CrossRefGoogle Scholar
  18. 18.
    Labaka, G., España-Bonet, C., Màrquez, L., Sarasola, K.: A hybrid machine translation architecture guided by syntax. Mach. Transl. 28, 91–125 (2014)CrossRefGoogle Scholar
  19. 19.
    Hunsicker, S., Yu, C., Federmann, C.: Machine learning for hybrid machine translation. In: Proceedings of the Seventh Workshop on Statistical Machine Translation, Montreal, pp. 312–316 (2012)Google Scholar
  20. 20.
    Tambouratzis, G., Athena, I., Centre, R., Amaroussiou, P.: Comparing CRF and template-matching in phrasing tasks within a Hybrid MT system. In: Proceedings of the 3rd Workshop on Hybrid Approaches to Translation (HyTra), pp. 7–14 (2014)Google Scholar
  21. 21.
    Gutiérrez-Artacho, J., Olvera-Lobo, M.-D., Rivera-Trigueros, I.: Human post-editing in hybrid machine translation systems: automatic and manual analysis and evaluation. In: Rocha, A., Adeli, H., Reis, L., Costanzo, S. (eds.) Trends and Advances in Information Systems and Technologies, WorldCIST8 2018. Advances in Intelligent Systems and Computing, vol. 745, pp. 254–263. Springer, Cham (2018)Google Scholar
  22. 22.
    Laurian, A.M.: Machine translation: what type of post-editing on what type of documents for what type of users. In: Proceedings of the 10th International Conference on Computational Linguistics and 22nd Annual Meeting on Association for Computational Linguistics, pp. 236–238 (1984)Google Scholar
  23. 23.
    Krings, H.: Repairing Texts: Empirical Investigations of Machine Translation Post-editing Processes. Kent State University Press, Kent (2001)Google Scholar
  24. 24.
    Schäfer, F.: MT post-editing: how to shed light on the “unknown task” - Experices made at SAP. In: Joint Conference on 8th International Workshop European Association for Machine Translation, 4th Controlled Language Application Workshop, pp. 133–140 (2003)Google Scholar
  25. 25.
    Farreús, M., Costa-Jussà, M.R., Morse, M.P.: Study and correlation analysis of linguistic, perceptual, and automatic machine translation evaluations. J. Am. Soc. Inf. Sci. Technol. 63, 174–184 (2012)CrossRefGoogle Scholar
  26. 26.
    Sivakama, S., Prema, V., Savitha, G.: A comparative study of occurrence of errors in machine translation in a multilingual environment. Eng. Sci. Int. J. (ESIJ) 3, 1–4 (2016)Google Scholar
  27. 27.
    Vilar, D., Xu, J., D’Haro, L., Ney, H.: Error analysis of statistical machine translation output. In: Proceedings of LREC (2006)Google Scholar
  28. 28.
    Vieira, L.N.: Cognitive Effort in Post-Editing of Machine Translation: Evidence from Eye Movements, Subjective Ratings, and Think-Aloud Protocols (2016).
  29. 29.
    Wagner, E.: Post-editing systran - a challenge for commission translators. Terminol. Trad. 3, 1–7 (1985)Google Scholar
  30. 30.
    O’ Brien, S.: Researching and Teaching Post-Editing (2009).
  31. 31.
  32. 32.
    Hu, K., Cadwell, P.: A comparative study of post-editing guidelines. Balt. J. Mod. Comput. 4, 346–353 (2016)Google Scholar
  33. 33.
    Olvera-Lobo, M.-D., Gutiérrez-Artacho, J.: Evaluación de los sistemas QA de dominio abierto frente a los de dominio especializado en el ámbito biomédico. In: I Congreso Español de Recuperación de Información (CERI 2010), Madrid, pp. 161–169 (2010)Google Scholar
  34. 34.
    Olvera-Lobo, M.D., Gutiérrez-Artacho, J.: Question-answering systems as efficient sources of terminological information: an evaluation. Heal. Inf. Libr. J. 27, 268–276 (2010)CrossRefGoogle Scholar
  35. 35.
    Olvera-Lobo, M.D., Gutiérrez-Artacho, J.: Multilingual question-answering system in biomedical domain on the web: an evaluation. Lecture Notes in Computer Science, vol. 6941, pp. 83–88 (2011)Google Scholar
  36. 36.
    Olvera-Lobo, M.-D., Gutierrez-Artacho, J.: Performance analysis in web-based question answering systems. Rev. Esp. Doc. Cient. 36(2), e009 (2013)CrossRefGoogle Scholar
  37. 37.
    Olvera-Lobo, M.-D., Gutierrez-Artacho, J.: Question answering track evaluation in TREC, CLEF and NTCIR. In: Rocha, A., Correia, A.M., Costanzo, S., Reis, L.P. (eds.) New Contributions in Information Systems and Technologies, pp. 13–22 (2015)Google Scholar
  38. 38.
    Gutiérrez Artacho, J.: Recursos y herramientas lingüísticos para los sistemas de búsqueda de respuestas monolingües y multilingües (2015)Google Scholar
  39. 39.
    Koehn, P.: Europarl: a parallel corpus for statistical machine translation. MT Summit 11, 79–86 (2005)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Translation and Interpreting, Faculty of Translation and InterpretingUniversity of GranadaGranadaSpain
  2. 2.Department of Information and Communication, Colegio Máximo de CartujaUniversity of GranadaGranadaSpain
  3. 3.CSICUnidad Asociada Grupo SCImagoMadridSpain

Personalised recommendations