Skip to main content

Hybrid Machine Translation Oriented to Cross-Language Information Retrieval: English-Spanish Error Analysis

Part of the Advances in Intelligent Systems and Computing book series (AISC,volume 930)

Abstract

The main objective of this study focuses on analysing the automatic translation of questions (intended as query inputs to a Cross-Language Information Retrieval System) and on the creation of a taxonomy of translation errors present in hybrid machine translation (HMT) systems.

An analysis of translations by HMT systems was carried out. From these, there is a proposal of a type 1, 2 or 3 error taxonomy weighted according to their level of importance. Results indicate that post-editing is an essential task in the automatic translation process.

Keywords

  • Cross-language information retrieval
  • Hybrid machine translation systems
  • Translation errors
  • Post-editing

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-16181-1_18
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   269.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-16181-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   349.99
Price excludes VAT (USA)
Fig. 1.

Notes

  1. 1.

    Nowadays, Systran has already implemented Neural Machine Translation in its MT systems.

References

  1. Zhou, D., Truran, M., Brailsford, T., Wade, V., Ashman, H.: Translation techniques in cross-language information retrieval. ACM Comput. Surv. 45, 1:1–1:44 (2012)

    CrossRef  Google Scholar 

  2. Banchs, R.E., Costa-Jussà, M.R.: Cross-language document retrieval by using nonlinear semantic mapping. Appl. Artif. Intell. 27, 781–802 (2013)

    CrossRef  Google Scholar 

  3. Sharma, V.K., Mittal, N.: Cross lingual information retrieval (CLIR): review of tools, challenges and translation approaches. In: Satapathy, S., Mandal, J., Udgata, S., Bhateja, V. (eds.) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol. 433. Springer, New Delhi (2016)

    Google Scholar 

  4. Hull, D.A., Grefenstette, G.: Querying across languages: a dictionary-based approach to multilingual information retrieval. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR 1996, pp. 49–57. ACM Press, New York (1996)

    Google Scholar 

  5. Olvera-Lobo, M., Gutierrez-Artacho, J.: Language resources used in multi-lingual question-answering systems. Online Inf. Rev. 35, 543–557 (2011)

    CrossRef  Google Scholar 

  6. García-Santiago, L., Olvera-Lobo, M.D.: Analysis of automatic translation of questions for question-answering systems. Inf. Res. 15(4), paper 450 (2010). http://InformationR.net/ir/15-4/paper450.html

  7. Olvera-Lobo, M.D., Garcia-Santiago, L.: Analysis of errors in the automatic translation of questions for translingual QA systems. J. Doc. 66, 434–455 (2010)

    CrossRef  Google Scholar 

  8. Madankar, M., Chandak, M.B., Chavhan, N.: Information retrieval system and machine translation: a review. Procedia Comput. Sci. 78, 845–850 (2016)

    CrossRef  Google Scholar 

  9. Allen, J.: Post-editing. In: Somers, H.L. (ed.) Computers and Translation: A Translators Guide, pp. 297–317. John Benjamins, Amsterdam/Philadelphia (2003)

    CrossRef  Google Scholar 

  10. Koponen, M.: Is machine translation post-editing worth the effort? A survey of research into post-editing and effort. J. Spec. Transl. 25, 131–148 (2016)

    Google Scholar 

  11. Olvera-Lobo, M.D., Castro-Prieto, M.R., Quero-Gervilla, E., Munoz-Martin, R., Munoz-Raya, E., Murillo-Melero, M., Robinson, B., Senso-Ruiz, A., Vargas-Quesada, B., Dominguez-Lopez, C.: Translator training and modern market demands. Perspect. Transl. 13, 132–142 (2005)

    Google Scholar 

  12. Lagarda, A.L., Ortiz-Martinez, D., Alabau, V., Casacuberta, F.: Translating without in-domain corpus: machine translation post-editing with online learning techniques. Comput. Speech Lang. 32, 109–134 (2015)

    CrossRef  Google Scholar 

  13. Temizöz, Ö.: Postediting machine translation output: subject-matter experts versus professional translators. Perspectives (Montclair) 24, 646–665 (2016)

    CrossRef  Google Scholar 

  14. Torres-Hostench, O., Cid-Leal, P., Presas, M. (coords.) El uso de traducción automática y posedición en las empresas de servicios lingüísticos españolas: Informe de investigación ProjecTA 2015, Bellaterra (2016)

    Google Scholar 

  15. Costa, Â., Ling, W., Luís, T., Correia, R., Coheur, L.: A linguistically motivated taxonomy for machine translation error analysis. Mach. Transl. 29, 127–161 (2015)

    CrossRef  Google Scholar 

  16. Mesa-Lao, B.: Introduction to post-editing – the CasMaCat GUI 1. Introduction: Why post-editing MT outputs? (2013)

    Google Scholar 

  17. Costa-Jussà, M.R., Fonollosa, J.A.R.: Latest trends in hybrid machine translation and its applications. Comput. Speech Lang. 32, 3–10 (2015)

    CrossRef  Google Scholar 

  18. Labaka, G., España-Bonet, C., Màrquez, L., Sarasola, K.: A hybrid machine translation architecture guided by syntax. Mach. Transl. 28, 91–125 (2014)

    CrossRef  Google Scholar 

  19. Hunsicker, S., Yu, C., Federmann, C.: Machine learning for hybrid machine translation. In: Proceedings of the Seventh Workshop on Statistical Machine Translation, Montreal, pp. 312–316 (2012)

    Google Scholar 

  20. Tambouratzis, G., Athena, I., Centre, R., Amaroussiou, P.: Comparing CRF and template-matching in phrasing tasks within a Hybrid MT system. In: Proceedings of the 3rd Workshop on Hybrid Approaches to Translation (HyTra), pp. 7–14 (2014)

    Google Scholar 

  21. Gutiérrez-Artacho, J., Olvera-Lobo, M.-D., Rivera-Trigueros, I.: Human post-editing in hybrid machine translation systems: automatic and manual analysis and evaluation. In: Rocha, A., Adeli, H., Reis, L., Costanzo, S. (eds.) Trends and Advances in Information Systems and Technologies, WorldCIST8 2018. Advances in Intelligent Systems and Computing, vol. 745, pp. 254–263. Springer, Cham (2018)

    Google Scholar 

  22. Laurian, A.M.: Machine translation: what type of post-editing on what type of documents for what type of users. In: Proceedings of the 10th International Conference on Computational Linguistics and 22nd Annual Meeting on Association for Computational Linguistics, pp. 236–238 (1984)

    Google Scholar 

  23. Krings, H.: Repairing Texts: Empirical Investigations of Machine Translation Post-editing Processes. Kent State University Press, Kent (2001)

    Google Scholar 

  24. Schäfer, F.: MT post-editing: how to shed light on the “unknown task” - Experices made at SAP. In: Joint Conference on 8th International Workshop European Association for Machine Translation, 4th Controlled Language Application Workshop, pp. 133–140 (2003)

    Google Scholar 

  25. Farreús, M., Costa-Jussà, M.R., Morse, M.P.: Study and correlation analysis of linguistic, perceptual, and automatic machine translation evaluations. J. Am. Soc. Inf. Sci. Technol. 63, 174–184 (2012)

    CrossRef  Google Scholar 

  26. Sivakama, S., Prema, V., Savitha, G.: A comparative study of occurrence of errors in machine translation in a multilingual environment. Eng. Sci. Int. J. (ESIJ) 3, 1–4 (2016)

    Google Scholar 

  27. Vilar, D., Xu, J., D’Haro, L., Ney, H.: Error analysis of statistical machine translation output. In: Proceedings of LREC (2006)

    Google Scholar 

  28. Vieira, L.N.: Cognitive Effort in Post-Editing of Machine Translation: Evidence from Eye Movements, Subjective Ratings, and Think-Aloud Protocols (2016). http://hdl.handle.net/10443/3130

  29. Wagner, E.: Post-editing systran - a challenge for commission translators. Terminol. Trad. 3, 1–7 (1985)

    Google Scholar 

  30. O’ Brien, S.: Researching and Teaching Post-Editing (2009). http://www.mt-archive.info/MTS-2009-OBrien-ppt.pdf

  31. TAUS: MT Post-editing Guidelines. https://www.taus.net/academy/best-practices/postedit-best-practices/machine-translation-post-editing-guidelines

  32. Hu, K., Cadwell, P.: A comparative study of post-editing guidelines. Balt. J. Mod. Comput. 4, 346–353 (2016)

    Google Scholar 

  33. Olvera-Lobo, M.-D., Gutiérrez-Artacho, J.: Evaluación de los sistemas QA de dominio abierto frente a los de dominio especializado en el ámbito biomédico. In: I Congreso Español de Recuperación de Información (CERI 2010), Madrid, pp. 161–169 (2010)

    Google Scholar 

  34. Olvera-Lobo, M.D., Gutiérrez-Artacho, J.: Question-answering systems as efficient sources of terminological information: an evaluation. Heal. Inf. Libr. J. 27, 268–276 (2010)

    CrossRef  Google Scholar 

  35. Olvera-Lobo, M.D., Gutiérrez-Artacho, J.: Multilingual question-answering system in biomedical domain on the web: an evaluation. Lecture Notes in Computer Science, vol. 6941, pp. 83–88 (2011)

    Google Scholar 

  36. Olvera-Lobo, M.-D., Gutierrez-Artacho, J.: Performance analysis in web-based question answering systems. Rev. Esp. Doc. Cient. 36(2), e009 (2013)

    CrossRef  Google Scholar 

  37. Olvera-Lobo, M.-D., Gutierrez-Artacho, J.: Question answering track evaluation in TREC, CLEF and NTCIR. In: Rocha, A., Correia, A.M., Costanzo, S., Reis, L.P. (eds.) New Contributions in Information Systems and Technologies, pp. 13–22 (2015)

    Google Scholar 

  38. Gutiérrez Artacho, J.: Recursos y herramientas lingüísticos para los sistemas de búsqueda de respuestas monolingües y multilingües (2015)

    Google Scholar 

  39. Koehn, P.: Europarl: a parallel corpus for statistical machine translation. MT Summit 11, 79–86 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irene Rivera-Trigueros .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Gutiérrez-Artacho, J., Olvera-Lobo, MD., Rivera-Trigueros, I. (2019). Hybrid Machine Translation Oriented to Cross-Language Information Retrieval: English-Spanish Error Analysis. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S. (eds) New Knowledge in Information Systems and Technologies. WorldCIST'19 2019. Advances in Intelligent Systems and Computing, vol 930. Springer, Cham. https://doi.org/10.1007/978-3-030-16181-1_18

Download citation