eBISS 2016: Business Intelligence pp 38-58 | Cite as

Computational Approaches to Translation Studies

Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 280)

Abstract

Translated texts, in any language, have unique characteristics that set them apart from texts originally written in the same language. Translation studies is a research field that focuses on investigating these characteristics. Until recently, research in computational linguistics, and specifically in machine translation, has been entirely divorced from translation studies. The main goal of this tutorial is to introduce some of the findings of translation studies to researchers interested mainly in machine translation, and to demonstrate that awareness of these findings can result in better, more accurate machine translation systems (This chapter synthesizes material that has been previously published by the author and colleagues, in particular in Volansky et al. (2015); Rabinovich and Wintner (2015); Lembersky et al. (2011, 2012a, b, 2013); and Twitto et al. (2015)).

Notes

Acknowledgements

I am grateful to Noam Ordan for his immense help with the research reported here. Thanks are due to all my other collaborators on these works, including Gennadi Lembersky, Vered Volansky, Udi Avner, Naama Twitto and Ella Rabinovich. Special thanks are due to Agata Savary, not least for her continuous encouragement. I am grateful to the three anonymous reviewers whose constructive comments greatly improved the quality of the presentation. This research was supported by a grant from the Israeli Ministry of Science and Technology.

References

  1. Al-Shabab, O.S.: Interpretation and the Language of Translation: Creativity and Conventions in Translation. Janus, Edinburgh (1996)Google Scholar
  2. Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, Philadelphia, PA, USA. Society for Industrial and Applied Mathematics, pp. 1027–1035 (2007). http://dl.acm.org/citation.cfm?id=1283383.1283494. ISBN 978-0-898716-24-5
  3. Avner, E.A., Ordan, N., Wintner, S.: Identifying translationese at the word and sub-word level. Digit. Scholarsh. Humanit. 31(1), 30–54 (2016). http://dx.doi.org/10.1093/llc/fqu047 CrossRefGoogle Scholar
  4. Baker, M.: Corpus linguistics and translation studies: implications and applications. In: Baker, M., Francis, G., Tognini-Bonelli, E. (eds.) Text and Technology: In Honour of John Sinclair, pp. 233–252. John Benjamins, Amsterdam (1993)CrossRefGoogle Scholar
  5. Baroni, M., Bernardini, S.: A new approach to the study of translationese: machine-learning the difference between original and translated text. Lit. Linguist. Comput. 21(3), 259–274 (2006). http://llc.oxfordjournals.org/cgi/content/short/21/3/259?rss=1 CrossRefGoogle Scholar
  6. Ben-Ari, N.: The ambivalent case of repetitions in literary translation. Avoiding repetitions: a “universal” of translation? Meta 43(1), 68–78 (1998)CrossRefGoogle Scholar
  7. Blum-Kulka, S.: Shifts of cohesion and coherence in translation. In: House, J., Blum-Kulka, S. (eds.) Interlingual and Intercultural Communication Discourse and Cognition in Translation and Second Language Acquisition Studies, vol. 35, pp. 17–35. Gunter Narr Verlag, Tübingen (1986)Google Scholar
  8. Blum-Kulka, S., Levenston, E.A.: Universals of lexical simplification. Lang. Learn. 28(2), 399–416 (1978)CrossRefGoogle Scholar
  9. Blum-Kulka, S., Levenston, E.A.: Universals of lexical simplification. In: Faerch, C., Kasper, G. (eds.) Strategies in Interlanguage Communication, pp. 119–139. Longman, Harlow (1983)Google Scholar
  10. Brants, T., Xu, P.: Distributed language models. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts, Boulder, Colorado, May 2009, pp. 3–4. Association for Computational Linguistics (2009). http://www.aclweb.org/anthology/N/N09/N09-4002
  11. Brown, P.F., Cocke, J., Della Pietra, S.A., Della Pietra, V.J., Jelinek, F., Lafferty, J.D., Mercer, R.L., Roossin, P.S.: A statistical approach to machine translation. Comput. Linguist. 16(2), 79–85 (1990). ISSN 0891-2017Google Scholar
  12. Brown, P.F., Della Pietra, V.J., Della Pietra, S.A., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Comput. Linguist. 19(2), 263–311 (1993). ISSN 0891-2017Google Scholar
  13. Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16(1), 22–29 (1990). ISSN 0891-2017Google Scholar
  14. Frawley, W.: Prolegomenon to a theory of translation. In: Frawley, W. (ed.) Translation. Literary, Linguistic and Philosophical Perspectives, pp. 159–175. University of Delaware Press, Newark (1984)Google Scholar
  15. Gellerstam, M.: Translationese in Swedish novels translated from English. In: Wollin, L., Lindquist, H. (eds.) Translation Studies in Scandinavia, pp. 88–95. CWK Gleerup, Lund (1986)Google Scholar
  16. Gries, S.T., Wulff, S.: Regression analysis in translation studies. In: Oakes, M.P., Ji, M. (eds.) Quantitative Methods in Corpus-Based Translation Studies. Studies in Corpus Linguistics, vol. 51, pp. 35–52. John Benjamins, Philadelphia (2012)CrossRefGoogle Scholar
  17. Grieve, J.: Quantitative authorship attribution: an evaluation of techniques. Lit. Linguis. Comput. 22(3), 251–270 (2007)CrossRefGoogle Scholar
  18. Halverson, S.: The cognitive basis of translation universals. Target 15(2), 197–241 (2003)CrossRefGoogle Scholar
  19. Ilisei, I., Inkpen, D.: Translationese traits in Romanian newspapers: a machine learning approach. Int. J. Comput. Linguist. Appl. 2(1–2) (2011)Google Scholar
  20. Ilisei, I., Inkpen, D., Corpas Pastor, G., Mitkov, R.: Identification of translationese: a machine learning approach. In: Gelbukh, A. (ed.) CICLing 2010. LNCS, vol. 6008, pp. 503–511. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-12116-6_43. http://dx.doi.org/10.1007/978-3-642-12116-6. ISBN 978-3-642-12115-9CrossRefGoogle Scholar
  21. Kenny, D.: Lexis and Creativity in Translation: A Corpus-Based Study. St. Jerome, Northampton (2001). ISBN 9781900650397Google Scholar
  22. Koehn, P.: Europarl: a parallel corpus for statistical machine translation. In: Proceedings of the Tenth Machine Translation Summit, pp. 79–86. AAMT (2005). http://mt-archive.info/MTS-2005-Koehn.pdf
  23. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, Prague, Czech Republic, pp. 177–180. Association for Computational Linguistics, June 2007. http://www.aclweb.org/anthology/P07-2045
  24. Koppel, M., Ordan, N.: Translationese and its dialects. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, pp. 1318–1326. Association for Computational Linguistics, June 2011. http://www.aclweb.org/anthology/P11-1132
  25. Kurokawa, D., Goutte, C., Isabelle, P.: Automatic detection of translated text and its impact on machine translation. In: Proceedings of MT-Summit XII, pp. 81–88 (2009)Google Scholar
  26. Laviosa, S.: Core patterns of lexical use in a comparable corpus of English lexical prose. Meta 43(4), 557–570 (1998)CrossRefGoogle Scholar
  27. Laviosa, S.: Corpus-Based Translation Studies: Theory, Findings, Applications. Approaches to Translation Studies. Rodopi, Amsterdam (2002). ISBN 9789042014879Google Scholar
  28. Lembersky, G., Ordan, N., Wintner, S.: Language models for machine translation: original vs. translated texts. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, pp. 363–374. Association for Computational Linguistics, July 2011. http://www.aclweb.org/anthology/D11-1034
  29. Lembersky, G., Ordan, N., Wintner, S.: Adapting translation models to translationese improves SMT. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, pp. 255–265. Association for Computational Linguistics, April 2012a. http://www.aclweb.org/anthology/E12-1026
  30. Lembersky, G., Ordan, N., Wintner, S.: Language models for machine translation: original vs. translated texts. Comput. Linguist. 38(4), 799–825 (2012b). http://dx.doi.org/10.1162/COLI_a_00111
  31. Lembersky, G., Ordan, N., Wintner, S.: Improving statistical machine translation by adapting translation models to translationese. Comput. Linguist. 39(4), 999–1023 (2013). http://dx.doi.org/10.1162/COLI_a_00159 CrossRefGoogle Scholar
  32. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982). doi: 10.1109/TIT.1982.1056489. ISSN 0018-9448CrossRefGoogle Scholar
  33. Munday, J.: A computer-assisted approach to the analysis of translation shifts. Meta 43(4), 542–556 (1998)CrossRefGoogle Scholar
  34. Olohan, M.: How frequent are the contractions? A study of contracted forms in the translational English corpus. Target 15(1), 59–89 (2003)CrossRefGoogle Scholar
  35. Øverås, L.: In search of the third code: an investigation of norms in literary translation. Meta 43(4), 557–570 (1998)CrossRefGoogle Scholar
  36. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, Morristown, NJ, USA, pp. 311–318. Association for Computational Linguistics (2002). http://dx.doi.org/10.3115/1073083.1073135
  37. Pearson, K.: On lines and planes of closest fit to systems of points in space. Philos. Mag. 2(6), 559–572 (1901)CrossRefGoogle Scholar
  38. Popescu, M.: Studying translationese at the character level. In: Angelova, G., Bontcheva, K., Mitkov, R., Nicolov, N. (eds.) Proceedings of RANLP 2011, pp. 634–639 (2011)Google Scholar
  39. Rabinovich, E., Wintner, S.: Unsupervised identification of translationese. Trans. Assoc. Comput. Linguist. 3, 419–432 (2015). https://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/view/618. ISSN 2307-87XGoogle Scholar
  40. Rabinovich, E., Nisioi, S., Ordan, N., Wintner, S.: On the similarities between native, non-native and translated texts. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, pp. 1870–1881, August 2016. http://aclweb.org/anthology/P/P16/P16-1176.pdf
  41. Shlesinger, M.: Simultaneous interpretation as a factor in effecting shifts in the position of texts on the oral-literate continuum. Master’s thesis, Tel Aviv University, Faculty of the Humanities, Department of Poetics and Comparative Literature (1989)Google Scholar
  42. Teich, E.: Cross-Linguistic Variation in System and Text: A Methodology for the Investigation of Translations and Comparable Texts. Mouton de Gruyter, Mouton (2003)CrossRefGoogle Scholar
  43. Tetreault, J., Blanchard, D., Cahill, A.: A report on the first native language identification shared task. In: Proceedings of the Eighth Workshop on Building Educational Applications Using NLP. Association for Computational Linguistics, June 2013Google Scholar
  44. Toury, G.: In Search of a Theory of Translation. The Porter Institute for Poetics and Semiotics, Tel Aviv University, Tel Aviv (1980)Google Scholar
  45. Toury, G.: Descriptive Translation Studies and Beyond. John Benjamins, Amsterdam/Philadelphia (1995)CrossRefGoogle Scholar
  46. Tsvetkov, Y., Twitto, N., Schneider, N., Ordan, N., Faruqui, M., Chahuneau, V., Wintner, S., Dyer, C.: Identifying the L1 of non-native writers: the CMU-Haifa system. In: Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 279–287. Association for Computational Linguistics, June 2013. http://www.aclweb.org/anthology/W13-1736
  47. Twitto, N., Ordan, N., Wintner, S.: Statistical machine translation with automatic identification of translationese. In: Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon, Portugal, pp. 47–57. Association for Computational Linguistics, September 2015. http://aclweb.org/anthology/W15-3002
  48. van Halteren, H.: Source language markers in EUROPARL translations. In: Scott, D., Uszkoreit, H., (eds.) Proceedings of the 22nd International Conference on Computational Linguistics, COLING 2008, Morristown, NJ, USA, pp. 937–944. Association for Computational Linguistics (2008). ISBN 978-1-905593-44-6Google Scholar
  49. Vanderauwerea, R.: Dutch Novels Translated into English: The Transformation of a ‘Minority’ Literature. Rodopi, Amsterdam (1985)Google Scholar
  50. Volansky, V., Ordan, N., Wintner, S.: On the features of translationese. Digit. Scholarsh. Humanit. 30(1), 98–118 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of HaifaHaifaIsrael

Personalised recommendations