The CRITT Translation Process Research Database

  • Michael CarlEmail author
  • Moritz Schaeffer
  • Srinivas Bangalore
Part of the New Frontiers in Translation Studies book series (NFTS)


Since its existence 10 years ago, the Center for Research and Innovation in Translation and Translation Technology (CRITT) at the Copenhagen Business School has been involved in Translation Process Research (TPR). TPR data was initially collected by the Translog tool and released in 2012 as a Translation Process Research Database (TPR-DB). Since 2012 many more experiments have been conducted and more data has been added to the TPR-DB. In particular, within the CASMACAT (Sanchis-Trilles et al. 2014) project a large amount of TPR data for post-editing machine translation was recorded and the TPR-DB has been made publicly available under a creative commons license. At the time of this writing, the TPR-DB contains almost 30 studies of translation, post-editing, revision, authoring and copying tasks, recorded with Translog and with the CASMACAT workbench. Each study consists of between 8 and more than 100 recording sessions, involving more than 300 translators. Currently, the data amounts to more than 500 h of text production time gathered in more than 1400 sessions with more than 600,000 translated words in more than 10 different target languages.

This chapter describes the features and visualization options of the TPR-DB. This database contains recorded logging data, as well as derived and annotated information assembled in seven kinds of simple and compound process—and product units which are suited to investigate human and computer-assisted translation processes and advanced user modelling.


Empirical translation process research Translation process research database 



This work was supported by the CASMACAT project funded by the European Commission (7th Framework Programme). We are grateful to all contributors to the database and for allowing us to use their data.


  1. Alves, F., & Vale, D. C. (2011). On drafting and revision in translation: A corpus linguistics oriented analysis of translation process data. Translation: Corpora, Computation, Cognition. Special Issue on Parallel Corpora: Annotation, Exploitation, Evaluation, 1(1), 105–122. Scholar
  2. Carl, M. (2012a). Translog-II: A program for recording user activity data for empirical reading and writing research. In The eighth international conference on language resources and evaluation (pp. 2–6). May 21–27, 2012, Istanbul, Tyrkiet. Department of International Language Studies and Computational Linguistics.Google Scholar
  3. Carl, M. (2012b). The CRITT TPR-DB 1.0: A database for empirical human translation process research. In S. O’Brien, M. Simard, & L. Specia (Eds.), Proceedings of the AMTA 2012 workshop on post-editing technology and practice (WPTP 2012) (pp. 9–18). Stroudsburg, PA: Association for Machine Translation in the Americas (AMTA).Google Scholar
  4. Carl, M., & Kay, M. (2011). Gazing and typing activities during translation: A comparative study of translation units of professional and student translators. Meta, 56(4), 952–975.CrossRefGoogle Scholar
  5. Jakobsen, A. L. (2002). Translation drafting by professional translators and by translation students. In G. Hansen (Ed.), Empirical translation studies: Process and product (pp. 191–204). Copenhagen: Samfundslitteratur.Google Scholar
  6. Jakobsen, A. L. (2011). Tracking translators’ keystrokes and eye movements with translog. In C. Alvstad, A. Hild, & E. Tiselius (Eds.), Methods and strategies of process research: Integrative approaches in translation studies (Benjamins translation library, Vol. 94, pp. 37–55). Amsterdam: John Benjamins.CrossRefGoogle Scholar
  7. Jakobsen, A. L., & Schou, L. (1999). Translog documentation. In G. Hansen (Ed.), Probing the process in translation methods and results (pp.~1–36). Copenhagen: Samfundslitteratur.Google Scholar
  8. Jakobsen, A. L. (2005). Instances of peak performance in translation. Lebende Sprachen, 50(3), 111–116.Google Scholar
  9. Germann, U. (2008). Yawat: Yet another word alignment tool. In Proceedings of the ACL-08: HLT demo session (Companion Volume) (pp. 20–23). Columbus, OH: Association for Computational Linguistics.Google Scholar
  10. Lacruz, I., & Shreve, S. (2014). Pauses and cognitive effort in post-editing. In post-editing of machine translation: Processes and applications. In S. O’Brien, M. Simard, L. Specia, M. Carl, & L. W. Balling (Eds.), Expertise in post-editing: Processes, technology and applications (pp. 246–274). Cambridge: Scholars Publishing.Google Scholar
  11. Leijten, M., & Van Waes, L. (2013). Keystroke logging in writing research: Using inputlog to analyze and visualize writing processes. Written Communication, 30(3), 358–392.CrossRefGoogle Scholar
  12. Sanchis-Trilles, G., Alabau, V., Buck, C., Carl, M., Casacuberta, F., Martinez, M. G., et al. (2014). Interactive translation prediction versus conventional post-editing in practice: A study with the CasMaCat workbench. Machine Translation, 28(3–4), 217–235.CrossRefGoogle Scholar
  13. Vandepitte, S., Hartsuiker, R. J., & Van Assche, E., (2015). Process and text studies of a translation problem. In A. Ferreira, & J. W. Schwieter (Eds.), Psycholinguistic and Cognitive Inquiries into Translation and Interpreting. (pp. 127–143).Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Michael Carl
    • 1
    Email author
  • Moritz Schaeffer
    • 1
    • 2
  • Srinivas Bangalore
    • 3
  1. 1.Center for Research and Innovation in Translation and Translation Technology, Department of International Business CommunicationCopenhagen Business SchoolFrederiksbergDenmark
  2. 2.Institute for Language, Cognition and ComputationUniversity of EdinburghEdinburghUK
  3. 3.Interactions CorporationNew ProvidenceUSA

Personalised recommendations