Skip to main content
Log in

Quality evaluation in community post-editing

  • Published:
Machine Translation

Abstract

Machine translation is increasingly being deployed to translate user-generated content (UGC). In many situations, post-editing is required to ensure that the translations are correct and comprehensible for the users. Post-editing by professional translators is not always feasible in the context of UGC within online communities and so members of such communities are sometimes asked to translate or post-edit content on behalf of the community. How should we measure the quality of UGC that has been post-edited by community members? Is quality evaluation by community members a feasible alternative to professional evaluation techniques? This paper describes the outcomes of three quality evaluation methods for community post-edited content: (1) an error annotation performed by a trained linguist; (2) evaluation of fluency and fidelity by domain specialists; (3) evaluation of fluency by community members. The study finds that there are correlations of evaluation results between the domain specialist evaluation and the community evaluation for content machine translated from English into German in an online technical support community. Interestingly, the community evaluators were more critical in their ratings for fluency than the domain experts. Although the results of the error annotation seem to contradict those obtained in the domain specialist evaluation, a higher number of errors in the error annotation appear to result in lower scores in the domain specialist evaluation. We conclude that, within the context of this evaluation, post-editing by community members is feasible, though with considerable variation across individuals, and that evaluation by the community is also a feasible proposition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. It is of importance to distinguish ‘community translation’, which refers to translation in an online community, from the concept of community translation that is also known as ‘public service translation’. In this paper, the term community translation will refer to online community translation.

  2. https://www.evaluation.taus.net/about.

  3. http://www.qt21.eu/launchpad/.

  4. All evaluators in this study were domain specialists. Four of the seven evaluators (57%) were also trained linguists. See Sect. 2.2.2 for more detail.

  5. ST: source text.

  6. QC: quality control.

  7. http://www.de.community.norton.com/.

  8. http://www.accept-portal.eu.

  9. A “task” in this context denotes a combination of a subject line, the initial question posted in the forum of one thread followed by the post of the thread that was marked as the solution to the problem, as presented in the left panel of Fig. 1.

  10. http://www.community.norton.com/.

  11. Note that the guidelines were provided in German, the working language of the community. Original guidelines are available in the Appendix.

  12. http://www.accept-portal.eu.

  13. Available for download at http://www.ida.liu.se/~sarst/blast/.

  14. https://www.github.com/cfedermann/Appraise.

  15. In de Almeida’s categorisation, the latter is an extra category. Here it is integrated into accuracy errors, as mistranslations affect accuracy and thus the fidelity of the translation.

  16. There were two groups of post-editors editing comparable content (cf. Sect. 2.1).

  17. This includes errors present in the raw MT output that remained uncorrected and errors introduced by the post-editors.

  18. This was done to increase exposure of these sentences, starting from sentence 21, which was then the first sentence to be displayed, sentence 22 the second etc. This was deemed appropriate, as the average number of ratings in one sitting was eight.

References

  • Baer N (2010) Trends in crowdsourcing: case studies from not-for-profit and for-profit organisations. ATA 2010, Oct 27–30, 2010, Denver, Colorado, USA. http://www.wiki-translation.com/tiki-download_wiki_attachment.php?attId=62. Accessed 6 Jan 2014

  • Banerjee P (2013) Domain adaptation for statistical machine translation of corporate and user-generated content. Dissertation, Dublin City University

  • Blanchon H, Boitet C, Huynh C (2009) A web service enabling gradable post-edition of pre-translations produced by existing translation tools: practical use to provide high-quality translation of an online encyclopedia. MT Summit XII, Beyond translation memories: new tools for translators workshop. Ottawa, Canada, pp 20–27

  • de Almeida G (2013) Translating the post-editor: an investigation of post-editing changes and correlations with professional experience across two Romance languages. Dissertation, Dublin City University

  • Désilets A, van der Meer J (2011) Co-creating a repository of best-practices for collaborative translators. In: O’Hagan, M (ed) Linguistica antverpienisia new series: themes in translation studies. Translation as a social activity, 10/2011, pp 27–45

  • Drugan J (2013) Quality in professional translation: assessment and improvement. Bloomsbury, London

    Google Scholar 

  • Dugast L, Senellart J, Koehn P (2007) Statistical post-editing on SYSTRAN’s rule-based translation system. In: Proceedings of the second workshop on statistical machine translation. StatMT ’07. Association for Computational Linguistics. Prague, Czech Republic, pp 220–223

  • Flournoy RS, Callison-Burch C (2000) Reconciling user expectations and translation technology to create a useful real-world application. In: Proceedings of the Twenty-second International Conference on Translating and the Computer, 16–17 November 2000, London, United Kingdom

  • Federmann C (2010) Appraise: an open-source toolkit for manual phrase-based evaluation of translations. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC ’10), Valetta, Malta, May 2010

  • Guerberof A (2009) Productivity and quality in the postediting of outputs from translation memories and machine translation. Localis Focus 7(1):11–21

    Google Scholar 

  • Hu C, Bederson BB, Resnik P (2010) Translation by iterative collaboration between monolingual users. In: Proceedings of Graphics Interface. GI ’10. Ottawa, Ontario, Canada, pp 39–46

  • Hu C, Bederson BB, Resnik P, Kronrod Y (2011) MonoTrans2: a new human computation system to support monolingual translation. In: Proceedings of the SIG-CHI Conference on Human Factors in Computing Systems. CHI ‘11. Vancouver, BC, Canada, pp 1133–1136

  • Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: ACL 2007 Proceedings of demo and poster sessions. Czech Republic, Prague, pp 177–180

  • Koehn P (2010) Enabling monolingual translators: post-editing vs. options. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. HLT ’10. Los Angeles, California: Association for Computation Linguistics, pp 537–545

  • Koskinen K, Suojanen T, Tuominen T (forthcoming) User-centered translation, Translation Practices Explained series. Routledge, London

  • Kumaran A, Saravanan K, Maurice S (2008) wikiBABEL: community creation of multilingual data. In: Proceedings of the 4th International Symposium on Wikis. ACM, New York, NY, USA, pp 14:1–14:11

  • Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174

    Article  MATH  MathSciNet  Google Scholar 

  • LDC (2005) Linguistic data annotation specification: assessment of fluency and adequacy in translations. Revision 1:5

  • MacQueen, JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5\(^{\rm th}\) Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, pp 281–297

  • McDonough Dolmaya J (2012) Analyzing the crowdsourcing model and its impact on public perceptions of translation. Translator 18(2):167–191

    Article  Google Scholar 

  • Mesipuu M (2012) Translation crowdsourcing and user-translator motivation at Facebook and Skype. Translation spaces, vol 1. John Benjamins, Amsterdam, pp 33–53

    Google Scholar 

  • Mitchell L, Roturier J (2012) Evaluation of Machine-translated user generated content: a pilot study based on user ratings. In: EAMT 2012: Proceedings of the 16\(^{\rm th}\) Annual Meeting of the European Association for Machine Translation. Trento, Italy, pp 61–64

  • Mitchell L, Roturier J, O’Brien S (2013) Community-based post-editing of machine-translated content: monolingual vs. bilingual, In: Machine Translation Summit XIV, Workshop on Post-editing Technology and Practice. Nice, France, pp. 35–43

  • O’Brien S (2011) Towards predicting post-editing productivity. Mach Transl 25(3):197–215

    Article  Google Scholar 

  • O’Brien S (2012) Towards a dynamic quality evaluation model for translation. J Specialised Transl (17):55–77

  • O’Hagan M (2011) Community translation: translation as a social activity and its possible consequences in the advent of Web 2.0 and beyond. In: O’Hagan M (ed) Linguistica antverpienisia new series: themes in translation studies. Translation as a social activity, 10/2011, pp 111–128

  • Perrino S (2009) User-generated translation: the future of translation in a Web 2.0 environment. J Specialised Transl 12:55–78

    Google Scholar 

  • Pielmeier, H, Kelly, N (2012) Translation production models, common sense advisory report. http://www.commonsenseadvisory.com/Portals/_default/Knowledgebase/ArticleImages/121130_R_Translation_Production_Models_Preview.pdf. Accessed 16 Jan 2014

  • Pym, A (2011) Translation research terms: a tentative glossary for moments of perplexity and dispute. In: Pym, A (ed.) From translation research projects 3 (Online). Tarragona: Intercultural Studies Group, pp 75–110. http://www.isg.urv.es/publicity/isg/publications/trp_3_2011/index.htm. Accessed 25 July 2014

  • Risku H, Windhager F, Apfelthaler M (2013) A dynamic network model of translatorial cognition and action. Transl Spaces 2:151–182

    Article  Google Scholar 

  • Roturier J, Bensadoun A (2011) Evaluation of MT systems to translate user generated content. In: Proceedings of the Thirteenth Machine Translation Summit. Xiamen, China, pp 244–251

  • Roturier J, Mitchell L, Grabowski R, Siegel M (2012) Using automatic machine translation metrics to analyze the impact of source reformulations. AMTA-2012: the Tenth Biennial Conference of the Association for Machine Translation in the Americas. San Diego, CA, pp 138–144

  • Roturier J, Mitchell L, Silva D (2013) The ACCEPT Post-Editing environment: a flexible and customizable online tool to perform and analyse machine translation post-editing. Machine Translation Summit XIV. Workshop on Post-editing Technology and Practice. Nice, France, pp 119–128

  • StataCorp LP (2014) kappa: interrater agreement. http://www.stata.com/manuals13/rkappa.pdf. Accessed 10 Jan 2014

  • Stymne S (2011) BLAST: a tool for error analysis of machine translated output. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Student Session. Portland, Oregon, pp 56–61

  • Symantec (2013) Browser-based client demonstrator and adapted post-editing environment and evaluation portal prototype. Deliverable 5.6, The ACCEPT Project (FP7/2007-2013 grant agreement \(n^{\circ }\) 288769). http://www.accept.unige.ch/Products/D_5_6_Browser-based_client_demonstrator_and_adapted_post-editing_environment_and_evaluation_portal_prototypes.pdf. Accessed 15 Jan 2014

  • Tatsumi M, Aikawa T, Yamamoto K, Isahara H, (2012) How good is crowd post-editing? Its potential and limitations. In: AMTA 2012 Workshop on Post-Editing Technology and Practice (WPTP, (2012) Association for Machine Translation in the Americas (AMTA). San Diego, California, USA, pp 69–77

  • TAUS (2011) MT post-editing guidelines. https://www.taus.net/post-editing/machine-translation-post-editing-guidelines. Accessed 20 Apr 2013

  • Zaidan OF, Callison-Burch C (2011) Crowdsourcing translation: professional quality from non-professionals. In: Proceedings of the 49\(^{\rm th}\) Annual Meeting of the Association for Computational Linguistics. Portland, Oregon, pp 1220–1229

Download references

Acknowledgments

This research was funded by the European Union’s 7th Framework Programme via the ACCEPT Project (Grant agreement: 288769).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Linda Mitchell.

Appendices

Appendix 1: post-editing guidelines (German)

Tipps zum Nachbearbeiten:

  • Bearbeiten Sie den Text nach Ihrer Interpretation so, dass er flüssiger und klarer wird.

  • Versuchen Sie z.B. die Wortfolge und Rechtschreibung zu korrigieren, wenn durch diese der Text schwer oder nicht zu verstehen ist.

  • Verwenden Sie Wörter, Satzteile oder Zeichensetzung unbearbeitet, falls diese akzeptabel sind.

  • Wenn Sie mit Referenz zum Originaltext arbeiten, achten Sie darauf, dass keine Informationen hinzugefügt oder gelöscht wurden.

Appendix 2: fluency and fidelity scales for evaluators (German)

Sprachfluss

Vollständigkeit

Wie würden Sie die sprachliche Qualität der Übersetzung einschätzen? (Sprachfluss) It is:

Wie viel der in der Quellübersetzung enthaltenen Bedeutung wird auch in der Zielübersetzung ausgedrückt ? (Vollständigkeit)

5 perfekt

5 alles

4 gut

4 das Meiste

3 nicht muttersprachlich

3 vieles

2 zusammenhangslos

2 wenig

1 unverständlich

1 nichts

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mitchell, L., O’Brien, S. & Roturier, J. Quality evaluation in community post-editing. Machine Translation 28, 237–262 (2014). https://doi.org/10.1007/s10590-014-9160-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-014-9160-1

Keywords

Navigation