Applying CNL Authoring Support to Improve Machine Translation of Forum Data

Lehmann, Sabine; Gottesman, Ben; Grabowski, Robert; Kudo, Mayo; Lo, Siu Kei Pepe; Siegel, Melanie; Fouvry, Frederik

doi:10.1007/978-3-642-32612-7_1

Sabine Lehmann²¹,
Ben Gottesman²¹,
Robert Grabowski²¹,
Mayo Kudo²¹,
Siu Kei Pepe Lo²¹,
Melanie Siegel²¹ &
…
Frederik Fouvry²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7427))

Included in the following conference series:

International Workshop on Controlled Natural Language

638 Accesses
2 Citations

Abstract

Machine translation (MT) is most often used for texts of publishable quality. However, there is increasing interest in providing translations of user-generated content in customer forums. This paper describes research towards addressing this challenge by automatically improving the quality of community forum data to improve MT results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aikawa, T., Schwartz, L., King, R., Corston-Oliver, M., Lozano, C.: Impact of controlled language on translation quality and post-editing in a statistical machine translation environment. In: Proceedings of MT Summit XI, Copenhagen, Denmark, pp. 1–7 (2007)
Google Scholar
Allen, J.: Postediting: an integrated part of a translation software program. In: Language International, pp. 26–29 (April 2001)
Google Scholar
de Sousa, S., Aziz, W., Specia, L.: Assessing the Post-Editing Effort for Automatic and Semi-Automatic Translations of DVD Subtitles. In: Proceedings of the International Conference Recent Advances in Natural Language Processing 2011, Hissar, Bulgaria, pp. 97–103. RANLP 2011 Organising Committee (September 2011)
Google Scholar
Dugast, L., Senellart, J., Koehn, P.: Statistical post-editing on SYSTRAN’s rule-based translation system. In: Proceedings of the Second Workshop on Statistical Machine Translation, Prague, Czech Republic, pp. 220–223. Association for Computational Linguistics (June 2007)
Google Scholar
Eisele, A.: Hybrid Architectures for Better Machine Translation. GSCL WS ”Kosten und Nutzen von MT”, Potsdam (September 2009)
Google Scholar
Garcia, I.: Translating by post-editing: is it the way forward? Machine Translation 25, 217–237 (2011), doi:10.1007/s10590-011-9115-8
Article Google Scholar
Guzmán, R.: Advanced automatic mt postediting. Multiling Computing 19(3), 52–57 (2008)
Google Scholar
Koehn, P.: Statistical Machine Translation. Cambridge University Press (2010)
Google Scholar
Kohl, J.: The Global English Style Guide. Writing Clear, Translatable Documentation for a Global Market. SAS Institute INC, Cary NC (2008)
Google Scholar
Lin, C.-Y., Och, F.J.: Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Barcelona, Spain (2004)
Google Scholar
Melamed, I., Green, R., Turian, J.: Precision and recall of machine translation. In: Proceedings of HLT-NAACL: Short Papers (2003)
Google Scholar
O’Brien, S., Roturier, J.: How Portable are Controlled Languages Rules? A Comparison of Two Empirical MT Studies. In: Proceedings of MT Summit XI, Copenhagen, Denmark, pp. 345–352 (2007)
Google Scholar
Plitt, M., Masselot, F.: A Productivity Test of Statistical Machine Translation Post-Editing in a Typical Localisation Context. The Prague Bulletin of Mathematical Linguistics 93, 7–16 (2010)
Article Google Scholar
Simard, M., Ueffing, N., Isabelle, P., Kuhn, R.: Rule-Based Translation with Statistical Phrase-Based Post-Editing. In: Proceedings of the Second Workshop on Statistical Machine Translation, Prague, Czech Republic, pp. 203–206. Association for Computational Linguistics (June 2007)
Google Scholar
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A Study of Translation Edit Rate with Targeted Human Annotation. In: Proceedings of the 7th Conference of the Association for Machine Translation of the Americas, Cambridge, Massachusetts (2006)
Google Scholar
Somers, H.L.: Translation Memory Systems. In: Somers, H.L. (ed.) Computers and Translation: A Translator’s Guide, pp. 31–48. John Benjamins Publishing Company, Amsterdam (2003)
Google Scholar
Specia, L.: Exploiting objective annotations for measuring translation post-editing effort. In: Proceedings of the 15th International Conference of the European Association for Machine Translation, Leuven, Belgium, pp. 73–80 (2011)
Google Scholar
Stymne, S., Ahrenberg, L.: Using a grammar checker for evaluation and postprocessing of statistical machine translation. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta. European Language Resources Association (ELRA) (May 2010)
Google Scholar
Temnikova, I.: Cognitive evaluation approach for a controlled language post-editing experiment. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta. European Language Resources Association (ELRA) (May 2010)
Google Scholar
Thicke, L.: Improving MT results: a study. Multilingual 22(1), 37–40 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Acrolinx GmbH, Friedrichstr. 100, 10117, Berlin, Germany
Sabine Lehmann, Ben Gottesman, Robert Grabowski, Mayo Kudo, Siu Kei Pepe Lo, Melanie Siegel & Frederik Fouvry

Authors

Sabine Lehmann
View author publications
You can also search for this author in PubMed Google Scholar
Ben Gottesman
View author publications
You can also search for this author in PubMed Google Scholar
Robert Grabowski
View author publications
You can also search for this author in PubMed Google Scholar
Mayo Kudo
View author publications
You can also search for this author in PubMed Google Scholar
Siu Kei Pepe Lo
View author publications
You can also search for this author in PubMed Google Scholar
Melanie Siegel
View author publications
You can also search for this author in PubMed Google Scholar
Frederik Fouvry
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Pathology, School of Medicine, Yale University, 300 George Street, 06511, New Haven, CT, USA
Tobias Kuhn
Institut für Computerlinguistik, Universität Zürich, Binzmühlestrasse 14, 8050, Zürich, Switzerland
Norbert E. Fuchs

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lehmann, S. et al. (2012). Applying CNL Authoring Support to Improve Machine Translation of Forum Data. In: Kuhn, T., Fuchs, N.E. (eds) Controlled Natural Language. CNL 2012. Lecture Notes in Computer Science(), vol 7427. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32612-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-32612-7_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32611-0
Online ISBN: 978-3-642-32612-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics