‘Garbage Let’s Take Away’: Producing Understandable and Translatable Government Documents: A Case Study from Japan

Miyata, Rei; Hartley, Anthony; Kageura, Kyo; Paris, Cécile

doi:10.1007/978-3-319-27237-5_16

‘Garbage Let’s Take Away’: Producing Understandable and Translatable Government Documents: A Case Study from Japan

Rei Miyata⁴,
Anthony Hartley^5,7,
Kyo Kageura⁴ &
…
Cécile Paris⁶

Chapter
First Online: 01 January 2016

1281 Accesses

Abstract

Government departments increasingly communicate information to citizens digitally via web sites, and, in many societies, the linguistic diversity of these citizens is also growing. In Japan, a largely monolingual society, municipal governments now routinely address the necessity of providing practical and legal information to residents with limited Japanese by machine-translating their public service web sites into selected languages. Cost constraints often mean the translation is left un-edited and, as a result, may be unclear, misleading or even incomprehensible. While machine translation from Japanese is particularly challenging because of its structural uniqueness, the state of the art in the field generally is such that poor output is a universal problem. The solution we propose draws on recent advances in controlled authoring, document structuring and machine translation evaluation. It is realised as a prototype tool that enables non-professional writers to create documents where individual sentences and overall flow are both clear. The tool is designed to enhance machine-translatability into English without compromising the readability of the Japanese original. The originality of the tool is to provide an interactive sentence checker that is context-sensitive to the individual functional elements of a document template specialised for the public administration domain. Where natural Japanese sentences give bad translation results, we pre-process them internally into a form which yields acceptable machine translation output. Evaluation of the tool will target three concerns: its usability by non-professional authors; the acceptability of the Japanese document; and the comprehensibility of the English translation. We suggest that such an authoring framework could facilitate government communication with citizens in many societies beyond Japan.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://facebook.com/.
2.
https://mixi.jp/.
3.
https://myspace.com/.
4.
https://twitter.com/.
5.
http://now.ameba.jp/.
6.
https://translate.google.com/.
7.
http://www.city.shinjuku.lg.jp/foreign/english/guide/todoke/todoke_7.html. Accessed 11 June 2015.
8.
See http://www.plainlanguage.gov/. For a similar UK initiative, but one not backed by legislation, see http://www.plainenglish.co.uk/.
9.
http://www.smartny.com/maxit.html/.
10.
http://www.acrolinx.com/.
11.
http://www.hotdocs.com/.
12.
http://www.exari.com/.
13.
http://www.logicnets.com/.
14.
http://www.ptc.com/products/arbortext/.
15.
http://dita-jp.org/en/.
16.
http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html.

References

Adriaens, G., & Schreurs, D. (1992). From Cogram to Alcogram: Toward a controlled English grammar checker. In Proceedings COLING1992, Nantes, France.
Google Scholar
AECMA (1995). A guide for the preparation of aircraft maintenance documents in the aerospace maintenance language AECMA Simplified English. AECMA Document, PSC-85-16598, Paris: AECMA.
Google Scholar
Bellamy, L., Carey, M., & Schlotfeldt, J. (2012). DITA best practices: A roadmap for writing, editing, and architecting in DITA. Upper Saddle River, NJ: IBM Press.
Google Scholar
Bernth, A., & Gdaniec, C. (2001). Mtranslatability. Machine Translation, 16(3), 175–218.
Article MATH Google Scholar
Bertot, J., Jaeger, P., & Hansen, D. (2012). The impact of policies on government social media usage: Issues, challenges and recommendations. Government Information Quarterly, 29(2012):30–40. (Elsevier).
Google Scholar
Biber, D., & Conrad, S. (2009). Register, genre, and style. New York: Cambridge University Press.
Book Google Scholar
Bouayad-Agha, N., Power, R., & Belz, A. (2002). PILLS: Multilingual generation of medical information documents with overlapping content. In Proceedings LREC 2002, Las Palmas, Spain.
Google Scholar
Brown, P., Della Pietra, S., Della Pietra, V., & Mercer, R. (1993). The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2), 263–311.
Google Scholar
Carroll, T. (2010). Local government websites in Japan: International, multicultural, multilingual? Japanese Studies, 30(3), 373–392.
Article Google Scholar
Colineau, N., Paris, C., & Linden, K. V. (2002). An evaluation of procedural instructional text. In Proceedings International Natural Language Generation Conference, New York.
Google Scholar
Colineau, N., Paris, C., & Linden, K. V. (2012). Government to citizen communications: From generic to tailored documents in public administration. Information Polity, 17(2), 177–193.
Google Scholar
Colineau, N., Paris, C., & Linden, K. V. (2013). Automatically producing tailored web materials for public administration. New Review of HyperMedia and MultiMedia, 9(2), 158–181.
Article Google Scholar
Day, D., Priestley, M., & Schell, D. (2005). Introduction to the Darwin Information Typing Architecture: Toward portable technical information. IBM Corporation. http://www.ibm.com/developerworks/xml/library/x-dita1/x-dita1-pdf.pdf. Accessed 18 Jan 2015.
DiMarco, C., Bray, P., Covvey, H. D., Cowan, D., DiCuccio, V., Hovy, E., & Yang, C. (2008). Authoring and generation of individualised patient education materials. Journal on Information Technology in Healthcare, 6(1), 63–71.
Google Scholar
Hartley, A. (2010). Enabling multilingual applications of ‘controlled language’: The DITA framework. Asia-Pacific Association for Machine Translation Journal, 48, 15–18.
Google Scholar
Hartley, A. F., & Paris, C. (1997). Multilingual document production: From support for translating to support for authoring. Machine Translation, 12(1997), 109–128.
Article Google Scholar
Hartley, A., Paris, C. (2001). Translation, controlled languages, generation. In E. Steiner, C. Yallop (Eds.), Exploring translation and multilingual text production: Beyond content (pp. 307–325), Berlin: De Gruyter Mouton.
Google Scholar
Hartley, A., Tatsumi, M., Isahara, H., Kageura, K., & Miyata, R. (2012). Readability and translatability judgments for ‘Controlled Japanese.’ In Proceedings EAMT2012, Trento, IT.
Google Scholar
Inui, K., & Fujita, A. (2004). 言い換え技術に関する研究動向 (A survey on paraphrase generation and recognition). Natural Language Processing, 11(5), 151–198.
Article Google Scholar
Japan Technical Communicators Association (Ed.). (2011). 日本語スタイルガイド (Style guide for Japanese documents) (2nd ed.). Tokyo: JTCA Publication.
Google Scholar
Jong, M., & Schellens, P. J. (2000). Toward a document evaluation methodology: What does research tell us about the validity and reliability of evaluation methods? IEEE Transactions on Professional Communication, 43(3), 242–260.
Article Google Scholar
Kamprath, C., Adolphson, E., Mitamura, T., & Nyberg, E. (1998). Controlled language for multilingual document production: Experience with Caterpillar Technical English. In Proceedings CLAW1998, Pittsburgh, PA.
Google Scholar
Kando, N. (1997). Text-level structure of research articles and its implication for text-based information processing systems. In Proceedings. 19th British Computer Society Annual Colloquium on Information Retrieval Research, Aberdeen, Scotland, UK.
Google Scholar
Kittredge, R. (2003). Sublanguages and controlled languages. In R. Mitkov (Ed.), Oxford handbook of computational linguistics (pp. 430–437). Oxford: Oxford University Press.
Google Scholar
Kruijff, G.-J., Teich, E., Bateman, J., Kruijff-Korbayova, I., Skoumalova, H., Sharoff, S., Sokolova, E., Hartley, T., Staykova, K., & Hana, J. (2000). Multilinguality in a text generation system for three Slavic languages. In Proceedings COLING2000, Saarbruecken, Germany.
Google Scholar
Kuhn, T. (2014). A survey and classification of controlled natural languages. Computational Linguistics, 40(1), 121–170.
Article Google Scholar
Ministry of Internal Affairs and Communications. (2014). 地域におけるICT利活用の現状等に関する調査研究報告書 (Report of survey on utilisation of ICT in the regions). http://www.soumu.go.jp/johotsusintokei/linkdata/h26_07_houkoku.pdf. Accessed 24 May 2015.
Mitamura, T., & Nyberg, E. (2001). Automatic rewriting for controlled language translation. In Proceedings NLPRS2001 Workshop on Automatic Paraphrasing: Theory and Application, Tokyo, Japan.
Google Scholar
Mitamura, T., Baker, K., Nyberg, E., & Svoboda, D. (2003). Diagnostics for interactive controlled language checking. In Proceedings EAMT2003 Workshop on Controlled Language Applications, Dublin.
Google Scholar
Nagao, M. (1984). A framework of a mechanical translation between Japanese and English by analogy principle. In A. Elithorn & R. Banerji (Eds.), Artificial and human intelligence. New York: Elsevier North-Holland Inc.
Google Scholar
Nagao, M., Tanaka, N., & Tsujii, J. (1984). 制限文法にもとづく文章作成援助システム (Support system for writing texts based on controlled grammar). Information Processing Society of Japan, NL-44, 33–40.
Google Scholar
Nielsen, J. (1993). Usability engineering. San Francisco: Morgan Kaufmann.
MATH Google Scholar
Nyberg, E., & Mitamura, T. (2000). The KANTOO machine translation environment. In Proceedings AMTA2000, Cuernavaca, Mexico.
Google Scholar
Nyberg, E., Mitamura, T., & Huijsen, W. (2003). Controlled language for authoring and translation. In H. Somers (Ed.), Computers and the translator. Amsterdam: Benjamins.
Google Scholar
OASIS. (2010). Darwin Information Typing Architecture (DITA) Version 1.2. http://docs.oasis-open.org/dita/v1.2/os/spec/DITA1.2-spec.html. Accessed 31 May 2015.
O’Brien, S. (2003). Controlling controlled English: An analysis of several controlled language rule sets. In Proceedings EAMT2003 Workshop on Controlled Language Applications, Dublin.
Google Scholar
O’Brien, S. (2010). Controlled language and readability. Translation and Cognition, 15, 143–165.
Article Google Scholar
Ogura, E., Kudo, M., & Yanagi, H. (2010). シンプリファイド・テクニカル・ジャパニーズ英訳を視野に入れて日本語を作る (Simplified Technical Japanese: Writing translation-ready Japanese documents). Information Processing Society of Japan, DD-78(5), 1–8.
Google Scholar
Paris, C., Linden, K. V., Colineau, N., & Lu, S. (2005). Automatically generating effective on-line help. International Journal on E-Learning, 4(1), 83–103.
Google Scholar
Paris, C., Colineau, N., Lampert, A., & Linden, K. V. (2010). Discourse planning for information composition and delivery: A reusable platform. The International Journal of Natural Language Engineering, 16(1), 61–98.
Article Google Scholar
Paris, C., Thomas, P., & Wan, S. (2012). Differences in language and style between two social media communities. In Proceedings ICWSM2012, Dublin.
Google Scholar
PLAIN (Plain Language and Information Network). (2011). Federal Plain Language Guidelines. http://www.plainlanguage.gov. Accessed 31 May 2015.
Power, R., Scott, D., & Hartley, A. (2003). Multilingual generation of controlled languages. In Proceedings EAMT2003 Workshop on Controlled Language Applications, Dublin.
Google Scholar
Pym, P. (1990). Pre-editing and the use of simplified writing for MT. In P. Mayorcas (Ed.), Translating and the computer 10 (pp. 80–95). London: Aslib.
Google Scholar
Roturier, J. (2009). Controlled language for MT in action. In Proceedings Translingual Europe, Prague.
Google Scholar
Sato, S., & Nagao, M. (1990). Toward memory-based translation. In Proceedings COLING1990, Stroudsburg, PA.
Google Scholar
Sato, S., Tsuchiya, M., Murayama, M., Asaoka, M., & Wang, Q. (2003). 日本語文の規格化 (Standardization of Japanese sentences). Information Processing Society of Japan, NL-4, 133–140.
Google Scholar
Shirai, S., Ikehara, S., Yokoo, A., & Ooyama, Y. (1998). Automatic rewriting method for internal expressions in Japanese to English MT and its effects. In Proceedings CLAW1998, Pittsburgh, PA.
Google Scholar
Smart, J. F. (2006). SMART Controlled English. In Proceedings CLAW2006, Cambridge, MA.
Google Scholar
Tatsumi, M., Miyata, R., Hartley, A., Kageura, K., & Isahara, H. (2013). Towards acceptable quality machine translation without post-editing for municipal websites: An evaluation of Japanese controlled language rules. MT Summit 2013 QTLaunchPad Workshop on Human-Centric Machine Translation and Evaluation, Nice, France.
Google Scholar
Watanabe, T. (2010). 産業日本語プロジェクトの概要特許・技術情報の利用性向上のために (Outline of the ‘Technical Japanese’ project: Activity for acceleration of patent technological information utilization). Information Processing and Management, 53(9), 480–491.
Article Google Scholar
Yoshida, S., & Matsuyama, A. (1985). 日本語の規格化:係り受け関係の規格化とそれへの変換ルール (Standardizing Japanese: Standardizing dependency relations and transformation rules). Information Processing Society of Japan, NL-31, 1–6.
Google Scholar
Yoshimi, T., Sata, I., & Fukumochi, Y. (2000). Automatic preediting of English sentences for a robust English-to-Japanese MT system. Natural Language Processing, 7(4), 99–117.
Article Google Scholar

Download references

Acknowledgments

This work was supported by the Research Grant Program of KDDI Foundation, Japan. The MT system J-SERVER Professional TransGateway V3 was offered by Kodensha Co. Paris’s stay in Japan to work with Miyata, Kageura and Hartley was funded by the Japanese Society for the Promotion of Science and CSIRO.

Author information

Authors and Affiliations

Graduate School of Education, The University of Tokyo, Tokyo, Japan
Rei Miyata & Kyo Kageura
College of Intercultural Communication, Rikkyo University, Tokyo, Japan
Anthony Hartley
CSIRO, Data61, Sydney, Australia
Cécile Paris
University of Leeds, Leeds, UK
Anthony Hartley

Authors

Rei Miyata
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Hartley
View author publications
You can also search for this author in PubMed Google Scholar
Kyo Kageura
View author publications
You can also search for this author in PubMed Google Scholar
Cécile Paris
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rei Miyata .

Editor information

Editors and Affiliations

CSIRO Data61, Marsfield, New South Wales, Australia
Surya Nepal
Services Flagship, CSIRO Data61, Sydney, New South Wales, Australia
Cécile Paris
RMIT University, Melbourne, Victoria, Australia
Dimitrios Georgakopoulos

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Miyata, R., Hartley, A., Kageura, K., Paris, C. (2015). ‘Garbage Let’s Take Away’: Producing Understandable and Translatable Government Documents: A Case Study from Japan. In: Nepal, S., Paris, C., Georgakopoulos, D. (eds) Social Media for Government Services. Springer, Cham. https://doi.org/10.1007/978-3-319-27237-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-27237-5_16
Published: 01 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27235-1
Online ISBN: 978-3-319-27237-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics