Abstract
Government departments increasingly communicate information to citizens digitally via web sites, and, in many societies, the linguistic diversity of these citizens is also growing. In Japan, a largely monolingual society, municipal governments now routinely address the necessity of providing practical and legal information to residents with limited Japanese by machine-translating their public service web sites into selected languages. Cost constraints often mean the translation is left un-edited and, as a result, may be unclear, misleading or even incomprehensible. While machine translation from Japanese is particularly challenging because of its structural uniqueness, the state of the art in the field generally is such that poor output is a universal problem. The solution we propose draws on recent advances in controlled authoring, document structuring and machine translation evaluation. It is realised as a prototype tool that enables non-professional writers to create documents where individual sentences and overall flow are both clear. The tool is designed to enhance machine-translatability into English without compromising the readability of the Japanese original. The originality of the tool is to provide an interactive sentence checker that is context-sensitive to the individual functional elements of a document template specialised for the public administration domain. Where natural Japanese sentences give bad translation results, we pre-process them internally into a form which yields acceptable machine translation output. Evaluation of the tool will target three concerns: its usability by non-professional authors; the acceptability of the Japanese document; and the comprehensibility of the English translation. We suggest that such an authoring framework could facilitate government communication with citizens in many societies beyond Japan.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
http://www.city.shinjuku.lg.jp/foreign/english/guide/todoke/todoke_7.html. Accessed 11 June 2015.
- 8.
See http://www.plainlanguage.gov/. For a similar UK initiative, but one not backed by legislation, see http://www.plainenglish.co.uk/.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
References
Adriaens, G., & Schreurs, D. (1992). From Cogram to Alcogram: Toward a controlled English grammar checker. In Proceedings COLING1992, Nantes, France.
AECMA (1995). A guide for the preparation of aircraft maintenance documents in the aerospace maintenance language AECMA Simplified English. AECMA Document, PSC-85-16598, Paris: AECMA.
Bellamy, L., Carey, M., & Schlotfeldt, J. (2012). DITA best practices: A roadmap for writing, editing, and architecting in DITA. Upper Saddle River, NJ: IBM Press.
Bernth, A., & Gdaniec, C. (2001). Mtranslatability. Machine Translation, 16(3), 175–218.
Bertot, J., Jaeger, P., & Hansen, D. (2012). The impact of policies on government social media usage: Issues, challenges and recommendations. Government Information Quarterly, 29(2012):30–40. (Elsevier).
Biber, D., & Conrad, S. (2009). Register, genre, and style. New York: Cambridge University Press.
Bouayad-Agha, N., Power, R., & Belz, A. (2002). PILLS: Multilingual generation of medical information documents with overlapping content. In Proceedings LREC 2002, Las Palmas, Spain.
Brown, P., Della Pietra, S., Della Pietra, V., & Mercer, R. (1993). The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2), 263–311.
Carroll, T. (2010). Local government websites in Japan: International, multicultural, multilingual? Japanese Studies, 30(3), 373–392.
Colineau, N., Paris, C., & Linden, K. V. (2002). An evaluation of procedural instructional text. In Proceedings International Natural Language Generation Conference, New York.
Colineau, N., Paris, C., & Linden, K. V. (2012). Government to citizen communications: From generic to tailored documents in public administration. Information Polity, 17(2), 177–193.
Colineau, N., Paris, C., & Linden, K. V. (2013). Automatically producing tailored web materials for public administration. New Review of HyperMedia and MultiMedia, 9(2), 158–181.
Day, D., Priestley, M., & Schell, D. (2005). Introduction to the Darwin Information Typing Architecture: Toward portable technical information. IBM Corporation. http://www.ibm.com/developerworks/xml/library/x-dita1/x-dita1-pdf.pdf. Accessed 18 Jan 2015.
DiMarco, C., Bray, P., Covvey, H. D., Cowan, D., DiCuccio, V., Hovy, E., & Yang, C. (2008). Authoring and generation of individualised patient education materials. Journal on Information Technology in Healthcare, 6(1), 63–71.
Hartley, A. (2010). Enabling multilingual applications of ‘controlled language’: The DITA framework. Asia-Pacific Association for Machine Translation Journal, 48, 15–18.
Hartley, A. F., & Paris, C. (1997). Multilingual document production: From support for translating to support for authoring. Machine Translation, 12(1997), 109–128.
Hartley, A., Paris, C. (2001). Translation, controlled languages, generation. In E. Steiner, C. Yallop (Eds.), Exploring translation and multilingual text production: Beyond content (pp. 307–325), Berlin: De Gruyter Mouton.
Hartley, A., Tatsumi, M., Isahara, H., Kageura, K., & Miyata, R. (2012). Readability and translatability judgments for ‘Controlled Japanese.’ In Proceedings EAMT2012, Trento, IT.
Inui, K., & Fujita, A. (2004). 言い換え技術に関する研究動向 (A survey on paraphrase generation and recognition). Natural Language Processing, 11(5), 151–198.
Japan Technical Communicators Association (Ed.). (2011). 日本語スタイルガイド (Style guide for Japanese documents) (2nd ed.). Tokyo: JTCA Publication.
Jong, M., & Schellens, P. J. (2000). Toward a document evaluation methodology: What does research tell us about the validity and reliability of evaluation methods? IEEE Transactions on Professional Communication, 43(3), 242–260.
Kamprath, C., Adolphson, E., Mitamura, T., & Nyberg, E. (1998). Controlled language for multilingual document production: Experience with Caterpillar Technical English. In Proceedings CLAW1998, Pittsburgh, PA.
Kando, N. (1997). Text-level structure of research articles and its implication for text-based information processing systems. In Proceedings. 19th British Computer Society Annual Colloquium on Information Retrieval Research, Aberdeen, Scotland, UK.
Kittredge, R. (2003). Sublanguages and controlled languages. In R. Mitkov (Ed.), Oxford handbook of computational linguistics (pp. 430–437). Oxford: Oxford University Press.
Kruijff, G.-J., Teich, E., Bateman, J., Kruijff-Korbayova, I., Skoumalova, H., Sharoff, S., Sokolova, E., Hartley, T., Staykova, K., & Hana, J. (2000). Multilinguality in a text generation system for three Slavic languages. In Proceedings COLING2000, Saarbruecken, Germany.
Kuhn, T. (2014). A survey and classification of controlled natural languages. Computational Linguistics, 40(1), 121–170.
Ministry of Internal Affairs and Communications. (2014). 地域におけるICT利活用の現状等に関する調査研究 報告書 (Report of survey on utilisation of ICT in the regions). http://www.soumu.go.jp/johotsusintokei/linkdata/h26_07_houkoku.pdf. Accessed 24 May 2015.
Mitamura, T., & Nyberg, E. (2001). Automatic rewriting for controlled language translation. In Proceedings NLPRS2001 Workshop on Automatic Paraphrasing: Theory and Application, Tokyo, Japan.
Mitamura, T., Baker, K., Nyberg, E., & Svoboda, D. (2003). Diagnostics for interactive controlled language checking. In Proceedings EAMT2003 Workshop on Controlled Language Applications, Dublin.
Nagao, M. (1984). A framework of a mechanical translation between Japanese and English by analogy principle. In A. Elithorn & R. Banerji (Eds.), Artificial and human intelligence. New York: Elsevier North-Holland Inc.
Nagao, M., Tanaka, N., & Tsujii, J. (1984). 制限文法にもとづく文章作成援助システム (Support system for writing texts based on controlled grammar). Information Processing Society of Japan, NL-44, 33–40.
Nielsen, J. (1993). Usability engineering. San Francisco: Morgan Kaufmann.
Nyberg, E., & Mitamura, T. (2000). The KANTOO machine translation environment. In Proceedings AMTA2000, Cuernavaca, Mexico.
Nyberg, E., Mitamura, T., & Huijsen, W. (2003). Controlled language for authoring and translation. In H. Somers (Ed.), Computers and the translator. Amsterdam: Benjamins.
OASIS. (2010). Darwin Information Typing Architecture (DITA) Version 1.2. http://docs.oasis-open.org/dita/v1.2/os/spec/DITA1.2-spec.html. Accessed 31 May 2015.
O’Brien, S. (2003). Controlling controlled English: An analysis of several controlled language rule sets. In Proceedings EAMT2003 Workshop on Controlled Language Applications, Dublin.
O’Brien, S. (2010). Controlled language and readability. Translation and Cognition, 15, 143–165.
Ogura, E., Kudo, M., & Yanagi, H. (2010). シンプリファイド・テクニカル・ジャパニーズ英訳を視野に入れて日本語を作る (Simplified Technical Japanese: Writing translation-ready Japanese documents). Information Processing Society of Japan, DD-78(5), 1–8.
Paris, C., Linden, K. V., Colineau, N., & Lu, S. (2005). Automatically generating effective on-line help. International Journal on E-Learning, 4(1), 83–103.
Paris, C., Colineau, N., Lampert, A., & Linden, K. V. (2010). Discourse planning for information composition and delivery: A reusable platform. The International Journal of Natural Language Engineering, 16(1), 61–98.
Paris, C., Thomas, P., & Wan, S. (2012). Differences in language and style between two social media communities. In Proceedings ICWSM2012, Dublin.
PLAIN (Plain Language and Information Network). (2011). Federal Plain Language Guidelines. http://www.plainlanguage.gov. Accessed 31 May 2015.
Power, R., Scott, D., & Hartley, A. (2003). Multilingual generation of controlled languages. In Proceedings EAMT2003 Workshop on Controlled Language Applications, Dublin.
Pym, P. (1990). Pre-editing and the use of simplified writing for MT. In P. Mayorcas (Ed.), Translating and the computer 10 (pp. 80–95). London: Aslib.
Roturier, J. (2009). Controlled language for MT in action. In Proceedings Translingual Europe, Prague.
Sato, S., & Nagao, M. (1990). Toward memory-based translation. In Proceedings COLING1990, Stroudsburg, PA.
Sato, S., Tsuchiya, M., Murayama, M., Asaoka, M., & Wang, Q. (2003). 日本語文の規格化 (Standardization of Japanese sentences). Information Processing Society of Japan, NL-4, 133–140.
Shirai, S., Ikehara, S., Yokoo, A., & Ooyama, Y. (1998). Automatic rewriting method for internal expressions in Japanese to English MT and its effects. In Proceedings CLAW1998, Pittsburgh, PA.
Smart, J. F. (2006). SMART Controlled English. In Proceedings CLAW2006, Cambridge, MA.
Tatsumi, M., Miyata, R., Hartley, A., Kageura, K., & Isahara, H. (2013). Towards acceptable quality machine translation without post-editing for municipal websites: An evaluation of Japanese controlled language rules. MT Summit 2013 QTLaunchPad Workshop on Human-Centric Machine Translation and Evaluation, Nice, France.
Watanabe, T. (2010). 産業日本語プロジェクトの概要 特許・技術情報の利用性向上のために (Outline of the ‘Technical Japanese’ project: Activity for acceleration of patent technological information utilization). Information Processing and Management, 53(9), 480–491.
Yoshida, S., & Matsuyama, A. (1985). 日本語の規格化:係り受け関係の規格化とそれへの変換ルール (Standardizing Japanese: Standardizing dependency relations and transformation rules). Information Processing Society of Japan, NL-31, 1–6.
Yoshimi, T., Sata, I., & Fukumochi, Y. (2000). Automatic preediting of English sentences for a robust English-to-Japanese MT system. Natural Language Processing, 7(4), 99–117.
Acknowledgments
This work was supported by the Research Grant Program of KDDI Foundation, Japan. The MT system J-SERVER Professional TransGateway V3 was offered by Kodensha Co. Paris’s stay in Japan to work with Miyata, Kageura and Hartley was funded by the Japanese Society for the Promotion of Science and CSIRO.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Miyata, R., Hartley, A., Kageura, K., Paris, C. (2015). ‘Garbage Let’s Take Away’: Producing Understandable and Translatable Government Documents: A Case Study from Japan. In: Nepal, S., Paris, C., Georgakopoulos, D. (eds) Social Media for Government Services. Springer, Cham. https://doi.org/10.1007/978-3-319-27237-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-27237-5_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27235-1
Online ISBN: 978-3-319-27237-5
eBook Packages: Computer ScienceComputer Science (R0)