Abstract
We extend an automatically generated bilingual Japanese-Swedish dictionary with new translations, automatically discovered from the multi-lingual online encyclopedia Wikipedia. Over 50,000 translations, most of which are not present in the original dictionary, are generated, with very high translation quality. We analyze what types of translations can be generated by this simple method. The majority of the words are proper nouns, and other types of (usually) uninteresting translations are also generated. Not counting the less interesting words, about 15,000 new translations are still found. Checking against logs of search queries from the old dictionary shows that the new translations would significantly reduce the number of searches with no matching translation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Koehn, P., Knight, K.: Knowledge sources for word-level translation models. In: Proceedings of EMNLP 2001, Pittsburgh, USA (2001)
Sjöbergh, J.: Creating a free digital Japanese-Swedish lexicon. In: Proceedings of PACLING 2005, Tokyo, Japan, pp. 296–300 (2005)
Adafre, S.F., de Rijke, M.: Finding similar sentences across multiple languages in Wikipedia. In: EACL 2006 Workshop on New Text – Wikis and Blogs and Other Dynamic Text Sources, Trento, Italy (2006)
Wang, Y.C., et al.: IASL system for NTCIR-6 Korean-Chinese cross-language information retrieval. In: Proceedings of NTCIR-6 Workshop, Tokyo, Japan (2007)
Su, C.Y., Wu, S.H., Lin, T.C.: Using Wikipedia to translate OOV terms on MLIR. In: Proceedings of NTCIR-6 Workshop, Tokyo, Japan (2007)
Mori, T., Takahashi, K.: A method of cross-lingual question-answering based on machine translation and noun phrase translation using web documents. In: Proceedings of NTCIR-6 Workshop, Tokyo, Japan (2007)
Fukuhara, T., Murayama, T., Nishida, T.: Analyzing concerns of people from Weblog articles. AI & Society (in press, 2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sjöbergh, J., Sjöbergh, O., Araki, K. (2008). What Types of Translations Hide in Wikipedia?. In: Tokunaga, T., Ortega, A. (eds) Large-Scale Knowledge Resources. Construction and Application. LKR 2008. Lecture Notes in Computer Science(), vol 4938. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78159-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-78159-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78158-5
Online ISBN: 978-3-540-78159-2
eBook Packages: Computer ScienceComputer Science (R0)