Abstract
As new onomatopoeic words are often created at short notice, existing dictionaries tend to have an insufficient number of their entries. Furthermore, onomatopoeic words seldom appear in collections of newspaper articles, that have been used as corpora in natural language processing. In this work, we present a method of automatically acquiring lexical knowledge for Japanese onomatopoeic words from the WWW. As a result, we could automatically construct a onomatopoeic dictionary that contained 5,130 entries. By manually evaluating 487 newly acquired words that were not in the existing dictionary, we found that we could acquire 266 new onomatopoeic words, and if words in the existing dictionary were regarded as being correct, precision of our automatic acquisition was 83.6%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sinclair, J. (ed.): Collins Cobuild English Dictionary. HarperCollins Publishers (1995)
Kurohashi, S., Nagao, M.: Kyoto university text corpus project. In: Proceedings of ANLP 1997, pp. 115–118 (1997) (in Japanese)
Japanese Electronic Dictionary Research Institute Ltd.: EDR electronic dictionary technical guide ver.2.0 (1999)
Kilgarriff, A., Grefenstette, G.: Introduction to the special issue on the web as corpus. Computational Linguistics 29(3), 333–347 (2003)
Dumais, S., Banko, M., Brill, E., Lin, J., Ng, A.: Web question answering: Is more always better? In: Proceedings of SIGIR 2002, pp. 291–298 (2002)
Ravichandran, D., Hovy, E.: Learning surface text patterns for a question answering system. In: Proceedings of ACL 2002 (2002)
Kehoe, A., Renouf, A.: Webcorp: Applying the web to linguistics and linguistics to the web. In: Proceedings of The Eleventh International World Wide Web Conference (2002)
Tamori, I.: Nihongo onomatope no on’in keitai. In: Kakei, H., Tamori, I. (eds.) Onomatopia GionEGitaigo no Rakuen, pp. 1–15 (1993) (in Japanese)
Tamori, I.: Nihongo onomatope no tougo hanchuu. In: Kakei, H., Tamori, I. (eds.) Onomatopia GionEGitaigo no Rakuen, Keisou Shobou, pp. 17–75 (1993) (in Japanese)
Kurohashi, S., Nagao, M.: Japanese Morphological Analysis System JUMAN version 3.61 Manual (1999) (in Japanese)
Kurohashi, S., Nagao, M.: Kn parser: Japanese dependency/case structure analyzer. In: Proceedings of the Workshop on Sharable Natural Language Resources, pp. 48–55 (1994)
Hida, Y., Asada, H.: Gendai Giongo Gitaigo Youhou Jiten. Tokyodo Shuppan (2002) (in Japanese)
Michibata, H.: Eijirou. 1st edn. Alc (2002) (in Japanese), http://www.alc.co.jp/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Okumura, M., Okumura, A., Saito, S. (2006). Automatic Construction of a Japanese Onomatopoeic Dictionary Using Text Data on the WWW. In: Kop, C., Fliedl, G., Mayr, H.C., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2006. Lecture Notes in Computer Science, vol 3999. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11765448_20
Download citation
DOI: https://doi.org/10.1007/11765448_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34616-6
Online ISBN: 978-3-540-34617-3
eBook Packages: Computer ScienceComputer Science (R0)