Disambiguation of Japanese Onomatopoeias Using Nouns and Verbs
Japanese onomatopoeias are very difficult for machines to recognize and translate into other languages due to their uniqueness. In particular, onomatopoeias that convey several meanings are very confusing for machine translation systems to distinguish and translate correctly. In this paper, we discuss what features are helpful in order to automatically disambiguate the meaning of onomatopoeias that have two different meanings. We used nouns, adjectives, and verbs extracted from sentences as features, then carried out a machine learning classification analysis and compared the accuracy of how well these features differentiate two meanings of ambiguous onomatopoeias. As a result, we discovered that employing a combination of machine learning with nouns and verbs as a feature achieved accuracy of above 80 points. In addition, we were able to improve the accuracy by excluding pronouns and proper nouns and also by limiting verbs to those that are modified by onomatopoeias. In future, we plan to concentrate on dependency between verbs that are modified by onomatopoeia and nouns, as we believe that this approach will help machine translation to translate Japanese onomatopoeias correctly.
KeywordsMachine Translation Feature Selection Method Proper Noun Machine Translation System Correct Meaning
Unable to display preview. Download preview PDF.
- 1.Tokyo Olympics, http://tokyo2020.jp/
- 2.Shimizu, Y., Doizaki, R., Sakamoto, M.: A System to Estimate an Impression Conveyed by Onomatopoeia. Transactions of the Japanese Society for Artificial Intelligence 29(1), 41–52 (2014) (in Japanese)Google Scholar
- 4.Furutake, Y., Sato, S., Komatani, K.: Onomatope wo iikaeru hyougen no jidoushushu ( Automatic Collection of Expression in Other Words of Onomatopoeias). In: Proceeding of the 17th.Annual Meeting of the Association for Natural Language Processing, pp. 904–907 (2011) (in Japanese)Google Scholar
- 5.Uchida, Y., Araki, K., Yoneyama, J.: Semantic Ambiguity of Onomatopoeia Extracted from Blog Entries. In: Proceeding of the 27th Fuzzy System Symposium, pp. 853–856 (2011) (in Japanese)Google Scholar
- 6.Twitter, https://twitter.com/
- 7.Kudo, T., et al.: MeCab, Yet Another Part-of-Speech and Morphological Analyzer, http://mecab.sourceforge.net/
- 8.Joachims, T.: SVM-Light, http://svmlight.joachims.org/
- 9.Agency for Cultural Affairs, gaikokujin no tame no kihongoyourei jitenn (Basic Word Usage Dictionary for Foreigners), National Printing Bureau (1990)Google Scholar
- 10.Google, https://www.google.co.jp/
- 11.Uchida, Y., Araki, K., Yoneyama, J.: Affect Analysis of Onomatopoeia Sentences Extracted from Blog Entries. In: Proceeding of the 10th Forum on Information Technology, pp. 274–279 (2011) (in Japanese)Google Scholar
- 12.Ameba, http://www.ameba.jp/
- 13.CaboCha, Yet Another Japanese Dependency Structure Aalyser, http://code.google.com/p/cabocha/