Abstract
In this paper, we discuss the possibility to expand Japanese WordNet using AutoExtend that can produce embedded vectors based on dictionary structure. Recently several kinds of NLP tasks showed that the distributed representations for words are effective, however, the word-embedded vectors constructed based on contexts of surrounded words would be difficult to discriminate meanings of a word because every vector is produced for a word. On the other hand, AutoExtend that can produce embedded vectors for meanings and concepts as well as words taking into account thesaurus structure of dictionary, has been proposed and applied into English WordNet. Thus, in this paper, we apply AutoExtend into a Japanese dictionary i.e., Japanese WordNet to construct embedded vectors for lexems and synsets as well as words taking into account thesaurus structure of Japanese WordNet. The experimental results show that embedded vectors constructed by AutoExtend can be helpful to find corresponding meanings for unregistered words in the dictionary.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Asahara, M.: NWJC2Vec: word embedding dataset from ‘NINJAL Web Japanese Corpus’. Terminol. Int. J. Theor. Appl. Issues Spec. Commun. 24(2), 7–25 (2018)
Asahara, M., Maekawa, K., Imada, M., Kato, S., Konishi, H.: Archiving and analysing techniques of the ultra-large-scale web-based corpus project of NINJAL, Japan. Alexandria 26(1–2), 129–148 (2014)
Bentivogli, L., Pianta, E.: Extending wordnet with syntagmatic information. In: Proceedings of The Second Global WordNet Conference, pp. 47–53 (2004)
Fišer, D.: Leveraging parallel corpora and existing wordnets for automatic construction of the slovene wordnet. In: Vetulani, Z., Uszkoreit, H. (eds.) LTC 2007. LNCS (LNAI), vol. 5603, pp. 359–368. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04235-5_31
Fujita, S., Tanaka, T., Bond, F., Nakaiwa, H.: An implemented description of Japanese: the Lexeed dictionary and the Hinoki treebank. In: COLING/ACL06 Interactive Presentation Sessions, pp. 65–68 (2006)
Isahara, H., Bond, F., Uchimoto, K., Utiyama, M., Kanzaki, K.: Development of the Japanese WordNet. In: Proceedings of the 6th International Conference on Language Resources and Evaluation, pp. 2420–2423 (2008)
Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., Mikolov, T.: FastText.zip: compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)
Lally, A., Prager, J.M., et al.: Question analysis: how Watson reads a clue. IBM J. Res. Dev. 56(34), 2:1–2:14 (2012)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR (2013). http://arxiv.org/abs/1301.3781
Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. (CSUR) 41(2), 1–69 (2009)
Palmer, M., Gildea, D., Kingsbury, P.: The proposition bank: an annotated corpus of semantic roles. Comput. Linguist. 31(1), 71–105 (2005)
Palmer, M., Gildea, D., Xue, N.: Semantic Role Labeling. Morgan & Claypool Publishers, San Rafael (2010)
Rothe, S., Schütze, H.: AutoExtend: extending word embeddings to embeddings for synsets and lexemes. In: Proceedings of the Association for Computational Linguistics (2015)
Acknowledgment
A part of the research reported in this paper is supported by JSPS KAKENHI (JP19K00552) and the NINJAL project “Development of and Research with a parsed corpus of Japanese” by JSPS KAKENHI (JP15H03210).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ko, D., Takeuchi, K. (2020). Evaluation of Embedded Vectors for Lexemes and Synsets Toward Expansion of Japanese WordNet. In: Nguyen, LM., Phan, XH., Hasida, K., Tojo, S. (eds) Computational Linguistics. PACLING 2019. Communications in Computer and Information Science, vol 1215. Springer, Singapore. https://doi.org/10.1007/978-981-15-6168-9_7
Download citation
DOI: https://doi.org/10.1007/978-981-15-6168-9_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-6167-2
Online ISBN: 978-981-15-6168-9
eBook Packages: Computer ScienceComputer Science (R0)