A Term Normalization Method for Better Performance of Terminology Construction
The importance of research on knowledge management is growing due to recent issues with big data. The most fundamental steps in knowledge management are the extraction and construction of terminologies. Terms are often expressed in various forms and the term variations play a negative role, becoming an obstacle which causes knowledge systems to extract unnecessary knowledge. To solve the problem, we propose a method of term normalization which finds a normalized form (original and standard form defined in dictionaries) of variant terms. The method employs a couple of characteristics of terms: one is appearance similarity, which measures how similar terms are, and the other is context similarity which measures how many clue words they share. Through experiment, we show its positive influence of both similarities in the term normalization.
KeywordsTerm Normalization Terminology Appearance Similarity
Unable to display preview. Download preview PDF.
- 1.Dowdal, J., Rinaldi, F., Ibekwe-SanJuan, F., SanJuan, E.: Complex Structuring of Term Variants for Question Answering. In: Proc. of the ACM Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, vol. 18, pp. 1–8 (2003)Google Scholar
- 2.Ibekwe-Sanjuan, F.: Terminological Variation, a Means of Identifying Research Topics from Texts. In: Proc. of Intl. Conf. on Computational Linguistics, vol. 1, pp. 564–570 (1998)Google Scholar
- 4.Toutanova, K., Manning, C.: Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. In: Proc. Joint SIGDAT Conf. Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 63–70 (2000)Google Scholar
- 10.Hwang, M., Choi, D., Choi, J., Kim, H., Kim, P.: Similarity Measure for Semantic Document Interconnections. Information-An International Interdisciplinary Journal 13(2), 253–267 (2010)Google Scholar
- 11.Hwang, M., Choi, D., Kim, P.: A Method for Knowledge Base Enrichment using Wikipedia Document Information. Information-An International Interdisciplinary Journal 13(5), 1599–1612 (2010)Google Scholar
- 12.Bawakid, A., Oussalah, M.: Using features extracted from Wikipedia for the task of Word Sense Disambiguation. In: Proc. of IEEE Intl. Conf. on Cybernetic Intelligent Systems, pp. 1–6 (2010)Google Scholar
- 13.Fogarolli, A.: Word Sense Disambiguation Based on Wikipedia Link Structure. In: Proceedings of IEEE Intl. Conf. on Semantic Computing, pp. 77–82 (2009)Google Scholar