WSD Algorithm Applied to a NLP System

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Nowadays, the need of advanced free text filtering is increasing. Therefore, when searching for specific keywords, it is desirable to eliminate occurrences where the word or words are used in an inappropriate sense. This task could be exploited in internet browsers, and resource discovery systems, relational databases containing free text fields, electronic document management systems, data warehouse and data mining systems, etc. In order to resolve this problem in this paper a method for the automatic disambiguating of nouns, using the notion of Specification Marks and the noun taxonomy of the WordNet lexical knowledge base [8] is presented. This method is applied to a Natural Language Processing System (NLP). The method resolves the lexical ambiguity of nouns in any sort of text, and although it relies on the semantics relations (Hypernymy/Hyponymy) and the hierarchic organization of WordNet. However, it does not require any sort of training process, no hand-coding of lexical entries, nor the hand-tagging of texts. An evaluation of the method was done on both the Semantic Concordance Corpus (Semcor)[9], and on Microsoft’s electronic encyclopaedia („Microsoft 98 Encarta Encyclopaedia Deluxe“). The percentage of correct resolutions achieved with these two corpora were: Semcor 65.8% and Microsoft 65.6%. This percentages show that successful results with different domain corpus have been obtained, so our proposed method can be applied successfully on any corpus.