MCD 2007: Mining Complex Data pp 82-92 | Cite as
Discovering Word Meanings Based on Frequent Termsets
Abstract
Word meaning ambiguity has always been an important problem in information retrieval and extraction, as well as, text mining (documents clustering and classification). Knowledge discovery tasks such as automatic ontology building and maintenance would also profit from simple and efficient methods for discovering word meanings. The paper presents a novel text mining approach to discovering word meanings. The offered measures of their context are expressed by means of frequent termsets. The presented methods have been implemented with efficient data mining techniques. The approach is domain- and language-independent, although it requires applying part of speech tagger. The paper includes sample results obtained with the presented methods.
Keywords
Association rules frequent termsets homonyms polysemyPreview
Unable to display preview. Download preview PDF.
References
- 1.Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th Int’l Conf. on Very Large Databases, pp. 487–499. Morgan Kaufmann, Santiago (1994)Google Scholar
- 2.Dorow, B., Widdows, D.: Discovering corpus-specific word senses. In: EACL 2003, Budapest, Hungary, pp. 79–82 (2003)Google Scholar
- 3.FAOLEX Legal Database, FAO, http://faolex.fao.org/faolex
- 4.Gawrysiak, P., Rybinski, H., Skonieczny, Ł, Wiech, P.: AMI-SME: An exploratory approach to knowledge retrieval for SME’s. In: 3rd Int’l Conf. on Autonomic and Autonomous Systems, ICAS 2007 (2007)Google Scholar
- 5.General Architecture for Text Engineering, http://gate.ac.uk/projects.html
- 6.Harris, Z.: Distributional structure. Word 10(23), 146–162 (1954)Google Scholar
- 7.Hepple, M.: Independence and commitment: Assumptions for rapid training and execution of rule-based POS taggers. In: Proc. of the 38th Annual Meeting of the Association for Computational Linguistics, ACL 2000 (2000)Google Scholar
- 8.Ide, N., Veronis, J.: Introduction to the special issue on word sense disambiguation: The state of the art. Computational Linguistics 24(1), 1–40 (Special Issue on Word Sense Disambiguation)Google Scholar
- 9.Lin, D.: Automatic Retrieval and Clustering of Similar Words. In: Proc. of the 17th Int’l Conf. on Computational linguistics, Canada, vol. 2 (1998)Google Scholar
- 10.Mihalcea, R., Moldovan, D.: Automatic generation of a coarse grained WordNet. In: Proc. of NAACL Workshop on WordNet and Other Lexical Resources, Pittsburgh, PA (2001)Google Scholar
- 11.Miller, G., Chadorow, M., Landes, S., Leacock, C., Thomas, R.G.: Using a semantic concordance for sense identification. In: Proc. of the ARPA Human Language Technology Workshop, pp. 240–243 (1994)Google Scholar
- 12.Pantel, P., Lin, D.: Discovering word senses from text. In: Proc. of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, Edmonton, Alberta, Canada, July 23-26, 2002, pp. 613–619. ACM Press, New York (2002)CrossRefGoogle Scholar
- 13.Portnoy, D.: Unsupervised Discovery of the Multiple Senses of Words and Their Parts of Speech, The School of Engineering and Applied Science of The George Washington University, September 30 (2006)Google Scholar
- 14.Protaziuk, G., Kryszkiewicz, M., Rybinski, H., Delteil, A.: Discovering compound and proper nouns. In: Kryszkiewicz, M., Peters, J.F., Rybinski, H., Skowron, A. (eds.) RSEISP 2007. LNCS (LNAI), vol. 4585, pp. 505–515. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 15.Rybinski, H., Kryszkiewicz, M., Protaziuk, G., Jakubowski, A., Delteil, A.: Discovering synonyms based on frequent termsets. In: Kryszkiewicz, M., Peters, J.F., Rybinski, H., Skowron, A. (eds.) RSEISP 2007. LNCS (LNAI), vol. 4585, pp. 516–525. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 16.Sparck Jones, K.: Synonymy and Semantic Classification. Edinburgh University Press (1986) (originally published in 1964), ISBN 0-85224-517-3Google Scholar
- 17.Park, Y.C., Han, Y.S., Choi, K.-S.: Automatic thesaurus construction using bayesian networks. In: The Proc. of the fourth international conference on Information and knowledge management, United States (1995)Google Scholar
- 18.Zaki Mohammed, J., Karam, G.: Efficiently mining maximal frequent itemsets. In: 1st IEEE Int’l Conf. on Data Mining, San Jose (2001)Google Scholar