CICLing 2005: Computational Linguistics and Intelligent Text Processing pp 280-292 | Cite as
Finding Instance Names and Alternative Glosses on the Web: WordNet Reloaded
Conference paper
Abstract
This paper presents an approach to extending existing lexical resources with instance names and alternative definitions acquired from textual documents. The experiments involve WordNet and approximately 300 million Web documents, but the method is more generally applicable. We leverage formally-structured, human-validated resources, on one hand, and data-driven instance names and definitions on the other, which opens the path to new applications of the reloaded resources.
Keywords
Noun Phrase Word Sense Disambiguation Unstructured Text Lexical Concept Semantic Lexicon
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database and Some of its Applications. MIT Press, Cambridge (1998)Google Scholar
- 2.Agirre, E., Rigau, G.: Word sense disambiguation using conceptual density. In: Proceedings of the 16th International Conference on Computational Linguistics (COLING 1996), Copenhagen, Denmark, pp. 16–22 (1996)Google Scholar
- 3.Chai, J., Biermann, A.: The use of word sense disambiguation in an information extraction system. In: Proceedings of the 16th National Conference on Artificial Intelligence (AAAI 1999), Menlo Park, California, pp. 850–855 (1999)Google Scholar
- 4.Dorr, B., Katsova, M.: Lexical selection for cross-language applications: Combining LCS with WordNet. In: Farwell, D., Gerber, L., Hovy, E. (eds.) AMTA 1998. LNCS (LNAI), vol. 1529, pp. 438–447. Springer, Heidelberg (1998)CrossRefGoogle Scholar
- 5.Green, S.: Automatically generating hypertext in newspaper articles by computing semantic relatedness. In: Proceedings of the 2nd Conference on Computational Language Learning (CoNLL 1998), Sydney, Australia, pp. 101–110 (1998)Google Scholar
- 6.Banerjee, S., Pedersen, T.: Extended gloss overlaps as a measure of semantic relatedness. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003), Acapulco, Mexico, pp. 805–810 (2003)Google Scholar
- 7.Brants, T.: TnT - a statistical part of speech tagger. In: Proceedings of the 6th Conference on Applied Natural Language Processing (ANLP 2000), Seattle, Washington, pp. 224–231 (2000)Google Scholar
- 8.Voorhees, E.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval. Information Processing and Management 22, 465–476 (1986)CrossRefGoogle Scholar
- 9.Paşca, M.: Acquisition of categorized named entities for Web search. In: Proceedings of the 13th ACM Conference on Information and Knowledge Management (CIKM 2004), Washington, D.C. (2004)Google Scholar
- 10.Wacholder, N., Ravin, Y., Choi, M.: Disambiguation of proper names in text. In: Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP 1997), Washington, D.C., pp. 202–208 (1997)Google Scholar
- 11.Fujii, A., Ishikawa, T.: Summarizing encyclopedic term descriptions on the web. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 645–651 (2004)Google Scholar
- 12.Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th International Conference on Computational Linguistics (COLING 1992), Nantes, France, pp. 539–545 (1992)Google Scholar
- 13.Schiffman, B., Mani, I., Concepcion, C.: Producing biographical summaries: Combining linguistic knowledge with corpus statistics. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL 2001), Toulouse, France, pp. 450–457 (2001)Google Scholar
- 14.Phillips, W., Riloff, E.: Exploiting strong syntactic heuristics and co-training to learn semantic lexicons. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), Philadelphia, Pennsylvania, pp. 125–132 (2002)Google Scholar
- 15.Ravichandran, D., Hovy, E.: Learning surface text patterns for a question answering system. In: Proceedings of the 40th Annual Meeting of the Association of Computational Linguistics (ACL 2002), Philadelphia, Pennsylvania (2002)Google Scholar
- 16.Solorio, T., Pérez, M., Montes, M., Villasenor, L., López, A.: A language independent method for question classification. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland (2004)Google Scholar
- 17.Cucerzan, S., Yarowsky, D.: Language independent named entity recognition combining morphological and contextual evidence. In: Proceedings of the 1999 Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC 1999), College Park, Maryland, pp. 90–99 (1999)Google Scholar
- 18.Liu, B., Chin, C., Ng, H.: Mining topic-specific concepts and definitions on the web. In: Proceedings of the 12th International World Wide Web Conference (WWW 2003), Budapest, Hungary, pp. 251–260 (2003)Google Scholar
- 19.Dolan, W., Quirk, C., Brockett, C.: Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland (2004)Google Scholar
Copyright information
© Springer-Verlag Berlin Heidelberg 2005