Abstract
A user expresses her information need through words with a precise meaning, but from the machine point of view this meaning does not come with the word. A further step is needful to automatically associate it to the words. Techniques that process human language are required and also linguistic and semantic knowledge, stored within distinct and heterogeneous resources, which play an important role during all Natural Language Processing (NLP) steps. Resources management is a challenging problem, together with the correct association between URIs coming from the resources and meanings of the words.
This work presents a service that, given a lexeme (an abstract unit of morphological analysis in linguistics, which roughly corresponds to a set of words that are different forms of the same word), returns all syntactic and semantic information collected from a list of lexical and semantic resources. The proposed strategy consists in merging data with origin from stable resources, such as WordNet, with data collected dynamically from evolving sources, such as the Web or Wikipedia. That strategy is implemented in a wrapper to a set of popular linguistic resources that provides a single point of access to them, in a transparent way to the user, to accomplish the computational linguistic problem of getting a rich set of linguistic and semantic annotations in a compact way.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Proceedings of the 2001 ACM CIKM International Conference on Information and Knowledge Management. ACM Press, New York (November 5-10, 2001)
Language resource management –Lexical markup framework (LMF) (March 2008), http://www.lexicalmarkupframework.org/
Agirre, E., de Lacalle Lekuona, O.L.: Publicly available topic signatures for all wordnet nominal senses. In: The 4rd International Conference on Languages Resources and Evaluations (LREC), Lisbon, Portugal (2004)
Bunescu, R., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2006), Trento, Italy, pp. 9–16 (April 2006)
Ghani, R., Jones, R., Mladenic, D.: Mining the web to create minority language corpora. In: CIKM [1], pp. 279–286
Kazama, J., Torisawa, K.: Exploiting wikipedia as external knowledge for named entity recognition. In: Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 698–707 (2007)
Kilgarriff, A., Grefenstette, G.: Introduction to the special issue on the web as corpus. Computational Linguistics 29(3), 333–348 (2003)
Kwok, C.C.T., Etzioni, O., Weld, D.S.: Scaling question answering to the web. ACM Trans. Inf. Syst. 19(3), 242–262 (2001)
Magnini, B., Cavaglià, G.: Integrating subject field codes into wordnet. In: 2nd International Conference on Language Resources and Evaluation (LREC 2000), pp. 1413–1418 (2000)
Magnini, B., Negri, M., Prevete, R., Tanev, H.: Is it the right answer? exploiting web redundancy for answer validation. In: ACL, pp. 425–432 (2002)
Miller, G.: Wordnet: An on-line lexical database. International Journal of Lexicography 3(4) (1990) (Special Issue)
Navigli, R.: Meaningful clustering of senses helps boost word sense disambiguation performance. In: Annual Meeting of the Association for Computational Linguistics joint with the 21st International Conference on Computational Linguistics (COLING-ACL 2006), pp. 105–112 (2006)
Ponzetto, S.P., Strube, M.: Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. In: Moore, R.C., Bilmes, J.A., Chu-Carroll, J., Sanderson, M. (eds.) HLT-NAACL. The Association for Computational Linguistics (2006)
Prévot, L., Borgo, S., Oltramari, A.: Interfacing ontologies and lexical resources. In: Proceedings of OntoLex 2005 - Ontologies and Lexical Resources, Jeju Island, Republic of Korea (October 15, 2005)
Radev, D.R., Qi, H., Zheng, Z., Blair-Goldensohn, S., Zhang, Z., Fan, W., Prager, J.M.: Mining the web for answers to natural language questions. In: CIKM [1], pp. 143–150
Rosso, P., y Gómez, M.M., Buscaldi, D., Pancardo-Rodríguez, A., Pineda, L.V.: Two Web-Based Approaches for Noun Sense Disambiguation. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 267–279. Springer, Heidelberg (2005)
Ruiz-Casado, M., Alfonseca, E., Castells, P.: From wikipedia to semantic relationships: a semi-automated annotation approach. In: Völkel, M., Schaffert, S. (eds.) SemWiki. CEUR Workshop Proceedings, CEUR-WS.org, vol. 206 (2006)
Strapparava, C., Valitutti, A.: Wordnet-affect: an affective extension of wordnet. In: 4th International Conference on Language Resources and Evaluation (LREC 2004), pp. 1083–1086 (2004)
Strube, M., Ponzetto, S.P.: Wikirelate! computing semantic relatedness using wikipedia. In: AAAI. AAAI Press, Menlo Park (2006)
Terra, E.L., Clarke, C.L.A.: Frequency estimates for statistical word similarity measures. In: HLT-NAACL 2003 (2003)
Toral, A., Munoz, R.: A proposal to automatically build and maintain gazetteers for Named Entity Recognition by using Wikipedia. In: EACL 2006 (2006)
Turney, P.D.: Mining the web for synonyms: Pmi-ir versus lsa on toefl. In: Raedt, L.D., Flach, P. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 491–502. Springer, Heidelberg (2001)
Zesch, T., Gurevych, I., Mühlhäuser, M.: Analyzing and accessing wikipedia as a lexical semantic resource. In: Biannual Conference of the Society for Computational Linguistics and Language Technology (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gentile, A.L., Basile, P., Iaquinta, L., Semeraro, G. (2008). Lexical and Semantic Resources for NLP: From Words to Meanings. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2008. Lecture Notes in Computer Science(), vol 5179. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85567-5_35
Download citation
DOI: https://doi.org/10.1007/978-3-540-85567-5_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85566-8
Online ISBN: 978-3-540-85567-5
eBook Packages: Computer ScienceComputer Science (R0)