Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

Specia, Lucia; Stevenson, Mark; das Graças Volpe Nunes, Maria

doi:10.1007/s10579-009-9107-y

Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

Published: 19 November 2009

Volume 44, pages 295–313, (2010)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Lucia Specia¹,
Mark Stevenson² &
Maria das Graças Volpe Nunes³

204 Accesses
1 Citation
Explore all metrics

Abstract

Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

Diksha Khurana, Aditya Koli, … Sukhdev Singh

Natural Language Processing

A Survey of Semantic Analysis Approaches

Notes

Definite clauses are first-order clauses containing one positive literal.
Horn clauses are first-order clauses that can contain at most one positive literal. A Horn clause with exactly one positive literal is a definite clause.
Where \(\wedge\) represents logical and, ⊧ logically proves and □ falsity.
A clause is satisfiable if there exists at least one model for it, i.e., there exists one interpretation (a set of ground facts) that assigns a true value for such clause.
See Sect. 4.1 for a discussion of OntoNote’s treatment of phrasal verbs such as “come back”.

References

Agirre, E., & Martínez, D. (2001). Knowledge sources for word sense disambiguation. In Proceedings of the 4th international conference on text speech and dialogue (TSD), Plzen (pp. 1–10).
Agirre, E., Marquez, L., & Wicentowski, R. (2007). In 4th International workshop on semantic evaluations (SemEval-07), Prague (pp. 48–53).
Agirre, E., & Rigau, G. (1996). Word sense disambiguation using conceptual density. In Proceedings of the 15th conference on computational linguistics (COLING-96), Copenhagen (pp. 16–22).
Agirre, E., & Stevenson, M. (2006). Knowledge sources for word sense disambiguation. In E. Agirre & P. Edmonds (Eds.), Word sense disambiguation: Algorithms, applications and trends. Dordrecht: Springer.
Bruce, R., & Guthrie, L. (1992). Genus disambiguation: A study in weighted performance. In 14th Conference on computational linguistics (COLING-92), Nantes (pp. 1187–1191).
Daelemans, W., Hoste, V., Meulder, F., & Naudts, B. (2003). Combined optimization of feature selection and algorithm parameter interaction in machine learning of language. In Proceedings of the 14th European conference on machine learning (ECML-03), Croatia (pp. 84–95).
Decadt, B., Hoste, V., Daelemans, W., & van den Bosch, A. (2004). GAMBL, genetic algorithm optimization of memory-based WSD. In Senseval-3: 3rd international workshop on the evaluation of systems for the semantic analysis of text, Barcelona (pp. 108–112).
Edmonds, P., Mihalcea, R., & Saint-Dizier, P. (2002). Proceedings of the workshop word sense disambiguation: Recent successes and future directions, Philadelphia.
Fellbaum, C. (1998). WordNet: An electronic lexical Database. Massachusetts: MIT Press.
Google Scholar
Hovy, E. H., Marcus, M., Palmer, M., Pradhan, S., Ramshaw, L., & Weischedel, R. (2006). OntoNotes: The 90% solution. In Human language technology/North American association of computational linguistics conference (HLT-NAACL 06), New York (pp. 57–60).
Lee, Y. K., & Ng, H. T. (2002). An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation. In Proceedings of the conference on empirical methods in natural language processing (EMNLP), Philadelphia (pp. 41–48).
Lin, D. (1993). Principle based parsing without overgeneration. In Proceedings of the 31st meeting of the association for computational linguistics (ACL-93), Columbus (pp. 112–120).
Mihalcea, R. F. (2002). Word sense disambiguation with pattern learning and automatic feature selection. Natural Language Engineering, 8(4), 343–358. (Cambridge University Press).
Google Scholar
Mihalcea, R. F., Chklovski, T., & Kilgariff, A. (2004). The SENSEVAL-3 english lexical sample task. In SENSEVAL-3: 3rd international workshop on the evaluation of systems for semantic analysis of text (pp. 25–28).
Miller, G. A., Chorodow, M., Landes, S., Leacock, C., & Thomas, R. G. (1994). Using a semantic concordancer for sense identification. In ARPA human language technology workshop, Washington (pp. 240–243).
Muggleton, S. (1991). Inductive logic programming. New Generation Computing, 8(4), 295–318.
Article Google Scholar
Muggleton, S. (1994). Inductive logic programming: Derivations, successes and shortcomings. SIGART Bulletin, 5(1), 5–11.
Article Google Scholar
Muggleton, S. (1995). Inverse entailment and progol. New Generation Computing, 13, 245–286.
Article Google Scholar
Muggleton, S., & Raedt, L. D. (1994). Inductive logic programming: Theory and methods. Journal of Logic Programming, 19(20), 629–679.
Article Google Scholar
Ng, H. T., & Lee, H. B. (1996). Integrating multiple knowledge sources to disambiguate word sense: An exemplar-based approach. In Proceedings of the 34th meeting of the association for computational linguistics (ACL-96), Santa Cruz (pp. 40–47).
Pradhan, S., Loper, E., Dligach, D., & Palmer, M. (2007). SemEval-2007 Task-17: English lexical sample, SRL and all words. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-07), Prague (pp. 87–92).
Procter, P. (Ed.) (1978). Longman dictionary of contemporary English. Essex: Longman Group.
Google Scholar
Ratnaparkhi, A. (1996). A maximum entropy part-of-speech Tagger. In Proceedings of the conference on empirical methods in natural language processing, New Jersey (pp. 133–142).
Small, S., & Rieger, C. (1982). Parsing and comprehending with word experts (a theory and its realisation). In W. Lehnert, & M. Ringle (Eds.), Strategies for natural language processing. Hillsdate: Lawrence Erlbaum Associates.
Specia, L., Nunes, M. G. V., & Stevenson, M. (2007a). Learning expressive models for word sense disambiguation. In 45th Annual meeting of the association for computational linguistics (ACL-07), Prague (pp. 41–148).
Specia, L., Nunes, M. G. V., Srinivasan, A., & Ramakrishnan, G. (2007b). USP-IBM-1 and USP-IBM-2: The ILP-based systems for lexical sample WSD in SemEval-2007. In Proceedings of the 4th international workshop on semantic evaluations (SemEval-07), Prague (pp. 442–445).
Srinivasan, A. (1999). The aleph manual. Available at http://www.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/, 1999.
Stevenson, M., & Wilks, Y. (2001). The interaction of knowledge sources in word sense disambiguation. Computational Linguistics, 27(3), 321–349.
Article Google Scholar
Wilks, Y. (1978). Making preferences more active. Artificial Intelligence, 11(3), 197–223
Article Google Scholar
Yarowsky, D. (1995). Unsupervised word-sense disambiguation rivaling supervised methods. In Proceedings of the 33rd meeting of the association for computational linguistics (ACL-95), Cambridge (pp. 189–196).
Yarowsky, D., & Florian, R. (2002). Evaluating sense disambiguation across diverse parameter spaces. Natural Language Engineering, 8(2), 293–310.
Article Google Scholar

Download references

Acknowledgments

We are grateful for the feedback provided by the anonymous reviewers of this paper. Mark Stevenson was supported by the UK Engineering and Physical Sciences Research Council (grants EP/E004350/1 and EP/D069548/1).

Author information

Authors and Affiliations

Research Institute for Information and Language Processing, University of Wolverhampton, Stafford Street, Wolverhampton, WV1 1SB, UK
Lucia Specia
Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK
Mark Stevenson
Universidade de São Paulo, Caixa Postal 668, São Carlos, 13560-970, Brazil
Maria das Graças Volpe Nunes

Authors

Lucia Specia
View author publications
You can also search for this author in PubMed Google Scholar
Mark Stevenson
View author publications
You can also search for this author in PubMed Google Scholar
Maria das Graças Volpe Nunes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark Stevenson.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Specia, L., Stevenson, M. & das Graças Volpe Nunes, M. Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation. Lang Resources & Evaluation 44, 295–313 (2010). https://doi.org/10.1007/s10579-009-9107-y

Download citation

Published: 19 November 2009
Issue Date: December 2010
DOI: https://doi.org/10.1007/s10579-009-9107-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

Natural Language Processing

A Survey of Semantic Analysis Approaches

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

Natural Language Processing

A Survey of Semantic Analysis Approaches

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation