Skip to main content
Log in

Improving English verb sense disambiguation performance with linguistically motivated features and clear sense distinction boundaries

  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

This paper presents a high-performance broad-coverage supervised word sense disambiguation (WSD) system for English verbs that uses linguistically motivated features and a smoothed maximum entropy machine learning model. We describe three specific enhancements to our system’s treatment of linguistically motivated features which resulted in the best published results on SENSEVAL-2 verbs. We then present the results of training our system on OntoNotes data, both the SemEval-2007 task and additional data. OntoNotes data is designed to provide clear sense distinctions, based on using explicit syntactic and semantic criteria to group WordNet senses, with sufficient examples to constitute high quality, broad coverage training data. Using similar syntactic and semantic features for WSD, we achieve performance comparable to that of human taggers, and competitive with the top results for the SemEval-2007 task. Empirical analysis of our results suggests that clarifying sense boundaries and/or increasing the number of training instances for certain verbs could further improve system performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.senseval.org/.

  2. Verbs that are past participles and are not preceded by be orhave verbs are semi-passive.

  3. The significance test mentioned here and in later discussions is a statistical hypothesis test, where the two systems (sysA and sysB)’ run on N test instances were regarded as two experiments each consisting of N independent trials. The null hypothesis is p A  = p B , where p X is sysX (X = A or B)’s error rate (i.e., estimated probability of making mistakes based on N observations). The significance test script we used was signif written by Jeff Bilmes from UC Berkeley (1996).

  4. In our case, the ITA is the number of instances whose senses were agreed upon by two annotators divided by the total number of annotated instances. The Gold Standard consists of senses adjudicated by the third person (especially for those disagreed upon by the two annotators). Since the adjudicated sense was usually the choice of one of the annotators, the tagger (i.e., annotator) accuracy compared to the Gold Standard is usually higher than the ITA.

  5. For more details about the systems developed with these other learning algorithms see Chen et al. (2007).

  6. We did not use tagger Gold Standard agreement for our analysis because the data we used were created in summer 2007, for which we did not calculate that number. The current OntoNotes data contains more instances for the 217 verbs and redefined sense groupings for some verbs that had extremely low ITA’s before, so the analysis could not be redone. However, we expect that doing a comparison with tagger-Gold Standard rates instead would give very similar results, since the distribution is quite similar and the agreement rate is on average 7% higher (based on the current OntoNotes data for the 217 verbs).

References

  • Agirre, E., & Edmonds, P. (2007). Word sense disambiguation: Algorithms and applications. Text, Speech and Language Technology Series (Vol. 33). Springer, Netherland. ISBN: 978-1-4020-6870-6.

  • Berger, A. L., Della Piertra, S. A., & Della Pietra, V. J. (1996). A maximum entropy approach to natural language processing. Computational Linguistics, 22(1), 39–71.

    Google Scholar 

  • Bikel, D. M. (2002). Design of a multi-lingual, parallel-processing statistical parsing engine. In Proceedings of HLT 2002, San Diego, CA.

  • Bikel, D. M., Schwartz, R., & Weischedel, R. M. (1999). An algorithm that learns what’s in a name. Machine Learning, 34(1–3). Special Issue on Natural Language Learning.

  • Cai, J. F., Lee, W. S., & Teh, Y. W. (2007). NUS-ML: Improving word sense disambiguation using topic features. In Proceedings of the 4th international workshop on Semantic Evaluations (SemEval 2007), Prague, Czech Republic (pp. 249–252).

  • Carpuat, M., & Wu, D. (2007). Improving statistical machine translation using word sense disambiguation. In Proceedings of the 2007 joint conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (pp. 61–72).

  • Chan, Y. S., Ng, H. T., & Chiang, D. (2007, June). Word sense disambiguation improves statistical machine translation. In Proceedings of the 45th annual meeting of the association of computational linguistics, Prague, Czech Republic. Association for Computational Linguistics (pp. 33–40).

  • Chen, J. (2006). Towards high-performance word sense disambiguation by combining rich linguistic knowledge and machine learning approaches. PhD Thesis, University of Pennsylvania.

  • Chen, J., Dligach, D., & Palmer, M. (2007). Towards large-scale high-performance English verb sense disambiguation by using linguistically motivated features. In Proceedings of the international conference on semantic computing (ICSC 2007). Irvine, CA.

  • Chen, J., & Palmer, M. (2005, October 11–13). Towards robust high performance word sense disambiguation of English verbs using rich linguistic features. In Proceedings of the 2nd international joint conference on natural language processing, Jeju Island, Korea.

  • Chen, S. F., & Rosenfeld, R. (1999). A Gaussian prior for smoothing maximum entropy models. Technical Report CMU-CS-99-108, CMU.

  • Chen, J., Schein, A., Ungar, L., & Palmer, M. (2006). An empirical study of the behavior of word sense disambiguation. In Proceedings of NAACL-HLT 2006, NY, 2006.

  • Dang, H. T. (2004). Investigations into the role of lexical semantics in word sense disambiguation. PhD Thesis, University of Pennsylvania.

  • Dang, H. T., & Palmer, M. (2005, June 26–28). The role of semantic roles in disambiguating verb senses. In Proceedings of the 43rd annual meeting of the association for computational linguistics, Ann Arbor, MI.

  • Dang, H. T., & Palmer, M. (2002). Combining contextual features for word sense disambiguation. In Proceedings of the SIGLEX/SENSEVAL workshop on WSD: Recent successes and future directions, in conjunction with ACL-02, Philadelphia.

  • Duffield, C. J., Hwang, J. D., Brown, S. W., Dligach, D., Vieweg, S. E., Davis, J., & Palmer, M. (2007). Criteria for the manual grouping of verb senses. In Linguistics annotation workshop, held in conjunction with ACL-2007, Prague, The Czech Republic.

  • Edmonds, P., & Cotton, S. (2001). SENSEVAL-2: Overview. In Proceedings of SENSEVAL-2: 2nd international workshop on evaluating WSD systems. ACL-SIGLEX, Toulouse France.

  • Fellbaum, C. (1998). WordNet—an electronic lexical database. Cambridge, MA/London: The MIT Press.

    Google Scholar 

  • Fellbaum, C., Delfs, L., Wolff, S., & Palmer, M. (2005). Word meaning in Dictionaries, corpora, and the speaker’s mind. In G. Barnbrook, P. Danielsson, & M. Mahlberg (Eds.), Meaningful texts: The extraction of semantic information from monolingual and multilingual corpora (pp. 31–38). Birmingham, UK: Birmingham University Press.

  • Fellbaum, C., Palmer, M., Dang, H. T., Delfs, L., & Wolf, S. (2001, June 2, 3). Manual and automatic semantic annotation with WordNet. In SIGLEX workshop on WordNet and other lexical resources (NAACL-01), Invited talk, Pittsburgh, PA.

  • Gonzalo, J., Verdejo, F., Chugur, I., & Cigarran, J. (1998). Indexing with WordNet synsets can improve text retrieval. In Proceedings of the COLING/ACL’98 workshop on usage of WordNet for NLP, Montreal, Canada.

  • Hanks, P. (1996). Contextual dependencies and lexical sets. The International Journal of Corpus Linguistics, 1, 1.

  • Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., & Weischedel, R. (2006). OntoNotes: The 90% solution. In Proceedings of HLT-NAACL06, New York.

  • Ide, N., & Veronis, J. (1998). Introduction to the special issue on word sense disambiguation: The state of the art. Computational Linguistics, 24(1), 140.

    Google Scholar 

  • Kipper, K., Korhonen, A., Ryant, N., & Palmer, M. (2006). Extensive classifications of English verbs. In Proceedings of the 12th EURALEX international congress, Turin, Italy.

  • Lappin, S., & Leass, H. (1994). An algorithm for pronominal anaphora resolution. Computational Linguistics, 20(4), 535–561.

    Google Scholar 

  • Lee, Y. K., & Ng, H. T. (2002). An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation. In Proceedings of the conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 41–48).

  • Lee, Y. K., Ng, H. T., & Chia, T. K. (2004). Supervised word sense disambiguation with support vector machines and multiple knowledge sources. In Proceedings of SENSEVAL-3: Third international workshop on the evaluation of systems for the semantic analysis of text, Barcelona, Spain (pp. 137–140).

  • Levin, B. (1993). English verb classes and alternations: A preliminary investigation. Chicago: University of Chicago Press.

    Google Scholar 

  • Lucke, J. F., & Embretson, S. (1984). The biases and mean squared errors of estimators of multinormal squared multiple correlation. Journal of Educational Statistics, 9(3), 183–192. doi:10.2307/1165005.

    Article  Google Scholar 

  • Marcus, M., Kim, G., Marcinkiewicz, M. A., MacIntyre, R., Ferguson, M., Katz, K., et al. (1994). The Penn Treebank: Annotating predicate argument structure. In Proceedings of the ARPA’94 HLT workshop.

  • McCallum, A. K. (2002). MALLET: A machine learning for language toolkit. http://www.cs.umass.edu/~mccallum/mallet.

  • Mihalcea, R., Chklovski, T., & Kilgarriff, A. (2004, July). The Senseval-3 English lexical sample task. In Proceedings of Senseval-3: The third international workshop on the evaluation of systems for the semantic analysis of text, Barcelona, Spain.

  • Navigli, R. (2006, July 17–18). Meaningful clustering of senses helps boost word sense disambiguation performance. In Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the ACL, Sydney, Australia (pp. 105–112).

  • Navigli, R., Litkowski, K. C., & Hargraves, O. (2007, June). SemEval-2007 Task 07: Coarse-grained English all-words task. In Proceedings of SemEval, held in conjunction with ACL 2007, Prague, Czech Republic.

  • Palmer, M., Babko-Malaya, O., & Dang, H. T. (2004). Different sense granularities for different applications. In Proceedings of the 2nd workshop on scalable natural language understanding systems (HLT/NAACL 2004). Boston, MA.

  • Palmer, M., Dang, H., & Fellbaum, C. (2007, June). Making fine-grained and coarse-grained sense distinctions, both manually and automatically. Journal of Natural Language Engineering, 13(2), 137–163.

    Google Scholar 

  • Palmer, M., Fellbaum, C., Cotton, S., Delfs, L., & Dang, H. T. (2001, July 5–6). English tasks: All-words and verb lexical sample. In Proceedings of SENSEVAL-2: Second international workshop on evaluating word sense disambiguation systems. Toulouse, France.

  • Palmer, M., Gildea, D., & Kingsbury, P. (2005). The proposition bank: A corpus annotated with semantic roles. Computational Linguistics, 31, 1. doi:10.1162/0891201053630264.

    Article  Google Scholar 

  • Philpot, A., Hovy, E., & Pantel, P. (2005). The omega ontology. In Proceedings of the ONTOLEX workshop at the International Conference on Natural Language Processing (IJCNLP05). Jeju Island, Korea.

  • Pradhan, S., Loper, E., Dligach, D., & Palmer, M. (2007, June). SemEval-2007 task-17: English lexical sample, SRL and all words. In Proceedings of SemEval, held in conjunction with ACL 2007, Prague, Czech Republic.

  • Ratnaparkhi, A. (1998). Maximum entropy models for natural language ambiguity resolution. Ph.D. Thesis, University of Pennsylvania.

  • Sanderson, M. (1994). Word sense disambiguation and information retrieval. In Proceedings of the 17th International ACM SIGIR, Dublin, Ireland

  • Sanderson, M. (2000). Retrieving with good sense. Information Retrieval, 2(1), 49–69.

    Article  Google Scholar 

  • Snyder, B., & Palmer, M. (2004, July). The English all-words task. In Proceedings of Senseval-3: The third international workshop on the evaluation of systems for the semantic analysis of text. Barcelona, Spain.

  • Stokoe, C., Oakes, M. P., & Tait, J. (2003). Word sense disambiguation and information retrieval revisited. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, Toronto, Canada.

  • Yarowsky, D. (1993). One sense per collocation. In Proceedings of the 5th DARPA speech and natural language workshop.

  • Yarowsky, D., Cucerzan, S., Florian, R., Schafer, C., & Wicentowski, R. (2001). The Johns Hopkins SENSEVAL2 system description. In Proceedings of SENSEVAL-2: 2nd international workshop on evaluating WSD systems, Toulouse France.

  • Yarowsky, D., & Florian, R. (2002). Evaluating sense disambiguation across diverse parameter spaces. Journal of Natural Language Engineering, 8(4), 293–310.

    Article  Google Scholar 

  • Yi, S.-t., Loper, E., & Palmer, M. (2007, April). Can semantic roles generalize across genres? In Proceedings of NAACL 2007, Rochester, NY.

  • Zhong, Z., Tou Ng, H., & Chan, Y. S. (2008, October). Word sense disambiguation using OntoNotes: An empirical study. In Proceedings of EMNLP 2008, Waikiki, Honolulu, HI.

Download references

Acknowledgements

We gratefully acknowledge the support of the National Science Foundation Grant NSF-0415923, Word Sense Disambiguation, and Defense Advanced Research Projects Agency (DARPA/IPTO) under the GALE program, DARPA/CMO Contract No. HR0011-06-C-0022, subcontract from BBN, Inc. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinying Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, J., Palmer, M.S. Improving English verb sense disambiguation performance with linguistically motivated features and clear sense distinction boundaries. Lang Resources & Evaluation 43, 181–208 (2009). https://doi.org/10.1007/s10579-009-9085-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-009-9085-0

Keywords

Navigation