Advertisement

Generating Context Templates for Word Sense Disambiguation

  • Samuel W. K. Chan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8272)

Abstract

This paper presents a novel approach for generating context templates for the task of word sense disambiguation (WSD). Context information of an ambiguous word, in form of feature vectors, is first classified into coarse-grained semantic categories by topic features using the latent dirichlet allocation (LDA) algorithm. To further refine the sense tags, all feature vectors of the ambiguous word, under the same topic, are recast into a network. Various centrality measures are derived to figure out the features or context words in the context templates, which are highly influential in the disambiguation. The WSD is achieved by identifying the maximum pairwise similarities between the context encoded in the templates and the sentence. The correct sense of an ambiguous word is resolved by distinguishing the most activated template without being trapped in a subjective linguistic quagmire. The approach is assessed in a corpus of more than 1,000,000 words. Experimental result shows the best measures perform comparably to the state-of-the-art.

Keywords

Sense tagging network-based approach latent dirichlet allocation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agirre, E., Bengoetxea, K., Gojenola, K., Nivre, J.: Improving dependency parsing with semantic classes. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL HLT 2011, Portland, pp. 699–703 (2011)Google Scholar
  2. 2.
    Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)zbMATHGoogle Scholar
  3. 3.
    Cai, J.F., Lee, W.S., Teh, Y.W.: Improving word sense disambiguation using topic features. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Languages Processing and Computational Natural Language Learning, pp. 1015–1023 (2007)Google Scholar
  4. 4.
    Dagan, I., Lee, L., Pereira, F.: Similarity-based models of word co-occurrence probabilities. Machine Learning Journal 3, 1–3, 43–69 (1999)Google Scholar
  5. 5.
    Decadt, B., Hoste, V., Daelemans, W., van den Bosch, A.: GAMBL, Genetic Algorithm Optimization of Memory-Based WSD. In: SENSEVAL-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (2004)Google Scholar
  6. 6.
    Di Sciullo, A.M., Williams, E.: On the Definition of Word. In: Linguistic Inquiry Monograph, vol. 14, MIT Press, Cambridge (1987)Google Scholar
  7. 7.
    Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  8. 8.
    Freeman, L.C.: Centrality in social networks conceptual clarification. Social Networks 1, 215–239 (1977)CrossRefGoogle Scholar
  9. 9.
    Jiang, J.J., Conrath, D.W.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In: Proceedings of International Conference on Research in Computational Linguistics, pp. 19–33. International Committee on Computational Linguistics (1997)Google Scholar
  10. 10.
    Ker, S.-j., Huang, C.-R., Hong, J.-F., Liu, S.-Y., Jian, H.-L., Su, I.-L., Hsieh, S.-K.: Design and Prototype of a Large-scale and Fully Sense-tagged Corpus. In: Tokunaga, T., Ortega, A. (eds.) LKP 2008. LNCS (LNAI), vol. 4938, pp. 186–193. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Mackinlay, A., Dridan, R., Mccarthy, D., Baldwin, T.: The effects of semantic annotations on precision parse ranking. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics (*SEM 2012), Montreal, pp. 228–236 (2012)Google Scholar
  12. 12.
    Mei, J., Zhu, Y., Gao, Y., Ying, H.: Tongyici Cilin. Commercial Press (1984) (in Chinese)Google Scholar
  13. 13.
    Mihalcea, R., Csomai, A.: SenseLearner: word sense disambiguation for all words in unrestricted text. In: Proceedings of the ACL 2005 on Interactive Poster and Demonstration Sessions, pp. 53–56 (2005)Google Scholar
  14. 14.
    Navigli, R., Lapata, M.: Graph connectivity measures for unsupervised word sense disambiguation. In: Proceedings of IJCAI, pp. 1683–1688 (2007)Google Scholar
  15. 15.
    Newman, M.: Networks: An Introduction. Oxford (2011)Google Scholar
  16. 16.
    Steyvers, M., Griffiths, T.: Probabilistic Topic Models. In: Landauer, T., Mcnamara, D., Dennis, S., Kintsch, W. (eds.) Handbook of Latent Semantic Analysis (2007)Google Scholar
  17. 17.
    Tsatsaronis, G., Varlamis, I., Nørvåg, K.: An experimental study on unsupervised graph-based word sense disambiguation. In: Gelbukh, A. (ed.) CICLing 2010. LNCS, vol. 6008, pp. 184–198. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  18. 18.
    Wu, Y., Jin, P., Guo, T., Yu, S.: Building Chinese sense annotated corpus with the help of software tools. In: Proceedings of the Linguistic Annotation Workshop. ACL, Prague (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Samuel W. K. Chan
    • 1
  1. 1.The Chinese University of Hong KongHong Kong SARHong Kong

Personalised recommendations