Topic Modeling for Word Sense Induction

  • Johannes Knopp
  • Johanna Völker
  • Simone Paolo Ponzetto
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8105)


In this paper, we present a novel approach to Word Sense Induction which is based on topic modeling. Key to our methodology is the use of word-topic distributions as a means to estimate sense distributions. We provide these distributions as input to a clustering algorithm in order to automatically distinguish between the senses of semantically ambiguous words. The results of our evaluation experiments indicate that the performance of our approach is comparable to state-of-the-art methods whose sense distinctions are not as easily interpretable.


word sense induction topic models lexical semantics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Boyd-Graber, J., Blei, D., Zhu, X.: A topic model for word sense disambiguation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) and Computational Natural Language Learning (CoNLL), pp. 1024–1033 (2007)Google Scholar
  2. 2.
    Brody, S., Lapata, M.: Bayesian word sense induction. In: Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 103–111 (2009)Google Scholar
  3. 3.
    Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)CrossRefGoogle Scholar
  4. 4.
    Di Marco, A., Navigli, R.: Clustering and diversifying web search results with graph-based word sense induction. Computational Linguistics 39(4) (2013)Google Scholar
  5. 5.
    Fellbaum, C.: WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press (May 1998)Google Scholar
  6. 6.
    Griffiths, T., Jordan, M., Tenenbaum, J.: Hierarchical topic models and the nested chinese restaurant process. Advances in Neural Information Processing Systems 16, 106–114 (2004)Google Scholar
  7. 7.
    Griffiths, T.L., Steyvers, M., Tenenbaum, J.B.: Topics in semantic representation. Psychological Review 114(2), 211 (2007)CrossRefGoogle Scholar
  8. 8.
    Harris, Z.S.: Distributional structure. Word (1954)Google Scholar
  9. 9.
    Hirst, G.: Near-synonymy and the structure of lexical knowledge. In: AAAI Symposium on Representation and Acquisition of Lexical Knowledge: Polysemy, Ambiguity, and Generativity, pp. 51–56 (1995)Google Scholar
  10. 10.
    Lau, J.H., Cook, P., McCarthy, D., Newman, D., Baldwin, T.: Word sense induction for novel sense detection. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 591–601. Association for Computational Linguistics, Avignon (2012)Google Scholar
  11. 11.
    Manandhar, S., Klapaftis, I., Dligach, D., Pradhan, S.: Semeval-2010 task 14: Word sense induction & disambiguation. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 63–68. Association for Computational Linguistics, Uppsala (July 2010)Google Scholar
  12. 12.
    Navigli, R.: Word sense disambiguation: A survey. ACM Computing Surveys (CSUR) 41(2), 10 (2009)CrossRefGoogle Scholar
  13. 13.
    Ng, H.T.: Getting serious about word sense disambiguation. In: Proceedings of the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics, pp. 1–7 (1997)Google Scholar
  14. 14.
    Rosenberg, A., Hirschberg, J.: V-measure: A conditional entropy-based external cluster evaluation measure. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), vol. 410, p. 420 (2007)Google Scholar
  15. 15.
    Schuetze, H., Pedersen, J.O.: A cooccurrence-based thesaurus and two applications to information retrieval. Information Processing and Management 33(3), 307–318 (1997)CrossRefGoogle Scholar
  16. 16.
    Steyvers, M., Griffiths, T.: Probabilistic topic models. In: Landauer, T., Mcnamara, D., Dennis, S., Kintsch, W. (eds.) Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum (2007)Google Scholar
  17. 17.
    Turney, P.D., Pantel, P.: From frequency to meaning: Vector space models of semantics. Artificial Intelligence 37(1), 141–188 (2010)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Van de Cruys, T., Apidianaki, M., et al.: Latent semantic word sense induction and disambiguation. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1476–1485 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Johannes Knopp
    • 1
  • Johanna Völker
    • 1
  • Simone Paolo Ponzetto
    • 1
  1. 1.Data & Web Science Research GroupUniversity of MannheimGermany

Personalised recommendations