Skip to main content

In Search of Semantic Compositionality in Vector Spaces

  • Conference paper
Conceptual Structures: Leveraging Semantic Technologies (ICCS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5662))

Included in the following conference series:

Abstract

In spite of the widespread usage of geometric models of meaning in computational linguistics and information retrieval research, they have been until recently mostly utilized for modeling lexical meaning. The ability to deal with concept combination, however, is the essential capacity of human language, and any semantic theory should be able to handle it.

Making use of Word Space Models (Schütze 1998) and Random Indexing (Sahlgren 2005), we explore the hypothesis that compositional meaning can be captured in such models by adopting a number of mathematical operations for vector composition (summation, component product, tensor product and convolution) to model semantic composition in a multiword unit identification task.

This work was supported by German “Federal Ministry of Economics” (BMWi) under the project Theseus (number 01MQ07019).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval. ACM Press, Addison-Wesley (1999)

    Google Scholar 

  • Baldwin, T., Bannard, C., Tanaka, T., Widdows, D.: An empirical model of multiword expression decomposability. In: Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (2003)

    Google Scholar 

  • Bannard, C., Baldwin, T., Lascarides, A.: A statistical approach to the semantics of verb-particles. In: Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan (2003)

    Google Scholar 

  • Berry, M., Drmac, Z., Jessup, E.R.: Matrices, Vector Spaces, and Information Retrieval. SIAM Review 41(2), 335–362 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Clark, S., Pulman, S.: Combining symbolic and distributional models of meaning. In: Proceedings of the AAAI Spring Symposium on Quantum Interaction, Stanford, CA, pp. 52–55 (2007)

    Google Scholar 

  • Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by Latent Semantic Analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)

    Article  Google Scholar 

  • Dowty, D.R., Wall, R., Peters, S.: Introduction to Montague Semantics. Kluwer Academic Publishers, Dordrecht (1981)

    Google Scholar 

  • Evert, S., Krenn, B.: Methods for the qualitative evaluation of lexical association measures. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (2001)

    Google Scholar 

  • Evert, S.: The statistics of word cooccurrences: word pairs and collocations. Ph.D. Thesis, University of Stuttgart (2004)

    Google Scholar 

  • Firth, J.: A synopsis of linguistic theory 1930-1955. Studies in Linguistic Analysis, pp. 1–32. Longman (1957)

    Google Scholar 

  • Frege, G.: Letter to Jourdain. In: Gabriel, G., et al. (eds.) Philosophical and Mathematical Correspondence, Chicago University Press 1980 (1914)

    Google Scholar 

  • Gärdenfors, P.: Conceptual Spaces: The Geometry of Thought. The MIT Press, Cambridge (2004)

    Google Scholar 

  • Jones, S., Sinclair, J.M.: English lexical collocations. Cahiers de Lexicologie 24 (1974)

    Google Scholar 

  • Katz, G., Giesbrecht, E.: Automatic identification of non-compositional multiword expressions using Latent Semantic Analysis. In: Proceedings of the ACL/Coling Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties (2006)

    Google Scholar 

  • Kintsch, W.: Predication. Cognitive Science 25(2) (2001)

    Google Scholar 

  • Krenn, B.: The usual suspects: data-oriented models for the identification and representation of lexical collocations. In: Dissertations in Computational Linguistics and Language Technology, German Research Center for Artificial Intelligence and Saarland University, Saarbrücken, Germany (2000)

    Google Scholar 

  • Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review 104, 211–240 (1997)

    Article  Google Scholar 

  • Lin, D.: Automatic identification of noncompositional phrases. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (1999)

    Google Scholar 

  • Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  • Mitchell, J., Lapata, M.: Vector-based models of semantic composition. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 236–244 (2008)

    Google Scholar 

  • Sag, I.A., Baldwin, T., Bond, F., Copestake, A., Flickinger, D.: Multiword expressions: a pain in the neck for NLP. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, p. 1. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  • Sahlgren, M.: An Introduction to Random Indexing. In: Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE, Copenhagen, Denmark, August 16 (2005)

    Google Scholar 

  • Schone, P., Jurafsky, D.: Is knowledge-free induction of multiword unit dictionary headwords a solved problem? In: Proceedings of Empirical Methods in Natural Language Processing, Pittsburgh, PA (2001)

    Google Scholar 

  • Schütze, H.: Automatic word sense discrimination. Computational Linguistics 24(1), 97–124 (1998)

    MathSciNet  Google Scholar 

  • Smolensky, P., Legendre, G.: The Harmonic Mind: from Neural Computation to Optimality-Theoretic Grammar. MIT Press, Cambridge (2006)

    MATH  Google Scholar 

  • Sowa, J.F.: Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks Cole Publishing Co., Pacific Grove (2000)

    Google Scholar 

  • Widdows D.: Semantic vector products: some initial investigations. In: Proceedings of the Second AAAI Symposium on Quantum Interaction (2008)

    Google Scholar 

  • Widdows, D., Ferraro, K.: Semantic vectors: a scalable open source package and online technology management application. In: Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008) (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Giesbrecht, E. (2009). In Search of Semantic Compositionality in Vector Spaces. In: Rudolph, S., Dau, F., Kuznetsov, S.O. (eds) Conceptual Structures: Leveraging Semantic Technologies. ICCS 2009. Lecture Notes in Computer Science(), vol 5662. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03079-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03079-6_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03078-9

  • Online ISBN: 978-3-642-03079-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics