Filaments of Meaning in Word Space
- 1.6k Downloads
Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive effects on modeling likeness of meaning and that the local structure of word spaces is where interesting semantic relations reside. We show that the local structure of word spaces has substantially different dimensionality and character than the global space and that this structure shows potential to be exploited for further semantic analysis using methods for local analysis of vector space structure rather than globally scoped methods typically in use today such as singular value decomposition or principal component analysis.
KeywordsLatent Semantic Analysis Vector Space Model Retrieval Practice Left Tail Random Indexing
Unable to display preview. Download preview PDF.
- 2.Dubin, D.: The most influential paper Gerard Salton never wrote. Library Trends 52(4), 748–764 (2004)Google Scholar
- 3.Schütze, H.: Word space. In: Proceedings of the 1993 Conference on Advances in Neural Information Processing Systems, NIPS 1993, pp. 895–902. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
- 4.Chávez, E., Navarro, G.: Measuring the dimensionality of general metric spaces. Technical Report TR/DCC-2000-1, Department of Computer Science, University of Chile (2000)Google Scholar
- 6.Kanerva, P., Kristofersson, J., Holst, A.: Random indexing of text samples for latent semantic analysis. In: Proceedings of the 22nd Annual Conference of the Cognitive Science Society, p. 1036. Erlbaum, Mahwah (2000)Google Scholar
- 9.Sahlgren, M.: An introduction to random indexing. In: Witschel, H. (ed.) Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering. TermNet News: Newsletter of International Cooperation in Terminology, vol. 87 (2005)Google Scholar
- 11.Sahlgren, M.: The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. PhD thesis, Department of linguistics, Stockholm university (2006)Google Scholar