Extracting Semantic Representations from Large Text Corpora
Many connectionist language processing models have now reached a level of detail at which more realistic representations of semantics are required. In this paper we discuss the extraction of semantic representations from the word co-occurrence statistics of large text corpora and present a preliminary investigation into the validation and optimisation of such representations. We find that there is significantly more variation across the extraction procedures and evaluation criteria than is commonly assumed.
KeywordsWindow Size Target Word Lexical Decision Semantic Representation Distance Ratio
Unable to display preview. Download preview PDF.
- 1.Battig WF and Montague WE. Category norms for verbal items in 56 categories: A replication and extension of the Connecticut category norms. Journal of Experimental Psychology Monograph 1969; 80Google Scholar
- 2.Bullinaria JA. Modelling Reading, Spelling and Past Tense Learning with Artificial Neural Networks. Brain and Language 1997; in pressGoogle Scholar
- 3.Bullinaria JA. Modelling Lexical Decision: Who needs a lexicon? In Keating JG. (Ed) Neural Computing Research and Applications III, 62–69. Maynooth, Ireland: St. Patrick’s College, 1995Google Scholar
- 4.Bullinaria JA. Connectionist Models of Reading: Incorporating Semantics. In Proceedings of the First European Workshop on Cognitive Modelling, 224–229, Berlin: Technische Universitat Berlin, 1996Google Scholar
- 5.Bullinaria JA and Huckle CC. Modelling Lexical Decision Using Corpus Derived Semantic Representations in a Connectionist Network. In Proceedings of the Fourth Neural Computational and Psychology Workshop 1997Google Scholar
- 9.Leech G. 100 million words of English: the British National Corpus. Language Research 1992, 28:1–13Google Scholar
- 10.Levy JP, Bullinaria JA and Patel M. Evaluating the Use of Word Co-Occurrence Statistics as Semantic Representations, in preparationGoogle Scholar
- 11.Lund K, Burgess C and Atchley RA. Semantic and Associative Priming in High-dimensional Semantic Space. In Moore JD and Lehman JF (Eds), Proceedings of the Seventeenth Annual Meeting of the Cognitive Science Society, 660–665. Lawrence Erlbaum Associates, Pittsburgh PA 1995Google Scholar
- 15.Patel M. Using Neural Nets to Investigate Lexical Analysis. PRICAI’96: Topics in Artificial Intelligence 1996; 241–252Google Scholar
- 16.Plaut DC. Semantic and Associative Priming in a Distributed Attractor Network. Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society 1995; 37–42Google Scholar
- 19.Schutze H. Word Space. In Hanson SJ, Cowan JD and Giles CL (Eds), Advances in Neural Information Processing Systems 5, 895–902. Morgan Kaufmann, San Mateo CA, 1993.Google Scholar