From short-term memory to semantics-a computational model
Abstract
Clinical disorders of language, known as aphasia, cause impaired comprehension of speech in written and spoken forms. This impairment is due to the patient’s inability to process semantics that arise from sequence independent co-occurrence of words with content in a short-term memory (STM) of preceding words. If W i is the immediately forthcoming word in input to the patient, STM, in the context of this disorder, consists of a window, STMWin, that contains the k words that immediately precede W i . We use a generative approach to model semantics that ensue from the co-occurrence of W i and STMWin, and view these semantics as the output of a random process with parameters θ. The model uses supervised learning to maximize the likelihood of θ, given labeled content in STMWin. Experimental validation on standard text classification data sets gives an accuracy that is comparable to, or better than, that obtained using support vector machines (SVMs).
Keywords
Semantics Memory Learning Language ClassificationReferences
- 1.Dronkers NF, Redfern BB, Knight RT (2000) The neural architecture of language disorders. In: Gazzaniga MS et al (eds) The new cognitive neurosciences. MIT Press, Cambridge, pp 949-958Google Scholar
- 2.Mayeux R, Kandel ER (1991) Disorders of language: the aphasias. In: Kandel ER et al (eds) Principles of neural science. Elsevier, New York, pp 839-851Google Scholar
- 3.Baddeley A (1992) Working memory. Science 255:556-559Google Scholar
- 4.Haarmann HJ, Davelaar EJ, Usher M (2003) Individual differences in semantic short-term memory capacity and reading comprehension. J Mem Lang 48:320-345CrossRefGoogle Scholar
- 5.Hanten G, Martin RC (2000) Contributions of phonological and semantic short-term memory to sentence processing: evidence from two cases of closed head injury in children. J Mem Lang 43:335-361CrossRefGoogle Scholar
- 6.Kutas M, Federmeier KD (2000) Electrophysiology reveals semantic memory use in language comprehension. Trends Cogn Sci 4:463-470CrossRefGoogle Scholar
- 7.Federmeier KD, Kutas M (1999) A rose by any other name: long-term memory structure and sentence processing. J Mem Lang 41:469-495CrossRefGoogle Scholar
- 8.Yuret D (1998) Discovery of linguistic relations using lexical attraction. PhD dissertation, Massachusetts Institute of TechnologyGoogle Scholar
- 9.Rosenfeld R (1996) A maximum entropy approach to adaptive statistical language modeling. Comput Speech Lang 10:187-228Google Scholar
- 10.Dell GS, Seaghdha PGO (1991) Mediated and convergent lexical priming in language production: a comment on Levelt et al (1991). Psychol Rev 98:604-614CrossRefGoogle Scholar
- 11.Jerger S, Martin RC, Damian MF (2002) Semantic and phonological influences on picture naming by children and teenagers. J Mem Lang 47:229-249CrossRefGoogle Scholar
- 12.Chakrabarti S (2003) Similarity and clustering. In: Homet L (ed) Mining the Web: discovering knowledge from hypertext data. Morgan Kaufman, San Francisco, pp 79-123Google Scholar
- 13.Ramscar M, Yarlett D (2003) Semantic grounding in models of analogy: an environmental approach. Cogn Sci 27:41-71CrossRefGoogle Scholar
- 14.Lee MD, Corlett EY (2003) Sequential sampling models of human text classification. Cogn Sci 27:159-193CrossRefMATHGoogle Scholar
- 15.Olson DR (1970) Language and thought: aspects of a cognitive theory of semantics. Psychol Rev 77:257-273Google Scholar
- 16.Collins AM, Loftus EF (1975) A spreading-activation theory of semantic processing. Psychol Rev 82:407-428CrossRefGoogle Scholar
- 17.Ratcliff R, McKoon G (1988) A retrieval theory of priming in memory. Psychol Rev 95:385-408CrossRefGoogle Scholar
- 18.Lowe W, McDonald S (2000) The direct route: mediated priming in semantic space. In: Proceedings of the 22nd annual meeting of the cognitive science society CogSci2000), Philadelphia, Pennsylvania, August 2000, pp 806-811Google Scholar
- 19.Dell GD, Schwartz MF, Martin N, Saffran EM, Gagnon DA (1997) Lexical access in aphasic and nonaphasic speakers. Psychol Rev 104:801-838CrossRefGoogle Scholar
- 20.Hinton GE, Shallice T (1991) Lesioning an attractor network: investigations of acquired dyslexia. Psychol Rev 98:74-95CrossRefGoogle Scholar
- 21.Plaut DC, Shallice T (1993) Deep dyslexia: a case study of connectionist neuropsychology. Cogn Neuropsychol 10:377-500Google Scholar
- 22.Kohonen T (1998) Self-organization of very large document collections: state of the art. In: Proceedings of the 8th international conference on artificial neural networks (ICANN ‘98), Skovde, Sweden, September 1998, pp 65-74Google Scholar
- 23.Honkela T, Pulkki V, Kohonen T (1995) Contextual relations of words in Grimm fairy tales, analyzed by self-organizing map. In: Proceedings of the 5th international conference on artificial neural networks (ICANN ‘95), Paris, France, October 1995, pp 3-7Google Scholar
- 24.Lagus K (1996) Self-organizing maps of document collections: a new approach to interactive exploration. In: Proceedings of the 2nd international conference on knowledge discovery and data mining (KDD ‘96), Portland, Oregon, August 1996, pp 238-243Google Scholar
- 25.Lowe W (1997) Meaning and the mental lexicon. In: Proceedings of the 15th international joint conference on artificial intelligence (IJCAI ‘97), Nagoya, Japan, August 1997, pp 1092-1097Google Scholar
- 26.Landauer TK, Dumais ST (1997) A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol Rev 104:211-240CrossRefGoogle Scholar
- 27.Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshmann R (1990) Indexing by latent semantic analysis. J Am Soc Inform Sci 41:391-407Google Scholar
- 28.Salton G, Buckley C (1991) Global text matching for information retrieval. Science 253:1012-1025Google Scholar
- 29.Salton G, Allen J, Buckley C (1994) Automatic structuring and retrieval of large text files. Commun ACM 37:97-108CrossRefGoogle Scholar
- 30.Kaski S, Lagus K, Honkela T, Kohonen T (1998) Statistical aspects of the WEBSOM system in organizing document collections. Comput Sci Stat 29:281-298Google Scholar
- 31.Letsche TA, Berry MW (1997) Large-scale information retrieval with latent semantic indexing. Int J Comput Inf Sci 100:105-137Google Scholar
- 32.Tai X, Ren F, Kita K (2002) An information retrieval model based on vector space method by supervised learning. Inform Process Manag 38:749-764CrossRefMATHGoogle Scholar
- 33.Nigam K, Lafferty J, McCallum A (1999) Using maximum entropy for text classification. In: Workshop on machine learning for information filtering at the 16th international joint conference on artificial intelligence (IJCAI ‘99), Stockholm, Sweden, August 1999, pp 61-67Google Scholar
- 34.Tan CM, Wang YF, Lee CD (2002) The use of bigrams to enhance text categorization. Inform Process Manag 38:529-546CrossRefMATHGoogle Scholar
- 35.Lang K (1995) Newsweeder: learning to filter netnews. In: Proceedings of the 12th international conference on machine learning (ICML ‘95), Lake Tahoe, California, July 1995, pp 331-339Google Scholar
- 36.Schohn G, Cohn D (2000) Less is more: active learning with support vector machines. In: Proceedings of the 17th international conference on machine learning (ICML ‘00), Stanford, California, June/July 2000, pp 839-846Google Scholar
- 37.Berman RA, Slobin DI (1994) Relating events in narrative: a cross linguistic developmental study. Lawrence Erlbaum Associates, Hillsdale, New JerseyGoogle Scholar
- 38.MacWhinney B (2000) The CHILDES project: tools for analyzing talk, 3rd edn. Vol 2: the database. Lawrence Erlbaum Associates, Mahwah, New JerseyGoogle Scholar