Advertisement

International Journal of Speech Technology

, Volume 13, Issue 4, pp 201–218 | Cite as

Deliberate word access: an intuition, a roadmap and some preliminary empirical results

  • Michael ZockEmail author
  • Olivier Ferret
  • Didier Schwab
Article

Abstract

No doubt, words play a major role in language production, hence finding them is of vital importance, be it for writing or for speaking (spontaneous discourse production, simultaneous translation). Words are stored in a dictionary, and the general belief holds, the more entries the better. Yet, to be truly useful the resource should contain not only many entries and a lot of information concerning each one of them, but also adequate navigational means to reveal the stored information. Information access depends crucially on the organization of the data (words) and the access keys (meaning/form), two factors largely overlooked. We will present here some ideas of how an existing electronic dictionary could be enhanced to support a speaker/writer to find the word s/he is looking for. To this end we suggest to add to an existing electronic dictionary an index based on the notion of association, i.e. words co-occurring in a well balanced corpus, the latter being supposed to represent the average citizen’s knowledge of the world. Before describing our approach, we will briefly take a critical look at the work being done by colleagues working on automatic, spontaneous or deliberate language production,—that is, computer-generated language, simulation of the mental lexicon, or WordNet (WN),—to see how adequate they are with regard to our goal.

Keywords

Lexical access Index based on associations Mental lexicon Navigation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agirre, E., Ansa, O., Martinez, D., & Hovy, E. (2001). Enriching WordNet concepts with topic signatures. In NAACL’01 workshop on WordNet and other lexical resources: applications, extensions and customizations. Google Scholar
  2. Aitchinson, J. (2003). Words in the mind: an introduction to the mental lexicon. Oxford: Blackwell. Google Scholar
  3. Avancini, H., Lavelli, A., Magnini, B., Sebastiani, F., & Zanoli, R. (2003). Expanding domain-specific lexicons by term categorization. In 18th ACM symposium on applied computing (SAC-03). Google Scholar
  4. Baddeley, A. (1982). Your memory: a user’s guide. Baltimore: Penguin. Google Scholar
  5. Barabási, A. (2002). Linked: the new science of networks. Cambridge: Perseus. Google Scholar
  6. Buchanan, M. (2002). Nexus: small worlds and the groundbreaking theory of networks. New York: W.W. Norton. Google Scholar
  7. Bateman, J., & Zock, M. (2003). Natural language generation. In R. Mitkov (Ed.), Handbook of computational linguistics (pp. 284–304). Oxford: Oxford University Press. Google Scholar
  8. Beeferman, D., Berger, A., & Lafferty, J. (1999). Statistical models for text segmentation. Machine Learning, 34(1), 177–210. zbMATHCrossRefGoogle Scholar
  9. Boissière, P. (1862). Dictionnaire analogique de la langue française : répertoire complet des mots par les idées et des idées par les mots, Paris. Google Scholar
  10. Bonin, P. (2004). Mental lexicon: some words to talk about words. New York: Nova Science Publishers. Google Scholar
  11. Brown, R., & McNeill, D. (1996). The tip of the tongue phenomenon. Journal of Verbal Learning and Verbal Behaviour, 5, 325–337. CrossRefGoogle Scholar
  12. Burke, D. M., MacKay, D. G., Worthley, J. S., & Wade, E. (1991). On the tip of the tongue: what causes word finding failures in young and older adults? Journal of Memory and Language, 30, 542–579. CrossRefGoogle Scholar
  13. Cahill, L., & Reape, M. (1999). Lexicalisation in applied NLG systems (p. 9). Brighton: ITRI. Google Scholar
  14. Church, K., & Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1), 177–210. Google Scholar
  15. Collins, A., & Quillian, L. (1969). Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior, 8, 240–247. CrossRefGoogle Scholar
  16. Cumming, S. (1986). The lexicon in text generation. ISI: 86–168. Google Scholar
  17. Cutler, A. (Ed.) (1982). Slips of the tongue and language production. Amsterdam: Mouton. Google Scholar
  18. Deese, J. (1965). The structure of associations in language and thought. Baltimore: Johns Hopkins Press. Google Scholar
  19. Dell, G. S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological Review, 93, 283–321. CrossRefGoogle Scholar
  20. Dong, Z., & Dong, Q. (2006). HOWNET and the computation of meaning. London: World Scientific. CrossRefGoogle Scholar
  21. Dutoit, D., & Nugues, P. (2002). A lexical network and an algorithm to find words from definitions. In F. van Harmelen (Ed.), ECAI2002, Proceedings of the 15th European conference on artificial intelligence, Lyon (pp. 450–454). Google Scholar
  22. El-Kahlout, I. D., & Oflazer, K. (2004). Use of wordnet for retrieving words from their meanings. In 2nd Global WordNet conference, Brno. Google Scholar
  23. Fellbaum, C. (1998). WordNet: an electronic lexical database and some of its applications. Cambridge: MIT Press. Google Scholar
  24. Ferret, O. (2002). Using collocations for topic segmentation and link detection. In COLING 2002 (pp. 260–266). Google Scholar
  25. Ferret, O. (2006). Building a network of topical relations from a corpus. In LREC 2006. Google Scholar
  26. Ferret, O., & Zock, M. (2006) Enhancing electronic dictionaries with an index based on associations. In ACL’06: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the ACL (pp. 281–288). Google Scholar
  27. Fontenelle, T. (1997). Using a bilingual dictionary to create semantic networks. International Journal of Lexicography, 10(4):275–303. CrossRefGoogle Scholar
  28. Fromkin, V. (Ed.) (1973). Speech errors as linguistic evidence. The Hague: Mouton Publishers. Google Scholar
  29. Goddard, C. (1998). Bad arguments against semantic primitives. Theoretical Linguistics, 24(23), 129–156. CrossRefGoogle Scholar
  30. Goldman, N. (1975). Conceptual generation. In R. Schank (Ed.), Conceptual information processing. Amsterdam: North-Holland. Google Scholar
  31. Hanks, P., & Pustejovsky, J. (2005). A pattern dictionary for natural language processing’ in revue française de linguistique appliquée 10 (2). Google Scholar
  32. Harabagiu, S., & Moldovan, D. (1998). Knowledge processing on extended WordNet. In C. Fellbaum (Ed.), WordNet: an electronic lexical database and some of its applications (pp. 379–405) Cambridge: MIT Press. Google Scholar
  33. Harley, T. (2010). Talking the talk. New York: Psychology Press. Google Scholar
  34. Jarema, G., Libben, G., & Kehayia, E. (2002). The mental lexicon. Brain and Language, 81. Google Scholar
  35. Jung, C., & Riklin, F. (1906). Experimentelle Untersuchungen Über Assoziationen Gesunder. In Jung, C. G. (Ed.), Diagnostische Assoziationsstudien (pp. 7–145) Leipzig: Barth. Google Scholar
  36. Kempen, G., & Huijbers, P. (1983). The lexicalization process in sentence production and naming: Indirect election of words. Cognition, 14, 185–209. CrossRefGoogle Scholar
  37. Kilgarriff, A., Rychly, P., Smrz, P., & Tugwell, D. (2004). The sketch engine. In Proceedings of the eleventh EURALEX international congress, Lorient, France (pp. 105–116). Google Scholar
  38. Lamb, S. (1999). Pathways of the brain: the neurocognitive basis of language. Amsterdam: John Benjamins. Google Scholar
  39. Levelt, W. (1992). Accessing words in speech production: stages, processes and representations. Cognition, 42, 1–22. CrossRefGoogle Scholar
  40. Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. Behavioral and Brain Sciences, 22, 1–75. Google Scholar
  41. Magnini, B., & Cavaglia, G. (2000). Integrating subject field codes into WordNet. In Second international conference on language resources and evaluation, Athenes, Geece (pp. 1413–1418). Google Scholar
  42. Mandala, R., Tokunaga, T., & Tanaka, H. (1999). Complementing WordNet with Roget’s and corpus-based thesauri for information retrieval. In EACL99. Google Scholar
  43. Marslen-Wilson, W. (Ed.) (1979). Lexical representation and process, Bradford book. Cambridge: MIT Press. Google Scholar
  44. Mel’čuk, I., Arbatchewsky-Jumarie, N., Iordanskaja, L., Mantha, S., & Polguère, A. (1999). In Recherches lexico-séman-tiques IV. Dictionnaire explicatif et combinatoire du français contemporain. Montréal: Les Presses de l’Université de Montréal. Google Scholar
  45. Mihalcea, R., & Moldovan, D. (2001). Extended WordNet: progress report. In NAACL 2001—workshop on WordNet and other lexical resources, Pittsburgh, USA. Google Scholar
  46. Miller, G. A. (Ed.) (1990). WordNet: an on-line lexical database. International Journal of Lexicography, 3(4), 235–244. CrossRefGoogle Scholar
  47. Moerdijk, F. (2008). Frames and semagrams; meaning description in the general dutch dictionary. In Proceedings of the thirteenth Euralex international congress, EURALEX, Barcelona. Google Scholar
  48. Nicolov, N. (1999). Approximate text generation from non-hierarchical representation in a declarative framework. PhD dissertation, university of Edinburgh. Google Scholar
  49. Nogier, J. F., & Zock, M. (1992). Lexical choice by pattern matching. Knowledge Based Systems, 5(3), 200–212. CrossRefGoogle Scholar
  50. Richardson, S. W., Dolan, B., & Vanderwende, L. (1998). MindNet: acquiring and structuring semantic information from text. In ACL-COLING’98 (pp. 1098–1102). Google Scholar
  51. Robin, J. (1990). A survey of lexical choice in natural language generation. Technical Report CUCS 040-90, Dept. of Computer Science, University of Columbia. Google Scholar
  52. Roelofs, A. (1992). A spreading-activation theory of lemma retrieval in speaking. In Cognition, 42, 107–142. W. Levelt (Ed.) Special issue on the lexicon. CrossRefGoogle Scholar
  53. Roget, P. (1852). Thesaurus of English words and phrases. London: Longman. Google Scholar
  54. Rundell, M. (2002). Macmillan English dictionary for advanced learners. Oxford: Macmillan. Google Scholar
  55. Sharoff, S. (2005). The communicative potential of verbs of ‘away-from’ motion in English and Russian. Functions of Language, 12(2), 203–238. CrossRefGoogle Scholar
  56. Schvaneveldt, R. (Ed.) (1989). Pathfinder Associative Networks: studies in knowledge organization. Norwood: Ablex. Google Scholar
  57. Sierra, G. (2000). The onomasiological dictionary: a gap in lexicography. In Proceedings of the ninth Euralex international congress, IMS, Universität Stuttgart (pp. 223–235). Google Scholar
  58. Sinopalnikova, A., & Smrz, P. (2006). Knowing a word vs. accessing a word: Wordnet and word association norms as interfaces to electronic dictionaries. In Proceedings of the third international WordNet conference, Korea (pp. 265–272). Google Scholar
  59. Smith, E., Shoben, E., & Rips, L. (1974). Structure and process in semantic memory: a featural model for semantic decisions. Psychological Review, 81, 214–241. CrossRefGoogle Scholar
  60. Stede, M. (1995). Lexicalization in natural language generation: a survey. Artificial Intelligence Review, 8, 309–336. CrossRefGoogle Scholar
  61. Stemberger, N. (1985). The lexicon in a model of speech production. New York: Garland. Google Scholar
  62. Summers, D. (1993). Language Activator: the world’s first production dictionary. London: Longman. Google Scholar
  63. T’ong, T.-K. (1862). Ying ü tsap ts’ün (The Chinese and English instructor). Canton. Google Scholar
  64. Vigliocco, G., Antonini, T., & Garrett, M. F. (1997). Grammatical gender is on the tip of Italian tongues. Psychological Science, 8, 314–317. CrossRefGoogle Scholar
  65. Wanner, L. (1996). Lexical choice in text generation and machine translation. Machine Translation, 11, 3–35. Choice. L. W. (Ed.) Special Issue on Lexical. CrossRefGoogle Scholar
  66. Ward, N. (1988). Issues in word choice. COLING-88, Budapest. Google Scholar
  67. Zock, M., & Bilac, S. (2004). Word lookup on the basis of associations: from an idea to a roadmap. In Proc. of coling workshop: Enhancing and using dictionaries, Geneva (pp. 29–35). CrossRefGoogle Scholar
  68. Zock, M., & Schwab, D. (2010). Lexical access, a search problem. In Cogalex-2, Beijing. Google Scholar
  69. Zock, M., & Schwab, D. (2008). Lexical access based on underspecified input. In Cogalex-1, coling workshop, Manchester. Google Scholar
  70. Zock, M. (1996). The power of words in message planning, COLING, Copenhagen, 990-5. http://acl.ldc.upenn.edu/C/C96/C96-2167.pdf.

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Laboratoire d’Informatique Fondamentale (LIF)CNRS & Aix-Marseille UniversitéMarseille Cedex 9France
  2. 2.Vision and Content Engineering LaboratoryCEA, LISTFontenay-aux-RosesFrance
  3. 3.Laboratoire d’Informatique de Grenoble, équipe GETALPUniversité Grenoble 2Grenoble Cedex 9France

Personalised recommendations