Lexique 2 : A new French lexical database

  • Boris New
  • Christophe Pallier
  • Marc Brysbaert
  • Ludovic Ferrand
Article
  • 857 Downloads

Abstract

In this article, we present a new lexical database for French:Lexique. In addition to classical word information such as gender, number, and grammatical category,Lexique includes a series of interesting new characteristics. First, word frequencies are based on two cues: a contemporary corpus of texts and the number of Web pages containing the word. Second, the database is split into a graphemic table with all the relevant frequencies, a table structured around lemmas (particularly interesting for the study of the inflectional family), and a table about surface frequency cues. Third,Lexique is distributed under a GNU-like license, allowing people to contribute to it. Finally, a metasearch engine,Open Lexique, has been developed so that new databases can be added very easily to the existing ones.Lexique can either be downloaded or interrogated freely fromhttp://www.lexique.org.

Supplementary material

New-BRM-2004 links.txt (0 kb)
Supplementary material, approximately 340 KB.

References

  1. Alario, F.-X., &Ferrand, L. (1999). A set of 400 pictures standardized for French: Norms for name agreement, image agreement, familiarity, visual complexity, image variability, and age of acquisition.Behavior Research Methods, Instruments, & Computers,31,531–552.CrossRefGoogle Scholar
  2. Baayen, R. H., Piepenbrock, R., &van Run, H. (1993).The Celex lexical database (CD-ROM). University of Pennsylvania, Philadelphia: Linguistic Data Consortium.Google Scholar
  3. Blair, I. V., Urland, G. R., &Ma, J. E. (2002). Using Internet search engines to estimate word frequency.Behavior Research Methods, Instruments, & Computers,34,286–290.CrossRefGoogle Scholar
  4. Coltheart, M., Davelaar, E., Jonasson, J. T., &Besner, T. (1977). Access to the internal lexicon. In S. Dornic (Ed.),Attention and performance VI (pp. 535–555). Hillsdale, NJ: Erlbaum.Google Scholar
  5. Content, A., Mousty, P., &Radeau, M. (1990). BRULEX: Une base de données lexicales informatisée pour le Français écrit et parlé [A lexical computerized database for written and spoken French].L’Année Psychologique,90,551–566.CrossRefGoogle Scholar
  6. De Moor, W., &Brysbaert, M. (2000). Neighborhood-frequency effects when primes and targets are of different lengths.Psychological Research,63,159–162.CrossRefGoogle Scholar
  7. Ferrand, L., Grainger, J., &New, B. (2003). Normes d’âge d’acquisition pour 400 mots monosyllabiques [Age-of-acquisition norms for a set of 400 monosyllabic words].L’Année Psychologique,104, 445–468.CrossRefGoogle Scholar
  8. Francis, N., &Kučera, H. (1982). Frequency analysis of English usage: Lexicon and grammar. Boston: Houghton-Mifflin.Google Scholar
  9. Keller, E., &Zellner, B. (1998). Motivations for the prosodic predictive chain.Proceedings of ESCA Symposium on Speech Synthesis,76, 137–141.Google Scholar
  10. Lambert, E., &Chesnet, D. (2001). Novlex: Une base de données lexicales pour les élèves de primaire [A lexical database for primary school pupils].L’Année Psychologique,101,277–288.CrossRefGoogle Scholar
  11. Lété, B., Sprenger-Charolles, L., &Colé, P. (2004). MANULEX: A grade-level lexical database from French elementary school readers.Behavior Research Methods, Instruments, & Computers,36,156–166.CrossRefGoogle Scholar
  12. Monsell, S. (1991). The nature and locus of word frequency effects in reading. In D. Besner & G. Humphreys (Eds.),Basic processes in reading: Visual word recognition (pp. 148–197). Hillsdale, NJ: Erlbaum.Google Scholar
  13. Namer, F. (2000). Flemm: Un analyseur flexionnel du Français à base de règles [Flemm: Inflectional analyzer for French with rules].T.A.L,41, 523–548.Google Scholar
  14. New, B., Brysbaert, M., Segui, J., Ferrand, L., & Rastle, K. (in press). The processing of singular and plural nouns in French and English.Journal of Memory & Language.Google Scholar
  15. New, B., Pallier, C., Ferrand, L., &Matos, R. (2001). Une base de données lexicales du Français contemporain sur Internet: LEXIQUE [A lexical database on the Internet about contemporary French: Lexique].L’Année Psychologique,101,447–462.CrossRefGoogle Scholar
  16. Pallier, C. (1994).Rôle de la syllabe dans la perception de laparole: Études attentionelles [Syllable role in speech perception]. Thèse de doctorat. Paris: École des Hautes Études en Sciences Sociales. (Available at http://www.pallier.org/papers/).Google Scholar
  17. Peereman, R., &Dufour, S. (2003). Un correctif aux notations phonétiques de la base de données Lexique [A corrective to the phonetic notations of the Lexique database].L’Année Psychologique,103, 103–108.CrossRefGoogle Scholar
  18. Pythoud, C. (1996). Problèmes de la correction automatique de l’orthographie lexicale du Français à travers une étude de cas: Le correcteur orthographique ispell et le dictionnaire Français—IREQ [Automatic spell-checking problems: The ispell program and the French—IREQ dictionary] available athttp://www.vuil.ch/ling /frgvt.html.Mémoire de licence, Université de Lausanne.Google Scholar
  19. Robert, P. (1992).Le Grand Robert version électronique. Paris: Dictionnaires le Robert.Google Scholar
  20. Schmid, G. (1994).TreeTagger—A language-independent part-of-speech tagger. Available athttp://www.ims.uni-stuttgart.de/Tools/DecisionTreeTagger.html.Google Scholar

Copyright information

© Psychonomic Society, Inc. 2004

Authors and Affiliations

  • Boris New
    • 1
    • 2
  • Christophe Pallier
    • 3
  • Marc Brysbaert
    • 1
  • Ludovic Ferrand
    • 2
  1. 1.Royal HollowayUniversity of LondonLondonEngland
  2. 2.CNRS and Université René DescartesParisFrance
  3. 3.INSERMParisFrance

Personalised recommendations