Skip to main content
Log in

The semantic organization of the animal category: evidence from semantic verbal fluency and network theory

  • Research Report
  • Published:
Cognitive Processing Aims and scope Submit manuscript

Abstract

Semantic memory is the subsystem of human memory that stores knowledge of concepts or meanings, as opposed to life-specific experiences. How humans organize semantic information remains poorly understood. In an effort to better understand this issue, we conducted a verbal fluency experiment on 200 participants with the aim of inferring and representing the conceptual storage structure of the natural category of animals as a network. This was done by formulating a statistical framework for co-occurring concepts that aims to infer significant concept–concept associations and represent them as a graph. The resulting network was analyzed and enriched by means of a missing links recovery criterion based on modularity. Both network models were compared to a thresholded co-occurrence approach. They were evaluated using a random subset of verbal fluency tests and comparing the network outcomes (linked pairs are clustering transitions and disconnected pairs are switching transitions) to the outcomes of two expert human raters. Results show that the network models proposed in this study overcome a thresholded co-occurrence approach, and their outcomes are in high agreement with human evaluations. Finally, the interplay between conceptual structure and retrieval mechanisms is discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Notes

  1. The absence of auto-loops ensures that all entries of the main diagonal (a ii entries) are 0.

  2. Performing a hierarchical clustering directly on the adjacency matrix and setting a threshold in the dendrogram is among the most basic and common approaches used to find modules. Nevertheless, it must be acknowledged that inferred adjacency matrices from empirical data are often noisy or incomplete. This severely affects hierarchical clustering evaluation and misleads the selection of an accurate cutoff value for module detection.

  3. For any m value, GTOM output is a normalized overlap matrix with values between 0 and 1 containing interconnectedness shared information for every pair of nodes.

  4. Every word was converted to its singular and three pure synonyms were unified. Finally, one word that was not an animal was removed.

  5. While methodologies based on co-occurrences have been successfully used to study language networks (Solé et al. 2010), it is important to remark that syntactic constraints severely reduce the possible orderings of items with respect to verbal fluency outputs, where position of concepts is unrestricted.

  6. For instance, l = 1 indicate that they are consecutive words. In general, l = n indicate that there are n − 1 words between the two words under study.

  7. A more individualized approach could be done by assessing individual test sizes instead.

  8. It is assumed that sequences, i.e. tests, do not contain repeated elements. In the unlikely event of finding a word repeated in a test, neighborhoods for all appearances are considered to obtain co-occurrences.

  9. It is straightforward to see that, when \(l=N-1, P_{w_{i},w_{j}}^{(\le l)}= 2\sum_{i=1}^{N-1} {\frac{N-i}{\left[N\atop 2\right]}}=1\).

  10. Setting l = 1 would only consider associations for strictly consecutive words, which are more likely to be related with respect to more distant concepts. The high-order variability naming related concepts requires of a large dataset to capture most relationships. A solution to overcome this issue consists of increasing parameter l. However, large windows provide more candidates for establishing relationships of words but at the same time, they reduce the significance of nearby concepts (method explained below) and are more likely to induce meaningless co-occurrences.

  11. For instance, a word named once would be automatically linked to any word named less than 32 times, considering that N = 31.57 and l = 2 in our dataset.

  12. Removing 39% of distinct words might seem a severe filtering, but they only represented 3.5% of all word occurrences within the tests as they were very low frequent items. Such small reduction of evidence is indeed one step ahead of previous works where semantic distance approaches have been applied to those words either said by a minimum of around 30% of participants or to most named words (threshold set around 12 occurrences) (Henley 1969; Chan et al. 1993; Aloia et al. 1996; Schwartz and Baldo 2001; Prescott et al. 2006).

  13. Those words with no significant interactions were not included in the network (4 words) since they represented isolated words that prevent a network analysis. Additionally, the isolated pair eel-elver was also removed for the same reason, leaving a total of 236 nodes in the network.

  14. Raters had experience at the evaluation of verbal fluency tests in healthy controls and neurological patients. They were asked to judge whether each transition between two words was between animals from the same or different subcategories and had for guidance two articles with rules on how to evaluate clustering and switching (Troyer 2000; Villodre et al. 2006). Raters were blind to the results produced by the in-silico evaluations.

  15. These figures are close to the results of 423 distinct animals, and 175 named only once obtained from 21 participants during 10 min somewhere else (Henley 1969) and might be indicating an average magnitude of the human lexicon size in the category of animals.

  16. The information regarding modularity provided by this matrix is the presence or absence of discrete blocks along the diagonal. When there is no modularity in a network, as it occurs in random graphs, no blocks appear independently of the number of neighborhood expansions until the graph represents itself one module. For those networks where modularity emerges, the selection of a hierarchical clustering cutoff (0.58 in our data) must separate those blocks as well as possible to get a feasible partition of the network in modules.

References

  • Albert R, Barabási AL (2002) Statistical mechanics of complex networks. Rev Modern Phys 74:47

    Article  Google Scholar 

  • Aloia M, Gourovitch M, Weinberger D, Goldberg T (1996) An investigation of semantic space in patients with schizophrenia. J Int Neuropsychol Soc 2(4):267–273

    Article  PubMed  CAS  Google Scholar 

  • Alvarez B, Cuetos F (2007) Objective age of acquisition norms for a set of 328 words in spanish. Behav Res Methods 39(3):377–383

    Article  PubMed  Google Scholar 

  • Anderson JR (1976) Language, memory and thought. Lawrence Earlbaum, Hillsdale

    Google Scholar 

  • Anderson JR, Pirolli PL (1984) Spread of activation. J Exp Psychol Learn Mem Cogn 10(4):791–798

    Article  Google Scholar 

  • Ardila A, Ostrosky-Solís F (2006) Cognitive testing toward the future: the example of semantic verbal fluency (animals). Int J Psychol 41(5):324–332

    Article  Google Scholar 

  • Arenas A, Fernández A, Fortunato S, Gómez S (2008) Motif-based communities in complex networks. J Phys A Math Theor 41(5):224,001

    Google Scholar 

  • Batagelj V, Mrvar A (2002) Pajek—analysis and visualization of large networks, vol 2265. Springer, Berlin

    Google Scholar 

  • Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang D (2006) Complex networks: structure and dynamics. Phys Rep 424:175–308

    Article  Google Scholar 

  • Boguñá M, Krioukov D, Claffy K (2009) Navigability of complex networks. Nat Phys 5(1):74–80

    Article  Google Scholar 

  • Borge-Holthoefer J, Arenas A (2010a) Categorizing words through semantic memory navigation. Eur Phys J B 74(2):265

    Article  CAS  Google Scholar 

  • Borge-Holthoefer J, Arenas A (2010b) Semantic networks: structure and dynamics. Entropy 12(5):1264–1302

    Article  Google Scholar 

  • Bousfield W, Barclay W (1950) The relationship between order and frequency of occurrence of restricted associative responses. J Exp Psychol 40(5):643–647

    Article  PubMed  CAS  Google Scholar 

  • Bousfield W, Sedgewick C (1944) An analysis of sequences of restricted associative responses. J Gen Psychol 30:149–165

    Google Scholar 

  • Budson AE, Price BH (2005) Memory dysfunction. N Engl J Med 352(7):692–699

    Article  PubMed  CAS  Google Scholar 

  • Chan A, Butters N, Paulsen J, Salmon D, Swenson M, Maloney L (1993) An assessment of the semantic network in patients with alzheimers-disease. J Cogn Neurosci 5:254–261

    Article  Google Scholar 

  • Chouinard P, Goodale M (2010) Category-specific neural processing for naming pictures of animals and naming pictures of tools: an ale meta-analysis. Neuropsychologia 48(2):409–418

    Article  PubMed  Google Scholar 

  • Clauset A, Moore C, Newman M (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101

    Article  PubMed  CAS  Google Scholar 

  • Clopper C, Pearson S (1934) The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26:404–413

    Article  Google Scholar 

  • Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Measure 20(1):37–46

    Article  Google Scholar 

  • Collins AM, Loftus EF (1975) A spreading-activation theory of semantic processing. Psychol Rev 82(6):407–428

    Article  Google Scholar 

  • Collins AM, Quillian MR (1969) Retrieval time from semantic memory. J Verbal Learn Verbal Behav 8(2):240–248

    Article  Google Scholar 

  • Collins AM, Quillian MR (1970) Does category size affect categorization time? J Verbal Learn Verbal Behav 9(4):432–438

    Article  Google Scholar 

  • Crowe S, Prescott T (2003) Continuity and change in the development of category structure: insights from the semantic fluency task. Int J Behav Dev 27(5):467–479

    Article  Google Scholar 

  • Danon L, Duch J, Arenas A, Díaz-Guilera A (2007) Large scale structure and dynamics of complex networks: from information technology to finance and natural science. World Scientific, Singapore, pp 93–113

    Google Scholar 

  • Eguíluz V, Chialvo D, Cecchi G, Baliki M, Apkarian A (2005) Scale-free brain functional networks. Phys Rev Lett 94(1):018,102

    Article  Google Scholar 

  • Erdös P, Rényi A (1960) On the evolution of random graphs. Publ Math Inst Hung Acad Sci 5:17–61

    Google Scholar 

  • Ferrer i Cancho R, Solé RV (2001) The small world of human language. Proc R Soc Lond Ser B Biol Sci 268(1482):2261–2265

    Article  Google Scholar 

  • Galeote M, Peraita H (1999) Memoria semántica y fluidez verbal en demencias. Revista Española de Neuropsicol 1(2–3):3–17

    Google Scholar 

  • Goñi J, Martincorena I, Corominas-Murtra B, Arrondo G, Ardanza-Trevijano S, Villoslada P (2010) Switcher-random-walks: a cognitive inspired strategy for random exploration on networks. Int J Bifurc Chaos 20(3):913–922

    Article  Google Scholar 

  • Griffiths TL, Steyvers M, Tenenbaum JB (2007) Topics in semantic representation. Psychol Rev 114(2):211–244

    Article  PubMed  Google Scholar 

  • Gruenewald P, Lockhead G (1980) The free recall of category examples. J Exp Psychol Hum Learn Mem 6(3):225–240

    Article  Google Scholar 

  • Guimera R, Sales-Pardo M (2009) Missing and spurious interactions and the reconstruction of complex networks. Proc Natl Acad Sci USA 106:22,073–22,078

    Article  CAS  Google Scholar 

  • Hayes-Roth B (1977) Evolution of cognitive structures and processes. Psychol Rev 84(3):260–278

    Article  Google Scholar 

  • Henley NM (1969) A psychological study of the semantics of animal terms. J Verbal Learn Verbal Behav 8:176–184

    Article  Google Scholar 

  • Jeong H, Mason S, Barabási AL, Oltvai Z (2001) Lethality and centrality in protein networks. Nature 411(6833):41–42

    Article  PubMed  CAS  Google Scholar 

  • Lerner A, Ogrocki P, Thomas P (2009) Network graph analysis of category fluency testing. Cogn Behav Neurol 22(1):45–52

    Article  PubMed  Google Scholar 

  • Lezak M (1995) Neuropsychological assessment, 3rd edn. Oxford University Press, New York

    Google Scholar 

  • Lund K, Burgess C (1996) Producing high dimensional semantic spaces from lexical co-ocurrence. Behav Res Methods Instrum Comput 28(2):203–208

    Article  Google Scholar 

  • Mestres J, Gregori-Puigjané E, Valverde S, Solé R (2008) Data completeness-the achilles heel of drug-target networks. Nat Biotechnol 26(9):983–984

    Article  PubMed  CAS  Google Scholar 

  • Motter AE, de Moura AP, Lan YC, Dasgupta P (2002) Topology of the conceptual network of language. Phys Rev E 65(065102)

  • Newman M (2003) The structure and function of complex networks. SIAM Rev 45:167–256

    Article  Google Scholar 

  • Noh JD, Rieger H (2004) Random walks on complex networks. Phys Rev Lett 92(11)

  • Overschelde JPV, Rawson K, Dunlosky J (2004) Category norms: an updated and expanded version of the battig and montague (1969) norms. J Mem Lang 50:289–335

    Article  Google Scholar 

  • Patterson K, Nestor PJ, Rogers TT (2007) Where do you know what you know? the representation of semantic knowledge in the human brain. Nat Rev Neurosci 8:976–987

    Article  PubMed  CAS  Google Scholar 

  • Prescott TJ, Newton LD, Mir NU, Woodruff PW, Parks RW (2006) A new dissimilarity measure for finding semantic structure in cathegory fluency data with implications for understanding memory organization in schizophrenia. Neuropsychology 20(6):685–699

    Article  PubMed  Google Scholar 

  • Quillian MR (1967) Word concepts: A theory and simulation of some basic semantic capabilities. Behav Sci 12(5):410–430

    Article  PubMed  CAS  Google Scholar 

  • Raskin S, Sliwinski M, Borod J (1992) Clustering strategies on tasks of verbal fluency in parkinson’s disease. Neuropsychologia 30(1):95–99

    Article  PubMed  CAS  Google Scholar 

  • Ravasz E, Somera A, Mongru D, Oltvai Z, Barabási AL (2002) Hierarchical organization of modularity in metabolic networks. Science 297(5586):1551–1555

    Article  PubMed  CAS  Google Scholar 

  • Rips LJ, Shoben EJ, Smith EE (1973) Semantic distance and the verification of semantic relations. J Verbal Learn Verbal Behav 12:1–20

    Article  Google Scholar 

  • Rogers T, Lambon Ralph M, Garrard P, Bozeat S, McClelland J, Hodges J, K KP (2004) Structure and deterioration of semantic memory: a neuropsychological and computational investigation. Psychol Rev 111(1):205–235

    Article  PubMed  Google Scholar 

  • Rosch E (1974) Linguistic relativity. In: Silverstein A (eds) Human communication: theoretical perspectives. Halsted Press, New Sork

    Google Scholar 

  • Rosch E (1975) Cognitive representations of semantic categories. J Exp Psychol Gen 104(3):192–233

    Article  Google Scholar 

  • Rosch E, Mervis CB (1975) Family resemblances: studies in the internal structure of categories. Cogn Psychol 7:573–605

    Article  Google Scholar 

  • Rosch E, Simpson C, Miller RS (1976) Structural bases of tipicality. J Exp Psychol Hum Percept Perform 2(4):491–502

    Article  Google Scholar 

  • Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci USA 105(4):1118–1123

    Article  PubMed  CAS  Google Scholar 

  • Schwartz S, Baldo J (2001) Distinct patterns of word retrieval in right and left frontal lobe patients: a multidimensional perspective. Neuropsychologia 39(11):1209–1217

    Article  PubMed  CAS  Google Scholar 

  • Schwartz S, Baldo J, Graves RE, Brugger P (2003) Pervasive influence of semantics in letter and category fluency: a multidimensional approach. Brain Lang 87:400–411

    Article  PubMed  Google Scholar 

  • Shannon P, Markiel A, Ozier O, Baliga N, Wang J, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504

    Article  PubMed  CAS  Google Scholar 

  • Sigman M, Cecchi G (2002) Global organization of the wordnet lexicon. PNAS 99(3):1742–1747

    Article  PubMed  CAS  Google Scholar 

  • Sloman SA (1998) Categorical inference is not a tree: the myth of inheritance hierarchies. Cogn Psychol 35:1–33

    Article  PubMed  CAS  Google Scholar 

  • Solé RV, Corominas-Murtra B, Valverde S, Steels L (2010) Language networks: their structure, function and evolution. Complexity

  • Sporns O, Chialvo D, Kaiser M, Hilgetag C (2004) Organization, development and function of complex brain networks. Trends Cogn Sci 8:418–425

    Article  PubMed  Google Scholar 

  • Steyvers M, Tenenbaum JB (2005) The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Cogn Sci 29:41–78

    Article  Google Scholar 

  • Tröster A, Fields J, Testa J, Paul R, Blanco C, Hames K, Salmon D, Beatty W (1998) Cortical and subcortical influences on clustering and switching in the performance of verbal fluency tasks. Neuropsychologia 36(4):295–304

    Article  PubMed  Google Scholar 

  • Troyer AK (2000) Normative data for clustering and switching on verbal fluency tasks. J Clin Exp Neuropsychol 22(3):370–378

    Article  PubMed  CAS  Google Scholar 

  • Troyer AK, Moscovitch M, Winocur G (1997) Clustering and switching as two components of verbal fluency: evidence from younger and older healthy adults. Neuropsychology 11(1):138–146

    Article  PubMed  CAS  Google Scholar 

  • Troyer AK, Moscovitch M, Winocur G, Alexander MP, Stuss D (1998a) Clustering and switching on verbal fluency: the effect of focal frontal- and temporal-lobe lesions. Neuropsychologia 36(6):499–504

    Article  PubMed  CAS  Google Scholar 

  • Troyer AK, Moscovitch M, Winocur G, Leach L, Freedman M (1998b) Clustering and switching on verbal fluency tests in alzheimer’s and parkinson’s disease. J Int Neuropsychol Soc 4(2):137–143

    Article  PubMed  CAS  Google Scholar 

  • Villodre R, Sánchez-Alfonso A, Brines L, Nunez A, Chirivella J, Ferri J, Noe E (2006) Fluencia verbal: estudio normativo piloto según estrategias de agrupación y saltos de palabras en población espańola de 20 a 49 ańos. Neurología 21(3):124–130

    PubMed  CAS  Google Scholar 

  • Voy B, Scharff J, ADPerkins, Saxton A, Borate B, Chesler E, Branstetter L, Langston M (2006) Extracting gene networks for low-dose radiation using graph theoretical algorithms. Plos Comput Biol 2(7):e89

    Article  PubMed  Google Scholar 

  • Wagner G, Pavlicev M, Cheverud J (2007) The road to modularity. Nat Rev Genet 8(12):921–931

    Article  PubMed  CAS  Google Scholar 

  • Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge

    Google Scholar 

  • Watts D, Strogatz S (1998) Collective dynamics of 'small-world’ networks. Nature 4(393):440–442

    Article  Google Scholar 

  • Wixted J, Rohrer D (1994) Analyzing the dynamics of free recall: An integrative review of the empirical literature. Psychon Bull Rev 1(1):89–106

    Article  Google Scholar 

  • Yip AM, Horvath S (2007) Gene network interconnectedness and the generalized topological overlap measure. BMC Bioinform 8(22)

Download references

Acknowledgments

We would like to acknowledge Ricard V. Solé, Jean Bragard and John F. Wesseling for helpful discussions; Lluis Samaranch for his useful comments and for being rater 2. JG to UTE project CIMA. BCM to James McDonnell Foundation. SAT to project MTM 2009-14409-C02-01. We also thank the referees for their thorough review and highly appreciate their comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pablo Villoslada.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (RAR 134 kb)

Appendix

Appendix

A Matlab (The Mathworks Inc., Natick, MA, USA) implementation of the methodology described in this study is available as electronic supplementary material. It starts with a verbal fluency dataset and at the last step obtains an enriched conceptual network. In order to ease the use of the code, all these files contain step-by-step explanations and references to sections and equations of this manuscript where appropriate. The script batch_verbal_fluency.m is also very helpful to comprehend the process in a global manner. The modular implementation of the different functions permits their independent use.

  • batch_verbal_fluency.m: It is the general script that deals with the whole process from the verbal fluency data to the enriched conceptual network. It uses the functions described below.

  • count_words.m: function that counts the number of words of each verbal fluency test contained in the dataset.

  • get_rel_frequencies.m: function that gets the relative frequencies of each word included in the verbal fluency data.

  • getco_occurrences.m: function that counts the number of co-occurrences of every pair words for a given maximum distance (parameter l)

  • get_statistical_co_occurrences.m: function that performs the statistical approach described in the paper for the network inference.

  • get_components.m: function that obtains the components of an undirected graph. This is used to obtain the giant component of the conceptual network.

  • computeGTOM.m: function that performs the modularity analysis using the Generalized Topological Overlap Measure.

  • enrich_newtork.m: function that performs the enrichment process of a network according to its modularity analysis (which is the output of computeGTOM in our case).

  • write_graph_links.m: function that writes pairs of words that are linked in a graph according to a dictionary into a file. Each line consists of a pair word,word.

    The verbal fluency data of the 200 subjects used in this study are available in the file data.mat, which can be loaded typing load data.mat in a Matlab environment. The dictionaries of the 236 words included in the networks are available in dictionaries.mat (first column in Spanish, second column in English). In the case of Spanish, acute accents and dieresis were omitted and letter ñ was substituted by n.

    Finally, both CN and ECN graphs have been included in Spanish (original language of the tests) and English (translation made by the authors). These files include all the pair of words that are connected (i.e. links of the graph) in a comma separated value format (.csv). These files can be easily visualized as graphs with programs such as Pajek (Batagelj and Mrvar 2002) or Cytoscape (Shannon et al. 2003).

  • CN_spa.csv is the conceptual network (CN) with animals written in English (graph with 236 nodes and 611 links).

  • CN_eng.csv is the conceptual network (CN) with animals written in Spanish (graph with 236 nodes and 611 links).

  • ECN_spa.csv is the enriched conceptual network (ECN) with animals written in Spanish (graph with 236 nodes and 2357 links).

  • ECN_eng.csv is the enriched conceptual network (ECN) with animals written in English (graph with 236 nodes and 2357 links).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goñi, J., Arrondo, G., Sepulcre, J. et al. The semantic organization of the animal category: evidence from semantic verbal fluency and network theory. Cogn Process 12, 183–196 (2011). https://doi.org/10.1007/s10339-010-0372-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10339-010-0372-x

Keywords

Navigation