Journal of Ornithology

, Volume 148, Supplement 1, pp 35–44

Neural systems for vocal learning in birds and humans: a synopsis



DOI: 10.1007/s10336-007-0243-0

Cite this article as:
Jarvis, E.D. J Ornithol (2007) 148: 35. doi:10.1007/s10336-007-0243-0


I present here a synopsis on a hypothesis that I derived on the similarities and differences of vocal learning systems in vocal learning birds for learned song and in humans for spoken language. This hypothesis states that vocal learning birds—songbirds, parrots, and hummingbirds—and humans have comparable specialized forebrain regions that are not found in their close vocal non-learning relatives. In vocal learning birds, these forebrain regions appear to be divided into two sub-pathways, a vocal motor pathway mainly used to produce learned vocalizations and a pallial–basal–ganglia–thalamic loop mainly used to learn and modify the vocalizations. I propose that humans have analogous forebrain pathways within and adjacent to the motor and pre-motor cortices, respectively, used to produce and learn speech. Recent advances have supported the existence of the seven cerebral vocal nuclei in the vocal learning birds and the proposed brain regions in humans. The results in birds suggest that the reason why the forebrain regions are similar across distantly related vocal learners is that the vocal pathways may have evolved out of a pre-existing motor pathway that predates the ancient split from the common ancestor of birds and mammals. Although this hypothesis will require the development of novel technologies to be fully tested, the existing evidence suggest that there are strong genetic constraints on how vocal learning neural systems can evolve.


SingingSpeakingEvolutionSong nucleiAuditory pathway


Vocal learning birds, songbirds in particular, have been extensively used as a model system to study neural mechanisms of vocal learning as it relates to speech acquisition in humans (Jarvis 2004a, b). This neurobiology sub-field began in the 1970s with the first discovery of a non-human vocal learning system, that of canaries (Serinus canaria) (Nottebohm et al. 1976). Since then, nearly a thousand papers have been published on vocal learning systems in birds (Pubmed and Scirus searches; keywords, song–system–brain–avian). However, little attempt was made to link neural systems for vocal learning in birds with that for spoken language in humans (Doupe and Kuhl 1999). Making such links was hampered by several factors, including uncertainty on the telencephalic homologies between birds and mammals, lack of broadly agreed-upon definitions for song, speech, and language and what makes language special, and lack of sufficient data and synthesis on the neural pathways for vocal learning across bird orders and for speech learning in humans.

Some of these limitations have been overcome in recent years. First, a revision of the nomenclature and understanding of the avian brain has resulted in a consensus view that birds and mammals have homologous pallidal, striatal, and pallial subdivisions in their cerebrums, of which the latter two contain the vocal learning regions (Reiner et al. 2004; Jarvis et al. 2005). However, the pallial subdivision in mammals, the cortex, is layered in its cellular organization whereas in birds it is nuclear, which makes comparisons difficult at the level of one-on-one homologies or analogies. Second, a greater understanding of birdsong behavior has allowed for more informative comparisons with human speech (Doupe and Kuhl 1999; Hauser et al. 2002; Okanoya 2007), although many open questions still remain. For the sake of brain comparisons, I simply define ‘song’ in the vocal learning birds and ‘speech’ in humans as analogous behaviors, and ‘spoken language’ in humans as synonymous with speech. Third, gene expression mapping studies have led to important discoveries on the vocal neural systems across vocal learning bird orders (Jarvis et al. 2000) and brain imaging studies in humans have allowed a more accurate identification of brain areas for spoken language (Gracco et al. 2005). Based upon these advances, I derived a hypothesis on the similarities and differences of brain pathways for song in vocal learning birds and spoken language in humans. Here, I present a synopsis of that hypothesis, some of the evidence for it, and some new findings since it was first reported in 2004 (Jarvis 2004a, b).

Vocal learning

Vocal learning is the ability to modify the acoustic and/or syntactic structure of sounds produced, including imitation and improvisation. It is distinct from auditory learning, which is the ability to make associations with sounds heard, though vocal learning depends upon auditory learning (Konishi 1965). Vocal learning is one of the most critical behavioral substrates for spoken human language; with it, humans have the ability to imitate speech sounds heard individually and sequentially, and modify them through auditory feedback. Vocal learning, however, is not synonymous with spoken language, in that spoken language includes many other features such as grammar and recursion (Hauser et al. 2002). That is, different vocal learning species imitate and modify sounds to various degrees, with humans being the most prolific. Despite these differences, most, if not all, vertebrates are capable of auditory learning, but few are capable of vocal learning. The latter has found in three distantly related groups of mammals (humans, bats, and cetaceans) and three distantly related groups of birds (parrots, hummingbirds, and songbirds) (Nottebohm 1972; Janik and Slater 1997). Recent studies have also discovered evidence for vocal learning in seals (Sanvito et al. 2007) and elephants (Poole et al. 2005). However, it is only in humans and the three vocal learning bird groups that the brain pathways for learned vocalization have been studied.

Vocal learning brain pathways in birds and humans

Only vocal learners, songbirds, parrots, hummingbirds, and humans, have brain regions in their cerebrums (or telencephalon) that control vocal behavior (Jurgens 1995; Jarvis et al. 2000). Non-vocal learners, including non-human primates and chickens, only have midbrain and medulla regions that control innate vocalizations (Wild 1997). Each vocal learning bird group contains seven comparable cerebral vocal brain nuclei: four posterior nuclei and three anterior nuclei (Fig. 1a–c; abbreviations in Table 1; Jarvis et al. 2000). These brain nuclei have been given different names in each bird group because of the possibility that each evolved their vocal nuclei independently of a common ancestor with such nuclei (Striedter 1994; Jarvis et al. 2000). In all three bird groups, the posterior nuclei form a posterior vocal pathway that projects from a nidopallial vocal nucleus (HVC, NLC, VLN) to an arcopallial vocal nucleus (RA, AAC dorsal part, VA), to midbrain (DM) vocal premotor and medulla (nXIIts) vocal motor neurons (Fig. 1a–c, black arrows; Striedter 1994; Durand et al. 1997; Vates et al. 1997; Gahr 2000); nXIIts projects to the muscles of the syrinx, the avian vocal organ. Vocal non-learning birds do not to have arcopallium projections to DM or nXIIts (Wild et al. 1997). The anterior nuclei (connectivity examined only in songbirds and parrots) form an anterior vocal pathway loop, where a pallial vocal nucleus (MAN, NAO) projects to a striatal vocal nucleus (Area X, MMSt), the striatal vocal nucleus to a nucleus of the dorsal thalamus (DLM, DMM), and the dorsal thalamus back to the pallial vocal nucleus (MAN, NAO) (Fig. 1a, b, white arrows; Durand et al. 1997; Vates et al. 1997). The parrot pallial MO vocal nucleus also projects to the striatal vocal nucleus (MMSt) (Durand et al. 1997). Connectivity of the songbird MO analogue has not yet been determined.
Fig. 1

Proposed comparable vocal and auditory brain areas among vocal learning birds (ac) and humans (d). Left hemispheres are shown, as this is the dominant side for language in humans and for song in some songbirds. Yellow regions and black arrows indicate proposed posterior vocal pathways; red regions and white arrows indicate proposed anterior vocal pathways; dashed lines indicate connections between the two vocal pathways; blue indicates auditory regions. For simplification, not all connections are shown. The globus pallidus in the human brain, also not shown, is presumably part of the anterior pathway as in non-vocal pathways of mammals. Basal ganglia, thalamic, and midbrain (for the human brain) regions are drawn with dashed-line boundaries to indicate that they are deeper in the brain relative to the anatomical structures above them. The anatomical boundaries drawn for the proposed human brain regions involved in vocal and auditory processing should be interpreted conservatively and for heuristic purposes only. Human brain lesions and brain imaging studies do not allow one to determine functional anatomical boundaries with high resolution. Scale bar ∼7 mm. Abbreviations are in Table 1. Figure modified from Jarvis (2004b)

Table 1

Abbreviations used in the text and in Figs. 1, 2 and 3


Word or phrase


Word or phrase




Lateral magnocellular nucleus of anterior nidopallium


Central nucleus of the anterior arcopallium




Central nucleus of the anterior arcopallium, dorsal part


Magnocellular nucleus of anterior nidopallium


Central nucleus of the anterior arcopallium, ventral part


Mesencephalic lateral dorsal nucleus


Intermediate arcopallium


Medial magnocellular nucleus of anterior nidopallium


Caudal medial arcopallium


Magnocellular nucleus of the anterior striatum


Anterior cingulate cortex


Oval nucleus of the mesopallium


Anterior insula cortex




Nucleus ambiguous


Oval nucleus of the anterior nidopallium


Anterior thalamus


Caudal medial nidopallium


Anterior supplementary motor area


Caudal dorsal nidopallium


Anterior striatum


Intermediate dorsal lateral nidopallium

Area X

Area X of the striatum


Interfacial nucleus of the nidopallium




Central nucleus of the lateral nidopallium


Caudal mesopallium


Tracheosyringeal subdivision of the 12th nucleus


Caudal striatum


Nucleus oviodalis


Medial nucleus of dorsolateral thalamus


Periaqueductal grey


Dorsal medial nucleus of the midbrain




Magnocellular nucleus of the dorsomedial thalamus


Robust nucleus of the arcopallium


Dorsal lateral prefrontal cortex


Vocal nucleus of the arcopallium


Face motor cortex


Vocal nucleus of the anterior mesopallium


(A letter based name)


Vocal nucleus of the anterior nidopallium


Field L2


Vocal nucleus of the anterior striatum


Lateral nucleus of the anterior nidopallium


Vocal nucleus of the lateral nidopallium


Lateral nucleus of the anterior mesopallium


Vocal nucleus of the medial mesopallium



Vocal nucleus of the medial nidopallium

The major differences among vocal learning birds are in the connections between the posterior and anterior vocal pathways (Jarvis and Mello 2000). In songbirds, the posterior pathway sends input to the anterior pathway via HVC to Area X; the anterior pathway sends output to the posterior pathway via lateral MAN (LMAN) to RA and medial MAN (MMAN) to HVC (Fig. 1c; Foster and Bottjer 2001). In contrast, in parrots, the posterior pathway sends input into the anterior pathway via ventral AAC (AACv, parallel of songbird RA) to NAO (parallel of songbird MAN) and MO; the anterior pathway sends output to the posterior pathway via NAO to NLC (parallel of songbird HVC) and to AAC (Fig. 1a; Durand et al. 1997).

In humans, imaging and lesions studies have revealed cortical, striatal, and thalamic regions that are active and necessary for learning and production of language (reviewed in Jarvis 2004a, b; and see below). However, ethical and practical issues prevent connectivity tract-tracing experiments on humans. Some post-mortem neuro-degeneration studies have been conducted in humans and many tract-tracing studies have been performed on adjacent non-vocal pathways in vocal non-learning mammals. Based upon these comparisons, it appears that the avian posterior vocal pathways are similar to mammalian motor cortico-brainstem pathways, where, in humans, I propose an analogous posterior vocal pathway consists of the face motor cortex that projects to nucleus ambiguous (Am) of the medulla (Fig. 1d; Kuypers 1958a); Am, the parallel of avian nXIIts, projects to the muscles of the larynx, the main mammalian vocal organ (Zhang et al. 1995; Jurgens 1998). Non-human primates, like chickens, do not have such a projection (Kuypers 1958a, b). See Jarvis (2004b) for a detail description on analogous cell types.

The avian anterior vocal pathways are similar in connectivity to mammalian cortical-basal ganglia–thalamic–cortical loops (Bottjer and Johnson 1997; Durand et al. 1997; Jarvis et al. 1998; Perkel and Farries 2000). In this regard, I proposed that a strip of adjacent premotor cortex in humans that is required for speech learning and syntax production makes up the cortical part of a speech loop. This cortical strip extends from the anterior insula (aINS), Broca’s area, the anterior dorsal lateral prefrontal cortex (aDLPFC), the anterior pre-supplementary motor area (aSMA), to the anterior cingulate (aCC; Fig. 1d). This strip I argue is analogous to the avian pallial anterior vocal nuclei (i.e., parrot MO and NAO). As in non-human primates and in vocal learning birds, I proposed that this cortical strip projects to the anterior most region of the striatum (aSt), the anterior striatum to the globus pallidus (GP), the pallidus to the anterior dorsal thalamus (aT), and the dorsal thalamus back up to the cortical strip (Fig. 1d), all regions required for speech learning and syntax (described below).

Because connections between the posterior and anterior vocal pathways differ between songbirds and parrots, comparisons between them and mammals will also differ. In mammals, layer 5 neurons of motor cortex have axon collaterals, where one projects into the striatum and another projects to the medulla and spinal cord (Alexander and Crutcher 1990; Reiner et al. 2003). This pattern is different from the songbird where a specific cell type of HVC, called X-projecting neurons, projects to Area X in the striatum separately from neurons of RA of the arcopallium that project to the medulla (Fig. 1c). This pattern is also different from the parrot, where AAC of the arcopallium has two anatomically separate neuron populations, AACd that projects to the medulla and AACv that projects to anterior pallial vocal nuclei NAO and MO (Fig. 1a; Durand et al. 1997). Output of mammalian anterior pathways are proposed to be the collaterals of layer 3 and upper layer 5 neurons that project to other cortical regions and the striatum (Reiner et al. 2003; Jarvis 2004b).

Functions of vocal brain areas in birds and humans

There are some gross similarities in behavioral deficits following lesions in specific brain areas of vocal learning birds (experimentally placed) and of humans (due to stroke or trauma). Lesions to songbird posterior nuclei HVC and RA (Nottebohm et al. 1976; Simpson and Vicario 1990), on the left side in canaries, cause deficits similar to those found after damage to left human face motor cortex, this being muteness for learned vocalizations, i.e., for speech (Valenstein 1975; Jurgens et al. 1982; Jurgens 1995). Lesions to parrot NLC even cause deficits in producing the correct acoustic structure of learned human speech in parrots (Lavenex 2000). Lesions to the face motor cortex in chimpanzees and other non-human primates do not affect their ability to produce vocalizations (Kuypers 1958b; Jurgens et al. 1982; Kirzinger and Jurgens 1982). Lesions to avian nXIIts and DM and mammalian Am and PAG result in muteness in both vocal learners and non-learners (Brown 1965; Nottebohm et al. 1976; Seller 1981; Jurgens 1994, 1998; Esposito et al. 1999).

Lesions to songbird MAN cause deficits that are most similar to those found after damage to anterior parts of the human premotor cortex, this being disruption of imitation and/or induction of sequencing problems. In birds and humans, such lesions do not prevent the ability to produce learned song or speech. In humans, these deficits are called verbal aphasias and verbal amusias (Benson and Ardila 1996). Damage to the left side often leads to verbal aphasias, whereas damage to the right can lead to verbal amusias (Berman 1981). The deficits in humans, however, are more complex. Specifically, lesions to songbird LMAN (Bottjer et al. 1984; Nottebohm et al. 1990; Scharff and Nottebohm 1991; Kao et al. 2005) and to the human insula and Broca’s (Mohr 1976; Benson and Ardila 1996; Dronkers 1996) lead to poor imitation with sparing or even inducing more stereotyped song or speech. In addition, lesions to Broca’s and/or DLPFC (Benson and Ardila 1996) lead to poor syntax production in construction of phonemes into words and words into sentences. Lesions to DLPFC also result in uncontrolled echolalia imitation, whereas lesions to aSMA and anterior cingulate result in spontaneous speech arrest, lack of spontaneous speech, and/or loss of emotional tone in speech, but with imitation preserved (Nielsen and Jacobs 1951; Barris et al. 1953; Rubens 1975; Valenstein 1975; Jonas 1981). Lesions to songbird MMAN lead to a decreased ability in vocal learning and some disruption of syntax (Foster and Bottjer 2001).

Lesions to songbird Area X and to the human anterior striatum do not prevent the ability to produce already learned speech, but do result in disruption of vocal learning and disruption of some syntax in birds (Sohrabji et al. 1990; Scharff and Nottebohm 1991; Kobayashi et al. 2001) or verbal aphasias and amusias in humans (Mohr 1976; Bechtereva et al. 1979; Leicester 1980; Damasio et al. 1982; Alexander et al. 1987; Cummings 1993; Speedie et al. 1993; Lieberman 2000). Humans can have a combination of symptoms (Mohr 1976) perhaps because, as in non-human mammals, large cortical areas send projections that converge onto relatively smaller striatal areas (Beiser et al. 1997). Not many cases have been reported of lesions to the human globus pallidus leading to aphasias (Strub 1989), but the fact that this can occur suggests some link with a striatal vocal area in humans. In vocal learning birds, the pallidal neurons appear to be within the striatal vocal nucleus (Durand et al. 1997; Farries and Perkel 2002).

Similar to a preliminary report on songbird DLM (Halsema and Bottjer 1991), damage to anterior portions of the human thalamus leads to verbal aphasias (Graff-Radford et al. 1985). In humans, thalamic lesions can lead to temporary muteness followed by aphasia deficits that are sometimes greater than after lesions to the anterior striatum or premotor cortex. This greater deficit may occur perhaps because there is further convergence of inputs from the striatum to the globus pallidus and then from the globus pallidus to the thalamus (Beiser et al. 1997).

Results of lesion studies overlap with brain activation studies. In vocal learning birds, all seven comparable cerebral vocal nuclei display vocalizing-driven expression of egr-1, an immediate early gene (Jarvis and Nottebohm 1997; Jarvis et al. 1998, 2000; Jarvis and Mello 2000); expression of immediate early genes are responsive to changes in neural activity. Likewise, premotor neural firing has been found in several posterior and anterior vocal nuclei when a bird sings (McCasland 1987; Yu and Margoliash 1996; Hessler and Doupe 1999; Hahnloser et al. 2002). The firing in songbird HVC and RA correlates with sequencing of syllables and syllable structure, respectively, whereas firing in Area X and LMAN is much more varied and, in LMAN, it correlates with song variability. Stimulation with electrical pulses to HVC during singing temporarily disrupt song output, i.e., song arrest (Vu et al. 1998).

In humans, the face motor cortex is always activated with speech task (Petersen et al. 1988; Rosen et al. 2000; Gracco et al. 2005). For the proposed language strip, production of verbs and complex sentences can be accompanied by activation in all or a subregion of this strip (Fig. 1d) (Petersen et al. 1988; Poeppel 1996; Price et al. 1996; Crosson et al. 1999; Wise et al. 1999; Papathanassiou et al. 2000; Rosen et al. 2000; Palmer et al. 2001; Gracco et al. 2005). Activation in Broca’s, DLPFC, and aSMA is higher when speech tasks are more complex, including learning to vocalize new words or sentences, sequencing words into complex syntax, producing non-stereotyped sentences, and thinking about speaking (Hinke et al. 1993; Poeppel 1996; Buckner et al. 1999; Bookheimer et al. 2000). Like vocal nuclei in birds, premotor speech-related neural activity has been found in Broca’s area (Fried et al. 1981). Further, low threshold electrical stimulation to the face motor cortex, Broca’s, or the aSMA cause speech arrest or generation of phonemes or words (Jonas 1981; Fried et al. 1991; Ojemann 1991, 2003).

In non-cortical areas, speech production is accompanied by activation of the anterior striatum and the thalamus (Wallesch et al. 1985; Klein et al. 1994; Wildgruber et al. 2001; Gracco et al. 2005). Low threshold electrical stimulation to ventral lateral and anterior thalamic nuclei, particularly in the left hemisphere, leads to word repetition, speech arrest, speech acceleration, spontaneous speech, anomia, or verbal aphasia (Johnson and Ojemann 2000). The globus pallidus can also show activation during speaking (Wise et al. 1999). In non-human mammals and in birds, PAG and DM, and Am and nXIIts display premotor vocalizing neural firing (Larson 1991; Larson et al. 1994; Zhang et al. 1995; Dusterhoft et al. 2004) and/or vocalizing-driven gene expression (Jarvis et al. 1998, 2000; Jarvis and Mello 2000).

Since this hypothesis was proposed in 2004 (Jarvis 2004a, b), PET brain imaging studies by Brown and colleagues on humans revealed that, when humans sing or speak, activation specifically occurs in all the above-described brains areas (Brown et al. 2004, 2006, 2007). Further, they found that song learning in humans is accompanied by higher activation of the anterior premotor cortical and striatal regions relative to simply production of already well-learned songs (Brown et al. 2006). In birds, the presence of all seven vocal nuclei had been previously revealed with only one gene, egr-1. Since then, additional immediate early genes have been examined, and multiple genes reveal the seven vocal nuclei (Wada et al. 2006), where c-fos clearly shows a high contrast of activation (Fig. 2a, songbird; b, parrot). No other brain areas showed high levels of activation, indicating that the entire vocal systems of these species probably have been identified.
Fig. 2

Singing-driven c-fos mRNA expression in vocal learning species. a Accumulated c-fos mRNA (white) in adult male Zebra Finches (Taeniopygia guttata), a songbird, from a silent male and a male that sang while alone for 30 min. All seven cerebral vocal nuclei as well as DM of the midbrain show singing-induced gene expression. MLd shows expression that is due to the bird hearing itself sing. Figure modified from Wada et al. (2006). b c-fos expression in adult male Budgerigar (Melopsittacus undulatus), a parrot, vocal nuclei. All seven cerebral vocal nuclei previously identified with egr-1 show singing-induced c-fos expression. The AAC vocal nucleus has lower c-fos expression in Budgerigars than the other vocal nuclei. NCM and CM show expression that is due to the bird hearing itself sing. In situ hybridizations of the parrot brain sections were generated by Dr. Miriam Rivas. Anterior is to the right, dorsal is up. Abbreviations are in Table 1. Scale bar 2 mm

Taken together, the lesion and brain activation findings are consistent with the idea that songbird HVC and RA are more similar in their functional properties to face motor cortex than to any other human brain area, and that songbird MAN, Area X, and the anterior part of the dorsal thalamus are more similar in their properties to parts of the human premotor cortex, anterior striatum, and ventral lateral/anterior thalamus, respectively. The findings are consistent with the presence in humans of a posterior-like vocal motor pathway and an anterior-like vocal premotor pathway that are similar to the production and learning pathways of vocal learning birds. A difference between birds and humans appears to be the greater complexity of the deficits found after lesions in humans.

The auditory system

An auditory pathway is common among vocal learners and vocal non-learners (Jarvis 2004b). In brief, birds, reptiles, and mammals have relatively similar auditory pathways (Fig. 3) (Webster et al. 1992; Vates et al. 1996; Carr and Code 2000). The pathway begins with ear hair cells that synapse onto sensory neurons, which project to cochlea and lemniscal nuclei of the brainstem, which in turn project to midbrain and thalamic auditory nuclei. The thalamic nuclei in turn project to primary auditory cell populations in the pallium (avian L2, reptile caudal medial pallium, mammalian layer 4 of primary auditory cortex). Avian L2 then projects to other pallial regions and to the caudal striatum (CSt), forming a complex network. Mammalian layer 4 cells project to other layers of primary auditory cortex and to secondary auditory regions. In reptiles, after reaching the caudal medial pallium (Smeets and Gonzalez 1994), the remaining cerebral pathway connectivity is not known.
Fig. 3

Comparative and simplified connectivity among auditory pathways in reptiles, mammals, and birds, placed in order from left to right of the most recently evolved. The connectivity from CM to CSt in birds needs verification by retrograde tracing. Abbreviations are in Table 1. Figure reproduced from (Jarvis 2004b) with permission

The source of auditory input into the vocal pathways of vocal learning birds is unclear. In songbirds, proposed routes include the HVC shelf into HVC, the RA cup into RA, Ov or CM into NIf, and from NIf dendrites in L2 (Wild 1994; Fortune and Margoliash 1995; Vates et al. 1996; Mello et al. 1998). However, the location of the vocal nuclei relative to the auditory regions differs among vocal learning groups. In songbirds, the posterior vocal nuclei are embedded in the auditory regions; in hummingbirds, they are situated more laterally, but still adjacent to the auditory regions; in parrots, they are situated far laterally and physically separate from the auditory regions (Fig. 1a–c). At a minimum, the auditory input must take different routes to enter the posterior vocal nuclei of each group.

In humans, primary auditory cortex information is passed to secondary auditory areas, which includes Wernicke’s area (Fig. 1d). Damage to this area leads to auditory aphasias, sometimes call fluent aphasia. A patient can speak well, but produces nonsense highly verbal speech. One reason for this symptom is that the vocal pathways may no longer receive feedback from the auditory system. Bilateral damage to primary auditory cortex and Wernicke’s area also leads to full auditory agnosia, the inability to consciously recognize any sounds (speech, musical instruments, natural noises, etc.) (Benson and Ardila 1996). Information from the Wernicke’s area has been proposed to be passed to Broca’s area through arcuate fibers in a caudal-rostral direction (Geschwind 1979), but for many years such a pathway had not been proven. Recently, this hypothesis was tested in experiments with stimulation electrodes in patients undergoing surgery, which revealed a functional bi-directional axon pathway between Wernicke’s and Broca’s areas (Matsumoto et al. 2004).

No one has tested whether lesions to avian secondary auditory areas result in fluent song aphasias. Yet, lesions to songbird NCM and CM result in a significant decline in the ability to form auditory memories of songs heard (MacDougall-Shackleton et al. 1998; Gobes and Bolhuis 2007). It is difficult to ascertain how non-human animals, including birds, perceive sensory stimuli, and therefore it is difficult to make comparisons with humans in regard to perceptual auditory deficits.

Evolution of vocal learning systems from a common motor pathway

Given that the auditory pathways in avian, mammalian, and reptilian species are similar, whether not a given species is a vocal learner, this suggests that the auditory pathway in vocal learning birds and in humans was inherited from their common stem-amniote ancestor, thought to have lived ∼320 million years ago (Evans 2000). Having a cerebral auditory pathway would explain why non-human mammals, including dogs, exhibit auditory learning, including learning to understand the meaning of human speech, although with less facility than a human. For vocal learning pathways, because the connections of the anterior and posterior vocal pathways in vocal learning birds bear some resemblance to those of non-vocal pathways in both birds and mammals, pre-existing connectivity could have been a genetic constraint for the evolution of vocal learning (Durand et al. 1997; Farries 2001; Lieberman 2002; 2004a, b). In terms of function, recent results suggest that vocal nuclei of vocal learning birds are embedded within at least seven brain areas active during the production of limb and body movements (Feenders, Leidvogel, Rivas, Zapka, Horita, Tremere, Hara, Wada, Mouritsen, and Jarvis, submitted). The same movement-associated brain areas are also found in vocal learning birds, such as Ring Doves (Streptopelia risoria). Like the vocal nuclei, their activation is independent of auditory input and correlates with the amount of movement performed. These findings led to a motor theory for the origin of vocal learning, whereby in the avian brain a pre-existing motor system in a vocal non-learner ancestor is proposed to consists of seven brain regions distributed across mesopallial, nidopallial, arcopallial, and striatal brain subdivisions, and separated into two pathways: an anterior pre-motor pathway that forms a pallial–basal–ganglia–thalamic–pallial loop and a posterior motor pathway that sends descending projections to brainstem and spinal cord pre-motor neurons. Then, a mutational event or events might have caused descending projections of avian arcopallium neurons, that normally synapse onto non-vocal pre-motor neurons, to instead synapse onto vocal nXIIts motor neurons in vocal learners. Thereafter, cerebral vocal brain regions could have developed out of adjacent motor brain regions using the pre-existing connectivity. Such a mutational event would be expected to occur in genes that regulate synaptic connectivity of pallial motor neurons to α-motor neurons. This theory can also be applied to the proposed human posterior and anterior vocal pathways used for spoken language, as these regions are either embedded within or adjacent to motor and pre-motor pathways. Various parts of this hypothesis can be verified or falsified with connectivity, lesion, and brain activation experiments on adjacent brain areas in vocal non-learning birds, brain areas for vocal learning in other mammalian vocal learners, and gene manipulation experiments on genes that control pallial to brainstem neural connectivity in birds and mammals.


I thank Dr. Miriam Rivas for performing the in situ hybridizations of the parrot brain sections.

Copyright information

© Dt. Ornithologen-Gesellschaft e.V. 2007