In an influential paper on the memory circuits that subserve language processing, Ullman [1] developed the declarative/procedural (DP) model. According to this approach, there is a distinction in the brain between those regions that process explicit world knowledge, such as facts and events, and subserve arbitrary associations of learned information, and those regions that process implicit cognitive and motor skills, which are performed unconsciously. Ullman called the first group of regions the ‘declarative’ memory system and suggested that it primarily includes the hippocampus and temporal regions. The second group of regions was called the ‘procedural’ memory system, which primarily includes the frontal lobe and basal ganglia, with a potential role for inferior parietal regions. In a later version of his model, Ullman [2] extended the procedural network by providing a role for the cerebellum in it.
Based on neurocognitive data, Ullman [1] extended the DP model in order to account for language processing: he suggested that the procedural system subserves implicit and automatic aspects of language processing, such as processing of grammar, whereas the declarative system subserves knowledge and processing of words, i.e. the lexicon. This suggestion is compatible to experimental data that show differential processing for regularly versus irregularly inflected past tense forms (e.g. played versus kept) [3, 4]. Regular inflections such as played are thought to be processed online by the application of a linguistic rule that instructs the automatic outstripping of the suffix -ed in order to process the stem play. On the other hand, irregular forms, such as kept, are thought to be monomorphemic, occupying separate lexical entries than their stem (keep). Based on this distinction, processing of regular past tense forms should involve the procedural system, whereas processing of irregular forms should involve the declarative system. Further supporting evidence for the relevance of the DP model for language processing has been provided by studies on aphasics [5], as well as on people diagnosed with specific language impairment (SLI) [6]. A common finding in these studies was that people with impaired grammatical processing often had their word learning capabilities spared. Hedenius et al. [7] recently showed that a grammatically impaired group of children with SLI was unable to consolidate learning of new non-linguistic sequences, as opposed to a group of typically developing children, which showed evidence of procedural learning. This finding was interpreted as evidence for the close relationship between sequence learning and rule-based grammatical processing and as an indication that the two types of impairment may have a common underlying neurological cause. These findings suggest a close link between non-linguistic skills and grammatical processing at the brain level, which also appear independent from episodic world knowledge, including explicit language knowledge, i.e. the lexicon.
The Cerebellum in Language Processing
As already mentioned, Ullman [2] proposed a role for the cerebellum within the procedural system, which also extends to language processing. More specifically, he suggested that the cerebellum may be involved in the ‘error-based learning of the rules that underlie the regularities of complex structures’ (p. 247). The role of the cerebellum in language processing has been supported by several studies in recent years. These include evolutionary approaches suggesting that language processing in humans evolved as a by-product of the organisation of ‘syntactic’ behavioural sequences for problem solving during foraging, which are largely subserved by the cerebellum [8]. De Smet et al. [9] recently reviewed a number of patient studies and unveiled the effects of cerebellar damage in syntactic processing, including difficulties in processing morphemes, as well as the role that the cerebellum may have in conditions such as aphasia, alexia, dyslexia and dysgraphia. Similarly, recent reviews [10–12] list a number of functional neuroimaging studies that suggested the activation of the cerebellum in linguistic tasks, such as word generation, object naming, stem completion and semantic judgement; importantly, they also report case studies where damage in the right cerebellum led to symptoms typical in non-fluent aphasia, such as marked agrammatism and impairment in sentence construction, leading some researchers to describe a cerebellar-induced type of aphasia [13]. The available data also suggest a functional topography of the cerebellum, with language-related tasks engaging the right lateral posterior cerebellum, along with the left prefrontal regions, for right-handed participants [14–16]. This includes areas, such as lobules VI and VII (Crus I/II) [17]. Importantly, left-handed participants appear to activate the left cerebellar homologues for the same tasks [15]. Recent findings [18] have further supported the language lateralisation within the cerebellum, suggesting, however, that it is a less lateralised region compared to other typical language areas, such as the left inferior frontal gyrus (LIFG), and that the lateralisation changes little over time before puberty.
The role of the right cerebellum in language processing was further demonstrated in a recent study by Lesage et al. [19]. Lesage and colleagues applied repetitive transcranial magnetic stimulation to the right cerebellum and caused delayed responses in subjects being tested in an auditory task where the participants had to predict linguistic input based on a specific context. The author suggested that the right cerebellum may be crucial for predicting linguistic input and linked this suggestion to the general predictive role that has been proposed for the cerebellum for motor control [20]. Similarly, Marvel and Desmond [21] administered a functional magnetic resonance imaging (fMRI) study tapping on verbal working memory in two conditions: the storage condition, where subjects were required to attend to a target letter, rehearse it silently over a delay period and match it to a probe letter, and a manipulation condition, where the subjects saw the target letter, counted two letters up and rehearsed the new letter, until the probe appeared. For manipulation versus storage, the results revealed the activation of regions that support motor planning and preparation, including the pre-motor cortex, the pre-supplementary motor area (SMA) and the bilateral superior cerebellum, as well as regions shown to have a role in working memory, such as the dorsal prefrontal cortex, the insula and the right inferior cerebellum. Marvel and Desmond suggested that their findings signified a functional specialisation within the cerebellum for verbal working memory tasks: while the inferior cerebellum is important for maintaining and updating information in working memory, the superior cerebellum remains active for the ongoing manipulation of information. Consequently, the cerebellum emerges as an important region for inner speech, which in turn supports working memory.
To summarise, the role of the cerebellum in language processing has been demonstrated in a variety of studies on both healthy and impaired populations. Among the various language-related functions of the cerebellum, the evidence linking cerebellar damage to impaired syntactic processing and agrammatism further supports the role of the cerebellum in grammatical processing as part of the procedural network. We will now turn to the predictions of the DP model for non-native language processing, as well as the available evidence for structural changes in the bilingual brain as a function of learning a second language (L2).
The DP Model in L2 Processing
Ullman [1] suggested that the distinction between procedural and declarative aspects of language processing does not apply to late L2 learners of a language and that reliance to the procedural system is significantly dependent on the age of acquisition (AoA) of the L2. This is due to maturational constraints of the procedural system, which make it harder to access and utilise by late learners of an L2. To investigate the effects of AoA on L2 processing, Consonni and colleagues [22] tested balanced Italian–Friulian bilinguals and Friulian late learners of Italian (AoA, 3–6 years) in an fMRI experiment with a task combining comprehension and production: it required the generation of a verb or a noun as a response to a description, in both Friulian and Italian. Both groups were highly proficient in both languages. Their results revealed comparable brain patterns across the two groups and for both languages, with a significant distinction for processing verbs versus nouns. Consonni and colleagues concluded that, with the proficiency kept high, AoA is not a significant factor for comprehension and production in an L2; instead, high bilingual proficiency and exposure lead to the convergence of the neural substrates that process the two languages. However, it is important to note that the two groups were equally exposed to two languages since birth—the late group only received Italian linguistic instruction at the age of 3 years, while any prior naturalistic exposure to Italian cannot be ruled out. Additionally, subjects with an AoA of 3–6 years are rarely classified in the literature as ‘late’ learners of a language [23, 24]. The maturational constraints that Ullman suggested may not apply at that age [18]; moreover, Ullman defines as ‘late’ the language acquisition that takes place after puberty [2]. Therefore, the suggestions by Consonni and colleagues must be treated with caution, as far as the effects of AoA are concerned.
Although Ullman suggested that late AoA may be detrimental for the procedural acquisition of the L2, he also admitted that late L2 learners may become more native-like as a result of practice and experience in an L2 [2]. Recent evidence has suggested that highly proficient L2 learners of English can be native-like in demonstrating rule-based morphological processing [25]. In that study, L2 learners of English with 8 years of classroom instruction and native speakers were shown to process regular past tense inflection in English in a similar way in a self-paced reading task; more specifically, both groups showed evidence for rule-based decomposition of regularly inflected forms (played) embedded in grammatical sentences, where no such evidence was shown for irregular forms (kept). According to the Skills Acquisition Theory [26], a shift from declarative to procedural knowledge in an L2 is feasible and is dependent on successful classroom instruction. In other words, the continuous instruction of a grammatical rule leads to the proceduralisation of the rule, the application of the rule without the involvement of a declarative component. It has been suggested [27] that proceduralisation or native-like processing of grammatical rules may not be across the board, but restricted to simple concatenative rules, such as the English past tense rule. Indeed, it has been shown that L2 learners with classroom exposure do not process abstract syntactic features in an L2 [28]; however, more recent findings suggested that abstract syntactic processing can be eventually established as a result of extensive naturalistic exposure to the L2 [29].
Given the available evidence for proceduralisation of L2 processing as a matter of classroom or naturalistic exposure, it is interesting to see whether these changes in behavioural processing are accompanied by structural changes in the brain. In other words, it is worth investigating whether this ‘switch’ between the declarative and procedural networks is manifested in structural reorganisation of the relevant brain regions.
Structural Changes in the L2 Brain
Emerging neuroimaging evidence has suggested that the structure of the human brain can be altered as a result of learning an L2. Mechelli et al. [30] used voxel-based morphometry (VBM) analyses on the brains of native English speakers and age-matched and education-matched L2 learners of English and identified a region in the left inferior parietal cortex with significantly greater grey matter (GM) density in the L2 learners. Importantly, the GM density in that area was positively correlated to the L2 learners’ proficiency level and negatively correlated to their age of L2 acquisition, suggesting a dynamic reorganisation of the brain structure in this area as a function of learning and using an L2. The importance of L2 proficiency as a predictor of structural brain changes was also demonstrated by Stein and colleagues [31], who run a longitudinal MRI study on English learners of L2 German. They acquired high-resolution anatomical brain images and a proficiency measurement from their subjects in two occasions: first, after an intense 3-week German language course (day 1), and second, after 5 months from that date (day 2). Stein and colleagues found that both subjects’ proficiency and their overall GM volume had increased from days 1 to 2. A subsequent regression analysis showed that proficiency level was a good predictor of GM density in the LIFG, an area which has also been suggested to form part of the procedural network [2]. Importantly, it appears that this kind of cortical restructuring took place even within a short amount of naturalistic exposure to L2.
More evidence for the relationship between L2 learning and brain structure has been provided by VBM studies investigating how the GM density in the bilingual brain is correlated to the bilinguals’ performance in behavioural tasks. Grogan et al. [32] reported significant correlations between the bilinguals’ performance in fluency tasks and the GM density in four areas: in the left inferior temporal lobe and bilaterally in the caudate nucleus, the cerebellum and the pre-SMA. These effects were common for their performance in both their native language (L1) and their L2, and the correlations with the caudate nucleus were stronger for the L2 than the L1. A subsequent study [33] suggested that vocabulary knowledge in an L2 is positively correlated to the GM density in the inferior parietal lobe, bilaterally.
Notably, some of the brain regions that have been identified as crucial for bilingual processing, more specifically the LIFG, the cerebellum and the caudate nucleus, form part of what has been described as the procedural network in the DP model [1]. Despite the importance given by Ullman to this network for the processing of grammatical rules, none of the available studies has so far attempted to correlate changes in the brain structure of bilinguals to their performance in grammatical tasks. Since there is evidence for native-like processing of the past tense rule by L2 learners [25] and also since the GM density of parts of the procedural memory system has been shown to be related to L2 performance in language tasks [32], it is worth investigating the relationship between the bilinguals’ GM volume and their performance in a task tapping on grammatical processing. We, therefore, compared the structural images of the native and highly proficient non-native speakers of English, who also performed a behavioural task involving processing of the English past tense rule. This was a masked priming task with past tense forms as primes and their corresponding present tense forms as targets. We chose this task as it has already been used by studies on morphological processing in L2 and it is thought to tap into morphological rule application by leaving outside any semantic effects [34–36]. We predicted significant between-group differences in GM volume in the procedural system, as a result of its increased involvement in the proceduralisation of the L2. We also predicted that any bilingualism-induced effects on the procedural system will be reflected in the processing patterns of the L2 learners on the grammatical task.
Finally, although there is behavioural evidence for the effects of naturalistic exposure on L2 syntactic processing [29] and MRI evidence that even 5 months of exposure can lead to changes in cortical GM density [31], none of the available studies has investigated the effects of extensive naturalistic exposure to the structure of the bilingual brain. To investigate that, we re-ran our analysis on the L2 learners only, by using their amount of naturalistic exposure as a predictor for the GM volume across the brain.