A hallmark of the central nervous system is its tremendous capacity for change as a product of experience. This change is the signature of learning, a concept that will be defined broadly here as the measurable behavioral advantages that emerge as a function of training over time. An enduring question in the field of psychology has been the extent to which frequent exposure to or expertise in a particular task produces more general cognitive advantages. That is, when is learning strictly task specific, and when does it transfer to other, qualitatively similar tasks?

Some of the earliest models for learning (e.g., computation theories of learning) confined the skills required to successfully complete any one of a number of qualitatively distinct tasks to localized regions or pathways in the brain. Although one problem with these localizationist models (see Poggio & Bizzi, 2004, for a review) is that they severely restricted the generalizability of learning, considerable experimental research has continued to show, especially in the domains of perceptual learning and skill acquisition, evidence for task-specific learning (Fahle, 2004, 2005; Fiorentini & Berardi, 1980; Maehara & Goryo, 2003; Speelman & Kirsner, 1997). Saffell and Matthews (2003), for example, demonstrated that participants who trained extensively on a direction discrimination task failed to transfer this training to a speed discrimination task, and vice versa. Similarly, Ball, Berch, Helmers, Jobe, Leveck, Marisiske et al. (2002) have demonstrated that individuals who are trained in visual search show little transfer between search performance and memory or reasoning tests.

Although there are many examples of nontransfer between two arguably similar tasks, there are increasingly more studies that purport to demonstrate a link between engagement in certain activities and general cognitive advantages. One area of study that continues to demonstrate associations between lifelong activities and general, effective cognitive functioning relates to the cognitive enrichment hypothesis (Hebb, 1947, 1949). According to this hypothesis, a wide variety of specific lifestyle factors have pervasive beneficial effects on cognitive functioning well into old age (Fratiglioni, Paillard-Borg, & Winblad, 2004). High levels of physical activity throughout the life span, for example, are associated with protection against cognitive decline (Yaffe, Barnes, Nevitt, Lui, & Covinsky, 2001). Similarly, elevated participation in mentally stimulating activities (Wilson, Bennett, Bienias, Mendes de Leon, Bienias, Morris, & Evans, 2005), social interaction (Bassuk, Glass, & Berkman, 1999), intellectually demanding employment (Potter, Helms, & Plassman, 2008), and video game play (e.g., Gopher, Weil, & Bareket, 1994; C. S. Green & Bavelier, 2003; see C. S. Green & Bavelier, 2008, for a review) all seem to be associated with a general improvement in cognitive outcomes (see Hertzog, Kramer, Wilson, & Lindenberger, 2009, for a review of some of these factors).

In a similar vein of thought, Bialystok, Craik, and Freedman (2007) have shown that fluency in two languages protects against symptoms of dementia into old age. In a cohort of 184 patients selected from a memory clinic in Toronto, comprising an approximately even distribution of bilinguals and monolinguals who were equated for various other social and cognitive factors, the onset of dementia occurred 4.1 years later in bilinguals. The impetus for this investigation was the exciting report by Bialystok, Craik, Klein, and Viswanathan (2004) of better conflict resolution by bilinguals than monolinguals, particularly among older participants, in a nonlinguistic interference paradigm (the Simon task). The dramatic implication of this result is that the requirement imposed on bilinguals to manage two languages renders long-term cognitive benefits that extend beyond the sphere of language. The empirical findings leading to this conclusion, however, have been unreliable in children (Bialystok, Martin, & Viswanathan, 2005; Martin-Rhee & Bialystok, 2008) and young adults (Bialystok, 2006; Bialystok, Craik, Grady, Chau, Ishii, Gunji, & Pantev, 2005; Costa, Hernández, Costa-Faidella, & Sebastián-Gallés, 2009) and have been understudied in older age groups. The purpose of the present review is to examine the extent to which a bilingual advantage is present on tasks that require the ignoring of irrelevant, nonlinguistic information.

Regulation of the language system

Much of the work on bilingual advantages on conflict resolution that has developed in the twenty-first century has been stimulated by D. W. Green’s (1998) inhibitory control theory. D. W. Green proposed that an inhibitory control mechanism mediates the suppression of task-dependent irrelevant language in bilinguals. According to this model, there is parallel activation of lexical items associated with a particular concept between languages. The assumption, then, is that a particular experience or thought activates semantically linked units in both languages. In order to retrieve the desired word, one of these lexical candidates (often called “lemmas”) needs to be inhibited. The model hypothesizes a supervisory attentional system (SAS) that responds reactively (via inhibition) in a manner directly proportional to the degree of parallel activation elicited by a particular experience. That is, if an irrelevant language is strongly activated, the amount of inhibition generated by the SAS will increase proportionally in order to suppress the irrelevant information. The SAS therefore allows for the successful retrieval of the relevant semantic unit for speech or language by resolving the conflict associated with two simultaneously activated semantic units, by virtue of inhibition.

The assumption that competition may arise between two semantic units owing to parallel activation has been validated to some extent by empirical research. Data that converge on this idea show that bilinguals are also slower on picture-naming tasks (Gollan, Montoya, Fennema-Notestine, & Morris, 2005), produce fewer words in verbal fluency tasks (Rosselli, Ardila, Araujo, Weekes, Caracciolo, Padilla, & Ostrosky-Solis, 2000), perform worse on lexical decision tasks (Ransdell & Fischler, 1987), and experience much more difficulty with lexical access, despite sometimes similar receptive vocabulary scores (Gollan & Acenas, 2004; Yan & Nicoladis, 2009; see Bialystok, 2009a, for a review). Importantly, what might unite all of these findings is the idea that a second, task-irrelevant language is interfering with the production of a relevant linguistic response.

Moreover, there is an asymmetrical cost for switching from a dominant language (L1) to a nondominant language (L2) that is consistent with D. W. Green’s (1998) reactive inhibition assumption. Meuter and Allport (1999), for example, have shown that bilinguals are slower to name digits in L1 if the preceding digit is named in L2, as compared to when bilinguals first name a digit in L1 and subsequently name a second digit in L2. This asymmetry has been taken to indicate that more inhibition is required to suppress the dominant language and that this inhibition is more persistent than the inhibition for L2. It has also since been shown, despite some earlier controversy (e.g., Finkbeiner, Almeida, Janssen, & Caramazza, 2006), that this inhibition operates independently of or above and beyond any inhibition that may have arisen because of the cue repetition characteristics of a cued language-switching paradigm or any response set that may have emerged for a specific pattern of stimuli (Philipp & Koch, 2009). Thus, there is compelling evidence that the production of one language in lieu of the other engages certain inhibitory processes and that, in line with D. W. Green’s assumption, the inhibition required to suppress L1 is stronger than that required to suppress L2.

A second assumption of this model, stemming from the first, is that the mechanism that resolves conflict between two simultaneously activated linguistic representations is not necessarily language specific. That is, there may be a common brain mechanism that mediates many instances of cognitive conflict. This is a possibility if one hypothesizes an executive control system, possibly located in the frontal lobes (Goldman-Rakic, 1996), that has widespread inhibitory processing capacities throughout the central nervous system (e.g., Miyake, Friedman, Emerson, Witzki, Howerter, & Wagner, 2000).

Are the inhibitory processes involved in language specific to language tasks?

Early evidence revealed that young bilinguals tend to outperform monolinguals on tasks requiring the suppression of irrelevant information that had at one time been relevant. This has been shown in the dimensional change card sort (Bialystok, 1999) and in tasks for which there is a large amount of to-be-ignored irrelevant input, as in detecting grammatical errors while ignoring irrelevant and anomalous semantic content (Bialystok, 1988). These bilingual advantages might be expected if the same inhibitory control mechanism were used for all tasks involving conflict resolution. In this case, the routine need of bilinguals to suppress irrelevant lemmas would fine-tune this central inhibitory control mechanism. It is conceivable, however, that these particular tasks might engage a language-specific inhibitory mechanism allowing for improved accuracy. In the context of detecting grammatical errors, it is relatively self-evident that this type of processing might engage language-specific mechanisms. In the dimensional change card sort, although this is perhaps less obvious, the presentation of geometric shapes (i.e., a square or a circle) colored either red or blue might activate well-developed inhibitory control mechanisms for language. The improved ability of bilinguals to switch from sorting on one dimension to another may have more to do with coding the physical properties of the stimuli linguistically and exploiting the well-developed inhibitory processes of a language control system, rather than a more efficient SAS owing to bilingualism. More recently, Bialystok, Craik, and Luk (2008) found that bilinguals showed a smaller Stroop effect than did monolinguals. Importantly, this advantage was probably due to the superior ability of bilinguals to eliminate the influence of the irrelevant word. More importantly, to demonstrate that bilingualism confers a general inhibitory control advantage would require the use of tasks that are not so obviously language driven as the Stroop task.

One such paradigm, which has only recently been used to explore bilingual-versus-monolingual differences, is the task-switching paradigm (Garbin, Sanjuan, Forn, Bustamante, Rodriguez-Pujadas, Belloch et al., 2010; Prior & MacWhinney, 2010). So long as neither the switching nor the tasks are linguistically mediated, this paradigm would seem to have face validity for this purpose. However, when testing the idea that developmentally early and frequent switching between two languages causes a generalized improvement in inhibitory control, we believe that neither language (content) nor switching (mental operation) should be involved when assessing whether the advantage is “general.” We will address these studies in a more detailed theoretical discussion later on (see the Task Switching, Language Switching, and Neurocognitive Mechanisms section), but—primarily for this reason, and also because there are, as yet, very few studies of nonlinguistic task switching—such studies are not included in our empirical review.

Nevertheless, a variety of purportedly nonlinguistic paradigms have been used to assess the hypothesis that bilinguals have acquired an inhibitory control advantage, and all of the tasks in these studies have entailed the presentation of to-be-ignored (or task-irrelevant) information. The most common approach (e.g., Bialystok et al., 2004) has been to administer the Simon task (Simon, 1969). The standard Simon task (see Fig. 1, top) is a two-alternative forced choice test in which participants are required to discriminate a target that appears to either the left or the right of a fixation stimulus, on the basis of two equiprobable characteristics (e.g., red or green), by way of a manual response. Typically, one of the two alternative response choices is assigned to each hand, and each hand is aligned with a target location to the left or right of fixation. Despite the fact that the location of the stimulus is irrelevant, response times (RTs) are often faster when there is a spatial correspondence between the location of the response and the location of the target (see Proctor & Reeve, 1990, and a special issue of Acta Psychologica dedicated to the Simon effect [Mordkoff & Hazeltine, 2011] for reviews). The critical point here is that the necessary condition to elicit the Simon effect is only that the location of the response must correspond (or not) with the location of the target (Wallace, 1971). As such, the effect is apparently not due to an anatomical relationship between the right hand and a right target, or vice versa, but to how the space–response relationship is cognitively represented. This task, which will hereafter be referred to as the “standard Simon task,” is thought to reflect the extent to which a prepotent motoric association to a task-irrelevant spatial dimension influences manual responding to the task-relevant feature dimension.

Fig. 1
figure 1

An illustration of the Simon, spatial Stroop, and flanker interference tasks, respectively. Stimulus–response (S–R: Simon effect and spatial Stroop task) and stimulus–stimulus (S–S: flanker task, although arguably also representing an instance of S-R compatibility, see Egner, 2007) compatibility conditions are segregated by the midline. For the Simon and spatial Stroop tasks, when the task-irrelevant location of the task-relevant stimulus dimension (a to-be-discriminated color or arrow, respectively) corresponds with the location of the response, there is S–R compatibility. When there is noncorrespondence, there is S–R incompatibility. For the flanker task, when the task-irrelevant arrows are congruent with the direction of the central target arrow, there is S–S compatibility. When the task-irrelevant arrows are incongruent with the direction of the central target arrow, there is S–S incompatibility

A second task, closely resembling the Simon task but sometimes considered to be more difficult, is known as the “spatial Stroop task” or, occasionally, the “Simon arrow task” (e.g., Bialystok, 2006). In this task (see Fig. 1, middle), the target attribute, rather than being purely nonspatial (e.g., color), is a leftward or rightward arrow whose direction must be discriminated. In this task, the target arrow’s extracted form is a spatial attribute that will either be congruent or incongruent with the task-irrelevant location of the arrow.

The difference in RTs between trials on which the response and target onset positions are compatible (congruent trials) and on which the response and target onset positions are incompatible (incongruent trials) is known as the “Simon effect.” While this type of task, and variations thereof, could conceivably engage language-specific mechanisms to some extent, the Simon effect, which has been found in nonlinguistic species (Courtière, Hardouin, Burle, Vidal, & Hasbroucq, 2007; Urcuioli, Vu, & Proctor, 2005), is generally considered to be nonlinguistic.

A third approach to test this hypothesis has been to use the flanker task (Eriksen & Eriksen, 1974). This task (see Fig. 1, bottom), which has been embedded in the Attentional Network Test (ANT; Fan, McCandliss, Sommer, Raz, & Posner, 2002), has been used to examine inhibitory control processes in bilinguals (e.g., Carlson & Meltzoff, 2008; Costa et al., 2009; Costa, Hernández, & Sebastián-Gallés, 2008). In the flanker component of the ANT, a central target arrow points either left or right. The target arrow may be flanked by two arrows in close spatial proximity on each side. On half of such trials, the flanking arrows point in either the same (congruent trials) or the opposite (incongruent trials) direction as the target arrow. The difference in RTs between trials with congruent and incongruent arrows, which we will refer to as the “flanker effect” is, much like the Simon effect, taken to index the ability to suppress irrelevant information (but see Kornblum, Hasbroucq, & Osman, 1990 for theoretical dissociations between tasks; or Egner, 2008, for empirical dissociations). The flanker and Simon effects will be collectively referred to as “interference effects.”

All three of these tasks have been used to answer the question: Do bilinguals enjoy a task-general inhibitory control advantage? An affirmative answer to this question will be referred to as the bilingual inhibitory control advantage (BICA) hypothesis, which can be expressed as follows:

Frequent use of the inhibitory processes involved in language selection in bilinguals will result in more efficient inhibitory processes, which will confer general advantages on nonlinguistic interference tasks—that is, those requiring conflict resolution. These advantages will be reflected in reduced interference effects in bilinguals as compared to monolinguals. In other words, bilinguals should show an advantage over monolinguals on trials with response conflict.

A critical review of the literature that has used these tasks to answer this question is the principal aim of this article.

The studies

Other studies have examined inhibitory control processes in bilinguals using the Stroop task (e.g., Bialystok et al., 2008), spatial negative-priming (Treccani, Argyri, Sorace, & Della Sala, 2009), and inhibition-of-return (IOR) paradigms (Colzato et al., 2008). The Stroop task has not been examined in this review because of its close relationship to language. The relationship of an IOR paradigm to active inhibitory control processes, on the other hand, is a much more ambiguous case, given that opinions are highly divergent on the causes (Hunt & Kingstone, 2003; Klein, 2000; Souto & Kerzel, 2009) and effects (Abrams & Dobkin, 1994; Taylor & Klein, 2000) of IOR. Thus, it is difficult to discern what greater IOR for bilinguals as compared to monolinguals (Colzato et al., 2008; but see Hernández et al., 2010, for a nonreplication) might mean. On a historical note, the Colzato et al. investigation, showing greater IOR in bilinguals at long cue–target intervals, concluded that this language group possesses a superior ability to maintain action goals, whereas a greater spatial negative-priming effect in bilinguals (Treccani et al., 2009) has been taken as evidence in favor of BICA.

To date, eight studies have employed the standard Simon task or the Simon arrows task to examine inhibitory control processes in bilinguals relative to monolinguals (see Table 1). Four studies have examined flanker interference, either independently or embedded in the ANT. One further study has employed flanker interference with a Simon-like component (see note 2 on p. 7). These studies are all presented in Table 1, along with their key methodological features.

Table 1 Key information about all the experiments that have so far been published that address the bilingual executive control advantage hypothesis. Bold font denotes the tasks that are illustrated in the empirical review

Interference effects: overview

The magnitudes of interference effects in bilinguals relative to monolinguals are shown for all experiments in Fig. 2a. In this scatterplot, each dot represents the interference effects from a single experiment, with the x-axis projection representing the interference effect experienced by bilinguals and the y-axis projection representing the interference effect experienced by monolinguals. Data points above the diagonal line reflect a bilingual advantage on inhibitory control (smaller bilingual interference effect); data points below the line reflect a monolingual advantage on inhibitory control. The bilingual advantage for each experiment (interference score for monolinguals minus the interference score for bilinguals) is plotted as a function of the mean age of the participants in Fig. 2b. At this preliminary stage, we will comment on what these overall patterns suggest. It is important to note, however, that very few of these studies are identical in design. Although, at their core, they explore Simon and flanker interference, methodological differences may account for significant disparities among the data points. These variations will be discussed below, and in more detail in footnotes.

Fig. 2
figure 2

a Top panel: The interference effect for monolinguals versus the interference effect for bilinguals, for each condition in all experiments. All values above the diagonal show an advantage for bilinguals. Conversely, all values below the diagonal show an advantage for monolinguals. (The data from Costa et al. (2008) are collapsed across all networks measured by the ANT) Data from the studies presented in Table 1 have been included in this figure. Bottom panel: Because of some extreme scores (e.g., Bialystok et al., 2004), the ranges of the abscissa and ordinate make it difficult to visualize differences between language groups. The bottom figure is an inset of the area of the top panel in which most of the data lie. b Differences between bilinguals and monolinguals on the interference effect as a function of age. A positive value is indicative of an advantage for bilinguals on the interference effect (i.e., bilinguals encounter less conflict). Data from the studies presented in Table 1 are included in this figure

It is apparent from Fig. 2a that few experiments have reported dramatically large interference effects and bilingual advantages, with the remaining data showing much smaller interference effects and, overall, little or no bilingual advantage. It is apparent from Fig. 2b, firstly, that the magnitudes of the interference effects between monolinguals and bilinguals are very similar for young adults and children. The absence of a bilingual advantage in these age groups is simply inconsistent with the proposal that bilingualism has a general positive effect on inhibitory control processes (i.e., BICA). Secondly, the magnitude of the interference effects seems to become markedly more pronounced in the middle-aged and old-aged participants. Importantly, for these age groups, the bilingual advantage appears to be robust. Although there is evidence to suggest that the magnitude of the Simon effect increases as a function of age, the standard Simon effect in older adults seems to peak at around 70 ms (Kubo-Kawai & Kawai, 2010; Van der Lubbe & Verleger, 2002). Thus, it is very puzzling that Simon effects for the monolingual language groups, in particular, were (as can be seen in Fig. 2a) sometimes around the 1,000–1,800 ms range (Bialystok et al., 2004; Bialystok, Martin, & Viswanathan, 2005).

Overall (global) RT effects: overview

As the following sections will show, the bilingual advantage on the interference effect is a relatively rare phenomenon, occurring only under a restricted set of conditions. A secondary focus that has emerged in this literature (see, e.g., Appendix A in Costa et al., 2009) is known as the overall or global RT advantage (“global advantage”). This advantage refers to the somewhat unanticipated finding that bilinguals typically outperform monolinguals on both congruent and incongruent trials (Bialystok et al., 2004; Costa et al., 2009). This serendipitous and robust finding is beyond the sphere of the inhibitory control model and is, so far, lacking a stable theoretical foundation. The hypothesis that bilinguals enjoy domain-general executive functioning advantages, as indexed by largely equivalent performance benefits on all conditions in nonlinguistic interference tasks, will be referred to as the “bilingual executive processing advantage” (BEPA) hypothesis. This advantage is apparent across all age groups, though in young adult bilinguals it is sometimes not obtained unless task difficulty is high (Bialystok, 2006; Costa et al., 2009). Figure 3a shows the global advantage for bilinguals. To avoid any contribution to the global advantage from bilingual advantages that may be present on the interference effect, the extant data are plotted for congruent trials (Fig. 3b), and in the graphic presentations that follow, the global effect will be presented using only congruent trials. For completeness, the data from incongruent trials are shown separately (Fig. 3c). With few exceptions (e.g., Bialystok et al., 2004; Bialystok et al., 2008), the bilingual advantage is equivalent for both congruent and incongruent trials (resulting in a null effect of language group on the interference effect).

Fig. 3
figure 3figure 3figure 3

a Top panel: Overall bilingual advantage on response times (RTs), collapsed across congruence, across all studies. A value above the diagonal is indicative of an advantage for bilinguals on global response times (i.e., bilinguals respond faster). The data from all studies presented in Table 1 (wherein the tasks represented by this figure are highlighted in bold) have been included in this figure. Bottom panel: Because of some extreme scores (e.g., Bialystok et al., 2004), the ranges of the abscissa and ordinate make it difficult to visualize differences between language groups. The bottom figure is an inset of the area of the top panel in which most of the data lie. b Top panel: Bilingual advantage on response times (RTs) across all studies ,for congruent trials only. A value above the diagonal is indicative of an advantage for bilinguals on global response times (i.e., bilinguals respond faster). The data from all studies presented in Table 1 (wherein the tasks represented by this figure are highlighted in bold) have been included in this figure. Bottom panel: Because of some extreme scores (e.g., Bialystok et al., 2004), the ranges of the abscissa and ordinate make it difficult to visualize differences between language groups. The bottom figure is an inset of the area of the top panel in which most of the data lie. c Top panel: Bilingual advantage on response times (RTs) across all studies, for incongruent trials only. A value above the diagonal is indicative of an advantage for bilinguals on global response times (i.e., bilinguals respond faster). The data from all studies presented in Table 1 (wherein the tasks represented by this figure are highlighted in bold) have been included in this figure. Note that very few of these studies were identical in methodology (see the text for details). Bottom panel: Because of some extreme scores (e.g., Bialystok et al., 2004), the ranges of the abscissa and ordinate make it difficult to visualize differences between language groups. The bottom figure is an inset of the area of the top panel in which most of the data lie

Recall that the inhibitory control model predicts superior performance for bilinguals on incongruent trials specifically. It is here where broadly defined inhibitory processes, or something akin to an SAS, might be more efficient in suppressing task-irrelevant input. It is hard to imagine how more efficient inhibitory processes would confer a benefit on congruent trials. That the bilingual advantage on the interference effect emerges almost invariably in the presence of a global advantage casts doubt on the role of a centrally based inhibitory process developed to resolve all instances of possible conflict (see Costa et al., 2009, for similar theoretical assertions). Yet, that bilinguals outperform monolinguals in overall performance so long as the task entails some level of conflict, however, strongly suggests that there is a cognitive advantage related to second-language learning (Bialystok, 2006; 2009a; Bialystok & Craik, 2010; Costa et al., 2009). As we will see later, and to reinforce an earlier point, current theoretical approaches appear to be at a loss to explain this robust phenomenon. We will return to this issue and offer an explicit proposal to explain the bilingual global advantage, after a more detailed examination of the empirical results.

Interference and global effects across the lifespan

Performance of elderly and middle-aged monolinguals and bilinguals on interference tasks

Four studies have examined interference effects in middle- or old-aged bilinguals, or in both, as compared to age-matched monolinguals (Bialystok et al., 2004 [these data were later reproduced in Bialystok, Martin, & Viswanathan, 2005; for comparison, their younger-sample data are presented in the next section]; Bialystok et al., 2008; Emmorey, Luk, Pyers, & Bialystok, 2009). These data are presented in Figs. 4 (left panel) and 5 (left panel) for the old-aged and middle-aged language groups, respectively. One of these studies (Bialystok et al. 2004; and see Bialystok, Martin, & Viswanathan, 2005, for additional discussion of these data) was a seminal investigation in this area, because of its report of significant advantages on the Simon effect for bilinguals relative to monolinguals.

Fig. 4
figure 4

Left panel: Magnitude of the bilingual advantage on the interference effect for elderly adults (mean age > 60 years). Right panel: Magnitude of the global RT advantage (based on congruent RTs; see the text) from the same studies. The studies from which the data were derived appear between the two panels, and the letter identifiers correspond to the study and task information in Table 1. Positive values, in both cases, are indicative of an advantage for bilinguals

Fig. 5
figure 5

Left panel: Magnitude of the bilingual advantage on the interference effect for middle-aged adults (mean age = 40–60 years). Right panel: Magnitude of the global RT advantage (based on congruent RTs; see the text) from the same studies. The studies from which the data were derived appear between the two panels, and the letter identifiers correspond to the study and task information in Table 1. Positive values, in both cases, are indicative of an advantage for bilinguals

For five reasons, some potentially relevant aspects of their experimental approaches will be outlined in detail. First, these studies generated a tremendous amount of interest in the possibility that bilinguals have more efficient inhibitory control processes than monolinguals. As such, conditions that are fruitful for observing these results ought to be better known, so as to encourage follow-up research. Second, as already noted (in the Interference effects: overview section), at least one aspect of the results from these studies is anomalous, in that the magnitudes of the interference effects are extraordinarily large. Third, interference effect differences between language groups are typically only reported in middle- and older-aged groups. Fourth, these empirical data have not been replicated or are only partially replicated under a very restricted set of conditions (e.g., Bialystok et al., 2008), and it is thus important to identify why there are empirical differences. Finally, the interesting explanation of these nonreplications, which for the most part have been conducted with younger participants, is that the bilingual advantage on inhibitory control processes becomes more apparent as inhibitory control processes decline with increasing age. A less interesting explanation is that one or more methodological features were present in the original studies but not in subsequent investigations. These two possibilities are not necessarily mutually exclusive.

Using the Simon task, Bialystok et al. (2004) published the first study to evaluate general (nonlinguistic) inhibitory control processes in bilinguals as compared to monolinguals. In a series of three experiments, it was reported that bilinguals showed a smaller Simon effect than monolinguals, and this was interpreted as providing strong support for the BICA hypothesis by showing that older bilinguals had superior inhibitory control processes, perhaps reflecting a greater immunity to the ubiquitous cognitive decline with normal aging that is seen in this important executive control function.

In Experiment 1, Bialystok et al. (2004), administered a standard Simon task to two language groups (monolingual and bilingual) comprising 20 participants, each of which was decomposed into subgroups on the basis of age (middle-aged participants, with an age ranging from 30 to 54 years, and elderly participants, with an age ranging from 60 to 88 years). For each participant in the monolingual group, there was a gender-matched participant in the bilingual group of the same age. All bilinguals had begun to learn a second language at the age of 6 years and were, for the most part, considered to be equally proficient in both languages (as indexed by the language background questionnaire). The monolinguals were all native Canadian residents, while the bilinguals were all native residents of Southern India. This confound raises some concern about potential cultural differences, or the possibility of one or more uncontrolled demographic factors that may have influenced the outcome of this study (i.e., Bialystok, 2001; Morton & Harper, 2007; see the section Hidden Factors: The Controversy Surrounding the Implementation of Appropriate Environmental Controls, below). Nevertheless, participants were considered to be of a similar education background (given that they had all obtained bachelor’s degrees); all participants were selected from middle-class socioeconomic environments; and both groups performed similarly on Raven’s Standard Progressive Matrices, an index of general reasoning abilities and intelligence.

The first experimental design (Exp. 1) for the Simon task consisted of only 28 experimental trials, for which there was an even distribution of congruent and incongruent trials. This is, as the authors admit, an uncharacteristically small number of trials for a Simon task. Participants had to discriminate leftward- or rightward-presented squares on the basis of color (red or blue). The results revealed unusually large Simon effects. Middle-aged and old-aged monolinguals showed 535- and 1,713-ms Simon effects, respectively, whereas middle-aged and old-aged bilinguals showed 40- and 748-ms Simon effects, respectively. Clearly, middle-aged and old-aged bilinguals showed smaller Simon effects than monolinguals. Nevertheless, in general, bilinguals performed all aspects of the task (congruent and incongruent trials) more rapidly than monolinguals (e.g., middle-aged monolingual congruent RT = 770 ms, whereas middle-aged bilingual congruent RT = 497 ms; old-aged monolingual congruent RT = 1,437 ms, whereas old-aged bilingual congruent RT = 911 ms). The latter findings are not easily explained by BICA.

Four conditions were tested in separate blocks in Experiment 2. The blocks consisted of 24 trials, and there were two blocks in each condition, for a total of 192 trials. One condition was a control condition in which participants had to discriminate centrally presented targets on the basis of color. A second condition, similar to that in Experiment 1, consisted of a color discrimination task with peripherally presented targets (allowing for a measure of the Simon effect). Two other conditions were identical in all respects to the previously defined conditions, with the following exception: Instead of a two-stimulus/two-response discrimination task, four stimuli were mapped onto two responses in an effort to increase the load of the stimulus–response mapping rules that the participants would have to hold in working memory. These conditions were introduced in a preset order and then reversed (allowing for 48 trials in each condition), to assess the possibility that bilinguals, instead of possessing a superior ability to ignore irrelevant input, simply enjoyed better working memory ability. In this case, keeping two colors instead of four colors in mind would theoretically impose less of a load on the working memory system. One possibility, then, was that the Simon effect would be even more pronounced in monolinguals when working memory demands were elevated.

Unlike in Experiment 1, four practice trials were provided before the two-choice discrimination conditions, and eight practice trials were provided for the four-choice discrimination conditions, to demonstrate the unique configuration of the stimuli. In these practice blocks, if an error were made, the trial was recycled into the program until all trials were completed without error. The experiment consisted of 94 participants [64 middle-aged adults (ranging from 30 to 58 years of age) and 30 older adults (ranging from 60 to 80 years of age)]. The groups were age- and gender-matched but differed socioculturally: The bilingual groups were composed of Cantonese–English residents of Hong Kong, Tamil–English residents of India, and French–English residents of Canada. All monolingual (English-speaking) participants resided in Canada. Despite these eclectic cultural backgrounds, the mean scores on the Cattell Culture Fair Intelligence Test, a nonverbal test of general intelligence, were similar. Both language groups also scored similarly on measures of working memory span.

As in Experiment 1, bilinguals were advantaged on the Simon effect as compared to monolinguals. Perhaps because of the extra practice, the RT differences in the interference effects between the two groups were substantially less than in Experiment 1. In the condition that most closely resembled Experiment 1, the magnitude difference between middle-aged bilinguals and monolinguals was 116 ms, and the respective difference was 371 ms in the elderly group. The bilingual advantage in the elderly group was significantly greater than that in the younger group. However, subsequent experimentation (Exp. 3 in Bialystok et al., 2004) on middle-aged bilinguals and monolinguals from the same communities, while replicating the bilingual advantage early in practice, also demonstrated that this advantage diminished to nonsignificance as a function of practice. Furthermore, Experiment 2 revealed that the costs of increased working memory load were greater for monolinguals than for bilinguals on the univalent (central stimulus) color discrimination task. When there was a one-to-one mapping of colors to hands, monolinguals and bilinguals performed equivalently on RT. When, however, there was a two-to-one mapping of colors to hands, bilinguals outperformed monolinguals on RTs (by 460 ms), indicating that the global bilingual advantage might not be restricted to conflict resolution tasks, so long as the working memory load is elevated. Nevertheless, these seminal data, particularly from Experiment 2, are consistent with a role for superior inhibitory control processes in bilinguals relative to monolinguals (i.e., BICA). Subsequent investigation, however, has equivocated this interpretation considerably.

Two studies, in addition to Bialystok et al. (2004), have examined interference effect differences between bilinguals and monolinguals in these age groups (Bialystok et al., 2008; Emmorey et al., 2009) while using a large number of experimental trials and prior increased practice. The advantage of more trials is that it mitigates any factors between groups that might relate to initial strategy recruitment or learning how to perform the task successfully. Of course, bilingual advantages in such processes would be interesting, but they would not support the BICA hypothesis.

Emmorey et al. (2009) administered a flanker task in which the target arrow was positioned in the center or to the left or right of center.Footnote 2 Middle-aged participants (mean age = 47.76 years) were instructed to indicate the direction in which the target arrow was pointing. The irrelevant arrows pointed in either the same direction as or the opposite direction from the target arrow. The 48 trials per block consisted of an even distribution of trials in which the flankers were either congruent or incongruent with the target arrow. Two blocks of trials were administered to three language groups [a bilingual, a monolingual, and a “bimodal” group (i.e., a group that was fluent in American Sign Language)], and 12 practice trials with feedback were provided to each participant before a block of experimental trials. Education level was taken as an index of socioeconomic status (SES), and participants in all groups were statistically equivalent on this measure. Both the bilingual and bimodal groups had a lifetime of experience in both languages, but their ages of language acquisition varied, despite the fact that most bilinguals had developed their second language early in childhood. Verbal reasoning abilities, age, and proficiency ratings (between the bilingual and monolingual groups) were balanced across groups.

Bilingual participants responded more rapidly on congruent and incongruent trials than did the other two groups (consistent with BEPA). The difference in the flanker effects between the monolingual and bilingual groups was not significant (although the RTs between these two groups on congruent and incongruent trials were only presented in figure form, an extraction of these data revealed a negligible, ~4-ms advantage for monolinguals on the interference effect).

Bialystok et al. (2008) conducted an investigation using the spatial Stroop task (see Figure 1, spatial Stroop). There were 48 participants, half bilingual and half monolingual, in the elderly group (mean age = 68 years; data from the young participants were reported in the previous section). While the language groups were equated on measures of working memory ability, a monolingual advantage was found on several verbal tasks. The bilingual language group comprised participants with heterogeneous linguistic backgrounds, with a wide range of second languages. In the elderly group, 20 bilinguals were immigrants, with all except 4 having arrived in Canada before the age of 12. Years of formal education were also compared within age groups; there were no statistical differences within these groups. Bialystok, Craik, and Luk administered 192 total trials (96 congruent and 96 incongruent) in the spatial Stroop task, separated by two blocks of 96 trials in two other conditions.Footnote 3 They demonstrated that bilinguals outperformed monolinguals on the interference effect but not on overall RTs (see Fig. 4, right panel). Closer inspection of the composite scores for the interference effect, however, reveals a puzzling pattern of results. Bilinguals performed, on average, 10 ms faster on incongruent trials relative to monolinguals (bilinguals = 741 ms, monolinguals = 751 ms). However, the mean difference between monolinguals and bilinguals on congruent trials was a 50-ms monolingual advantage (monolinguals = 691 ms, bilinguals = 741 ms). Collectively, these data show no global advantage. Nevertheless, the surprising tendency for monolinguals to respond about 50 ms faster on congruent and 10 ms slower on incongruent trials, as compared to bilinguals, results in a statistically significant advantage for bilinguals on the interference task (60-ms advantage). Yet, seemingly against the predictions of BICA, this advantage cannot be attributed to the ability of bilinguals to outperform monolinguals on incongruent trials. Rather, it appears to be attributable to the exceptional finding that bilinguals were responding, on average, 50 ms slower than monolinguals on congruent trials.Footnote 4

Performance of monolingual and bilingual young adults on interference tasks

Seven studies, comprising eight sets of data, were extracted from various experiments investigating, between language groups, flanker interference/ANT (Costa et al., 2009; Costa et al., 2008; Luk, Anderson, Craik, Grady, & Bialystok, 2010), Simon (Bialystok, Craik, et al., 2005; Bialystok, Martin, & Viswanathan, 2005), and spatial Stroop (Bialystok et al., 2008; Bialystok & DePape, 2009) tasks. For studies in which the standard Simon and flanker tasks were used and in which there was a random but equal assortment of congruent and incongruent trials, the magnitude of the difference scores on the interference effect (left side of Fig. 6a) between bilingual and monolingual groups was never significant, or never maintained significance after practice (Bialystok, Craik, et al., 2005; Bialystok & DePape, 2009; Bialystok et al., 2008, Exp. 1; Bialystok, Martin, & Viswanathan, 2005, Exp. 3; Costa et al., 2009; Costa et al., 2008).Footnote 5

Fig. 6
figure 6

a Left panel: Magnitude of the bilingual advantage on the interference effect for young adults (mean age = 20–30 years). (The data from Costa et al. (2008) are from a single study in which the bilingual advantage was plotted on overall RTs and the interference effect on all networks (orienting, alerting, and executive) in the ANT and on the no-cue condition to illustrate the results on all ANT measures. The Bialystok, Craik, et al. (2005) study compared two bilingual groups (one Cantonese and one French) against a monolingual control group. Consequently, both the Cantonese and French groups are represented in this figure.) Right panel: Magnitude of the global RT advantage (based on congruent RTs; see the text) from the same studies. The studies from which the data were derived appear between the two panels, and the letter identifier corresponds to the study and task information in Table 1. Positive values, in both cases, are indicative of an advantage for bilinguals. b Left panel: Magnitude of the bilingual advantage on the interference effect for young adults (mean age = 20–30 years) on unconventional implementations of the spatial Stroop and Simon tasks and the ANT. The studies from which the data were derived appear between the two panels, and the letter identifier corresponds to the study and task information in Table 1. Although varieties of the ANT that employ neutral trials (e.g., Costa et al., 2008) or no neutral trials (Costa et al., 2009) but a 1:1 ratio of congruent to incongruent trials do not exactly constitute unconventional implementations, the relative proportions of incongruent trials afford an opportunity to observe any effect that conflict trials might have on modulating either interference or the global RT effect. All bilinguals and monolinguals from Bialystok (2006) are treated collectively, irrespective of video game history. Right panel: Magnitude of the global RT advantage (based on congruent RTs; see the text) from the same studies. Positive values, in both cases, indicate an advantage for bilinguals. In two cases (denoted by unfilled circles), a reverse interference effect was obtained (i.e., faster RTs on incongruent than on congruent trials were obtained in both language groups). In these cases, interpretation is difficult, but note that the same convention of subtracting the monolingual interference effect from the bilingual interference effect was used to obtain the bilingual advantage (i.e., positive values)

Noteworthy omissions from this figure are the data from Bialystok (2006) and selected data from Costa et al. (2009). These instances entailed unconventional uses of interference tasks, which are therefore discussed and plotted separately (see Fig. 6b). Bialystok (2006) administered even distributions of congruent and incongruent trials on both the standard Simon and spatial Stroop tasks while directly manipulating the frequency of intertrial response switches (i.e., how often a stimulus change occurred that required a response different from that on the preceding trial). That is, for each task, there was a fixed order of trial presentation in blocks in which there were many intertrial response switches (28 of 40 trials) and fewer intertrial response switches (15 of 40 trials), in an effort to examine the effects of intertrial response switching. The principal result here was that bilinguals outperformed monolinguals only on global RTs and only on the spatial Stroop task when there were many intertrial response switches (presumably, then, when task difficulty was highest).Footnote 6 Thus, again, in no condition was there a significant bilingual advantage on the interference effect.

Costa et al. (2009) parametrically manipulated the proportions of congruent trials in blocks of the ANT without neutral trials. There were four conditions comprising three blocks of trials: an 8%-, a 92%-, a 75%-, and a 50%-congruent condition. When extreme probability manipulations were used (e.g., 8% and 92%), a bilingual advantage was observed on neither the interference effect nor the global advantage.Footnote 7 We doubt the usefulness of data from these extreme blocks, because it is difficult if not impossible to ascertain whether participants are ignoring or paying attention to and strategically taking advantage of the flankers whose direction predicts the correct response with 92% accuracy (e.g., when 8% of the trials are congruent, by responding in the direction opposite to the flanking arrows, participants could achieve 92% accuracy; conversely, when 92% of the trials are congruent, the same level of accuracy could be achieved by responding in the direction of the flanking arrows). When the probability of congruent trials was 75%, a bilingual advantage on the interference effect appeared in the first block of trials but disappeared for the remaining two blocks, whereas a statistically significant global advantage was apparent.Footnote 8 When the probability was 50%, there was no advantage for bilinguals on the interference effect, whereas a global advantage was apparent that was numerically greater than in the 75% condition.Footnote 9

In young adults, conventional and unconventional implementations of interference effects alike revealed little evidence to suggest that bilinguals show superior inhibitory control relative to monolinguals. On the other hand, there was a remarkably robust advantage for bilinguals on global RTs. Of the seven studies that examined this effect via standard interference tasks (Fig. 6a), four revealed an overall RT advantage for bilinguals (on the flanker task, Costa et al., 2009; Exp. 2 and Costa et al., 2008; on the Simon task, Bialystok, Craik, et al. (2005); on the spatial Stroop task, Bialystok, & DePape, 2009), with all additional investigations showing numerical advantages for bilinguals on overall RT (Bialystok, Martin, & Viswanathan, 2005; Bialystok et al., 2008; Luk et al., 2010). Critically, in this age group, when a central arrow was presented (i.e., in cases where there was no apparent competition between automatically elicited task-irrelevant and task-relevant responses), bilinguals and monolinguals performed similarly (Bialystok et al., 2008; Bialystok, Martin, & Viswanathan, 2005), suggesting that either task difficulty or the introduction of response competition (Bialystok, 2009a; Bialystok & Craik, 2010)—two ideas that are not necessarily mutually exclusive—leads to an overall RT advantage.

Performance of bilingual and monolingual children on interference tasks

Four studies comprising seven sets of experimental data have examined interference effect differences between young bilingual and monolingual children using either the flanker task (embedded in the ANT) or the Simon task (Bialystok, Martin, & Viswanathan, 2005; Carlson & Meltzoff, 2008; Martin-Rhee & Bialystok, 2008; Morton & Harper, 2007). The magnitude of the Simon effect differences between bilinguals and monolinguals is presented in the left panel of Fig. 7 for all studies. The RTs in Carlson and Meltzoff (2008), the only study using the flanker task in this age range, were not reported and therefore could not be included in this figure.

Fig. 7
figure 7

Left panel: Magnitude of the bilingual advantage on the interference effect for young children (mean age < 10 years). Right panel: Magnitude of the global RT advantage (based on congruent RTs; see the text) from the same studies. The studies from which the data were derived appear between the two panels, and the letter identifier corresponds to the study and task information in Table 1. Positive values, in both cases, are indicative of an advantage for bilinguals

No studies investigating the Simon effect in young children revealed significant differences in the magnitude of the effect between monolingual and bilingual groups (see Fig. 7, left panel; Bialystok, Martin, & Viswanathan, 2005; Martin-Rhee & Bialystok, 2008; Morton & Harper, 2007). Additionally, all experiments in this group comprised a maximum of 40 experimental trials. This is an exceptionally small number of trials for research on the Simon effect, which was explicitly or implicitly justified as ensuring that the task would sustain the child’s attention. All studies, in addition, controlled for various and ultimately different potentially confounding variables.

Whereas the bilingual advantage on the interference effect was conspicuously absent in this age group, the global advantage materialized strikingly often. In two of the three investigations, comprising six experiments, the global RT advantage for bilinguals was observed five of six times (see Fig. 7, right panel; on the standard Simon task, Bialystok, Martin, & Viswanathan, 2005, Exps. 1 and 2, and Martin-Rhee & Bialystok, 2008, Exps. 1 and 2; on the spatial Stroop task, Martin-Rhee & Bialystok, 2008, Exp. 3). Importantly, this global RT advantage was usually not seen in the absence of response competition, so long as task demands were minimal (e.g., Martin-Rhee & Bialystok, 2008, for responses to centrally presented arrows).

Claiming that SES had been inadequately controlled in most previous studies, Morton and Harper (2007) directly controlled for it, producing the exceptional finding of neither an overall RT nor an interference effect advantage for bilinguals. Although their interpretation is somewhat controversial (see the next section for more information), Morton and Harper (2007) suggested that instantiating better controls over SES might have eliminated bilingual advantages.

One final note is required for the Carlson and Meltzoff (2008) findings. In this study, whose authors assiduously controlled for SES, a battery of tests (including the ANT) was administered to bilingual and monolingual children. This battery of nine tests included language-based executive tasks (e.g., Simon says) and delayed-gratification tasks, which were collectively analyzed along with the ANT to produce Composite Executive Function scores. These scores on linguistic and nonlinguistic tasks were aggregated and used as an index of executive functioning ability. On the measure of accuracy in the ANT, there was no statistical difference between the language groups. In this literature, however, there is seldom a statistical difference between language groups on accuracy (e.g., Costa et al., 2009; Costa et al., 2008; Emmorey et al., 2009). Without analyzing the RT data, it was not possible to determine whether bilinguals enjoyed superior performance.

Hidden factors: the controversy surrounding the implementation of appropriate demographic controls

Having thus demonstrated the empirical differences underlying bilingual and monolingual language groups on nonlinguistic interference tasks, before the theoretical issues can be tackled in more detail, one major assumption of the present article must be fully disclosed. This review has operated under the assumption that demographic factors have been sufficiently controlled in the research programs guiding bilingual research on inhibitory control. It is well recognized that there are a multitude of factors, aside from early exposure to a bilingual environment, that might play a crucial role in shaping the information-processing (or if you prefer, neurocognitive) systems responsible for behavior. When these factors are not well controlled, a primary concern is that some of them might contribute or lead directly to what would appear to be bilingual processing advantages, and indeed, concerns of this sort have permeated the bilingualism literature.

In the context of children, this possibility was expressed eloquently by Bialystok (2001). It is worth repeating here, as it applies throughout the life span:

The constellation of social, economic and political circumstances of life have a large bearing on how children will develop both linguistically and cognitively. If bilingual children differ from each other in these dimensions, as they surely do, then they will also differ in the way that their bilingualism has interacted with the highly variable dimensions of their linguistic and cognitive development. Therefore, any averaging of relevant developmental indices across the conditions for becoming bilingual will be confounded with an array of hidden factors that crucially influence development. (Bialystok, 2001, p. 7)

Thus, at any given time, there will be considerable uncertainty as to the degree to which certain understudied or unknown factors are associated with the measures that are taken to gauge certain components of information processing. This uncertainty, however, can be allayed by determining which other factors are associated with the measures of interest in the investigation, and then by either balancing the two language groups on these factors or regressing out the variance due to them.

The most widespread, and arguably unanswered, criticism of the literature on bilingualism and executive function is an apparent failure to control sufficiently for SES. It is relatively clear that SES covaries with executive ability, where higher SES tends to be associated with better performance on measures of cognitive functioning (Mezzacappa, 2004). As such, it has been suggested that SES, rather than bilingualism, may account for the bilingual advantage (Mindt, Arentoft, Germano, D’Aquila, Scheiner, Pizzirusso et al., 2008). These views are not unfounded, despite some objection to them (e.g., Bialystok, 2009b). Rarely is SES controlled for directly in this literature. Occasionally, the highest achieved level of formal education (e.g., Bialystok et al., 2004; Bialystok et al., 2008) or selecting from middle-class neighborhoods (e.g., Bialystok, 2009b; Emmorey et al., 2009) is taken as an index of homogeneity in SES, but these measures are all relatively indirect.

Furthermore, Morton and Harper (2007, 2009), recognizing that in this literature SES had been poorly controlled, opted to replicate previous findings on the Simon effect in bilingual children. When this factor was controlled for, monolinguals experienced a significant advantage on global RTs, and a nonsignificant (approximately 70-ms) advantage on the Simon effect (see Fig. 7). Despite this result, it is somewhat difficult to fully endorse the implications that Morton and Harper (2007) draw from these findings. The most important reason is that they tested children, and it is relatively clear now that the bilingual advantage on interference effects is appreciably more elusive in young children (see the Performance of bilingual and monolingual children on interference tasks section). On the other hand, the near-ubiquitous bilingual advantage for global performance on tasks with interference was significantly reversed (to a monolingual advantage) in Morton and Harper’s (2007) study, which is difficult to reconcile with the idea that early switching between languages in bilinguals makes for a more efficient executive processing system.

A different challenge to these results was offered by Bialystok, 2009b. She noted that the 6.5- year-old children in Morton and Harper’s (2007) study were approximately one-and-a-half years older than the children tested by Bialystok, Martin, & Viswanathan, (2005), who found a global bilingual advantage in their sample. This age difference, she argued, might have been sufficient to overcome the initial difficulties experienced by monolinguals. Rebutting this challenge, however, is the finding from Martin-Rhee and Bialystok (2008) of a bilingual advantage on global RTs with 8-year-old children. Bialystok, 2009b criticism of Morton and Harper (2007), then, would entail the proposal of a very narrow window of development, with the global bilingual advantage being present at 5 years of age, reversing at 6.5 years of age, and reappearing at 8 years of age. Most importantly, however, and to stress the validity of the point made by Morton and Harper (2007) and Mindt et al. (2008), current investigations must ensure that SES is controlled for to a greater extent than it typically has been in this literature.

One further concern in this literature is the extent to which other environmental factors, perhaps confounded with SES, affect global RT and interference effect differences between language groups. It is now known, for example, that high computer use (Bialystok, Craik, et al., 2005), video game play (Bialystok, 2006), and expertise in music (Bialystok & DePape, 2009) produce global RT advantages that are similar to those shown by bilinguals. At this point, scant evidence has indicated that these groups might also experience a reduced interference effect as compared to nonplayers (see also Bailey, West, & Anderson, 2010, who showed no evidence that video game use reduces the Stroop effect).

The onus is now on current investigative work to ensure that these factors are not influencing experimental outcomes. A step in this direction has been taken by Costa et al. (2009). Here, video game play was balanced between language groups. Moreover, and to expand on this trend, the field would profit appreciably from the use of a comprehensive survey that assessed a host of life experiences that might be associated with executive control. In this way, it would be possible to rule out other environmental factors that might covary with bilingualism, and therefore possibly confound experimental outcomes.

Although this issue remains largely unsettled, the foregoing empirical results and the conditions under which they have been obtained will be considered in the remaining sections as if bilingualism, and not a combinations of possibly uncontrolled demographic factors, is responsible. As we have shown, and will summarize in the next sections, the bilingual advantage on inhibitory control is a somewhat sporadic phenomenon, in contrast to the more robust global RT advantage.

To whatever extent the reader is concerned with the possibility that these bilingual advantages are caused by inadequately controlled demographic factors (the main one being SES), rather than by bilingualism per se, these advantages require some explanation, and we believe that the mechanisms discussed later are plausible under either causal attribution. Regardless, the extent to which bilingualism is the complete, partial, or apparent cause of these data is an area that warrants further investigative work, and we urge future investigators of the BICA/BEPA hypotheses to be assiduous in their efforts to match monolinguals and bilinguals on plausibly pertinent demographic factors.

When does a bilingual advantage materialize on the interference effect?

First and foremost, all studies taken collectively—unique design characteristics notwithstanding—reveal that interference effect advantages for bilinguals are relatively elusive in young adults and children, yet can be surprisingly large in middle-aged and elderly adults, despite not being consistently observed in these groups. This pattern raises serious concerns about the applicability of the inhibitory control model (e.g., D. W. Green, 1998) to nonlinguistic domains of inhibitory control, and it obviously undermines the BICA hypothesis. Only under a restricted set of experimental conditions in which there are sometimes unusual frequencies of intertrial compatibility switches, allowing for less exposure to conflict trials (consider, e.g., Costa et al., 2009; Costa et al., 2008 ), do young-adult bilinguals exhibit a short-term advantage over monolinguals on interference effects. These findings help dispel any notion of an enduring bilingual advantage on the interference effect.

It may also be helpful to draw on Linck, Hoshino, and Kroll (2008),Footnote 10 whose results, for the reasons outlined in note 10, could not be visualized in the empirical section of this review. These authors demonstrated that young adult bilinguals, in general, exhibited reduced Simon effects relative to a monolingual control group. At first glance, these findings would appear to support BICA. The most striking feature of their data, however, was that the most inexperienced bilingual group (classroom learners who had never practiced their second language abroad) among three others showed the smallest Simon effect (most importantly, 25.1 ms, relative to 43.7 ms in monolinguals) when controlling for working memory span (on which bilinguals outperformed monolinguals). This inexperienced group also outperformed bilinguals (L1 = English, L2 = Spanish) who had performed in intermediate-level university language courses but who had the “advantage” of practicing their second language in courses in an L2 environment (Spain) for 3 months (Simon effect = 43.2 ms relative to 43.7 ms in monolinguals). This result is wholly unanticipated by BICA. Furthermore, an examination of proficient Japanese and Spanish bilinguals in a subsequent experiment revealed no evidence that increased L2 proficiency (as measured by performance on a picture-naming task) had any effect on the Simon effect.Footnote 11

An oft-recognized but less commonly examined phenomenon is that the bilingual advantage in conflict resolution in middle-aged and elderly participants vanishes as a function of the number of trials to which the participants have been exposed. Even on the rare occasion when a bilingual advantage has been seen in young adults, it too disappears with further practice (Costa et al., 2009; Costa et al., 2008). In the handful of studies that have examined interference effects in middle-aged and elderly bilinguals and monolinguals, variability in the number of experimental and practice trials provides an opportunity to visualize the possible consequences of varying this parameter. When there are few experimental trials in middle-aged and elderly populations, large RT advantages can be seen on the interference effect in bilinguals relative to monolinguals (see Fig. 8). These RT advantages disappear rapidly, however, with practice. Unlike in middle-aged participants, for whom the difference between the interference effects between language groups diminishes to the point of nonsignificance, the interference effect in elderly individuals might be slightly more resilient (but see notes 3 and 4).

Fig. 8
figure 8

Magnitude of the bilingual advantage on the interference effect for middle-aged adults (mean age = 40–60 years) and old-aged adults as a function of number of experimental trials. Positive values indicate an advantage for bilinguals on the interference effect, but clearly this advantage wanes as a function of the number of experimental trials

None of the studies that have examined the Simon effect in young adults have contained as few experimental trials as those studies that have investigated bilingual advantages in middle-aged and old-aged adults (Bialystok et al., 2008). Bialystok et al. (2008) explained the absence of an interference effect advantage in their study of young adults by suggesting that young adults are at the zenith of their cognitive abilities and that the present measures were not sensitive to this effect. A conceivable alternative, however, is that a bilingual advantage on inhibitory control is present in young adults, but that it disappears so quickly with practice that one is not observed in a typical study. In other words, it is possible that the rate of disappearance, with practice, of the bilingual advantage on interference effects may vary with age. Indeed, some theories of cognitive aging specifically assume that as we age it becomes more difficult to reconfigure processing (Hasher & Zacks, 1984). Recall that young children, despite being fluent in two languages, show no bilingual advantages on the interference effect even with few experimental trials. Theoretically, then, children might be able to reconfigure relevant inhibitory control centers more rapidly than older adults. This would result in an increase in the short-lived interference effects with age, an increase that could be delayed by a well-oiled executive system.

An outstanding but important research question, however, must be whether the interference effects between the two language groups reach statistical equivalence, much as the findings from Bialystok et al. (2004, Exp. 3) suggest they may, even after a large number of experimental trials. If statistical differences between groups were reliable after many trials, much needed credence could be given to the possibility of an enduring, general cognitive advantage on inhibitory control processes in old age owing to bilingualism (i.e., Bialystok et al., 2007). Otherwise, we are confronted with the possibility that bilinguals only approach the task differently and that, with a minimal amount of experience on the task, monolinguals acquire this approach.

The data reviewed above, particularly the absence of a ubiquitous bilingual advantage in children and young adults, point to a rejection of the original form of the BICA hypothesis. With regard to a weakened form of the hypothesis, in which the bilingual advantage only becomes apparent in middle and old age, the evidence is at best inconclusive: That the advantage decreases so rapidly with practice (see Fig. 8), usually to nonsignificance, opens the door to a strategy difference rather than a structural advantage in the neural networks responsible for inhibitory control.

When does the bilingual advantage materialize on global RTs?

Having established that the bilingual advantage on the interference effect is rare rather than ubiquitous, and even when observed disappears with practice, it now seems appropriate to focus on the robustly observed advantage on global RTs and to begin developing a theoretical framework that might explain this phenomenon. The bilingual advantage on global RTs appears to materialize on any nonlinguistic interference task in children, middle-aged adults, and the elderly. The effect on global RTs is robust in these age groups, but the effect becomes more pronounced when task difficulty is elevated (Bialystok et al., 2004; Martin-Rhee & Bialystok, 2008). In young adults, the global RT advantage is detected ubiquitously on spatial Stroop and flanker interference tasks [especially when the frequency of intertrial switches (Bialystok, 2006; Costa et al., 2009) is high], though seemingly not in Simon tasks (Bialystok, 2006). The latter finding must be prefaced by the caveat that, to date, only one Simon task study has been reported in the literature comparing monolingual and bilingual young adults. Whereas it has already been shown in the conflict resolution literature on the bilingual advantage that conflict within a trial is not required in order to obtain a bilingual advantage, given that bilinguals outperform monolinguals on overall RT (Bialystok et al., 2004), there are studies showing that the magnitude of the global RT advantage might be more readily detected when there is a higher frequency of intertrial compatibility switches (Costa et al., 2009). The import of this observation for theory development cannot be understated, principally because of recent advances in the area of “conflict monitoring,” which have identified a potentially domain-general neurocognitive system, to which studies of this sort are most pertinent.

Why does the bilingual advantage materialize on global RT? In search of a theoretical framework

In recent years, conflict monitoring has been a hot topic of research. At its roots, it has examined the extent to which intertrial compatibility switches affect performance. Importantly, a considerable amount of research has demonstrated that a complex network subsuming several higher-order cognitive domains might be driving these so-called “sequencing effects.” If an advantage for bilinguals were found in sequencing effects, the implication could be that the bilingual advantage, rather than being restricted to general inhibitory control processes (an idea that is contradicted by the data reviewed above), extends more generally to many cognitive domains. The complex network that explains sequencing effects has been referred to as the “conflict-monitoring system” (Botvinick, Braver, Barch, Carter, & Cohen, 2001; Botvinick, Nystrom, Fissell, Carter, & Cohen, 1999). Because the conflict-monitoring proposal is a promising theoretical construct to account for global RT differences (Bialystok, 2006; Costa et al., 2009; Costa et al., 2008), it and its prospective relation to bilingualism will be addressed next.

Classic conflict monitoring

The classic conflict-monitoring theory, proposed by Botvinick and colleagues, suggests that a particular area in the frontal lobe, the anterior cingulate cortex (ACC), detects conflict, allowing for online shifts of attentional control that are regulated by the dorsolateral prefrontal cortex, which causes trial-by-trial modulations of cognitive control over the suppression of task-irrelevant input. More specifically, when task-relevant and task-irrelevant input automatically elicit competing responses, the conflict-monitoring system detects this discrepancy, and the level of cognitive control is consequently elevated to reduce the influence of the task-irrelevant dimension on response selection. The neuroscientific understanding of the conflict-monitoring system affords an opportunity to extend cognitive theoretical constructs for behavioral phenomena to specific brain regions or centers.

The conflict-monitoring proposal has evolved as a result of earlier findings showing first-order sequencing effects in traditional interference tasks (Gratton, Coles, & Donchin, 1992). Thus, at the crux of the conflict-monitoring account is the empirically validated proposal that congruent and incongruent trial response times are affected differentially depending on whether the preceding trial is congruent or incongruent (e.g., Chen, Li, He, & Chen, 2009; Gratton et al., 1992; Stadler & Hogan, 1996; Stürmer, Leuthold, Soetens, Schröter, & Sommer, 2002). The conflict-monitoring system operates as follows: on incongruent trials, two competing responses are activated—one for the task-irrelevant input and one for the task-relevant input. In this instance, the conflict-monitoring system detects the discrepant activated responses and, consequently, increases the level of cognitive control in order to ensure that the task-appropriate response is selected. Following an incongruent trial, the increased level of cognitive control needed to suppress extraneous information remains activated, resulting in significantly reduced interference effects in the subsequent trial as cognitive control is extended to suppress the task-irrelevant attribute, irrespective of whether the task-irrelevant attribute is congruent or incongruent with the task-relevant attribute (e.g., Stürmer et al., 2002; Wühr & Ansorge, 2005). Conversely, following a congruent trial, increased cognitive control is not recruited by the conflict-monitoring system, and thereafter the level of cognitive control in place is low, allowing for the task-irrelevant attributes to exert a greater influence over response selection (but see Wühr & Ansorge, 2005, for an indication that the ACC [or an “ancillary monitoring mechanism”] may also play a role in the Simon task on congruent trials, and Bialystok, Craik, et al., 2005, for MEG evidence partially consistent with this). Consequently, when the current trial is incongruent and the preceding trial was congruent, the magnitude of the interference effect is magnified as compared to when the preceding trial was incongruent (additional studies illustrating this robust empirical phenomenon can be found in Akçay & Hazeltine, 2008; Funes, Lupiáñez, & Humphreys, 2010; Iani, Rubichi, Gherri, & Nicoletti, 2009; Kerns, Cohen, MacDonald, Cho, Stenger, & Carter, 2004; Ullsperger, Bylsma, & Botvinick, 2005). The pattern of sequence effects resulting in cognitive up-regulation is typically referred to as “conflict adaptation.”

The conflict-monitoring proposal has similarly been called upon to explain how bilinguals attenuate the influence of one of two conflicting lemmas (e.g., Hernandez, Dapretto, Mazziotta, & Bookheimer, 2001) and the superior ability of bilinguals to switch between nonlinguistic tasks or “mental sets” (e.g., monolinguals show greater switch costs when switching between shape naming and color naming in a block of trials; Prior & MacWhinney, 2010). Costa et al. (2009) also drew a link to conflict monitoring, and Bialystok (2006) wrestled with a conceptually similar but somewhat distal idea of online monitoring. If a far-reaching, highly integrated system, like the conflict-monitoring system, were highly developed in bilinguals owing to a perpetual need to manage multiple languages, the theoretical implication would be that bilinguals would excel at most (primarily nonlinguistic) tasks that impose elevated demands on cognitive systems. This system would ultimately have the ability to account for global RT advantages in particular if, similar to what was proposed by the inhibitory control model for language, the conflict-monitoring account applied generally to instances in which online shifts of attentional control were required.

A link between conflict monitoring and bilingualism

The reasoning behind this theoretical assertion, if not already clear, is intuitive and plausible. We state it as follows:

Assuming that the conflict-monitoring system is adapted to detect any instance in which a conflict materialized, one could reasonably follow the same logical road map as D. W. Green (1998) to explain why bilinguals might possess a more advanced monitoring system. Thus, when two conflicting lemmas are activated simultaneously, the conflict-monitoring system will recognize the presence of two simultaneously active competing responses, adjust the level of cognitive control to aid in the resolution of competing representations, and signal relevant pathways to allow for task-appropriate response selection.

As we have shown earlier, there is little, or only sporadic, evidence to suggest a bilingual inhibitory control advantage on nonlinguistic interference tasks. However, if the advantage were owing to a general conflict-monitoring system in which one objective was to modulate processing in order to ensure an elevated level of cognitive control, such that response selection was universally improved in tasks for which a higher level of cognitive control was required, a global RT advantage would be expected. Similarly, if the conflict-monitoring system were involved in simultaneous language management, the frequent requirement for cognitive control in bilinguals would likely lead to an improvement in this area and in any other brain structure that contributed to cognitive control.

The discussion of conflict monitoring has so far centered on describing its role in modulating cognitive control on a trial-by-trial basis, where it is believed that a particular module (with an extensively studied neuroanatomical basis) is distinctly activated by within-trial conflict, resulting in the up-regulation of conflict control in subsequent trials. Some hypotheses have nevertheless extended beyond or outside the classical domain of conflict adaptation in the bilingual literature (e.g., Bialystok, 2006; Prior & MacWhinney, 2010), while others within the conflict-monitoring literature have proposed unique functional roles for this system (e.g., the idea that the ACC, in particular, responds to the perceived likelihood of error by learning about its likelihood; Brown & Braver, 2005; Rushworth, Buckley, Behrens, Walton, & Bannerman, 2007; but see Yeung & Nieuwenhuis, 2009). One such hypothesis was offered by Botvinick, Cohen, and Carter (2004), who suggested that any instance of conflict within a task might simply be an index of the degree to which the task is cognitively demanding:

A possible extension of this proposal is suggested by the claim that the ACC encodes information about effort. With this in mind, it is interesting to consider the hypothesis that conflict might serve as an index of the demand for mental effort. Consistent with this, it has been noted that the ACC becomes active in just those task settings that are experienced as cognitively difficult. (p. 545)

Presumably then, the detection of a demanding task would result in the up-regulation of cognitive control to ensure optimal performance. The somewhat controversial hypothesis that can be derived from this line of thinking is that the constant strain of language management on the conflict-monitoring system might strengthen the extent to which bilinguals can focus processing on task-relevant stimuli (via cognitive control). When the task is difficult (competing responses comprise one instance of this, but certainly not the only instance), bilinguals may then be able to exercise superior cognitive control over responding to the relevant attributes of the task. Converging evidence in favor of this proposal would be provided if a global advantage were detected on a variety of nonlinguistic and noninterference tasks in which the principal manipulations were to increase cognitive demands in the absence of explicit (flanker) or implicit (Simon) response conflict. Just such a finding was reported by Bialystok et al. (2004) when they found a global advantage with centrally presented stimuli (entailing no Simon-generated response conflict) when they increased the cognitive load by increasing the number of stimulus–response mappings.

Task switching, language switching, and neurocognitive mechanisms

Abutalebi and Green (2008) have linked the vast literature on language-switching tasks in bilinguals to the literature on (putatively nonlinguistic) task switching. Yet, until recently (e.g., Garbin et al., 2010; Prior & MacWhinney, 2010), nonlinguistic task switching had not been explored in monolingual and bilingual language groups. There is compelling evidence to suggest that the ACC or components thereof (e.g., Wang, Xue, Chen, Xue, & Donga, 2007; see Abutalebi & Green, 2007, 2008, for reviews) are involved in language-switching tasks, and it has been assumed that this particular structure might also be involved in task switching (e.g., Abutalebi & Green, 2007). It must, however, be noted that this notion of the ACC being involved during task switching is a rather dramatic departure from the original boundaries imposed on the conflict-monitoring system. As described in the section above on classic conflict monitoring, it was originally thought that within-trial conflict (incongruent and not congruent trials) activated the ACC, causing cognitive up-regulation by way of the dorsolateral prefrontal cortex; i.e., increased cognitive control: see Botvinick et al., 2001; Botvinick et al., 1999; see Egner, 2007, 2008, for reviews). However, recall that a legion of researchers have attempted to extend the role of the ACC beyond intratrial conflict (Botvinick et al., 2004; Brown & Braver, 2005; Rushworth et al., 2007; see Woodward, Metzak, Meier, & Holroyd, 2008, and Hyafil, Summerfield, & Koechlin, 2009, for evidence of a distinct role for the ACC in task switching). One possibility, then, is that during task or language switching, there might be some amount of proactive interference (Philipp, Kalinich, Koch, & Schubotz, 2008) that would, because of the conflict between the current and previous representations, recruit the ACC. A simplification of this idea might be that the ACC becomes active at any time when conflict resolution is required.

Recent neuroimaging data by Garbin et al. (2010) on task switching in monolinguals and bilinguals reveal an interesting result with respect to bilinguals on task switching and conflict monitoring. They presented bivalent stimuli [i.e., colored (red or blue) shapes (squares or circles)], along with a word cue signaling participants to make a discrimination response on the basis of the color or shape of the stimulus. The experiment comprised an even number of nonswitch (color–color or shape–shape) and switch trials (color–shape or shape–color). The behavioral data indicated that bilinguals showed no switch costs (switch trials relative to nonswitch trials), whereas monolinguals showed a significant switch cost. (Note that in their purely behavioral study, Prior & MacWhinney, 2010, reported a similar pattern: Switch costs were larger for monolinguals than for bilinguals; the departure from Garbin et al.’s pattern was that bilinguals did show a significant switch cost.)

The neuroimaging data revealed a somewhat unusual dissociation between monolinguals and bilinguals. In monolingual speakers, the ACC, the right inferior frontal gyri (IFG), and the left posterior parietal lobe were reported as showing increased levels of activation on switch relative to nonswitch trials. That the ACC was involved during this task for monolinguals provides some evidence that there might be some amount of conflict that is detected by the conflict-monitoring system, or at the very least, that elements of the conflict-monitoring system are involved in task switching (Hyafil et al., 2009). In bilinguals, however, this switch-modulated activation was confined to the left IFG [which has been related to language control (Abutalebi & Green, 2007)] and left putamen, and was not observed in the ACC. On the one hand, this suggests that the ACC, the mainstay of the conflict-monitoring system, plays little or no role in mediating cognitive set in bilinguals in task switching. Conversely, multiple language use seems to result in the selective activation of the left putamen and left IFG in bilinguals, which somehow attenuates (Prior & MacWhinney, 2010) or eliminates (Garbin et al., 2010) task switch costs. A general cognitive implication of this neuroimaging finding might be that these differences are mediated by differences in strategy. In other words, when dealing with the requirement to switch rules or minimize interference from irrelevant information, bilinguals recruit different modules than do monolinguals. If true, this general possibility deviates dramatically from the typical assumption in this literature that both groups are using the same modules, but that a module in bilinguals has been made more efficient by the large amounts of linguistically mediated exercise.

With respect to neuroscientific implications, although the study of neurocognitive mechanisms in monolinguals and bilinguals in task-switching paradigms is a relatively novel enterprise in language research, these incipient data point to nontransfer of ACC-related processes from language switching to task switching. They point, instead, to the possibility that multilanguage use configures the IFG to respond to more general task demands, while the ACC appears to have a more restricted use in bilinguals.

Given that the focus of this review is bilingual performance on nonlinguistic interference tasks, specifically, it is important to describe how these neuroscientific findings relate to this topic. Firstly, and behaviorally, bilinguals do not outperform monolinguals on nonswitch trials in a task-switching paradigm, whereas they do outperform monolinguals on all trial types in nonlinguistic interference tasks. This suggests that task repetition in task switching and intertrial compatibility repetitions in interference tasks engage different processes. Secondly, neurocognitive parallels have not been established between the demand to switch tasks and the demand (in nonlinguistic interference tasks) to ignore or suppress irrelevant inputs. Furthermore, and thirdly, how well task switching can be likened neuroscientifically to intertrial compatibility switches is unknown. To this end, more specifically, it is not clear whether the reason behind the bilingual global advantage on interference tasks is due to reduced switch costs from incongruent to congruent trials or from congruent to congruent trials in which the response on trial n (e.g., > > > > >) is opposite that on the preceding (n – 1) trial (e.g., < < < < <); nor is it clear, neuroscientifically and behaviorally, whether there is a difference between bilinguals and monolinguals when trial n – 1 repeats on trial n (see the First-order sequencing effects section). Thus, there is an open question as to how closely intertrial switching of congruence or response in nonlinguistic interference tasks relates to literal task switching—a literal switching of tasks during the experiment—in bilinguals. Although it appears that bilinguals perform differently than monolinguals on task switching and that this difference has a neurocognitive correlate, it is not clear whether the IFG also plays a substantive role for largely equivalent bilingual advantages on congruent and incongruent trials in interference tasks after sufficient practice.

Nonlinguistic interference tasks, conflict monitoring, and neurocognitive mechanisms

There are only limited data on nonlinguistic interference tasks within this burgeoning area, but surely persistent research will be instrumental to developing a comprehensive, cogent theoretical framework. We are aware of only two brain-imaging investigations that have directly explored differences in neurocognitive architecture between monolinguals and bilinguals on the Simon (Bialystok, Craik, et al., 2005) and flanker (Luk et al., 2010) tasks.

Bialystok, Craik, et al. (2005; the behavioral data of which were covered in the Performance of monolingual and bilingual young adults on interference tasks section) administered a Simon task to French bilinguals, Cantonese bilinguals, and English-speaking monolinguals and used magnetoencephalography (MEG) imaging to tease apart any differences in the task-related modulation of brain activity between language groups. Although all language groups recruited similar brain regions for the task on congruent and incongruent trials, faster responses in the bilingual groups were related to increased involvement of the ACC, superior frontal, and inferior frontal regions situated predominantly in the left hemisphere, whereas faster responding in monolinguals was associated with increased activation of the middle frontal area of the left hemisphere. Comparison of the performance data with the neuroimaging findings from the different language groups in this study reveals an interesting and important dissociation. In this study, French bilinguals and monolinguals did not differ on overall RTs (they appeared to perform congruent and incongruent trials with equivalent proficiency). In contrast, the Cantonese bilinguals outperformed French bilinguals and monolinguals on both trial types. Yet, in both bilingual groups the same, above-mentioned bilingual-centric brain regions were associated with faster responding.

These results are telling for several reasons. First, because both the Cantonese and French bilinguals engaged similar brain regions when performing the task successfully, yet only the Cantonese bilinguals outperformed the monolinguals, the results underscore the idea that something other than exposure to two languages (see the section Hidden factors: the controversy surrounding the implementation of appropriate demographic controls). Second, while there were clear neurocognitive similarities in the regions on which the French and Cantonese bilinguals relied for faster RTs relative to monolinguals, the involvement of these brain regions, per se, was not necessarily responsible for improved performance. Reinforcing this idea, Bialystok, Craik, et al. (2005) noted that both French and Cantonese bilinguals demonstrated different brain–behavior correlations, which is at least somewhat suggestive that how these regions are used, and not necessarily the regions themselves, conduces to general behavioral advantages.

Luk et al. (2010) collected fMRI data from mono- and bilingual participants while they performed a flanker task comprising five randomly intermixed trial types. On congruent and incongruent trials, a singleton target chevron would appear to the left or right of center of four horizontally flanking chevrons either matching (congruent) or mismatching (incongruent) the direction of the target chevron. On neutral trials, a red target chevron was centered and flanked on each side by two diamonds. On no-go trials (which required withholding of a response to the target chevron), the target chevron appeared to the left or right of center of four horizontally aligned Xs. On baseline trials, a single target chevron appeared.Footnote 12 Analyses of the behavioral data revealed no significant RT differences between language groups (there was an ~ 20-ms numerical advantage for bilinguals on congruent and incongruent trials, but this was not significant). Analyses comparing the brain–behavior relationship for congruent and incongruent trials against neutral trials revealed a striking pattern of results. Superior performance on congruent trials involved similar brain regions in both language groups. Increased activation levels in the bilateral middle occipital gyrus, left fusiform gyrus, left lingual gyrus, bilateral cerebellum, and right caudate and IFG was associated with superior performance on congruent trials. Divergence between language groups, however, was observed on incongruent trials. In bilinguals, superior incongruent performance was associated with increased activation in bilateral cerebellum, bilateral superior temporal gyri, left supramarginal gyri, bilateral postcentral gyri, and bilateral precuneus, whereas in monolinguals superior incongruent performance was associated with the same network that was identified with superior performance on congruent trials. Finally, activation of the left ACC, bilateral IFG, and right caudate nucleus was also associated with superior performance on incongruent trials, but analyses did not indicate that the involvement of these areas was unique to bilinguals. Again, however, it must be noted that bilinguals appeared to have contrasting activation patterns relative to monolinguals.

The most impressive aspect of the results from Luk et al. (2010) for the present purposes is that, relative to monolinguals, bilinguals appeared to activate different regions to respond to incongruent trials, whereas both language groups appeared to engage similar brain regions during congruent trials. This pattern seems to have reified already strong convictions in this literature that the bilingual advantage (in general) is driven by well-tuned inhibitory control processes (BICA). This variety of interpretation is commonplace today, but the locus of the advantage is not as often attributed directly to inhibitory processes. For example, Luk et al. (2010) explained that “these results support the proposition that bilingualism influences cognitive control of inhibition at the attention level, but not motor control of prepotent responses” (p. 356) and that “differential engagement of this more extensive set of regions during incongruent trials in the two groups suggests that bilinguals can recruit this control network for interference suppression more effectively than monolinguals, consistent with their tendency to show less interference in terms of RT”Footnote 13 (p. 356). This conclusion is, in our opinion, impetuous and narrowly focused to the extent that it emphasizes inhibitory control. The issue with this interpretation is the same one that has beset the previously delineated interpretation of behavioral data on nonlinguistic interference tasks: The BICA model accounts well for a bilingual advantage on incongruent trials; it is challenged, however, by a literature showing little to no bilingual advantage on the interference effect (i.e., bilingual advantages that are largely similar on congruent and incongruent trials, as evidenced by the earlier empirical review of this literature). Thus, while it appears that functionally distinct brain regions are involved on incongruent and congruent trials for bilinguals, in contrast to monolinguals, whether these pathways are necessary, in and of themselves, for the bilingual advantage on incongruent trials, specifically, is less obvious. The necessity of this region, and especially the necessity of this region somehow being linked to superior inhibitory control processes (i.e., BICA), is severely undermined by our review of the behavioral data that, on balance, show a largely symmetrical bilingual advantage for congruent and incongruent trials (BEPA).Footnote 14 We do not deny a role for inhibitory control processes in the brain, nor do we deny that they play an important part in language management; however, there is simply little to no direct evidence (neuroimagining, behavioral, or otherwise) that they play any special role in nonlinguistic interference tasks.

A less intrepid and far more parsimonious interpretation centers on the idea that there is a well-developed mechanism in bilinguals or, more likely, a network of mechanisms in the bilingual brain that mediates between congruent and incongruent trials (much as it might manage language selection), in a way that is different from the way in which the monolingual brain operates. This type of theoretical perspective has been developing in the literature, although somewhat vaguely (Bialystok, 2009a; Bialystok & Craik, 2010), but the bare bones of it are evinced nicely by Luk et al. (2010): “Unlike the bilinguals, monolinguals did not respond to facilitation and suppression of interference using different brain networks, leading to fewer neural resources being recruited when performing the flanker task” (p. 356). Indeed, Bialystok and colleagues have long been aware of the need to explain superior bilingual performance on congruent trials (at least as early as Bialystok, 2006). This is an interpretation that is much more consistent with BEPA, and it is akin to the one that we favor. Although there may be several ways in which this could be achieved neuroscientifically, we will take the liberty of hashing out this strong hypothesis with less of a focus on inhibitory pathways or inhibition-based processing sites.

Imagine that the system for performing nonlinguistic interference tasks in bilinguals interprets and selects pathways for inputs based on whether they contain conflict. In deference to the results from Luk et al. (2010) and the conflict-monitoring literature, perhaps something like the ACC (but there might also be a role for the IFG and other regions that have been identified in Luk et al.’s important investigation) causes inputs to be rerouted depending on the presence or absence of conflict; the detection of conflict activates a (domain-general) dedicated conflict resolution center. Increased activation in this center triggers a routing of the input to a domain-general pathway, well-adapted because of bilingualism, for conflict resolution. The absence of conflict precipitates activity in a brain region that has been configured specifically to deal with nonconflicting inputs. The division of labor between functionally distinct processing streams and the consequent freeing up of processing resources—not superior inhibitory control or the efficiency with which an inhibitory pathway can be recruited relative to noninhibitory pathways in bilingualsFootnote 15—would then be responsible for the ubiquitous global RT advantage. In monolinguals, congruent and incongruent trials appear to be resolved in similar neurocognitive systems. Ancillary pathways are involved in both language groups for no-go trials, but not for monolinguals on incongruent trials. The suggestion is that these ancillary pathways for monolinguals lack more domain-general processing. This occurs because these pathways have not been adapted to such cognitive demands as dual language use.Footnote 16 The advantage for bilinguals in the nonlinguistic interference task might have materialized because the bilingual brain possesses a system that can distribute inputs to separate processing centers, depending on the presence of within-trial conflict, and that has adapted a network of pathways to respond to more general instances of conflict because of experience with multiple languages.

If this were so, flanker interference tasks requiring spatial processing of target and irrelevant distractor stimuli—as when irrelevant distractor stimuli must be processed to complete the task successfully—would show greater congruency (congruent trial RT – neutral trial RT) and smaller incongruency (incongruent trial RT – neutral trial RT) effects for bilinguals as compared to monolinguals, despite an absence of language group differences on the interference effect. Precisely this was done by Hernández, Costa, Fuentes, Vivas, and Sebastián-Gallés (2010) using the number Stroop taskFootnote 17 (Luk et al., 2010, also showed some behavioral evidence of this phenomenon). These bilingual advantages on facilitation and interference relative to a neutral condition, however, would not be observed if spatial processing were restricted in advance (Laberge, 1983), as in when targets and distractors occupy fixed regions in a display (Costa et al., 2008). Nevertheless, bilinguals may outperform monolinguals on congruent and incongruent trials because of a language-mediated division of labor between congruent and incongruent information-processing streams (but see Bialystok, Craik, et al., 2005, pp. 46–47, and note 12). Alternatively, if this finding proves to be restricted to instances in which target location is variable, overall RT advantages may still be accounted for by the speed at which (in)congruent-relevant brain regions can be selected by the monitoring system (Luk et al., 2010), irrespective of inhibition.

Remaining to be explained are the bilingual advantages in conflict resolution that sometimes appear on the earliest testing trials. We suggest that this may be due to an asymmetrical activation, in bilinguals, of congruent and incongruent pathways before testing begins. That is, if the uniquely active pathways for bilinguals on incongruent trials were due to multilanguage management, these pathways would likely be active in the experimental setting under which the task was explained linguistically and under which lemmas from both languages would be competing for selection. Several theoretical views have agreed that some mechanism must select the desired language for lexicalization (Costa, Miozzo, & Caramazza, 1999; D. W. Green, 1998; La Heij, 2005) and that exercising this mechanism may play some critical role in the fleeting bilingual advantage on the interference effect:

On this view, the origin of the bilingual advantage would not be so related to the specific processes engaged in resolving lexical competition between languages (inhibitory control or language specific lexically built-in mechanisms) but rather to the previous step of setting the language in which communication will proceed. (Costa et al., 2009, p. 144)

Thus, early on in a nonlinguistic interference task, unique bilingually related brain regions jointly involved in language selection and the processing of distracting inputs might be primed by the language setting under which the task is administered.

Empirical testing of conflict monitoring and theoretical implications

First-order sequencing effects

Although we favor the abovementioned theoretical construct of the bilingual advantage, its link to the conflict-monitoring system—or any system for that matter—has yet to be shown unequivocally. As such, additional research between language groups on sequencing effects seems warranted, given the potential that either components of or the entire conflict-monitoring system might have in domain-general responding (cf. Costa et al., 2009; Costa et al., 2008). Consider the conflict-monitoring system specifically. It is not entirely clear whether bilinguals show superior efficiency (as measured by an enhanced rate of information processing, by signaling relevant processing sites, by routing inputs to functionally specific pathways, or by some combination) in the conflict-monitoring system. This is due, in large part, to conceptualizations of monitoring that seem to extend beyond intertrial compatibility switching into the domain of intertrial response switching (e.g., Bialystok, 2006). At other times, the first-order sequencing effects lack an appropriate baseline, due to a failure to remove sequences in which trial n is a complete repetition of trial n – 1 (e.g., Costa et al., 2009; Costa et al., 2008) and because two-alternative forced choice tasks are typically considered insufficient for getting at the core of conflict monitoring (see the next section). If, however, an experimental design obtaining a purer measure of sequencing effects in the context of the conflict-monitoring theory were to be implemented, and if it were to show this advantage on sequencing effects, there would be some preliminary evidence that might begin to account for a body of work demonstrating a bilingual advantage on a wealth of cognitive assessment tools (i.e., BEPA; see Adesope, Lavin, Thompson, & Ungerleider, 2010; Carlson & Meltzoff, 2008).

Toward a sounder measure of conflict-monitoring differences between monolinguals and bilinguals

Behaviorally, too, the relationship between bilingualism and conflict monitoring in nonlinguistic interference tasks is poorly understood. This is primarily due to the development of experimental designs that have been insensitive to the principles of conflict monitoring and to other theoretical constructs, with lesser known neurocognitive correlates, that compete with it. One theory that has opposed conflict monitoring as a candidate explanation for sequential modulations of the Simon effect is the event-file theory (Hommel, 1998). A treatment of how the event-file theory (or feature integration theory) relates to sequential modulation can be found in Hommel, Proctor, and Vu (2004; or in Hommel, 2004, for a more general treatment of this theory). For now, a relatively coarse description will suffice. According to the event-file theory, only a limited number of event files (or transient memory traces) can be held simultaneously, and partial overlap between event files results in a time-consuming update to the previously constructed event file, because one component of a multicomponent event file has been activated. Consider, for instance, if a green stimulus in trial n automatically activated a response to a location right of fixation, and a red stimulus in trial n – 1 automatically activated a response to a location right of fixation. In this situation, the unconditional coding of task-irrelevant location information in the previous trial would be activated again on trial n. But, according to event-file theory, a mismatch on this dimension would necessarily result in a modification to the event file from n – 1. Thus, the feature code from the previous event file would need to be “unbound,” because it is a necessary component of the new event file. Alternatively, if there is no overlap (thus, complete repetition, or complete alternation as in the case when a corresponding trial is followed by a corresponding trial), processes completely unrelated to conflict monitoring and the feature integration account might affect RTs. It is thought that comparing the sequence congruent to congruent with incongruent to congruent, for example, might artificially inflate the switch cost because the sequence congruent to congruent is comprised exclusively of complete repetitions and alternations whereas the sequence incongruent to congruent comprises partial matches (Hommel, 2004). Mechanisms related to priming (Christie, & Klein, 2001) could just as easily account for switch cost differences when complete alternations and matches on first order sequencing effects are compared to first order sequences in which one dimension from trial n–1 matches in trial n whereas the other mismatches.

Having described the feature integrationist account to some extent, it might now appear obvious that both conflict-monitoring and feature integration accounts make similar predictions on most (if not all) two-alternative forced choice tasks, and problematically, it is clear that other processes are likely involved in these tasks that are liable to produce sequential modulations on interference tasks. One such process might play a role in facilitating response on a trial n on which there is a complete repetition or alternation of the S–R code (e.g., Wühr & Ansorge, 2005). The second concern is that it is virtually impossible to dissociate feature integration from conflict monitoring in two-alternative forced choice tasks. The reason for this is relatively straightforward. When there are only two response values per stimulus dimension, partial alternations and partial repetitions are perfectly confounded with transition from congruent to incongruent trials, or vice versa (Egner, 2007; Funes et al., 2010). Thus, the sequential modulation can occur either because of the difficulty associated with “unbinding” an event file or because of an evaluatory mechanism regulating cognitive control on the basis of cognitive demand from one trial to the next.

The way to circumvent the issue of co-occurring mechanisms for either response priming or feature integration, both of which may modulate sequential effects, is to increase the number of stimulus–response relationships, which would unconfound feature integration and conflict monitoring. This would allow for first-order sequence analysis on a purely abstract level (congruence) if the rarer sequences on which there was a partial repetition (i.e., if a response repeats but the stimulus position changes) or a complete repetition (i.e., a repetition of both the stimulus location and the response) were excluded. The remaining trials, therefore, include those on which there is a difference only at the level of the processing relationship between task-irrelevant and task-relevant information (i.e., congruent to incongruent, incongruent to incongruent, etc.). A vast library of experimental designs illustrate precisely the types of steps that can be taken to eliminate the influence of co-occurring phenomena, and there is little reason why similar approaches could not be adopted to examine differences between language groups (Akçay & Hazeltine, 2007, 2008; Funes et al., 2010; Mayr, Awh, & Laurey, 2003; Stürmer et al., 2002; Ullsperger et al., 2005; Wühr & Ansorge, 2005; see Egner, 2007, 2008, for information on how to design these types of tasks). Thus, to the extent that conflict adaptation can occur in the absence of any dimensional overlap or repetition effects, this type of methodological and analytical approach would provide one of the purest measures. If an advantage were observed for bilinguals on this purer measure of conflict adaptation—the effect that the conflict-monitoring system is ostensibly reacting to—the implication would be, at the very least, that this system behaves differently in bilinguals. Reduced first-order sequencing effects in bilinguals would most likely be attributable to a more efficient conflict-monitoring system and not necessarily functionally distinct processing streams for incongruent and congruent trials. Of course, one of the drawbacks of introducing this type of approach is that an increase in the size of the response–stimulus set might correspond to an increase in cognitive load (which might or might not involve the conflict-monitoring system to some degree), which could theoretically be handled better by bilinguals than by monolinguals. One solution to this problem, however, might be to introduce stimuli that could reflexively activate responses [i.e., arrows (Ristic & Kingstone, 2006; Ristic, Friesen, & Kingstone, 2002) in, e.g., the spatial Stroop task (Bialystok, 2006)].

In practice, bilingual research could dissociate these competing theories for sequential modulation and arrive at a purer test of conflict adaptation between language groups. Assuming a performance advantage in bilinguals (as compared to monolinguals) on conflict adaptation (excepting trial sequences consistent with event-file theory), these findings would provide, to reiterate, relatively clear-cut evidence that processing efficiency is improved in this particular system, which, because of its close connection to many other structures in the brain, might imply widespread behavioral and cognitive advantages (a confirmation of the BEPA). Furthermore, although there may be relatively few a priori reasons to assume that bilinguals ought to outperform monolinguals on something akin to repetition priming (Pashler & Baylis, 1991) or perhaps feature integration, these components too could be studied on a trial-by-trial basis to examine numerical differences between language groups. To date, analyses of congruent–congruent and incongruent–incongruent sequences in bilingualism research have not distinguished between complete matches (e.g., trial n–1 = > > > > > and trial n = > > > > >) and complete mismatches (e.g., trial n – 1 = > > > > > and trial n = < < < < <) for congruence (or incongruence). Thus, to date, feature integration and conflict-monitoring accounts are perfectly confounded in the bilingualism literature.

Broader implications of conflict-monitoring advantages for domain-general bilingual advantages

Although behavioral and neurocognitive explorations of bilingual advantages must strive to determine the true relationship between domain-general processing systems and bilingualism, we would like to allow some pause to momentarily discuss the implications of an advanced conflict-monitoring system owing to dual-language management. There are, to be sure, a wide variety of hypotheses surrounding a domain-general role for the conflict-monitoring system and its components. Whereas the literature has attributed several possible roles to the conflict-monitoring system, a second consequence of the bilingual advantage on global RTs (or conflict monitoring) might be that one or more components of the conflict-monitoring system, having been relied on frequently for managing multiple languages, confer advantages on other neurocognitive systems for which these same components play a major part. One system, for example, that relies on a component of the conflict-monitoring system is the locus coeruleus norepinephrine system (LCNE), which receives projections from prefrontal regions and the ACC (Aston-Jones & Cohen, 2005). The LCNE is a biphasic multifaceted system. Functionally, it has been hypothesized to regulate task-related decision processes, to facilitate the execution of appropriate behavior, to facilitate attentional filtering and, similarly, to increase attention to task-relevant processes—to name a few of its proposed contributions. To the extent that the efficiency of something like the ACC, which is tightly integrated with other processing system (including the LCNE), has directly benefited from dual language use, we might also expect behavioral advantages on tasks engaging different systems but between which there is structural overlap.

The implications of interconnected neurocognitive systems are manifold. Among them is the possibility that if one or several modules in a general conflict-monitoring system were well developed because of L2 management, and if similar modules were involved in other brain circuits, advantages could extend to a variety of other cognitive domains. Moreover, it is not necessary, and perhaps it is unlikely, that only one or a spattering of regions in the brain would be affected by the acquisition of a second language. This would ultimately lead to exceptionally complex behavioral and neuropsychological interactions. As a result, while examining one feature of the system would provide invaluable insight as to whether such a system responded differently, or possibly more efficiently, in one group as compared to another, but on the basis of one or possibly several features alone, it would be enormously challenging to account for a pattern of results that has been produced by a tightly integrated system (e.g., Costa et al., 2009, alluded to something of this effect when they expressed the view that inhibitory control processes and conflict monitoring could interact in complex waysFootnote 18). Surely, as we have seen from recent neuropsychological data comparing bilinguals and monolinguals on these tasks, there are awe-inspiring differences between the activation patterns and regions involved in the bilingual brain relative to the monolingual brain.

Closing remarks and conclusions

From all of the evidence we have considered, it is at this juncture enormously challenging and perhaps premature to conclude that bilinguals have profited from a lifetime of multilanguage management so as to ensure that they have developed a more adept, general, and multifaceted inhibitory control system that is less subject to degeneration, specifically, from aging. Certainly, there is very little evidence to support the BICA hypothesis; despite this fact, BICA has been endorsed, despite some recent softening of this idea by placing more emphasis on the cognitive control of inhibition (e.g., Bialystok & Craik, 2010; Luk et al., 2010). When bilingual advantages on the interference effect appear in young-adult populations, they appear only briefly, early on, and dissipate very rapidly. That is, when detected (to be sure, such advantages have been detected only twice in flanker interference tasks: Costa et al., 2009; Costa et al., 2008), their onsets occur following the first 24 trials, increase to a pinnacle after three or four blocks (primarily because monolinguals become somewhat monotonically slower on congruent trials), and then abruptly vanish. To date, only one study has found an interference effect advantage that is even remotely close to what an inhibitory control model might predict (Bialystok et al., 2004); this study involved older bilinguals and, in showing the advantage on the interference effect (Exp. 2), also showed the instability of it (Exp. 3). In this case, although there was an overall RT advantage for bilinguals, they also appeared to benefit most on incongruent trials. Critically, recall that the interference effect disappeared with practice and was largest after only a few experimental trials without practice, pointing to the possibility that a reconfiguration of cognitive processes, rather than any enduring bilingual advantage on inhibitory control, might better characterize this finding.

Thus, although there is scant evidence in favor of the BICA hypothesis, there is clearer evidence to suggest that bilinguals enjoy a general processing advantage that can be detected early developmentally and that persists throughout life. This is clear from the robust advantage of bilinguals on global RTs in difficult tasks and nonlinguistic interference tasks (see Fig. 2b), which begins in childhood and lasts into old age. The relative ubiquity of the bilingual advantage in global RTs provides strong support for the BEPA hypothesis. This hypothesis places the locus of control not on inhibitory processes per se, but on a central executive system that has some capacity to regulate processing across a wide variety of task demands. A model of this sort might be able to better accommodate the ephemeral advantage on those tasks that induce interference but that are apparently nonlinguistic or less linguistic in nature (e.g., the Simon or flanker tasks) as a function of unusual conflict adaptation effects early on, differences in task learning, and beyond. It is here that something akin to a more global conflict-monitoring system (Costa et al., 2009; Costa et al., 2008) operates, not entirely as a function of whether conflict or congruence had been perceived on previous trials, but as a general executive system that improves in efficiency owing to the need to monitor linguistic representations competing for selection. The components of such a system, being intricately related to a number of modules that may have also developed through cross-language use, contribute to the regulation of cognitive control by delegating processing between quasi-independent pathways or brain regions. This type of BEPA-oriented theoretical framework might lead to bilingual advantages across a broad range of tasks in which the need for executive control is most pressing and in which processing can be neatly divided between separate processing streams. On this last point, however, while it is clear from the neuroscience that a lifetime of dual-language use results in neurocognitive differences between bilinguals and monolinguals, how these differences translate into behavioral differences—and even whether these differences reflect bilingual advantages—is poorly understood, and moving forward, much remains to be learned about these processes.