Introduction

The ability to exert goal-directed control over behaviour allows healthy individuals to flexibly adjust their actions in accordance with current needs and desires. However, occasional ‘slips of action’ reveal that goal-directed control competes with stimulus–response (S-R) habits. According to dual-system theories, the balance between the two ultimately determines behavioural output (de Wit and Dickinson 2009; Dickinson 1985). This notion can be illustrated with the example of cycling into town on a Sunday afternoon in order to have lunch. If one becomes distracted and temporarily loses focus of this goal, one may find oneself instead cycling habitually towards the office. Such slips of action are triggered by environmental stimuli via S-R associations, as, for example, the sight of the crossroads triggering the response of turning left towards the office. Whereas goal-directed action has the advantage of flexibility, habitual responding requires less cognitive effort. Investigation of the balance between the underlying neural systems can further our understanding of functional as well as dysfunctional behaviour and is therefore of great societal importance, being relevant to mental health and psychopathology.

Previous animal research has implicated dopamine (DA) in both habitual and goal-directed behaviour. DA-enhancing drugs have been shown to accelerate the transition from goal-directed to habitual control with practice (Nelson and Killcross 2006), while lesions to the nigrostriatal DAergic pathway prevent habit formation (Faure et al. 2005). On the other hand, goal-directed action and outcome prediction may be supported by a DAergic circuit that includes ventromedial prefrontal cortex and nucleus accumbens (Cheer et al. 2007; Goto and Grace 2005; Hitchcott et al. 2007).

So far, direct evidence for a role of DA in goal-directed and habitual control in humans is still lacking, but there have been many claims that baseline DA levels play a role in maladaptive behaviour. For instance, DA is thought to play a role in habitual and ultimately compulsive drug-seeking (Belin and Everitt 2008; Everitt et al. 2001; Everitt and Robbins 2005; Vanderschuren et al. 2005). Similarly, DA may be involved in impulsive and compulsive food-seeking in obesity (Wang et al. 2001). Aberrant habit formation is also thought to play a central role in obsessive–compulsive disorder and Tourette syndrome (Gillan et al. 2011; Graybiel and Rauch 2000), which can be treated with DA receptor antagonists (McDougle et al. 1994), and in anorexia nervosa (Steinglass and Walsh 2006), a condition that has been associated with increased DA receptor activity (Frank et al. 2005).

In the present study, we investigated the role of DA in the balance between S-R learning and goal-directed action by reducing global DA synthesis and transmission through acute phenylalanine and tyrosine depletion (APTD) (Harmer et al. 2001; Montgomery et al. 2003; Robinson et al. 2010; Vrshek-Schallhorn et al. 2006) before testing healthy volunteers on a novel instrumental paradigm (de Wit et al. 2009a, 2007). In the initial instrumental learning stage of this paradigm, participants learned by trial-and-error that certain responses led to rewarding outcomes in the presence of different stimuli. In a subsequent outcome-devaluation test, some of these outcomes were devalued such that participants had to use their knowledge of the response–outcome (R-O) relationships to direct their choices towards still-valuable outcomes. Finally, in a slips-of-action test, participants were shown the stimuli from the original learning stage and were asked to selectively respond to stimuli that signalled the availability of still-valuable outcomes (Gillan et al. 2011). Dominant goal-directed control should be reflected in good selective responding. Conversely, if participants were strongly reliant on S-R associations, they should commit ‘slips of action’, reflected in a failure to withhold responses to stimuli that signalled now-devalued outcomes.

We have recently used this paradigm to show that OCD patients are relatively vulnerable to slips of action (Gillan et al. 2011). Furthermore, we found evidence for a goal-directed deficit in Parkinson’s disease patients that emerged with increasing disease severity (de Wit et al. 2011). The latter finding may be related to progressive DA depletion in the ventral corticostriatal circuit, but we should treat these findings with caution as Parkinson’s disease is associated with disruptions in additional neurotransmitter systems (Agid et al. 1993; Dubois et al. 1990), and medication effects may also have contributed to these findings.

The aim of the present study was to investigate the hypothesis that attenuated global DA levels cause an imbalance between goal-directed and habitual action control in healthy male and female volunteers, without the confounding effects of disease and with no restriction on receptor subtype. To this end, we adopted the dietary intervention of APTD and assessed the effect of decreased DA function on performance on the instrumental learning paradigm.

Materials and methods

Procedures were approved by the Hertfordshire Research Ethics Committee (08/H0311/25) and were in accord with the Helsinki Declaration of 1975. All participants gave written informed consent prior to commencing the study. Testing took place at the Wellcome Trust Clinical Research Facility at Addenbrooke’s Hospital, Cambridge.

Participants

Participants were recruited through local mail and poster advertisements. All participants had been pre-screened by telephone interview to ensure that they met the study criteria. Exclusion criteria were as follows: cigarette smoking, history of psychiatric disorder or neurological disorder, history of major illness, drug abuse, excessive alcohol intake and head injury resulting in unconsciousness. Participants with a first degree relative with a history of axis 1 psychiatric disorder, or who were currently taking psychoactive medication, were also excluded. A second screening on their first visit to the hospital consisted of a physical examination by a trainee clinician HS and by nursing staff in the research ward.

One participant withdrew because of feeling unwell following the amino acid drink. A total of 28 participants completed this study (14 male), aged between 19 and 49 years. Females were, on average, 26 years of age (SEM = 1.9), and males were, on average, 29 years (SEM = 1.9). We aimed to test all female participants outside of menses, but due to time constraints, 3 out of the 14 females were tested either during or in the week prior to menses. Furthermore, five females were taking a contraceptive pill at the time of testing. Full demographic details (as well as trait characteristics) have been reported in a previous publication (Robinson et al. 2010).

Acute phenylalanine/tyrosine depletion (APTD) procedure

On the day preceding their visit to the Clinical Research Facility, participants were instructed to follow a low-protein diet (less than 20-g protein) and then to fast from 7 p.m. onwards. All participants arrived on the test day at approximately 9.15 a.m. A baseline blood sample was obtained, following which the amino acid drink was given. For males, the TYR drink contained 15-g isoleucine, 22.5-g leucine, 17.5-g lysine, 5-g methionine, 17.5-g valine, 10-g threonine and 2.5-g tryptophan. The BAL drink contained the same but with the addition of 12.5-g tyrosine and 12.5-g phenylalanine. Female subjects received 20% less of each amino acid in order to account for a lower average body weight. The amino acids were dissolved in approximately 300 ml of water, and lemon flavouring was added to make the drink more palatable. Fourteen subjects (seven males) received the TYR drink, while the other 14 participants (seven males) received the BAL drink. Both the participant and the researcher were blind to which drink was being administered, and it was randomly assigned. After consuming the drink, the participants were given free time, but were asked to remain at the research facility. They had unlimited access to water, and at 12 p.m., they were given an apple to avoid hypoglycaemia. Approximately 4.5 h after consuming the BAL/TYR drink, a second blood sample was taken, and behavioural testing was then carried out. Once testing was completed, participants were given a meal and were allowed to go home.

Behavioural testing

We compared performance of the BAL and TYR groups on an instrumental learning paradigm (de Wit et al. 2007) and a digit span test (Wechsler 1981). In addition, participants received a number of tasks and measures of mood in a crossover design. These have been reported elsewhere (Robinson et al. 2010).

Instrumental paradigm description

The instrumental learning paradigm was programmed in Visual Basic 6.0 and was presented on an Advantech Paceblade computer. The paradigm was divided into three stages: instrumental learning, outcome-devaluation test and slips-of-action test. For detailed descriptions, we refer the reader to previous publications: learning stage and outcome-devaluation test (de Wit et al. 2007) and slips-of-action test (Gillan et al. 2011). In the following sections, we describe the basic features of these tasks (see Fig. 1 for a schematic depiction).

Fig. 1
figure 1

Greyscale illustration of the three-stage instrumental paradigm. a Illustration of the three discrimination types: standard, congruent and incongruent. b Instrumental learning. In this example from the standard discrimination, participants are presented with grapes on the outside of the box. If the incorrect (L left) key is pressed, an empty box is revealed (and no points are earned). If the correct (R right) key is pressed, participants are rewarded with cherries on the inside of the box (and points). c Outcome-devaluation test. In this example, participants are presented with two open boxes with a melon and cherries inside. The cross superimposed on the cherries indicates that this fruit type is no longer worth any points. The correct response in this example would be to press the left key (which, during training, yielded the still-valuable melon outcome). d Slips-of-action test. In this example, the initial instruction screen shows that the pineapple and cherries outcomes will now lead to the subtraction of points, as indicated by the crosses. The other four outcomes are still valuable. Following the instruction screen, participants are presented with a rapid succession of the fruit stimuli (on the front doors of the boxes) and are asked to press the correct keys (Go) when a stimulus signals the availability of a still-valuable outcome inside the box, but to refrain from responding (No-Go) when the outcome inside the box has been devalued. In this particular example, participants should press when the apple stimulus is depicted on the front door, but refrain from responding on trials with the grape stimulus

Instrumental learning

Participants were instructed to earn as many points as they could by collecting food items from inside a box that was displayed on the screen. At the beginning of each trial, a closed box was shown on the screen, with a picture of a food item on the front. This food item acted as a discriminative stimulus, signalling which of two instrumental responses, either a right or left key press, would be rewarded with another food item and points (see Fig. 1b). Participants had to find out by trial and error which key to press for six different food pictures on the outside of the box. Whereas correct responses opened the box to reveal a food reward inside and points, the box was empty following incorrect responses, and no points were earned. In order to perform well during this stage, participants had to learn, therefore, which was the correct key to press for each stimulus on the outside of the box. However, they were also instructed to pay attention to what was inside the box as this would become important at a later stage of the game. Finally, faster correct responses earned more points (in the range from 1 to 5). The training consisted of six blocks. In each block, each of the six stimuli was presented twice in random order.

Participants were trained concurrently on three bi-conditional discriminations: congruent, standard and incongruent (see Fig. 1a). For each discrimination, one food picture on the front of the box would signal that the left response was correct, while another picture would signal that the right was correct. On trials of the critical, standard discrimination, four different food pictures functioned as stimuli and as outcomes. In addition, we included a congruent discrimination that did not require outcome learning because each food item on the outside of the box (the stimulus) was identical to the food item inside the box (the outcome). Conversely, on trials of the incongruent discrimination, each food item functioned as stimulus and outcome for opposing responses. For example, an orange stimulus signalled that the right response would be rewarded with a pineapple outcome, while on other trials, the pineapple would function as a discriminative stimulus signalling that the left response would be rewarded with an orange outcome. In this case, goal-directed learning about the incongruent outcome is rendered disadvantageous because it interferes with activation of the correct response through the appropriate S-R associations. In this example, associating the orange outcome with the left response (O-R) that earned it would interfere with discriminative control by the orange stimulus over the right response (S-R). Therefore, performance on incongruent trials should rely solely on habitual control via S-R associations. We should expect to observe, therefore, in line with previous studies (de Wit et al. 2007, 2009; Dickinson and de Wit 2003), that a ‘congruence effect’ in that performance should be superior on standard and congruent discriminations relative to incongruent because only the former two can benefit from goal-directed support (this should also be reflected in the performance on the outcome-devaluation test described in the following section). The incongruent discrimination therefore provides us with a baseline measure of S-R habit learning.

Outcome-devaluation test

Following the learning phase, the instructed outcome-devaluation test was conducted to assess R-O knowledge (see Fig. 1c). In this stage, participants were presented with two open boxes, which contained foods that had previously been collected. One food was previously earned by pressing left and the other by pressing right. However, one of the food items had a red cross superimposed on top of it, indicating that it was no longer worth any points. Participants were instructed to press the key that would allow them to collect the still-valuable food. The outcome-devaluation stage consisted of 12 trials, with 4 trials for each of the three discriminations, presented in random order. During the test stages, response feedback was no longer provided.

Slips-of-action test

This final test stage was designed to assess directly the balance between habitual and goal-directed control (see Fig. 1d). At the start of each of six blocks, all six food outcomes inside the boxes were shown on the screen, but a red cross was superimposed on two of these to indicate that these would now lead to subtraction of points. Subsequently, a series of closed boxes with the food stimuli on the front was shown in rapid succession. Participants were instructed to earn points by pressing the appropriate keys in order to open boxes that contained still-valuable outcomes (the four outcomes shown without a cross at the start of each block), but to refrain from responding if a box contained a now-devalued food item (the two outcomes shown with a cross superimposed at the start of each block). Each of the six stimuli was shown four times per block, and across blocks, each of the outcomes was devalued twice.

This test was used to directly assess relative habitual and goal-directed control. Strong response activation via direct S-R associations should lead to commission errors on trials with the devalued outcomes. Conversely, successful selective inhibition on the basis of outcome value should be indicative of dominant goal-directed control that is mediated by anticipation and evaluation of the consequent outcome.

Digit span test

In the backward version of the digit span test (Wechsler 1981), random sequences of numbers were read out by the experimenter, and participants were asked to repeat these in reverse order. The list was initially very short (only two numbers) but increased in increments of one number at each stage. Two trials per stage were administered. Testing was stopped after participants failed both trials of a given stage or when the final stage (stage 7; 8 numbers long) was completed, whichever came first.

Statistical analysis

All data were analysed using SPSS version 15.0. We conducted analyses of variance (ANOVA), which always included the between-subjects factors gender and APTD (referring to the groups that received either the BAL or the TYR drink). Bonferroni corrections were adopted for pairwise comparisons. All p-values are based on Greenhouse–Geisser sphericitiy corrections, and all significant (p < .05) higher-order interactions with APTD and gender are reported.

We focused our analyses of APTD’s effects on action control on standard trials, which lack the stimulus–outcome confound that is inherent to congruent and incongruent trials. However, we also report additional analyses to assess relative performance on the standard, congruent and incongruent trials. On congruent trials, active outcome retrieval was rendered unnecessary, while on incongruent trials, it was actually disadvantageous. For the mean performance values of the latter two, we refer to the supplemental table (Online Resource 1). Furthermore, complementary RT analyses are also included in the supplemental material (Online Resource 2).

Results

Biochemical effects of APTD treatment

Blood samples were not available for one female participant in the BAL and one in the TYR condition. For the remaining 26 participants, we calculated the ratio of TYR and PHE plasma concentrations combined, to those of other large neutral amino acids (LNAAs), at baseline and at approximately 4.5 h post-drink. These TYR/PHE:∑LNAAs ratios provide an index of TYR availability in the brain. A priori statistical analysis yielded a significant APTD*Time interaction (F(1,22) = 15.98, MSE = .009, p = .001). As can be seen in Fig. 2, TYR availability in the brain was unaffected by the BAL drink (F(1,11) = 1.53, MSE = .013), while the TYR drink led to a significant reduction (F(1,11) = 86.26, MSE = .005, p < .0005). Therefore, we established that the APTD treatment was successful in reducing DA precursor.

Fig. 2
figure 2

Biochemical effects of APTD treatment: average TYR/PHE:∑LNAAs ratios (and SEMs) are shown separately for the male and female participants (left versus right graph) in the BAL and TYR groups (empty versus filled dots), both pre-drink (T0) and post-drink (T4.5)

Effects of APTD on instrumental learning

Figure 3 displays percentages of correct responses on standard trials during the six blocks of training, with 50% indicating chance level performance. A priori analysis of standard trials yielded only a significant effect of block (F(5,12)) = 14.38, MSE = 281.0, p < .0005. There were no significant effects involving the factors of APTD nor that of gender (Fs < 1). Therefore, instrumental learning was not affected by APTD.

Fig. 3
figure 3

Instrumental learning: average percentages of correct responses (and SEMs) on standard trials during the six blocks of instrumental training are shown separately for male and female participants (left and right panel) and for the BAL and TYR groups (empty and filled dots, respectively)

As expected on the basis of previous studies (de Wit et al. 2007, 2009a), our a priori overall analysis confirmed that participants acquired the congruent and standard discriminations at a faster rate than the incongruent (F(2,48) = 25.86, MSE = 311.1, p < .0005). A priori group analyses confirmed that this ‘congruence effect’ was significant in both the BAL group (F(2,24) = 24.87, MSE = 403.0, p < .0005) and in the TYR group (F(2,24) = 8.88, MSE = 403.0, p < .05). In order to specifically determine the effect of APTD on S-R learning, we conducted a separate a priori analysis of incongruent trials. This analysis failed to yield an effect of APTD (F < 1), providing evidence that S-R learning was not impaired.

Effects of APTD on outcome-devaluation test of R-O learning

APTD did not affect R-O learning. A priori analysis of performance on standard trials did not yield a significant effect of APTD (F < 1), nor an APTD*Gender interaction (F < 1). Average percentages of correct responses were 91% in the BAL group and 88% in the TYR group.

An additional a priori analysis that also included the congruent and incongruent discriminations revealed a significant three-way APTD*Discrimination*Gender interaction (F(2,48) = 10.53, MSE = 450.2, p < .0005). Breakdown of this interaction showed that there was a significant APTD*Discrimination interaction for females (F(2,24) = 10.78, MSE = 377.0, p = .001), but not for males (F(2,24) = 2.25, MSE = 523.3). Post-hoc independent sample t-tests established that female participants that had been given the TYR drink performed significantly worse on the incongruent discrimination than those in the BAL group (t = 4.89, p = .001), while performance on the standard (t = .28) and congruent discriminations (t = 1.55) was statistically indistinguishable. Furthermore, incongruent performance was even significantly below chance level for female subjects in the TYR group (t = 4.89, p = .001), suggesting reliance on S-R habits as opposed to goal-directed action control. In conclusion, the results of the outcome-devaluation test suggest that APTD did not impair R-O learning, but, if anything, strengthened habitual control in females.

Effects of APTD on the balance between goal-directed and habitual control in the slips-of-action test

To investigate the effect of APTD on the occurrence of slips of action, we calculated percentages of responses made (number of responses made/number of trials * 100) separately for trials on which valuable outcomes were signalled to be available versus trials on which devalued outcomes were signalled to be available. Perfect performance would be 100% responding towards valuable outcomes and 0% towards devalued outcomes.

As can be seen in Fig. 4, standard performance on the slips-of-action test was severely disrupted by APTD in females only. In line with this observation, a priori statistical analysis yielded a significant APTD*Devaluation*Gender interaction (F(1,24) = 5.67, MSE = 544.2, p < .05). Separate analyses of male and female performances confirmed that there was a significant APTD*Devaluation interaction in females (F(1,12) = 13.07, MSE = 378.0, p < .005), but not in males (F < 1). Post-hoc analyses of female participants revealed that responding towards valuable outcomes was unaffected (F(1,12) = 1.24, MSE = 248.1, p < .05), but APTD disrupted the ability to withhold responses towards devalued outcomes (F(1,12) = 12.11, MSE = 553.4, p < .005).

Fig. 4
figure 4

Slips-of-action test: average percentages of responding (and SEMs) towards valuable outcomes versus devalued outcomes are shown separately for male and female participants, in the left and right graphs, respectively

An additional a priori analysis with the within-subject factor discrimination type yielded a significant Discrimination*Devaluation interaction (F(2,48) = 15.61, MSE = 305.5, p < .0005). Whereas separate post-hoc analyses of the congruent trials revealed only a significant main effect of devaluation (F(1,24) = 216.2, MSE = 263.9, p < .0005), incongruent performance revealed the same three-way Devaluation*APTD*Gender interaction as we reported for standard performance (F(1,24) = 7.54, MSE = 889.0, p = .01). APTD affected incongruent performance negatively only in female participants (F(1,12) = 9.90, MSE = 550.4, p < .01). Therefore, the detrimental effect of APTD on slips of action by female participants was specifically due to disrupted performance on standard and incongruent trials, on which successful performance required the ability to evoke the available outcome and its current value. In contrast, performance was intact on congruent trials, on which the decision to respond or not could be based on the value of the stimuli that were directly presented on the front of the boxes. Therefore, APTD did not abolish females’ ability to selectively withhold responses generally, but instead specifically affected response inhibition on the basis of anticipated outcome value.

Controlling for menses status and contraception in females

The role of gonodal hormones in the female-specific effect of APTD remains to be elucidated in future research. However, we have made a first attempt to control the menses status of our female participants at the time of testing by conducting post-hoc analyses on performance on the slips-of-action test with the between-subjects factor menses status (with 3 out of 14 females having been tested during or in the week prior to menses). Hormonal regulation is also affected by the use of a contraceptive pill, so additional analyses were conducted to control this factor (with 5 out of 14 females being on the pill at the time of testing). In these additional analyses, the effect of APTD on performance on the standard trials of the slips-of-action test (Fs(1,10) = 12.05 and 8.66, MSEs = 409.5 and 406.0, ps < .01) and incongruent trials (Fs(1,10) = 5.37 and 8.66, MSEs = 541.9 and 603.8, ps < .05) proved to be robust and did not interact with menses status nor with pill use (Fs < 1).

Correlational analyses of DA availability and slips of action

We conducted an a priori Spearman correlational analysis to investigate whether performance on standard trials of the slips-of-action test (as reflected in a difference score of responding towards valuable minus devalued outcomes) was directly related to post-drink DA availability (as assessed in terms of TYRPHE/LNAAs ratios). Whereas DA availability failed to predict slips of action in male participants (Rho = −.11, p = .7), DA availability in female participants correlated positively and significantly with the ability to base instrumental responding on current outcome value in the slips-of-action test (Rho = .58, p < .05). These results provide further support for a role of DA in determining the balance between goal-directed and habitual control, with acute APTD tipping the balance towards habitual responding.

Working memory and age

In order to exclude the possibility that APTD’s detrimental effects on goal-directed action were mediated by a working memory impairment, we analysed backward scores on the digit span task. A priori statistical analysis confirmed that the TYR and BAL groups performed equally well on this task (F < 1), with an equal average score of 9.1 (SEMs = 0.6 and 0.8, respectively). Furthermore, male and female participants performed equally well overall (F < 1), with average scores of 8.7 and 9.5 (SEMs = 0.6 and 0.8, respectively). Finally, there was no APTD*Gender interaction (F(1,24) = 1.47, MSE = 7.000). Therefore, over-reliance on habits at the expense of goal-directed action under APTD does not appear to be mediated by a detrimental effect on working memory.

Our participants varied in age, so we conducted similar analyses to rule out that age differences accounted for the effect of APTD. We established that male and female participants did not differ significantly in age (F < 1), nor was there a main effect of APTD (F < 1). Finally, there was no significant Gender*APTD interaction (F < 1).

Summary of main results

Lowering DA levels through APTD did not affect performance of male participants, but favoured habitual relative to goal-directed control in female participants. While APTD did not impair their ability to associate stimuli with the appropriate responses, nor learning about the response–outcome relationships, APTD did disrupt their performance on the slips-of-action test. More specifically, when females were required to base responding on the current value of a signalled outcome, on standard and incongruent trials of the slips-of-action test, they failed to withhold responses towards devalued outcomes. DA availability correlated positively with relative goal-directed action control on this test.

Discussion

This study demonstrates that putative DA depletion through APTD shifts the balance from goal-directed to habitual action control. This effect occurred in females only and was directly related to DA availability. To briefly summarise, in the initial learning stage, APTD did not affect their ability to use discriminative stimuli to guide instrumental choice. Therefore, S-R learning was not impaired. R-O learning was also intact, as reflected in successful performance on standard trials of the subsequent outcome-devaluation test. However, APTD had a significant negative effect on incongruent test performance. Below-chance performance on this subset of trials suggests that APTD rendered their performance more strongly reliant on S-R associations. These findings suggest that APTD, if anything, led to stronger habitual control. Indeed, when goal-directed action and S-R habits were brought into direct competition in the ‘slips-of-action’ test, we found that females failed to selectively inhibit responses that led to devalued outcomes. Therefore, we propose that APTD led to reliance on S-R habits at the expense of goal-directed action.

The role of dopamine in goal-directed actions and habits

Our conclusion that lowering DA levels through APTD shifts the balance towards habitual responding is in seeming conflict with a previous study (Faure et al. 2005), which demonstrated that 6-hydroxydopamine lesions of the nigrostriatal pathway in rats disrupt habit formation with extensive training. However, in the present study, we did not investigate the effect of extensive practice. Instead, we adopted a paradigm that allowed us to study relative goal-directed and habitual control within the relatively brief time window within which APTD is effective. Future research is required to investigate whether in humans, as in animals, low DA levels prevent the development of behavioural autonomy as a consequence of over-training (Wickens et al. 2007).

We should also note that Faure and colleagues specifically targeted the nigrostriatal pathway. This pathway has previously been implicated in habit formation, while goal-directed action appears to be subserved by a parallel pathway that includes ventral striatum and ventromedial prefrontal cortex. Evidence for this dual-system neural architecture, from both animal and human research, points to the importance of the particular pathway in which DA is affected (Balleine and O’Doherty 2010). Indeed, DA function in the ventral corticostriatal circuit has been implicated in goal-directed action and outcome prediction (Cheer et al. 2007; Day and Carelli 2007; Hitchcott et al. 2007; Hollerman et al. 2000; Pessiglione et al. 2006; Schultz 1998; Taylor et al. 2007; Waelti et al. 2001). Therefore, it is possible that the effect of APTD in the current study was mediated by DA depletion in this ventral corticostriatal circuit (McLean et al. 2004), and our finding that APTD leads to over-reliance on S-R learning is therefore not necessarily in conflict with previous animal lesioning research into DA function. Future research may adopt the technique of PET to determine the relative efficacy of APTD in reducing DA neurotransmission in these corticostriatal pathways (Leyton et al. 2004; Montgomery et al. 2003).

Finally, APTD likely led to considerably less DA depletion in humans than the 6-OHDA lesions of Faure et al.’s study (2005), where there was an almost complete depletion in rats of nigrostriatal DA. Therefore, it is possible that similarly profound depletion in humans would exert similar effects to those of Faure and colleagues. However, in a recent study into Parkinson’s disease, in which striatal DA is very much compromised, we also failed to find evidence for a deficit in S-R habit learning using our instrumental paradigm (de Wit et al. 2011).

Possible limitations of the present study

While APTD’s effectiveness in different brain regions is, at present, not fully understood, it does appear that it mainly influences DA levels. Although tyrosine is also a precursor for noradrenaline, previous animal and human research indicates that APTD affects DA selectively (McTavish et al. 1999a, b; Sheehan et al. 1996).

At first glance, it may seem that the impaired performance on the slips-of-action test could have been due to a general inhibitory deficit. However, previous research into response inhibition implicates mainly serotonergic and noradrenergic systems as opposed to DA (Eagle et al. 2008). Furthermore, performance on congruent test trials, on which participants could withhold responses simply on the basis of stimulus identity, was not impaired. Instead, APTD interfered specifically with performance on standard and incongruent trials, on which selective inhibition critically depended on the ability to accurately predict the outcome that was signalled to be available by the stimuli. Finally, impaired performance did not appear to be mediated by a general working memory deficit. In line with previous research, APTD did not affect working memory as assessed with the digit span test (Mehta et al. 2005). Therefore, it appears that in the present study, APTD compromised specifically the ability to perform in a goal-directed manner, with greater control over behaviour being conferred on the S-R habit system.

Possible basis of the observed gender difference

Females were more sensitive to the disruptive effects of APTD than males. Previous publications have also pointed at gender differences in cognitive performance following APTD treatment (e.g., Munafo et al. 2007; Robinson et al. 2010). In the present study, it appears unlikely that this was due to differential effectiveness of the APTD treatment as amino acid dosages were adjusted to lower female body weight. However, the basis of this gender difference could be that females have significantly higher striatal DA synthesis capacity than males (Laakso et al. 2002; Haaxma et al. 2007).

We should point out that menstrual cycle and use of contraceptive pill are important factors to consider in DA regulation and cognitive performance in females. Although the present study was not designed to address this issue, inclusion of these variables in our analyses did not affect our finding of a female-specific performance deficit as a consequence of APTD. Having established this gender difference, future research should further investigate the, too commonly neglected, role of gonadal hormones on the effects of dopamine reduction on action control.

The role of dopamine in psychopathologies

It is also possible that females’ sensitivity to the disruptive effects of APTD relates to differential vulnerability to psychopathologies that involve alterations of DA transmission (Cahill 2006; Seeman 1997; Wetherington 2007). Several gender differences have been observed in psychopathologies that involve impulsive and compulsive behaviours, although the contribution of DA remains to be elucidated. For example, recent epidemiological research suggests that following first-time cocaine use, females are more likely to develop dependence on the drug than males (O’Brien and Anthony 2005), and animal research also points to greater vulnerability in females (Lynch 2006; Roth and Carroll 2004). DA depletion in Parkinson’s disease may also affect the balance between goal-directed and habitual behaviour (de Wit et al. 2011). Greater vulnerability to this disease in males than in females has been hypothesised to be related to higher striatal DA levels in females (Haaxma et al. 2007). A further gender bias has also been observed in vulnerability to impulse control disorders in medicated PD patients (Giladi et al. 2007). Finally, females’ vulnerability to depressive thinking styles and to ruminative tendencies (Strauss et al. 1997) may reflect a greater susceptibility to automaticity in thought as well as outward action. In conclusion, the contribution of DA to gender differences in susceptibility to DA-dependent psychopathologies clearly warrants further investigation.

Conclusion

We present the first investigation of the role of global DA in the balance between habitual and goal-directed behavioural control in humans, using the safe, dietary intervention of APTD. We provide evidence that APTD does not interfere with S-R nor with R-O learning. However, APTD does appear to shift the balance towards reliance on habitual responding at the expense of flexible, goal-directed action. This detrimental effect of APTD was restricted to female volunteers. Future research is required to improve our understanding of this gender difference in dopaminergic regulation of habitual and goal-directed behaviour.