Introduction

Impulsivity has long been thought to play a role in the aetiology of drug dependence, but the mechanisms underpinning this association remain to be clarified. Prevailing evidence indicates that impulsivity and drug use are co-morbid (Stanford et al. 2009) and that impulsivity both pre-dates (Ersche et al. 2010; Tarter et al. 2004; Verdejo-García et al. 2008) and is exacerbated by drug exposure (Dallery and Locey 2005; Heil et al. 2006; Setlow et al. 2009; Winstanley 2007) suggesting their relationship is reciprocal (De Wit 2009; Everitt et al. 2008; Perry and Carrol 2008). The question, therefore, is how impulsivity influences drug-seeking/taking to enhance dependence.

One possibility is that impulsivity confers hypersensitivity to drug reinforcement which establishes higher rates of drug-seeking/taking. This claim is supported by the finding that impulsivity in rats, quantified by preference for small immediate reward over a delayed larger reward (delay discounting) is associated with higher rates of cocaine (Anker et al. 2009; Perry et al. 2005; Perry et al. 2008), alcohol (Poulos et al. 1995) and methylphenidate self-administration (Marusich and Bardo 2009). Similarly, impulsivity in rats, assessed by premature responding in the five-choice serial reaction time task (five-choice), is associated with reduced D2 receptor availability and higher rates of cocaine self-administration (Dalley et al. 2007; see also Le Foll et al. 2009). In turn, reduced D2 availability is associated with higher rates of cocaine self-administration in monkeys (Nader et al. 2006) and more positive subjective liking of methylphenidate in humans (Volkow et al. 1999; Volkow et al. 2002). Finally, human impulsivity indexed by the Barratt Impulsivity Scale (BIS; Stanford et al. 2009) is associated with reduced dopamine D2 receptor availability in the striatum (Lee et al. 2009), implying that BIS impulsivity should be linked to higher rates of self-administration.

The alternative possibility is that impulsivity does not influence drug reinforcement, but rather, facilitates automatic or habitual control of drug-seeking/taking behaviour by drug-associated stimuli. Evidence for this proposal comes from the finding that five-choice impulsive rats do not maintain higher rates of cocaine self-administration at baseline, but rather, show selective perseveration of self-administration punished by electric shock (Belin et al. 2008; Economidou et al. 2009). Arguably, such perseverative self-administration is not mediated by the response–outcome contingency, but rather, is mediated by a crystallized, or compulsive, stimulus–response (S-R) association which renders the behaviour resistant to challenge (Belin et al. 2009; Everitt et al. 2008).

Further complexity in this area has been raised by Diergaarde et al. (2008). In this study, five-choice impulsive rats showed higher rate of nicotine self-administration (consistent with Dalley et al. 2007) but no perseveration of extinguished self-administration (inconsistent with Belin et al. 2008; Economidou et al. 2009), whereas delay discounting impulsive rats showed the converse pattern: equal rates of self-administration (inconsistent with Anker et al. 2009; Marusich and Bardo 2009; Perry et al. 2005; Perry et al. 2008; Poulos et al. 1995) and perseveration of extinguished self-administration (consistent with Belin et al. 2008; Economidou et al. 2009). It may be concluded from these mixed data that vulnerability to dependence is conferred by two dissociable traits, sensitivity to drug reinforcement and impulsive-automaticity, but the optimal methods for discriminating these two traits and assessing their unique impact on behaviour remains to be determined.

Human self-administration studies have produced similar mixed results. Although a number of human self-administration paradigms have been developed (Bisaga et al. 2007; Fischman and Foltin 1992; Haney 2009; Hart et al. 2001; Harvey et al. 2004; Lamb et al. 1991; Leeman et al. 2010; McKee et al. 2009; Panlilio et al. 2005; Ray et al. 2006; Silverman et al. 1994; Spiga et al. 2005), only three studies appear to have examined the relationship with impulsivity (Dallery and Raiff 2007; Mueller et al. 2009; Walsh et al. 2010). In two studies (Dallery and Raiff 2007; Mueller et al. 2009), smokers could earn money in accordance with the time for which they refrained from puffing on a cigarette or initiating smoking, and delay discounting impulsivity predicted higher rates of puffing and a shorter latency to initiate smoking, respectively. However, it is not clear whether impulsivity conferred hypersensitivity to tobacco reward or hyposensitivity to punishment in the form of money loss. By contrast, Walsh et al. (2010) found that cocaine-dependent participants had a greater hedonic response to cocaine and higher rates of cocaine self-administration compared to cocaine abusers, yet the two groups were equated for BIS impulsivity, suggesting impulsivity was orthogonal to the reinforcement value of cocaine. The current study was undertaken to address these unresolved questions concerning the relationship between impulsivity and drug self-administration.

In the current experiment, a sample 100 relatively young adult smokers were recruited, containing an equal proportion of daily and non-daily smokers to ensure a broad variation in smoking uptake (cigarettes per week). Craving was measured with the questionnaire of smoking urges (Cox et al. 2001), and impulsivity was assessed with the BIS (Patton et al. 1995). Then, the rate of drug-seeking was assessed using a concurrent choice procedure (for reviews see Ahmed 2010; Hursh and Silberberg 2008). In each trial, participants could press one key to earn cigarette points or a second key to earn chocolate points, and each key had a 50% probability of yielding its respective reward in any given trial. Previous studies employing human instrumental learning tasks in which an arbitrary response yields tobacco puffs (Bühler et al. 2010; Perkins et al. 1994; Willner et al. 1995), cocaine pictures (Moeller et al. 2009) or cocaine (Walsh et al. 2010) have shown that response rate/percent is predicted by drug use severity and craving. Thus, smoking uptake and craving were expected to predict increased percent tobacco over chocolate responding in the concurrent choice task. The question at stake was whether BIS impulsivity would also be associated with increased percent choice of tobacco, suggesting hypersensitivity to drug reinforcement, or would be uncorrelated with this measure of drug-seeking, suggesting that impulsivity plays a dissociable role in the aetiology of dependence.

Drug-taking was then assessed in an ad libitum smoking session, where the number of puffs consumed was the critical measure. Previous studies using ad libitum smoking have shown that the number of puffs consumed is associated with smoking uptake and craving (Bisaga et al. 2007; Leeman et al. 2010; see also Brauer et al. 1996), and so uptake and craving were expected to predict increased puff number in the current study. The question at stake was whether BIS impulsivity would also be associated with increased drug-taking, suggesting greater sensitivity to drug reinforcement, or would be uncorrelated with drug-taking, suggesting a dissociable role in dependence.

The second objective was to determine the link between impulsivity and automatic control of drug-seeking/taking. Numerous methodologies have been claimed to isolate automatic or implicit process in addiction, but most fall short of the strict criteria required for this conclusion (Debner and Jacoby 1994; Lieberman et al. 1998; Lovibond and Shanks 2002; Seth et al. 2008; Shanks and St. John 1994). According to one criterion, behaviour is demonstrably automatic if its performance is dissociated from the subjectively reported intention of the individual (Jacoby 1991). In applying this criterion to addiction, Tiffany (1990) proposed that drug-seeking/taking may be identified as automatic if this behaviour is uncorrelated with subjective craving. A similar point was made by Robinson and Berridge (1993) in their wanting–liking dissociation. On this rationale, the current study quantified the automaticity of drug-seeking/taking by its decoupling (null correlation) from subjective craving to smoke, using moderation analysis. The question at stake was whether BIS impulsivity would moderate the correlation between craving and drug-seeking/taking, suggesting a propensity to automatic control over these behaviours as proposed by Tiffany (1990).

Learning theory anticipates that drug-seeking and drug-taking should acquire different levels of automaticity. The basis for this suggestion comes from the animal devaluation procedure. In this procedure, rats are first trained on a drug or reward-seeking response before the incentive value of the outcome is devalued by satiety/taste aversion or revalued by abstinence. In the test that follows, rats have the opportunity to perform the drug/reward-seeking response in extinction to determine if they can use an expectation of the outcome to control their behaviour. Responding that is sensitive to the devaluation/revaluation treatment in the extinction test is identified as goal-directed insomuch as this behaviour must be mediated by rats’ knowledge of the response–outcome (R-O) contingencies established in training combined with knowledge of the current incentive value of the outcome acquired during the devaluation/revaluation treatment (Dickinson and Balleine 2010). By contrast, drug/reward-seeking is demonstrably automatic (or habitual) if its performance is insensitive to the devaluation/revaluation treatment. In this case, the behaviour is arguably elicited directly by the instrumental context as a S-R habit, without the animal retrieving knowledge of the outcome of that behaviour, i.e. the behaviour is decoupled from intention.

The development of habitual control over natural rewarded behaviour is favoured by overtraining or practice (Dickinson et al. 1995; Killcross and Coutureau 2003; Tricomi et al. 2009), by the availability of a single response option (Kosaki and Dickinson 2010) and by responses that are proximal to the ingestion of the reinforcer (Balleine et al. 1995; Corbit and Balleine 2003). Moreover, these factors appear to be important for the development of habitual drug-seeking (Dickinson et al. 2002; Glasner et al. 2005; Miles et al. 2003; Zapata et al. 2010) versus goal-directed drug-seeking (Hutcheson et al. 2001; Olmstead et al. 2001). The implication for the current study is that because the concurrent choice test of drug-seeking incorporates factors that favour goal-directed control (minimal training, multiple response options, distance from ingestion), whereas the ad libitum smoking test of drug-taking incorporates factors that favour habitual control (overtraining, single response option, proximity to ingestion), one would expect greater automatic control over drug-taking. Accordingly, it is anticipated that impulsivity could selectively moderate the correlation between craving and drug-taking, but not drug-seeking, suggesting that the propensity for automatic control is limited to behaviours that have received training conditions favourable to the development of automaticity.

Method and materials

Participants

One hundred smokers were recruited for the study. Half reported daily and half reported non-daily smoking to ensure broad variance in uptake. Ten participants consumed an outlying number of puffs (±3 SD) in the ad libitum consumption test, that is, three participants consumed three or fewer puffs and seven participants consumed 27 or greater puffs. These participants were excluded to avoid undue influence of outliers and to allow puff number to be transformed to a normal distribution, which are essential for the stability of regression analysis (Draper and John 1981). The remaining 90 participants had a average age of 21.4 (3.4, 18–40), smoked 4.8 (2.5, .2–7) days per week, in which days they smoked 5.8 (4.0, 1–20) cigarettes, smoked for 4.5 (3.5, .5–27) years, starting at an age of 16.9 (2.6, 11–31) and reported a smoking urges score of 3.4 (1.5, 1–6.8) for factor 1 and 1.5 (0.7, 1–4.3) for factor 2 (brackets contain SD followed by the range min–max). There were 47 males and 43 females and 44 daily and 46 non-daily smokers in the analysed sample.

Apparatus and materials

Questionnaires established age, gender, smoking days per week, cigarettes smoked on smoking days, time since last cigarette, smoking years, age of smoking onset and questionnaire of smoking urges (QSU—Cox et al. 2001) using a revised scoring protocol (Cappelleri et al. 2007), which yielded factor 1, reflecting a desire to smoke for the rewarding consequences, and factor 2, reflecting anticipation of relief from negative withdrawal. The BIS version 11 was used to measure impulsivity (Stanford et al. 2009). This questionnaire contains three subscales: (1) Motor impulsivity, e.g. “I do things without thinking”, which assesses propensity for action without thought, (2) nonplanning impulsivity, e.g. “I plan tasks carefully”, which assesses capacity for purposive future action and (3) attentional impulsivity, e.g. “I don’t pay attention”, which assesses capacity for sustained attention. The concurrent choice task was generated with E-prime software (Psychology Software Tools, Inc. pstnet.com) on a standard PC.

Concurrent choice task (drug-seeking)

Immediately following these questionnaires, participants completed a concurrent choice task to quantify rate of drug-seeking. Participants were first presented with the on-screen instructions: “This is a game in which you imagine winning cigarettes and chocolate. In each round, either 1/4 of a cigarette or 1/4 of a chocolate bar will be available, but you will not be told which. Choose either the D or H key in each round to try and win the reward. You will only win if you select the correct key. Good luck. Press the space bar to begin”.

Each trial began with the central text, “Select a key” presented for between 1 and 5 s. Pressing either the D or H key in this period yielded the outcome text “You win 1/4 of a cigarette”, which overwrote the “Select a key” text at its termination, whereas pressing the other key yielded the outcome text “You win 1/4 of a chocolate bar”. These outcomes were presented for 2 s followed by and random inter-trial interval of 1,000 to 2,500 msec. The D and H key were counterbalanced between participants in their production of the tobacco and chocolate outcome, and each key had only a 50% chance of yielding its outcome in any given trial. On non-rewarded trials, the text “You win nothing” was presented. Participants had no way of predicting which key would be reinforced in any given trial.

There were five blocks of 12 trials, and each block comprised two cycles of three tobacco and three chocolate outcomes scheduled randomly across six trials, such that no more than three of the same outcome could occur in succession within a block. Earned outcomes were summed across trials, and at the end of each 12-trial block, a “totalizer” screen reported the quantity of each reward type earned. Where whole cigarettes or chocolate bars had been earned, participants were instructed to move that many units from two boxes present on the table, which contained 15 Marlboro Lights cigarettes and 15 Cadbury Dairy Milk treat size chocolate bars (15 g), respectively, into “their box” present alongside. In this way, participants contacted the earned rewards, although they were aware that they would not keep them at the end of the experiment. The critical measure was the percentage of trials in which the tobacco key was selected.

Ad libitum smoking session (drug-taking)

Immediately following the concurrent choice task, participants were told that they were free to take a break in which they could smoke. They were sent outside the building for a fixed 10-min interval to smoke as much or as little as they wished and they recorded on a sheet of paper each puff consumed to measure drug-taking. Given the importance of overtraining in encouraging habitual behaviour, and the possibility that any alteration in the behavioural sequence would reengage intentional control (Daw et al. 2005), the objective of this ad libitum test was to be natural and avoid experimental intervention as far as possible, apart from the tick sheet on which each puff was recorded. Participants were paid £5 for participation, and the Nottingham School of Psychology Ethics committee approved the study.

Results

Data were normally distributed or transformed to a normal distribution prior to analysis (p values > 0.05). The index of smoking uptake was cigarettes smoked per week (smoking days per week × cigarettes smoked on smoking days). Relationships were first tested with Pearson correlations, followed by regression to isolate the contribution of the various predictors. Data were centred prior to moderation analysis. All figures reported the actual data (untransformed/uncentred) for ease of interpretation. A threshold of p < .05 defined significance in all analyses. Smoking urges factor 2 was omitted from the analysis as it showed no significant associations.

Predicting drug-seeking/taking

Figure 1a–d shows that drug-seeking and taking were predicted by cigarettes per week (uptake) and smoking urges factor 1 (craving) but not by BIS impulsivity scales (Fig. 1e–j). Table 1 shows the hierarchical regression undertaken to contrast these two sets of predictors. Drug-seeking or taking were entered as the dependent variable with the BIS scales as predictors at level 1, and these associations were again non-significant. By contrast, when cigarette per week or smoking urges factor 1 were added as predictors at level 2, the proportion of variance accounted for increased significantly. Thus, drug-seeking/taking were significantly more strongly predicted by uptake and craving than by BIS impulsivity, suggesting that BIS impulsivity does not increase sensitivity to drug reinforcement.

Fig. 1
figure 1

ad Scatterplots showing the extent to which cigarettes per week (uptake) and smoking urges factor 1 (craving) predict percent tobacco responses in the concurrent choice task (drug-seeking) and number of puffs consumed in the ad libitum smoking session (drug-taking). ej Scatterplots showing the extent to which Barratt Impulsivity Scales (BIS) predict the drug-seeking and drug-taking measures

Table 1 Hierarchical regression with percent tobacco choice (drug-seeking) or number of puffs consumed (drug-taking) as the dependent variables

Multiple regression was undertaken to test whether the relationships between drug-seeking/taking and uptake/craving could be accounted for other individual differences. Again, drug-seeking or taking were entered as the dependent variable with uptake or craving as the predictors, along with age, time since a cigarette, smoking years and age of smoking onset. In each of these four regressions, only the uptake and craving measures served as significant independent predictors, t values > 2.15, p values ≤ .03, whereas the other variables were all unreliable, t values < −1.43, p values > .15. These analyses indicate that the association between uptake/craving and drug-seeking/taking cannot be attributed to age, chronicity or recency of smoking. Finally, uptake/craving measures were not reliably correlated with the BIS impulsivity scales, r values < .19, p values > .08, suggesting these traits are orthogonal.

Moderation analysis

Moderation analysis was conducted to determine if impulsivity influenced the strength of relationship between smoking urges factor 1 and drug-seeking/taking. For this purpose, hierarchical regression was conducted in which either percent tobacco choice (seeking) or number of puffs consumed (taking) were entered as dependent variables. In the first level, centred values for the independent variable, smoking urges factor 1, were entered, along with centred values for the moderator variables (the three impulsivity scales in separate tests), creating six tests in total. In the second level of these tests, the centred values for the product of the independent and moderator variables were entered, and this interaction term was tested for significance in the change of the R 2. Table 2 shows the interaction term for each of these six tests. These analyses showed that BIS nonplanning impulsivity moderated the relationship between smoking urges factor 1 and drug-taking behaviour, as shown in Fig. 2a, that is, craving and drug-taking became decoupled with nonplanning impulsivity. By contrast, there were no other significant moderation effects. Figure 2b shows the non-significant moderation effect with drug-seeking, to highlight the selectivity of this effect to drug-taking.

Table 2 Moderation analyses examining the impact of BIS impulsivity on the association between smoking urges factor 1 and drug-seeking/taking (see text for details)
Fig. 2
figure 2

a Simple slopes analysis showing the change in association between smoking urges factor 1 (craving) and number of puffs consumed in the ad libitum smoking session (drug-taking) as a function of three levels of BIS nonplanning impulsivity (low, median, high). The progressive reduction in the association between craving and drug-taking across levels of nonplanning impulsivity is indicative of a transition to automatic (non-intentional) control of drug-taking. b Simples slopes analysis with percent tobacco choice (drug-seeking) as the dependent measure, shows no comparable decoupling from craving across levels of nonplanning impulsivity

Simple slope analysis (Jose 2008) on Fig. 2a indicated that the relationship between smoking urges factor 1 and drug-taking was significant for the low BIS nonplanning group, t = 4.13, p < .001, and the median BIS nonplanning group, t = 2.88, p = .005, but not for the high BIS nonplanning group, t = .13, p = .90. These analyses indicate that drug-taking became progressively decoupled from subjective craving as BIS nonplanning impulsivity increased. Moreover, nonplanning impulsivity was not reliably correlated with age, smoking years, age of smoking onset or time since a cigarette, r values < .13, p values > .22, indicating that the development of automatic drug-taking could not be attributed to age, chronicity or recency of smoking. Finally, Levene’s test indicated that the three nonplanning groups (low, median, high) showed no difference in the variance or either puff number of smoking urges factor 1, F values < 1, indicating that the loss of correlation between puff number and smoking urges factor 1 could not be attributed to differential constraint on the variance of these measures imposed by ceiling or floor effects.

Discussion

In this study, a sample of smokers who varied in their uptake of smoking reported their craving to smoke and impulsivity before tobacco-seeking and taking behaviour were measured experimentally. The result showed that impulsivity was not associated with higher rates of drug-seeking/taking, contrary to the view that impulsivity confers hypersensitivity to drug reinforcement. In contrast, uptake and craving were associated with higher rates of drug-seeking and taking, which could not be accounted for by age, recency or chronicity of smoking, and this uptake/craving trait was orthogonal (uncorrelated) with impulsivity. These findings suggest that drug uptake is mediated by hypersensitivity to drug reinforcement, which establishes greater intentional drug-seeking/taking reflected in subjective craving, whereas impulsivity plays a dissociable role in the aetiology of dependence. The finding that nonplanning impulsivity moderated the association between craving and drug-taking suggests, in accordance with Tiffany’s (1990) criteria, that nonplanning impulsivity confers a propensity for automatic (non-intentional) control over drug-taking. At the same time, the null moderation effect with drug-seeking suggests that the propensity to automatic control in nonplanning impulsivity extends only to behaviours that have undergone training that favours automatic control. The overall conclusion, therefore, is that two orthogonal vulnerabilities, hypersensitivity to drug reinforcement and propensity to automatic control, play dissociable roles in the aetiology of dependence.

This dual-process account of dependence vulnerability reconciles two paradoxical literatures. One set of literature suggests that dependence is mediated by sensitivity to drug reinforcement, which establishes greater intentional drug choice, indexed by the hedonic response to drugs (Scherrer et al. 2009; Volkow et al. 2009; Walsh et al. 2010), craving (Allen et al. 2008; Killen and Fortmann 1997), positive expectancies (Campbell and Oei 2010; Herd et al. 2009; Leventhal and Schmitz 2006), instrumental response rate (Dalley et al. 2007; Le Foll et al. 2009; Moeller et al. 2009; Nader et al. 2006; Walsh et al. 2010) and economic demand (Bickel et al. 2000; MacKillop et al. 2010). By contrast, the other set of literature suggests that dependence is mediated by automaticity, or the loss of intentional control over behaviour, indexed by effects of drugs on impulsive responding (Dallery and Locey 2005; Heil et al. 2006; Setlow et al. 2009; Winstanley 2007), habit formation (Dickinson et al. 2002; Jedynak et al. 2007; Nelson and Killcross 2006), fronto-executive impairment (Dom et al. 2005; Garavan and Hester 2007; Goldstein et al. 2009; London et al. 2000) and perseveration of self-administration under punishment (Belin et al. 2008; Deroche-Gamonet et al. 2004; Economidou et al. 2009; Pelloux et al. 2007; Vanderschuren and Everitt 2004) and extinction (Diergaarde et al. 2008). It is believed that these two dissociable traits, hypersensitivity to drug reinforcement and propensity to automatic control, have independent genetic substrates (Breitling et al. 2009; Dalley and Everitt 2009; Furberg et al. 2010; Sherva et al. 2008) and play a differential role in the uptake and clinical perseveration of drug use, respectively (Belin et al. 2009; Everitt et al. 2008; Goldstein et al. 2009).

The current data are at odds with studies which have shown impulsive rats to acquire higher rates of self-administration (Anker et al. 2009; Dalley et al. 2007; Marusich and Bardo 2009; Perry et al. 2005; Perry et al. 2008; Poulos et al. 1995). One explanation for this discrepancy is suggested by the finding that the effect of impulsivity on the rate of self-administration in animals is dependent upon the dose, length of training and reinforcement rate (see in particular Marusich and Bardo 2009). Thus, it is possible that the current drug-seeking/taking schedules were not optimized to detect an effect of impulsivity. However, because the ad libitum smoking test approximated “normal” human self-administration, the absence of an impulsivity effect here rather calls into question the generality of this effect in animal self-administration.

The second source of variance between these studies is the tests of impulsivity employed. Whereas the current study used the BIS, the animal designs used delay discounting or five-choice, and these assays are likely to tap only partially overlapping arrays of traits (de Wit et al. 2007). Two human studies which have indexed impulsivity with delay discounting have shown an association with higher rates of smoking in the laboratory (Dallery and Raiff 2007; Mueller et al. 2009). However, as noted, smoking in these studies was offset against money loss, so higher rates could have been driven by hyposensitivity to punishment, i.e. automaticity, rather than sensitivity to nicotine reinforcement (Belin et al. 2008; Economidou et al. 2009). On the other hand, Walsh et al. (2010) found that although dependence status predicted rate of cocaine self-administration, BIS impulsivity did not, corroborating the current finding. Thus, although there is converging evidence that BIS impulsivity does not mark sensitivity to drug reinforcement, it remains possible that other assays of impulsivity do so; although in humans, this remains to be established in the absence of punishment.

Higher rates of drug-seeking in individuals with greater uptake arguably highlight the importance of goal-directed learning in dependence vulnerability. On this goal-directed account, when deciding which key to press in the concurrent choice task, participants retrieved a representation of the two outcomes, “You win 1/4 of a cigarette” and “You win 1/4 of a chocolate bar”, and the incentive value ascribed to each outcome provided evaluative feedback to determine the propensity to select the associated response (R-O; de Wit and Dickinson 2009). Arguably, this retrieval of incentive value added a bias to the selection of the higher valued outcome (Baum 1974), above and beyond the ongoing tabulation of local reinforcement rates which ensured a high degree of switching and general tendency towards equal distribution between the two responses (Herrnstein 1961; Staddon 1992). The claim that retrieval of the incentive value of tobacco determined preferential responding for this outcome, rather than this outcome simply establishing a stronger motoric propensity through the law of effect (Thorndike 1911) or S-R/reinforcement learning (Hull 1943), comes from the finding that this preference was associated with subjective craving. Strictly speaking, this correlational evidence does not support causal inferences. However, Cartesian accounts, which view craving as an epiphenomenon, are less parsimonious because they cannot explain why conscious experience should arise if not to play a causal role in action selection. More direct evidence, however, comes from the finding that selection of the two responses can be devalued in an extinction test (Hogarth and Chase 2011) indicating that choice is influenced by retrieval of a representation of the current incentive value of the outcome. Overall, therefore, these data converge on the view that the uptake of drug use is mediated by the hypervaluation of the drug as the outcome of goal-directed drug-seeking.

Goal-directed learning may also play a role in drug-taking, at least in low nonplanning impulsive participants. On this view, when low nonplanning impulsive participants raised the cigarette to their lips in the ad libitum test, this response was governed by a representation of the reward value of the inhalation, indexed by craving. By contrast, the decoupling of craving and drug-taking in high nonplanning impulsive participants suggests a transition to automatic control in these individuals. On this view, when high nonplanning impulsive individuals lifted the cigarette to their lips to inhale, this response was not governed by an expectation of the rewarding consequences, reflected in craving, but rather, was elicited directly by Pavlovian stimuli embedded within the smoking sequence (Ostlund et al. 2009), for example, by the sight of the cigarette, the time since the previous puff, the pneumatic parameters of the previous puff or interoceptive signals of taste, nicotine or blood oxygen (Nemeth-Coslett and Griffiths 1984a,b; Rose 2006).

There are three forms of automatic learning that could govern smoking behaviour in high nonplanning impulsive participants. By classic S-R/reinforcement learning (Hull 1943), nicotine reinforcement strengthens the association between stimuli (S) embedded within the smoking sequence and the puffing response (R), enabling in those S to elicit the R directly. By this S-R account, the high nonplanning impulsive smoker is a true automaton because no cognitive or motivational process mediates stimulus control of drug-taking. By contrast, two-process learning theory (Rescorla and Solomon 1967) suggests that cues embedded within the smoking sequence acquire through Pavlovian conditioning the capacity to elicit an excitatory motivational state (Om), such as physiological arousal (Carter and Tiffany 1999), akin to that elicited by the inhalation itself, which modulates S-R associations to evoke the puff response. On this view, a momentary arousal state mediates the cue effect on drug-taking (S-Om-R), but the taking response is nevertheless autonomous because the response is not governed by an explicit representation of the outcome.

According to S-Os-R theory (Balleine and O’Doherty 2010; de Wit and Dickinson 2009; Ostlund and Balleine 2008), stimuli (S) embedded in the smoking sequence elicit a representation of the sensory features of the associated inhalation outcome (Os), which in turn elicits the puffing response (R) through a bidirectional instrumental (Os-R) association. On this view, high nonplanning impulsive smokers were cognitive agents because an expectation of the inhalation outcome mediated the puff response. However, this expectation incorporated only the sensory features of inhaling, which elicited the puff response automatically in the sense the representation of current value of that outcome (Ov) was not retrieved. The essential difference between these three accounts lies in the quality of the O representation retrieved to govern the puff response. The outcome representation is either entirely absent (S-R), restricted to a general motivational state (S-Om-R), or restricted to a sensory percept (S-Os-R). All three proposals converge on the view that automatic action ceases to be controlled by a representation of current incentive value of the outcome (Ov).

Support an S-Os-R account of automatic drug use comes from three studies. Hogarth et al. (2010) paired a stimulus (S+) with the receipt of the tobacco outcome “You win 1/4 of a cigarette”, and then tested the ability of this stimulus to transfer control over puffing behaviour in an ad libitum smoking session. The results showed that across the smoking session, puff probability and craving declined, reflecting a decrease in the incentive value of smoking with satiety. By contrast, the capacity of the S+ to enhance puff probability and craving remained unchanged, suggesting that the cue controlled puffing automatically, without making contact with the current incentive value of smoking (for corroboratory evidence, see Drobes and Tiffany 1997; Maude-Griffin and Tiffany 1996; Tiffany et al. 2000; Tiffany et al. 2007; Waters et al. 2004). Importantly, because the S+ was not associated with the puffing response in training, this transfer of control could not be mediated by the formation of a direct S-R association (Colwill and Rescorla 1988) and must have involved retrieval of a representation of the outcome, either Om or Os.

To discriminate these two possibilities, Hogarth et al. (2007) conducted an outcome-specific Pavlovian to instrumental transfer procedure (for a review see Holmes et al. 2010). In this design, initial discrimination training established two stimuli as predictors of tobacco or money reward, before instrumental training established two responses to earn these same outcomes. In the transfer test, each stimulus selectively augmented performance of the response that was associated with the same outcome, demonstrating that the transfer of control was mediated by each stimulus retrieving a representation of its associated outcome, which in turn elicited its associated response, i.e. transfer of control was mediated by S-O-R learning. Finally, in the third study of this series (Hogarth and Chase 2011, Experiment 2), this transfer effect was shown to be insensitive to devaluation, corroborating animal studies (Colwill and Rescorla 1990; Corbit et al. 2007; Holland 2004; Rescorla 1994). Thus, the outcome representation retrieved by the S, which controlled action selection, was demonstrably restricted to the sensory percept of the outcome (Os) and did not make contact with the current incentive value of the outcome (Ov). The implication is that high nonplanning impulsivity conferred a predisposition to this form of automatic S-Os-R based control of puffing behaviour, decoupling this behaviour from outcome value (Ov) indexed by craving. The predominance of this automatic form of learning is potentially responsible for the intransigence of drug use to intentional regulation.

It is important to note that automaticity was restricted to drug-taking, whereas drug-seeking remained coupled to craving irrespective of impulsivity. This finding accords with animal studies which have found that insensitivity to devaluation (habit formation) is favoured by overtraining, restricted response options and the proximity of the response to ingestion (Corbit and Balleine 2003; Dickinson et al. 1995; Kosaki and Dickinson 2010). Thus, the selective decoupling of drug-taking from craving across nonplanning impulsivity may have arisen because the ad libitum consumption test contained procedural variables that favoured automatic control, i.e. the smoking sequence was overtrained, singular and proximal to ingestion. By contrast, the concurrent choice test may have remained coupled to craving irrespective of impulsivity because the responses measured in this test had received training favourable to goal-directed control, i.e. minimal training of multiple response options that were distal from ingestion. The implication is that automatic and goal-directed learning operate concurrently, but differentially influence behaviours depending upon their conditions of training (Balleine and O’Doherty 2010; Dickinson and Balleine 2010; Killcross and Coutureau 2003). The dominance of habit learning, however, may extend to behaviours that have undergone unfavourable training given sufficient neurotoxic insult (Jedynak et al. 2007; Killcross and Coutureau 2003; Nelson and Killcross 2006; Zapata et al. 2010), which could drive wide ranging automaticity in more clinically severe populations.

Several observations support the notion that automaticity plays a more prominent role in the longitudinal clinical preservation of drug use. First, the DSM-IV (1994) criteria for diagnosis of dependence favour constructs related to automaticity, such as perseveration despite costs and binging or chain consumption, rather than constructs to do with drug valuation, such as craving and expectancy. Indeed, we have recently found that BIS impulsivity was most strongly associated with these two automatic constructs of the DSM (Chase and Hogarth 2011). Second, the impact of public health information on cessation is dependent upon individual differences in the depth of cognitive processing of this information (Hammond et al. 2003) and education level of the individual (Heyman 2003), suggesting impaired cognition is important for clinical perseveration. Third, impulsivity has been associated with markers for clinical perseveration rather than uptake (Flory and Manuck 2009) and shows greater co-morbidity with substance misuse in adult versus adolescent populations (Biederman et al. 1997), is more strongly associated with treatment retention than with drug use severity (Moeller et al. 2001) see also (Schmitz et al. 2009) and has been associated with poorer quit rates after controlling for craving, dependence, age and affect (Doran et al. 2004). Although these findings are suggestive, the predominance of automatic learning in clinical perseveration remains to be tested using paradigms that fully support this conclusion.

It is difficult to explain why the BIS nonplanning scale alone was associated with automatic drug-taking. Across the BIS literature as a whole, the three scales (nonplanning, motor, attentional) show a somewhat chaotic pattern of associations with different diagnoses and psychometric assessments. However, several observations highlight nonplanning impulsivity as decisive predictor, beyond the other scales, which help build a notion of the mechanisms underpinning impulsive-automaticity. Specifically, nonplanning impulsivity has been associated with reduced volume of the right middle orbitofrontal cortex (OFC; Matsuo et al. 2009), reduced OFC volume and addiction in schizophrenics (Schiffer et al. 2010; which may exacerbate reduced medial OFC volume in smokers Kühn et al. 2010), delayed discounting (de Wit et al. 2007), a unique genetic substrate (Benko et al. 2010) and greater susceptibility to impairment of cued reaction time following tryptophan (5HT) depletion (Cools et al. 2005). Perhaps most curiously, Ersche et al. (2010) examined the impulsivity of siblings of drug users to determine whether impulsivity pre-exists drug exposure and found that siblings differed from controls only with respect to nonplanning impulsivity. By contrast, drug users differed from controls by all three scales, suggesting that the motor and attentional components are exacerbated by drug exposure. One can only speculate about how automatic consummatory behaviour accompanying nonplanning impulsivity may have conferred vulnerability to initiate drug use. One possibility is that automatic consummatory patterns more readily generalise from natural reward (food, sweets) to alcohol, tobacco and illicit drugs because the individual does not exert intentional control to stop this generalisation, easing the gateway into drug use, but this remains a speculation.

To conclude, the study found that impulsivity was not associated with increased frequency of drug-seeking and drug-taking behaviour, but individual differences in uptake and craving were. Rather, nonplanning impulsivity was associated with a decoupling of drug-taking from craving, suggesting this behaviour was automatic. However, this automaticity did not extend to drug-seeking, suggesting that nonplanning impulsivity only augments automatic control of behaviours which have undergone training favourable to automaticity. Overall, the study supports a dual-process account of dependence vulnerability wherein uptake is mediated by hypervaluation of the drug as an instrumental goal, whereas clinical perseveration is mediated by the formation of automatic or habitual drug consumption. It will be important for future research to assess whether the balance between these two learning processes changes as a function of drug class and chronicity of drug use.