Human attentional abilities are known to be unreliable (e.g., Rensink, O’Regan, & Clark, 1997; Shapiro, Driver, Ward, & Sorensen, 1997) and inherently unstable (e.g., Cheyne, Solman, Carriere, & Smilek, 2009; Robertson, Manly, Andrade, Baddeley, & Yiend, 1997). Failures of attention have been associated with more traffic fatalities and injuries than have alcohol, drugs, speed, or fatigue (Knowles & Tay, 2002). Given the frailty of attention and the potential severity of its failures, research has focused on a growing effort to develop methods of improving attention across a broad range of tasks. Recent claims have been made that (1) mindfulness and meditation training improve performance on the attention network test (ANT; e.g., Jha, Krompinger, & Baime, 2007; see Tang & Posner, 2009, for a review); (2) training attentional control and attention switching improves performance on attention-switching tasks (see Gopher, 1992; Tang & Posner, 2009, for reviews); (3) playing action video games improves performance on tasks assessing visual and spatial attention (e.g., Green & Bavelier, 2003, 2007, 2009; Greenfield, deWinstanley, Kilpatrick, & Kaye, 1994); (4) adding a moderate attention-demanding task improves performance on temporal and spatial attention tasks (e.g., Gil-Gomez de Liano, Botella, & Pascual-Ezama, 2011; Olivers & Nieuwenhuis, 2005, 2006; Smilek, Enns, Eastwood, & Merikle, 2006); (5) taking a walk in a natural setting improves directed-attention abilities (e.g., Berman, Jonides, & Kaplan, 2008); (6) instructing people to adopt a more passive rather than an active attention strategy increases the efficiency of attention shifts during search (Smilek et al., 2006; Watson, Brennan, Kingstone, & Enns, 2010); and (7) manipulating participants’ mood improves temporal attention (e.g., Jefferies, Smilek, Eich, & Enns, 2008).

Recently, there has been a particular focus on improving performance on sustained-attention tasks, in which individuals must maintain a relatively narrow focus of attention for protracted periods (e.g., Manly et al., 2004; Mrazek, Smallwood, & Schooler, 2012; Valentine & Sweet, 1999). A substantial amount of this work has been done in the context of a continuous-performance task known as the sustained attention to response task (SART; Robertson et al., 1997; see also Smilek, Carriere, & Cheyne, 2010). In this task, failures to withhold button pressing to an infrequent no-go stimulus are scored as errors of commission and are used to index sustained attention ability, with more errors indicating poorer sustained attention ability. Using the SART to index sustained attention ability, researchers have attempted to assess potential improvements in sustained attention performance by (1) having participants engage in “mindful breathing” (Mrazek et al., 2012), (2) inducing a positive rather than a negative mood (Smallwood, Fitzgerald, Miles, & Phillips, 2009), (3) providing a self-alertness training strategy (O’Connell et al., 2008), and (4) presenting periodic auditory alerts to bring attention back on task (Manly et al., 2004).

Most attempts to improve sustained attention have yielded modest results as measured by performance on the SART (i.e., a small reduction in error rates). However, it is sometimes unclear from these studies whether even these modest reductions in errors are truly due to improved sustained attention, or whether they are the result of strategic changes in responding along the speed–accuracy trade-off curve (cf. Helton, 2009; Helton, Kern, & Walker, 2009). This particular concern applies primarily to studies using the SART rather than to those using more traditional go–no-go tasks. Whereas traditional go–no-go tasks index sustained attention by measuring errors of omission on infrequent go trials, the SART indexes sustained attention by measuring errors of commission on infrequent no-go trials. Because the critical error in the traditional go–no-go tasks is a nonresponse, speed–accuracy trade-offs are presumably less of a problem in such tasks, and the prevalence of the error can reasonably be used as a measure of sustained attention. However, because the critical error in the SART is the result of an inappropriately made response, there is the possibility that the error could be caused by speed–accuracy trade-offs (i.e., impulsive responding) rather than by a failure of sustained attention.

Together with others (Helton, 2009; Helton et al., 2009), we assume that in the context of the SART it is in principle possible to separate the influences of sustained attention from strategic shifts along the speed–accuracy dimension. While it is possible to explain speed–accuracy trade-offs using the concept of “attention”—for instance, positing that shifts along the speed–accuracy trade-off curve reflect strategic shifts in attention, either to generating a motor response (i.e., speed) or to stimulus identification (i.e., accuracy; we thank a reviewer for raising this possibility)—this is not the type of “attention” that researchers have referred to when using the SART. Instead, researchers using the SART explain commission errors in terms of people’s ability to “sustain attention” to the overall task, a concept that refers to the overall resources allocated to the task at hand over its duration (Robertson et al., 1997). Given this conceptualization of SART performance, it is presumably possible for individuals to hold constant sustained attention to a task while shifting along the speed dimension of the speed–accuracy trade-off curve, responding either more quickly or more slowly, depending on the strategy employed by the individual. In other words, we maintain that it is possible to separate the influences of sustained attention and speed–accuracy trade-offs.

Importantly, in many studies that have used the SART to document improvements in sustained attention, it is not possible to evaluate whether the performance improvements have reflected improved sustained attention or strategic shifts along the speed–accuracy trade-off curve. This is because in the majority of the studies (e.g., Mrazek et al., 2012; O’Connell et al., 2008; Smallwood et al., 2009), sustained attention performance was indexed by SART commission errors, but mean response time (RT) data were not included with the mean error data. In the absence of a report of the RT data, it impossible to assess potential speed–accuracy trade-offs. That the foregoing concern is justified is suggested by those cases in which, when mean RTs have been reported, the observed reductions in errors in sustained attention performance were accompanied by slower RTs (see Manly et al., 2004), suggesting speed–accuracy trade-off effects.

Further highlighting the need to consider speed–accuracy trade-offs in the SART, we recently reported that participants made fewer errors on an auditory as compared to a visual version of the SART, but this error reduction was entirely explained by slower RTs under the auditory condition (Seli, Cheyne, Barton, & Smilek, 2012a). Additionally, we found that a simple alteration in instructions emphasizing a slow-and-accurate rather than a fast-and-accurate strategy produced substantial improvements in SART performance (Seli, Cheyne, & Smilek, 2012b): Instructing participants to respond slowly cut commission errors roughly in half, from a mean of 48 % for standard instructions to a mean of 25 % for slowing instructions. The decrease in errors across conditions was accompanied by longer RTs on go trials, increasing from a mean of roughly 350 ms with standard instructions to a mean of roughly 460 ms with slowing instructions, and a detailed analysis of the RTs revealed that the change in errors was almost entirely accounted for by changes in RTs. Also relevant is the finding that sustained attention performance in the SART, as measured by a reduction in errors, improves with age, but this error reduction is entirely accounted for by robust response slowing with increasing age (Carriere et al., 2010). In accordance with these findings, Head, Russell, Dorahy, Neumann, and Helton (2012) recently reported that participants who were proficient in text-speak were more inclined to respond quickly to a text-speak variant of the SART, but that these participants were also more inclined to produce a greater number of commission errors. Such changes in performance appear to reflect adjustments in response strategies to deal with attention-demanding tasks, rather than modifications of attentional ability per se.

Speed–accuracy trade-offs have been documented in countless studies in experimental psychology (Garrett, 1922; Hick, 1952; Woodworth, 1899; see Pachella, 1974, and Sperling & Dosher, 1986, for reviews). Here we explore the role of this this well-known experimental confound in studies using the SART as a measure of sustained attention. The sustained-attention literature using the SART has indeed seen its fair share of debates over the extent to which performance is reflective of differences in choices regarding where to respond along the speed–accuracy trade-off curve (see Helton, 2009; Helton et al., 2009; Peebles & Bothell, 2004; Seli et al., 2012b). However, despite this longstanding issue, it appears that researchers concerned with sustained attention and, in particular, with sustained-attention training, have not typically considered the potential impact of speed–accuracy trade-offs on the results of their studies using the SART. Furthermore, as far as we know, while the possible role of speed–accuracy trade-offs in the SART has been identified before (Helton, 2009; Helton et al., 2009; Peebles & Bothell, 2004), no one examining sustained attention performance using the SART has yet manipulated response delay independently of individual differences in criterion setting, in order to evaluate speed–accuracy trade-offs.Footnote 1 Rather, speed–accuracy trade-offs have been detected by correlational analyses. That is, it has been observed that when participants speed up, they make more errors, and when they slow down, they make fewer errors (e.g., Head et al., 2012; Helton et al., 2009; Peebles & Bothell, 2004; Seli et al., 2012b). Although these data certainly suggest that speed–accuracy trade-offs exist, given the correlational nature of the data, a number of other variables may also contribute to this outcome. For example, any manipulation that encourages caution or an emphasis on accuracy over speed might potentially lead to both improvements in performance and more measured responding (e.g., Seli et al., 2012b). Thus, to gain a better understanding of the role of speed–accuracy trade-offs in the SART, experimental manipulations are required to break this interdependence.

Although the concern regarding the role of speed–accuracy trade-offs in the SART is particularly important when considering studies designed to improve sustained attention performance, the concern also applies to the large body of research that has emerged using the SART purely as a measure of sustained attention. As we have noted elsewhere (Seli et al., 2012a; Seli et al., 2012b), the SART has been used to index sustained attention abilities in numerous contexts, including situations in which people experience negative mood (e.g., Smallwood, O’Connor, Sudbery, & Obonsawin, 2007) and stress-related burnout (e.g., van der Linden, Keijsers, Eling, & van Schaijk, 2005), as well as in clinical conditions such as depression (Farrin, Hull, Unwin, Wykes, & David, 2003), attention-deficit/hyperactivity disorder (e.g., Greene, Bellgrove, Gill, & Robertson, 2009), cortical lesions (e.g., Molenberghs et al., 2009), and schizophrenia (e.g., Chan et al., 2009). A clear demonstration that speed–accuracy trade-offs influence SART performance would raise the possibility that the results of studies using the SART may in some cases reflect speed–accuracy trade-offs rather than strictly reflecting sustained attention ability.

The present study

Given the importance of the SART in recent investigations of sustained attention, the purpose of the present study was to systematically explore the effects of speed–accuracy trade-offs in this task. The study was designed to extend our previous finding that SART commission errors are reduced when participants are instructed to respond slowly (Seli et al., 2012b). Although results from our previous study provided some insight into the effects of response delay on SART performance, one limitation of the study was that the instructions were nonspecific, in that they simply encouraged “slow” responding without specifying a precise response tempo. This lack of instructional precision almost certainly resulted in individual differences in the interpretation of the instructions. That is, instructions to respond either “as quickly and accurately as possible” or “as slowly and accurately as possible” imply somewhat of an arbitrary pacing and are quite susceptible to individual differences in interpretation. In addition, participants might have interpreted the slowing instructions to mean that they ought to be more cautious, thus both affecting movement along the speed dimension of the speed–accuracy trade-off curve and motivating the participants to pay more attention (thus, varying sustained attention). To overcome these problems, in the present study we systematically manipulated response delay along the speed–accuracy trade-off curve by linking responses to a precisely timed metronome.

Participants in the present study completed either the standard SART or a modified version of the SART in which their responses were locked to one of three tempos. Participants under the standard-SART condition were instructed to respond to each “go” digit as quickly as possible and to withhold responses to each “no-go” digit. Under the other three, sustained metronome-modulated attention-to-response task (SMMART) conditions, participants were instructed to coordinate their responding to go trials with metronome tones presented 400, 600, or 800 ms after the onset of each digit (we chose these different delays because we wanted equally spaced intervals across a large range of the ISI). Participants were further instructed to withhold responding to the no-go digit (i.e., 3). The instructions did not make reference to the concept of “attention,” nor did they, in any way, refer to the extent to which participants should be “cautious” when responding.

By varying response delay across a wider range of the speed–accuracy trade-off curve, the SMMART allowed for a precise assessment of the effects of different response delays on the frequency of commission errors. Notably, in the SART, the digit, which is rhyhtmically presented, serves two functions: First, the digit carries identity information specifying whether or not a response ought to be made; second, the onset of the digit also specifies when a response ought to be initiated. In other words, the digit serves both as an informational stimulus, indicating whether the trial is a go or a no-go trial, as well as a response-initiation stimulus, indicating that a response should be made as soon as possible. The unique aspect of the SMMART is that the informational and the response-initiation aspects of the stimuli are separated in time, with the digit still serving as the informational stimulus, but with the onset of the metronome serving as the response-initiation stimulus. This feature of the SMMART allowed us to systematically vary the interval between the informational and the response-initiation stimuli, which in effect served as a direct manipulation of the time allowed for the processing of the digit (captured by the speed dimension on the speed–accuracy trade-off curve). Thus, rather than simply measuring RTs on go trials, as had been done in previous studies using the SART, and then performing correlational analyses, we experimentally varied the time allotted for the processing of the digit as the primary independent variable and measured commission errors as the primary dependent variable of interest. In the context of the SMMART, RTs on go trials were used only as a check of the success of our experimental manipulation of response delay. To assess other aspects of performance, we also evaluated omission errors, as has been done in previous studies (e.g., Cheyne et al., 2009).

Method

Participants

The participants were 200 University of Waterloo psychology undergraduate students (58 male, 142 female) with self-reported normal or corrected-to-normal visual acuity, who participated in a session lasting approximately 30 min. Participation was voluntary, and participants received course credit. A group of 50 participants were assigned to each condition. The results from participants whose omission rates were greater than ten percent (i.e., those who failed to respond to at least 56 of the 560 go trials) or whose error rates were greater than three standard deviations from the mean (using a recursive outlier analysis in each condition) were removed from the data set. Of the original 200 participants, seven participants’ responses were removed in this manner.

Materials

Stimulus presentation was controlled by either a Dell Latitude D800 laptop or a Lenovo ThinkPad T420 laptop. Displays were presented on a Viewsonic G225F 21-in. CRT screen. All programs used in the present study were constructed with E-Prime software (Psychology Software Tools Inc., Pittsburgh, PA).

Measures

Mean RTs were calculated for all responses made during go trials. If no response was made to a go stimulus (the digits 1–2 and 4–9), this was coded as an omission. Responses to the no-go stimulus (3) were coded as errors.

The SART

On each SART trial, a single digit (1–9) was presented in the center of a computer monitor for 250 ms, followed by an encircled “x” mask for 1,350 ms, for a total trial duration of 1,600 ms. Typically in the SART, each digit is presented for 250 ms, followed by a mask presented for 900 ms (for a total trial duration of 1,150 ms). However, in piloting with the standard 1,150-ms trial duration, we noticed that some responses made by participants in the 800-ms SMMART condition appeared to carry over to the next trial (resulting in an omission on the current trial and a very fast response on the subsequent trial), presumably because the trials terminated too quickly after the onset of the metronome. To eliminate this problem, we extended the trial duration across all conditions to 1,600 ms to allow sufficient time to make responses within the boundaries of each trial.

Each of the digits was presented equally often across a total of 630 trials. On each trial, the digit was chosen randomly from the set and presented in white against a black background. The size of the digits was also varied randomly across trials, with the fonts being sampled equally from five possible sizes (120, 100, 94, 72, and 48 points). Participants were instructed to respond as quickly as possible to go digits and to withhold responses to the no-go digit. They were further instructed to place equal emphasis on responding both quickly and accurately. Displays were viewed at a distance of approximately 50 cm. Following 18 practice trials, which included the presentation of two no-go digits (the digit 3), were 630 uninterrupted experimental trials, which included the presentation of 70 no-go digits (i.e., 1/9 of all trials were no-go trials).

The SMMART

All details of the SMMART were identical to those mentioned in the description of the SART, with one important exception. Namely, in the SMMART, a metronome tone was presented 400, 600, or 800 ms after the onset of each digit, and participants were instructed to respond synchronously with the onset of the metronome tone on each go trial (and to withhold their responses in each no-go trial). Participants were further instructed to place equal emphasis on responding synchronously with the metronome and responding accurately.

Prior to beginning the tasks, participants in both the SART and SMMART conditions were provided with brief demonstrations on how to properly complete their tasks. Specifically, the experimenter completed 18 SART or SMMART trials while the participant watched. This demonstration was included because in piloting the SMMART, the mean RTs produced by some participants indicated that they may not have understood the task instructions (e.g., one participant in the 800-ms SMMART condition produced a mean RT of 321 ms). Hence, the demonstration was added to ensure participants’ understanding of the tasks.

Procedure

Participants were randomly assigned to one of four conditions: (1) the standard-SART condition, (2) the 400-ms SMMART condition, (3) the 600-ms SMMART condition, or (4) the 800-ms SMMART condition. Each set of instructions was visually presented on the monitor and was read aloud by the experimenter. All participants were instructed to respond to go stimuli by pressing the spacebar on the keyboard and to withhold responses when they saw the no-go stimulus. If a participant had any questions about the instructions, the experimenter provided clarification.

Results

Parsing the RT distribution: Proportion of RTs within 100-ms intervals

As an initial manipulation check of response delay, we examined, for each condition, the proportion of go trials on which RTs fell within each of 16 intervals, from 1 to 1,600 ms (each interval was 100 ms in duration), as well as the proportion of omissions (i.e., failures to respond within the temporal limits of go trials). As can be seen in Fig. 1a and b, the proportions of RTs falling in the 201- to 300-ms interval under the standard-SART and 400-ms SMMART conditions were far greater than those under the 600- and 800-ms SMMART conditions. Additionally, the proportions of RTs under the 600-ms SMMART condition peaked in the 501- to 600-ms interval, whereas in the 800-ms SMMART condition, they peaked in the 701- to 800-ms interval.

Fig. 1
figure 1

(a) Proportions of go RTs for the standard-SART condition falling within each of sixteen 100-ms intervals, plus omissions. (b) Proportions of go RTs for the 400-, 600-, and 800-ms SMMART conditions falling within each of sixteen 100-ms intervals, plus omissions

Go-trial RTs

As a further manipulation check, we analyzed the mean go-trial RTs across the four conditions to determine whether the RTs for each group roughly matched the relative time at which the metronome was presented for that group. The mean RTs, which are presented in Fig. 2, were analyzed with a one-way analysis of variance (ANOVA) with four levels of the between-subjects factor (standard-SART and 400-, 600-, and 800-ms SMMART conditions). The analysis revealed a significant effect of condition, F(3, 189) = 175.37, MSE = 8,122.81, p < .001. All post hoc analyses were conducted using Fisher’s LSD tests. RTs were significantly different across all conditions (all ps < .02), with the fastest RTs being produced in the 400-ms SMMART condition, followed by the standard-SART, 600-ms SMMART, and 800-ms SMMART conditions, respectively. Our manipulation checks confirmed that participants were indeed following the SMMART instructions and that we were successful at manipulating our independent variable of response delay.

Fig. 2
figure 2

Mean go-trial RTs for the standard-SART and the 400-, 600-, and 800-ms SMMART conditions. Error bars represent standard errors

No-go errors

Mean no-go errors, which are presented in Fig. 3, were analyzed with a one-way ANOVA with four levels of the between-subjects factor (standard-SART and 400-, 600-, and 800-ms SMMART conditions). The analysis revealed a significant effect of condition, F(3, 189) = 46.64, MSE = .033, p < .001. Again, Fisher’s LSD tests were used for all post hoc analyses. Significantly fewer errors were committed in the 600-ms SMMART condition relative to the standard-SART condition, p < .001, and the 400-ms SMMART condition, p < .001. Additionally, participants in the 800-ms SMMART condition made significantly fewer errors than participants in all other conditions (all ps < .02). There was no significant difference in error rates across the standard-SART and 400-ms SMMART conditions (p > .05).

Fig. 3
figure 3

Mean proportions of no-go errors for the standard-SART and the 400-, 600-, and 800-ms SMMART conditions. Error bars represent standard errors

Go-trial omissions

In a parallel analysis, we examined the mean proportions of go-trial omissions, which are presented in Fig. 4. This analysis yielded a significant effect of condition, F(3, 189) = 5.26, MSE = .000, p = .002. Fisher’s LSD tests revealed that significantly fewer omissions were produced in the 600- and 800-ms SMMART conditions relative to the standard-SART condition (both ps < .01). In addition, participants in the 800-ms SMMART condition produced significantly fewer omissions than did those in the 400-ms SMMART condition (p = .02), and participants in the 600-ms condition produced marginally fewer omissions than those in the 400-ms condition (p = .06). There were no significant differences in omission rates across the standard SART and 400-ms SMMART conditions (p = .28).

Fig. 4
figure 4

Mean proportions of omissions for the standard-SART and the 400-, 600-, and 800-ms SMMART conditions. Error bars represent standard errors

Discussion

In the present study, we extended our prior line of work by systematically varying response delay along the speed–accuracy trade-off curve to evaluate the role of such trade-offs in the SART. That the RTs rather closely matched the metronome onsets suggests that our response delay manipulation was largely successful. Our examination of the mean no-go error rates showed a decrease in error rates as a function of our response delay conditions. That is, manipulating response delay produced a speed–accuracy trade-off: Participants in the 400-ms condition produced the most errors; those in the 600-ms condition produced fewer errors; and those in the 800-ms condition produced still fewer errors. Perhaps the most noteworthy result yielded by the error analysis was that sustained attention performance was substantially improved (i.e., error rates were decreased to a mere 6 %) by delaying responses to an RT range of roughly 800 ms. Also noteworthy is the finding that omission rates were lower in the 600- and 800-ms SMMART conditions relative to the standard-SART and 400-ms SMMART conditions. Given that omissions are commonly used alongside commission errors to index performance (with more omissions being indicative of poorer performance), these results provide additional support for the claim that SART performance improves with longer response delays. Finally, and most importantly, the results of the present study demonstrated the effects of experimentally manipulating response delay on error reduction in sustained-attention tasks such as the SART, and they clearly indicated that the SART is indeed susceptible to speed–accuracy trade-offs.

Concluding remarks

The present findings have important implications for researchers who, using the SART to index sustained attention ability, seek to improve sustained attention performance, because any intervention used to improve sustained attention could be mediated by a simple slowing strategy. In view of the present results, it is a matter of some concern that it has not been the norm for researchers who have examined interventions aimed at improving sustained attention performance to report RT changes along with error performance measures. This is not to say that delaying one’s response might not be a useful coping strategy for reducing errors in performance on laboratory tasks, or even potentially for improving everyday attentional performance. It is important to be aware, however, that these improvements may be independent of changes in sustained attention ability. We therefore strongly encourage researchers to report mean RT data in any studies aimed at improving sustained attention performance, so that any possible effects of speed–accuracy trade-offs can be taken into account when drawing inferences from the data.

Another reason for serious consideration of changes in response delay following attentional training is that some or all training methods may well affect sustained attention not directly, as intended by the therapy, but indirectly by modulating response tempo. In such cases, induced changes in response tempo might incidentally increase effective attention-to-task by, for example, allowing more time for decisions. This might well be a beneficial coping strategy to compensate for inherent attention deficits, but would not be a remediation of attention per se. These are complex issues and will require sophisticated designs and multivariate analyses to sort out the benefits and costs, if any, of different training regimes, but they also have the potential for enriching not only our understanding of the effects of attention training, but also of the interactive role of attentional processes and alternative coping strategies on continuous-performance tasks.

We also note, in closing, that the findings of our study have implications for a number of studies (other than those assessing the effects of attention training) in which the SART has been used to index sustained attention ability. As mentioned, the SART has been used for a wide range of studies, not just those assessing the possible effects of attention training. Indeed, it has been used in numerous clinical and applied settings (e.g., Chan et al., 2009; Farrin et al., 2003; Greene et al., 2009; Molenberghs et al., 2009; Smallwood et al., 2007; van der Linden et al., 2005). Our results suggest the possibility that the results of some of these studies may in fact reflect speed–accuracy trade-offs rather than purely reflecting sustained attention performance. Although the present work cannot directly speak to the role of speed–accuracy trade-offs in these particular studies, the results highlight the danger of ignoring such trade-offs. While our intention is not to invalidate the SART as a measure of sustained attention ability, our results suggest that researchers should be careful when administering this task, and that future research should perhaps either employ other, less problematic tasks when exploring sustained attention or, alternatively, use statistical procedures (e.g., residualizing commission errors on RTs) to control for the effects of speed–accuracy trade-offs.