People are often faced with situations requiring quick decisions or judgments based on little or incomplete information. Such situations may include any time an individual must detect, recognize, identify, or classify a stimulus in order to initiate an appropriate behavioral response. Even relatively quick decisions require an individual to draw on a combination of attentional, perceptual, decisional, and motor processes. A large body of research has shown that these processes, while distinct, can interact to influence behavior. For instance, individual performance on speeded decision tasks are generally faster and more accurate when spatial/perceptual features and response features of a task are related. Such effects, known as stimulus–response (S–R) congruency effects, have been extensively studied in cognitive psychology (Pashler, Johnston, & Ruthruff, 2001; Proctor & Vu, 2006). However, a challenge facing theoretical accounts of S–R congruency effects is the difficulty of separating effects due to perceptual and motor mechanisms, respectively, in behavioral data. Sequential sampling models can address the separable, additive, and interactive effects of perceptual, decisional, and motor processes. To that end, the goals of the current study are: (1) to examine how the quality of evidence extracted from visual stimuli influences S–R congruency effects; and (2) to utilize a sequential sampling model of perceptual decision making to assess the separate and interactive effects of perceptual/decisional and motor processes in the S–R congruency effect.

It is well known that the speed and accuracy of visual object recognition can be influenced by various features of an image, including but not limited to: size (Craddock & Lawson, 2009; Graf, 2006; Te Pas, Kappers, & Koenderink, 1996; Tomkins, 2021), location (Golomb, Kupitz, & Thiemann, 2014; Graf, 2006), luminance (Tomkins, 2021), clarity (Pegna, Khateb, Michel, & Landis, 2004; Shahangian & Oruc, 2014; Wyatte, Curran, & O’Reilly, 2012)and presentation duration (Potter, Wyble, Hagmann, & McCourt, 2014; Thorpe, Fize, & Marlot, 1996). Objects are generally more difficult to recognize when either viewing conditions or the quality of the image itself are degraded. While this phenomenon is well-established, the relationship among perceptual quality, decision processes, and motor (i.e., executive) processes is less understood. The relationship among perceptual, decisional, and motor processes is no subtle matter. The vast majority of studies examining perceptual and cognitive processes require some form of motor response from the participant. This may include a keypress, verbal response, finger point, or similar behavior. Exceptions may include EEG or eye-tracking techniques. This situation presents a challenge for interpreting behavioral data, since data reflect the cumulative effects of attentional, perceptual, decisional, and motor processes. However, there is evidence that these distinct processes do not always operate in a serial fashion, but often in parallel. That is, (pre)motor processes may initiate before perceptual processes have completed (Proctor & Vu, 2006). This is problematic since it is generally assumed that statistical differences across experimental conditions genuinely reflect the intended process(es) in question given that all other testing procedures are kept the same. However, observed differences in mean RT across different experimental conditions intended to measure some aspect of cognition may not actually reflect cognitive processes at all. There is a large body of evidence that response modality and response mapping can have significant influences on data intended to examine attention, perception, or cognitive mechanisms (Gomez, Ratcliff, & Childers, 2015; Hazeltine & Ruthruff, 2006; Kramer, Cox, Yu, Kravitz, & Mitroff, 2021; Philipp & Koch, 2005; Phillips & Ward, 2002; Roggeveen, Prime, & Ward, 2005; Taylor & Klein, 2000). For instance, spatial mapping of attentional or perceptual codes and motor processes can combine into a single response code over repeated trials (i.e., S–R congruency effect). Such findings have prompted some authors to suggest the keypress method should be discontinued in favor of alternative methods (Kramer et al., 2021). While this proposal may be extreme, there are valid criticisms of the keypress modality that require careful consideration in response time research. Furthermore, the common practice of averaging response time data has the potential to introduce artifacts (Estes, 1956; Gómez, Breithaupt, Perea, & Rouder, 2021). To address these issues, the present study systematically manipulated perceptual and response features in a speeded choice decision task. Additionally, the present study implements a mathematical model that considers the full response time distribution and assesses participant response time and accuracy in a single analysis to examine the select effects of perceptual, decisional, and motor processes in a simple decision task.

Models of stimulus–response congruency

Several theoretical models have been proposed to explain S–R congruency effects. Although they differ in specific features, current models show consensus that S–R congruency is composed of two separate effects: (1) general effects, and (2) task-specific effects. There are fundamental patterns that are shown irrespective of specific stimuli or task demands. For instance, an image presented to the right visual field will automatically attract spatial attention resources and prime a response to the right regardless of what the image is or whether it is relevant to the task demands. In contrast, there are congruency effects that vary based on stimulus features and/or the specific demands of the task. The dual-process model (De Jong, Lang, & Lauber, 1994) explicitly distinguishes between these two kinds of processes as unconditionally automatic and conditionally automatic effects, respectively, and argues the two processes operate independently. An alternative account is the dimensional overlap processing model (Kornblum & Lee, 1995), which conceptualizes stimulus and response information as vectors of features. In this view, S–R congruency effects are the result of an overlap or match between the respective attributes (i.e., dimensions) between the stimulus vector and response. The more dimensions that match between the two processes, the larger the S–R congruency effect. It is important to note that experimental tasks used to examine S–R congruency effects do not involve only perceptual and motor processes, but also decision processes. While current models have provided a great deal of insight into the dynamic relationship between perceptual and motor mechanisms, they offer little explanation concerning how the decision process unfolds. It is one thing to argue, for instance, that an individual must make a decision concerning whether a feature of a stimulus is task-relevant, but it is another to show how this decision is made and make specific predictions for data based on modeling assumptions. Purely verbal accounts are limited in this respect. In contrast, sequential sampling models can help reveal the decision-making process and make predictions for data based on modeling parameters which reflect specific psychological processes.

Diffusion model

Sequential sampling models have proven highly successful at describing and predicting human behavior in perceptual decision tasks. These are mathematical models that characterize decision making as a process in which noisy evidence from a stimulus accumulates over time until the threshold for a particular decision is met and a response is initiated. In particular, the diffusion model (Ratcliff, 1978; Ratcliff, Voskuilen, & McKoon, 2018) has reliably shown to account for participant behavior in “one shot” decisions across a variety of different contexts (see Ratcliff, Smith, Brown, and McKoon (2016) for a review). A strength of the diffusion model is the ability of the model to separate decision processes and non-decisional processes (e.g., memory processes, executive processes). Figure 1 shows a graphical representation of a hypothetical trial within the diffusion model. The response time in a speeded dual-choice task can be separated into three components: the time required to extract the physical and psychological features of a stimulus relevant to the task (encoding time), the time required for the diffusion process to reach one of two decision boundaries (accumulation of evidence), and the time required for the motor response. The sum of the encoding time and the time required for the motor response are represented by the parameter Ter, and the st parameter (its range).

Fig. 1
figure 1

Diffusion model representation of a single trial. Here the participant must decide whether a flashed image is a circle or square. On this trial, the image is a square. Parameter z represents the initial moment of the decision process and time elapses going from left to right

Drift rate

In the diffusion model, noisy evidence accumulates towards a decision boundary at a variable rate (see Fig. 1). This process is referred to as drift. The average rate at which the evidence drifts towards a decision boundary is referred to as “drift rate”. Conceptually, the drift rate can be thought of as the quality of evidence extracted from a stimulus. For instance, suppose a participant must quickly decide whether a briefly presented object is a circle or a square. Further suppose that on a particular trial the shape presented is a square. If a relatively large and clear image is displayed there should be a large positive drift rate—reflecting that the object is relatively easy to recognize. However, if the image were more difficult to perceive, either by being made very small, fuzzy, or being presented for only a few milliseconds the value of the drift rate should be smaller—reflecting a more difficult decision due to lower quality evidence from the stimulus. In this case, the participant will likely take longer to make a decision and the drift is more likely to cross the “circle” threshold (represented by 0 in Fig. 1), resulting in an error.

When a participant is required to make a series of binary decisions over many trials, there are two sources of variability: within-trial variability and across-trial variability. The diffusion model accounts for both sources of variability. Within-trial variability is represented in Fig. 1 by the jagged line. As mentioned above, the accumulation of evidence within a trial is noisy (i.e., that is, variable). This parameter is scalar, which means that the same predictions could be generated by changing this parameter while scaling the other model parameters. Across-trial variability is contained within the η parameter. In most tasks, stimuli are grouped based on shared features or experimental conditions and it cannot be assumed that all stimuli have equal discriminability.

Not only can the drift rate be affected by features of stimuli, but also by bias. Participants show faster response times and better accuracy when the location of a target stimulus is location-congruent with the response key (S–R congruency effect). For instance, if the response key for “square” is positioned to the participant’s right, they will be faster and more accurate when a square is presented on the right side of a display relative to when the square is presented to the left. The diffusion model can capture this bias by considering the boundary position. The two parameters of the model that describe the boundary position are z, the location of the starting point (with a range of sz), and a, the distance between the boundaries (with the location of the negative boundary assumed to be set at 0). When no bias is present, z = 0.5. When biases are present, z moves closer to the boundary the participant is biased towards, deviating either greater (or less than) 0.5.

The present task

The present study includes a simple shape recognition task. On each trial, participants were presented with either a square or a circle to either the left or right of a central fixation very briefly and instructed to press one key to classify the stimulus as a square and a second key to classify the stimulus as a circle. To test the selective and interactive influence of perceptual and motor processes both a perceptual and motor manipulation were included. The perceptual quality of the visual code was manipulated by varying presentation duration (Experiment 1) and stimulus size (Experiment 2). The motor response was manipulated by varying stimulus–response mapping. Specifically, the key to identify a stimulus as a “square” was positioned to the right of the participant, whereas the key to identify a stimulus as a “circle” was positioned to the left. In effect, half the trials are location-congruent (i.e., square presented to the right, circle presented to the left) and half the trials are location-incongruent (i.e., square presented to the left, circle presented to the right). In Experiment 3, response keys were positioned centrally and vertically to the display. This design enables us to tease apart the perceptual, cognitive, and motor processes involved in participant behavior and also systematically examine the selective, joint, and interactive effects among these factors.

Previous research has shown that S–R congruency effects may be decreased when stimuli are positioned vertically, rather than horizontally, on the display (Wiegand & Wascher, 2007). However, the extent to which S–R congruency effects are influenced by keeping the position of stimuli fixed and manipulating the relative position of the response keys has received less attention. This question is important since many experiments require lateral presentation of stimuli, thus positioning stimuli centrally and vertically is not an option. Furthermore, the present study goes beyond assessment of S–R effects by including a perceptual manipulation. This design provides a broader examination of choice behavior, enabling us to examine the effects of spatial-perceptual-motor processes. If S–R congruency effects are attenuated or eliminated by this vertical mapping assignment, we will have additional evidence that decisional biases are primarily motor-based. However, if S–R congruency effects are reduced depending on the perceptual quality of the stimulus, this suggests that S–R congruency effects are the result of a combination of early perceptual processes (i.e., since the manipulation is stimulus presentation duration), spatial attention, and motor mechanisms.

Hypotheses

With the design established, four competing accounts are proposed with each making specific predictions for the data:

  • Null account: No effect of perceptual evidence or stimulus–response mapping

  • Perceptual account: Decisions based solely on perceptual evidence; No effect of stimulus–response mapping

  • Stimulus–response account: Decisions based solely on stimulus–response mapping; No effect of perceptual evidence

  • Interaction account: Perceptual and stimulus–response processes interact to influence decisions

The first hypothesis can be considered the null model, where no significant differences are expected based on perceptual quality or response mapping. The second hypothesis assumes that the present task is primarily a perceptual task. Specifically, participants will show worse performance (slower RT, lower accuracy) when the perceptual features of the stimuli are degraded (i.e., shorter presentation, smaller size); however, the effects of response mapping will have no influence on performance. The third hypothesis emphasizes the motor aspect of the task. In this account, participants are expected to show S–R congruency effects, but no differences based on the perceptual quality of the stimuli. This outcome would suggest that S–R congruency effects are the result of stimulus–response mapping only. In other words, S–R congruency effects are the result of spatial attention and motor processes, while perceptual processes are not involved. The fourth hypothesis assumes that perceptual and response-mapping manipulations will interact to influence participant performance. Figure 2 shows the modeling predictions for each of the four hypotheses (see Angele, Baciero, Gómez, and Perea (2022) and Gomez and Perea (2020) for examples of this approach). There is a principled way for the model to account for the data: presentation time (Experiments 1&3) or stimulus size (Experiment 2) should only affect the drift rate parameter, while the response thresholds and the starting point of the diffusion process should reflect the response biases.

Fig. 2
figure 2

Drift-diffusion model representations for each of the four hypotheses. The bottom portion of each graph reflects the encoding process followed by the evidence accumulation process. The top portion reflects the response time distributions for correct trials across experimental conditions. The shaded color beams reflect the average rate of evidence accumulation. The null account predicts no differences in either encoding processes (represented by the dashed lines) or average rate of evidence accumulation. The stimulus–response account predicts faster encoding when stimuli and response key are location-congruent (blue) than when the stimuli and response are location-incongruent (red), but no differences in evidence accumulation rate. The perceptual account predicts no differences in encoding, but more rapid extraction of perceptual evidence when the shape is presented for longer (blue) than when the shape is presented for a shorter duration (red). Lastly, the interaction account predicts that both encoding and evidence accumulation processes will be influenced by experimental manipulations

There are two potential patterns that may emerge in the interaction account, each with different implications. First, participants will display a larger S–R congruency effect when the perceptual quality of the stimulus is degraded. When the perceptual code affords insufficient evidence for a decision either due to the stimulus being presented too briefly (Experiment 1) or the stimulus being smaller (Experiment 2), participants will rely more on spatial-response biases and less on perceptual evidence. This pattern would indicate that better perceptual information helps reduce S–R effects. When quality evidence is afforded by perceptual codes, this quality evidence aids decisional and motor processes. The second pattern that may emerge from the interaction account is that participants will display larger S–R congruency effects with better perceptual evidence (i.e., longer presentation time, larger stimulus size). This pattern would suggest that perceptual information enhances S–R effects. Specifically, high-quality perceptual information interferes with decisional and motor processes, resulting in greater responding bias.

Experiment 1

Methods

Participants

Thirty subjects from the College of Saint Benedict and Saint John’s University participated in Experiment 1. Prescreening was conducted to ensure that all participants had normal or corrected-to-normal vision, and no history of visual impairments. Participants received course credit for participation.

Materials and procedure

Stimuli consisted of an image of a square and the image of a circle. Each image constituted a 1.5 cm x 1.5 cm region of the screen. Participants were positioned 48 cm from the center of the display. Each trial began with a centrally-presented fixation “+” that remained on the screen for 1000 ms (see Fig. 3). Following the fixation, a stimulus (square or circle) was briefly presented either to the left or the right of the fixation with the inner edge of the stimulus 2 cm from the center of the screen, creating a 2.38 visual angle. The stimulus subtended \(\sim 1.8^{\circ }\) vertically and horizontally. On half the trials, the stimulus appeared for 80 ms (high quality). On the other half of trials, the stimulus appeared for 20 ms (low quality). As shown in Fig. 3, a backwards mask was presented immediately following the target to prevent additional evidence accumulation from sensory memory processes. Participants were instructed to indicate by keypress whether the flashed image is a circle or square as quickly as possible. When the stimulus is a square, the participant was instructed to press the “M” key (positioned to the right) on the keyboard with the right index finger. When the stimulus was a circle, the participant was instructed to press the “Z” key (positioned to the left) with the left index finger. Each participant completed a total of 480 trials broken down into six blocks of 80 trials each. Trials were fully counterbalanced such that an equal count (n = 60) of each of the 8 experimental conditions were presented to each participant: 2 stimulus (circle, square) x 2 presentation location (left, right) x 2 duration (long, short). All testing was conducted using E-Prime 2.0 testing software.

Fig. 3
figure 3

Trial procedure

Results

Behavioral results

Prior to analysis, raw RTs shorter than 100 ms or longer than 2000 ms were removed (1.7 % of trials). With outliers removed, separate linear mixed-effects regressions were conducted for correct RTs and error RTs including stimulus duration (short, long) and response mapping (location-congruent, location-incongruent) as fixed factors. To avoid the issue of nonlinear transformations for response time data, a gamma distribution was specified for all response time analyses. Accuracy was analyzed with a logistic mixed-effects regression including perceptual quality and response mapping as fixed factors.

Accuracy

Results (see Fig. 5) showed a significant main effect of duration, OR = 0.07, CI = 0.06, 0.09, p < .001. Participants were more likely to make an error on short-duration trials (P = 0.33) than long-duration trials (P = 0.04). A significant main effect of response mapping was also found, OR = 0.67, CI = 0.52, 0.85, p < .001. The probability of an error was larger on incongruent trials (P = 0.20) than congruent trials (P = 0.16).

Correct RT

For correct trials, a significant main effect of duration was found, b = -164.55, p < .001 (see Fig. 4). Response times were shorter on long-duration trials (M = 436 ms) than short-duration trials (M = 624 ms). The main effect of response mapping was also significant, b = -7.31, p < .001. Participants responded faster when targets were location-congruent with the response key (M = 521 ms) than location-incongruent (M = 538 ms) Lastly, a significant duration x mapping interaction was observed, b = 17.53, p < .001. To further examine this interaction effect, a post hoc (Tukey) test was conducted (see Table 1). On incongruent trials, response times were faster in the long-duration condition than the short-duration condition, b = -182.09, p < .001. On congruent trials, response times were also faster in the long-duration condition than the short-duration condition, b = -164.55, p < .001. On short-duration trials, response times were shorter in the congruent mapping condition than the incongruent condition, b = -24.85, p < .001. However, no significant difference based on mapping was found on long-duration trials, b = -7.31, p = .07.

Fig. 4
figure 4

Swarmplot of participant response times on error trials (left) and correct trials (right)

Table 1 Mean response times (correct trials)

Error RT

On error trials, a significant main effect of duration was found, b = -91.23, p < .001 (see Fig. 4). Incorrect responses were made faster in the long-duration condition (M = 518 ms) than the short-duration condition (M = 659 ms). There was also a significant main effect for mapping, b = -64.52, p = .01. Errors were made faster on incongruent trials (M = 620 ms) than congruent trials (M = 672 ms). The duration x congruency interaction effect was also significant for error trials, b = -33.13, p < .001. Post hoc tests showed that on congruent trials, errors were made faster in the long-duration condition than short-duration condition, b = -91.2, p < .001. Second, on incongruent trials errors were also made faster in the long-duration condition than the short-duration condition, b = -124.73, p < .001. Third, in the long-duration condition error RTs were faster when the stimulus was location-incongruent than when the stimulus was location-congruent, b = -64.5, p < .001. Lastly, in the short-duration condition error RTs were faster in the incongruent condition than the congruent condition, b = -31.1, p = .007.

Model fit

The model was fit to the data using the D*M method (Van den Bergh, Tuerlinckx, & Verdonck, 2020). The model was used to fit correct and error RTs across the eight experimental conditions: two stimuli conditions (square, circle); two durations (short, long); and two response mapping conditions (location-congruent, location-incongruent). All parameters were allowed to change freely across the different duration x mapping conditions; however, within each duration x mapping condition only the drift rate was allowed to vary. Table 2 shows the parameter estimates derived from the model.

Table 2 Diffusion model parameters

Discussion

In Experiment 1, participants completed a speeded decision task requiring them to identify a stimulus as a square or circle. Stimuli were presented either on the same side as the response key or on the opposite side. Additionally, stimuli were presented either very briefly (short condition) or for a longer duration (long condition). Participant data were then entered into a diffusion analysis and parameter estimates were derived. The discussion will be separated into two parts. First, the behavioral results will be discussed, followed by examination of the model parameters.

Participants made more errors when stimuli were presented for only 20 ms compared to 80 ms (Fig. 5). Consistent with the perceptual hypothesis, this finding confirms shape recognition was more difficult with shorter exposure duration. Participants also made more errors when the stimulus appeared on the opposite side of the display (S–R incongruent) from the response key. This result supports the stimulus–response hypothesis. Interestingly, the data did not reveal a duration x mapping interaction for accuracy. Thus, the accuracy results suggest that both perceptual and motor processes are involved in the present task, but these respective processes may operate independently.

Fig. 5
figure 5

Swarmplot of participant accuracy

Unlike the accuracy results, the RT data did reveal a duration x mapping interaction effect. As shown in Table 1, participant RTs on correct trials showed little difference based on mapping position when stimuli were presented for longer. On short-duration trials, however, mean RT was significantly shorter when the stimulus was location-congruent with the response key. This pattern is consistent with the interaction hypothesis, suggesting participants relied more on spatial and motor processes when the quality of perceptual evidence extracted is reduced (i.e., by very brief visual exposure). The same stimulus–response bias was not present when the quality of perceptual input was improved (i.e., longer visual exposure). This important finding not only replicates the typical effect of stimulus–response biases, but clearly illustrates that such biases are influenced by perceptual features of stimuli. In sum, perceptual, spatial, and motor processes all interact to influence speeded decision behavior. When the quality of perceptual evidence is reduced, individuals rely more on stimulus–response biases when making quick decisions.

Response time patterns on error trials tell us more about how these respective processes affect decision-making. First, participants made errors faster when the stimuli were presented for longer. This pattern of behavior is consistent with previous implementations of the diffusion model (Ratcliff, Gomez, & McKoon, 2004; Voss, Rothermund, & Voss, 2004). The logic for this pattern is as follows: when a stimulus is relatively easy to identify this results in a larger drift rate and a smaller range between the decision boundaries. This results in two effects. First, participants are less likely to make an error for such a stimulus. Second, since the decision boundaries come closer together, there is a greater probability of the drift crossing the incorrect decision boundary very early in the evidence accumulation process. In the present study, it is assumed that shapes presented for longer (80 ms) are easier to recognize. The participant does not require as much time to make a decision about the stimulus. Consequently, the participant is less likely to make an error on these easier trials and if the participant does make an error it is likely to occur relatively quickly.

Second, errors were generally faster on location-incongruent trials than location-congruent trials. This effect has important theoretical and practical implications for decision-making research. From a theoretical perspective, this pattern lends additional support to the finding that visuospatial mechanisms can interact with motor mechanisms to influence reaction time. However, behavior on correct trials and error trials show opposing effects. When a stimulus is location-congruent with a response key, reaction time speeds up on correct trials but is slowed on error trials. The explanation for faster RTs on correct, location-congruent trials is straightforward. On these trials, visuospatial codes and motor codes do not conflict. Thus, there is no interference present to slow down responding time. The pattern on error trials is less clear. To explain these results, we need to carefully deconstruct the present task. There are two sources of information that influence the motor response: what the stimulus is and where the stimulus is. If we choose a particular response key (“M”) as a reference, there are four possible combinations a target may appear in: (1) correct-stimulus, congruent location, (2) correct-stimulus, incongruent location, (3) incorrect stimulus, congruent location, and (4) incorrect stimulus, incongruent location. In combination 1, both sources of evidence are consistent with the response key. In combinations 2 and 3, one source of evidence is consistent with the response key (stimulus type, stimulus location, respectively) and one source of evidence is inconsistent. In combination 4, neither source of evidence is consistent with the particular response key. Upon closer examination, error RTs showed a tiered pattern across these combinations. Specifically, mean RT for errors was slowest when both sources of evidence were consistent (M = 707 ms). Error RTs were slightly faster when the correct stimulus appeared but in the incongruent location on the screen (M = 697 ms). RTs decreased further when the incorrect stimulus was presented, but in the congruent location with the response key (M = 663 ms). Lastly, error RTs were quickest when the incorrect stimulus was presented in the incongruent location (M = 595 ms). This pattern may suggest decision times on error trials vary depending on the amount of evidence consistent with a particular response. Specifically, as more sources of evidence consistent with a particular response are present, the amount of time for the drift to reach the “error” boundary (Fig. 1) increases—reflecting that the participant needs more time to make a decision about the stimulus.

An alternative explanation for the pattern in error RT is perceptual evidence and response mapping interact on error trials as well, but the manipulations used in the experiment were not strong enough to detect the effect. In the short (20-ms)-duration condition, error RTs were 44 ms faster on incongruent trials than congruent trials. However, in the long-duration (80-ms) condition, error RTs were 86 ms faster on incongruent trials. This may suggest the additional evidence afforded by longer exposure to a stimulus may add additional time to the drift rate when a participant makes an error. This conclusion should be taken with caution, however. As shown in Fig. 5, participants displayed ceiling effects in the 80-ms-duration condition. Thus, the mean RTs on error trials in the 80 ms condition are derived from a relatively small amount of data. Future studies could test this possibility further by including shorter exposure durations, and perhaps including more than two levels of presentation time. A practical implication from the present results is experiments using the keypress modality to make inferences about cognitive functions need to consider patterns on error trials as well as correct trials. It is common practice to omit error trials from RT analyses, but patterns in error trials can reveal more about the processing at play in the respective task.

Table 2 shows the output of the diffusion analysis for each model parameter. The parameters of interest for the present study are the drift rate (v); the boundary separation and starting point (a and z, respectively); and the non-decisional component of the RT (Ter). The drift rate is considerably larger in the long-duration condition (bottom two rows of Table 2), which reflects easier discriminability when the stimulus is presented for longer. This also explains why both correct responses and errors tended to be faster in the long-duration condition. However, the pattern in the drift rates is more complex when we consider the combined effects of stimulus duration and location. On short-duration trials, the drift rate is larger (2.04) when the stimulus is location congruent with the response key than when it is location incongruent (1.28). In contrast, the drift rates are similar between the mapping conditions on long trials. An advantage of the diffusion model is that parameters can be described in terms of specific psychological processes. Recall that the drift rate specifically reflects the decision component (i.e., extraction of perceptual evidence from the stimulus) of the present task, not encoding or motor processes. Thus, we can conclude the larger S–R congruency effects shown on short-duration trials is partially the result of perceptual-decisional processes (reflected in the drift rate). However, we also need to consider the other parameters.

We can measure bias towards one of the decision boundaries by considering the starting point (z) relative to the distance between decision boundaries (a). Focusing on the long-duration condition, z/a = 0.31 on congruent trials and z/a = 0.28 on incongruent trials. This suggests participants display slightly greater responding bias when the shape appeared on the same side as the response key. However, the opposite pattern is shown in the short-duration condition, where z/a = 0.37 on congruent trials and z/a = 0.42 on incongruent trials. This unexpected result suggests that when a stimulus is presented for only 20-ms participants showed greater response bias when the stimulus was displayed on the opposite side from the response key. Additionally, Table 2 shows that non-decision processes (Ter) had a larger effect in the incongruent, short-duration condition relative to the other experimental conditions. This finding is significant since it suggests differences in participant performance across these duration x visibility manipulations may not be solely due to perceptual/cognitive processes, but partially non-decisional processes (i.e., encoding and/or motor processes).

Figure 6 illustrates the observed response quantiles plotted against the model-predicted quantiles for both correct and error trials across the 8 experimental conditions (2 duration x 2 mapping x 2 shape). As can be seen, the model predictions came very close to the observed data. Quantile plots provide a qualitative picture of the model fit; however, the D*M approach obtains parameter estimates by minimizing the Chi-square difference between the observed data distribution and the model distribution for each condition-response pair and then summing the Chi-square values across all condition-response pairs. Results showed a very close fit between the model predictions and the observed data, χ2 = 2.89. Overall, the results of the diffusion analysis suggest (1) the diffusion model can adequately account for decisional biases elicited by spatial-response mapping manipulations, and (2) decisional biases are not reflected in a single process. Instead, we see evidence of differences reflected in decisional processes (i.e., drift rate), the starting point of the decision process (z), and non-decisional processes (Ter)—consistent with the interaction hypothesis.

Fig. 6
figure 6

Model fits for each duration x mapping x shape combination on both correct and error trials; “con” = congruent; “inc” = incongruent

Experiment 2

The findings of Experiment 1 show that S–R congruency effects are modulated by the perceptual quality of evidence afforded by a stimulus. However, presentation duration is not the only feature of a stimulus that may influence the perceptual quality of evidence needed to make an appropriate response, and it should not be assumed that all ways of degrading perceptual evidence elicit the same pattern of decisional biases. A large body of research has shown that stimulus size can produce adverse effects on visual object recognition (Craddock & Lawson, 2009; Graf, 2006; Te Pas et al., 1996; Tomkins, 2021), and stimulus size may also influence S–R congruency biases. For instance, a recent report (Wühr & Richter, 2022) showed S–R congruency effects are modulated by the relative size of shape stimuli, with faster responses to smaller targets on the left and larger targets on the right. At present, however, it is unclear whether stimulus size manipulations specifically affect perceptual, motor, or a combination of both processes in speeded decisions. To examine the generalizabilty of the findings of Experiment 1, Experiment 2 examined the relationship among perceptual, decisional, and response processes further by manipulating stimulus size rather than duration.

Methods

Participants

33 students from the College of Saint Benedict and Saint John’s University participated in Experiment 2. Prescreening was conducted to ensure that all participants had normal or corrected-to-normal vision, and no history of visual impairments. Participants received course credit for participation.

Materials and procedure

The procedure in Experiment 2 was kept identical to Experiment 1 with the following exception: instead of manipulating the presentation time of stimuli, the size of stimuli was manipulated. Two viewing conditions were created. On half the trials, the image was 1.5 cm x 1.5 cm in size (same as Experiment 1). On the other half of trials, the image was decreased to 0.75 cm x 0.75 cm in size subtending \(\sim 0.6^{\circ }\) vertically and horizontally. The inner edge of large and small stimuli was fixed at 2 cm from the center of the screen. In Experiment 2, presentation duration was fixed at 80 ms for all stimuli.

Results

Behavioral results

Prior to analysis, outliers were removed from the raw data (1.4 % of trials). Subsequently, linear mixed-effects regressions were conducted separately for participant RT on correct trials and error trials including stimulus size (large, small) and response mapping (location-congruent, location-incongruent) as fixed factors. A logistic mixed-effects regression was conducted to examine proportion of errors.

Accuracy

Results showed a significant main effect of stimulus size, OR = 0.59, CI = 0.50, 0.70, p < .001 (see Fig. 8). Participants were more likely to make an error for small targets (P = .10) than large targets (P = .06). The main effect of response mapping was not significant, OR = 0.90, p = 0.23, and no interaction effect was observed, OR = 1.06, p = 0.61.

Correct RT

For correct trials, a significant main effect of stimulus size was shown, b = -22.01, p < .001 (see Fig. 7). Response times were shorter for large stimuli (M = 482 ms) than small stimuli (M = 511 ms). No effect of response mapping was found, b = -2.36, p = 0.25, and the interaction effect was not significant, b = 5.24.

Fig. 7
figure 7

Swarmplot of participant response times on error trials (left) and correct trials (right)

Error RT

On error trials, a significant main effect of stimulus size was observed, b = -50.01, p < .001 (see Fig. 7). Participants made errors faster for large targets (M = 575 ms) than small targets (M = 596 ms). The main effect of response mapping was also significant, b = -29.43, p = .002, with participants making errors faster on incongruent trials (M = 557 ms) than congruent trials (M = 621 ms). The size x mapping interaction was not significant, b = -11.79, p = 0.24.

Model fit

The diffusion model was fit to the data using the same procedure as Experiment 1. Parameter estimates from the model are shown in Table 4.

Discussion

In contrast to Experiment 1, results from Experiment 2 support the perceptual hypothesis. Participants made faster and more accurate responses for large targets than small targets (Table 3). However, no effects of response mapping were observed in either participant accuracy or latency. This pattern of results suggests participants relied primarily on perceptual evidence in the task. The larger stimulus provides greater perceptual quality of evidence, resulting in faster and more accurate judgments. This finding is intriguing since it suggests not all factors that decrease perceptual quality of evidence influence speeded decisions in the same way. When perceptual evidence is poor due to brief presentation (as in Experiment 1), individuals seem to rely on spatial-motor codes to compensate. In contrast, when perceptual evidence is poor due to decreasing the size of a stimulus, participants did not display S–R biases. The absence of a stimulus size x location interaction contrasts with recent findings (Wühr & Richter, 2022) where such an interaction was observed.

Table 3 Mean response times (correct trials)

It is important to note that participant accuracy was very high in Experiment 2 for both large and small stimuli (see Fig. 8). Thus, one must be cautious about concluding that smaller stimuli do not produce stimulus–response biases. It is possible, for instance, that stimuli in the small condition were not small enough to produce stimulus–response biases.Footnote 1 Alternatively, biases may be observed if smaller stimuli are presented for a shorter amount of time. That participants were slower and slightly less accurate for smaller stimuli indicates the perceptual quality of evidence was degraded, but not so much that responding biases were elicited. Additional studies decreasing stimulus size and/or duration further are needed before it can be determined (with confidence) whether smaller stimuli elicit location-congruency effects or not.

Fig. 8
figure 8

Swarmplot of participant accuracy

Once again, results showed a close fit between the model predictions and the observed data, χ2 = 4.30 (Fig. 9). The drift rates show a different pattern than those shown in Experiment 1. We see larger drift rates for small stimuli (top two rows of Table 4) than the briefly presented stimuli (top two rows of Table 2). This reflects easier discriminability for small stimuli presented for 80 ms than larger stimuli presented for 20 ms (Experiment 1). Looking at the Ter parameter, we see a smaller value on incongruent trials for small stimuli (0.19) than the value shown on incongruent trials for briefly presented stimuli (0.25) in Experiment 1. This suggests executive processes had a smaller influence on performance when stimulus size was manipulated relative to stimulus presentation duration. To express this pattern in psychological terms—nondecisional processes (i.e., encoding, motor processing) had a greater influence on individual performance when stimulus duration was manipulated than when stimulus size was manipulated. This helps explain why we observe S–R congruency effects in Experiment 1, but not Experiment 2.

Fig. 9
figure 9

Model fits for each size x mapping x shape combination on both correct and error trials; “con” = congruent; “inc” = incongruent

Table 4 Diffusion model parameters

Experiment 3

Experiment 1 showed that participants display S–R congruency effects only when stimuli were presented for a brief amount of time. These S–R congruency effects would be expected given the design of the experiment. Participants used a key positioned to their right to make responses to one stimulus and a key positioned to their left to respond to a second stimulus. That is, the experiment was designed to elicit S–R congruency effects. However, in most cases researchers using lateral presentation of stimuli wish to avoid S–R congruency biases, rather than elicit them. To achieve this, it is common to orient response keys vertically rather than horizontally or counterbalance response hand or finger. The assumption is that by removing (or counterbalancing) the horizontal mapping of response keys, participants will not form associations between a response key and particular location on the experimental display. The goal of Experiment 3 was to test this assumption directly. Participants made a choice decision for shapes presented for either 20 or 80 ms. However, response keys were positioned vertically and along the center of the display. If the above assumption is correct, the S–R congruency effects observed in Experiment 1 should be reduced or eliminated with this response mapping.

Methods

Participants

40 students from the College of Saint Benedict and Saint John’s University participated in Experiment 3. Prescreening was conducted to ensure that all participants had normal or corrected-to-normal vision, and no history of visual impairments. Participants received course credit for participation.

Materials and procedure

The procedure in Experiment 3 was kept identical to Experiment 1 with the following exception: response keys were positioned vertically with respect to the display screen rather than horizontally. Furthermore, response keys were positioned in alignment with the center of the screen. Each participant was randomly assigned to make responses with either the left (n = 20) or right (n = 20) hand (see Fig. 10). On trials when a “circle” was presented, participants pressed the “2” key on the keyboard with their index finger. On “square” trials, participants pressed the “5” key with the middle finger. It is important to note that the two response keys still maintain a relative position. For those participants using the right hand to respond, the “square” key is positioned to the canonical right of the “circle” key. In contrast, the “square” key is positioned to the canonical left for those participants responding with the left hand.

Fig. 10
figure 10

Response mapping assignments used in Experiment 3

Results

Behavioral results

Prior to analysis, outliers were removed from the raw data (2.7% of trials). Next, linear mixed-effects regressions were conducted separately for participant RT on correct trials and error trials including stimulus duration (short, long), response mapping (location-congruent, location-incongruent) and response hand (left, right) as fixed factors. A logistic mixed-effects regression was conducted to examine proportion of errors.

Accuracy

Results showed a significant main effect of duration, OR = 0.05, CI = 0.04, 0.06, p < .001 (see Fig. 12). Participants were more likely to make an error on short (P = 0.36) than long-duration trials (P = .03). A significant main effect of response mapping was also found, OR = 0.59, CI = 0.42, 0.83, p = .003. The probability of an error was larger on incongruent trials (P = 0.21) than congruent trials (P = 0.17). The main effect of response hand was not significant, OR = 0.64, p = .13. A significant hand x duration interaction was found, OR = 1.69, CI = 1.17, 2.45, p = .005. On short-duration trials, participants were more likely to make an error when using the left hand (P = 0.36) than the right hand (P = 0.34), but no differences on long-duration trials. A significant hand x response mapping interaction was also found, OR = 1.60, CI = 1.01, 2.53, p = .04. On incongruent trials, participants were more likely to make an error with the left hand (P = 0.22) than the right hand (P = 0.20). Neither the duration x congruency, OR = 1.22, p = .28, nor the hand x duration x congruency effects were significant, OR = 0.69, p = .13.

Correct RT

For correct trials, a significant main effect of stimulus duration was observed, b = -157.87, p < .001 (see Fig. 11), with shorter response times on long-duration trials (M = 493 ms) than short-duration trials (M = 660 ms). The main effect of response hand, b = 5.41, p = .11 was not significant. A significant main effect of response mapping was shown, b = 6.2, p < .001, with faster correct responses on congruent trials (M = 555 ms) than incongruent trials (M = 563 ms). A significant duration x congruency interaction was found, b = 11.25, p < .001. To further examine this interaction effect, a post hoc (Tukey) test was conducted. On incongruent trials, response times were shorter in the long-duration condition than the short-duration condition, b = -170.24, p < .001. On congruent trials, response times were also shorter in the long-duration condition than the short-duration condition, b = -157.34, p < .001. On short-duration trials, response times were shorter in the congruent mapping condition than the incongruent condition, b -17.5, p < .001. However, no significant difference based on mapping was found for long-duration trials, b -3.05, p = .93.

Fig. 11
figure 11

Swarmplot of participant response times on error trials (left) and correct trials (right)

Error RT

For error trials, neither the main effect of duration, b = 7.84, p = .39, or response hand, b = 11, p = .27, were significant (see Fig. 11). However, the main effect of response mapping was significant, b = -92.35, with participants making errors faster on incongruent trials (M = 671 ms) than congruent trials (M = 702 ms). Additionally, there was a significant duration x congruency interaction effect, b = 79.61, p < .001. Post hoc tests showed that on long-duration trials, errors were faster on incongruent trials than congruent trials, b = -92.35, p < .001. Interestingly, error RTs were not significantly different on short-duration trials, b = 12.73, p = .26. Lastly, in the incongruent condition errors were made faster on long-duration trials than short-duration trials, b = -87.41, p < .001.

Discussion

As in Experiment 1, the results of Experiment 3 support the interaction account. Despite arranging response keys centrally and vertically with respect to the display, participants displayed a duration x mapping interaction effect only on short-duration trials. As shown in Table 5, participant response times on correct trials showed a similar pattern to those observed in Experiment 1 (see Table 6 for estimates of model parameters).

Table 5 Mean response times (correct trials)
Table 6 Diffusion model parameters

Unexpectedly, the results for participant accuracy showed that response hand interacted with both stimulus duration and location-congruency. Although statistically significant, these findings should be interpreted with caution. The purpose of randomizing participant response hand was to help control for general hand-related differences in the accuracy or response time data. As shown in Fig. 12, the effects are modest and there is a large amount of variability in accuracy on short-duration trials. As such, it is questionable whether these interactions reflect real effects. Future studies including a larger sample size could help determine whether response hand truly interacts with stimulus duration and location. Nevertheless, the main effects of duration and congruency on participant accuracy are more clearly illustrated. This suggests both seeing the stimulus for longer and having the stimulus appear in the same location as the response key improved accuracy. Once again, the model estimates showed a close fit with the participant data, χ2 = 4.28 (Fig. 13).

Fig. 12
figure 12

Swarmplot of participant accuracy

Fig. 13
figure 13

Model fits for each duration x mapping x shape combination on correct and error trials; “con” = congruent; “inc” = incongruent

General discussion

The present findings have both theoretical implications for decision behavior and practical implications for research analyzing response time data. Stimulus–response congruency effects have a long history in cognitive psychology. It has been well-established that, when presented with a stimulus, speeded decisions tend to be faster and more accurate when the stimulus and response share common features, such as being positioned in close proximity to each other. Location x speed interactions suggest that information provided via spatial attention mechanisms and behavior mechanisms can be shared or integrated in the mind. The present findings expand on this phenomenon by showing that these overlapping S–R codes are modulated by the quality of the perceptual representation afforded by a stimulus. When perceptual evidence is poor, individuals rely more heavily on spatial-motor codes; that is, the relative location of the stimulus and response key had a stronger influence on participants’ decisions. In contrast, when the perceptual evidence was improved, spatial-motor congruency biases were eliminated. In effect, it seems the “stimulus” component of the stimulus–response congruency effect can be separated further into two subcomponents: spatial and perceptual. Individuals use both perceptual and spatial information about a stimulus when making speeded decisions, however, these features of a stimulus have distinct effects on decision behavior.

Results from these three experiments raise problems for some models of S–R congruency effects. First, the observation that S–R congruency effects were found only for brief stimulus presentations contradicts the dimensional overlap processing model (Kornblum & Lee, 1995). This model proposes that S–R congruency effects are the direct result of overlapping features between stimuli and responses. “(W)e propose that SRC is the direct consequence of the degree to which the stimulus and response sets of a stimulus–response (S–R) ensemble are perceptually, conceptually, or structurally similar” (Kornblum & Lee, 1995, p. 855). However, in the present study, dimensional overlap (i.e., number of shared features) was constant between duration conditions. Identical stimuli were presented in the same locations, the position of response keys was kept identical, the stimulus identity was always task-relevant and the stimulus position was always task-irrelevant. If S–R congruency effects are solely due to the degree of overlapping features, S–R congruency biases should be observed across both duration conditions. Second, the results of Experiment 3 clearly show the location of response keys does not need to coincide (in external space) with the position of the stimuli to produce S–R congruency effects. In other words, the stimulus–response arrangement need not be left-left or right-right. If this were so, S–R congruency effects should not have been observed in Experiment 3. On the contrary, the findings of Experiment 3 suggest all that is required to produce S–R congruency effects is that response keys maintain a fixed position relative to each other. It is important to note that in Experiment 3, the stimulus–response mappings are arbitrary. There is no inherent difference whether the response key for “square”, for instance, is positioned above or below the response key for “circle” or vice versa. Participants should not be disposed to make responses for a particular stimulus on either side of the display. Nonetheless, in this situation participants seem to mentally map each respective response key to a particular location on the display screen. Third, the present findings show that S–R congruency effects are not inevitable—even when stimuli and responses are location-congruent. Participants did not display S–R congruency biases for shapes presented for longer in any of the three experiments. Only when presentation duration was shortened were S–R congruency effects observed, suggesting that S–R congruency effects cannot be explained exclusively with respect to the degree of overlapping features. Rather, the present findings suggest S–R congruency effects are contingent on task difficulty. As task difficulty increases, S–R congruency effects seem to be enhanced. Any model of S–R congruency effects must be able to account for these patterns.

Findings of Experiment 3 also provide important implications concerning experimental design and procedures; primarily for those studies requiring participants to make manual responses to laterally presented stimuli. It has been assumed that orienting response keys vertically and/or counterbalancing response hand across participants effectively eliminates S–R congruency effects for laterally presented stimuli. The present study systematically tested both of these countermeasures, yet S–R congruency effects were still observed both in participant accuracy and response times. The clear implication is that vertical orientation of response keys and/or counterbalancing response hand is insufficient to eliminate S–R congruency biases. Even if response keys are positioned orthogonal to a display, they maintain a relative, canonical position that a participant may mentally map to particular sides of a display. For those participants in Experiment 3 using the right hand to respond, the “square” key was located to the canonical right of the “circle” key. Indeed, these participants made faster responses when a square appeared on the right side of the screen (M = 549 ms) than the left (M = 561 ms), whereas the participants using the left hand showed faster responses when the square appeared on the left side of the screen (M = 558 ms) than the right (M = 570 ms). This poses a problem for experiments utilizing manual responses to laterally presented stimuli. In such situations, it may be preferable to use a single response key and implement a go/no-go procedure. In any experimental paradigm where a manual response is required of a participant, response times and accuracy data inevitably reflect the summation of all processes involved in the task. It can never be certain whether differences in data reflect only the effect(s) of interest to the researcher. Furthermore, counterbalancing response hand across participants and comparing grouped averages may phase out differences statistically, but this does not address the underlying problem. For cognitive experiments using a manual response (i.e., keypress) to record participant accuracy and reaction times, it is commonly assumed that the motor aspect of the task is constant across conditions. Thus, any differences observed across experimental conditions must be due to cognitive processes. The present modeling results bring this assumption into question, suggesting that executive processes may vary depending on stimulus location, stimulus presentation duration, response key position, and possibly response hand. The size, location, and presentation duration of visual stimuli may influence a participant to rely more, or less, on response biases. In sum, it should not be assumed that executive processes, unrelated to the cognitive effect of interest, are constant across experimental conditions.

Although the present findings clearly show that S–R congruency effects are influenced by the presentation duration of a stimulus, this pattern was not observed when stimulus size was manipulated (Experiment 2). However, this null result does not imply that S–R congruency effects are unaffected by stimulus size. As shown in Fig. 7, participants showed accuracy ceiling effects for both large and small stimuli. It is possible the stimuli in the “small” condition were not small enough to elicit S–R congruency effects. In other words, the perceptual quality of evidence afforded by the smaller figures was not weak enough to pressure participants to compensate with S–R congruency biases.

The modeling results from the present study suggest that (1) the diffusion model can adequately account for participant performance across different presentation and response mapping manipulations, and (2) the S–R congruency effects observed in the present study are not solely due to decisional aspects of the task (reflected by the drift rate),rather such biases reflect both decisional and executive (i.e., encoding, motor) processes. The effects of different experimental manipulations on the various parameters of the diffusion model has been studied for many years. In addition to assessing whether the model can account for patterns in participant behavior, there is strong interest in examining which parameters vary—and how they vary—in response to specific manipulations (Ratcliff, 2013). Changes in different parameters have different implications for the processes that produced the differences in behavioral data. For instance, Wagenmakers, Ratcliff, Gomez, and McKoon (2008) observed criterion shifts in the starting point parameter in response to the relative probability of a specific target. This suggests participants display a pre-decisional bias towards a more probable stimulus compared to a less probable stimulus. Other researchers have shown similar shifts in the starting point boundary based on stimulus probability (Kang, De Boeck, & Ratcliff, 2022; Mulder, Wagenmakers, Ratcliff, Boekel, & Forstmann, 2012). Additional studies of the diffusion model have tested the diffusion model in the opposite direction, namely, assessing how well the model can account for participant data when varying model assumptions. For instance, it has been observed that collapsing decision boundaries produces no better fit to data than a model where decision boundaries are assumed to be fixed (Voskuilen, Ratcliff, & Smith, 2016) and that assuming within-trial variability is necessary to account for perceptual decision-making data (Kang & Ratcliff, 2020). Other manipulations may affect the drift rate, but not the starting point parameter (Ratcliff et al., 2004), or both the drift rate and boundary positions (Diederich & Busemeyer, 2006). Of particular relevance to the present study is the effect of consistent vs. conflicting information on model parameters in decision tasks. Using a diffusion model approach with the flanker task, White, Ratcliff, and Starns (2011) observed that response biases were captured by the starting point, z, speed/accuracy trade-offs were reflected in the boundary separation and nondecisional time parameters, and congruency effects were described by changes in attention parameters. The present results are consistent with these previous observations shown with the flanker task. The starting point of the evidence accumulation process was shown to vary between congruent and incongruent conditions. Additionally, the distance between the decision boundaries was generally smaller in the short-duration condition than the long-duration condition, suggesting the larger error rate on short-duration trials is not exclusively due to worse extraction of perceptual evidence (i.e., reflected by smaller drift rate) but also different decision thresholds between the two duration conditions. In other words, biases are reflected in both the drift rate and decisional boundaries (Diederich & Busemeyer, 2006; Leite & Ratcliff, 2011). The present findings also bring new information to the modeling literature by showing how stimulus presentation time influences decisional and nondecisional parameters of the drift-diffusion model. In sum, the presentation duration of a stimulus influences both decision-based on nondecisional parameters in decisional tasks, and does so in an interactive manner with respect to stimulus–response mapping assignments.