Attention and reward are both fundamental cognitive processes capable of influencing behavioral task performance. These processes were long studied in isolation (Liu et al., 2011; Näätänen, 1990; Petersen & Posner, 2012; Schultz, 2010), but there is increased interest in uncovering how these critically important cognitive processes interact with each other in the brain to influence behavior (Anderson & Kim, 2019; Awh et al., 2012). Most of this work has been in the visual domain, focusing on the effects of reward on visual attention processes (Hickey et al., 2010; Hickey & van Zoest, 2013; Kim & Anderson, 2019; Walsh et al., 2019). In contrast, the effects of reward on auditory processing and auditory attention have been relatively understudied. Expanding our knowledge of reward-attention interactions beyond the visual domain is critical for understanding how these fundamental cognitive processes work together to shape behavior in our multisensory world. We investigated the effects of reward associations on the neural cascade of auditory stimulus processing in a classic auditory attention paradigm, namely the auditory oddball task.

It is well established that attention enhances the neural processing of auditory stimuli. Classic event-related potential (ERP) studies of spatial auditory attention demonstrate this enhancement begins in the early stages of sensory processing, with target-associated tones eliciting a larger-amplitude frontocentral negativity than nontarget tones (the N1 component, starting ~85-ms post-stimulus onset; Hillyard et al., 1973; Knight et al., 1981). Selective attention to auditory stimuli, such as attending to stimuli of a given pitch or duration among a stream of tones, produces a prolonged negativity effect (the Nd) starting around the peak of the N1 and moving frontally through its later phase 300- to 400-ms post-stimulus onset (Hansen & Hillyard, 1980, 1983). Selective auditory attention to relatively rare auditory targets in a stream of tones also results in enhanced neural processing of the target-associated tones (see review by Herrmann & Knight, 2001). For instance, in auditory oddball paradigms participants attend to lower-probability target tones of a given frequency randomly embedded in a stream of higher-probability nontarget, standard tones of another frequency. “Oddball” target tones elicit a second frontocentral negative-going activity in quick succession from the N1, known as the mismatch negativity (MMN). The MMN activity is reflective of detecting stimuli that deviate from standard template (Giard et al., 1995; Näätänen et al., 1993; Woldorff et al., 1998). The N1 and MMN are generated in the auditory cortex regions on the bottom bank of the Sylvian fissure/sulcus, which produces dipolar sources oriented perpendicular to that sulcus and pointing toward frontocentral scalp sites (for a broader overview of auditory ERPs and their topography see Winkler et al., 2015).

Attention also enhances later stages of stimulus processing, with target-associated tones eliciting greater activity than nontarget standards in a frontocentral negativity (N2, ~200 ms) followed by a large-amplitude, long-latency, longer-duration central-parietal positivity (P3b, 300-600 ms; Bonala & Jansen, 2012; Nasman & Rosenfeld, 1990; Sutton et al., 1965). This later-stage, positive-polarity P3 activity is thought to reflect the final stages of evaluating and classifying a behaviorally-relevant target stimulus, with forms of it evoked by target-associated stimuli regardless of modality (Grillon, 1990; Katayama & Polich, 1999; Squires et al., 1975; see also Fritz et al., 2007; Näätänen, 1990 & Shinn-Cunningham, 2008 for additional ways attention can modulate auditory stimulus processing).

A few studies to date have characterized the effects of reward on auditory stimulus processing. Prior behavioral studies have shown associating reward with auditory stimuli can bias attentional selection, with previously reward-associated auditory stimuli involuntarily capturing attention and/or interfering with behavior on top-down goal-directed auditory tasks (Asutay & Västfjäll, 2016; Kim et al., 2021) and on cross-modal visual tasks (Anderson, 2016). In a series of ERP experiments by Folyi et al. (Folyi et al., 2016; Folyi & Wentura, 2019), participants learned to associate certain auditory tones with monetary gain (positively valenced tones) or monetary loss (negatively valenced tones). These tones were later presented in a task-irrelevant channel concurrently while participants performed an auditory perceptual task presented via the task-relevant channel. The now task-irrelevant tones previously associated with positive or negative valence captured participants’ attention, as reflected by their elicitation of higher-amplitude N1 waves compared with those evoked by neutral tones. Interestingly, N1 amplitudes were not influenced by the direction of the valence associations (positive or negative). This finding shows valence associations can enhance early sensory processing of those auditory stimuli, in line with similar findings observed in the visual domain (Kiss et al., 2009).

In the present study, we investigated if the magnitude of monetary rewards affected the early sensory processing of task-relevant auditory target stimuli. We hypothesized auditory targets affiliated with higher amounts of monetary gain would show enhanced amplitude N1 waves compared with targets associated with lower amounts of monetary gain. Furthermore, we hypothesized this enhancement for highly rewarded targets would ramify forward through the neural cascade related to processing these targets, resulting in enhanced processing of high-reward targets compared with low-reward targets at later stages of processing as well.

Methods

Participants

Participants were 24 healthy, young adults (15 females, 23 right-handed), aged 18-25 years (M: 19.67 ± 0.35 years). Participants were recruited from Duke University and the surrounding community. Participants reported having normal hearing and reported no history of neurological disorders or diseases known to affect cognition. Prior to initiating any experimental procedures, participants gave informed consent in accordance with Duke University’s Institutional Review Board policies. Participants recruited through the Duke University psychology subject pool received course credit for their participation; the other participants who were recruited not through that pool were compensated at a rate of $15 per hour. All participants received reward-related bonus monetary payment based on their task performance (details below, ~$17 per participant).

Auditory oddball task

Participants performed an auditory oddball task consisting of two infrequently presented target tones that deviated in pitch from a frequently presented nontarget “standard” tone. Participants were instructed to button press each time either of the two target tones were presented (1 button used; right index finger button). The nontarget standard tones were of a middle pitch (1000 Hz; 70% of tones), with the higher-pitched target tones (1080 Hz, 1090 Hz, 1100 Hz, 1110 Hz, or 1120 Hz, see below; 15% of tones), and a lower-pitched target tone (900 Hz; 15% of tones) occurring infrequently. Because lower-pitched tones are sometimes easier for people to hear than higher-pitched tones, we used participants’ detection reaction times during the pre-experiment task practice to determine which pitch for the higher-pitch target tones listed above yielded equivalent detection response times to the lower-pitch target tone, decreasing the pitch of the higher-pitch tone as needed. Based on the practice results, a single frequency was selected for each participant and subsequently used as the high-pitched target tone. Tones were presented via two external computer speakers. The tones were pure sine waves with a 10-ms rise and fall time and 100 ms in duration. The interstimulus interval was randomly jittered between 800 and 1000 ms. Participants performed the task in 2.5-min blocks with a short break between blocks. Participants were presented with a series of 150 tones per block. The tone stimulus types were presented during the task blocks in random order.

Experimental session phases and stimulus-reward assignments

Participants performed the auditory oddball task under a fixed sequence of task condition phases, including phases without reward and phases with different reward conditions. Participants were fully instructed about all reward/no-reward conditions prior to testing. Participants first completed two blocks of the auditory oddball task without reward (baseline phase). Baseline phase performance was used to determine each participant’s mean accuracy and mean reaction time for behaviorally detecting the low-pitch and high-pitch target tones in the absence of any stimulus-reward associations. To determine a threshold reaction time for gaining a reward on the subsequent rewarded phases, each individual’s mean baseline reaction time (calculated separately for each of the two target tones), plus 100 ms, was used. For the first reward phase (9 blocks), participants were instructed they would gain a monetary reward each time they accurately detected a target tone and were faster than their threshold reaction time. One target tone type (i.e., high-pitch or low-pitch) was assigned to have a low-reward potential and the other to a high-reward potential. These reward assignments to the target tones were counterbalanced across participants, with half of participants assigned to counterbalance order A with high reward first associated with high-pitched tones and low reward with low-pitched tones, and the other half of participants assigned to counterbalance order B with high reward first associated with low-pitched tones and low reward with high-pitched tones. For low-reward targets, participants were instructed they would receive 1 point for every millisecond faster their reaction time was for detecting that tone compared to their threshold response time. For the high-reward targets, participants were instructed they would receive 10 points for every millisecond faster they were in their responses compared with their threshold reaction time. Every 400 points translated to $0.01 monetary gain. To discourage false alarms, participants were instructed they would lose 50 cents for each block where they had more than 6 false-alarm button-presses to the standard tones. After each block, participants were informed if a penalty had been assessed, how many points and dollars they had accrued for that block, and how many points and dollars they had accrued so far in the session. After performing the 9 blocks of the first reward phase, participants completed 4 blocks of the task with no reward (extinction phase 1). This no-reward phase was used to help behaviorally extinguish the stimulus-reward associations. Participants then completed a second reward phase (9 blocks). During the second reward phase, the target stimulus-reward assignments were flipped for each subject (counterbalance order A participants now assigned as high reward associated with low-pitched tone, order B now assigned as high reward associated with higher-pitched tone). Flipping the high- and low-reward assignments allowed us to maintain a within-subject design for examining the behavioral and neural effects of stimulus-reward assignments, as both the high- and low-pitched target tones were thus each assigned to each reward condition for each participant. Finally, participants completed 2 additional blocks of the task with no reward (extinction phase 2).

EEG acquisition and analysis

The electroencephalogram (EEG) was recorded from a 63-channel active electrode system (Brain Vision actiCHamp, Brain Products, Gilching Süd, Germany), mounted on a customized plastic electrode cap (Woldorff et al., 2002; EASYCAP, Herrsching, Germany), while participants performed the auditory oddball task. An electrode was placed below the left eye to monitor vertical eye movements, and two electrodes slightly lateral to the left and right outer canthi were used for monitoring lateral eye movements. The scalp sites of our equidistant-electrode custom cap are reported here in terms of the closest location in the standard 10–10 system if within a couple of millimeters; in those cases where our electrode varied more than a few millimeters from the related 10-20 or 10-10 location, the electrodes are denoted with “a,” “p,” “i,” or “s” to indicate they are slightly anterior, posterior, inferior, or superior to the 10-20 or 10-10 location, respectively. Relevant electrodes also are specifically identified on schematic head figures in the Results section. During the experiment, the EEG recording was referenced to the right mastoid, filtered between DC and 200 Hz, and digitized with a 500 Hz sampling rate per channel. Electrode impedances were maintained below 5 kΩ for the ground and reference channels and below 15 kΩ for all other channels.

Offline, the EEG data were bandpass filtered from 0.01 to 70 Hz (finite impulse response (FIR) filter), downsampled to 250 Hz, and re-referenced to the algebraic average of the left and right mastoid electrodes. As our a priori questions of interest centered on the effects of reward on auditory stimulus processing in the brain, we optimized the number of trials during the rewarded phases for neural analyses. EEG data analysis thus focused on the rewarded phases of the auditory oddball task. Epochs were created by time-locking to the onset of standard tones, the high-reward-potential targets tones, and the low-reward-potential target tones. Artifact rejection was performed to remove trials with eye blinks, eye movement, muscle tension, or electrode drift artifacts. Approximately 24% of epochs per participant were rejected on average due to artifacts, and the numbers of epochs accepted after artifact rejection were comparable for the low- and high-reward trial types. Epochs were baseline corrected by subtracting the mean amplitude of the baseline period (−200 ms to 0 ms before stimulus onset) from every timepoint in the epoch. ERPs were created by selectively time-lock averaging the epoched EEG data to the onset of the standard tones, the low-reward target tones, and the high-reward target tones.

Statistical analyses

For the behavioral data, we calculated participants’ accuracy and reaction time (RT) for detecting each of the two target tone types. We examined how these measures changed as a function of the experimental condition and phase (baseline, reward phase 1, extinction phase 1, reward phase 2, and extinction phase 2), reward level (low, high), and counterbalance order (order A: reward phase 1 high-pitch tone associated with high reward and reward phase 2 high-pitch tone associated with low reward; order B: reward phase 1 high-pitch tone associated with low reward and reward phase 2 high-pitch tone associated with high reward).

Our auditory oddball paradigm elicited a classic set of ERP components reflecting the cascade of neural processing of standard and target deviant tones, namely the frontocentral N1, starting ~85-ms post-stimulus and immediately followed by an MMN (target deviant tones only), and then the frontocentral N2 (~200-ms post-stimulus) and centroparietal P3b components (target deviant tones only, ~240-ms post-stimulus). For our ERP analyses, our primary question of interest was how reward level (low, high) modulated the mean amplitude of the processing of the target deviant tones during the rewarded phases of the auditory oddball task. We also were interested whether reward level would modulate the onset latency of components, particularly of the later P3b component. Channels of interest were chosen a priori based on the extensive literature surrounding these components; channel FCz was used for the frontocentral negativities, and channel CPz was used for the P3b component analyses. In order to determine which time windows to use for our analyses, a grand average ERP waveform was created to average together our three stimulus types (no-reward standards, low-reward targets, high-reward targets) for the N1/MMN and N2 frontocentral negativities and our two target stimulus types (low-reward targets, high-reward targets) for the P3b. This allowed us to determine time windows of interest for our analyses in a manner that was unbiased towards any particular experimental condition. Specific time windows of interest are discussed in detail within the relevant results section. Onset latencies were obtained by calculating the fractional peak latency, meaning the time at which the ERP waveform reached 50% of its peak amplitude (Luck, 2014). For mean amplitude and latency analyses, the reward phase ERP data were subjected to repeated-measures ANOVAs with within-subjects factors of Phase (reward phase 1, reward phase 2) and Reward Stimulus Type (no-reward standard, low-reward targets, high-reward targets), with a between-subjects factor of Counterbalance Order (A, B). To ensure our design did indeed appropriately equate the effects of the physical tone stimuli used for the targets across our low- and high-reward conditions, we conducted follow-up ANOVAs with the within-subjects factors of Reward (high reward, low reward) by Pitch (high-pitch tone, low-pitch tone), with a between-subjects factor of Counterbalance Order. Significant effects also were further queried with paired t-tests between conditions where relevant.

For all analyses, the Huyhn–Feldt sphericity correction was applied as needed. Corrected F and p values are reported, with degrees of freedom rounded to integers for easier reading. Effect sizes are reported using Cohen’s d or partial eta squared (ηp2) as appropriate for each statistical test.

Results

Behavioral results

Participants’ reaction times and accuracy to the low- and high-pitched target tones across the experimental phases are presented in Tables 1 and 2. Participants’ reaction times were used as the primary dependent variable for behavioral analyses, with accuracy analyses presented for completeness. Participants did not differ in their reaction times to low- and high-pitched target tones in the initial no-reward baseline phase (t(23) = 1.18, p = 0.25, d = 0.24). Participants’ reaction times to target tones sped up when rewards were introduced for fast, accurate detection of the target tones (paired t-test between overall baseline and overall reward phase 1, t(23) = 7.4, p < 0.0001, d = 1.52).

Table 1 Reaction times (RTs) to target tones
Table 2 Accuracy for detecting target tones

The two reward phases’ reaction time data were entered into a Reward (high reward, low reward) by Phase (reward phase 1, reward phase 2) by Counterbalance Order (A, B) repeated measures ANOVA. Participants were ~23-ms faster to respond to the high-reward targets than the low-reward targets (F(1,22) = 83.46, p < 0.001, ηp2 = 0.79; Fig. 1), and ~13-ms faster to respond in the second reward phase than the first (F(1,22) = 17.40, p < 0.001, ηp2 = 0.44). There were no significant interactions among any of these factors or any effects of Counterbalance Order (all p > 0.54). To confirm there were no interactions between the physical target tone stimuli and the reward association, we also examined a Reward by Pitch by Counterbalance Order ANOVA on the target reaction times and found no interactions with or main effect of Pitch (all p > 0.44). Exploration of reaction times during the first rewarded task phase revealed participants were significantly faster at responding to high-reward targets than low-reward targets after the first task block, suggesting reward associations quickly shaped participants’ behavior (approximately 23 high-reward target tones and 23 low-reward target tones were presented per block; paired t-test between participants’ mean reaction times in the first task block of the first rewarded phase of experiment for high-reward targets (383 ± 8 ms) versus low-reward targets (394 ± 8 ms), t(23) = 2.5, p = 0.02, Cohen’s d = 0.52).

Fig. 1
figure 1

Participants are faster at responding to high-reward targets than to low-reward targets. Bars are the mean reaction times in milliseconds for correctly detected target tones (hits), collapsed across the two reward phases. Error bars are across-subjects standard error around the mean. Participants were faster to respond to target tones associated with high-reward than they were to respond to target tones associated with low-reward

The results showed that participants slowed down during the extinction phases. Inspection of the data suggests the behavioral effects of reward associations were extinguished in the first extinction task block. We explored whether there was any lingering impact of reward history on reaction times by entering the extinction phase 1 task-block data into a 2 x 4 x 2 repeated-measures ANOVA, with factors of Reward History (past association with high reward, past association with low reward) by Block (4 task blocks) by Reward Phase 1 Counterbalance Order (A, B). There were no significant interactions or main effects from this analysis (all p > 0.06). Participants’ were ~9-ms faster to respond to the targets previously associated with high-reward than low-reward, but this difference did not reach statistical significance (mean (standard error) of previously high-reward targets: 412 (8) ms, of previously low-rewarded targets: 421 (9) ms; main effect of Reward History not significant, F(1,22) = 4.10, p = 0.06, ηp2 = 0.16). Analogous analyses of the second extinction phase, which lasted only 2 blocks, again confirmed no interactions or main effects of Reward History on reaction times (all p > 0.33).

Participants had good accuracy at correctly detecting the target oddball tones (hit rate 87% or higher across experimental phases). Participants had a higher hit rate for the low-pitched target tone than the high-pitched (t(23) = 2.51, p = 0.02, Cohen’s d = 0.39). Participants’ hit rate to target tones stayed the same when rewards were introduced for fast, accurate detection of the target tones (paired t-test between overall baseline and overall reward phase 1, t(23) = 1.7, p = 0.10, d = 0.40).

The two reward phases’ hit proportion data were entered into a Reward (high reward, low reward) by Phase (reward phase 1, reward phase 2) by Counterbalance Order (A, B). Participants were 3% more accurate at detecting high-reward targets than low-reward targets (F(1,22) = 16.95, p < 0.001, ηp2 = 0.44). There were no other significant interactions or effects (all p > 0.25). To confirm there were no interactions between the physical target tone stimuli and the reward association, we also examined a Reward by Pitch by Counterbalance Order ANOVA on the target reaction times and found no interactions with Pitch nor a main effect of Pitch (all p > 0.23). Participants’ overall accuracy at detecting targets stayed the same during the two extinction phases as during the rewarded phases (both p > 0.07). During the extinction phases, there was a significant interaction between Reward History and Counterbalance Order, with participants in order B showing more of a difference in accuracy based on prior reward history than participants in order A (F(1,22) = 14.95, p = 0.001, ηp2 = 0.41). Participants in order B tended to show higher accuracy for extinction phase targets that were previously associated with high reward; this pattern was not evident in the other half of our participants. There was no main effect of Reward History (F(1,22) = 3.83, p = 0.06, ηp2 = 0.15) and no other significant effects (all p > 0.11).

ERP Results

Reward associations enhance the amplitude of early-latency sensory and deviance-related processing

Standard tones and deviant target tones all elicited an early sensory negativity (the N1 component), starting around 85-ms and peaking around 112-ms post-stimulus onset (Fig. 2). This negativity was strongest over frontocentral channels, consistent with the hallmark distribution of the N1. In addition to eliciting this N1 activity, the two deviant target tones also elicited an additional subsequent negativity, the MMN, whose timing and distribution overlapped with the later part of the N1 activity. As the spatial and temporal activations of the N1 and MMN overlapped, the effect of reward on these negativities was examined together in one analysis. Channel FCz was selected for analyses a-priori based on the extensive literature on these components. Based on the grand average ERP wave across all stimulus conditions, the time window 85-195-ms post-stimulus onset was selected for mean amplitude analyses of the frontocentral negative-polarity activity.

Fig. 2
figure 2

High-reward targets show increased early frontocentral negativity compared with low-reward targets in latency of N1 and MMN components, but not in the latency of the N2 component. (A) Stimulus onset-locked grand average ERP waveforms at electrode FCz for no-reward standard tones, low-reward target tones, and high-reward target tones. All tones elicited an early negativity over frontocentral electrodes starting around 85-ms post-stimulus onset (gray box, in the latency of the N1/MMN). This negativity was larger for target tones than for standards and larger for the high-reward targets than for low-reward targets. Targets tones then elicited a second negativity (~200-ms post-stimulus onset, in the N2 latency) that was much stronger than for standard stimuli. This N2-latency negativity was not modulated by reward magnitude. (B) Scalp topographies of the N1/MMN frontocentral negativity mean amplitude for each tone type. (C) Stimulus onset-locked grand average ERP waveforms at FCz for the subtractions low reward targets minus standards and high-reward targets minus standards highlight the effect of reward magnitude on the N1/MMN frontocentral negativity (gray box). (D) Topographies for N1/MMN mean amplitude of activity for low-reward targets minus standards, high-reward targets minus standards, as well as the difference of these differences

The mean amplitude of this frontocentral negativity differed by Phase (F(1,22) = 13.85, p = 0.001, ηp2 = 0.39), with participants showing smaller ERPs in the second reward phase (second half of experiment) than in the first reward phase (Table 3). As predicted, the mean amplitudes also differed by Reward Stimulus Type (F(2,44) = 20.78, p < 0.001, ηp2 = 0.49). There was no effect of Counterbalance Order and no significant interactions between any factors (all p > 0.25). Our follow-up Reward by Pitch by Counterbalance Order ANOVA on the targets confirmed a significant effect of Reward (F(1,22) = 4.41, p = 0.04, ηp2 = 0.17), with no main effect of target Pitch (F(1,22) = 0.001, p = 0.94, ηp2 < 0.01), no effect of Counterbalance Order (F(1,22) = 1.35, p = 0.259, ηp2 = 0.06), and no interactions between factors (all p > 0.49; Table 3).

Table 3 Mean amplitudes for N1/ MMN, N2, and P3b ERP components

The significant Reward Stimulus Type main effect from the Phase by Reward Stimulus Type by Counterbalance ANOVA was further queried with paired t-tests comparing the mean amplitudes of the no-reward standards, high-reward targets, and low-reward targets. Consistent with the existing auditory oddball literature, the mean amplitude of the activity elicited by both of the target deviant tone types were higher than the mean amplitude of the activity elicited by the standard tones (both t > 4.63, p < 0.0001, d > 0.95). Critically, the mean amplitude of the activity elicited by the high-reward targets was greater than that elicited by the low-reward targets (t(23) = 2.34, p = 0.03, d = 0.48), indicating that reward can strengthen the early sensory processing of auditory stimuli. No latency effects were observed for this frontocentral negativity (all p > 0.05).

Target detection, but not reward, affects middle stages of target processing

Following the N1 and MMN components reflecting the respective stages of early sensory processing and deviance-detection, both target tones—but not the standard tones—elicited a frontocentral negativity in the latency of the hallmark N2 component (starting ~200-ms post-stimulus onset, Fig. 2). Mean amplitudes were extracted from 200- to 300-ms post-stimulus onset from channel FCz. The repeated-measures Phase by Reward Stimulus Type by Counterbalance Order ANOVA found a significant main effect of Reward Stimulus Type (F(2,44) = 21.23, p < 0.0001, ηp2 = 0.49), and Counterbalance Order (F(1,22) = 5.33, p = 0.03, ηp2 = 0.19) but no other main effects or interactions (all p > 0.44). Paired t-tests revealed that the Reward Stimulus Type effect was driven by differences between each of the targets and the standard (paired t-tests between targets and standard both t > 4.30, p < 0.0001, d > 0.88). Participants whose counterbalance order assigned them to the condition where the higher pitched target tone was high reward and the lower pitched tone was low reward in the first reward phase showed smaller N2 amplitudes overall than participants assigned to the opposite counterbalance order. Inspection of the data suggests this effect may have been driven by a few participants who did not show strong N2 responses in the latency window queried. The Reward by Pitch by Counterbalance Order ANOVA confirmed there were no significant effects or interactions of Reward and Pitch on target N2 amplitudes (all p > 0.26), with the Counterbalance Order again significant (F(1,22) = 4.39, p < 0.05, ηp2 = 0.17). This suggests the N2 amplitude was influenced by target attributes but not by reward. There were no effects on latency observed (all p > 0.05).

Reward associations enhance the amplitude of longer-latency target processing

Both target tones, but not the standard tones, then elicited a large, centroparietal positivity at longer latencies, reflecting a form of the hallmark P3b component (starting ~240-ms post-stimulus onset; Fig. 3). Mean amplitudes were extracted from 420- to 468-ms post-stimulus onset from channel CPz. This narrow, 50-ms window was centered on the cross-conditional P3b peak and was selected so that the mean amplitude analyses were less likely to be heavily influenced by the faster onset latency observed for high reward targets. For the mean amplitudes, the Phase by Reward Stimulus Type by Counterbalance Order interaction was significant (F(2,44) = 4.84, p = 0.02, ηp2 = 0.18), as was the main effect of Reward Stimulus Type (F(2,44) = 99.82, p < 0.001, ηp2 = 0.82). There was no effect of Phase or Counterbalance Order or other significant interactions (all p > 0.5). The target stimuli were further queried with a Reward by Pitch by Counterbalance repeated-measures ANOVA to examine the effects of reward level and the physical target tone stimuli. There were no interactions among factors (all p > 0.5) and no effect of Counterbalance Order (F(1,22) = 0.11, p = 0.75, ηp2 = 0.01). There was a main effect of Reward (F(1,22) = 23.08, p < 0.001, ηp2 = 0.51), with more highly rewarded stimuli showing larger amplitude responses. There also was a main effect of Pitch (F(1,22) = 23.49, p = 0.04, ηp2 = 0.19), with the low-pitch tones eliciting larger amplitude responses than high-pitch tones (Table 3). Pairwise comparisons between the high-reward target, low-reward target and no-reward standard, collapsed across other factors, showed both the high-reward and low-reward targets elicited strong, positive-polarity activity compared to the nontarget standards (both paired t-tests, t > 11.12, p < 0.0001, d > 2.27), consistent with the large literature implicating the P3b in late-stage categorization and evaluation of target stimuli (Näätänen, 1988; Pornpattananangkul et al., 2017; Soltani & Knight, 2000; Zhu et al., 2019). The level of reward also influenced P3b activity, with the high-reward targets eliciting larger P3b activity compared with the low-reward targets (t(23) = 4.88, p < 0.001, d = 1.02).

Fig. 3
figure 3

Reward associations enhance the amplitude of a late-stage positivity in the latency of the P3b component. (A) Stimulus onset-locked grand average ERP waveforms from electrode CPz for standard, low-reward target, and high-reward target tones. Target tones, but not the Standards, elicited a strong, late-latency centroparietal positivity, starting approximately 240-ms post-stimulus onset and peaking approximately 400- to 450-ms post-stimulus onset. Reward magnitude also impacted this positivity, with high-reward targets eliciting an earlier and larger-amplitude positivity compared to low-reward targets. (B) Scalp topographies of positivity mean amplitude for standards, low-reward targets, and high-reward targets

Reward also influenced the latency of the P3b, with high-reward targets eliciting earlier P3b activity compared to the low-reward targets (Fig. 3a). A Phase by Reward Stimulus Type by Counterbalance Order repeated-measures ANOVA on the latency data revealed a significant main effect of Reward Stimulus Type (F(2,44) = 22.74, p < 0.001, ηp2 = 0.51). There were no effects of Phase (F(2,22) = 2.14, p = 0.16, ηp2 = 0.09) or Counterbalance Order (F(1,22) = 0.20, p = 0.66, ηp2 = 0.01), and no interactions (all p > 0.17). Paired t-tests between high- and low-reward targets showed earlier latencies for high-reward targets for both the first and second rewarded task phases (first reward phase: t(23) = 3.8, p = 0.001, Cohen’s d = 0.77; second reward phase: t(23) = 2.8, p = 0.01, Cohen’s d = 0.62).

Discussion

We used a within-subjects design and a well-characterized auditory attention paradigm (the oddball task) to investigate how reward magnitude influenced the detection of reward-associated, attentionally-relevant target stimuli. Behaviorally, we found reward association sped up participants’ detection of the oddball target tones, with the fastest reaction times seen for the high-reward target tones. Neurally, we found reward-associations amplified the attentional enhancements seen for sensory processing of target stimuli. Both the early N1 sensory wave and the deviance-related MMN component showed larger amplitudes for high-reward targets compared to low-reward targets. This suggests reward-magnitude associations are represented in auditory sensory cortices and can influence processing early in the processing stream. To our knowledge, this is the first demonstration of how reward magnitude modulates early attention-related neural processing of task-relevant auditory stimuli. These findings are in line with work showing reward magnitude influences the learning of stimulus-response mappings, an effect strengthened by selective attention (Vartak, Jeurissen, Self, & Roelfsema, 2017), and the broader literature that the brain’s reward processing mechanism encode for valence and magnitude of monetary gains and losses (Delgado et al., 2003). We also found attended targets elicited a later N2 component not amplitude-modulated by reward. Reward did, however, impact the still later stages of target processing, with high-reward targets eliciting earlier and larger P3b components than low-reward targets.

Our design did not try to tease apart the influence of stimulus-reward associations per se versus the effects of the top-down attentional priorities for certain features or salience types. However, our design does inform and constrain possible theoretical implications of our findings. First, our tones were randomly presented within a single stream. During the reward phases participants could not predict when the low- or high-reward oddball targets would occur in the stream. Accordingly, participants were not able to adjust their overall motivation levels in the way they might be able to if, for example, the stimuli were presented in a blocked-fashion (e.g., if there had been blocks with only high-reward targets and blocks with only low-reward targets, participants might have had more motivation during the former). We can thus rule out overall motivation modulations as playing a role in our findings.

We note further that our tone stream was also presented binaurally, so participants could not set up in advance an attentional bias towards a particular spatial input channel. This thus differs from classic rapid-presentation dichotic listening experiments where participants were instructed to attend to all the tones presented to one ear (nontargets and oddball targets) and detect the oddball target tones in that stream while ignoring all tones presented to the other ear, which resulted in N1 enhancements for all the stimuli in the attended channel relative to those in the unattended channel (Hillyard et al., 1973; Woldorff et al., 1987; Woldorff & Hillyard, 1991). The N1-latency effects from those studies (and analogous ones) have been interpreted as reflecting an attentional set that biases attention toward a particular spatial stream/channel. It is generally thought stimuli needed to be presented fairly often and rapidly in order for the brain to be able to continually refresh the attention set for the designated channel. Attentional sets can also be deployed for streams of stimuli of different pitches, but, similarly, the stimuli of those streams also need to be presented relatively often to obtain analogous effects (Schwent et al., 1976; Giard et al., 1995; see Näätänen & Michie, 1979 and Giard et al., 2000 for additional discussion).

In our study, each of our oddball deviant tone types were relatively rare and occurred only once every 5 or 6 seconds or so, not nearly as often as in the above dual-stream studies. Attending to oddball deviant tones that are rare within the tone stream elicits an enhanced cascade of processing reflected in the MMN, N2, and P3b waves, relative to the nontarget standards (see review by Sussman, 2007). In such paradigms, participants generally must attend to the whole auditory stream to detect the rare targets. In these cases then, participants may proactively set up what might be considered a salience set or salience map for the deviant tone type. That is, they attend to the whole auditory stream to detect the rare feature-deviant target, identify it as a target, and orient additional attentional processing to it. Importantly, however, in our study both of the two oddball deviant tone types were behaviorally-relevant, rare targets, albeit with different reward associations. Our effects would thus seem to reflect the proactive establishment of a salience association for the oddball target stimuli, with a stronger salience association set for the high-reward-associated targets relative to the low-reward ones. This salience set resulted in a stronger attentional orienting towards the high reward targets when they occurred, which was then reflected by the enhanced early activity in the N1/MMN latency range.

In other work, selective attention has been found to be able to rapidly bias auditory stimulus processing in auditory sensory cortex in order to facilitate detection of salient stimuli (Kayser, 2005), as also seen for visual stimulus processing in the visual system (Parkhurst et al., 2002). In the visual domain, Hickey et al. (2010) found visual stimuli associated with reward elicited larger P1 visual sensory processing waves from visual cortex (perhaps analogous to our N1/MMN auditory sensory effects). Of particular relevance here, a recent visuospatial search task (Bachman et al., 2020) found that higher-reward (versus lower-reward) associations for popout (i.e., visual-oddball) target stimuli triggered strengthened attentional orienting towards those stimuli, reflected by increased amplitude of the attentional-orienting-sensitive N2pc component. (In contrast, in that study manipulating the physical salience of the visual targets sped up the latency of this component, but did not increase its amplitude.) In these visual search tasks, the popout target stimuli could occur anywhere in a stimulus array, and so participants must attend to the whole visual scene in order to detect the popout stimulus and then orient to it, and when the popout has a high-reward association, that detection and orienting process was stronger. The reward-related effects in the current auditory oddball task study thus seem particularly analogous to this, in that participants needed to attend to the whole auditory scene (the single stream of stimuli) to detect and then orient to the rare target tones in the stream. Thus, participants would have presumably proactively set up a salience set for the target tones, enabling attentional orienting towards the targets when they occur, and, most critically here, more strongly so for the high-reward targets. The enhanced detection and orienting process was then reflected in the larger amplitude N1/MMN for high-reward targets.

Others have observed reward associations, and prior reward history can influence behavior in tasks even when its effects run contrary to top-down attentional goals (Kim et al., 2021). Stimuli associated with reward value can rapidly capture attention, even when such stimuli are not physically salient or task-relevant (Anderson & Kim, 2019; Donohue et al., 2016; Hickey & van Zoest, 2013; Kim et al., 2021). Auditory stimuli previously associated with reward can elicit enhanced sensory processing, even when these stimuli are task-irrelevant (Folyi et al., 2016; Folyi & Wentura, 2019). Hickey et al. (2010) found the reward-induced enhancements seen in the P1 waves for reward-associated stimuli persisted even when the associated stimulus was used as a distractor rather than a target (see also MacLean and Giesbrecht (2015) for a replication of these findings).

Lastly, we note that Liao and Anderson (2020) also have found some evidence of a residual bias towards originally high-value stimuli after a reversal of reward contingencies in a visual color task. Accordingly, we explored the possibility that reward history could have interacted with our effects, but we did not find compelling evidence to support an influence of prior reward associations on our behavioral or ERP effects. For instance, we flipped the stimulus-reward associations for the target stimuli in the second half of the experiment (with an extinction phase in between), but we did not find significant evidence of interactions between experimental phase (first vs. second) and reward magnitude. We also explored the behavioral data for the reward extinction phases of our task. In the first extinction task block, average reaction times for detecting previously high-reward targets were a bit numerically faster but not statistically different from the reaction times for previously low-reward targets. This suggests the extinction phases in our study successfully extinguished the attentional biases towards highly rewarded stimuli. These data are consistent with the idea that reward instilled a salience set that biased the attentional orienting towards the reward-associated targets, and especially the high-reward targets, and that this salience set extinguished when reward was removed, and it was then updated when the reward associations were flipped.

We found reward magnitude and auditory attention influenced a common neural processing cascade, with reward further augmenting attentional enhancements of the neural processing of task-relevant stimuli both early (~100-ms post-stimulus onset) and later on (~400-ms post-stimulus onset) in the processing stream. These findings extend the mechanistic understanding of how reward and attention interact in the auditory domain. Future investigations will be needed to more precisely elucidate how reward magnitude and the positive or negative valance of rewards interact with attentional mechanisms in the auditory domain, the mechanisms governing the learning and extinction of reward-associations in the auditory domain, and how reward and attention interact in circumstances involving more complex multisensory stimuli and environments.