Introduction

In Western music, composers often use major modes to indicate happy moods (Happy Birthday) and minor modes to indicate sadder or more contemplative moods (Greensleeves). Although major and minor scales differ in only one or two notes, and those only by a semitone, musician and nonmusician listeners alike reliably map major to happy and minor to sad (Crowder, 1985). Despite the ability to classify tunes as happy and sad, nonmusicians are generally unable to discriminate same-except-for-mode pairs when the pairs start on different notes (Halpern, 1984; Halpern, Bartlett, & Dowling, 1998) or to classify major and minor tunes using those labels, even after a short training period (Leaver & Halpern, 2004). This growing body of evidence suggests that musical training may be required for individuals to recognize mode using the formal terms “major” and “minor” and that general music experience and/or explicit knowledge about scales may support this ability. In Western music, major modes are far more frequent than minor modes (Bowling, Gill, Choi, Prinz, & Purves, 2009), and the distinction between these two modes is a critical component of musical training. The learned knowledge of the critical note’s significance likely aids in the highly accurate identification of major versus minor in musicians. The importance of the critical note is reflected in measurable changes in the neural response to minor melodies and especially the critical note in musicians (Halpern, Martin, & Reed, 2008). In the current study, we used high-density EEG (128 channels) to further study these neural responses, evaluate the importance of first versus second occurrence of the critical note in behavioral and brain responses, relate these responses to their neural loci, and determine the extent to which the observed brain activity could reliably classify a listener as a musician or nonmusician.

A previous ERP study using a 32-channel system (Halpern et al., 2008) examined how groups of musicians and nonmusicians process mode. The melodies began as neither major nor minor but then presented a “critical note” that unambiguously signaled the mode of the melody (the melodies were otherwise identical in each major/minor pair). Half of each group classified each melody by using affective labels of “happy” and “sad”; half used “major” and “minor.” Musicians were highly successful independent of the instruction. The nonmusicians classified adequately using affective labels (73% correct) but were less successful using mode labels, even after some training (65% correct). Musicians exhibited a late positive component (LPC) at a subset of temporal electrodes in response to the first occurrence of a critical note, independent of the classification task, but only for minor melodies. This suggests that minor melodies are a marked category in Western music, as the LPC indicates attention to a contextually important stimulus (Nieuwenhuis, Aston-Jones, & Cohen, 2005). Nonmusicians did not exhibit an LPC response to the critical note, even with the affective categorization instructions, suggesting that nonmusicians do not have an a priori expectation for mode and so the critical note does not serve as a specific marker for the melody’s mode, at least in a time-linked way.

Because major and minor modes are used to convey emotion in Western music, the neural processing of this comparison likely involves a wide range of brain regions. Functional magnetic resonance imaging studies report activation in response to minor melodies (compared with major ones) in varied regions, including left medial frontal gyrus, cingulate cortex, and left parahippocampal gyrus (Khalfa, Schon, Anton, & Liegeois-Chauvel, 2005; Green et al., 2008). The processing of minor versus major in emotional processing areas, such as the cingulate cortex is not surprising. However, activation present in left frontal gyrus suggests a higher cognitive component, perhaps associated with the context of the music. Researchers have long reported frontal asymmetry in response to emotional content of music (Davidson, & Hugdahl, K. (Eds.)., 1996; Mikutta, Altorfer, Strik, & Koenig, 2012; Schmidt & Trainor, 2001; Yuvaraj et al., 2014). The involvement of frontal cortex is also especially interesting in relation to the finding that increased prefrontal cortex asymmetry in the alpha and beta bands are associated with activation in limbic system structures, including the amygdala (Daly et al., 2019).

The processing of the critical note relies on the context in which it occurs and the listener’s knowledge of that context. In language, unexpected words that violate the expectations of the sentence require additional processing (e.g., the word “axe” in “He chopped up the carrots with an axe.”; Rayner, Warren, Juhasz, & Liversedge, 2004). Linguistic errors in language result in different ERP patterns, including increased negativity in posterior electrodes to semantic errors (Angrilli et al., 2002; Friederici, Pfeifer, & Hahne, 1993; Münte, Heinze, & Mangun, 1993), increased parietal positivity to syntactic errors (Angrilli et al., 2002), and early positive followed by late negative responses to morphological errors (Friederici et al., 1993).

Although minor notes are neither errors nor anomalies in the linguistic sense, the LPC reported in association with the first critical note in musicians may reflect, at least in part, a violation of rarity-dependent expectation (major mode, in this case). Semantics and syntax are both critical aspects of music (for review, see Koelsch, 2009). As in language, neural responses to violations of musical syntax and semantics are present in overlapping areas. Young adults with at least 4 years of musical training exhibited increased activation in left frontal, temporal, and parietal regions (as measured by fMRI) on a music target task in which the last note of a sequence was expected or not expected. Of particular interest is the increased activation in the left inferior frontal gyrus in response to final chords that were unexpected according to Western harmony (Tillmann, Janata, & Bharucha, 2003). Children with long-term musical training (at least 2.5 years) exhibited an early right anterior negativity (ERAN) during musical sequences with an unexpected chord, constituting a semantic error, whereas children without musical training had no anterior negativity component (Jentschke & Koelsch, 2009). Therefore, we hypothesized that if a Western-trained musician listens to a melody while expecting a major mode, the onset of a minor critical note also may trigger a similar neural signature in frontal regions. In the current study, we evaluated whether, in addition to the LPC, this early anterior negativity to the minor critical note was present in musicians but not in nonmusicians.

An additional open question is how non-musicians make their categorization decisions if the critical note is not being processed in the same way as musicians. In our prior study (Halpern et al., 2008), only the neural response to the first critical note was analyzed, although both critical notes were presented. Given the adequate but lower level of accuracy in nonmusicians, it is possible that more than one critical note is needed to make the distinction between “happy” (major) and “sad” (minor). In a single melody, many critical notes occur, following the mode of the song and maintaining its emotional content. If nonmusicians analyze the melody over time rather than at the occurrence of a single critical note, then a second critical note may be more informative to a nonmusician by adding to the overall perception of the melody than to a trained musician, for whom a single note is sufficient. Because the importance of the critical note itself is unknown to nonmusicians, as is the use of major/minor labels, nonmusicians may rely primarily on global emotional content, processed over the course of the entire melody, to make this distinction. In support of this hypothesis, our previous study revealed higher accuracy by nonmusicians when happy/sad labels were used compared to major/minor labels (Halpern et al., 2008).

The current study was designed to address three research questions. First, we asked whether there is a relationship between musical training and a response to the second critical note. Second, we asked whether the LPC to the critical note is a reliable marker for musical training. To address this question, we used a well-documented two-alternative forced-choice classifier (Centanni, Engineer, & Kilgard, 2013; Engineer et al., 2008) to assign group membership based on the LPC response. Third, we examined two frontal cortex markers of music processing: 1) the early right anterior negativity signal in response to the critical note in musicians; and 2) group differences in frontal asymmetry as a marker of emotional processing (Daly et al., 2019).

Methods

Participants

A total of 25 nonmusicians (0-2 years of musical experience, and with no current training) and 28 musicians (minimum of 10 years of musical training) participated. We included individuals with up to 2 years of experience in the nonmusician group to account for time spent in compulsory music classes in American elementary schools, in which basic skills in singing, rhythm, and recorder playing are taught. Individuals with more than 10 years of experience must have pursued music following these required classes. Data from nine nonmusicians and nine musicians were discarded due to variety of equipment and environmental problems (e.g., electrical noise due to nearby construction). The final sample consisted of data from 16 nonmusicians and 19 musicians. Ages of the participants in this final sample ranged from 18 to 31 years, with a median of 20 years. Before completing the experimental task, participants completed a short pure-tone audiometric screening and a brief questionnaire probing their musical background and handedness. All listeners were right-handed and had normal hearing. Approximately half of each group (8 nonmusicians and 11 musicians) were randomly assigned to make classifications of “major or minor” (MM) for each tune and the remaining participants were assigned to make judgments of “happy or sad” (HS). There was no significant difference in accuracy across the two instruction conditions, so we combined participants for all analyses.

Musical Stimuli

The 42 tunes (21 pairs) were a subset of tune pairs originally composed for the study by Halpern et al. (2008). One member of each pair was a tune newly composed or adapted from an obscure extant source. The other member of the pair was modified to be identical except for being in the opposite mode. All tunes were rated as being highly musical and representative of its respective major or minor mode (see Halpern et al. (2008), for more details of tune construction). The 24 tune pairs in the prior study all had an initial “critical note” (critical note 1, CN1), at which the tune became unambiguously major or minor. This was usually the third degree of the scale but was sometimes the sixth degree. The 21 tune pairs used in the current study also had a second critical note (CN2; Fig. 1). CN2 was never immediately adjacent to CN1. CN1 ranged from the second and seventh note, and CN2 always appeared after CN1 between notes 3 and 11. CN1 occurred on average at note position 3.4, which was approximately 1.03 s from the beginning of the tune. CN2 occurred on average at note position 6.8 and 2.43 s from the beginning of the tune. Each participant heard the tunes twice each, resulting in 42 examples of the major critical notes and 42 examples of the minor critical notes (as in Halpern et al., 2008). The tunes were synthesized in a piano timbre and saved as MIDI files (contact author AH for more information on these stimuli). On average, tunes were 4.7-seconds long. Six additional tune pairs served as practice materials. This study was approved by the institutional review board at the Pennsylvania State University.

Fig. 1.
figure 1

Example tune with critical notes marked by red circles.

All tunes were presented and responses were recorded using E-Prime version 2.0 (Psychology Software Tools, Pittsburgh, PA) and interfaced with Net Station software version 4.3 (EGI Philips, Eugene, OR) for the collection of continuous EEG recordings. Auditory stimuli were played via a single speaker situated approximately 60 cm to the front of the observer. Data collection was accomplished using a 128-channel Geodesic Sensor Net (EGI Philips. Eugene, OR) with the reference point at the vertex. Data were acquired continuously throughout the session and sampled at a rate of 1 KHz. Channel impedances were maintained at 50 kΩ or less before the testing sessions and for the entire session.

Mode-labeling task

A mode-labeling task was completed during an electroencephalography (EEG) session, which took place in a sound-attenuated shielded chamber, and sessions lasted approximately 1 hour. Following the completion of the audiometric screening (Carhart & Jerger, 1959) and questionnaires, the EEG electrode net was applied and adjusted until impedances were at or below criterion. Participants were then given instructions for the experimental task. The instructions were based on those used in Halpern et al. (2008). Participants were randomly assigned to categorize melodies as either “major or minor” or “happy or sad.” As in our prior study, this variable was included to ensure that the labels themselves did not exert any additional influence on the behavioral decision (Halpern et al., 2008). Participants were told that tunes could come in one of two “flavors” (happy or sad, or major or minor), depending on the participant’s assigned labels. No additional information was provided about the reason for these labels. Examples were played until the participant indicated understanding of the mode differences, with the researcher providing the correct label in each case to give the participant some experience with the stimuli and their correct labels. For each classification, participants were asked to decide the category of the tune as quickly as possible (i.e., they did not need to wait until the tune stopped playing) by pressing one of two buttons on a button box. Responses were linked to their decision of label rather than the presence of the critical notes. All melodies played from beginning to end regardless of when the response was provided to ensure consistency in the amount of EEG data collected and allow for accurate comparison across participants.

Next, participants completed six practice classifications with feedback. Practice melodies were designed to mimic the experimental melodies, and participants identified these as major/minor or happy/sad based on their task instruction group assignment. Following practice, each of the 42 items was presented twice, for a total of 84 trials. These were presented in 8 blocks of 11 or 12 items, with brief breaks in between. All items were played once before being repeated, and no tune occurred immediately after its other-mode twin. Otherwise, presentation order was randomized for each participant. Reaction times were recorded with respect to the most recent critical note and were analyzed only for correct responses. Reaction times were quantified as the difference between the onset of the most recent CN and the button press. For example, when the response occurred after CN2, reaction time was calculated as the latency between the onset of CN2 and the button press.

EEG data analysis

All EEG recordings were pre-processed and analyzed offline using the Brainstorm package in Matlab (Tadel, Baillet, Mosher, Pantazis, & Leahy, 2011). Due to the relatively low number of trials (21 melodies, each presented twice), we included all trials in the EEG analysis, regardless of response accuracy. No trials needed to be excluded due to movement noise. Data were bandpass filtered between 0.5 and 80 Hz with a notch filter at 60 Hz to account for electrical interference from within the recording environment and then normalized. Blink and cardiac artifacts were identified by an experienced observer and removed from the signal. Finally, data were low-pass filtered at 20 Hz and normalized to 100 ms of baseline activity. Baseline normalization was conducted prior to the extraction of time windows related to the critical notes. Thus, figures displaying neural activity just before a critical note represent neural activity related to the processing of the preceding note, not normalized baseline activity. Three components of interest with respect to each critical note were then extracted: the N1, the P2, and the P300, referred to here as the late positive component (LPC) based on specific time windows described below. To extract these ERP components of interest, we used the time course of the global field power (GFP; Skrandies, 1989, 1990), calculated across the entire set of electrodes separately for each individual observer. Peaks in GFP as a function of time were then used to guide selection of peak amplitudes for each of the components.

We visually inspected the data for each of the electrodes to select those electrodes that reflected the components of interest (Fig. 2). Peak values from these electrodes (left and right temporal/occipital for the LPC, frontal/parietal for N1/P2) along with time to peak values were then used in each of the statistical analyses. Amplitude calculations were then done individually using the average waveform for each participant in the following time windows. For the LPC component, we used a window of 475-575 ms to encompass the peak of the GFP in the set of temporal and parietal electrodes, which was 526 ms. This peak latency is similar to that found in our previous study (537 ms; Halpern et al., 2008) For the N1 and P2 components, we used a 50 ms window of ±25 ms around the peak of the GFP in the set of frontal electrodes. For the N1, this window occurred at 100-150 ms and for the P2 this window occurred at 175-225 ms.

Fig. 2.
figure 2

Electrodes used for N1/P2 and LPC analyses were identified by an experienced observer using the grand average waveform of both critical notes to identify the components of interest. Frontal/parietal electrodes used for N1/P2 analysis are highlighted in blue. Temporal/occipital electrodes used for LPC analysis are highlighted in red.

We employed source estimation for two analyses. First, to evaluate the early anterior negativity, we extracted activation from the bilateral pars opercularis regions of interest (Maess, Koelsch, Gunter, & Friederici, 2001). EEG data were first mapped on the cortical mantle derived from Freesurfer automatic segmentation (Fischl, 2012). This was accomplished by first calculating a standard head model using an overlapping spheres model (Huang, Mosher, & Leahy, 1999). Next, an inverse model was computed using sLORETA (Pascual-Marqui, 2002). Finally, we extracted the time-series from bilateral pars opercularis, a region of interest derived from the Destrieux-Killiany atlas corresponding to the inferior frontal gyrus (Desikan, Ségonne, & Fischl, 2006). We chose this region of interest based on prior studies that demonstrated this region as the source of early anterior negativity effects for music processing (Maess et al., 2001). Second, to estimate the limbic system activation, we extracted the average power in the alpha (8–12 Hz) and beta (13–20 Hz) ranges from the bilateral prefrontal cortex using the pars orbitalis region of interest. We then calculated asymmetry by subtracting the average right hemisphere power from the average left hemisphere power during a 1-sec time window beginning at the onset of the first critical note, as in Daly et al. (2019). Higher values indicated greater asymmetry and suggest higher limbic activation whereas values closer to zero indicated lower asymmetry and suggest reduced limbic activation (Daly et al., 2019).

Two-alternative forced-choice classifier

To determine the strength of the LPC difference across groups, we modified an established two-alternative forced-choice classifier (Centanni et al., 2013; Engineer et al., 2008) to evaluate whether the LPC was sufficient to determine group membership (musician vs. nonmusician). For each participant and each mode (major vs. minor), the classifier created a template for the non-musician LPC response and the musician LPC response while leaving the current participant out of the templates. The classifier then calculated city block distance between the single participant and each of the templates. Group membership assignment by the classifier went to the group with the smallest city block distance (e.g., the template that was most similar to the participant being evaluated). Chance performance was 50%.

Statistical Analyses and Power Estimate

Using the effect size of Cohen’s d = 1.27, calculated from data reported in our previous study using the same stimuli (Halpern et al., 2008), an a priori power analysis with an α = 0.05 and 1 - β=0.80 yielded a target sample size of 11 per group. Our sample sizes exceed this minimum estimate and therefore provide adequate power for the analyses described here. Unless noted otherwise, data for each of the dependent variables we analyzed using 2 (group: nonmusician, musician) x 2 (mode: major, minor) mixed analyses of variance (ANOVAs), with group being a between-subjects factor, and an α-level of 0.05. For ERP analyses, the critical note (CN1, CN2) was added as a factor. All t-tests were unpaired. Comparisons in the N1, P2, and LPC components of the ERP signal were done using one-tailed t-tests, given that our hypotheses for these metrics were based on our previous ERP findings using these stimuli in musicians versus nonmusicians. Furthermore, given the reported associations between musicianship and the ERAN in other musical contexts (Jentschke & Koelsch, 2009; Koelsch, Schmidt, & Kansok, 2002), we also used one-tailed t-tests for these comparisons. The Bonferroni correction was used to account for multiple comparisons within analysis set (e.g., within the set of LPC amplitude and latency analyses), acknowledging the overly-conservative nature of this correction.

Results

Behavior: accuracy and reaction time

Regarding accuracy, there was a significant main effect of group (p = 0.014; Table 1), such that musicians achieved higher accuracy (91 ± 3% correct) than nonmusicians (79 ± 3% correct; Table 2), but no main effect of mode (p = 0.085). There was a significant interaction between group and mode (p = 0.008) such that nonmusicians were more accurate classifying the minor mode (80.6%) than major mode (77.4%; paired, two-tailed t-test, t (14) = 2.3, p = 0.04), whereas there was no effect of mode on musicians’ accuracy for classifying the major mode (92.6%) compared with the minor mode (88.8%; paired, two-tailed t-test, t (18) = 0.73, p = 0.48; Table 2). There was no effect of label (major/minor or happy/sad) on accuracy in either group (musicians: t (17) = 0.80, p = 0.43 and nonmusicians: t (13) = 0.6, p = 0.56). These findings support our earlier result (Halpern et al., 2008) that label does not impact behavior.

Table 1 Repeated measures ANOVAs for behavioral performance on the mode identification task
Table 2 Behavioral performance on the mode identification task

Reaction time (RT) was calculated with respect to each critical note, such that the RT represented the difference between the onset of the most recent critical note and the button press. For musicians, most of the responses occurred after the first critical note (35.4% of responses were made after the CN2 in major melodies and 31.3% of responses were made after the CN2 in minor melodies). For nonmusicians, approximately half of the responses were made after the first critical note (57.6% of responses were made after CN2 for major melodies and 50.1% of responses were made after CN2 in minor melodies).

With respect to RT to the first critical note, there were significant main effects of group and mode, but no interaction. With respect to RT relative to the onset of the second critical note, there also were significant main effects of group such that musicians were faster than nonmusicians and mode such that RTs were shorter for minor modes than major modes. There was no interaction between group and mode.

Compared with nonmusicians in our prior study (approximately 69% correct; Halpern et al., 2008), nonmusicians were considerably more accurate (approximately 78.5% correct overall), but the superiority of musicians in speed and accuracy was replicated (i.e., the musicians were not trading speed for accuracy). We also replicated the finding that musicians were more accurate on major than minor tunes. All respondents were faster to judge minor tunes with respect to both critical notes, which is an effect that was obtained in the prior study only with musicians (only CN1 was assessed in Halpern et al., 2008). Thus, compared with the prior study, nonmusicians were more adept and looked in some respects more similar behaviorally to the musicians. In consequence, any divergence in the ERP results among the subject groups would be indicative of qualitative differences in approach to the task.

ERPs: N1/P2 and late positive component

Although the N1 and P2 components were visibly more distinct in response to CN1 (Fig 3A-B) versus CN2 (Fig 4A-B), there were few statistically significant differences. There were no significant main effects for the peak amplitudes of the N1 or P2 components (Table 3). To ensure that label condition (happy/sad vs. major/minor) did not influence the LPC amplitude, we ran repeated measures ANOVA in each group separately to evaluate this possibility (Table 3). Regarding CN1, in musicians there was no significant main effect of label condition, but there was a significant main effect of mode. There was no interaction between label condition and mode in musicians. In nonmusicians, there also was no main effect of label condition or mode on LPC amplitude. Regarding CN2, in musicians there was no significant main effect of label condition and no main effect of mode. In nonmusicians, there was similarly no main effect of label condition or mode on LPC amplitude.

Fig. 3
figure 3

Grand average wave-forms for the first critical note (CN1) as a function of group and mode. Major mode responses are shown in dark lines and Minor mode responses are shown in light lines. N1/P2 components in musicians (A) and nonmusicians (B). LPC component in musicians (C) and nonmusicians (D).

Fig. 4
figure 4

Grand average wave-forms for the second critical note (CN2) as a function of group and mode. Major mode responses are shown in dark lines and Minor mode responses are shown in light lines. N1/P2 components in musicians (A) and nonmusicians (B). LPC component in musicians (C) and nonmusicians (D).

Table 3 Repeated measures ANOVA results for ERP components

*p< 0.05; **p < 0.01; ***p < 0.001

There was a significant main effect of critical note on amplitude of the LPC but no main effect of group. There was a significant interaction between critical note and group such that the LPC was present in response to CN1 (paired t-tests for major vs. minor CN1 LPC in musicians: t (18) = 3.89, p = 0.001, and nonmusicians: t (15) = 0.91, p = 0.38) but not in response to CN2 (paired t-tests for major vs. minor CN2 LPC in musicians: t (18) = 0.54, p = 0.59, and nonmusicians: t (15) = 0.23, p = 0.82; Figs 4C-D). Given the lack of an LPC to CN2 in either group, we focused the remainder of our analyses on CN1 to evaluate the relationship of musical training to the LPC. There was a significant main effect of mode and a trend in the interaction between group and mode. Planned post hoc analyses revealed that musicians exhibited a significantly larger peak amplitude to the first minor critical note (1.38 ± 0.02 μV) compared with the nonmusicians (0.09 ± 0.01 μV; unpaired, one-tailed t-test, t (33) = 2.29, p = 0.014). In the nonmusicians, the amplitude of the CN1 to minor melodies did not significantly differ from the amplitude of the CN1 to major melodies (paired, two-tailed t-test; t (15) = 1.30, p = 0.21), but among musicians, the amplitude to the minor CN1 was larger than the amplitude to the major CN1 (mean major CN1 amplitude: 0.09 ± 0.02, paired, two-tailed t-test; t (18) = 3.27, p = 0.004). Thus, the musician brain appears more sensitive to the contrast between the CN1 in a major mode compared with the minor mode, with no apparent sensitivity in nonmusicians.

With respect to the latency of the N1 component, there were no reliable main effects of group or critical note and there was no interaction between group and critical note. There was no main effect of group on P2 latency, but there was a significant main effect of critical note such that the latency for CN2 was longer than for CN1. This effect was mainly driven by the musicians (CN1 latency: 201.5 ± 1.2 ms vs. CN2 latency: 204.1 ± 0.5 ms, paired, one-tailed t-test: t (18) = 3.08, p = 0.003) rather than the nonmusicians (CN1 latency: 199.7 ± 0.9 ms vs. CN2 latency: 201.7 ± 0.7 ms, paired, one-tailed t-test: t (15) = 1.60, p = 0.065). There was no interaction between group and critical note. Given the lack of reliable LPCs for the major tunes and for the second critical note, the data for time to peak amplitude of the LPC were analyzed only for the first critical notes in the minor tunes. There were no main effects of group or mode on latency of the LPC.

Classification of musical training using the LPC

We evaluated whether a nearest-neighbor classifier could distinguish participants with and without musical training on the basis of the LPC response to the first critical note. The classifier compared a single participant’s response to either a major or minor CN1 and used city block distance to determine group membership of that participant by comparing to the average templates for musicians and nonmusicians. We used a leave-one out approach in which the participant undergoing classification was not part of the group’s template. As expected, the classifier performed poorly (chance level: 50%) when given CN1 responses to major notes (45.7% accuracy; comparison to chance, t (34) = 1.17, p = 0.25) but performed above chance level in classifying musicians and nonmusicians when given CN1 responses to minor notes (57.1% accuracy; compared with chance, t (34) = 3.69, p = 0.0007). Classification using minor notes was significantly more accurate than classification using major notes (t (34) = 4.78, p = 0.00003).

We also used correlations to determine whether the size of the LPC was indicative of accuracy on the task. When considering the entire sample, there were no significant relationships between accuracy or RT with respect to CN1 in either mode (ps > 0.06). Interestingly, there were significant relationships across the entire sample between the LPC amplitude to CN2 and both major (r = −0.44, p = 0.01) and minor (r = −0.42, p = 0.01) accuracy but this finding was driven by the musicians (major: r = −0.64, p < 0.01 and minor: r = −0.51, p = 0.02).

Early anterior negativity response to minor notes in musicians

The early right anterior negativity response has been observed in both musicians and nonmusicians when listening to chords and was usually present in the 100-200 ms range following the unexpected musical component (Koelsch et al., 2000; Maess et al., 2001). Previous ERP work localized the pars opercularis region as the source of the early right anterior negativity signal in music (Maess et al., 2001). Therefore, we extracted the signal from bilateral pars opercularis (a region of the inferior frontal gyrus [IFG]) using an anatomical region of interest mask in the Brainstorm software (Tadel et al., 2011). We then evaluated the signal for evidence of this negativity to the first critical note. To capture the early right anterior negativity component, we focused on a time window between 100-250 ms after onset of the critical note (shaded in gray in both panels of Fig. 5). This time window was chosen to encompass the range of early right anterior negativity responses to music reported in the literature (Koelsch et al., 2000; Koelsch et al., 2002; Loui, Grent, Torpey, & Woldorff, 2005; Tillmann et al., 2003). The average early negativity response (calculated as major CN1 – minor CN1 response) across the time window was significantly different between groups in right hemisphere but not in left hemisphere. In the right hemisphere, musicians exhibited a larger early negativity (−2.85 ± 1.65) compared with nonmusicians (2.51 ± 2.06; unpaired, one-tailed t-test, t (33) = 2.12, p = 0.02; Fig. 5B). In the left hemisphere, there were no differences between musicians (1.55 ± 1.37) and nonmusicians (0.005 ± 1.92; unpaired, one-tailed, t-test, t (33) = 0.67, p = 0.25; Fig. 5A).

Fig. 5.
figure 5

Early anterior negativity in inferior frontal gyrus. Responses were evaluated in the 100-250 ms time window after the onset of the first critical note and this window is shaded in gray. A There were no group differences in the response to major versus minor tones in the first critical note position in left hemisphere based on musical training. B There was a significant difference between major versus minor responses in right hemisphere as a function of musical training such that musicians exhibited a stronger early right anterior negativity signal compared to non-musicians (unpaired, one-sided t-test, p = 0.02).

Prefrontal asymmetry as a measure of limbic system activation

To evaluate whether the neural response to emotional content in the melodies differed in the training groups, we calculated asymmetry in the α and β frequency bands in the prefrontal ROI (Daly et al., 2019). Prefrontal asymmetry in these frequency bands, as measured by EEG, corresponds strongly to participant-reported emotional valence and arousal detected in the piece (Davidson & Hugdahl, 1996;Mikutta, 2012;Schmidt & Trainor, 2001;Yuvaraj et al., 2014) as well as fMRI activation in a variety of brain regions, including the amygdala, during music listening (Daly et al., 2019). There was observable asymmetry in both groups, in both frequency bands, and in response to both modes, as indicated by asymmetry scores (left-right) greater than 0 (Fig. 6), suggesting that musical training/predisposition does not influence this neural metric. In the a-range, there was no significant main effect of group (F (1, 33) = 0.09, p = 0.77), but there was a trend in the effect of mode (F (1, 33) = 3.96, p = 0.055; Fig. 6A). There was no interaction (F (1, 33) = 0.09, p = 0.76). In the a-range, there was no significant main effect of group (F (1, 33) = 1, p = 0.33) or mode (F (1, 33) = 0.62, p = 0.44) and no interaction F (1, 33) = 0.21, p = 0.65; Fig. 6B).

Fig. 6.
figure 6

Inter-hemispheric alpha and beta power in prefrontal cortex. Group mean ± standard error of the mean log transformed hemispheric asymmetry values, as measured in uV, in the a (A) and b (B) frequency bands. Larger values indicate greater asymmetry.

Discussion

The present study replicates the main findings of Halpern et al. (2008), using a high-density electrode array and, critically, expands the findings to further characterize the neural correlates of mode perception in trained musicians. In both studies, musicians more so than nonmusicians showed a late-onset EEG response to the first note that signified a tune as being minor mode. Neither group exhibited a response to the critical note in the major mode, even though both groups demonstrated a strong onset response (N1/P2) to that note. With regard to our first question, we demonstrated that there was no relationship between musical training and the LPC at the second critical note. Furthermore, the neural response to the CN2 does not appear to provide additional information for mode classification. Regarding our second question, we found that the LPC signal is a reliable marker for musical training, as this neural metric was sufficient to identify a participant’s musician status. Regarding the first part of our third research question, we found that musicians exhibited evidence of an early anterior negativity response in the right hemisphere to minor versus major critical notes, whereas no such signal was present in the group of nonmusicians. Finally, we report no differences in prefrontal asymmetry in the alpha or beta bands, suggesting no differences in emotional response to these melodies in the two groups.

Relevance of a second-critical note for nonmusicians

A novel aspect of this study was that we were able to contrast an initial versus a second onset of the mode-defining critical note. We had thought that giving nonmusicians a second chance to use a classification point might elicit a time-locked response similar to that obtained for the musicians, but in fact the nonmusicians showed no LPC to the second critical note in the minor tunes. Although the nonmusicians were able to perform the task, neither critical note in the minor tunes captured attention in the same way that it did for the musicians. Thus, although we do not yet know how the nonmusicians were making their classification decision, we do know that the decision was not accompanied by the kind of attentional effects observed for the musicians. In other words, it is unlikely that the summed exposure to multiple critical notes provided any advantage to nonmusicians. Furthermore, once trained musicians attended to and used the first critical note, the second was of little additional value.

The LPC as a marker for musical training

Because major and minor tunes were presented equally often, and indeed were musically equivalent except for the critical notes, participants were unable to use the experimental design as a cue as to the melody’s mode. In the canon of Western music, the minor mode is less common than the major (Bowling et al., 2009; Halpern et al., 2008). We note that both musicians and nonmusicians reacted more quickly to minor tunes in this study. This suggests that even untrained listeners are sensitive to the statistical regularities of Western music, including popular and culturally traditional music (Everett, 2004). In fact, a number of prior studies have demonstrated that nonmusicians are sensitive to a variety of musical parameters (including tonality, rhythm, and musical style) and are capable of behavioral performance that mimics trained musicians (Bigand, 2003). However, despite the comparable decreased reaction time to minor melodies across both of our groups, the specificity of the LPC to minor critical notes in musicians and its ability to predict musical training group suggest that the time-locked LPC response itself is mediated by the experience and practice present in the musician group. Thus, just as acoustic deviants capture attention in the sensory domain (Sussman, Winkler, & Schroger, 2003), minor critical notes also may capture attention in the cognitive domain.

These results raise a set of questions about the informational processes supporting the behavioral decision, along with the neural bases for those processes. Most prominent are the questions associated with the manner in which knowledge and expertise modulate attention. Trained musicians attending to a specific instrument in the context of natural music exhibit increased recruitment of auditory-general attention network nodes, including temporal, parietal, and frontal regions compared with nonmusicians (Janata, Tillmann, & Bharucha, 2002). These domain general regions are also more activated in musicians during a distorted tone task compared with nonmusicians, especially in temporal and frontal regions (Seung, Sug-Kyong, Woo, Lee, & Lee, 2005). Thus, it is possible that increased musical training facilitates the recruitment of domain-general attentional networks in tasks that require the identification of a specific musical element. However, because musical experience is a correlational variable, we cannot completely dismiss the possibility that musicians are born predisposed to this ability (Trehub, 2003). The fact that the onset response was identical in both groups and that major and minor tunes are both defined culturally as is their relative occurrence, does point to a large component of specific learning. Selecting musicians with a wider range of musical experience could speak to whether the number of years of training modulates the minor LPC effect.

We also observed a relationship in musicians between LPC amplitude and accuracy in both modes, and with both CNs. Interestingly, these relationships were both negative such that reduced amplitude corresponded with higher accuracy. This finding suggests that even though there were no main effects of CN2, this note does hold value, especially for musicians. One possible interpretation of these findings relates to our proposal that musicians are trained to recognize and use the critical note in mode perception, whereas the second critical note confirms or denies the listener’s expectation. If the first critical note is processed accurately, the second critical note should verify the expectation and thus, not elicit a second LPC signal. Under this theory, a reduction in the LPC to the second critical note would serve as a confirmation of the listener’s expectation and support the behavioral choice. For example, if a musician is expecting a major mode, the first CN in a minor mode will elicit an LPC. The musician listener then adjusts his or her expectation from major to minor. The second CN, also in minor, now fits the expectation and thus does not elicit an LPC. The degree to which the CN2 LPC is reduced may correspond to the degree of certainty or confidence of the listener in making the behavioral choice. It is unclear, however, why we did not observe a significant relationship between the minor CN1 LPC and accuracy in musicians, given the presence of this signal and its success in identifying those with musical training from those without. Future research in this area should probe this open question.

Neural correlates of unexpected events in music

The early right anterior negativity (ERAN) response is hypothesized to signify a syntactic error in a musical sequence—for example, in an unexpected final chord (Jentschke & Koelsch, 2009). The right inferior frontal gyrus is reported as the source of this signal (Maess et al., 2001), and the amplitude of this signal increases as a function of musical training (Koelsch et al., 2002). Previous work on the ERAN component has evaluated unexpected musical elements, including chords (Koelsch et al., 2002; Jentschke & Koelsch, 2009) in a sequence, but our study is the first to report the ERAN in response to the critical note in a minor mode. Given that major modes are more frequent in Western music, it could be that the appearance of a major CN is expected in the context of a novel melody. Therefore, the appearance of a minor critical note may represent a departure from the expected, driving the ERAN response. The observation that this effect is present only in trained musicians suggests training/exposure reinforces statistical learning of which mode is more common. Interestingly, this neural difference contrasts with the behavioral finding that both groups were able to distinguish between the modes and suggests different approaches to solving the problem. For trained musicians, the role of the critical note in defining the mode is well understood. However, nonmusicians likely rely more on the perception of the melody as a whole to make their judgment, rather than relying on a single note.

The appearance of an ERAN to the minor critical note in musicians is reminiscent of an early left hemisphere anterior negativity (ELAN) component that often is reported in relation to syntactic errors in speech and language processing in adults (Friederici et al., 1993; Friederici, Wang, Herrmann, Maess, & Oertel, 2000) and in children (Oberecker, Friedrich, & Friederici, 2005). The link between neural correlates of language and music processing is of increasing interest, especially in groups using these similarities for clinical purposes (for review, see Norton, Zipse, Marchina, & Schlaug, 2009). Interestingly, children with early musical training exhibit an earlier and a stronger ELAN response during syntax processing than their nonmusically trained peers (Jentschke & Koelsch, 2009). Mounting evidence from a variety of subfields suggest that musical training is related to benefits in speech and language, including for reading ability (Hallam, 2018), speech perception in noise (Parbery-Clark, Skoe, Lam, & Kraus, 2009), and language rehabilitation after stroke (Bonakdarpour, Eftekharzadeh, & Ashayeri, 2000; Schlaug, Marchina, & Norton, 2008; Wilson, Parsons, & Reutens, 2006). The link between syntax processing in music and in language provides additional support for the overlap between these two cortical networks and support ongoing research into the potential benefits of musical training for children at risk for language and/or reading impairments.

Emotional processing in musical melody

The emotional content of music is arguably one of its most salient features, with the use of major and minor modes playing a vital role in establishing the emotional content of the piece. Spatially precise methods, such as fMRI, demonstrate that listening to a piece of emotional music, as rated by the participant, activates deep brain structures in the limbic system, including the amygdala, the hippocampal formation, the right ventral striatum, and the left caudate nucleus (Brattico et al., 2011; Koelsch, Fritz, & v. Cramon, D. Y., Müller, K.,, & Friederici, A. D., 2006; Mueller et al., 2011). Although researchers have long reported frontal asymmetry in response to emotional content of music (Davidson, & Hugdahl, K., 1996; Mikutta et al., 2012; Schmidt & Trainor, 2001; Yuvaraj et al., 2014), until recently, the neural source of this marker was unknown. A simultaneous EEG and fMRI study reported strong correlations between the degree of alpha and beta asymmetry in prefrontal cortex (as measured by EEG) and activation in auditory cortex, cerebellum, and amygdala (as measured by fMRI) during an emotionally valent and arousing task (Daly et al., 2019). In the current study, we also observed prefrontal asymmetry in both of these frequency bands in response to both minor and major modes, suggesting activation in one or more of these fMRI-identified regions during the listening portion of our study. There was no association between musical training and the degree of asymmetry observed. As described above, nonmusicians are sensitive to a variety of musical parameters (including tonality, rhythm, and musical style) and are capable of behavioral performance that mimics trained musicians (Bigand, 2003). The lack of an effect in frontal asymmetry suggests that emotional processing in this respect is a general human skill. It is important to note, however, that the link between frontal asymmetry and emotional processing is a tentative one, and so these data should be interpreted with caution. Interestingly, there was a trend in the effect of mode in the alpha band, suggesting that the minor mode may contain greater emotional information than the major mode. Additional studies are needed with larger sample sizes and real-time measurements of emotional state from participants to test this hypothesis.

Limitations of the current study

There are three main limitations in the current study. First, although ERP data provides a reasonably good window on the temporal course of cortical processing, it can speak at best only coarsely to the brain regions involved in that processing. Although we used an averaged template MNI brain to estimate sources, it will be necessary to combine ERP with individualized structural MRI data in future studies to better estimate the cortical sources for the effects we have documented and replicated or replicate the current study using a technique with better spatial precision such as magnetoencephalography (MEG). This would allow us to shed light on the nature of the attentional processes engaged by musicians and the nature of the strategies used by nonmusicians to classify tunes. Second, our original study using these stimuli utilized a low-density electrode net placed in the 10-20 system convention and thus referenced to the mastoid electrodes, while we used a high-density 128-electrode net and referenced the Cz electrode placed on the vertex to align with other studies in high-density systems. The use of different references may affect the comparison of results across studies (Joyce & Rossion, 2005). Finally, we did not include a battery of neuropsychological testing to determine whether the two groups were matched for baseline factors in other domains, such as nonverbal IQ or working memory. Future studies should include these measures to account for potential third variables.