Introduction

Research has shown that time spent on distraction or mind wandering activities during the day amounts to 46% daily (Killingsworth and Gilbert 2010) which higlights that limited attentional resources have become a growing concern in society. Thus, there is a need to investigate if there are “on the spot”, low-cost, and non-invasive interventions that may enhance attention. Focus music may be an intervention that meets these criteria and indeed may extend to the cognitive domain and improve cognitive performance.

Previous research has shown that use of different types of music enhances sustained attention compared to a control group (Kirk et al., 2019), a finding which finds support in the literature (Carter & Russell, 1993; Chaieb et al., 2015; Colzato et al., 2017; Guo et al., 2018; James et al., 2020; Lane et al., 1998; Peretz & Zatorre, 2005). Furthermore, different types of music such as high-frequency beats increase alertness and attentional focus (Turow & Lane, 2011). Music has also been shown to exert positive effects on physiological responses (Khalfa et al., 2003; Nater et al., 2006; Nyklicek et al., 1997). Specifically, research has shown that listening to music may decrease sympathetic nervous system (SNS) activity (Burns et al., 2002; Hodges, 2010; Knight & Rickard, 2001; Standley, 1992). However, studies have also demonstrated that the impact of background music can disrupt cognitive performance (Kämpfe et al., 2010). Specifically, a meta-analysis on the effect of background music found that background music in some cases disrupts text comprehension while reading as well as verbal memory performance (Kämpfe et al., 2010). In addition, effects of music on cognitive performance differ depending on the type of music.

The term “music” refers to many musical genres and styles, and therefore, music evokes a range of emotional and cognitive effects. Thus, research on music yields a variety of different physiological effects due to the differences in musical genres and styles. Specifically, self-selected relaxing music has been shown to attenuate the heart rate (HR) (Allen & Blascovich, 1994). HR changes have been reported consistently in other studies whereby some studies found a decrease in HR during listening to relaxing music (Burns et al., 2002; Knight & Rickard, 2001) while exciting music elevates the HR (Blood & Zatorre, 2001). Relaxing music has been shown to decrease HR and respiration rate, and to effectively switch from sympathetic nervous system (SNS) to parasympathetic nervous system (PNS) activity (Sakamoto et al., 2013), while another study found decreased SNS activity when listening to peaceful music (Sandstrom & Russo, 2010).

Based on the findings reviewed above, it seems that relaxing background music due to its dampening effect on the SNS might be a candidate style of music that has the potential to impact cognitive performance. Thus, the present study sought to investigate if listening to relaxing background music could increase attentional resources. Specifically, the study employed music from three different genres of music, namely piano, jazz, and lo-fi, to explore potential differential effects of musical genres on cognitive performance.

As the physiological effects of music leading to effect on cognitive performance are not entirely known, the present study aimed furthermore to study the physiological effects of listening to music. The autonomic nervous system may serve as one path by which music exerts its effect (Ellis & Thayer, 2010), which can be explored by the assessment of heart rate variability (HRV) (Bernston et al., 1997; Shaffer et al., 2014; Task Force of the European Society of Cardiology & the North American Society of Pacing Electrophysiology, 1996; Thayer et al., 2012).

HRV refers to the fluctuation in time intervals between consecutive heartbeats (Bernston et al., 1997; Shaffer et al., 2014; Task Force of the European Society of Cardiology & the North American Society of Pacing Electrophysiology, 1996; Thayer et al., 2012). HRV reflects the balance of the cardiovascular system controlled by the SNS and PNS parts of the ANS. The two branches of the ANS influence cardiac activity; SNS activation accelerates HR while PNS activation is primarily responsible for its deceleration. HRV is regulated by the integration of afferent projections to the brain via the vagal nerve, and efferent projections connecting the prefrontal cortex with the amygdala and the brain stem where parasympathetic output to the sinoatrial node of the heart is gated (Bernston et al., 1997). HRV allows to quantify the change in the time intervals between consecutive heart beats and refers to an index of SNS activity and PNS activity at any given time (Bernston et al., 1997). Quantification of HRV parameters can broadly be classified into time and frequency domain measures. The primary time domain measure is RMSSD and reflects the beat-to-beat variance in HR. RMSSD is typically used to estimate vagally mediated changes reflected in HRV (Shaffer et al., 2014). RMSSD is reported in milliseconds, with higher RMSSD indicating increased parasympathetic modulation (Bernston et al., 1997; Malik et al., 1996).

In the context of music interventions, it seems that music can modulate HRV (Allen & Blascovich, 1994; Blood & Zatorre, 2001; Burns et al., 2002; Ellis & Thayer, 2010; Hodges, 2010; Khalfa et al., 2003; Knight & Rickard, 2001; Koelsch & Jancke, 2015; Nater et al., 2006; Nyklicek et al., 1997; Sakamoto et al., 2013; Sandstrom & Russo, 2010; Standley, 1992). Generally, music interventions such as relaxing music that aim to reduce arousal would ideally reduce SNS and enhance PNS activation, respectively (Koelsch & Jancke, 2015).

In summary, the objective of this study was to assess the effects on HRV of exposure to different genres of relaxing focus music in a sample of healthy subjects both while listening to music and immediately after having listened to music. Specifically based on previous literature, relaxing focus music was employed that was expected to reduce SNS and enhance PNS activation.

The Present Study

This study compared the efficacy of different types of music in modulating cognitive processing and physiological activity among healthy adults. The study recruited four groups of participants where each group was exposed to one specific genre of music compared to a no-music control group. In a between-group design, the study exposed three separate groups to jazz music, piano music, and lo-fi music respectively. The fourth group was a no-music control group.

The study employed a 3-day experimental procedure in which participants were exposed to a cognitive mind wandering task. Mind wandering has been defined as drifting away from an activity toward unrelated inner thoughts and feelings (Ottaviani et al., 2013). Specifically, to capture mind wandering, the study used the Sustained Attention to Response Task (SART) (Robertson et al., 1997) to assess baseline levels of sustained attention. Mind wandering has in previous literature been captured by the SART (Jha et al., 2017; Morrison et al., 2014; Mrazek et al., 2012, 2013). The SART was chosen in the present study because this task has been used in previous research that investigated cognitive effects of music exposure (Axelsen et al., 2019; Kirk et al., 2019). In addition, HRV was recorded during an extensive baseline period (4-h continuous HRV recording). To assess the impact of the different genres of music on the HRV response and sustained attention, the study administered a music intervention period in which music compositions belonging to the three experimental genres were presented with varying lengths (15-min vs 45-min condition).

Research has shown that familiar music is more likely than unfamiliar music to lead to emotional arousal (van den Bosch et al., 2013). Predictions and expectations of auditory events in familiar music compositions are higher resulting in elevated dopamine release in the reward system of the brain (Salimpoor et al., 2011), as well as activation of emotion-related processing (Pereira et al., 2011). This phenomenon suggests that familiarity might play an important role in the emotional engagement of listeners with the music. Hence, in the present study, the effect of unfamiliarity was experimentally controlled using music belonging to the three genres in the initial music intervention period, and during a follow-up visit (3 weeks after the initial data collection period) in which participants in the interim had become familiar with the same piece of music. This enabled us to address the question of the effects of music familiarity on cognitive processing and the underlying HRV response.

Previous research has shown that music exerts an acute effect on increased HRV activity, i.e., during music exposure, while there were no chronic effects during music exposure after a 10-day intervention involving daily music listening (Kirk & Axelsen, 2020). It remains unclear if such acute effects also extend to immediately after music exposure. Hence, the present study addressed the question whether listening to music extends to increased cognitive performance during music listening compared to a separate condition immediately after listening to music. To address this question, the study employed two cognitive processing probes: one during the music intervention period, specifically the Attention Network Task (ANT) (Fan et al., 2002), and the other immediately after the music intervention period, where the study employed the SART. According to Posner and Petersen (1990), the attentional system can be divided into three networks, namely orienting, alerting, and executive control. The present study included the orienting component in that this component has been shown to correlate with performance on the SART (Hu et al., 2012). Thus, it was expected to observe a relationship between captured by the SART and the orienting component of the ANT. In addition, the advantage of including the ANT during music exposure and the SART after music exposure was that it reduced potential learning effects from employing the same task in both conditions (i.e., during and after music exposure) in close temporal proximity.

The study addressed the following questions:

RQ1) Does focus music exert an effect on sustained attention and the underlying HRV response?

RQ2) Does the length of the music composition modulate the effect?

  • H1: It was hypothesized that the three experimental music genres would increase cognitive performance after the music intervention period compared to the no-music control group.

  • H2: It was also hypothesized that the three experimental music genres would increase parasympathetic activity compared to the no-music control group.

  • H3: It was predicted that the 45-min music intervention period would elevate the parasympathetic HRV response significantly more compared to the 15-min music intervention period.

  • H4: It was hypothesized that music familiarity would increase cognitive performance after the music intervention period compared to the unfamiliar music intervention period and increase parasympathetic activity in the familiar condition relative to the unfamiliar condition.

  • H5: It was hypothesized that sustained attention would be enhanced both during the music intervention period and after the music intervention period in the three music groups compared to the no-music control group.

Methods

Participants

One hundred and twenty healthy participants were recruited in a fully randomized procedure involving four study groups. Twelve participants met one of the study’s exclusion criteria (see below) and were excluded from the final analysis (3 participants in the jazz music group; 4 participants in the lo-fi music group; 3 participants in the piano music group; 2 participants in the no-music group). Thus, the total number of participants from which data was included in the analysis was 108 participants. In the follow-up (i.e., phase 5; see Fig. 1), all of the 108 participants included in the initial analysis showed up and were included in the subsequent analysis. The first group (n = 27; average age 24.5 (SD = 8.3); 14 females) constituted the jazz music group. The second group (n = 27; average age 24.8 (SD = 8.4); 14 females) constituted the piano music group. The third group (n = 26; average age 24.9 (SD = 7.6); 15 females) constituted the lo-fi music group. Finally, a no-music control group was recruited (n = 28; average age 24.1 (SD = 7.4); 15 females). Exclusion criteria for these four groups included current psychiatric illness or psychiatric medication intake. In addition, an exclusion criterion involved a score greater than “26” on the Perceived Stress Scale (PSS) reflecting high stress (Cohen et al., 1983) and a score greater than “5” on the Pittsburgh Sleep Quality Index (PSQI) reflecting poor sleep quality (Buysse et al., 1989).

Fig. 1
figure 1

Experimental procedure. Outline of the 5 experimental phases and the 3-day procedure and the follow-up visit included in the study

Recruitment for the current study involved online-based advertisement campaigns. Recruitment information included that the study involved music exposure groups or a non-intervention group in a 3-day procedure and a follow-up session. In addition, recruitment information included that participants would be assigned to one of the four groups in a random manner, which eliminated any self-selection bias across the groups.

All four study groups were recruited from the same initial demographic range. All participants were given written instructions outlining the experimental procedures/phases of the study but were not informed about the study aims. Participants were told that they would be randomly assigned to one of the four study groups, which eliminated any self-selection bias across the groups. Participants did not receive monetary compensation for their participation in the study; however, the participants received two cinema tickets as compensation for their participation. All procedures were conducted in accordance with the local ethical committee (Videnskabsetisk Komité for Region Syddanmark).

Experimental Procedures

In a randomized between-subjects design, research participants across four groups completed 5 phases (Fig. 1) of testing on 3 separate days.

Participants were informed in the lab about the procedures related to the 3-day experimental procedure. That is, participants were given instructions about the HRV measurement, cognitive testing, music group allocation procedure, and the 5 experimental phases (which are described in detail below) that each participant took part in.

Subsequently, participants were allocated to one of the four experimental groups in a random manner. Sequence generation and randomization were performed by the research team, who were not formally blinded to group allocation.

The no-music control group required that participants who were randomly assigned to this group did not listen to any music during the experimental procedures. The no-music control group was through an app-based platform given access to the music compositions after completing the study. That is the no-music control group mimicked a waitlist control group.

Phase 1 (Baseline Cognitive Testing and Questionnaires)

During phase 1 on day 1 participants were asked to complete the PSS (Cohen et al., 1983). They also completed the PSQI (Buysse et al., 1989) to assess sleep quality. Both the PSS and PSQI were employed to screen for self-perceived stress and sleep quality to control for these factors in that autonomic activity has been shown to be affected by elevated long-term stress levels and poor sleep quality (Het & Wolf, 2007; Leproult et al., 1997; Wright et al., 2015). Finally, participants completed the SART (Robertson et al., 1997) on a laptop computer (15-inch MacBook Pro), which constituted the cognitive testing period.

Phase 2 (Active Baseline Period)

Having completed the cognitive test (SART) and questionnaires, participants were subsequently taken to a separate room and equipped with a HRV monitor. Participants were informed about the HRV measurement and were told to refrain from alcohol and nicotine to avoid known influences of these factors on autonomic activity (Malik et al., 1996; Ralevski et al., 2019; Sjoberg & Saint, 2011) for the duration of the study. Participants were instructed not to engage in intense physical activity for the experimental period but were otherwise asked to maintain their daily and nightly routines. The experimenters applied electrodes and a Firstbeat Bodyguard II HRV monitor on the chest of each participant. Participants were asked to wear the HRV monitor continuously for 4 h outside the lab and return the device the next day. The time series extracted from the HRV monitor was time-locked to take place from 1 to 5 PM to calculate a baseline average of each participant during the day in which they maintained their daily routines (active baseline period). The time interval (1 PM–5 PM) was chosen across all participants in that previous studies have shown an effect of the cortisol level and heart rate stress to change at different times during the day (Kudielka et al., 2004)’; thus, the study kept the time window of 4 h constant across all participants.

Phase 3 (Music Intervention Period)

Upon arrival to the lab on day 2, participants were asked to rest for 15 min in a private waiting room. During the 15-min rest, they were asked to sit on a chair in an upright position. Following the 15-min resting period, participants were taken to a separate room and placed in front of a laptop computer (15-inch MacBook Pro) where they listened to one music composition using headphones (Bose QuietComfort 35) belonging to one of the 4 groups (jazz, lo-fi, piano or no-music control). There were two music conditions in which participants listened to one music composition on day 2 and a longer/shorter version of the same music composition on day 3. These two music conditions consisted of the same music composition but played with a duration of either 15 or 45 min. HRV measurements were recorded while participants listened to music (music intervention period). All 15-min versions of the three compositions used in the study can be accessed here:

https://drive.google.com/drive/folders/15HkqrUIAkzy-ZyKfCVeg4Yc7dNunO93K?usp=sharing

Attentional capacity was assessed only once during the final 5 min of the first music intervention period on day 2 independent of condition (15- or 45-min music condition); that is, the ANT was administered during music listening. The study used a modified version of the ANT (Fan et al., 2002). Prior to initiation of the music intervention period on day 2, participants were provided with instructions on the ANT and were told that the task would commence in the final 5 min of the music intervention period. HRV was collected, but not reported during this period.

Phase 4 (Cognitive Testing Period Post Intervention)

Following the music intervention period, the HRV monitor was removed, and participants were asked to complete a version of the SART (cognitive testing period) on the laptop computer, similar to the SART completed in phase 1.

On day 3, participants returned to the lab to complete phases 3 and 4. On this visit, participants were asked to listen to the same music track as on day 2, with the only difference being the ordering of the two music conditions (i.e., the 15- and the 45-min music compositions). The ordering of the two music conditions was counterbalanced within groups.

Data collection during phases 3 and 4 was completed between 1 and 5 PM such that it overlapped with the active baseline period (i.e., HRV recording) in phase 2 (i.e., on day 1), and thus, there were no differences within or between participants for the HRV data collection across the three experimental days.

Phase 5 (Music Intervention Period at Follow-up)

The follow-up took place approximately 3 weeks after the initial data collection (i.e., phases 1–4). Data collection during the follow-up visit was completed between 1 and 5 PM to mimic the time period for data collection during phases 1–4.

Prior to the follow-up, participants were through an app-based platform given access to the 15-min version of the music composition belonging to their group (i.e., jazz music, piano music, lo-fi music, or no-music). To induce music familiarity, participants in the 3 music groups were instructed to listen to the music composition a minimum of 10 times over 3 weeks prior to the follow-up visit. Compliance was provided by a function in the app that tracked the timestamps to keep count of number of repetitions of the music composition that each participant listened to. The study included participants in which > 80% of the required 10 repetitions of the music composition was completed which all participants adhered to (n = 108).

Measures

Perceived Stress Scale (PSS)

The PSS is a 10-item scale designed to measure self-perceived stress. The perceived stress scale measures perceived stress over a period of time, in this case the last month. There is no objective point of reference in the scale, which means that the self-report is purely subjective (Cohen et al., 1983). The PSS has shown good reliability with Cronbach’s alpha between 0.6 and 0.85 (Lee, 2012) and can be used in samples both with and without stress-related disturbances (Lee, 2012).

Pittsburgh Sleep Quality Inventory (PSQI)

Participants also completed the PSQI to assess sleep quality. The PSQI is a self-rated questionnaire which assesses sleep quality and disturbances over a 1-month time interval (Buysse et al., 1989). The questionnaire consists of nineteen items that generate seven component scores and the sum of these scores yields one global score (Buysse et al., 1989). The PSQI has good psychometric consistency with Cronbach alpha between 0.7 and 0.8 (Carpenter & Andrykowski, 1998) and can be used in samples both with sleep problems and individuals without sleep problems (Carpenter & Andrykowski, 1998).

Sustained Attention to Response Task (SART)

The SART is a cognitive Go/NoGo test which aims to measure sustained attention (Cheyne et al., 2006; Robertson et al., 1997). The SART requires the participant to respond by pressing the space bar, as quickly as possible, to all non-target stimuli, i.e., Go trials (Robertson et al., 1997). These are all numbers from 0 to 9 and the participant is asked to refrain from responding to rare targets (digit 3). This means that a NoGo trial is defined when the target stimulus (3) appears, and the pressing response must be inhibited. The numbers 0 to 9 were presented with a duration of 250 ms, followed by a backward mask which lasted 900 ms. Lastly, there is an inter-stimulus interval of 500 ms. Two hundred forty stimuli were presented in the task with 216 of those being non-target stimuli and 24 being target stimuli. Sustained attention is captured in the SART by using the percentage of successes when the NoGo stimulus is present (the percentage of successfully withheld pressing of the spacebar when the number 3 was presented), hereby referred to as %NoGo success, which has been reported in previous related research (Axelsen et al., 2019; Kirk et al., 2019).

Attentional Network Task (ANT)

The ANT is an attentional monitoring task that aims to measure processing efficiency of various aspects of attention. Specifically, the canonical version of the ANT measures different attentional effects (orienting, alerting, and executive) (Fan et al., 2002). The present study used a modified version of the ANT that isolated the orienting effect. The orienting effect of the ANT was used in that previous research has shown a correlation between SART performance and the ANT orienting component (Hu et al., 2012).

The ANT required participants to determine whether a central arrow points left or right. Participants viewed the screen from a distance of 65 cm, and responses were collected via two input keys on the keyboard (15-inch MacBook Pro). Stimuli consisted of a row of five visually presented horizontal black lines, with arrowheads pointing leftward or rightward, against a gray background. The target was a leftward or rightward arrowhead at the center. This target was flanked on either side by lines (neutral condition). The study modified the ANT and used only neutral conditions and did not include congruent and incongruent conditions. The participants’ task was to identify the direction of the centrally presented arrow by pressing one key for the left direction and a different key for the right direction. Each trial consisted of four events. First, there was a centrally placed fixation cross which was presented for a random variable duration (300–1400 ms). Then, a warning cue was presented for 100 ms. Subsequently, the target and flankers appeared simultaneously. The target and flankers were presented until the participants responded, but for no longer than 1700 ms. After participants made a response, the target and flankers disappeared immediately and there was a post-target interval (400 ms). After this interval, the next trial began. Each trial lasted approximately 3200 ms. Only the orienting component of the task was employed, which was introduced during the warning cue event whereby the target was presented in one of three locations, namely center cue, or cues presented either above or below the central fixation cross. The orienting effect was calculated by subtracting the mean RT of the spatial cue conditions from the mean RT of the center cue conditions. Participants were presented with a maximum of 90 trials. The timing of the ANT was such that it was completed approximately at the same time as the music composition finished.

Physiological Acquisition

HR was recorded as beat-to-beat intervals using the Firstbeat Bodyguard II HRV monitor (Firstbeat Technologies Ltd., Jyväskylä, Finland) that has been previously applied in research and validated with standard physiological monitoring systems used in clinical and laboratory settings (Parak & Korhonen, 2013, 2015; Ottaviani et al., 2015). Bodyguard 2 is a wearable lightweight monitor attached on the chest using two ECG electrodes (Ambu Ltd., Ballerup, Denmark) for measuring heart rate variability (HRV).

Physiological Signal Processing

The HRV measurements conducted in this study were performed according to the guidelines of the Task Force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology (Task Force of the European Society of Cardiology & the North American Society of Pacing Electrophysiology, 1996). In following these standardized procedures, the study reports time domain measures, specifically RMSSD in this study.

All raw physiological data was processed for time and frequency domain parameters using the Kubios analysis software (version 3.4). The recorded data was imported to Kubios to calculate R-R intervals and associated variability (Tarvainen et al., 2014). Examination of the electrocardiogram data (ECG) ensured that the autonomic R-wave detection algorithm had been performed satisfactorily. Artifact removal for the HRV was performed manually using the artifact correction tool to detect R-R intervals provided by the Kubios software. When correction was applied, detected artifact beats were replaced using cubic spline interpolation. Spectrum analysis was computed using the fast Fourier transformation procedure provided by the Kubios software. Because of the skewed distribution the HRV, variables were log transformed prior to exposing the data to statistical analysis.

The HRV data was recorded continuously during the active baseline period in phase 2 and the time series from each participant was extracted covering a time period of 4 h (between 1 and 5 PM). The HRV data collected during the music intervention period were broken up into segments. Specifically, the 3 music intervention periods belonging respectively to day 2, day 3, and the follow-up were segmented and calculated as means on a participant-by-participant basis. The final 5 min of the music intervention period during which the ANT took place on day 2 was not taken forward for analysis.

Music Intervention

All music included in the study was instrumental and included a jazz music composition, a piano music composition, and a lo-fi music composition. The specific music compositions that were used during the initial music intervention periods and the follow-up periods were identical except that one track was produced to have a duration of 15 min while the second version lasted 45 min. The musical compositions were provided by Headspace (https://www.headspace.com/). Participants remained seated in an upright position while listening to the music.

Statistical Analysis

All data are presented in mean ± SD unless otherwise stated. Assumptions of normal distribution and sphericity of data were checked accordingly. Greenhouse–Geisser correction to the degrees of freedom was applied when violations to sphericity were present. Separate one-way ANOVAs were used to assess differences in PSQI and PSS separately during phase 1 among the 4 groups. Specifically, the one-way ANOVAs included time (phase 1) and group (jazz, piano, lo-fi, no-music) as factors. Mixed 2 × 4 ANOVAs were used to assess if there were differences on the %NoGo success rate during SART at phase 1 and phase 4 separately for the 15-min and the 45-min music conditions. Specifically, the 2 × 4 mixed ANOVA included time (phase 1 and phase 4) and group (jazz, piano, lo-fi, no-music) as factors. Similarly, mixed 2 × 4 ANOVAs were used to assess differences on the groups’ mean RMSSD during phase 2 and phase 3 separately for the 15-min and the 45-min conditions. Specifically, the 2 × 4 mixed ANOVA included time (phase 2, phase 3) and group (jazz, piano, lo-fi, no-music) as factors. Finally, a mixed 2 × 4 ANOVA was used to assess differences on the groups’ mean RT derived from the ANT during phase 3 and phase 5. Specifically, the 2 × 4 mixed ANOVA included musical exposure (unfamiliarity/phase 3, familiarity/phase 5) and group (jazz, piano, lo-fi, no-music) as factors. Significant interaction effects from the one-way and mixed ANOVAs were followed up with simple main effects analysis and t tests. The effect sizes for the mixed measures ANOVAs were calculated as partial eta squared (η2p), using small = 0.02, medium = 0.13, and large = 0.26 interpretation for effect size (Bakeman, 2005). The effect sizes for the t tests were calculated as Cohen’s d using small = 0.2, moderate = 0.5, and large effect = 0.8 (Cohen, 1988). A Pearson correlation analysis was conducted to investigate correlations between cognitive performance on the SART (%NoGo Success) and ANT (orienting component) that included all participants (n = 108) independent of group using the cognitive performance data from the two tasks during the 15-min music intervention in the phase 3 and phase 4. Pearson correlations (R) were considered small = 0.1, medium = 0.24, and large = 0.37 suggested by Cohen (1988). All data analysis was conducted using the statistical packages for social science (SPSS version 28).

Results

Questionnaire Results

Initially, the participant’s self-reported sleep quality data collected in phase 1 was analyzed. The PSQI average score was computed for each of the four groups (Table 1) to inspect if there were group differences. There were no significant differences in PSQI scores between the groups in a one-way ANOVA (F(3,104) = 0.137, p = 0.94).

Table 1 Questionnaire data collected at baseline (i.e., phase 1) across the groups

A similar analysis for the PSS score across the four groups was performed (Table 1); however, there were no significant differences in PSS scores between the groups in a one-way ANOVA (F(3,104) = 0.579, p = 0.63).

SART Results

Performance differences on the SART during phase 1 relative to SART performance immediately after administration of the music intervention period (i.e., phase 4) was next analyzed. This analysis aimed to assess if music across the four groups had effects on cognitive performance.

The average response for %NoGO success during phase 1 and phase 4 was computed (Fig. 2 and Table 2). A mixed ANOVA was used to inspect time (phase 1, phase 4) and group (jazz, piano, lo-fi, no-music) for the groups’ %NoGo success score with age and gender as covariates. Significant interaction effects were followed up post hoc by paired t tests. The analyses were performed separately for the 15-min and 45-min conditions.

Fig. 2
figure 2

Box plot of group averages of SART %NoGo success rate from phase 1 (i.e., baseline) and phase 4 (i.e., cognitive test period) collected immediately after the music intervention period for the 15-min and 45-min conditions respectively

Table 2 Data from the SART collected at baseline (i.e., phase 1) and at the cognitive testing period (i.e., phase 4) across the groups

For the 15-min condition, there was a significant interaction of time and group (F(1,102) = 14.0, p < 0.001, η2p = 0.292). Simple main effects analysis showed that the three music groups had a significantly higher performance after the 15-min phase than the no-music group (p < 0.001). There was no significant difference in performance at phase 1 between the four groups. Follow-up paired t tests revealed that in the jazz music group, there was a significant increase in performance from phase 1 to phase 4 (paired t =  − 13.28, df = 26, p < 0.001, d =  − 2.33). Similarly, in the piano music group (paired t =  − 13.43, df = 26, p < 0.001, d =  − 2.59) and in the lo-fi music group (paired t =  − 10.18, df = 25, p < 0.001, d =  − 2.0), there were significant changes in performance from phase 1 to phase 4. There was also a significant change in %NoGo success for the no-music group (paired t =  − 3.92, df = 27, p < 0.001, d =  − 0.74).

For the 45-min condition, there was a significant interaction of time and group (F(1,102) = 23.49, p < 0.001, η2p = 0.409). Simple main effects analysis showed that the three music groups had a significantly higher performance after 45 min than the no-music group (p < 0.001). There was no significant difference in performance at phase 1 between the four groups. Follow-up paired t tests revealed that in the jazz music group, there was a significant increase in performance from phase 1 to phase 4 (paired t =  − 11.2, df = 26, p < 0.001, d =  − 2.16). Similarly, in the piano music group (paired t =  − 13.74, df = 26, p < 0.001, d =  − 2.65) and in the lo-fi music group (paired t =  − 10.81, df = 25, p < 0.001, d =  − 2.12), there were significant increases in cognitive performance from phase 1 to phase 4. There was also a significant change in %NoGo success for the no-music group (paired t =  − 2.43, df = 27, p = 0.02, d =  − 0.46).

A final set of paired t tests were performed to examine whether there were significant effects between the different music durations (15- vs 45-min conditions) within each of the three music categories. There were no significant differences between the music durations for the jazz music group (paired t = 2.02, df = 26, p = 0.054), the piano music group (paired t =  − 1.4, df = 26, p = 0.17), or the lo-fi music group (paired t =  − 1.36, df = 25, p = 0.19).

HRV Results

Physiological data collected during phase 2 (i.e., the active baseline period) and phase 3 (i.e., the music intervention period) were analyzed to specifically address if music had an effect on the HRV response (Fig. 3 and Table 3). A mixed ANOVA was used to inspect time (phase 2, phase 3) and group (jazz, piano, lo-fi, no-music) for the groups’ RMSSD response controlling for age and gender. Significant interaction effects were followed up post hoc by paired t tests. The analyses were performed separately for the 15-min and 45-min conditions.

Fig. 3
figure 3

Box plot of group averages of RMSSD (ms) during phase 2 (i.e., active baseline period) and phase 3 (i.e., music intervention period for the 15-min and 45-min conditions respectively)

Table 3 HRV data collected at baseline (i.e., phase 2), at the music intervention period (i.e., phase 3), and at follow-up (i.e., phase 5) across the groups

For the 15-min condition, there was a significant interaction of time and group (F(1,102) = 9.4, p < 0.001, η2p = 0.217). Simple main effects analysis showed that the three music groups had a significantly higher RMSSD after the 15-min music condition than the no-music group (p < 0.001). There was no significant difference in HRV at phase 1 between the four groups. Follow-up paired t tests revealed that in the jazz music group, there was a significant increase in the HRV response from phase 2 to phase 3 (paired t =  − 10.96, df = 26, p < 0.001, d =  − 2.11). Similarly, in the piano music group (paired t =  − 7.07, df = 26, p < 0.001, d =  − 1.36) and in the lo-fi music group (paired t =  − 8.76, df = 25, p < 0.001, d =  − 1.72), there were significant increases in the HRV response from phase 2 to phase 3. There was also a significant increase in RMSSD for the no-music group (paired t =  − 4.14, df = 27, p < . 001, d =  − 0.78).

For the 45-min condition, there was a significant interaction of time and group (F(1,102) = 11.04, p < 0.001, η2p = 0.245). Simple main effects analysis showed that the three music groups had a significantly higher RMSSD after the 45-min music condition than the no-music group (p < 0.001). There was no significant difference in RMSSD at phase 1 between the four groups. Follow-up paired t tests revealed that in the jazz music group, there was a significant increase in the RMSSD HRV response from phase 2 to phase 3 (paired t =  − 11.81, df = 26, p < 0.001, d =  − 2.27). Similarly, in the piano music group (paired t =  − 11.34, df = 26, p < 0.001, d =  − 2.18) and in the lo-fi music group (paired t =  − 10.54, df = 25, p < 0.001, d =  − 2.07), there were significant increases in the HRV response from phase 2 to phase 3. There was also a significant increase in RMSSD for the no-music group (paired t =  − 5.73, df = 27, p < . 001, d =  − 1.08).

A final series of paired t tests examined whether there were significant HRV effects between the different music durations (15- vs 45-min conditions) within each of the 3 music categories. There were no significant HRV differences between the music duration in the jazz music group (paired t =  − 0.46, df = 26, p = 0.65). However, there was a significant HRV difference between the 15-min and the 45-min conditions in the piano music group (paired t =  − 5.49, df = 26, p < 0.001, d =  − 1.06) and the lo-fi music group (paired t =  − 2.18, df = 25, p = 0.04, d =  − 0.43).

Follow-up Attentional Network Task (ANT)

In the follow-up (i.e., phase 5), which took place approximately 3 weeks following the initial data collection, participants belonging to the four groups were called back to the lab to complete a 15-min music intervention period.

The follow-up (i.e., phase 5) was justified in terms of answering the questions whether music familiarity (vs music unfamiliarity) was modulated by attentional deployment using reaction times (RTs) as an index of performance using a modified version of the ANT (Fig. 4 and Table 4). Specifically, attention was monitored using the ANT during the final 5 min of the music intervention period for the unfamiliar music condition (i.e., phase 3) and during the familiar music condition (i.e., phase 5).

Fig. 4
figure 4

Box plot displaying the average reaction times (RTs) in the attentional network task (ANT) while listening to familiar and unfamiliar music across the 3 music groups and no-music group

Table 4 Data from a modified version of the ANT collected during the 15-min music intervention period (i.e., phase 3) and during the music intervention period at follow-up (i.e., phase 5) across the groups

A mixed ANOVA was used to inspect musical exposure (unfamiliarity/phase 3, familiarity/phase 5) and group (jazz, piano, lo-fi, no-music) on the groups’ average RTs derived from ANT performance controlling for age and gender.

There was a significant interaction of musical exposure and group (F(3,102) = 20.85, p < 0.001, η2p = 0.38). Simple main effects analysis showed that the three music groups had a significantly faster RTs after the 15-min phase than the no-music group (p < 0.001). There was no significant difference in performance at phase 1 between the four groups. Follow-up paired t tests revealed that in the jazz music group, there was significant faster RTs in the familiar condition compared to the unfamiliar condition (paired t = 6.87, df = 26, p < 0.001, d = 1.32). Similarly, in the piano music group (paired t = 10.97, df = 26, p < 0.001, d = 2.11) and in the lo-fi music group (paired t = 8.39, df = 25, p < 0.001, d = 1.65), there were significant faster RTs in the familiar compared to the unfamiliar condition. The no-music group did not reveal significant differences in RT across the two repetitions of the ANT (paired t =  − 0.96, df = 26, p = 0.35). Pairwise comparisons in the ANOVA also showed no significant difference for the no-music group on phase 1 and phase 3 (p = 0.22).

The present study employed both the ANT and the SART, and a previous study found a correlation between the ANT and the SART (Hu et al., 2012). Thus, it was expected to observe a relationship between these two tasks. The purported relationship was investigated using the Pearson product-moment correlation coefficient. A small, but significant, correlation between ANT and SART performance was found (r =  − 0.19, n = 108, p = 0.04) (figure not shown).

Follow-up HRV

HRV data in the familiarity vs unfamiliarity conditions during the music intervention period, i.e., phase 5 vs phase 3, was analyzed (Fig. 5 and Table 3) to assess whether music familiarity exerted an effect on the HRV response.

Fig. 5
figure 5

Box plot of group averages of RMSSD (ms) during phase 3 (i.e., unfamiliar music intervention period) for the 15-min condition and phase 5 (i.e., familiar music intervention period). Note that phase 3 data is also displayed in Fig. 3

A one-way between-groups ANOVA was conducted to explore differences in RMSSD at phase 5 across the groups. There was a significant difference in the HRV response for the four groups (F(3,104) = 47.4, p < 0.001). Hochberg tests indicated that the no-music group had a significantly lower RMSSD than the rest of the groups.

Paired t tests were employed to look for differences within each group. There were statistical differences when looking at the HRV during familiar vs unfamiliar music in each of the three music groups (jazz: paired t =  − 2.5, p = 0.02; piano: paired t =  − 5.7, p < 0.001; lo-fi: paired t =  − 2.2, p = 0.04).

Discussion

This study investigated the cognitive and physiological effects of three genres of focus music (piano, jazz, and lo-fi) in a laboratory setting. The study addressed the question if the three music genres would enhance cognitive performance compared to the no-music control group as well as increase parasympathetic activity across the three music groups compared to the no-music control group. The study’s five hypotheses are discussed in context in the findings below.

Effect of Music on Cognitive Performance

The results in the present study demonstrated support for the three musical genres’ impact on cognitive processes (H1). Specifically, in a sustained attention task (SART), it was found that attentional capacity was increased from baseline compared to the time point following the music intervention period across the three music groups, but not in the no-music control group (Fig. 2). There was no differential effect across the three music genres on cognitive performance. These results indicate that the music employed in the present study indeed had the capacity to enhance cognitive performance. Several studies have shown an association between music training and improvement in cognitive abilities (Schellenberg, 2011; James et al., 2020; Guo et al., 2017), but only few studies have shown an effect of “on the spot” passive music listening on cognitive performance (Axelsen et al., 2019; Colzato et al., 2017; Kirk et al., 2019; Lane et al., 1998).

In the present study, the SART was employed to capture sustained attention and this task has also in previous research been shown to be effective in the context of administration of guided brief mindfulness exercises (Axelsen et al., 2019), but also music exposure (Axelsen et al., 2019; Kirk et al., 2019) which suggests that mindfulness and music both show immediate effects during “on the spot” interventions. Related research supports this finding by demonstrating that 8 min of mindful breathing reduces behavioral indicators of mind wandering during performance on the SART compared with both passive relaxation and reading (Mrazek et al., 2012). Together these studies highlight the constructs of mindfulness and of music as two separate types of interventions that nevertheless yield comparable cognitive outcomes in the acute phase. It was beyond the scope of the present study to address the impact that music might have in the longer term (i.e., chronic effects or entrainment effects). However, several studies have shown an association between music training and cognitive functioning (Chobert et al., 2011; Helmbold et al., 2005; Schellenberg, 2011).

An additional hypothesis was tested in relation to the effect of music in the cognitive domain. Specifically, it was hypothesized that sustained attention would be enhanced both during the music intervention period and after the music intervention period in the 3 music groups compared to the no-music control group (H5). To explore this hypothesis, two independent attentional tasks were employed: one that participants were exposed to during music exposure (i.e., a modified version of the Attentional Network Task, ANT) (Fan et al., 2002), while the other task was employed after exposure to music (i.e., the sustained attention to response task, SART) (Robertson et al., 1997). In the latter condition, that is, after the music intervention period, there was a pronounced effect in the direction of increased cognitive performance on the SART (Fig. 2). However, in the former condition, cognitive performance on the ANT was only significantly improved during music listening in the three music groups compared to the no-music group in the familiar condition in which case cognitive performance on the ANT was significantly increased (Fig. 4). These findings may inform the use of this type of music employed in the present study, which seems to promote increased attention that can be employed both during a demanding cognitive task but seems more effectful immediately after music listening unless the music is familiar.

The study employed a modified version of the ANT in that mind wandering as measured on the SART has in a previous study been shown to correlate with the orienting component on the ANT (Hu et al., 2011). It was found that the modified version of the ANT correlated with performance on the SART, suggesting that cognitive performance on the two attentional tasks were consistent both when employed after and during music exposure.

Effect of Music on Physiological Response

The study also showed a physiological effect in the direction of increased parasympathetic activity (Fig. 3)—indexed as an increased HRV RMSSD response—in response to listening to music (H2). This finding is consistent with convergent lines of evidence, suggesting that music can modulate HR and HRV (Allen & Blascovich, 1994; Blood & Zatorre, 2001; Burns et al., 2002; Ellis & Thayer, 2010; Hodges, 2010; Khalfa et al., 2003; Kirk & Axelsen, 2020; Knight & Rickard, 2001; Koelsch & Jancke, 2015; Nater et al., 2006; Nyklicek et al., 1997; Sakamoto et al., 2013; Sandstrom & Russo, 2010; Standley, 1992).

Previous research has shown that relaxation music relative to a no-music control group showed a decreased response in the hypothalamic–pituitary–adrenal (HPA) axis which indicates that relaxing music is more effective compared to a no-music control group in decreasing cortisol levels (Khalfa et al., 2003). Similar research on the physiological effects of music has primarily focused on music styles that induce positive emotions and physiological relaxation (i.e., elevated parasympathetic activity). Specifically, self-selected relaxation music showed attenuated HR (Allen & Blascovich, 1994). changes in HR have been reported consistently in other studies whereby some studies found a decrease in HR during relaxing music (Burns et al., 2002; Knight & Rickard, 2001) while exciting music elevated HR (Blood & Zatorre, 2001).

The process by which music in the current study impacts the PNS is thought to be through a decrease in HR through the vagal nerve release of acetylcholine. As opposed to stressful states, relaxing states are typically associated with increased HRV, which is caused by PNS control through vagus nerve activity (Grossman & Taylor, 2007). In the present study, variation of HRV, specifically vagally mediated HRV, was through RMSSD which is considered to represent the beat-to-beat variance in HR and is the primary time domain measure used to compute the vagally mediated changes reflected in HRV (Shaffer et al., 2014).

Previous studies have shown that different types of music show a differential HRV response. For example, Sutoo and Akiyama (2004) showed that heavy metal music resulted in increased sympathetic activity indicating a stress response, whereas baroque music increased parasympathetic activity. Similar to the present study, Iwanaga et al. (2005) demonstrated that the HRV response increased in response to relaxation music. These results are in line with the findings in the present study and indicate that PNS activity is related to the music’s relaxing effect.

Another line of evidence suggests that the mechanisms involved in physiological responses induced by music may suggest that the feelings experienced when listening to the music are related to the mesolimbic reward dopamine system (Salimpoor et al., 2011). As such, listening to music has been shown to be associated with dopamine release in the ventral striatum. The ventral striatum is involved in the euphoric element of primary rewards such as food or psychostimulants such as cocaine. The striatum is further connected with limbic regions that mediate emotional processes (Delgado, 2007).

Effect of Music Duration

The present study also hypothesized (H3) that the 45-min music intervention period would elevate the parasympathetic HRV response significantly more compared to a 15-min music intervention period (Fig. 3). It was found that piano music and lo-fi music elevated the HRV response significantly more in the 45-min condition compared to the 15-min condition. However, the jazz music group did not show a significant difference between the 45-min condition and the 15-min condition. This differential effect of music genres might be explained by musical preference for the longer version belonging to piano and lo-fi music, which might indicate an elevated relaxation (parasympathetic) response to the extended version of music (i.e., 45 min). However, as the study did not collect subjective preference ratings of the music compositions, the current study is unable to confirm this interpretation. However, there was no differential effect in cognitive performance for the 15-min compared to the 45-min conditions suggesting that cognitive performance was elevated by listening to music for 15 min. This finding is supported by previous work in the mindfulness domain showing an acute increase in cognitive performance (specifically, sustained attention) from an 8-min guided mindfulness session (Mrazek et al., 2012), while a separate study demonstrated a similar effect based on a 12-min guided mindfulness session (Axelsen et al., 2019). Previous research in the domain of music listening using binaural beats found that 15 min of listening to binaural beats increased sustained attention (Kirk et al., 2019). This finding has also been demonstrated in a similar study demonstrating improved performance on a psychomotor vigilance task (Lane et al., 1998).

Effect of Familiarity

Finally, the study tested the hypothesis that music familiarity would increase cognitive performance after the music intervention period compared to the unfamiliar music intervention period, and that the familiar condition would also yield increased parasympathetic activity relative to the unfamiliar condition (H4). It was expected that familiarity was an important contributing factor for engagement while listening to the music. Thus, the present study investigated whether becoming more familiar with the music belonging to each group in the 3-week period in which participants were instructed to listen to the music compositions (i.e., jazz, piano or lo-fi) had a role in determining if the listener increased cognitive performance from the repeated exposure. Participants were asked to listen to the music ten times during the 3-week period between the initial data collection and the follow-up. It was expected that becoming more familiar with a particular piece of music would decrease the level of distraction while listening to the music during a demanding cognitive task. This hypothesis was grounded in previous behavioral research showing that there exists a positive effect of prior exposure to music on subjective liking, which has been called the mere exposure effect (Peretz et al., 1998).

The results from the study partly supported the hypothesis; the music groups in the familiar condition compared to the unfamiliar condition exhibited significantly faster reaction times (RTs) during the music intervention period in which they were exposed to the ANT (Fig. 4). There were no differences across the three music groups in that all three groups showed faster RTs compared to the no-music condition. These results may suggest that the attentional demand was higher for unfamiliar music presumably caused by the uncertainty comprised of unfamiliar stimuli. However, this interpretation does not align with the finding that there were no attentional performance differences between the unfamiliar condition and the no-music condition. One explanation may be that the no-music condition elevated mind wandering capacity in participants and thus increased attentional demand (i.e., RTs) during the ANT. This interpretation finds support in a study that used a passive relaxation condition, which may be seen as comparable to the no-music condition employed in the present study. That study found performance lapses during a mind wandering task (Mrazek et al., 2012). Indeed, the results from the SART during the no-music condition suggest a similar pattern of increased lapses in attention during the SART.

Regarding the effects of familiarity on the physiological response, the current study found significant differences across the three music groups between familiarity vs unfamiliarity (Fig. 5) in support of H4. Across all three music groups, the HRV response was significantly elevated during the familiar condition. The HRV responses across the three music groups were significantly lower in the unfamiliar condition. The increase in the familiar condition seems to suggest that habituation to the musical stimulus caused the elevated HRV response in the familiar condition across the three genres of music.

Limitations

The study involved 3-active music groups and a no-music control group. The study did not employ an active-control group, which might have involved music exposure to vocal rock music, and which would have served as a control for the 3-active music groups in terms of demand characteristics. Cross-sectional studies such as the present study cannot demonstrate causality, in that the no-music group (waitlist group) was confounded by unmatched exposure effects. Future studies should aim to employ longitudinal designs involving music exposure entailing similar demand characteristics.

The study addressed the question of the effects of music familiarity on cognitive processing and the underlying HRV response in that previous research has shown that familiar music may result in increased emotional arousal (van den Bosch et al., 2013). In addition, there is research showing that becoming familiar with a particular piece of music increases the individual’s subjective preference for it (Peretz et al., 1998; Schellenberg et al., 2008). The present study did not incorporate subjective preference ratings for the music compositions which may have modulated the emotional engagement with the music in terms of both cognitive performance and the underlying physiological response. Future studies should consider including such subjective preference ratings.

Conclusion

The study found a pronounced effect of three types of focus music on both cognitive performance and the underlying physiological response. In addition, the study found an effect of attentional monitoring and physiological activity that increased with increased exposure to the musical stimuli (i.e., a familiarity effect of the music compared to unfamiliar music). Furthermore, the three genres of jazz, piano, and lo-fi music demonstrated enhanced cognitive efficacy both after music exposure and during music exposure, albeit with a pronounced effect in the latter condition. In summary, these three musical genres may be recommended as a non-invasive intervention to promote immediate, i.e., “on the spot” positive physiological effects as well as enhancement of cognitive performance.