Abstract
Recently, exploring brain activity based on functional networks during naturalistic stimuli especially music and video represents an attractive challenge because of the low signal-to-noise ratio in collected brain data. Although most efforts focusing on exploring the listening brain have been made through functional magnetic resonance imaging (fMRI), sensor-level electro- or magnetoencephalography (EEG/MEG) technique, little is known about how neural rhythms are involved in the brain network activity under naturalistic stimuli. This study exploited cortical oscillations through analysis of ongoing EEG and musical feature during freely listening to music. We used a data-driven method that combined music information retrieval with spatial Fourier Independent Components Analysis (spatial Fourier–ICA) to probe the interplay between the spatial profiles and the spectral patterns of the brain network emerging from music listening. Correlation analysis was performed between time courses of brain networks extracted from EEG data and musical feature time series extracted from music stimuli to derive the musical feature related oscillatory patterns in the listening brain. We found brain networks of musical feature processing were frequency-dependent. Musical feature time series, especially fluctuation centroid and key feature, were associated with an increased beta activation in the bilateral superior temporal gyrus. An increased alpha oscillation in the bilateral occipital cortex emerged during music listening, which was consistent with alpha functional suppression hypothesis in task-irrelevant regions. We also observed an increased delta–beta oscillatory activity in the prefrontal cortex associated with musical feature processing. In addition to these findings, the proposed method seems valuable for characterizing the large-scale frequency-dependent brain activity engaged in musical feature processing.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Understanding how our brain perceives complex and continuous inputs from the real-world has been an attractive problem in cognitive neuroscience in the past few decades. Brain imaging technology provides an opportunity to address this issue. However, revealing brain states is generally more difficult during real-word experiences than those recorded brain activities during resting-state or simplified abstract stimuli like controlled and rapidly repeated stimuli (Hasson et al. 2010; Malcolm et al. 2016; Spiers and Maguire 2007). The question of how to disentangle stimuli-induced brain activity from spontaneous activity still remains open for scientific research due to the complexity of natural situations. In the present study, we attempt to formulate an approach with several analysis techniques including spatial ICA, source localization, acoustic feature extraction, and temporal correlation to examine the elicited oscillatory brain networks using ongoing electroencephalography (EEG) recorded during music listening.
Recently, the brain state under the naturalistic stimuli including music and movie has been investigated through functional magnetic resonance imaging (fMRI) (Alluri et al. 2012a, b; Alluri et al. 2013; Burunat et al. 2014, 2016a, b; Liu et al. 2017; Toiviainen et al. 2014), MEG (Koskinen et al. 2013; Lankinen et al. 2014) and EEG (Cong et al. 2013a, b; Daly et al. 2014, 2015; Schaefer et al. 2013; Sturm et al. 2015; Zhu et al. 2019, 2020). Alluri et al. explored the neural correlates of music feature processing as it occurs in a realistic or naturalistic environment, where eleven participants attentively listened to the whole piece of music (Alluri et al. 2012a, b; Burunat et al. 2016b, a). They successfully identified brain regions involved in processing of musical features in a naturalistic paradigm and found large-scale brain responses in cognitive, motor and limbic brain networks during continuous processing of low-level (timbral) and high-level (tonal and rhythmical) acoustic features using fMRI. Burunat et al. studied the replicability of Alluri’s findings using a similar methodological approach with a similar group of participants and found the processing mechanisms for low-level musical features were more reliable than high-level features (Burunat et al. 2016b, a). Unfortunately, all BOLD measurements by fMRI are to some degree confounded since they are indirect assessments of brain activity; they relate to blood flow and not to electrical processes and are therefore limited by poor temporal resolution due to the protracted hemodynamic response (Brookes et al. 2014; Li et al. 2019). After that, Cong et al. used an analogous to correlation analysis technique to investigate neural rhythms based on ongoing EEG data collected during listening to same music stimuli (Cong et al. 2013a, b; Wang et al. 2016). They found the theta and alpha oscillations along central and occipital area of scalp topology seems significantly associated with high-level (tonal and rhythmical) acoustic features processing. Also, many other studies tried to examine the neural underpinnings of music listening based on sensor-level EEG data (Jäncke et al. 2015, 2018; Markovic et al. 2017), in which different frequency bands were extracted using time–frequency analysis methods and further analyzed separately (e.g., event-related synchronizations and oscillatory power changes). Those studies showed the influence of different music listening styles on neurophysiological and psychological state interpreted by brain activation. Some sensor-level EEG studies examined the physiological correlates of continuous changes in subjective emotional states while listening to a complete music piece (Mikutta et al. 2012, 2014). Compared with sensor-level EEG analysis, recent studies adopted a mathematical approach (called sLORETA–ICA) combing source localization techniques with ICA to detect the independent functional networks during music listening (Jäncke and Alahmadi 2016; Rogenmoser et al. 2016). Although the aforementioned studies investigated the oscillatory activation or functional networks during music listening, the specific networks emerging from dynamic processing of musical features are not yet fully understood (Meyer et al. 2006). For example, there is evidence indicating that timbral feature processing was associated with increased activations in cognitive areas of the cerebellum, and sensory and default mode network cerebrocortical areas, but musical pulse, and tonality processing recruited cortical and subcortical cognitive, motor and emotion-related circuits (Alluri et al. 2012a, b; Meyer et al. 2006). Thus, we aimed to examine the electrophysiological underpinnings of these networks emerging from dynamic processing of musical features.
Independent component analysis (ICA) is a well-established data-driven approach increasingly used to factor resting-state fMRI data into temporally covarying, spatially independent sources or networks. By contrast, in the analysis of EEG/MEG data, ICA has mainly been applied for artifact rejection. However, spatial Fourier–ICA was proposed for data-driven characterization of oscillatory brain activity using EEG/MEG data. Compared with other ICA method applied to the context of music listening, spatial Fourier–ICA used in the current study can automatically extract narrowband oscillations from broadband data without having to manually specify a frequency band of interest. So far, spatial Fourier–ICA has already been proved to be fruitful in gaining insights into electrophysiological underpinnings of networks (Kauppi et al. 2013; Li et al. 2018; Ramkumar et al. 2014).
By applying spatial Fourier–ICA in combination with acoustical feature extraction, this study aims at probing the spatial–spectral patterns under music listening. Particularly, the current study attempts to provide an analysis framework for identifying the spatial, temporal, and spectral signatures of brain activation recruited during dynamic processing of music features. Similar to our previous music listening studies (Alluri et al. 2012a, b; Cong et al. 2013a, b), we extracted five musical features from the musical stimulus, and spatial, temporal, and spectral factors using spatial Fourier–ICA to EEG data. We then analyzed the correlation between temporal courses and the musical feature time series to identify frequency-specific brain networks emerging from dynamic processing of musical features. We expected spatial Fourier-ICA to reveal functionally oscillatory EEG source contributing to the musical feature processing.
Material and Methods
Data Acquisition
Participants
Fourteen right-handed and healthy adults aged 20 to 46 years old were recruited to take part in the current experiment after signing written informed consent. None of them was reported about hearing loss or history of neurological illnesses and none of them had professional musical education. However, many participants reported background in different music-related interests such as learning to play an instrument, producing music with a computer, singing. Table 1 demonstrates the age and the non-professional musical background of each participant. This study was approved by the local ethics committee.
EEG Data Acquisition
During the experiment, participants were informed to listen to the music with eyes open. A 512 s long musical piece of modern tango by Astor Piazzolla was used as the stimulus. Music was presented through audio headphones with about 30 dB of gradient noise attenuation. This music clip had appropriate duration for the experimental setting, because of its high range of variation in several musical features such as dynamics, timbre, tonality and rhythm (Alluri et al. 2012a, b). The EEG data were recorded according to the international 10–20 system with BioSemi electrode caps (64 electrodes in the cap and 5 external electrodes at the tip of the nose, left and right mastoids and around the right eye both vertically and horizontally). EEG were sampled at a rate of 2048 Hz and stored for further processing in off-line. The external electrode at the tip of the nose was used as the reference. EEG channels were re-referenced using a common average. The data preprocessing was carried out using EEGLAB (Delorme and Makeig 2004). The EEG data were visually inspected for artefacts and bad channels were interpolated using a spherical spline model. A notch filter at 50 Hz was applied to remove noise. High-pass and low-pass filter with 1 Hz and 30 Hz cutoff frequencies were then applied as our previous investigation of the frequency domain revealed that no useful information was found in higher frequencies (Cong et al. 2013a, b). Finally, the data were down-sampled to 256 Hz. In order to remove EOG (i.e., eye blinks), ICA was performed on EEG data of each participant. To additionally remove any DC-jumps occasionally present in the data, we differentiated each time series, applied a median filter to reject large discontinuities and reintegrated the signals back (Ramkumar et al. 2012).
Musical Features
Based on the length of the window used in the computational analyses, the musical features can be generally classified into two categories: long-term features and short-term features (Alluri et al. 2012a, b; Cong et al. 2013a, b). Five long-term musical features including Mode, Key Clarity, Fluctuation Centroid, Fluctuation Entropy and Pulse Clarity were examined here. They were extracted using a frame-by-frame analysis approach commonly used in the field of Music Information Retrieval (MIR). The duration of the frames was 3 s and the overlap between two adjacent frames 67% of the frame length. The chosen length of the frame was approximately consistent with the length of the auditory sensory memory (Alluri et al. 2012a, b). This analysis process yielded the time series of musical feature at a sampling frequency of 1 Hz, in accordance with the short-time Fourier transform (STFT) analysis of EEG data. Thus, both the musical features and temporal courses of EEG had 512 time points. All the features were extracted using the MIRtoolbox (Lartillot et al. 2008) in MATLAB environment.
For the completeness of the content, we briefly introduce the five features below. We extracted two tonal and three rhythmic features. For the tonal features, Mode represents the strength of major or minor mode. Key Clarity is defined as the measure of the tonal clarity. The rhythmic features included Fluctuation Centroid, Fluctuation Entropy, and Pulse Clarity. Fluctuation Centroid is the geometric mean of the fluctuation spectrum, representing the global repartition of rhythm periodicities within the range of 0–10 Hz (Alluri et al. 2012a, b). This feature indicates the average frequency of these periodicities. Fluctuation entropy is the Shannon entropy of the fluctuation spectrum, representing the global repartition of rhythm periodicities. Fluctuation entropy is a measure of the noisiness of the fluctuation spectrum (Alluri et al. 2012a, b; Cong et al. 2013a, b). Pulse Clarity, naturally, is an estimate of clarity of the pulse (Alluri et al. 2012a, b; Cong et al. 2013a, b).
Source Localization
For each subject, the brain’s cortical surface was reconstructed from an anatomical MRI template in Brainstorm (Tadel et al. 2011). Dipolar current sources were estimated at cortical-constrained discrete locations (source points) separated by 15 mm. Each hemisphere was modelled by a surface of approximately 2000 vertices, thus a mesh of approximately 4000 vertices modelled the cortical surface for each subject.
The measured EEG signals are generated by postsynaptic activity of ensembles of cortical pyramidal neurons of the cerebral cortex (Lei and Yao 2011). These cortical pyramidal neurons can be modelled as current dipoles located at cortical surface (Lin et al. 2006). The scalp potentials generated by each dipole depend on the characteristics of the various tissues of the head and are measured by the EEG scalp electrodes (Tian et al. 2011). With the geometry of the anatomy and the conductivity of the subject’s head, the time course of the dipole’s activity can be assessed by solving two consecutive problems: the forward problem and the inverse problem.
The forward problem is to model the contribution of each dipole to the signals of the EEG electrodes by solving Maxwell’s equations, which takes the geometry and conductivity of head tissues into account. In this study, a forward solution was calculated using the symmetric boundary element method (BEM) for each source point while a relative conductivity coefficient was assigned to each tissue (with default MNI MRI template).
To solve the inverse problem, minimum-norm estimate (Lin et al. 2006) was adapted with a loose orientation constraint favoring source currents perpendicular to the local cortical surface (no noise modelling). When computing the inverse operator (1) the source orientations were constrained to be normal to the cortical surface; (2) a depth weighting algorithm was used to compensate for any bias affecting the superficial sources calculation; and (3) a regularization parameter, \(\lambda^{2} = 0.1\) was used to minimize numerical instability, and to effectively obtain a spatially smoothed solution. Finally, an inverse operator G of dimensions \(N_{s} \times N_{c}\) (where \(N_{s}\) is the number of source points and \(N_{c}\) is the number of channels: \(N_{s} \gg N_{c}\)) was obtained to map the data from sensor-space to source-space. Here, we had \(N_{s} = 4000\) and \(N_{c} = 64\).
Spatial Fourier Independent Component Analysis
Spatial Fourier-ICA was recently proposed to characterize oscillatory EEG/MEG activity in cortical source space (Ramkumar et al. 2012, 2014). The main idea was to apply complex-valued ICA to short-time Fourier transforms of source-level EEG/MEG signals to reveal physiologically meaningful components. We briefly introduced the main steps of spatial Fourier-ICA for the completeness of the content. Figure 1 demonstrates the analysis pipeline based on spatial Fourier-ICA and acoustical feature extraction.
Time–Frequency Data in Cortical Source Space
Preprocessed EEG data \({{\varvec{Y}}}_{0}\) (\({N}_{c}\) channels × \({N}_{p}\) sampling points) were transformed by STFT to obtain complex-valued time–frequency representation (TFR) data \({{\varvec{Y}}}_{1}\) (\({N}_{c}\), \({N}_{f}\), \({N}_{t}\)). To obtain TFR data in source space, three-way sensor-space TFR data \({{\varvec{Y}}}_{1}\) was reorganized as two-way matrix \({\widehat{{\varvec{Y}}}}_{1}\) (\({N}_{c}\), \({N}_{t}\times {N}_{f}\)). The source-space TFR data \({\widehat{{\varvec{Y}}}}_{2}\) was then obtained by left-multiplying the linear inverse operator G (\({N}_{s}, {N}_{c}\)) which was computed using the minimum-norm estimate inverse solution sensor-space data \({\widehat{{\varvec{Y}}}}_{1}\),
Two-way data \({\widehat{{\varvec{Y}}}}_{2}\) (\({N}_{s}\), \({N}_{t}\times {N}_{f}\)) can be rearranged as a three-way tensor format \({{\varvec{Y}}}_{2}\) (\({N}_{s}\), \({N}_{t}\), \({N}_{f}\)). For application of spatial Fourier ICA, we then rearranged the three-way tensor \({{\varvec{Y}}}_{2}\) as a two-way matrix \({X}_{0}\) (\({N}_{t}\),\({N}_{f}\times {N}_{s}\)). Thus, each row of \({{\varvec{X}}}_{0}\) was comprised of the complex-valued short-time Fourier coefficients from each source point for specific time points and each column represented a time point corresponding to a short-time window. In this study, the Hamming-widow with 3-s-length and 2-s-overlap of the adjacent windows was selected, resulting in a sampling rate of 1 Hz in time dimension. This sampling rate was in consistent with musical feature time series (see Musical features). The duration of EEG was 512 s, so we had \({N}_{t}=512\) time points. We adopted a 512-point FFT to calculate the STFT resulting in 256 frequency bins (Range of frequency: 1–128 Hz) for each window. We selected the range of frequency bins covering 1–30 Hz (\({N}_{f}=60\)) for further analysis.
Application of Complex-Valued ICA on Reshaped Data
For data \({{\varvec{X}}}_{0}\), we applied complex-valued ICA (A. Hyvarinen et al. 2010) and treated each row as an observed signal assumed to be a linear mixture of unknown spatial spectral pattern. Since the original data (\({{\varvec{X}}}_{0}\)) dimension was relatively high for the complex ICA calculation, data dimension reduction was required in the preprocessing step of ICA. A common approach of data dimension reduction is principal component analysis (PCA) which is linear. Here we extended PCA to the complex domain by considering complex-valued eigenvalue decomposition (Li et al. 2011). The choice of model order was based on previous studies (Abou-Elseoud et al. 2010; Smith et al. 2009), which suggested the number of a dimension slightly larger than the expected number of underlying sources. In this study, we tried different model orders and found that 20 was a reasonable order, which preserved much of the information in the data and reduced the dimensionality of the results. Then we extracted 20 independent components using complex-valued FastICA algorithm which applied ICA to STFT of EEG data in order to find more interesting sources than with time-domain ICA (A. Hyvarinen et al., 2010). This method is especially useful for finding sources of rhythmic activity. After complex-valued ICA, a mixing matrix \(\widehat{{\varvec{A}}}\) (\({N}_{t}\), \({N}_{ic}=20\) is number of components) and estimated source matrix \(\widehat{S}\) were obtained. Each column of \(\widehat{A}\) represented the temporal course for each independent component (IC). The ICs in the rows of \(\widehat{S}\) (\({N}_{ic}\), \({N}_{f}\times {N}_{s}\)) represented spatial-spectral patterns, which can be decomposed into the spatial power map and power spectra.
Spatial Map, Spectrum, and Temporal Course of ICs
By reshaping each row of \(\widehat{{\varvec{S}}}\) for each IC, we obtained a matrix (\({N}_{f},{N}_{s}\)), which meant there was a Fourier coefficient spectrum for each cortical source point. To obtain and visualize the spatial map of the IC, we computed the average of the squared magnitude of the complex Fourier coefficients across those frequency bins. Since the distribution of mean squared Fourier amplitude over the whole brain is highly non-Gaussian, we did not apply conventional z-score-based thresholding; instead, we applied a threshold to display for each component map only source points with the top 5% squared Fourier amplitude (Ramkumar et al. 2012). Then we analyzed the correlation coefficient of the spatial maps in those frequency bins and those spatial maps were similar. To visualize and obtain the spectrum of each IC, we calculated the mean of the Fourier power spectrum across those source points exceeding the 95th percentile (Ramkumar et al. 2012). Finally, we extracted the absolute values of the column of mixing matrix \(\widehat{{\varvec{A}}}\) corresponding to the row of the estimated IC as the time course, which reflected fluctuations of the Fourier amplitude envelope for the specific frequency and spatial profile.
Stability of ICA Decomposition
To examine the stability of ICA, we applied 100 times ICA decomposition for each subject with different initial conditions. For the real-valued case, ICASSO toolbox (Himberg et al. 2004) has been used to evaluate stability among multiple estimates of the fastICA algorithm (Hyvarinen 1999). All the components estimated from all runs were collected and clustered based on the absolute value of the correlation coefficients among the squared source estimates of ICASSO. Finally, the stability index \(\mathrm{I}\mathrm{q}\) was computed for each component. \(\mathrm{I}\mathrm{q}\) reflects the isolation and compactness of a cluster (Himberg et al. 2004). \(\mathrm{I}\mathrm{q}\) is calculated as follows:
where \({\stackrel{-}{S}\left(i\right)}_{int}\) denotes the average intra-cluster similarity; \({\stackrel{-}{S}\left(i\right)}_{ext}\) indicates average inter-cluster similarity and J is the number of clusters. The \(\mathrm{I}\mathrm{q}\) ranges from ‘0′ to ‘1′. When \(\mathrm{I}\mathrm{q}\) approaches ‘1′, it means that the corresponding component is extracted in almost every ICA decomposition application. This indicates a high stability of the ICA decomposition for that component. Otherwise, it means the ICA decomposition is not stable. Correspondingly, if all the clusters are isolated with each other, ICA decomposition should be stable. In general, there is no established criterion upon which to base a threshold for cluster quality. Given the preliminary nature of this investigation, we consider the decomposition is stable if the \(\mathrm{I}\mathrm{q}\) is greater than 0.7.
In this study, the ICASSO toolbox was modified to be available for the complex-valued case as well. The correlation matrix was used as the similarity measure for clustering in real-valued ICASSO. For the complex case, since the ICs were complex-valued, we just considered the correlation matrix among the magnitude ICs to perform the clustering (Li et al. 2011). Then, we took the \(\mathrm{I}\mathrm{q}\) as the criterion to examine stability of the ICA estimate.
Testing for Stimulus-Related Networks
After ICA decomposition, we obtained 20 × 14 = 280 ICs (14 subjects, 20 components for each subject). Now the challenge is to determine which one of these represents the genuine brain responses. In all ICA based methods, it is a general question that which independent components need to be retained or which component just reflects noise. Here, we examine which components were modulated significantly by the musical features. We computed the correlation (Pearson’s correlation coefficient) between the time courses of musical features and the time courses of those ICs (the dimensionality of both them is 512 points) in order to select stimulus-related activations. We used the Monte Carlo method and permutation tests presented in our previous research (Alluri et al. 2012a, b; Cong et al. 2013a, b) to calculate the threshold of significant correlation coefficient. In this method, a Monte Carlo simulation of the approach was performed to determine the threshold for multiple comparisons. We kept those ICs whose time courses were significantly correlated (p < 0.05) with the time courses of musical features for further analysis.
Cluster Analysis
The selected ICs had been represented by spatial map, spectrum, and temporal course. Since spatial ICA was carried out on individual level EEG data, we needed to examine the inter-subject consistency among participants. In this study, we focused on the spatial pattern emerging in the process of freely listening to music, so a group level data analysis was performed by clustering spatial maps of the selected ICs to evaluate the consistency among the participants. For reliable clustering, we applied a conventional z-score-based normalization to each spatial map. All spatial maps of the screened components significantly correlated with musical features were clustered into M clusters to find common spatial patterns among most of participants. Here for simplicity, a conventional k-means cluster algorithm was used with the Kaufman Approach (KA) for initializing the algorithm. We used the minimum description length (MDL) to determine the number of clusters M. Afterwards we countered the number of subjects involved in ICs in each cluster. If the number of subjects in one cluster is less than half of the all subjects, this cluster would be discarded for the reason that such a cluster does not reveal information shared among enough participants. For the retained clusters, the spatial-spectral-temporal information was obtained, which was represented by the centroid of the cluster, the spectra of ICs and the numbers of subjects whose temporal courses were involved in this cluster.
Results
Musical Features
Five musical features were extracted by MIRtoolbox (Lartillot and Toiviainen 2007) with 3 s time-widow and 2 s overlap, resulting in 1 Hz sampling rate of temporal course. They are Fluctuation Centroid, Fluctuation Entropy, Key Clarity, Mode and Pulse Clarity. The time series of these features had a length of 512 samples, which matched the length of the time course of the EEG components. Figure 2 shows their temporal courses.
Stability of ICA Decomposition
We extracted 20 ICs using modified ICASSO with 100 runs for each subjects’ data, then we obtained the stability index \(\mathrm{I}\mathrm{q}\). Figure 3 shows the magnitude of \(\mathrm{I}\mathrm{q}\mathrm{s}\) for each participant, greater than 0.7 for most ICs. The 20 ICs were separated with each other for every participant from the view of clustering. Thus, the ICA estimate was stable and the results of ICA decomposition in this study were satisfactory for each participant data to further analysis.
Interesting Clusters: Frequency-Specific Networks
After 85 ICs whose spatial maps were significantly correlated with musical features were selected, we set the number of clusters as five by performing MDL to estimate the optimal model order. Then the spatial maps of ICs were clustered into five clusters. Three clusters representing frequency-specific networks were chosen since the number of subjects in the cluster is more than half of the all subjects. Figure 4 demonstrates one of these clusters including the centroid of all spatial maps (Fig. 4a), the distribution of number of subjects across musical features (Fig. 4b) and the spectrum of the ICs in this cluster (Fig. 4c). Then we computed the correlation coefficients among spatial maps in each cluster to evaluate the performance of clustering. Figure 5 shows the inter-cluster similarity. We computed the mean of the correlation coefficients in each cluster and the corresponding standard deviation (SD). For cluster#1, the mean is 0.642 and the corresponding standard deviation (SD) is 0.1238. For cluster#2, the mean is 0.7125 and SD is 0.0572. For cluster#3, the mean is 0.8084 and SD is 0.0747. This indicates that the spatial patterns are similar across the participants. In the Table 2, we listed the participants whose EEG data were correlated with every musical feature in each cluster.
Beta-Specific Network
Figure 4 shows results of the Beta-specific brain networks engaged in processing music features. The spatial map displays that musical features were associated with increased activation in the bilateral superior temporal gyrus (STG). The spectrum of ICs in this cluster illustrates the beta rhythm (focusing on 20 Hz) was involved in generating this network. Thus, relatively large-scale brain region generated by beta rhythm was activated in the bilateral STG and the magnitude of activation in right hemisphere was a little stronger than left hemisphere. This Beta-specific network was found in seven subjects during music free-listening (see the first row of Table 2). Fluctuation Centroid were associated with this brain networks among subjects 2, 4, 5, and 12. The brain networks of subjects 1, 2, 3 and 13 were correlated with key feature. For fluctuation entropy, pulse clarity and mode, there was one subject involved in this cluster respectively. In addition, the number of ICs correlated with the musical features was more than the number of participants since there were 20 ICs for each subject.
Alpha-Specific Network
Figure 6 displays relatively large brain activity in the bilateral occipital lobe according to the spatial map. As can be seen, the oscillations of this pattern were dominated by alpha rhythm (focusing on 10 Hz) with few ICs located in Delta band. There were eight participants appearing alpha-specific occipital networks under free-listening to music. The second row of the Table 2 shows the subjects involved in the networks linked with each musical feature.
Delta-Beta-Specific Network
Figure 7a illustrates increased activity linked with musical features in bilateral prefrontal gyrus (PFG). The spectrum (Fig. 7c) shows both beta and delta oscillations recruited these areas across participants. The delta-beta-specific networks were found in eight subjects. Mode was associated with this brain networks among subjects 2, 3, 4, 6, 7 and 9. The networks of subjects 4, 5, 7, 9 and 11 were correlated with Fluctuation Centroid (see the third row of Table 2).
Discussion
In this study, we investigated spatial spectral profiles of brain networks during music free-listening. To this end, we proposed a novel method combing spatial ICA, source localization and music information retrieval. EEG data were recorded when participants listened to a piece of music freely. Firstly, we applied STFT to preprocessed EEG data. After this, an inverse operator was obtained using source localization and the sensor-space data was mapped to source-space data. Then complex-valued ICA was performed to extract spatial-spectral patterns. The stability of ICA estimate was evaluated using a complex-value ICASSO. Meanwhile, the temporal evolutions of five long-term musical features were extracted by the commonly used MIRtoolbox. Following this, the spatial-spectral ICs related to music stimuli were chosen by correlating their temporal course with the temporal course of musical features. To examine the inter-subject consistency, a cluster analysis was applied to spatial patterns of the retained ICs. Overall, our results highlighted the frequency-dependent brain networks during freely listening to music. The results are consistent with previous findings published in other studies (Alluri et al. 2012a, b; Cong et al. 2013a, b; Janata et al. 2002).
It was found that beta-specific brain networks in the bilateral STG emerged from dynamic processing of musical features (see Fig. 4). The bilateral STG were mostly activated during music listening, which was involved in long-term musical features processing. It was interesting to note that the beta oscillations were enhanced in this bilateral spatial profile (see Fig. 4c). This spatial-spectral pattern appeared more related with Fluctuation Centroid and Key processing than Fluctuation Entropy, Mode and Pulse Clarity (see Fig. 4b). The same areas were found in previous studies where timbre-related features were correlated with activations in large areas of the temporal lobe using fMRI (Alluri et al. 2012a, b). Besides, early MEG studies demonstrated that cortical rhythm activity in beta band activity (15–30 Hz) was tightly coupled to behavioral performance in musical listening and associated with predicting the upcoming note events (Doelling and Poeppel 2015). Since beta bands have been associated with motor and rhythmic processes, listeners may voluntarily engage in mental activities related to motor during listening to segments engaged in dancing (Meyer et al. 2006; Poikonen et al. 2018b). For the participants who like dancing, music is comprehensive and collaborative. Music forms a setting in which dancers produce movements that are coherent with (or intentionally in contrast to) the prevailing sound in terms of rhythm, sentiment, and movement style (Poikonen et al. 2018a). When freely listening, a participant might be more focused on the gist of the music than to the sequence of an individual instrument, melody contour, or rhythmic pattern. Importantly, in the current study, no participant was familiar with the presented music stimuli. Thus, the beta-specific brain networks emerging in the bilateral STG could reflect the activation of higher-level brain processes (Pearce et al. 2010; Poikonen et al. 2018b).
We also observed alpha oscillatory visual networks (see Fig. 6), which is in line with our previous study (Cong et al. 2013a, b). Alpha oscillations play an important role in basic cognitive process, which is linked to suppression and selection of attention (Klimesch 2012). Event-related brain activation in alpha band has been found in studies with sensory or motor tasks and with attention and working memory tasks. For example, alpha event-related synchronization was showed over the leg area of the motor cortex while event-related desynchronization in alpha was observed over the hand area when participants performed hand-movement tasks. This compensatory distribution of alpha activity demonstrates that alpha oscillation in task-irrelevant regions is associated with cortical disengagement (Pfurtscheller 2003). That could be the reason that the alpha-specific power over visual cortices was larger when attention was focused on the auditory stimuli.
A delta-beta oscillatory network in prefrontal cortex were also observed during listening to music (see Fig. 7). Helfrich et al. argued that the prefrontal cortex provides the structural basis for numerous higher cognitive functions and oscillatory dynamics of prefrontal cortex provide a functional basis for flexible cognitive control of goal-directed behavior (Helfrich and Knight 2016). Besides, prefrontal cortex has the function of entrainment as a mechanism of top-down control (Helfrich and Knight 2016). Our findings provided the evidence that the higher cognitive function with specific rhythms were involved in continuous and naturalistic music. Janata et al. identified an area in the rostromedial prefrontal cortex as a possible brain units for tonal processing (Janata et al. 2002). In addition, some studies demonstrated that oscillations in the delta and beta bands were instrumental in predicting the occurrence of auditory targets (Arnal et al. 2015; Doelling and Poeppel 2015). Music is shown to be a powerful stimulus modulating emotional arousal, an increase of posterior alpha, central delta, and beta rhythm was observed during high arousal (Mikutta et al. 2012; Poikonen et al. 2016a, b; Poikonen et al. 2016a, b). That may explain why the delta–beta oscillations in this study appears in prefrontal cortex (Fig. 7).
From the methodology consideration, most of these studies investigated one pattern of the spatial spectral profile and did not examined the interplay between brain networks and spectral mode. In contrast, we studied the interactions between brain region and cortical oscillations and found the brain networks during music listening were frequency-dependent. In terms of our proposed approach for analysis of frequency-specific networks during naturalistic music listening, we can credibly find the spatial-spectral patterns elicited by musical stimulus. There are some related approaches using spatial ICA in a variety of specific techniques to investigate the RSNs under MEG data. Nugent et al. proposed a method named as MultibandICA to derive frequency-specific spatial profile in RSNs. However, six frequency bands (delta, theta, alpha, beta, gamma, high gamma) firstly need to be extracted from the MEG data and were concatenated in certain dimensionality; ICA was then performed to concatenated data (Nugent et al. 2017). Similar methods were proposed in (Sockeel et al. 2016). Here distinctly, the proposed approach is completely data-driven and does not require pre-define the frequency band. Another important asset of our study is that the clustering was applied to the spatial maps to examine the inter-subject consistency in proposed method. The correlation coefficients were then computed in each cluster. We observed that the individual spatial-spectral profiles in every retained cluster were similar but the corresponding time courses were different. This is different from analysis of event-related potential (ERP) where temporal ICA components sharing identical spatial profiles might be similar. The differences might be resulted from different responses of participants under real-word experiences. In the future, we will attempt to develop group spatial ICA to analyze group-level data where the individual data are concatenated in time dimension.
Conclusion
In this study, we introduced a novel framework with several techniques including Fourier ICA, source estimation, acoustic feature extraction, and clustering for exploiting the spectral–spatial structure of brain during naturalistic stimulus. A complex-value ICA applied to source-space time–frequency representation of EEG data. Following this, a modified ICASSO was performed to evaluate the stability of ICA estimate and a cluster analysis was applied to examine the inter-subject consistency. The identified networks involved in music perception were in line with those previous studies. Further, we found that brain networks under music listening were frequency-specific and three frequency-dependent networks associated with processing musical features were observed.
References
Abou-Elseoud A, Starck T, Remes J, Nikkinen J, Tervonen O, Kiviniemi V (2010) The effect of model order selection in group PICA. Hum Brain Mapp 31(8):1207–1216
Alluri V, Toiviainen P (2010) Exploring perceptual and acoustical correlates of polyphonic timbre. Music Perception: An Interdisciplinary Journal 27(3):223–242
Alluri V, Toiviainen P, Jaaskelainen IP, Glerean E, Sams M, Brattico E (2012a) Large-scale brain networks emerge from dynamic processing of musical timbre, key and rhythm. Neuroimage 59(4):3677–3689. https://doi.org/10.1016/j.neuroimage.2011.11.019
Alluri V, Toiviainen P, Jääskeläinen IP, Glerean E, Sams M, Brattico E (2012b) Large-scale brain networks emerge from dynamic processing of musical timbre, key and rhythm. Neuroimage 59(4):3677–3689
Alluri V, Toiviainen P, Lund TE, Wallentin M, Vuust P, Nandi AK, … Brattico E (2013) From Vivaldi to Beatles and back: predicting lateralized brain responses to music. Neuroimage 83:627–636. https://doi.org/10.1016/j.neuroimage.2013.06.064
Arnal LH, Doelling KB, Poeppel D (2015) Delta-Beta Coupled Oscillations Underlie Temporal Prediction Accuracy. Cereb Cortex 25(9):3077–3085. https://doi.org/10.1093/cercor/bhu103
Brookes MJ, O'neill, GC, Hall, EL, Woolrich MW., Baker A, Corne, SP, … Barnes GR (2014) Measuring temporal, spectral and spatial changes in electrophysiological brain network connectivity. Neuroimage 91:282–299
Burunat I, Alluri V, Toiviainen P, Numminen J, Brattico E (2014) Dynamics of brain activity underlying working memory for music in a naturalistic condition. Cortex 57:254–269. https://doi.org/10.1016/j.cortex.2014.04.012
Burunat I, Toiviainen P, Alluri V, Bogert B, Ristaniemi T, Sams M, Brattico E (2016a) The reliability of continuous brain responses during naturalistic listening to music. Neuroimage 124(Pt A):224–231. https://doi.org/10.1016/j.neuroimage.2015.09.005
Burunat I, Toiviainen P, Alluri V, Bogert B, Ristaniemi T, Sams M, Brattico E (2016b) The reliability of continuous brain responses during naturalistic listening to music. NeuroImage 124:224–231
Cong F, Alluri V, Nandi AK, Toiviainen P, Fa R, Abu-Jamous B, … Huotilainen M (2013) Linking brain responses to naturalistic music through analysis of ongoing EEG and stimulus features. IEEE Trans Multimed 15(5):1060–1069
Daly I, Malik A, Hwang F, Roesch E, Weaver J, Kirke A, … Nasuto SJ (2014) Neural correlates of emotional responses to music: an EEG study. Neurosci Lett 573:52–57. https://doi.org/10.1016/j.neulet.2014.05.003
Daly I, Williams D, Hallowell J, Hwang F, Kirke A, Malik A, … Nasuto SJ (2015) Music-induced emotions can be predicted from a combination of brain activity and acoustic features. Brain Cogn 101:1–11. https://doi.org/10.1016/j.bandc.2015.08.003
Delorme A, Makeig S (2004) EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods 134(1):9–21. https://doi.org/10.1016/j.jneumeth.2003.10.009
Doelling KB, Poeppel D (2015) Cortical entrainment to music and its modulation by expertise. Proc Natl Acad Sci USA 112(45):E6233–6242. https://doi.org/10.1073/pnas.1508431112
Hasson U, Malach R, Heeger DJ (2010) Reliability of cortical activity during natural stimulation. Trends Cogn Sci 14(1):40–48. https://doi.org/10.1016/j.tics.2009.10.011
Helfrich RF, Knight RT (2016) Oscillatory dynamics of prefrontal cognitive control. Trends Cogn Sci 20(12):916–930. https://doi.org/10.1016/j.tics.2016.09.007
Himberg J, Hyvarinen A, Esposito F (2004) Validating the independent components of neuroimaging time series via clustering and visualization. Neuroimage 22(3):1214–1222. https://doi.org/10.1016/j.neuroimage.2004.03.027
Hyvarinen A (1999) Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans Neural Netw 10(3):626–634. https://doi.org/10.1109/72.761722
Hyvarinen A, Ramkumar P, Parkkonen L, Hari R (2010) Independent component analysis of short-time Fourier transforms for spontaneous EEG/MEG analysis. Neuroimage 49(1):257–271. https://doi.org/10.1016/j.neuroimage.2009.08.028
Janata, P., Birk, J. L., Van Horn, J. D., Leman, M., Tillmann, B., & Bharucha, J. J. (2002). The cortical topography of tonal structures underlying Western music. Science, 298 (5601), 2167–2170.
Jäncke L, Alahmadi N (2016) Detection of independent functional networks during music listening using electroencephalogram and sLORETA-ICA. Neuroreport 27(6):455–461
Jäncke L, Kühnis J, Rogenmoser L, Elmer S (2015) Time course of EEG oscillations during repeated listening of a well-known aria. Front Hum Neurosci 9:401
Jäncke L, Leipold S, Burkhard A (2018) The neural underpinnings of music listening under different attention conditions. Neuroreport 29(7):594–604
Kauppi J-P, Parkkonen L, Hari R, Hyvärinen A (2013) Decoding magnetoencephalographic rhythmic activity using spectrospatial information. Neuroimage 83:921–936
Klimesch W (2012) alpha-band oscillations, attention, and controlled access to stored information. Trends Cogn Sci 16(12):606–617. https://doi.org/10.1016/j.tics.2012.10.007
Koskinen M, Viinikanoja J, Kurimo M, Klami A, Kaski S, Hari R (2013) Identifying fragments of natural speech from the listener's MEG signals. Hum Brain Mapp 34(6):1477–1489. https://doi.org/10.1002/hbm.22004
Lankinen K, Saari J, Hari R, Koskinen M (2014) Intersubject consistency of cortical MEG signals during movie viewing. Neuroimage 92:217–224. https://doi.org/10.1016/j.neuroimage.2014.02.004
Lartillot O, Toiviainen P (2007) A Matlab toolbox for musical feature extraction from audio. Paper presented at the International conference on digital audio effects.
Lartillot O, Toiviainen P, Eerola T (2008) A matlab toolbox for music information retrieval. In Data analysis, machine learning and applications (pp. 261–268): Springer.
Lei X, Yao D (2011) EEG source localization based on multiple fMRI spatial patterns. In Advances in Cognitive Neurodynamics (II) (pp. 381–385): Springer.
Li C, Yuan H, Shou G, Cha Y-H, Sunderam S, Besio W, Ding L (2018) Cortical statistical correlation tomography of EEG resting state networks. Front Neurosci 12.
Li F, Yi C, Song L, Jiang Y, Peng W, Si Y, … Zhang Y (2019) Brain network reconfiguration during motor imagery revealed by a large-scale network analysis of scalp EEG. Brain Topogr 32(2):304–314
Li H, Correa NM, Rodriguez PA, Calhoun VD, Adali T (2011) Application of independent component analysis with adaptive density model to complex-valued fMRI data. IEEE Trans Biomed Eng 58(10):2794–2803. https://doi.org/10.1109/TBME.2011.2159841
Lin FH, Belliveau JW, Dale AM, Hamalainen MS (2006) Distributed current estimates using cortical orientation constraints. Hum Brain Mapp 27(1):1–13. https://doi.org/10.1002/hbm.20155
Liu C, Abu-Jamous B, Brattico E, Nandi AK (2017) Towards tunable consensus clustering for studying functional brain connectivity during affective processing. Int J Neural Sys 27(02):1650042
Malcolm GL, Groen II, Baker CI (2016) Making sense of real-world scenes. Trends Cogn Sci 20(11):843–856. https://doi.org/10.1016/j.tics.2016.09.003
Markovic A, Kühnis J, Jäncke L (2017) Task context influences brain activation during music listening.Front Hum Neurosci 11:342
Meyer M, Baumann S, Jancke L (2006) Electrical brain imaging reveals spatio-temporal dynamics of timbre perception in humans. Neuroimage 32(4):1510–1523
Mikutta C, Altorfer A, Strik W, Koenig T (2012) Emotions, arousal, and frontal alpha rhythm asymmetry during Beethoven’s 5th symphony. Brain Topogr 25(4):423–430
Mikutta C, Maissen G, Altorfer A, Strik W, König T (2014) Professional musicians listen differently to music. Neuroscience 268:102–111
Nugent AC, Luber B, Carver FW, Robinson SE, Coppola R, Zarate CA Jr (2017) Deriving frequency-dependent spatial patterns in MEG-derived resting state sensorimotor network: A novel multiband ICA technique. Hum Brain Mapp 38(2):779–791. https://doi.org/10.1002/hbm.23417
Pampalk E, Rauber A, Merkl D (2002) Content-based organization and visualization of music archives. In: Proceedings of the tenth ACM international conference on Multimedia. pp 570–579
Pearce MT, Ruiz MH, Kapasi S, Wiggins GA, Bhattacharya J (2010) Unsupervised statistical learning underpins computational, behavioural, and neural manifestations of musical expectation. Neuroimage 50(1):302–313
Pfurtscheller G (2003) Induced oscillations in the alpha band: functional meaning. Epilepsia 44:2–8
Poikonen H, Alluri V, Brattico E, Lartillot O, Tervaniemi M, Huotilainen M (2016a) Event-related brain responses while listening to entire pieces of music. Neuroscience 312:58–73
Poikonen H, Toiviainen P, Tervaniemi M (2016b) Early auditory processing in musicians and dancers during a contemporary dance piece. Sci Rep 6:33056
Poikonen H, Toiviainen P, Tervaniemi M (2018a) Dance on cortex: Enhanced theta synchrony in experts when watching a dance piece. Eur J Neurosci 47(5):433–445
Poikonen H, Toiviainen P, Tervaniemi M (2018b) Naturalistic music and dance: cortical phase synchrony in musicians and dancers. PloS One 13(4):e0196065
Ramkumar P, Parkkonen L, Hari R, Hyvärinen A (2012) Characterization of neuromagnetic brain rhythms over time scales of minutes using spatial independent component analysis. Hum Brain Mapp 33(7):1648–1662
Ramkumar P, Parkkonen L, Hyvärinen A (2014) Group-level spatial independent component analysis of Fourier envelopes of resting-state MEG data. Neuroimage 86:480–491
Rogenmoser L, Zollinger N, Elmer S, Jäncke L (2016) Independent component processes underlying emotions during natural music listening. Soc Cogn Affect Neurosci 11(9):1428–1439
Schaefer RS, Desain P, Farquhar J (2013) Shared processing of perception and imagery of music in decomposed EEG. Neuroimage 70:317–326. https://doi.org/10.1016/j.neuroimage.2012.12.064
Smith SM, Fox PT, Miller KL, Glahn DC, Fox PM, Mackay CE, … Laird AR (2009) Correspondence of the brain's functional architecture during activation and rest. Proc Nat Acad Sci 106(31):13040–13045
Sockeel S, Schwartz D, Pelegrini-Issac M, Benali H (2016 )Large-scale functional networks identified from resting-state EEG using spatial ICA. PLoS One 11(1):e0146845. https://doi.org/10.1371/journal.pone.0146845
Spiers HJ, Maguire EA (2007) Decoding human brain activity during real-world experiences. Trends Cogn Sci 11(8):356–365. https://doi.org/10.1016/j.tics.2007.06.002
Sturm I, Dahne S, Blankertz B, Curio G (2015) Multi-variate EEG analysis as a novel tool to examine brain responses to naturalistic music stimuli PLoS One 10(10):e0141281. https://doi.org/10.1371/journal.pone.0141281
Tadel F, Baillet S, Mosher JC, Pantazis D, Leahy RM (2011) Brainstorm: a user-friendly application for MEG/EEG analysis. Comput Intell Neurosci 2011:879716. https://doi.org/10.1155/2011/879716
Tian Y, Klein RM, Satel J, Xu P, Yao D (2011) Electrophysiological explorations of the cause and effect of inhibition of return in a cue–target paradigm. Brain Topogr 24(2):164–182
Toiviainen P, Alluri V, Brattico E, Wallentin M, Vuust P (2014) Capturing the musical brain with Lasso: dynamic decoding of musical features from fMRI data. Neuroimage 88:170–180. https://doi.org/10.1016/j.neuroimage.2013.11.017
Wang D, Cong F, Zhao Q, Toiviainen P, Nandi AK, Huotilainen M, Ristaniemi T, Cichocki A (2016) Exploiting ongoing EEG with multilinear partial least squares during free-listening to music. In: Paper presented at the Machine Learning for Signal Processing (MLSP), 2016 IEEE 26th International Workshop on
Zhu Y, Liu J, Mathiak K, Ristaniemi T, Cong F (2020) Deriving electrophysiological brain network connectivity via tensor component analysis during freely listening to music. IEEE Trans Neural Syst Rehabil Eng 28(2):409–418
Zhu Y, Liu J, Ristaniemi T, Cong F (2019) Distinct patterns of functional connectivity during the comprehension of natural, narrative speech. Int J Neural Syst. https://doi.org/10.1142/S0129065720500070
Acknowledgement
Open access funding provided by University of Jyväskylä (JYU). This work was supported by the National Natural Science Foundation of China (Grant No. 91748105), the Fundamental Research Funds for the Central Universities [DUT2019] in Dalian University of Technology in China, and the scholarship from China Scholarship Council (No. 201600090042).
Author information
Authors and Affiliations
Corresponding author
Additional information
Handling editor: Christoph M. Michel.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The features were extracted from the stimulus on a frame-by-frame basis (see (Alluri and Toiviainen 2010) for more details). A brief description of each of the acoustic features is presented below. A detailed explanation can be found in the user manual of the MIRToolbox (Lartillot and Toiviainen 2007).
Mode strength of major of minor mode.
Key Clarity the strength of the estimated key, computed as the maximum of cross-correlations between the chromagram extracted from the music and tonality profiles representing all the possible key candidates.
Fluctuation Centroid geometric mean of the fluctuation spectrum representing the global repartition of rhythm periodicities within the range of 0–10 Hz, indicating the average frequency of these periodicities.
Fluctuation Entropy Shannon entropy of the fluctuation spectrum (Pampalk et al. 2002) representing the global repartition of rhythm periodicities. Fluctuation entropy is a measure of the noisiness of the fluctuation spectrum. For example, a noisy fluctuation spectrum can be indicative of several co-existing rhythms of different periodicities, thereby indicating a high level of rhythmic complexity.
Pulse Clarity the strength of rhythmic periodicities sound, representing how easily the underlying pulsation in music can be perceived.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhu, Y., Zhang, C., Poikonen, H. et al. Exploring Frequency-Dependent Brain Networks from Ongoing EEG Using Spatial ICA During Music Listening. Brain Topogr 33, 289–302 (2020). https://doi.org/10.1007/s10548-020-00758-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10548-020-00758-5