There are many complex dynamical systems in nature, the function of which defy analysis with traditional approaches applied to recorded signals. When the observed dynamics are relatively simple, e.g. stationary periodicities, long-established analytical tools, such as the Fourier transform, can sufficiently characterize the signal patterns. More complex dynamics, such as bifurcations and chaotic oscillations require more sophisticated approaches. The challenges posed by brain activity are more severe due to the inherent nonlinearities and the metastability of the underlying system.

Among the most important techniques in temporal data mining lies symbolization, also known as symbolic time-series analysis. Symbolization involves the transformation of raw signal measurements into a time series of discretized symbols. The discrete representation of original time series makes available a variety of powerful techniques, founded on information and communication theory, for a more mathematically-concrete manipulation of dynamics. For example, the properties of symbolic encodings are part of the core theory in communication (Shannon and Weaver 1998), Markov chains for discrete systems (Seneta 1981) and bioinformatics (Baldi and Brunak 1998).

The earliest developments in symbolic dynamics began with the study of the complex behavior of dynamical systems. In 1898, Hadamard developed a symbolic description of sequences in geodesic flows on surfaces of negative curvature (Hadamard 1898). Specifically, he identified a finite set of forbidden symbol pairs those which cannot occur and noted that possible sequences were those which did not contain the forbidden pairs. The notion of forbidden patterns was recently applied in financial time series (Zanin 2008). The developments in symbolic dynamics followed several diverse directions, including the analysis of nonlinear oscillators, the treatment of nonlinear maps, and the establishment of a connection with information theory and the notion of metric entropy (a detailed review: (Daw et al. 2001).

A generic method for analyzing a dynamical system, that exploits the properties of its trajectory in a phase space (obtained via time-delay embedding of the original 1D time series), is based on recurrence plots (Marwan et al. 2007) and aims at revealing the dynamical invariants of the system. Recently, a technique called Fuzzy Symbolic Dynamics (FSD) has been proposed to enhance this information mapping (Duch and Dobosz 2011). FSD attempts to cluster each multidimensional point to a neighborhood and then to map the encapsulated information to a simplified and more understandable two or three dimensional diagram.

Various complexity measures, stemming from information theory (for a review see Gao et al. 2011), were lately applied to symbolic sequences derived from multichannel EEG/MEG data in order to understand the nature of brain dynamics, to develop novel diagnostic methods for brain pathologies and also to discriminate between different cognitive tasks. In a recent study, EEG data from Alzheimer patients were transformed into symbols of 0, 1, 2 employing the median, maximum and minimum value of the amplitude as thresholds in a channelwise fashion (Abasolo et al. 2006). Subsequently, the central tendency and the LempelZiv complexity measures were applied (for definitions refer to (Gao et al. 2011)). In an MEG study (Amigo et al. 2010) employing data from ADHD children and controls, the authors adopted a spatio-temporal approach, in which the symbolization was first performed independently for each channel, by using the mean amplitude of the signal as a threshold to produce binary sequences. They then applied a new permutation complexity measure based on forbidden and ordinal patterns. Finally, the technique of ordinal patterns itself can be considered as a symbolization approach that has already gained popularity and found important applications, e.g. in the detection of epileptic seizures (Ouyang et al. 2010).

The available symbolization techniques work either with the original 1D time series or with its representation in a reconstructed phase space. In the case of multichannel EEG/MEG recordings where the complex system dynamics are densely sampled (and correspond to activations from distributed brain networks), a more delicate encoding is desirable. The component-wise treatment (in which symbols are assigned separately for each channel’s signal) is unsatisfactory, as it misses possible patterns of coordinated brain activity. Considering that such patterns are dynamic signatures of segregation and integration around which cognition is known to revolve, the pursuit of a symbolization scheme that would respect them was a plausible research objective.

This paper introduces the use of neural-gas algorithm (Martinez et al. 1993) for symbolic encoding of neural spatiotemporal dynamics. The multichannel signal is considered as a time-dependent sequence of instantaneous patterns (vectors) that when fed to a suitable vector-quantization scheme can be transformed to a symbolic time series. Neural-gas is a self-organization network that efficiently derives the codebook (i.e. a set of prototypical patterns) that can describe the given multichannel data in a faithful manner. The codebook design corresponds to the symbols-selection step and precedes the vector quantization step which corresponds to symbol assignment step. Neural-gas has already been successfully applied to clustering temporal patterns of MEG response (Laskaris et al. 2004), organizing odour responses in optical recordings data (Laskaris et al. 2008) and classifying saccadic eye-movements (Bozas et al. 2010). Its ability to mine key-patterns from voluminous datasets is exploited, for the first time here, in the setting of multichannel encephalographic recordings.

The structure of this paper is organized as follows. In “Methods”, the neural-gas based symbolization scheme is described and followed by an algorithm that represents brain dynamics as a network topology that emphasizes state transitions. Section “Results” is devoted to the results obtained by applying the proposed technique to experimental EEG-data. Section "Discussion” discusses the benefits and the potential of the proposed methodology.


Neural-gas based symbolization

Assume that some unknown sources contribute to a multidimensional signal that is changing in time, for example an EEG/MEG signal measured by N electrodes:

$$ {\mathbf{X}}^{{{\mathbf{data}}}} = \left\{ {{\text{x}}_{\text{i}} \left( {\text{t}} \right)} \right\}\;{\text{i}} = 1, \ldots ,{\text{N}}\;{\text{and}}\;{\text{t}} = 1,\;2, \ldots {\text{T}} $$

The vector X(t) = [x1(t), x2(t) …, xN(t)] represents the state of the “dynamical system” at time t.

The partition of all T vectors x(t) into groups of homogenous patterns is the most direct way to summarize the temporal variations in the EEG/MEG recordings. In our approach, a codebook of k code vectors is designed by applying the neural-gas algorithmFootnote 1 to the data matrix X data. This algorithm is an artificial neural network model, which converges efficiently to a small number k < < T of codebook vectors {Mi}i=1:k, using a stochastic gradient descent procedure with a soft-max adaptation rule that minimizes the average distortion error (Martinez et al. 1993).

The nearest code vector k is then assigned to each of the T vectors X(t). In this way, the bulk of information contained in the data matrix is represented, in a parsimonious way, by a (T × k) partition matrix U, with elements uij indicating the assignment of input vectors to code vectors. Following the inverse procedure, we can rebuilt with a small reconstruction error X data from the k code vectors. The reconstructed version of X data is denoted as X data R . To compute the fidelity of the overall encoding procedure, an index which is the total distortion error divided by the total dispersion of the data is adopted:

$$ {\text{n}}_{\text{Distortion}} = \frac{{\sum\limits_{{{\text{t}} = 1}}^{\text{T}} {\left\| {{\text{X}}({\text{t}}) - {\text{X}}_{\text{R}} ({\text{t}})} \right\|^{2} } }}{{\sum\limits_{{{\text{t}} = 1}}^{\text{T}} {\left\| {{\text{X}}({\text{t}}) - \overline{\text{X}} } \right\|^{2} } }},\quad \, \overline{\text{X}} = \frac{ 1}{\text{T}}\sum\limits_{\text{i = 1}}^{\text{T}} {\text{X(t)}} $$

The smaller the nDistortion, the better the encoding. This index gets smaller with the increase of k, while reaches a plateau for a relative small value of k. In the present study, we considered as acceptable encoding the one produced with the smallest k and simultaneously satisfied the condition that nDistortion should be less than 8%. Hence, we repeatedly applied the neural-gas algorithm with increasing k and measured the reconstruction quality. In this way we defined the optimal ko, which in turn defined the codebook to use in the subsequent symbolization scheme. At the vector-quantization stage, each vector X(t) is assigned (according to the nearest-prototype rule) to the most similar among the derived prototypical patterns Mi, i = 1, 2, …, ko. This step completes the mapping from multichannel data to a symbolic time series s(t), t = 1, 2, … ,T, which in mathematical notation reads as follows

$$ \begin{aligned} {\text{X}}({\text{t}}) = [{\text{x}}_{1} ({\text{t}}),{\text{x}}_{2} ({\text{t}}), \ldots ,{\text{x}}_{\text{N}} ({\text{t}})]\; \in \;{\text{ R}}^{\text{N}} \to & {\text{M}}_{\text{j}} = \left[ {{\text{m}}_{ 1}^{\text{j}} ,{\text{m}}_{ 2}^{\text{j}} , \ldots ,{\text{m}}_{\text{N}}^{\text{j}} } \right]\; \in \;\{ {\text{M}}_{\text{i}} \}_{\text{i = 1}}^{{{\text{k}}_{\text{o}} }}, {\text{M}}_{\text{i}} \; \in \;{\text{R}}^{\text{N}} \\ {\text{X}}({\text{t}}) \to & {\text{s}}({\text{t}}) = {\text{j}}\; \in \; \, \left\{ {1,2, \ldots {\text{k}}_{\text{o}} } \right\} \\ \end{aligned} $$

This, one and only, timeseries has encoded the spatiotemporal dynamics of brain activity as succession of adaptively-defined (i.e. data-dependent) symbols. The temporal aspects of the dynamics can be red directly in the symbols timecourse, while the spatial patterning of brain activity can be deciphered only implicitly. Having in mind that in many situations the ultimate goal of data analysis is to provide a systematic quantitative comparison of the dynamics observed at different recording conditions (e.g. rest vs. active state), we describe next a novel scheme that can fully exploit the derived symbolic timeseries towards this end.

A network representation of neural-gas based symbolic time series

Recently, a method to convert a sequence of symbols into a weighted directed network has been introduced (Sinatra et al. 2010). This opens the possibility to characterize a given symbolic sequence in terms of well-established network metrics and further facilitates the comparison between symbolic sequences on the basis of network topologies. For consistency, we first summarize the main ideas in the original general setting and then describe how they were adopted for the purposes of the particular study. The method starts by first defining k-motifs which are particular strings of length k. After enumerating all the motifs in the given symbolic sequence, a network is built with nodes representing the different motifs and links denoting pairs of motifs with statistically significant co-occurrence in the sequence.

In our implementation, we restricted ourselves to single-symbol motif. Hence the obtained network consists of ko-nodes (i.e. each node corresponds to a particular code vector). The links of the network indicates those symbol pairs that were found lying consecutively in the symbolic time series. The direction of the links indicates the order of appearance and their weights have been estimated with the following procedure. We first estimated the observed probability pobs (α, β) that symbol α is followed by symbol β within the symbolic timeseries s(t). To detect the significantly correlated appearance of symbols, we need to estimate the probability of random co-occurrence of these two symbols. We denote as p(α) and p(β) the probabilities of finding the two symbols in s(t). The symbol α can occupy positions ranging from the first to the (T − 1)th position, where T is the length of s(t). For each fixed position i of α, with i = 1, …, (T − 1), there are (T – 1 − i) possible positions for β to appear in the sequence. Hence, the number of possible transitions αβ within s(t) is given by the equation:

$$ {\text{p}}^{\exp } (\alpha_{ \to } \beta ) = {\text{p}}(\alpha ){\text{p}}(\beta )\sum\limits_{{{\text{i}} = 1}}^{{{\text{T}} - 2}} {({\text{T}} - 1 - {\text{i}})} $$

A weight wα,β can be associated with the link from α to β, based on the extent to which the number of observed transitions deviates from the expected value:

$$ {\text{w}}_{\alpha ,\beta } = \left\{ {\begin{array}{*{20}c} {{\text{p}}^{\text{obs}} (\alpha_{ \to } \beta ),} \hfill & {{\text{if }}\quad \left. {\frac{{{\text{p}}^{\text{obs}} (\alpha_{ \to } \beta )}}{{{\text{p}}^{ \exp } (\alpha_{ \to } \beta )}}} \right\rangle 3} \hfill \\ {0,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right. $$

The obtained network topology emphasizes the state transitions that appear systematically and, therefore, are expected to have a crucial role in brain function.

Among the whole battery of metrics that can be applied to the derived codebook-network, we describe here the use of (weighted) directed global efficiency (GE), which in the case of a ko-nodes network takes the form

$$ {\text{GE}}_{\text{W}}^{ \to } = \frac{1}{{{\text{k}}_{\text{o}} }}\sum\limits_{{\text{j}} \in {\text{k}}_{\text{o}}} {\frac{{\sum\limits_{{{\text{j}} \in {\text{k}}_{\text{o}} ,{\text{j}} \ne {\text{i}}}} {\left( {{\text{d}}_{\text{ij}}^{\text{w}} } \right)}^{ - 1} }}{{{\text{k}}_{\text{o}} - 1}}} $$

GE is the inverse of the harmonic mean of the shortest path lengths between every pair of nodes (The dij values in Eq. 6 indicate the results of a simple transform that is applied to wα,β values so as to get ‘translated’ into pairwise ‘distances’ between nodes (Latora and Marchiori 2001)). Its values range between 0 and 1, with high values indicating an increased (with respect to randomness) number of state transitions, and hence a highly non-stable system.

Contrasting brain dynamics

The issue of comparing spatiotemporal dynamics from different brain states or associated with different tasks remains open. This is mainly due to the high inter-subject variability and the nature of encountered signals that do not comply with the conditions of normality and stationarity which are necessary for the applicability of traditional methods. To extract dynamic invariants, i.e. characteristics which robustly describe the underlying system regardless the particular realization of the available recordings and the presence of noise, is the usually preferred path. The application of standard metrics to the codebook-network(s) falls within this category and could be considered as an appealing option. In this study, however, we found (experimentally) that a slightly modified strategy can facilitate better the direct comparison between spatiotemporal dynamics. The difference of the adopted approach lies in the design of codebook, which is common for the two (or more) conditions to be compared. Provided the datasets X data = {xi(t)} and Y data = {yi(t)}, i = 1, …, N and t = 1, 2, …, T, the neural-gas algorithm is applied to the augmented dataset Z data = [X data|Y data]. Using the derived code vectors, both the symbolization step and the formation of codebook-network proceed as before. In this way, the GE measure derived for each of the condition to be compared has now been expressed in common terms (i.e. reffers to the same codebook) and, hence, reflects differences in a more concrete manner. Figure 1 provides a schematic illustration of the proposed strategy.

Fig. 1
figure 1

Schematic illustration of the proposed methodology (t refers to the length of the signal segment, S denotes the number of the subjects and cond 1, 2 refers to the conditions)


Experimental data

To demonstrate the suggested methodology, we utilized EEG data from healthy subjects recorded while performing two different mental arithmetic tasks (comparison and multiplication) and during a resting state (control). The particular datasets have been used in previous studies (Micheloyannis et al. 2005; Dimitriadis et al. 2010a, b) and its description is provided in this article as supplementary material. Here, the purpose was to test if our symbolization scheme can subserve the comparison of brain activity dynamics. For comparison purposes, we also applied the relevant technique of ordinal patterns (Ouyang et al. 2010), which has gained high popularity and its function is based on symbols (associated with the ordinal patterns) assigned in a channel-wise fashion.

Different signal representations

Various comparisons were performed after filtering the signal appropriately and using the frequency-band as parameter (e.g. θ, α1; see Table 1). Apart from the frequency range, we tested extensively if the (filtered) signal in its original form, or in a form that either emphasizes amplitude or phase dynamics, facilitates better the differentiation between different recording conditions. To isolate the amplitude and phase dynamics from the filtered signal x(t), we applied the Hilbert transform (Cohen 1995), which returns the instantaneous amplitude A(t) and instantaneous phase φ(t) and is defined as follows

$$ \begin{gathered} \chi ({\text{t}}) = \underline{\text{x}} ({\text{t}}) + {\text{j}}\widetilde{\text{x}}({\text{t}}) = {\text{A}}({\text{t}})\;{\text{e}}^{{{\text{j}}\varphi ({\text{t}})}} \hfill \\ \widetilde{\text{x}}({\text{t}}) = \frac{\lambda }{\pi }\int\limits_{ - \infty }^{\infty } {\frac{{\underline{\text{x}} (\tau )}}{{{\text{t}} - \tau }}{\text{d}}\tau } \hfill \\ \end{gathered} $$

where λ is the Cauchy principal value for the integral. An illustration of the three different signal representations, derived from the timeseries recorded at particular channel, is provided in Fig. 2.

Table 1 Global efficiency (GE) averaged values corresponding to the three possible comparisons
Fig. 2
figure 2

A 2 s (1,024 samples) segment of brain activity recorded from a single-channel and filtered in θ band (a). The signals of instantaneous amplitude A(t) and phase φ(t) are shown in (b) and (c) respectively

Differentiation of task-related brain dynamics

The new symbolization scheme, followed by the codebook-network analysis, was applied, in a contrastive fashion, for all possible pairs of recording conditions (control—comparison, control—multiplication and comparison—multiplication). For every frequency band and each of the three different signal representations (i.e. x(t), A(t), φ(t)), the pair of GE-measures was derived independently for each subject. To summarize across subjects, the computed set of GE-pairs were analyzed via the Wilcoxon-test. Figure 1, depicts schematically the overall procedure. Setting as significance level P = 0.001, we filtered out the non-significant results. Figure 3, illustrates the comparison of brain activity dynamics, which was performed on the basis of single-subject’s data and referred to the codebook-network representation. The contrasted conditions (control and multiplication) are associated with different connectivity patterns (a noticeable difference lies in the identity of forbidden transitions) and these differences have a direct implication to the measured GE-metric.

Fig. 3
figure 3

(subject 1) The codebook-network representation of a symbolic time series derived by means of neural-gas based quantization applied to combined data from control and multiplication task. Multichannel signal had been filtered within θ-band and transformed into its instantaneous phase representation. The size of codebook was 5, hence the depicted network includes 5 nodes. The arrow width encodes the proportion of transitions between particular pairs

The statistical analysis of GE-values showed that phase representation was the most suitable one for detecting task-related changes in brain dynamics. Table 1 includes the corresponding, averaged across-subjects, scores.

The most important trend is that GE, which in this setting described the spatiotemporal dynamics of instantaneous phase, succeeded to discriminate between the tasks in the whole frequency spectrum. Interestingly, the control condition is characterized by significantly lower GE-values for all frequency bands.

The observation about the enhanced importance of phase information in the multichannel symbolization, seems in accordance with recent findings. Phase synchronization reflects the exact timing of communication between distant but functionally related neural populations (for a review see Sauseng and Klimesch 2008). According to a well-known hypothesis (Varela et al. 2001), the most plausible mechanisms for neural integration are the formation of particular dynamical links that are reflected as phase-synchronisation in the EEG signals. The derived codebook manifests such patterns of phase-coupling and the transition-network indicates that the dynamic motifs are task-dependent.

Finally, it is important to mention here that the approach of ordinal patterns (considered as an alternative symbolization scheme which was applied in channel-wise fashion) failed to reveal significant differences between the mathematical tasks.


A symbolization scheme capable of handling multichannel recordings of brain activity and useful for contrasting dynamics from different conditions was introduced and applied to EEG data from mental calculations. Considering the emerging patterns of coordinated activity as an important aspect of underlying mechanisms, we developed a symbolic dynamics methodology that respects brain’s multistable character.

Among the outcomes of this study was that during multiplication GE values are higher than during comparison (for all frequency bands). The above observation is important since the two mathematical tasks depend upon neurophysiological processes that are known to differ regarding their nature and the distribution of the activated brain regions. According to several neuroimaging studies, the comparison task is a more localized procedure while difficult (two-digits) multiplication demands the activation of a widely distributed network (Micheloyannis et al. 2005—see Supp. Material).

At a methodological level, our approach bears some common characteristics with the mode level cognitive subtraction (MLCS) (Banerjee et al. 2008). However, our scheme relies on a non-linear summarisation of the multichannel data and the network representation of the mode sequences. Moreover, our approach shares the ‘prototyping’ step with the pioneer work of segmenting brain activity into functional microstates (Pascual-Marqui et al. 1995). However, it incorporates advanced algorithms for both codebook design (which in our case is subject-dependent) and treating the dynamics (symbolic timeseries are mapped to network topologies).

Our scheme can be readily adapted to various recording modalities (MEG, fMRI etc.) and used for comparing dynamics between healthy and diseased brains and based on a variety of different representations (e.g. network metrics time series; Dimitriadis et al. 2010a).