Complexity Measures for EEG Microstate Sequences: Concepts and Algorithms

von Wegner, Frederic; Wiemers, Milena; Hermann, Gesine; Tödt, Inken; Tagliazucchi, Enzo; Laufs, Helmut

doi:10.1007/s10548-023-01006-2

Complexity Measures for EEG Microstate Sequences: Concepts and Algorithms

Original Paper
Open access
Published: 26 September 2023

Volume 37, pages 296–311, (2024)
Cite this article

Download PDF

You have full access to this open access article

Brain Topography Aims and scope Submit manuscript

Complexity Measures for EEG Microstate Sequences: Concepts and Algorithms

Download PDF

Frederic von Wegner¹,
Milena Wiemers²,
Gesine Hermann³,
Inken Tödt⁴,
Enzo Tagliazucchi⁵ &
…
Helmut Laufs³

1817 Accesses
2 Citations
2 Altmetric
Explore all metrics

Abstract

EEG microstate sequence analysis quantifies properties of ongoing brain electrical activity which is known to exhibit complex dynamics across many time scales. In this report we review recent developments in quantifying microstate sequence complexity, we classify these approaches with regard to different complexity concepts, and we evaluate excess entropy as a yet unexplored quantity in microstate research. We determined the quantities entropy rate, excess entropy, Lempel–Ziv complexity (LZC), and Hurst exponents on Potts model data, a discrete statistical mechanics model with a temperature-controlled phase transition. We then applied the same techniques to EEG microstate sequences from wakefulness and non-REM sleep stages and used first-order Markov surrogate data to determine which time scales contributed to the different complexity measures. We demonstrate that entropy rate and LZC measure the Kolmogorov complexity (randomness) of microstate sequences, whereas excess entropy and Hurst exponents describe statistical complexity which attains its maximum at intermediate levels of randomness. We confirmed the equivalence of entropy rate and LZC when the LZ-76 algorithm is used, a result previously reported for neural spike train analysis (Amigó et al., Neural Comput 16:717–736, https://doi.org/10.1162/089976604322860677, 2004). Surrogate data analyses prove that entropy-based quantities and LZC focus on short-range temporal correlations, whereas Hurst exponents include short and long time scales. Sleep data analysis reveals that deeper sleep stages are accompanied by a decrease in Kolmogorov complexity and an increase in statistical complexity. Microstate jump sequences, where duplicate states have been removed, show higher randomness, lower statistical complexity, and no long-range correlations. Regarding the practical use of these methods, we suggest that LZC can be used as an efficient entropy rate estimator that avoids the estimation of joint entropies, whereas entropy rate estimation via joint entropies has the advantage of providing excess entropy as the second parameter of the same linear fit. We conclude that metrics of statistical complexity are a useful addition to microstate analysis and address a complexity concept that is not yet covered by existing microstate algorithms while being actively explored in other areas of brain research.

Frequency Analysis of EEG Microstate Sequences in Wakefulness and NREM Sleep

Article Open access 30 May 2023

Lempel-Ziv Complexity Analysis of Local Field Potentials in Different Vigilance States with Different Coarse-Graining Techniques

Sleep as a random walk: a super-statistical analysis of EEG data across sleep stages

Article Open access 10 December 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

EEG microstate analysis has become a widely used method to characterize spontaneous and evoked brain activity patterns (Michel and Koenig 2018). Complexity is a hallmark of integrated, large-scale brain activity and researchers have an intuition of how complexity manifests in their data, may that be neuronal spiking (Amigó et al. 2004; Szczepański et al. 2004), local field potentials (Abásolo et al. 2014), functional neuroimaging (Xin et al. 2021; Nezafati et al. 2020; Hancock et al. 2022), electroencephalography (EEG) (Casali et al. 2013), or magnetencephalography (MEG) (Fernández et al. 2011).

Yet, there is no unique theoretical notion of what complexity is or how it should be measured, and the fact that complexity is a multifaceted concept is reflected in the existence of a large number of complexity definitions in the literature (Shalizi 2006). Definitions of complexity have emerged in different scientific disciplines, and some concepts have re-emerged under different names which makes it challenging to maintain an overview of this research area (Prokopenko et al. 2008; Ay et al. 2011; Crutchfield and Feldman 2003). The specific aims of this article will be stated after a brief review of complexity concepts in general, and those used to characterize EEG microstate sequences so far. We will highlight links between these measures and to areas other than microstate research.

Concepts of Complexity

There are at least two basic flavours of complexity, each founded on different intuitions about what represents complexity.

One concept is known as algorithmic or Kolmogorov complexity (Alekseev and Yakobson 1981) and initially defined complexity as the length of the shortest program that is able to reproduce the input data. This measure increases monotonically with the amount of ‘randomness’ in the data. Intuitively, the shortest algorithm reproducing a random sequence is a command that prints exactly that sequence and the length of that algorithm would be approximately the same as the length of the dataset. In the microstate context, this would correspond to a sequence of independent samples from the set of microstate labels, e.g., from $\{A, B, C, D\}$, possibly weighted by their relative occurrence. At the non-random end of the spectrum, a sequence that consists of a single repeated label, e.g., $AAA\ldots$, is reproduced by a very short instruction such as ’print A, n times’. As will be explained further below, there are more practical approaches to measure Kolmogorov complexity than trying to find the actual program, namely entropy rate and Lempel–Ziv complexity.

A separate family of complexity measures was developed because many researchers are uncomfortable with the concept of complexity being essentially the same as randomness. Following the Kolmogorov complexity concept, randomly connected neurons are more complex than a real brain, and electrical white noise is more complex than actual brain electrical activity. The ’statistical complexity’ concept, on the other hand, asks how difficult it is to obtain a statistical model of the data and aims to construct bell-shaped complexity measures when plotted against randomness (Huberman and Hogg 1986; Lindgren and Nordahl 1988). Both complexity concepts agree in that extremely ordered systems should be assigned a low complexity, as they are non-random and a compact model can often be formulated. An example of a highly ordered spatial system is a crystal structure, as is a sine wave in the world of time series. Where the two complexity concepts differ is in their assessment of highly random systems. These have high Kolmogorov complexity but their statistical complexity is low because a simple assumption (statistical independence) can already provide a good statistical model of the data (Grassberger 1986; Crutchfield 1994). A similar line of reasoning led to the concept of Tononi–Sporns–Edelman complexity which has attained attention as a theory of consciousness and brain function in general (Tononi et al. 1994).

In Fig. 1 these concepts are illustrated with snapshots of the Potts model that will be formalized further below. The model shown attains one of four discrete states on each node of a regular square lattice and the dynamics are controlled by a parameter that corresponds to thermodynamic temperature. From left to right, with increasing temperature, the spatial patterns become increasingly random and disordered. This reflects Kolmogorov complexity, which increases monotonically with temperature. Statistical complexity, however, has its peak around the critical temperature of the model ($T=T_c$) where a phase transition occurs. Around $T_c$, the most extensive spatial and temporal correlations occur and statistical forecasting of the model states is most challenging.

The two complexity concepts discussed can be quantified by a range of metrics that require some disambiguation of the historical terminology. Kolmogorov complexity can be estimated as the randomness of a signal after all correlations have been taken into account (irreducible randomness, Prokopenko et al. 2008). This is captured by the entropy rate, which is derived from joint entropy (or block entropy) estimates of signal subsequences of different lengths, or via Lempel–Ziv complexity which measures this type of complexity by compressing the signal on the basis of repeated patterns (Lempel and Ziv 1976). The more a signal can be compressed, the lower its algorithmic (Kolmogorov) complexity. Entropy rate is also known as Kolmogorov–Sinai entropy in the field of dynamical systems and chaos theory (Shalizi 2006).

Statistical complexity can be assessed by different measures. One of these measures was called statistical complexity in Crutchfield and Young (1989), but had already appeared as true measure complexity in Grassberger (1986). This quantity is based on a graph model of the underlying process that needs to be reconstructed from empirical data. As model reconstruction is not trivial, approximations can be studied instead (Grassberger 1986; Prokopenko et al. 2008). A lower bound to the true measure complexity was named effective measure entropy in Grassberger (1986) and excess entropy in Crutchfield and Feldman (1997). We will use the latter term as it is used in the recent literature and expresses Grassberger’s initial observation that finite estimates of the entropy rate converge slowly towards the asymptotic value. Excess entropy can be expressed as the surplus of entropy across all finite entropy rate estimates (Grassberger 1986; Crutchfield and Feldman 2003). The same quantity appeared as predictive information in Bialek et al. (2001) and the name refers to the interpretation that statistical complexity measures the predictability of a time series. This idea is also expressed by its information-theoretic definition as the shared information between the past and the future of a signal, relative to an arbitrary observation time point. The concept of statistical complexity has therefore also been named forecasting complexity (Zambella and Grassberger 1988).

Complexity and Microstate Research

In the area of EEG microstate research, entropy rate, Lempel–Ziv complexity (LZC), and Hurst exponents have been explored as complexity measures in recent years. Hurst exponents were first used for microstate analysis by Van de Ville et al. (2010). Although the aim of the authors was to address self-similarity and fractality rather than explicitly measuring complexity, the relationship between Hurst exponents and complex system properties is present throughout the article. We evaluated this approach in relation to Markov models of microstate sequences in von Wegner et al. (2016) and applied it to cognitive load assessment in Jia et al. (2021).

Entropy rate estimation for microstate sequence analysis was introduced in von Wegner et al. (2018a), and we evaluated its changes during different types of cognitive effort in Jia et al. (2021), and for NREM sleep stages in Wiemers et al. (2023). We interpreted entropy rate in terms of sequence predictability in von Wegner et al. (2018a) and Jia et al. (2021), and as a complexity measure in Wiemers et al. (2023).

Next, Lempel–Ziv complexity analysis of microstate sequences was introduced in Tait et al. (2020), where a loss of complexity in the EEG of Alzheimer disease patients was demonstrated by applying a quantity called Omega complexity to the raw EEG signal (Wackermann 1996) and LZC to microstate sequences. A more recent variant of the Lempel–Ziv algorithm was used subsequently in Artoni et al. (2022) where concentration-dependent effects of propofol on microstate LZC were investigated. An intermediate approach can be found in Irisawa et al. (2006), where EEG topographies were classified with Omega complexity, and the results were compared to microstate duration, however, complexity analysis was not applied to microstate sequences.

Aims and Outline

The specific aims of this article are (1) to evaluate an explicit measure of statistical complexity (excess entropy) on microstate data, as the existing microstate complexity studies have focused on Kolmogorov complexity, (2) to test the theoretical equivalence between entropy rate and LZC that is valid for stationary stochastic processes (Ziv 1978), (3) to compare the aforementioned measures to Hurst exponents, and (4) to test the influence of first-order Markovian correlations on these complexity measures.

As the ground truth about brain states is unknown in empirical data, we first evaluate the selected metrics (entropy rate, excess entropy, Lempel–Ziv Complexity, and Hurst exponents) on a well-understood numerical model from statistical physics, the discrete Potts model (Wu 1982).

The Potts model offers some advantages in this context. First, the model produces time series over a discrete state space with an arbitrary number of states, similar to EEG microstate sequences. The number of states is often denoted Q for the Potts model, and K in microstate research, related to the initial K-means clustering of EEG data. The second advantage of the Potts model is that there is a single control parameter (temperature) that controls the appearance of a phase transition. The common elements between the Potts model and EEG microstates are (a) entropy, which is closely linked to temperature in statistical physics, but also has an interpretation in terms of time series predictability, and (b) phase transitions have been discussed as an important feature of resting-state brain activity and are often quantified by complexity metrics such the Hurst exponent, for example in EEG research (Linkenkaer-Hansen et al. 2001; Kantelhardt et al. 2015; von Wegner et al. 2018b), EEG microstate research (Van de Ville et al. 2010; Jia et al. 2021), and fMRI studies (Bullmore et al. 2009; Tagliazucchi et al. 2013).

In the second part of the results section, we evaluate the four metrics on EEG microstate sequences in wakefulness and non-REM (NREM) sleep, a dataset we have previously analyzed with other microstate analysis tools (Brodbeck et al. 2012; Wiemers et al. 2023). We analyze full microstate sequences as well as reduced sequences from which all duplicate labels have been removed (jump sequences). In an attempt to identify which time series features the different complexity metrics actually ’see’, we use first-order Markov surrogate data to represent exactly that amount of information that is captured by the transition probability matrix, an approach that is often used to report microstate data (Lehmann et al. 2005).

Methods

Computational Model

In microstate research, K-means clustering is commonly performed for $K=4$ or $K=5$ clusters. Hence, we implemented the Potts model with Q states for $Q=4,\,5$ as reviewed in Wu (1982) on a two-dimensional (2D) discrete lattice geometry. Other topologies could be employed but the 2D model is well studied and the critical temperatures are known analytically (Wu 1982; Brown et al. 2022). Two different energy (Hamilton) functions have been presented for the Potts model, the standard model and the vector (clock) model. We chose the standard model for which the type of phase transition is known (Wu 1982).

The Potts model uses discrete variables that can be visualized as 2D unit vectors (spins), uniformly distributed around the complex unit circle. Their energy difference is defined by the phase difference. Formally, the spin values are given by $S = \{\exp \left( 2 \pi \textrm{i} q/Q \right) ,\, q=0,\ldots ,Q-1 \}$ with phase $2\pi q/Q$. For our purpose, it is sufficient to store the integer values $q=0,\ldots ,Q-1$ as the discrete model states. In the standard Potts model, a lattice site (k, l) with phase $\phi _{kl}$ has energy:

$$\begin{aligned} E_{kl} = - \sum _{m,n \in N_{kl}} J \delta (\phi _{kl},\phi _{mn}). \end{aligned}$$

(1)

using the Kronecker delta function ($\delta$) and nearest-neighbour coupling, i.e. the neighbours of spin $\phi _{kl}$ at lattice site (k, l) are $N_{kl} = \{(k-1,l), (k+1,l), (k,l-1), (k,l+1)\}$. We exclusively considered ferromagnetic coupling ($J = +1$) which favours neighbouring spins to align, as the lowest energies are produced when their phase values are identical. In a neuronal context, this can be interpreted as neuronal ensembles which tend to align the phase of their voltage oscillations (Breakspear et al. 2010).

We simulated the Potts model on a square lattice of $25 \times 25$ nodes. Model data were generated by Monte–Carlo simulation with Metropolis sampling, i.e., a randomly chosen $q \rightarrow q'$ transition was accepted with probability $p = \min (1, \exp \left( -\frac{\Delta E}{T}\right) )$ which depended on the energy difference $\Delta E$ before and after the proposed transition. In words, transitions that reduced the lattice energy were always accepted whereas transitions increasing the total system energy were only accepted if they could jump across the energy barrier $\Delta E$, which was tested stochastically by comparison with a uniformly distributed pseudo-random number. The critical temperature of the two-dimensional Potts model is $T_c = \left( \log (1+\sqrt{Q}) \right) ^{-1}$ (Brown et al. 2022).

Individual simulations were run for $t=30000$ iterations, preceded by a warm-up of 2500 iterations to allow relaxation from the initial random state. We simulated the system across a range of temperatures which will be written relative to the critical temperature $T_c$. We used relative temperatures $T/T_c$ of 0.2, 0.4, 0.6, 0.8, 0.9, 1.0, 1.1, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0. For each temperature, we ran the model 50 times and selected a subset of $n=25$ random lattice nodes for complexity analysis. We chose a length of 30,000 samples to match the length of our EEG segments (2 min of EEG acquired at 250 Hz).

The results for $Q=4$ are presented in the main text, those for $Q=5$ in the supplementary data.

Experimental Data and EEG Pre-Processing

We analyzed EEG recordings from n = 19 healthy subjects. The EEG dataset is a subset of the dataset analyzed in Brodbeck et al. (2012) and Wiemers et al. (2023) and only includes those subjects for whom sleep stage N3 data was available. Briefly, EEG data from simultaneous EEG-fMRI recordings were corrected for scanner and cardioballistic artefacts and downsampled to 250 Hz as described in Brodbeck et al. (2012). Sleep stages were scored manually, according to international standards. The data were band-pass filtered to 1–30 Hz with a digital Butterworth filter of order six.

Microstate Algorithm

Subject-wise microstates were computed with the modified K-means algorithm (Pascual-Marqui et al. 1995) implemented in Python 3 (von Wegner and Laufs 2018). Group-wise microstate maps were computed for each sleep stage with a full permutation procedure over the subject-wise maps (Koenig et al. 1999), repeated 20 times with random initial conditions. Group maps were defined as the result that maximized the global explained variance across the subjects. Microstate sequences were obtained by competitive back-fitting at each time step and ignoring map polarity. No further smoothing methods were applied.

Following two different approaches found in the literature, we evaluated two types of microstate sequences, (a) full sequences as obtained from back-fitting, and (b) sequences without duplicate labels, transforming a sequence like ACCCAADBBA into ACADBA, for example. The latter approach ignores microstate duration and is rooted in the idea that the transition between non-identical brain states, as measured by EEG microstates, conveys essential information about brain activity (Michel and Koenig 2018). The latter approach is also commonly used in Markov chain analysis where these sequences are called jump sequences, a name that will also be used here (Gillespie 1992). Microstate maps and further properties of the EEG dataset used here are detailed in Wiemers et al. (2023).

Surrogate Data

Surrogate data for the Potts model and EEG microstate sequences were synthesized as first-order Markov processes, based on the empirical transition probability matrix of each time series as explained in von Wegner et al. (2016) and von Wegner and Laufs (2018). The Markov structure of microstate sequences was quantified with partial autoinformation coefficients as described in von Wegner (2018).

Complexity Metrics

Entropy Rate and Excess Entropy

Entropy rate and excess entropy were computed as described in von Wegner (2018) and published in our 2017 Python microstate package (github repository) (von Wegner and Laufs 2018). Briefly, the frequency distribution of microstate sequence blocks (‘microstate words’) ${\textbf{X}}_{n}^{(k)}=\left( X_{n}, \ldots , X_{n+k-1} \right)$ for each block length $k=1,\ldots ,6$ was estimated from the data, and the joint entropy $H\left( {\textbf{X}}^{(k)} \right)$ of each distribution $P\left( {\textbf{X}}^{(k)} \right)$ was computed. The parameters entropy rate ($h_X$) and excess entropy (${\textbf{E}}$) were obtained as the slope and y-axis intercept of a linear fit of $H\left( {\textbf{X}}^{(k)} \right)$ vs. k, respectively. This approach is visualized in Fig. 2 for three different situations of the Potts model. This estimate of the entropy rate $h_X$ is based on the following definition in terms of infinitely long observations:

$$\begin{aligned} h_X&= \lim _{n \rightarrow \infty } \frac{1}{n} H({\textbf{X}}_{n}^{(k)}), \end{aligned}$$

(2)

which, for stationary stochastic processes, is equivalent to the conditional entropy form:

$$\begin{aligned} h'_X = \lim _{n \rightarrow \infty } H(X_{n+1} \vert {\textbf{X}}_{n}^{(k)}). \end{aligned}$$

(3)

The second form (3) can be read in terms of time series predictability; $h_X$ expresses the uncertainty (entropy) in predicting the next state of the sequence ($X_{n+1}$) when the last k states (${\textbf{X}}_{n}^{(k)}$) are known.

In a similar approach, excess entropy can be expressed as the mutual information between the past ${\textbf{X}}_{\textrm{past}}=\left( \ldots ,X_{n-1},X_{n} \right)$ and the future ${\textbf{X}}_{\textrm{future}}=\left( X_{n+1},X_{n+2},\ldots \right)$ of the process:

$$\begin{aligned} {\textbf{E}} = I({\textbf{X}}_{\textrm{future}}; {\textbf{X}}_{\textrm{past}}). \end{aligned}$$

(4)

The concept can be made more intuitive by re-writing mutual information between the random variables X, Y as $I(X;Y)=H(X)-H(X \vert Y)$, i.e., the reduction in uncertainty about X by knowing Y. In this sense, excess entropy encodes to what extent predictions about the future improve, or entropy decreases, by including knowledge about the past of the process:

$$\begin{aligned} {\textbf{E}}&= H({\textbf{X}}_{\textrm{future}}) - H({\textbf{X}}_{\textrm{future}} \vert {\textbf{X}}_{\textrm{past}}). \end{aligned}$$

(5)

Lempel–Ziv Complexity (LZC)

The 1976 implementation of the Lempel–Ziv algorithm (LZ-76, Lempel and Ziv 1976) was written and compiled in Cython 0.29.3 to produce faster Python 3.6.9 code. Our implementation is identical to the algorithm used in Tait et al. (2020) which is publicly available.

Hurst Exponents

Hurst exponents were calculated by detrended fluctuation analysis (DFA) as described in Van de Ville et al. (2010) and von Wegner et al. (2016), using 50 logarithmically spaced time scales over the range of 50–2500 samples (200 ms—10 s for EEG data). Hurst exponents were determined as the slope parameter of the linear fit to the detrended fluctuation function. DFA was applied to random walks that were constructed by partitioning the discrete variables (Potts model states, EEG microstate classes) into two subsets, and substituting these categorical discrete variables with the values $\pm 1$, respectively (Van de Ville et al. 2010). There are three different (2, 2)-partitions for datasets with four states (Potts model for $Q=4$, EEG microstates) and ten (2, 3)-partitions for five states (Potts model for $Q=5$). For each time series, Hurst exponents were computed for each partition and the average across all partitions was used for statistical analysis.

Convergence

To analyze numerical differences between the entropy rate estimator $h_X$ and Lempel–Ziv complexity values for different sequence lengths, we quantified their convergence rate towards theoretically expected entropy rates using Markovian test sequences. Details of this procedure are explained in the supplementary material.

Code Availability

Sample Potts model data and analysis scripts to reproduce a simplified version of Fig. 3 is available online (github repository).

Results