A continuous mapping of sleep states through association of EEG with a mesoscale cortical model
 858 Downloads
 15 Citations
Abstract
Here we show that a mathematical model of the human sleep cycle can be used to obtain a detailed description of electroencephalogram (EEG) sleep stages, and we discuss how this analysis may aid in the prediction and prevention of seizures during sleep. The association between EEG data and the cortical model is found via locally linear embedding (LLE), a method of dimensionality reduction. We first show that LLE can distinguish between traditional sleep stages when applied to EEG data. It reliably separates REM and nonREM sleep and maps the EEG data to a lowdimensional output space where the sleep state changes smoothly over time. We also incorporate the concept of strongly connected components and use this as a method of automatic outlier rejection for EEG data. Then, by using LLE on a hybrid data set containing both sleep EEG and signals generated from the mesoscale cortical model, we quantify the relationship between the data and the mathematical model. This enables us to take any sample of sleep EEG data and associate it with a position among the continuous range of sleep states provided by the model; we can thus infer a trajectory of states as the subject sleeps. Lastly, we show that this method gives consistent results for various subjects over a full night of sleep and can be done in real time.
Keywords
Sleep Sleep staging Sleep scoring Seizure Locally linear embedding Electroencephalogram Cortical model Mesoscale Meanfield1 Introduction
The standard method of sleep scoring involves categorization of electroencephalogram (EEG) data into five separate stages (Niedermeyer and da Silva 2005). However, the discrete nature of these stages limits their utility as analytical and predictive tools. For example, in a study of human epilepsy, it may be observed that a seizure occurred during stage 2 sleep. This prompts further questions: Was the subject descending to deeper stages of sleep or arising from them? How quickly was the subject moving through each stage? Was a transition imminent when the seizure occurred?
The use of a mathematical model of the human sleep cycle may allow us to answer such questions by providing a continuous spectrum of sleep states, ranging from REM to the deepest slowwave sleep. If the model can be directly associated with human sleep EEG data, it will be possible to track the subject’s state to identify the stage as well as changes in sleep depth and proximity to transitions. Ideally, this would be done in realtime, where the state is continuously determined as the subject sleeps. The process must be consistent over various subjects and robust to nonstandard sleep cycles and periods of waking.
Here we utilize a technique called locally linear embedding (LLE) to make this connection between a model of the human sleep cycle and EEG data. First, we present a model of the human cortex with subcortical inputs represented by added driven noise, and we describe the associated mathematical representation of the sleep cycle (Section 2). We then introduce the technique of locally linear embedding (Section 3) and show that it provides the ability to distinguish between sleep stages when applied to EEG data (Section 4). These results demonstrate reliable separation between REM and NREM sleep data and provide a smooth temporal progression through the various stages of sleep. We also present the concept of strongly connected components as a method of outlier rejection for EEG data (Section 3.2) and introduce a method for automatic selection of LLE parameters (Section 4.3). Then, by performing LLE on a hybrid data set containing both sleep EEG and signals generated from the mathematical model, we are able to integrate the EEG and the model (Section 5). This allows us to take any sample of sleep EEG data and determine its position within the continuous range of sleep states provided by the model. We show that this method provides consistent results for various subjects over a full night of sleep, and it could be done online as the subject sleeps.
2 Meanfield cortical model
2.1 Background and mathematics
Meanfield models of the cortex are wellsuited to the study of brain states described by EEG signals, including sleep. The variables in these models, representing quantities that are averaged over the millimeter scale, are comparable to the mesoscale measurements of EEG electrodes. More specifically, we choose a cortical model developed most recently in Liley et al. (2002) and SteynRoss et al. (1999, 2003). In addition to sleep, it has been used to model epileptic seizures (Kramer et al. 2005), anesthesia (SteynRoss et al. 2004; Bojak and Liley 2005), and the transition to seizure due to application of anesthetic agents (Liley and Bojak 2005).
Dimensionless variables and parameters of the SPDE cortical model
Symbol  Description  Typical value 

\(\tilde{h}_{e,i}\)  Spatially averaged soma potential for neuron populations  – 
\(\tilde{I}_{ee,ei}\)  Postsynaptic activation due to excitatory inputs  – 
\(\tilde{I}_{ie,ii}\)  Postsynaptic activation due to inhibitory inputs  – 
\(\tilde{\phi}_{e,i}\)  Longrange (corticocortical) input to e and i populations  – 
\(\tilde{t}\)  Time (dimensionless)  – 
\(\tilde{x}\)  Space (dimensionless)  – 
Γ_{e,i}  Influence of synaptic input on mean soma potential  4.6875 × 10^{ − 4}, 0.0105 
\(h^0_{e,i}\)  Reversal potential  0, 1.0938 
T _{e,i}  Neurotransmitter rate constant  12.0, 3.6 
λ _{e,i}  Inverse length scale for corticocortical connections  11.2, 11.2 
P _{ee,ei}  Subcortical input from excitatory population  25.0, 25.0 
P _{ie,ii}  Subcortical input from inhibitory population  25.0, 25.0 
\(N^{\alpha}_{e,i}\)  Number of distant (corticocortical) connections from excitatory populations to e and i populations  3710, 3710 
\(N^{\beta}_{e,i}\)  Number of local synaptic connections from e and i populations  410, 800 
\(\tilde{g}_{e,i}\)  Slope at inflection point of sigmoid function \(\tilde{S}_e\)  − 29.021, − 19.347 
\(\tilde{\theta}_{e,i}\)  Inflection point for sigmoid function \(\tilde{S}_e\)  0.91406, 0.91406 
For completeness, we will include the full model in our simulations; however, it should be noted that a reduced version would suffice in this case. For example, we will utilize only the temporal evolution of variables, so it would be possible to convert the model to a system of ODEs by removing the spatial derivatives from Eqs. (7) and (8). In addition, the subdivision of local excitatory inputs, represented by \(N^{\beta}_e \tilde{S}_e[\tilde{h}_e]\) in Eqs. (3) and (4), is unnecessary. Making these changes would perhaps reduce the computation time for numerical solutions to the model, but we would not expect them to affect the results.
For the purpose of modeling sleep, we will focus on the parameters L and \(\Delta h_e^{rest}\) and the variable \(\tilde{h}_e\). The parameters represent the actions of neuromodulators adenosine and acetylcholine (ACh) that aid in the regulation of the human sleep cycle. Adenosine reflects the activity of the homeostatic drive to sleep, which is modulated by various somnogens. The ACh input into the cortex is a measure of the activity of the various brain stem controllers of sleep. Note that we have not specifically modeled the complex intrinsic interactions between the various brain stem nuclei. In this paper, we are primarily concerned with the interaction of their neuromodulator output with the cerebral cortex and thus model their effects only as extrinsic alterations in ACh.
In general, adenosine acts to reduce the resting potential of excitatory cells, thus making them less likely to fire; ACh does the opposite by raising the resting potential. These changes are represented in the model by \(\Delta h_e^{rest}\), which adds directly to the resting potential of the excitatory population (disguised as a “1” in the dimensionless equations). In addition, ACh decreases the amplitude of the excitatory postsynaptic potential, effectively reducing the synaptic gain. In the model, this corresponds to a reduction in the effect of synaptic currents \(\tilde{I}_{ee}\) and \(\tilde{I}_{ei}\); therefore, the parameter L is multiplied by these quantities to simulate a change in synaptic gain. Lastly, as was done in SteynRoss et al. (2005), we take the mean excitatory soma potential \(\tilde{h}_e\) to be representative of cortical activity; we will compare this variable to EEG measurements using locally linear embedding.
2.2 Model of the human sleep cycle
The mechanisms underlying human sleep and waking are complex; for recent, detailed reviews of the brain stem and hypothalamic control of sleep in thalamocortical systems see Fuller et al. (2006, 2007), McCarley (2007), Rosenwasser (2009), Saper et al. (2005a, b). In summary, the wakeful state may be characterized by high levels of activity in aminergic, cholinergic, orexinergic and glutamatergic neuronal populations in the brain stem and hypothalamus. The overall effect is to maintain the thalamocortical neurons in a depolarized, active, and continually firing state. These excitatory neurons also inhibit activity in various gammaaminobutyricacid(GABA)ergic cell populations, particularly in the ventrolateral preoptic area (VLPO), basal forebrain, and in the reticular nucleus of the thalamus. With the build up of homeostatic and circadian pressure to sleep (possibly mediated by various activitydependent somnogens such as adenosine), the wakeactive neurons are inhibited, which then allows the sleeppromoting neurons of the VLPO to start firing and trigger the transition from wakefulness to NREM sleep. This results in quiescence of the aminergic, orexinergic, and cholinergic brainstem neuromodulator centers; which in turn allows hyperpolarization of the corticothalamic systems and hence the burst firing patterns characteristic of slow wave sleep. If these neurons are only moderately hyperpolarized, the EEG is dominated by the sleep spindles and Kcomplexes characteristic of stage 2 sleep. With more profound hyperpolarization the EEG is dominated by the delta waves of stages 3 and 4 (Steriade and Amzica 1998; Steriade and Timofeev 2001). This progressive slowing of the dominant frequency is captured by measures such as the permutation entropy index (Olofsen et al. 2008). The transition from NREM to REM sleep is associated with corticothalamic depolarization caused by activation of cholinergic and glutamatergic brain stem systems (mainly in and near the pedunculopontine tegmentum). The neuromodulatory environment of REM sleep differs from the wakeful state in that the amines and orexinergic systems are inactive in REM sleep, but active in the wakefulness; however, this distinction is not explicit in the present model.
This model has been previously studied. SteynRoss et al. (2005) calculated the EEG total power, fractions of high and low power, and correlation time exhibited by the model at the transition from slowwave sleep to REM; it was found that they qualitatively matched both human clinical sleep recordings and cortical measurements from a cat. The model was also studied in two spatial dimensions to investigate stable oscillatory states similar to slowwave sleep, and it was shown that a transition from one state to another can occur due to stochastic fluctuations (Wilson et al. 2005). Lastly, Wilson et al. (2006) interpreted the kcomplex as a transient shift from a stable lowfiring state to an unstable highfiring state and used this model to demonstrate the mechanism by which the transition may occur.
Because we are interested in comparing this model directly to human EEG recordings, we will use the sleep manifold as a way to generate model “EEGlike” signals. We will choose values of L and \(\Delta h_e^{rest}\), find the numerical solution of the model for a given length of time, convert the dimensionless \(\tilde{h}_e\) to mV, and downsample it to match the EEG recordings. By doing this for many different values of L and \(\Delta h_e^{rest}\) we can obtain representative signals of every sleep stage.
It has previously been argued that \(\tilde{h}_e\) cannot be directly compared to measurements from cortical surface or scalp electrodes because those measurements are based on extracellular current flow, as opposed to the soma potential. This is important for the modeling of certain cortical phenomena; for example, in performing simulations of feedback control for the suppression of epileptic seizures, the value of the electrode measurement is fed directly back to the cortex to affect \(\tilde{h}_e\), with little or no time delay (Lopour and Szeri 2010). In that case, the relationship between \(\tilde{h}_e\) and the electrode measurement at any given time is very important. However, in the present analysis of EEG data using LLE, we are only interested in matching scaled features of the data that are calculated over 30second intervals. We will not attempt to compare the temporal progression of \(\tilde{h}_e\) directly to the EEG data. The previous work mentioned above has demonstrated a correspondence between \(\tilde{h}_e\) and sleep EEG data with regard to these general features, so we feel confident in using it for our analysis without the addition of a scalp electrode model.
3 Locally linear embedding (LLE)
Locally linear embedding is a method of nonlinear dimensionality reduction that was originally introduced in Roweis and Saul (2000). It is useful for visualizing highdimensional data sets as they would be embedded in a lowdimensional space, and it can often uncover relationships and patterns that are masked by the complexity of the original data set. It has been used to obtain maps of facial expressions and classify handwritten digits (Saul et al. 2003), as well as discriminate between normal and preseizure EEG measurements (Ataee et al. 2007). Here we will use LLE to characterize sleep EEG data and the numerical solutions of the cortical model. By embedding both in a twodimensional space, we will be able to associate traditional EEG sleep stages with the continuous spectrum of states provided by the model.
3.1 The algorithm
 1.
Calculate the nearest neighbors of each data pointX_{ i }in the Ddimensional space. This can be done in several ways; for example, we might choose the k closest points based on Euclidian distance, or we may choose only the points within a sphere of a given radius.
 2.Determine the best reconstruction of each point using only its nearest neighbors. Mathematically, this takes the form of a least squares minimization problem:where k represents the number of nearest neighbors. Our goal is to choose the weights W that best reconstruct the original data points in the Ddimensional space, based on the criteria of leastsquared error. Because we use only the nearest neighbors, we must have W_{ ij } = 0 if X_{ j } is not a neighbor of X_{ i }. In addition, we guarantee invariance to translations by enforcing ∑ _{ j }W_{ ij } = 1. Note that the minimization can be calculated individually for every i.$$ \min_W \sum\limits^{N}_{i=1} \mid \mathbf{X}_i  \sum\limits^{k}_{j=1} W_{ij} \mathbf{X}_j \mid^2 \;, $$(11)
 3.Compute the lowdimensional output vectorsY_{ i }. These are chosen to provide the best global reconstruction using the weights W from the previous step. Again, this can be formulated as a least squares minimization:Here we are making the assumption that the weights that give the best reconstruction in D dimensions will also be the optimal weights in the lowerdimensional space. In this case, the N minimization problems are coupled by the elements of Y, so they must be solved simultaneously.$$ \min_{\mathbf{Y}} \sum\limits^{N}_{i=1} \mid \mathbf{Y}_i  \sum\limits^{k}_{j=1} W_{ij} \mathbf{Y}_j \mid^2 \;. $$(12)
A detailed description of the algorithm and several examples are provided in Saul et al. (2003). In addition, a Matlab implementation of LLE is available on the authors’ website (Roweis and Saul 2009); it was used to generate all results presented here.
A possible source of confusion with locally linear embedding is the interpretation of output dimensions such as Y _{1} and Y _{2}. Unlike linear methods such as principal component analysis, LLE does not provide a description of the output vectors in terms of the original D dimensions. The elements of Y are chosen to give the best local reconstructions based on a global minimization problem; this means that the interpretation of Y is different for every data point, and it cannot be described by a simple combination of the original dimensions.
3.2 Strongly connected components
The use of the LLE algorithm is based on the assumption that the entire data set lies on the same manifold in highdimensional space. If more than one manifold is present, the locally linear reconstructions will no longer be accurate (imagine, for example, a point with nearest neighbors located on two separate manifolds). Therefore, before using LLE on a data set, we must verify this assumption.
The example in Fig. 3 has two strongly connected components: points 1, 2, 5, and points 3, 4, 6. However, the two groups are not strongly connected together; one can move from the first group to the second through the connection between 4 and 5, but there is no way to get from the second group to the first. Point 7 is not strongly connected to any other point. Therefore, to use LLE on this sample data set, we would remove point 7 and use the algorithm separately on each strongly connected component.
There are several ways to identify the strongly connected components of a data set. The most traditional method involves an algorithm based on depthfirst search of the directed graph (Tarjan 1972). An alternative method relies on analysis of the eigenspace that results from the LLE calculations (Polito and Perona 2001). It is also true that choosing the nearest neighbors in a different manner or increasing the value of k can change the structure of the strongly connected components. However, for the purposes of this study, we used the MATLAB function dmperm on a matrix containing the nearest neighbor associations for the data set. This function, based on the DulmageMendelsohn decomposition, permutes the rows and columns of a matrix to put it into block diagonal form; by including the fact that every point is a neighbor with itself, we can guarantee that this permutation will be symmetric. As output, dmperm provides the new order of rows and columns and identifies the blocks of the permuted matrix, where each block represents one strongly connected component within the data.
A remark about principal component analysis (PCA) is in order. This is perhaps the most common mode of dimensionality reduction, and it has also been used in the analysis of sleep EEG data (Gervasoni et al. 2004; Jobert et al. 1994; CorsiCabrera et al. 2000). However, PCA places the greatest importance on the directions of largest variance and relies on the assumption that the data is best reconstructed by a linear combination of the original measurements. While we tried PCA and achieved reasonable results, the nonlinear nature of the sleep manifold suggests that a more sophisticated solution is necessary. In addition, the concept of nearest neighbors on which the LLE algorithm is based enabled improvement in the separation of different sleep stages (see Section 4.4), and it played a crucial role in defining the quantitative relationship between the EEG data and mathematical model, as is discussed in Section 5.2.
4 LLE applied to sleep EEG data
Before examining the connection between EEG data and the mathematical model of the sleep cycle, we will first discuss the results of applying LLE to sleep EEG only. After introducing the data sets and our methods, we show that LLE can separate EEG data by sleep stage and provide a continuous representation of sleep depth.
4.1 Sleep EEG data
The EEG data used for this analysis was obtained from the SleepEDF database (Kemp 2009), which is part of the PhysioBank online resource of physiologic signals for biomedical research (Goldberger et al. 2000). We used four data sets (sc4002e0, sc4012e0, sc4102e0, and sc4112e0), each one consisting of a European data format (EDF) file and a file containing the hypnogram data. They were converted to ASCII format and then imported into Matlab.
The data were gathered in 1989 from healthy males and females between the ages of 21 and 35. Recordings were obtained over the course of one full day and include horizontal electrooculogram (EOG), two channels of EEG (FpzCz and PzOz sampled at 100 Hz), submentalelectromyogram (EMG) envelope, oronasal airflow, and rectal body temperature. However, we used only the data from the FpzCz EEG electrode pair in our analysis. The hypnogram data was generated via manual scoring according to Rechtschaffen & Kales using the two channels of EEG. For more details on the subjects, recording methods, and sleep staging, please see the full description in Mourtazaev et al. (1995).
4.2 LLE input based on EEG features
In order to use the EEG as an input to the LLE function, we need to define our highdimensional data set. We do this by dividing the signal into nonoverlapping windows and calculating both statistical and frequencybased features for each one. Therefore each window becomes one highdimensional data point, where the dimension equals the number of features. Because the data was scored using 30second epochs, this was a natural choice for the window length. Thus, if we have 100 minutes of EEG data and we calculate six features, we will input 200 sixdimensional points into LLE and seek the embedding in two dimensions.

Power in different frequency bands This group of five features consists of total power in the delta (up to 4 Hz), theta (4–7.5 Hz), alpha (7.5–12 Hz), beta (12–26 Hz), and gamma ranges (above 26 Hz).

Total power This is the total power in all five frequency bands.

Statistical measures These include variance, skewness, and kurtosis. Whereas the variance captures the spread of the data and is always positive, skewness is a measure of the asymmetry around the sample mean, i.e. negative skewness indicates that more data points lie below the mean than above. Kurtosis is a measure of how prone the distribution is to outliers; a signal with high kurtosis has infrequent large deviations from the mean.

Spindle score The spindle detector identifies segments of the EEG signal where the difference between consecutive points changes from positive to negative five times in a row, thereby creating two peaks and two troughs. The lag parameter τ_{ L } defines the number of sample points spanned by each rise or fall within the soughtfor spindle, so it can be adjusted to search for these motifs at lower frequencies. We set τ_{ L } = 5, which allows for detection of 8–12 Hz spindles in data sampled at 100 Hz (with the maximum response occurring for spindles at 10 Hz), and we used a minimum threshold of zero. The overall spindle score indicates the percentage of the signal that was classified as spindle activity. Matlab code for this function is provided in McKay et al. (2010).

Permutation entropy Similar to the spindle score, the permutation entropy (PE) identifies motifs in the EEG data, such as peaks, troughs, and slopes. The PE has its maximum value when there is an equal distribution of all motifs and its minimum value when only a single motif is present. In this way, it is a measure of the “flatness” or “uncertainty” of the signal. Here we use the composite permutation entropy index (CPEI), which combines the PE with τ_{ L } = 1 and τ_{ L } = 2 with a minimum threshold level. In our study, we set the threshold at 1% of the interquartile range of the EEG data. Further descriptions of this measure and an associated MATLAB function can be found in Olofsen et al. (2008).
The CPEI has been found to be a good measure of anesthetic depth, and the motifbased methods used for permutation entropy and spindle detection are generally robust to noise. This is demonstrated in Olofsen et al. (2008), where the CPEI is calculated for both a timevarying signal and the same signal with added white noise of various magnitudes. As mentioned above, the noise threshold for PE is built into the calculation. These reasons (and the availability of published MATLAB code) led us to choose motifbased methods over more common parametric measures.

Properties of log power These four features are based on the log of the power spectral density (PSD), as obtained by Welch’s method. First, we omit the delta and alpha peaks and calculate the slope and offset of a linear fit. We then determine the maximum value of the PSD above the linear estimate in the alpha range (8–17 Hz) and the maximum value of the PSD in the delta range (0.5–4 Hz). These values will generally be large when a prominent peak is present. The code for generating these features was based on a Matlab function found in Leslie et al. (2009).

Power fractions The low power fraction is obtained by summing the power in the delta and theta ranges and dividing by total power. Similarly, the high power fraction is calculated by summation of the power in the beta and gamma ranges and dividing by total power.
The selection of a subset of features from this list may seem like a difficult task. It is certainly an important one—the use of all 17 features or a “nonsensical” subset will give poor results. However, it is worth noting that there are many combinations that result in a satisfactory separation between sleep stages in the LLE embedding. While each one may be slightly different, there will be a large number of high quality with respect to discrimination.
4.3 Automated ranking of feature sets
 1.
How many nearest neighbors should we include? In other words, what is the value of k? The LLE embedding will be stable over a range of values; we generally expect that k will be greater than the number of output dimensions and smaller than the original number of dimensions D (Saul et al. 2003).
 2.
What should be the dimensionality of the LLE output space? A nice property of the LLE algorithm is that each dimension is preserved as additional dimensions are added. Therefore, if we look at the results in two dimensions and do not achieve the desired mapping, we can add a third dimension without affecting the first two.
 3.
Which combination of features should we use? Employing all 17 features in our LLE analysis does not guarantee good separation between sleep stages because some of the features may not show consistent variation as the sleep depth changes. In addition, some features, such as the variance and the power in the delta band, show similar trends; we may achieve better results by eliminating these redundancies.
While we were able to identify many effective feature combinations through educated guesswork, we wanted to evaluate the utility of LLE as a method of sleep staging by identifying the best possible results. In this case, the “best” results are those that provide a large separation between sleep stages, especially between REM and deep slowwave sleep. Because testing each combination of the features is an onerous task, e.g. choosing six features from a pool of 17 results in 12376 combinations, we developed an algorithm to evaluate the results automatically. It first identifies two groups of points: those marked as REM in the hypnogram and those determined to be stage 4. It then tracks two parameters based on the separation between those two groups of data points as they are embedded in the LLE output space.
The second parameter B uses the concept of nearest neighbors to evaluate separation; for example, if the stage 4 data points have only other stage 4 points as nearest neighbors, then we can infer that they are completely separated from the other sleep stages. More specifically, it measures the number of stage 4 points with REM points as nearest neighbors and divides that by the total number of stage 4 points. If the stage 4 group is isolated, we will have B = 0.
We determined the values of A and B for all possible combinations of six features. There were 267 feature sets where A exceeded a threshold of 90% separation in each direction: \(A > \sqrt{90^2 + 90^2}\). We then identified the 267 feature sets with the lowest values of B. By finding the combinations that were common to both groups, we identified the 11 best feature sets. Visual inspection of the LLE results for these combinations confirmed the desired separation between REM and stage 4 sleep. Note that all 11 of these feature combinations provided results with B = 0.
4.4 Separation of sleep stages via LLE

3 points from waking (epochs 1,620, 1,717, and 1,718)

2 points from stage 2 (epochs 1,619 and 1,624)

2 points from stage 3 (epochs 1,712 and 1,713)

1 point from stage 4 (epoch 1,665)
5 Integration of EEG data and the model sleep cycle
Thus far, we have shown that LLE is capable of distinguishing between sleep stages using only one channel of EEG and that the embedding exhibits a smooth progression over time. However, remember that our original goal was to find the relationship between EEG data and the mathematical model of the sleep cycle. Here we accomplish this by applying LLE simultaneously to EEG data and simulated data from the model.
5.1 Model data set
To generate the model data set, we place a grid of points on the sleep manifold (Fig. 1) and obtain the numerical solution of the cortical model at each one. We vary L over the interval [0.5, 2] in increments of 0.1 and \(\Delta h_e^{rest}\) over [−5, 5] in increments of 0.5. This gives us a total of 336 model signals for analysis; we then remove the initial transients and characterize each signal based on a subset of the features described in Section 4.2. In this way, the nonlinear sleep manifold is turned into “EEGlike” signals which are converted to highdimensional data points for use with LLE.
For the model data set, the length of each signal is 10 s (as opposed to the 30second windows used for the EEG data). We are able to use this shorter time because we can choose parameters in the model to simulate a stationary brain state, i.e. we can use constant values of L and \(\Delta h_e^{rest}\). A test of the feature calculations for various window lengths indicated that, in many cases, the signal properties were stationary for windows greater than five seconds. Certain parts of the sleep manifold had transients lasting roughly 10 seconds.
5.2 Application of LLE to a hybrid data set
In Fig. 9(a), the model data is represented by dots, where the color denotes the steadystate value of h _{ e } associated with that point; in general, the red points represent the REM portion of the manifold, while the blue points represent NREM. On the other hand, the sleep EEG data points are rings, where the color is chosen based on sleep stage. Note that, for clarity, only the first 500 EEG data points were included in the figure.
The most important aspect of this result is that the EEG data points and model points overlap each other in the Y _{1}–Y _{2} output space. This implies that model points have EEG data points as nearest neighbors (and vice versa) and verifies that LLE has associated the two data types with one another. Without fidelity of the model and careful choice of EEG features, we would have likely obtained a result with one cluster of EEG points and a completely separate cluster of model points. Further, LLE appears to have matched the sleep stages between the two data types—the deepest sleep (blue for both EEG and model) appears in the lower left corner, and REM (red) is embedded in a vertical band where Y _{1} is in the range [ − 1, 0]. The separation between sleep stages can be seen more clearly in Fig. 9(b), which displays only the EEG data points from Fig. 9(a). Here we see that the stages are grouped; even the REM points and the awake points are separated, despite the fact that their EEG traces are characterized by very similar features. If we were to plot the Y _{1} and Y _{2} values of the EEG data points as they evolve in time, we would see a very similar result to the one in Fig. 5(b). Here, the Y _{1} direction appears to be an approximate indicator of sleep depth.
5.3 Connection to the theoretical sleep manifold
So far, we have seen that LLE provides a qualitatively similar embedding for REM and NREM points in both EEG measurements and simulated model data. However, we would like to quantify this relationship. In other words, we would like to associate each EEG data point with a position on the sleep manifold in the L\(\Delta h_e^{rest}\) space. This will allow us to infer the model trajectory of a subject’s actual brain state as it moves along the manifold.
To do this, we use the results in Fig. 9(a) and again turn to the concept of nearest neighbors. Using k = 14, we calculate the nearest neighbors of every point in the Y _{1}–Y _{2} space. We then identify model points that are nearest neighbors of EEG data points. Each one of those model points has an associated position on the sleep manifold; we assume that the L\(\Delta h_e^{rest}\) positions of the model nearest neighbors will be the most closely associated positions for the EEG data point.
For example, (i)–(vi) in Fig. 10(a) correspond to awake, REM, and stages 1–4, respectively. The thick vertical line at L = 1.2 marks the approximate location of the fold. As we move from REM to the deeper stages of sleep, we can see a continuous progression along the sleep manifold. In this example, REM and stage 1 sleep generally associate themselves with locations on the right half of the manifold (and a small piece of the lower left corner). Then in stage 2 sleep, we move to the left half of the manifold; here, we see two distinct groups of points, with a majority landing in the group that borders the area associated with REM and Stage 1. Stage 3 is associated with a cluster of points starting in the upper lefthand corner and approaching the fold. Stage 4 continues this progression and is located in a band of points leading up to the fold.
We can then create a composite plot that combines all five sleep stages. We neglect the waking points for this task because the current model does not effectively distinguish between the waking and REM states, although this is certainly an issue that may be addressed in the future. For every location on the manifold, we determine which sleep stage it was most closely associated with and color it accordingly. To do this, we scale the number of nearest neighbors for each stage by the total number for that stage; then, for every position on the manifold, we choose the stage with the highest value. This accounts for the fact that the subjects do not spend an equal amount of time in each sleep stage (otherwise, more time spent in a certain stage would lead to more nearest neighbors and a greater likelihood of dominating this composite plot). As in previous figures, we use red for REM, yellow for stage 1, green for stage 2, cyan for stage 3, and blue for stage 4. The intensity of the color is assigned based on the percentage of times it was associated with that sleep stage. Suppose a certain point on the manifold was a neighbor of stage 2 twelve times, a neighbor of stage 1 five times, and a neighbor of REM three times. We would color that point green to indicate stage 2 sleep, and its saturation value would be 12/(12 + 5 + 3) = 0.6. In other words, the intensity of the color is a “confidence” measure; the more saturated the color, the more closely it is associated with that sleep stage. The composite figure for the data in Fig. 10(a) is shown in Fig. 10(b).
5.4 Inclusion of additional data sets
 1.
Find the k nearest neighbors of each new data point among the points in the existing embedding.
 2.
Compute the best linear reconstruction w _{ j } of each new point using only its nearest neighbors. Again, we enforce the constraint that the weights used in the reconstruction sum to one: ∑ _{ j } w _{ j } = 1.
 3.
Calculate the output for the new data points: y = ∑ _{ j } w _{ j } Y _{ j }, where Y contains the original embedding coordinates and j cycles through the neighbors of x.
This picture may be very useful in the analysis of seizures during sleep. Imagine taking another new sleep EEG data set, this time from an epileptic subject, and projecting it onto these results. By following the location in L\(\Delta h_e^{rest}\) as the subject sleeps, we can get an idea of the sleep stage as it is traditionally defined, and we can also identify that stage in more detail and detect nearness to transitions between stages. The grid of points on the sleep manifold essentially gives us descriptions of 336 different brain states associated with sleep. We expect that future research will identify the locations on the sleep manifold where seizures are most likely to occur. With that knowledge, if the sleep state characterization is done continuously while the subject is sleeping, this may allow for the prediction (and possibly prevention) of seizures.
We emphasize the fact that the coloring in the composite pictures (Figs. 10(b), 11, and 12(b)) is based on the subjective scoring of sleep data. The reliability of categorizing individual epochs of data has been reported at 73% for scorers from different labs (Norman et al. 2000) and as high as 90% for scorers from the same lab (Whitney et al. 1998). It has also been shown that reliability varies by sleep stage, with stage 2 having the highest level of agreement between scorers (78.3%) and stage 1 having the lowest (41.8%) (Norman et al. 2000). This certainly affects our results. For example, imagine if some of the points scored as REM that landed in the range 1.6 < L < 2 on the sleep manifold were instead scored as stage 1. Then the right side of the composite picture would be completely yellow and the region associated with stage 1 would be more clear. Therefore, the composite pictures should be seen as “guides” to tie the analysis back to the traditional definitions of the sleep stages, not as the ultimate truth. As mentioned in the previous paragraph, we are most interested in the position on the sleep manifold, the trajectory that results as the subject sleeps, and the relationship of this trajectory to the regions where seizures may be most likely to occur.
6 Summary
Mathematical models represent an opportunity for exploration and prediction. In this case, a model of the human sleep cycle creates the possibility for a more detailed description of sleep states, with application to the prediction and analysis of seizures during sleep. The first step in such an endeavor is always to connect the model to the real world through experimental data.
Here we have used locally linear embedding to directly associate human sleep EEG data with the mathematical model. We first showed that LLE has the ability to distinguish between sleep stages when applied to EEG data alone. This analysis can reliably separate REM and NREM sleep data and provide a smooth temporal progression through the various stages of sleep. We also presented the concept of strongly connected components as a method of automatic outlier rejection for EEG data and discussed a method for the selection of EEG features used in the analysis. Then, by using LLE on a hybrid data set containing both sleep EEG and signals generated from the mathematical sleep cycle, we were able to quantify the relationship between the model and the data. This enabled us to take any sample of sleep EEG data and associate it with a position among the continuous range of sleep states provided by the model. In addition, this approach yields consistent results for various subjects over a full night of sleep and can be done online as the subject sleeps. This suggests a wide range of possibilities for future investigation.
Notes
Acknowledgements
This work was supported through a National Science Foundation Graduate Research Fellowship to B. A. Lopour. It was also supported, in part, by a Mary Elisabeth Rennie Epilepsy and Epilepsyrelated Research Grant. We extend a special thanks to Kelly Clancy and Albert Kao for work which served as the starting point for this project, done as part of a course in computational neuroscience at UC Berkeley taught by Professor B. A. Olshausen.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References
 Ataee, P., Yazdani, A., Setarehdan, S. K., & Noubari, H. A. (2007). Manifold learning applied on EEG signal of the epileptic patients for detection of normal and preseizure states. In Proceedings of the 29th Annual International Conference of the IEEE EMBS (pp. 5489–5492).Google Scholar
 Bojak, I., & Liley, D. T. (2005). Modeling the effects of anesthesia on the electroencephalogram. Physical Review E, 71(041902).Google Scholar
 CorsiCabrera, M., Guevara, M. A., RíoPortilla, Y. D., Arce, C., & VillanuevaHernández, Y. (2000). EEG bands during wakefulness, slowwave and paradoxical sleep as a result of principal component analysis in man. SLEEP, 23(6), 1–7.Google Scholar
 Fuller, P., Gooley, J., & Saper, C. (2006). Neurobiology of the sleepwake cycle: Sleep architecture, circadian regulation, and regulatory feedback. Journal of Biological Rhythms, 21(6), 482–493.PubMedCrossRefGoogle Scholar
 Fuller, P., Saper, C., & Lu, J. (2007). The pontine rem switch: Past and present. Journal of Physiology, 584(3), 735–741.PubMedCrossRefGoogle Scholar
 Gervasoni, D., Lin, S.C., Ribeiro, S., Soares, E. S., Pantoja, J., & Nicolelis, M. A. (2004). Global forebrain dynamics predict rat behavioral states and their transitions. The Journal of Neuroscience, 24(49), 11137–11147.PubMedCrossRefGoogle Scholar
 Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. C., Mark, R. G., et al. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23), e215–e220. Circulation Electronic Pages: http://circ.ahajournals.org/cgi/content/full/101/23/e215.PubMedGoogle Scholar
 Jobert, M., Escola, H., Poiseau, E., & Gaillard, P. (1994). Automatic analysis of sleep using two parameters based on principal component analysis of electroencephalography spectral data. Biological Cybernetics, 71, 197–207.PubMedCrossRefGoogle Scholar
 Kemp, B. (2009). The SleepEDF Database. http://www.physionet.org/physiobank/database/sleepedf/. Accessed August 2009.
 Kramer, M. A., Kirsch, H. E., & Szeri, A. J. (2005). Pathological pattern formation and cortical propagation of epileptic seizures. Journal of the Royal Society Interface, 2, 113–127.CrossRefGoogle Scholar
 Kramer, M. A., Szeri, A. J., Sleigh, J. W., & Kirsch, H. E. (2007) Mechanisms of seizure propagation in a cortical model. Journal of Computational Neuroscience, 22, 63–80.PubMedCrossRefGoogle Scholar
 Leslie, K., Sleigh, J., Paech, M., Voss, L., Lim, C., & Sleigh, C. (2009). Dreaming and electroencephalographic changes during anesthesia maintained with propofol or desflurane. Anesthesiology, 111(3), 547–555.PubMedCrossRefGoogle Scholar
 Liley, D. T., & Bojak, I. (2005). Understanding the transition to seizure by modeling the epileptiform activity of general anesthetic agents. Journal of Clinical Neurophysiology, 22(5), 300–313.PubMedGoogle Scholar
 Liley, D. T., Cadusch, P. J., & Dafilis, M. P. (2002). A spatially continuous mean field theory of electrocortical activity. Network: Computation in Neural Systems, 13, 67–113.Google Scholar
 Lopour, B. A., & Szeri, A. J. (2010). A model of feedback control for the chargebalanced suppression of epileptic seizures. Journal of Computational Neuroscience, 28(3), 375–387.PubMedCrossRefGoogle Scholar
 McCarley, R. (2007). Neurobiology of REM and NREM sleep. Sleep Medicine, 8(4), 302–330.PubMedCrossRefGoogle Scholar
 McKay, E., Sleigh, J., Voss, L., & Barnard, J. (2010). Episodic waveforms in the electroencephalogram during general anaesthesia: A study of patterns of response to noxious stimuli. AIC, 38(1), 102–112.Google Scholar
 Mourtazaev, M., Kemp, B., Zwinderman, A., & Kamphuisen, H. (1995). Age and gender affect different characteristics of slow waves in the sleep EEG. Sleep, 18(7), 557–564.PubMedGoogle Scholar
 Niedermeyer, E., & da Silva, F. L. (2005). Electroencephalography: Basic principles, clinical applications, and related fields. Lippincott Williams & Wilkins.Google Scholar
 Norman, R. G., Pal, I., Stewart, C., Walsleben, J. A., & Rapoport, D. M. (2000). Interobserver agreement among sleep scorers from different centers in a large dataset. SLEEP, 23(7), 901–908.PubMedGoogle Scholar
 Olofsen, E., Sleigh, J. W., & Dahan, A. (2008). Permutation entropy of the electroencephalogram: A measure of anaesthetic drug effect. British Journal of Anaesthesia, 101(6), 810–821.PubMedCrossRefGoogle Scholar
 Polito, M., & Perona, P. (2001). Grouping and dimensionality reduction by locally linear embedding. In Advances in neural information processing systems 14 (pp. 1255–1262). MIT Press.Google Scholar
 Rosenwasser, A. (2009). Functional neuroanatomy of sleep and circadian rhythms. Brain Research Reviews, 61, 281–306.PubMedCrossRefGoogle Scholar
 Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.PubMedCrossRefGoogle Scholar
 Roweis, S. T., & Saul, L. K. (2009). Locally linear embedding. http://www.cs.toronto.edu/ roweis/lle/. Accessed June 2009.
 Saper, C., Lu, J., Chou, T., & Gooley, J. (2005a). The hypothalamic integrator for circadian rhythms. Trends in Neuroscience, 28(3), 152–157.CrossRefGoogle Scholar
 Saper, C., Scammell, T., & Lu, J. (2005b). Hypothalamic regulation of sleep and circadian rhythms. Nature, 437(7063), 1257–1263.PubMedCrossRefGoogle Scholar
 Saul, L. K., Roweis, S. T., & Singer, Y. (2003). Think globally, fit locally: Unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research, 4, 119–155.CrossRefGoogle Scholar
 Steriade, M., & Amzica, F. (1998). Coalescence of sleep rhythms and their chronology in corticothalamic networks. Sleep Research Online, 1(1), 1–10.PubMedGoogle Scholar
 Steriade, M., & Timofeev, I. (2001). Natural waking and sleep states: A view from inside neocortical neurons. Journal of Neurophysiology, 85(5), 1969–1985.PubMedGoogle Scholar
 SteynRoss, D. A., SteynRoss, M. L., Sleigh, J. W., Wilson, M. T., Gillies, I. P., & Wright, J. J. (2005). The sleep cycle modelled as a cortical phase transition. Journal of Biological Physics, 31, 547–569.CrossRefGoogle Scholar
 SteynRoss, M. L., SteynRoss, D. A., Sleigh, J. W., & Liley, D. T. J. (1999). Theoretical electroencephalogram stationary spectrum for a whitenoisedriven cortex: Evidence for a general anestheticinduced phase transition. Physical Review E, 60(6), 7299–7311.CrossRefGoogle Scholar
 SteynRoss, M. L., SteynRoss, D. A., Sleigh, J. W., & Whiting, D. R. (2003). Theoretical predictions for spatial covariance of the electroencephalographic signal during the anestheticinduced phase transition: Increased correlation length and emergence of spatial selforganization. Physical Review E, 68, 021902.CrossRefGoogle Scholar
 SteynRoss, M. L., SteynRoss, D. A., & Sleigh, J. W. (2004). Modelling general anaesthesia as a firstorder phase transition in the cortex. Progress in Biophysics & Molecular Biology, 85, 369–385.CrossRefGoogle Scholar
 Tarjan, R. (1972). Depthfirst search and linear graph algorithms. SIAM Journal on Computing, 1(2), 146–160.CrossRefGoogle Scholar
 Whitney, C. W., Gottlieb, D. J., Redline, S., Norman, R. G., Dodge, R. R., Shahar, E., et al. (1998). Reliability of scoring respiratory disturbance indices and sleep staging. SLEEP, 21(7), 749–757.PubMedGoogle Scholar
 Wilson, M. T., SteynRoss, M. L., SteynRoss, D. A., & Sleigh, J. W. (2005). Predictions and simulations of cortical dynamics during natural sleep using a continuum approach. Physical Review E, 72(051910).CrossRefGoogle Scholar
 Wilson, M. T., SteynRoss, A., Sleigh, J. W., SteynRoss, M. L., Wilcocks, L. C., & Gillies, I. P. (2006). The kcomplex and slow oscillation in terms of a meanfield cortical model. Journal of Computational Neuroscience, 21, 243–257.PubMedCrossRefGoogle Scholar