A novel method for automated classification of epileptiform activity in the human electroencephalogram-based on independent component analysis
- First Online:
- Cite this article as:
- De Lucia, M., Fritschy, J., Dayan, P. et al. Med Biol Eng Comput (2008) 46: 263. doi:10.1007/s11517-007-0289-4
- 144 Views
Diagnosis of several neurological disorders is based on the detection of typical pathological patterns in the electroencephalogram (EEG). This is a time-consuming task requiring significant training and experience. Automatic detection of these EEG patterns would greatly assist in quantitative analysis and interpretation. We present a method, which allows automatic detection of epileptiform events and discrimination of them from eye blinks, and is based on features derived using a novel application of independent component analysis. The algorithm was trained and cross validated using seven EEGs with epileptiform activity. For epileptiform events with compensation for eyeblinks, the sensitivity was 65 ± 22% at a specificity of 86 ± 7% (mean ± SD). With feature extraction by PCA or classification of raw data, specificity reduced to 76 and 74%, respectively, for the same sensitivity. On exactly the same data, the commercially available software Reveal had a maximum sensitivity of 30% and concurrent specificity of 77%. Our algorithm performed well at detecting epileptiform events in this preliminary test and offers a flexible tool that is intended to be generalized to the simultaneous classification of many waveforms in the EEG.
KeywordsElectroencephalogramIndependent component analysisAutomatic classificationEpileptiform eventsEye-blinks artefacts
Despite computational advances, current clinical practice is still for expert observers to interpret EEGs by eye. Over the past three decades, there has been considerable interest in finding automated ways to analyse the EEG in order to save time consuming human analysis. This is a special problem for longer recordings. Until now, these have mostly concentrated on detection of seizures and the spikes and sharp waves which indicate epileptic activity; sensitivities of up to about 79% for seizures and 89% for spikes and sharp waves have been reported for commercial software [42, 43].
Existing methods for automatic detection of epileptiform events are conventionally based on one or both of two principles: (1) spike morphology, in which a set of typical features that characterize spikes is utilized, and (2) localization of transient events, in which spikes are represented as transients that can be distinguished from ongoing background activity. Algorithms in the first class attempt to represent spikes by a set of features that make it possible to discriminate spikes from other events. These are constructed to capture and represent explicitly all the relevant information in the input, and eliminate redundancies that would corrupt the subsequent discrimination. Some features are readily related to spike morphology, such as duration, amplitude and kurtosis [10, 15, 16, 43]; others use more general signal processing methods such as waveform decomposition into Fourier or wavelet components [14, 25, 29]. Algorithms in the second class aim to detect epileptiform events by measuring their deviation from stationarity. Autoregressive analysis has been used both in single channel or multi channel EEG for the extraction of relevant coefficients for characterizing the non-stationarity of any epileptic events [1, 11]. The same principle has been implemented using nonlinear measures based on information theory that are able to capture the transitions between complex and less complex dynamics .
In a typical detection algorithm design, these principles are implemented in a tree-like structure algorithm or a multistage classifier. One common idea is to use different features of the EEG as inputs to one or more artificial neural networks, whose outputs are then passed through a set of rules based on fuzzy logic [35, 43]. Another is to use a highly structured network with several input and hidden layers [33, 40]. Such architectures can behave in complex ways, which often makes it difficult to diagnose and rectify flaws in classification.
The long-term purpose of this work is to develop an algorithm, which can automatically detect all the clinically relevant waveforms in the EEG. Here, we report a novel, generalizable, method, implemented in this case for the special cases of major classes of epileptiform activity, and the discrimination of these from eyeblinks. These two features were selected because of the clinical significance of the former, and the false positives caused by the latter. We present preliminary results on a sample database of seven EEGs and compare these results to the commercial software Reveal (http://www.eeg-persyst.com/index.html), as well as other possible approaches for feature extraction prior to classification.
Our algorithm falls into the class of spike morphology algorithms described above. First, the raw EEG was projected onto selected informative feature dimensions; second, these multidimensional features were used as the basis of a classifier. Note that, for the moment, we only considered single epileptic waveforms in the scalp EEG, thus excluding seizures and also polyspike sequences of these single waveforms.
The goal of feature extraction is to discover dimensions in the input that collectively enable simple, readily generalizable, separation of the various classes. This task is subject to various trade-offs, most notably between the power of the classifier and the power of the features, with the latter being subject to severe constraints based on the limited amount of information available in the training set. We therefore started by restricting our consideration to simple linear and quadratic classifiers, and sought small numbers of feature dimensions. We considered spectral, principal component analysis (PCA) and independent component analysis (ICA) approaches to determine appropriate features. The algorithm that performed most competently was based on ICA and a quadratic classifier. We, therefore, report the results achieved by this method.
In previous work, the most popular choice of features involves spectral , or wavelet-based [14, 25, 29] decompositions, and methods driven more directly by the data such as PCA and ICA have attracted much less study. However, the empirical bases are well suited to our goal, since they are automatically sensitive to the characteristics of the patterns involved. PCA and ICA are ways of extracting empirical bases that differ in subtle, but important, respects . Indeed, ICA has previously been shown to be useful for analyzing epileptiform discharges, but only given a degree of hand-tuning [26–28, 32]. We employed a novel variant of ICA in a fully automatic procedure, and showed that it out-performs PCA and is at least competitive with Reveal on the same input dataset.
2.2 Independent component analysis (ICA) and its novel use in this study
Not only are there different principles and algorithms, but also there are various ways to apply ICA to physiological data . One application of ICA, commonly used for fMRI data, is called spatial ICA (sICA). In this, the signals at each point in space are decomposed into Vspatially independent sources whose projections are fixed across time. sICA is conventionally applied to functional MRI [6, 30, 31], for which there are often many more spatial than temporal dimensions. Here, the number of sources, V, is bounded by the number of time steps of the data that were acquired.
For our purposes, tICA was problematic, since we expected that more than one independent component accounts for the epileptic activity, and that different patients would have different foci in different locations, therefore having quite different underlying spatial maps on different electrodes.
The novel aspect of our ICA variant arises from the fact that during testing, any potential spike was treated as an extra input into the model (Fig. 3b). The features for each spike (red links in Fig. 3b) were evaluated by projecting the independent components onto the spike itself (red waveform in Fig. 3b). This projection is in fact the set of mixing (weights) coefficients that, in the underlying ICA generative model, would allow the spike to be created from the ensemble of independent sources learned from the training set.
It is worth emphasizing that the independent sources were extracted just from the training database of spikes. This was done since we expected that there will be a bigger difference between the projection of spikes and non-spikes onto a representation that is precisely tailored to the vagaries of spikes themselves. The same argument applies potentially to other signals of interest such as eyeblinks, to which we also applied our method.
2.3 EEG training and validating data
# Epileptiform events
Idiopathic generalized epilepsy
Left temporal partial lobe epilepsy
Focal spike wave activity
Temporal lobe epilepsy
Generalized spike and wave and partial temporal epilepsy
Generalized spikes and spikes and wave, burst of polyspikes and wave
Training and validation proceeded according to a leave-one-out cross-validation procedure on the seven recordings. In each case, testing was performed on one full recording and training was undertaken on a subset of the remaining six. This was repeated seven times. The training set comprised all the sharp and spike waves collected from the six recordings together with a separate, hand-picked, 23 min subset of the six recordings. This subset was chosen to capture the range of variability in the data, whilst avoiding artefacts. The total number of sharp waves and spikes in each training set was between 499 and 611.
2.4 Description of the method of classification
For each waveform, we computed the maximum amplitude occurring within a sub-interval of the original temporal window. This subinterval consisted of 40 time frames starting at the 40th time point of each waveform. Without loss of generality we disregarded a priori all the waveforms that did not have this maximum value in the middle of the 109 time-frame window. This choice was made in order to reduce the computational time required by the algorithm, and because those criteria did not exclude any epileptic waveforms. The same principle applies to both training and testing of the algorithm.
In the first step of the algorithm, independent components S were extracted from Xs according to the model expressed by (3). The input data for ICA were preprocessed by centering and whitening. The ICA decomposition was performed using Fastica ,  and activation function g(u) = u3. The independent components were fixed for both classes, and their projections onto Xkwere treated as lists of features for each class. In the second step of the algorithm, we trained a quadratic classifier based on these projections .
In the test stage, a sliding window of length L was shifted sample by sample along each channel, and projections onto the independent components were computed for each EEG segment that met the above criteria.
For any of the cross-validation splits in the dataset, the maximum number of independent components was 105. Since the basis is undercomplete, computation of the pseudoinverse of the matrix S was required.
The algorithm was then generalized to detect a further class which comprised eye movements and eye blinks (eye artifacts; Fig. 2), as these artefacts appeared to provide the main source of classification errors in the two-class case. In the feature extraction stage, the three classes—epileptiform activity, eye artefacts and rest of the EEG—were all projected on the same single basis obtained from ICA using only the epileptiform activity. We generalized the quadratic discriminator to allow a multiclass classification, fitting a mixture of Gaussians model to the features of the different classes, and computing the posterior probabilities of each class .
The model was optimized by extracting the number of independent components that showed the best performance of the quadratic classifier across the seven splits. To this end, we reduced the input dimension ns of X to between 1 and 105 dimensions using Principal Component Analysis before extracting the same number of independent components. In the following we will always refer to this algorithm as ICA-algorithm, either when it is based on PCA followed by ICA, or on ICA alone when no dimension reduction is needed.
We evaluated the performance based on the average ROC curve across the seven cross-validation splits. Given the multiple classes, we evaluated the ROC curve in the testing stage of each of the seven splits by fixing the prior probability for the eyeblink class in order to get the maximum sensitivity possible for detecting epileptiform activity, and tuning the prior probability between epileptiform and the background EEG classes. In the ROC curve evaluation, we collapsed all the marks across channels along one single channel and considered the marks within the L-window as one single event. A mark is labeled as true positive when at least one of the spikes at a certain time point is detected. This criterion makes it possible to make a fair comparison with the Reveal software (see below).
Two aspects of the ROC curve are of particular interest: the area, which measures the overall performance of the algorithm, and the sensitivity reached at the rate of six false positives (FPs) per minute. Since EEGs are usually displayed with 10 s per page, a higher rate of FPs per minute would require a revision of classifier marks for every page, and so would not save the time of the technician or electrophysiologist. In order to emphasize the difference between the three representations, we also report the accuracy obtained at a mid-range value of 65% sensitivity.
3 Comparison with other feature-based classifiers and Reveal
The performance of the ICA-based representation was compared with other feature-based classifiers, and the commercially available software Reveal.
We considered two possible choices of features, using each time the same classifier as described above. The first important control was done using the raw, temporal, representation of the input data themselves. The second comparison was performed using PCA.
Reveal was also run on the same dataset. In this program, spikes, sharp waves, and spikes and waves are marked when they occur as single waveforms or in a run (focal or generalized or partially generalized). All these events are defined as spikes in the `mark detector’. The mark is accompanied by a probability index, termed a perception level p (0.1 < P < 1), and the channel of the spike focus.
The perception level quantifies the degree of uncertainty about the marked waveform; the lower the perception value, the lesser the confidence in the detection.
The corresponding ROC curve (Fig. 8) gave a sensitivity of 65% (±22%) at a specificity of 86% (±7%). With a PCA algorithm based on three components, the specificity was 76%(±1%) at the same sensitivity (Fig. 8). Allowing any possible number of extracted components, the PCA-based classifier gave an underlying ROC curve area that was always lower than the ICA-algorithm (Fig. 7; p < 0.05). The three ROC curves obtained for our algorithm at 105 independent components extracted, 105 principal components and raw data gave respectively a specificity of 80% ± 12%, 64% ± 26% and 74% ± 7% all at a sensitivity of 65% (Fig. 6).
The ICA-algorithm when three components were extracted provided 17% ± 17% in sensitivity and 96% in specificity, at a rate of six false positives per minute (Fig. 8).
For Reveal, the highest average sensitivity was 30% and a specificity of 77%. In one recording, Reveal did not detect any spikes. Our algorithm showed a higher accuracy than Reveal only for sensitivities higher than 17% (Fig. 8). ROC curves areas for Reveal compared to our algorithm restricted to this highest sensitivity reached by Reveal were not significantly different (Student’s t test).
We have presented a two-stage algorithm for automatic classification of EEG signals. The first stage consisted of a novel application of an independent components analysis algorithm. This resulted in a set of new representational coordinates that were appropriate for discriminating spikes from other features of the EEG. The second stage consisted of a mixture of Gaussians classifier, based on these coordinates. The classifier was optimized to detect epileptiform activity, eye blinks and eye movement artefacts. A ROC curve analysis for epileptiform events showed that we can expect an average sensitivity of 65% at a specificity of 86%.
Model optimization led to the selection of only three independent components after dimensionality reduction by PCA. The three selected components are clearly related to the salient features that human experts report using to detect a spike or a sharp wave (Fig. 9). One component is clearly responsible for the peakedness of the waveform (red line), the green component reflects the duration of the epileptiform event, and the last blue component represents the rebound, which specially characterizes the spikes and wave. This quite straightforward interpretability of the selected features can be seen as indirect support for the validity of our data driven approach to feature selection based on independent component analysis. More independent components are expected to be selected when more patterns must be detected.
The algorithm has not been fully tested on a naive dataset, but our preliminary results, which are based on a cross validation procedure, appear promising for routine clinical use. The algorithm did not impose high computational demands. A recording of 20 min, which comprises 17 Mb of data, can be processed in about 10 min on a 2.8 GHz dual processor PC. In clinical practice, we expect this time to be practicable for routine daily use. Should the computational time become unfeasible when more classes need to be detected, more powerful computational resources could be considered. One avenue is use of parallel processing by the GRID. Using the Condor GRID resource available at our university with 920 PCs, processing time was reduced by a factor of 25 [12, 13].
It is very hard to compare our algorithm directly against most other suggested methods, since, unlike ours, the datasets on which they have been tested and validated are not in the public domain. We certainly observed even in our own datasets, that there can be quite large differences depending on patient and human expert. In contrast, some other reported performances are better than this, especially in the range of 6 false positives per minute. For instance the ‘Gotman spike detector’  achieved a sensitivity of 76% with 5.2 FPs/min  and another study reported a sensitivity of 73% at 6.1 FPs/min, but our algorithm showed only 17% sensitivity in this range. More reliable are comparisons based on the same dataset. In this respect, we were able to run the commercial algorithm Reveal on our own data. Although this is reported as having a sensitivity and specificity of 89.9 and 99.6%, respectively (based on a large dataset including 40 subjects and 10 controls, which was independently labeled by five experts), when applied to our data, Reveal only achieved a sensitivity of 30% even at the lowest perception level. This prevented us from comparing our results at the level of sensitivity (>50%) usually reported in the literature . Nevertheless, the specificity of Reveal at sensitivity lower than 17% was slightly better than that of our algorithm. We are confident that our data have been appropriately labeled, and have duly put it into the public domain. That our method allows multiple classes suggests that its performance could be further improved if other explicit sources of misclassification are included as extra classes in the classifier. This is a key direction for future work.
Other planned work includes validation with a larger naïve training and cross-validation set, particularly because of the apparent variability in the number of spikes in each dataset. This variability led to problems in setting the threshold consistently between the separate cross-validation splits of the data, an issue with which extra data should greatly help. The next task is to improve the classifier. Since rather few representational components turn out to be optimal, it is possible to visualize the representation of the spikes and non-spikes in the reduced-dimensionality ICA space. This visualization suggests that we might beneficially replace the quadratic classifier, which was chosen mainly for its simplicity, with a nearest neighbor classifier.
In the longer run, our intent is to generalize automatic classification to the other salient aspects of EEG signals. If this can be accomplished, then this could provide a novel tool, which could be used routinely for a quantitative fast analysis suitable for everyday EEG recording.
In conclusion, our novel ICA-based approach is quite accurate in detecting epileptiform events, and is a promising approach for generalizing automatic methods to the detection of other salient aspects of EEG signals.