Dyslexia is a disability that causes difficulties in reading and writing despite average intelligence. This hidden disability often goes undetected since dyslexics are normal and healthy in every other way. Electroencephalography (EEG) is one of the upcoming methods being researched for identifying unique brain activation patterns in dyslexics. The aims of this paper are to examine pros and cons of existing EEG-based pattern classification frameworks for dyslexia and recommend optimisations through the findings to assist future research. A critical analysis of the literature is conducted focusing on each framework’s (1) data collection, (2) pre-processing, (3) analysis and (4) classification methods. A wide range of inputs as well as classification approaches has been experimented for the improvement in EEG-based pattern classification frameworks. It was uncovered that incorporating reading- and writing-related tasks to experiments used in data collection may help improve these frameworks instead of using only simple tasks, and those unwanted artefacts caused by body movements in the EEG signals during reading and writing activities could be minimised using artefact subspace reconstruction. Further, support vector machine is identified as a promising classifier to be used in EEG-based pattern classification frameworks for dyslexia.
Dyslexia is a disability that involves deficiencies in reading and writing capabilities, but does not affect intellect. Although this condition was commonly known as ‘word blindness’ in the 1800s, it has now been identified as a condition with a neurological origin and not as a condition to do with lack of vision [1, 2].
There are many techniques proposed by past research to detect indicators of dyslexia. We can broadly categorise these techniques into three. The first category is detection using ‘behavioural’ symptoms and aspects. This is the conventional and most popular method that is currently used by psychologist to diagnose dyslexia. This method assesses whether a person has dyslexia using highly recognised standardised tests . The second category is the use of brain imaging techniques to portray distinctive brain behaviours . Functional magnetic resonance imaging (fMRI), magneto-encephalography (MEG), electroencephalography (EEG) and positron emission tomography (PET) are few of the methods that could be used to depict these behaviours. Studies show [4,5,6] that individuals with dyslexia have unique brain structures and behaviours. The third category includes eye-movement patterns [7,8,9]. Category two and three are still in experimental stages. Those techniques only help to identify symptoms of dyslexia and are not currently used to diagnose dyslexia.
EEG is one of the popular techniques used to assess brain behaviours. In this study, we look at how EEG has been used to identify signs of dyslexia. The EEG results may help psychologist to complement the current dyslexia assessing techniques as it could add a neurological point of view.
This paper focuses on EEG-based dyslexia studies that have attempted to identify unique brain activations using pattern recognition. Although dyslexia can further be divided into sub-types, this study does not cover studies that use pattern recognition to identify the sub-types. The intension of this study is to review EEG-based pattern classification frameworks specific to dyslexia and do not intend to speculate the neuroscience behind the findings. Each framework is assessed using a pre-defined format to arrange the data in a meaningful manner and to recognise its strengths and weaknesses. These discoveries are then used to propose an improved EEG-based pattern classification framework for dyslexia (higher validation accuracies for the classifier).
What is dyslexia?
Dyslexia is a disability with a neurological origin that causes difficulties in reading, writing or spelling despite average or above average intelligence and sensory abilities. Common symptoms of dyslexia include poor reading skills, unreadable handwriting, slow writing or copying, bad spellings, letter migration or reversals. [2, 10,11,12,13].
This condition is heritable, which means that a child might inherit it from a parent. It has been reported that 23–65% of children who have a parent with dyslexia are at risk of having dyslexia . Dyslexia in some cases can have partly or wholly distinct genetic causes. Studies  suggest looking into the genetic aspect to effectively detect dyslexia instead of merely considering individual disabilities. Studies have shown that overall reading capabilities including dyslexia have noteworthy genetic components with heritability estimated at 54–84% . Left-handedness is sometimes considered to be prevalent among people with dyslexia. However, there seems to be a controversy, certain findings have discovered a connection between dyslexia and left-handedness, whereas some studies claim it to be a myth [17,18,19].
Why is dyslexia detection important?
A noteworthy amount of the world population is affected by dyslexia. Statistics show that approximately 20% of the child population in the USA , approximately 4% of the students in Australia  and overall approximately 15–20% of the world population  experience dyslexia.
An individual with dyslexia can become a depressed, unmotivated or a low self-esteemed if the condition goes undetected. Difficulty in learning to interpret letters, words or sometimes even symbols certainly causes the child to have a hard time keeping up with peers [4, 23].
Diagnosing dyslexia at an early stage is important to prevent the child having to go through a stressful, rough childhood and face frustrating experiences at school. Early detection helps to direct a child with dyslexia to the necessary treatments required. Targeted assistance is essential for people with dyslexia to cope up with their struggles and difficulties. Recent studies  state that ‘dyslexia is not a disease or defect that can be cured’, rather a ‘condition that can be helped’ with proper targeted support. Promising results have shown of children who go through such intervention programs in the early stages  proving improvement in reading performance as well as reduction in anxiety . Though these techniques help, dyslexia still does persist into adulthood .
Although persons with dyslexia face difficulties in reading and writing, they have normal or sometimes even higher intelligent levels. Albert Einstein, Leonardo da Vinci, Alexander Graham Bell, Hans Christian Andersen, Walt Disney, Henry Ford, Steve Jobs and Richard Branson are few of the famous and talented dyslexic great minds . According to Davis  in the book ‘The gift of dyslexia: why some of the brightest people can’t read and how they can learn’, people with dyslexia are believed to be highly intuitive and insightful with the ability to alter and create perceptions. They are known to be highly aware of the environment, with more curiosity than average, thinking mainly in pictures instead of words and experiencing thought as reality with a lot of vivid imaginations .
Conventional dyslexia detection techniques
The conventional dyslexia detection practices are often based on ‘behavioural’ symptoms and aspects . Standardised test such as Wechsler Individual Achievement Test (WIAT), Comprehensive Test of Phonological Processing (CTOPP), Oral and Written Language Scales (OWLS) and Woodcock Johnson (WJ) are used to assess reading, writing, intelligence quotient and phonological processing abilities. The results of the standardised test along with factors such as biographical information and family history help determine whether a person has dyslexia . The severity of dyslexia may vary from mild to severe, and the symptoms of dyslexia vary from person to person .
What is an EEG?
Electroencephalogram, commonly known as EEG, is a ‘record of the oscillations of brain electric potential recorded from electrodes on the human scalp’ [30, p. 3]. EEG is a technique that can be used to monitor and detect brain functions. The electrical activity of the brain for various stimuli can be identified via the electrodes placed on the scalp (Fig. 1).
EEG-based pattern classification for dyslexia
Recent studies show that researchers are now looking into ‘neurological’ aspects to identify patterns that are unique to dyslexia. It has been uncovered that there are structural differences as well as different forms of processing of the brain between normal and dyslexic individuals. Dyslexics’ brain is normal and healthy; it takes a longer time to make connections compared to normal people [4, 37].
Similar to other conditions, EEG can also be used to identify unique brain activation patterns of dyslexia since it has a neurological origin. The next section will discuss the efforts made by research in using EEG for pattern classification between dyslexics and non-dyslexics.
What are the existing frameworks and their shortcomings?
This section covers research carried out to improve pattern classification frameworks for dyslexia using EEG. The existing frameworks will be identified, and each framework will be explored in depth to identify its strengths and weaknesses.
Given below is an overview of the review process, which consists of 5 main steps. Each framework will be analysed taking into account the following criteria (Fig. 2).
A study carried out by Arns et al.  was able to uncover unique brain activation patterns in dyslexic children. A total of 38 participants: 19 dyslexics (11 males and 8 females) and 19 controls (11 males and 8 females) between the ages of 8–16 years took part in this study. The exclusion criteria included mental illness or genetic disorders in person or family history, neurological disorder, brain injury, addiction to drug or alcohol and serious medical conditions. The EEG data were acquired at a sampling rate of 500 Hz using the internationally recognised 10–20-electrode positioning system having 28 channels, namely Fp1, Fp2, F7, F3, Fz, F4, F8, FC3, FCz, FC4, T3, C3, Cz, C4, T4, CP3, CPz, CP4, T5, P3, Pz, P4, T6, O1, Oz and O2. The experiment was performed in a sound- and light-attenuated room, which was controlled at a room temperature of 22 °C. The EEG data were recorded for 2 min while being seated with eyes open, focusing the attention on a red dot displayed on a computer screen. The group of participants with dyslexia was also given few language tests. These tests consist of articulation, rapid naming of letters, phoneme deletion and spelling. These reading-related tasks were collected to find the correlation between EEG and the neurological findings of dyslexia. However, EEGs were not recorded while these tasks were performed; instead, the above-explained tasks with eyes open were used since the EEG of resting state highly correlated with the tests.
The data are EOG-corrected prior to the analysis. These data are then examined using the power spectral analysis. The approach followed is that the data are first partitioned into adjacent 4-s sections, next the data are transformed to the frequency domain from the time domain using fast Fourier transform (FFT), and finally the average power spectra are calculated for specified frequency bands ranging within the delta, theta, alpha and beta bands. The EEG data are then analysed statistically using one-way ANOVA to find the significant differences between the dyslexic and control group. Further, a correlation matrix is acquired for correlations between the variables within the dyslexic group. The significant measures of the EEG power and coherence data obtained from the two groups are submitted for the correlation analysis with the four language tests explained above. The study revealed that the dyslexic group had increased slow theta and delta activity in the frontal and right temporal areas of the brain. Beta was clearly increased at F7, and significant correlations were found between the EEG coherence and the dyslexia tests  (Table 1).
This study only performs statistical analysis using the EEG data and does not present any classification mechanisms to differentiate between dyslexics and non-dyslexics. The data collection has been carried out wisely, taking into account an equal number of participants, a sufficient number of EEG channels, excluding criteria that could have an effect on the brainwave recordings and by collecting the data in a consistent and suitable environment. However, since the EEG data are collected only in the resting state and not while the tests are actually being undertaken, important artefacts specific to each task are most likely to be missed out. Since the EEGs were recorded only in the resting state, the only main unwanted artefact being the eye blinks has been removed in the pre-processing step of the analysis. The input features using the EEG recordings include the power spectra for specified frequency bands such as alpha, beta and theta at each EEG channel. One of the significant findings being the increase in beta frequency verifies that the brainwaves get activated significantly in dyslexics while performing tasks, in this case specifically reading-related tasks.
A framework for detecting abnormalities in dyslexia using approximate entropy of EEG signals was proposed by Andreadis et al. . Approximate entropy (ApEn) is a ‘statistical parameter used to quantify the regularity of a time series data of physiological signals’ . This study consisted of a total of 57 participants: 38 dyslexics (26 males and 12 females) and 19 controls (7 males and 12 females) between the ages of 2–13 years. The exclusion criterion comprises difficulties in hearing, history of head injury, neurological diseases or attention deficit disorders.
The EEG for this study was recorded using the international 10–20 system, containing 15 channels, namely Fp1, F3, C5, C3, Fp2, F4, C6, C4, O1, O2, P4, P3, Pz, Cz and Fz. The experiment for this study is that a single sound tone was presented to the participant via earphones, which was of a high frequency of 3000 Hz or low frequency of 500 Hz, followed by numbers that had to be memorised. The brainwave data were collected as EEG signal for 500 ms before the stimulus and as event-related potential (ERP) after the stimulus for 1000 ms.
The pre-processing mechanisms used in this study include two main steps. The first step was recording the electrooculography (EOG) and rejecting values higher than 75 μV, and the second step was normalising the waveforms by subtracting the mean value and dividing by the standard deviation of each signal. These data are then analysed using ApEn and Cross-ApEn (comparing EEG signals from two electrodes). A support vector machine (SVM) classifier was then implemented using the statistical significant electrodes for all subjects obtained using ApEn as input features. This classifier offered promising results achieving a sensitivity of 89.47% and specificity of 57.89%. The study was then taken a step forward to enhance the classifier using the input features from Cross-ApEn. This method looks at significant pairs of electrodes instead of evaluating electrodes on its own. Although this technique delivered better discrimination abilities, no clear pattern has yet been found because there were a very high number of statistically significant pairs of electrodes.
Looking at the study as a whole, it can be stated that the researchers have been able to successfully develop a classifier that can differentiate between the dyslexic and the non-dyslexic. However, the experiment used looks into only the working memory abilities and does not involve any reading- or writing-related elements. Since dyslexia is a condition that causes deficiencies in reading and writing abilities, important factors required for the differentiation process could be missed out. The same research team performed another analysis using the same experiment and data by using wavelet entropy . The findings revealed that wavelet entropy could be used as a quantified measure to observe and analyse EEG and ERP signals to detect brain patterns specific to dyslexia.
A Malaysian research team conducted a frequency analysis of EEG signals generated between dyslexic and normal children during writing [41, 42]. The EEGs were recorded from a total of 6 right-handed children: 3 dyslexic and 3 control subjects between the ages of 8–12 years using the standard international 10–20 system. This study uses only 4 EEG channels, namely C3, C4, P3 and P4. The experiment involved collecting EEGs in the relaxed state and while performing writing-related activities, which were designed based on the conventional method of diagnosing dyslexia.
During the pre-processing phase, unwanted artefacts being electrocardiograms (ECG) and electrooculogram (EOG) were filtered out. Next, the signals containing the writing-related data were extracted using a band-pass FIR filter ranging from 8 to 30 Hz. For the frequency analysis, the signals are transformed to the frequency domain from the time domain using fast Fourier transform (FFT). The study revealed that the dyslexic children consume more energy which results in high-frequency beta wave relaxed states during writing-related activities compared to normal children. The frequency range identified for dyslexic children is between 22 and 28 Hz, whereas for non-dyslexic children it is between 14 and 22 Hz (Tables 2, 3).
Overall, this study does not provide any classification mechanism. It only analyses the frequencies obtained from the dyslexic and non-dyslexic groups. Looking at the number of channels and the number of participants used for the study, it can be implied that the numbers are too small to arrive at a conclusion for using these results for a framework to discriminate between the dyslexic and the non-dyslexic. The study has explicitly used subjects that are right-handed, which is in fact an important factor since the handedness has an effect on the EEG activities between the right-handed and left-handed subjects [43, 44]. However, excluding factors that could have an effect on the EEG recordings has not been taken into consideration. Additionally, it is not indicated whether a silent and temperature-controlled room was used to carry out the experiment. The pre-processing techniques used in this study are similar to previous similar studies; however, since this study involves hand movements, it is not specified how the artefacts generated from the hand movements were filtered out. Further, the experiment focuses only on the writing-related tasks.
Frid, Breznitz  proposed a support vector machine (SVM)-based algorithm for differentiating between dyslexic readers and regular readers using ERPs. The study was carried out with a total of 50 participants: 20 dyslexics and 30 controls of the ages between 24 and 40 years. The signals were recorded at a sampling rate of 2048 Hz using the standard 10–20 system with 64 channels. The experiment used in the study is that the subject is required to press a button in response to a target stimulus, which is a tone. The conditions consist of 50 stimuli of target tones at frequencies of 1000 Hz and 50 non-target tones of 2000 Hz.
The data collected is first pre-processed using a band-pass filter at 0.1–100 Hz, and then a notch filter at 50 Hz is used to remove noise caused by electric power lines, and finally unwanted artefacts such as eye and muscle movements are filtered out. The next step is the feature selection where the features with the most relevance and the ability to discriminate are chosen. The five features selected are positive area (Ap), maximal peak amplitude/time ratio (Mp), spectral flatness measure (SFM), standard deviation and skewness, and power spectral density (PSD). Although the classification was first attempted using a single classifier for all features, it was not successful. Therefore, the approach follows was to use ensemble SVMs. The classification results were compared for the combinations: the best single feature, an ensemble of three SVMs and only the left or right hemispheres.
To recapitulate, the study uses a simple experiment task, which relates to working memory and reasoning abilities, but does not engage any stimulus with regard to reading or writing which are important factors in detecting unique patterns to dyslexia. This may have bypassed on activating vital areas of the brain specific to dyslexia. The study does not indicate whether they were any inclusion and exclusion criteria taken into account when recruiting the participants, which could increase the likelihood of having outliers within the groups selected.
A classification model to distinguish dyslexic children from the normal children during rest state was suggested by . A total of 6 participants: 3 dyslexics and 3 controls within the ages of 4–7 years took part in this study. The EEG data are collected using the international 10–20-electrode placement system using 8 channels with a sampling rate of 250 Hz. The experiment is carried out in a room with controlled temperature and lighting while the participants are in the resting state with both eyes closed and eyes open.
During the pre-processing phase, noise and irrelevant artefacts have been removed. Since the data collection is done in the resting state, the frequency band relating to this state is alpha, and this has been extracted using band-pass filtering. The next phase being the feature extraction is performed using kernel density estimation (KDE), which is an artificial neural network technique organised in several different layers . Finally, the classifier is trained using multilayer perceptron (MLP). This mechanism was able to obtain an accuracy rate of 90% to classify the dyslexic and non-dyslexic during both eyes open and eyes closed conditions.
To wrap up, the study uses EEG data from only the resting state disregarding the essential reading- and writing-related brainwave data. The number of participants and the number of channels used are quite low compared to previous similar research [39, 45]. No inclusion or exclusion criteria for participants used are indicated. Further, although the study gave a 90% accuracy rate since the data set used is very small it is very encouraging.
A wavelet packet analysis of EEG signals between dyslexic and non-dyslexic children during writing was proposed by . A total of 8 subjects: 4 dyslexics and 4 controls between the ages of 7–12 years took part in this study. The EEG data were recorded in the temperature-controlled room at 24 °C using the international 10–20 system with 4 channels, namely C3, C4, P3 and P4, having a sample rate of 256 Hz. The signals were captured in the relaxed state, writing state and during letter recognition, and each task was repeated 6 times. This is then examined using wavelet packet analysis for alpha and beta frequency bands. The outcome of the study discovered that there was no significant difference in the alpha band frequencies during the relaxed state and writing state in dyslexics; however, for non-dyslexics the alpha band frequency was higher during relaxed state compared to writing state. During writing, beta frequency was higher in dyslexics compared to non-dyslexics.
This study looks into the brain behaviours during the resting and writing states, but does not look into the reading state. No information is provided about pre-processing the signal to remove unwanted artefacts such as eye blinks. The number of subjects and the number of channels used in the study are low compared to previous similar research [39, 45]. Finally, the study performs only as analysis and does not perform any classifications.
Is there a need for an improved framework?
This section will examine all the frameworks as a whole and ultimately propose an improved framework.
Number of participants
There are many important decisions to be made prior to the data collection to make the experiment successful. One of the most important decisions to be made is to determine the number of participants required for the study. The review disclosed that some studies had too little subjects, which makes the outcome less reliable (Table 4).
In medical research, the number of subjects used for the study is mostly limited because of uniqueness, ethical considerations, time and cost. Therefore, it is important to identify the optimal sample size to avoid the sample being too small resulting in not being able to recognise important effects and the sample being too large resulting in a waste of resources. Using the sample size of a similar study is one of the approaches that can be used to determine the sample size . In this case, instead of relying on one previous similar study, the sample size can be determined by getting the mean sample size of multiple similar studies. According to the calculation using past similar research, we can suggest having approximately 15 subjects for each group.
Another technique to determine the number of subjects is the Altman’s nomogram sample size calculation. According to this calculation for a power of 0.80 (P value significance of 0.05) and a standardised difference value between 0.8 and 1.0 (Cohen’s d effect size), the total number of subjects would vary between 50 and 30 participants. Therefore, the number of subjects would per group vary between 25 and 15 (Fig. 3).
According to previous similar studies, EEG-based pattern classification frameworks for dyslexia studies have been carried out on children as well as adults, which means that the study can be used on either group. However, it is important to make sure that the subjects of age range selected have parallel reading and writing abilities (Table 5).
The past similar studies reviewed have not compared any brainwave patterns specific to gender. Therefore, for future work, the comparison between the female and male dyslexic brainwave patters is a gap to be filled.
The data collection location and its environment is a very important factor to be looked at when recording EEGs. Below given is a summary of typical environment extracted from the review and more suggestions. These factors are important to make sure no interference caused to the signals, the subjects are comfortable and are not distracted.
Sound- and light-attenuated room.
Temperature-controlled room—if subjects are perspiring, it could cause problems to the recordings.
Any extra equipment in the room should be electrically quiet—this can be checked via a probe test for electromagnetic signals .
EEG recording system and channels
The recommended electrode placement system is the international 10–20 system. This method describes the location electrodes on the scalp. The ‘“10” and “20” refer to the fact that the actual distances between adjacent electrodes are either 10 or 20% of the total front–back or right–left distance of the skull’  (Fig. 4; Table 6).
The popular choice of EEG channel list was determined using channels specifically mentioned as prominent for classification in a study and channels that overlap at least between 2 studies.
Inclusion and exclusion criteria of the subjects
The inclusion and exclusion criteria summarised from the reviews are given below.
Genetic disorders in person or family history.
Drug or alcohol addiction.
Serious medical condition.
Difficulties in hearing/vision—this would not apply if the subject has corrected vision/hearing.
Attention deficit disorders.
Handedness–the participants recruited need to be either left-handed or right-handed and not have a mix of the both. This is because there is a difference in EEG activities between the right-handed and left-handed subjects [43, 44].
As explained before, it is now understood that dyslexia is a disability that causes difficulties in reading and writing despite normal (or above) intelligence and sensory capabilities. Therefore, it can be presumed that dyslexia-specific brainwave activation patterns are more prominent during performing reading and writing activities instead of having tasks that are only related to the working memory and reasoning. Reading-related tasks can be drilled down further to find out brain signal patterns while reading regular words against nonsense words. Phonological awareness, ‘the ability to hear and manipulate the sounds’ in words , is one of the commonly found difficulties in dyslexics. Research  shows that dyslexics perform worse in reading irregular and nonsense words compared to regular words. Therefore, including a task to read nonsense words may show noticeable results. Today, writing is often replaced by typing in day-to-day activities; therefore, this too could be included in the tasks. Further, a task with a combination of reading and writing can be incorporated.
Pre-processing is one of the most important steps in the analysis process of the signals. This step makes sure unwanted artefacts are removed from the signal. When recording EEG signals, some of the most commonly seen irrelevant artefacts are the eye movements and eye blinks, and the common practices used for removing these from EEG signals are independent component analysis (ICA) and principal component analysis (PCA) [53, 54]. Comparison studies between these two techniques show that ICA produces better results compared to PCA [54, 55].
In addition, electrooculogram (EOG), which are produced from eye movements, and EEG recordings can contain contamination signals such as electromyogram (EMG) and electrocardiogram (ECG). Typically, body movements are kept to a minimum during EEG-based experiments. This is because movements cause unwanted artefacts in the EEG signal, making the analyses and classifications difficult. In fact, sometimes trials with unwanted artefacts are manually rejected from studies . However, new methods have now been introduced making it possible to collect data during real-life activities instead of only collecting data during resting state or simple activities such as button clicks. Artefact subspace reconstruction (ASR) is one such method which can be used to filter out body movement and muscle burst artefacts from the EEG signals [57, 58]. ASR ‘relies on a sliding-window principal component analysis, which statistically interpolates any high-variance signal components exceeding a threshold relative to the covariance of the calibration data set. Each affected time point of EEG is then linearly reconstructed from the retained signal subspace based on the correlation structure observed in the calibration data’ .
ASR requires a 1-min EEG recording in the relaxed state, which is known as the calibration data set. This technique performs PCA on a sliding window, removes high variance up to three standard deviations above the mean and finally reconstructs using the remaining signal. This automated artefact removal technique is quite easy to use as it is available as a plug-in in EEGLAB (Fig. 5).
state = asr_calibrate (calibrationData, samplingFrequency);
cleanData = asr_process(experimentData, samplingFrequency, state);
Another important aspect to be filtered prior to the analysis is the noise caused by electric power lines. This is often seen at 60 or 50 Hz, and this can be filtered out using a notch filter.
There are mainly 2 types of analysis that could be used, which are namely frequency/Fourier analysis and wavelet analysis (Table 7).
Frequency analysis One of the common analyses used in EEG-based pattern classification frameworks for dyslexia is the frequency analysis. The raw EEG signal recorded is in the time domain. This waveform is a combination of a number of sinusoidal waves although is it not directly visible. Fast Fourier transform, commonly known as FFT, can be used for the decomposition of the waveform into a sum of sinusoids of different frequencies. Therefore, by performing the FFT it helps detect spikes in the frequency domain which could not have been visible before.
Wavelet analysis This method decomposes a signal onto a set of basis functions called wavelets  and allows analysis on the frequency domain and time domain.
The analysis should be selected based on the expected outcome. Although wavelet gives extra information, this might not be important if the intension is only to identify which voltages are present at each frequency and not at what time the particular voltage was present. The decision for the analysis method is purely based on the experiment and expected outcome.
EEG sub-band decomposition
Once all the channels have been transformed to the frequency domain, this could be decomposed into sub-bands. Table 8 contains a summary of each frequency sub-band.
This method allows analysing the frequencies at specific frequency bands instead of analysing each frequency in isolation.
The most important step in the analysis phase is the extraction of features. Feature extraction is transforming the input data into a set of features . This helps to analyse the data in terms of a reduced set of features instead of the large original input data set. The input features identified through the review are power spectral density, entropy, positive area, maximal peak amplitude/time ratio, spectral flatness measure, standard deviation and skewness. Energy, average valley amplitude, peak variation, root mean square and power are few of the features used in recent EEG-related studies [31, 60, 61] that could be incorporated in EEG-based pattern classification for dyslexia frameworks as well. Adding all these features will not necessarily improve the validation accuracy; these features from other EEG studies are suggested so that these combinations could be tested and help improve dyslexia-based frameworks as it has helped improve other frameworks.
The classification phase can be identified as the most important step in the dyslexia pattern identification process. Once all the data are ready, it is important to select the best classification algorithm. The popular choices of classification algorithms used in past similar research are support vector machine and multilayer perceptron. EEG classifications have also been performed for other conditions using classifiers such as fuzzy support vector , optimum-path forest classifier , linear discriminant analysis and neural networks . Out of the choices below are 3 popular choices, along with pros and cons of each choice.
Linear discriminant analysis
Linear discriminant analysis classifies data by first creating ‘models of the probability density functions for data generated from each class. Then, a new data point is classified by determining the probability density function whose value is larger than the others’ . The algorithm ‘assumes that each of the class probability density functions can be modelled as a normal density and that the normal density functions for all classes have the same covariance’ .
Linear component analysis is known to be a simple classifier that requires very small computations. However, this algorithm is not suitable for complex nonlinear EEG classifications since it does not produce good results for such scenarios .
Neural networks are ‘an assembly of several artificial neurons which enables to produce nonlinear decision boundaries’ .
Neural networks perform better for EEG classifications compared to linear discriminant analysis since it can be used to implement boundaries for nonlinear classifications. Nevertheless, to acquire the desired level of accuracy, it is important to choose a suitable number of hidden units, which can become problematic. Having a larger number of hidden units than required results in memorising the training set which causes poor generalisation .
Support vector machines
Support vector machine is a supervised learning method , which can handle both linear and nonlinear classifications. It produces a hyper-plane having the maximal margin to the support vectors. Support vector machine can classify even overlapping and non-separable data sets by mapping onto higher-dimensional spaces using the kernel functions [34, 63].
Popular classification technique
Through the comparison of the popular choices of the classification algorithms for EEG signals, it can be concluded that support vector machine (SVM) is a better choice.
SVM has been used in past research for many EEG signal classifications. Successful results have been obtained in classifying mental tasks , seizure detection [34, 35], discrimination between dyslexics and non-dyslexics [39, 45], epilepsy diagnosis , vigilance analysis , etc.
Further research [63, 64] has recommended support vector machines as a more appropriate choice for EEG signal classifications. Recent EEG-related studies [68,69,70] have been able to obtain good validation accuracies using SVM classifiers.
Dyslexia is a disability with a neurological origin, affecting a significant amount of the population, which causes difficulties in reading and writing despite average intelligence. It is a heritable condition, but not a disease or defect that can be cured, rather a state that can he helped with proper targeted assistance. Research has shown distinctions in the brainwave patterns and brain structures of dyslexics compared to non-dyslexics (normal). Though dyslexia has a neurological origin, the conventional dyslexia detection techniques used are often based on behavioural aspects such as reading, writing, intelligence quotient (IQ) and memory abilities.
Many researches have attempted to introduce and improve EEG-based pattern classification frameworks for dyslexia. This review paper has identified pros and cons of existing frameworks. The frameworks are reviewed based on the criteria: data collection, pre-processing, analysis and classification. According to the review, it was revealed that frameworks require a minimum of 15 subjects per each group, the studies could be conducted on children or adults, and comparison between the female and male dyslexic brainwave patterns need to be conducted. It is also important to identify the inclusion and exclusion criteria prior to the data collection to minimise the number of outliers.
It was discovered that the experiments used were often simple tasks, which measure working memory and reasoning abilities instead of reading and writing abilities. This could be because to reduce the unwanted artefacts caused by body movements in the EEG signals during reading and writing activities. We have proposed using ASR a successful method that has been used in recent studies to filter out body movement and muscle burst artefacts from the EEG signals [57, 58]. Finally, we have proposed more input features and recommended SVM as the classifier to be used in EEG-based pattern classification frameworks for dyslexia.
Zerbin-Rüdin E (1967) Kongenitale wortblindheit oder spezifische dyslexie (congenital word-blindness). Ann Dyslexia 17(1):47–54
Fletcher JM, Lyon GR, Fuchs LS, Barnes MA (2006) Learning disabilities: from identification to intervention. Guilford Press, New York
dyslexiaSpeld (2014) Consultations/assessment—tests for dyslexia
Mohamad S, Mansor W, Lee KY (2013) Review of neurological techniques of diagnosing dyslexia in children. In: 2013 IEEE 3rd international conference on system engineering and technology (ICSET), 19–20 Aug 2013, pp 389–393. https://doi.org/10.1109/icsengt.2013.6650206
Elnakib A, El-Baz A, Casanova MF, Switala AE (2010) Dyslexia diagnostics by centerline-based shape analysis of the corpus callosum. In: 2010 20th international conference on pattern recognition (ICPR), 23–26 Aug 2010, pp 261–264. https://doi.org/10.1109/icpr.2010.73
El-Baz A, Casanova M, Gimel’farb G, Mott M, Switala A, Vanbogaert E, McCracken R (2008) Dyslexia diagnostics by 3D texture analysis of cerebral white matter gyrifications. In: 19th international conference on pattern recognition, 2008. ICPR 2008, 8–11 Dec 2008, pp 1–4. https://doi.org/10.1109/icpr.2008.4760971
Prado C, Dubois M, Valdois S (2007) The eye movements of dyslexic children during reading and visual search: impact of the visual attention span. Vis Res 47(19):2521–2530. https://doi.org/10.1016/j.visres.2007.06.001
Macas M, Lhotska L, Novak D (2013) Hidden Markov models for analysis of eye movements of dyslexic children. In: 2013 18th international conference on digital signal processing (DSP), 1–3 July 2013, pp 1–5. https://doi.org/10.1109/icdsp.2013.6622783
Bellocchi S, Muneaux M, Bastien-Toniazzo M, Ducrot S (2013) I can read it in your eyes: what eye movements tell us about visuo-attentional processes in developmental dyslexia. Res Dev Disabil 34(1):452–460. https://doi.org/10.1016/j.ridd.2012.09.002
Sahari SH, Johari A (2012) Improvising reading classes and classroom environment for children with reading difficulties and dyslexia symptoms. Procedia 38:100–107. https://doi.org/10.1016/j.sbspro.2012.03.329
Shalev L, Mevorach C, Humphreys GW (2008) Letter position coding in attentional dyslexia. Neuropsychologia 46(8):2145–2151. https://doi.org/10.1016/j.neuropsychologia.2008.02.022
Démonet J-F, Taylor MJ, Chaix Y (2004) Developmental dyslexia. Lancet 363(9419):1451–1460. https://doi.org/10.1016/S0140-6736(04)16106-0
Gvion A, Friedmann N (2010) Letter position dysgraphia. Cortex 46(9):1100–1113. https://doi.org/10.1016/j.cortex.2009.09.008
Shaywitz SE, Shaywitz BA (2005) Dyslexia (specific reading disability). Biol Psychiat 57(11):1301–1309. https://doi.org/10.1016/j.biopsych.2005.01.043
Francks C, MacPhie IL, Monaco AP (2002) The genetic basis of dyslexia. Lancet Neurol 1(8):483–490. https://doi.org/10.1016/S1474-4422(02)00221-1
Eicher JD, Gruen JR (2013) Imaging-genetics in dyslexia: connecting risk genetic variants to brain neuroimaging and ultimately to reading impairments. Mol Genet Metab 110(3):201–212. https://doi.org/10.1016/j.ymgme.2013.07.001
Goez H, Zelnik N (2008) Handedness in patients with developmental coordination disorder. J Child Neurol 23(2):151–154. https://doi.org/10.1177/0883073807307978
Satz P (1987) Left-handedness and dyslexia: an old myth revisited. J Pediatr Psychol 12(2):291–298
Tonnessen FE, Lokken A, Hoien T, Lundberg I (1993) Dyslexia, left-handedness, and immune disorders. Arch Neurol 50(4):411–416
Shaywitz SE (2003) Overcoming dyslexia: a new and complete science-based program for reading problems at any level, vol, 1st edn. Knopf, New York
dyslexiaSpeld (2014) Learning difficulties and learning disabilities—dsf.net.au
de Santana VF, de Oliveira R, lmeida LDA, Baranauskas MCC (2012) Web accessibility and people with dyslexia: a survey on techniques and guidelines. In: Proceedings of the international cross-disciplinary conference on web accessibility, 2012. ACM, p 35
Sklar B, Hanley J, Simmons WW (1973) A computer analysis of EEG spectral signatures from normal and dyslexic children. IEEE Trans Biomed Eng BME-20(1):20–26. https://doi.org/10.1109/TBME.1973.324247
Zakopoulou V, Anagnostopoulou A, Christodoulides P, Stavrou L, Sarri I, Mavreas V, Tzoufi M (2011) An interpretative model of early indicators of specific developmental dyslexia in preschool age: a comparative presentation of three studies in Greece. Res Dev Disabil 32(6):3003–3016. https://doi.org/10.1016/j.ridd.2011.03.021
Haddadian F, Alipourb V, Majidi A, Maleki H (2012) The effectiveness of self-instruction technique on improvement of reading performance and reduction of anxiety in primary school students with dyslexia. Procedia 46:5366–5370. https://doi.org/10.1016/j.sbspro.2012.06.440
Suárez-Coalla P, Cuetos F (2015) Reading difficulties in Spanish adults with dyslexia. Ann Dyslexia 65(1):33–51. https://doi.org/10.1007/s11881-015-0101-3
Davis R (2010) The gift of dyslexia: why some of the brightest people can’t read and how they can learn. Souvenir Press Ltd, London
Laasonen M, Väre J, Oksanen-Hennah H, Leppämäki S, Tani P, Harno H, Hokkanen L, Pothos E, Cleeremans A (2014) Project DyAdd: implicit learning in adult dyslexia and ADHD. Ann Dyslexia 64(1):1–33. https://doi.org/10.1007/s11881-013-0083-y
Ekhsan HM, Ahmad SZ, Halim SA, Hamid JN, Mansor NH (2012) The implementation of interactive multimedia in early screening of dyslexia. In: 2012 international conference on innovation management and technology research (ICIMTR), 21–22 May 2012, pp 566–569. https://doi.org/10.1109/icimtr.2012.6236459
Nunez PL, Srinivasan R (2006) Electric fields of the brain: the neurophysics of EEG. Oxford university Press, Oxford
Nunes TM, Coelho ALV, Lima CAM, Papa JP, de Albuquerque VHC (2014) EEG signal classification for epilepsy diagnosis via optimum path forest—a systematic assessment. Neurocomputing 136:103–123. https://doi.org/10.1016/j.neucom.2014.01.020
Plante DT, Goldstein MR, Landsness EC, Riedner BA, Guokas JJ, Wanger T, Tononi G, Benca RM (2013) Altered overnight modulation of spontaneous waking EEG reflects altered sleep homeostasis in major depressive disorder: a high-density EEG investigation. J Affect Disord 150(3):1167–1173. https://doi.org/10.1016/j.jad.2013.05.084
Silipo R, Deco G, Bartsch H (1999) Brain tumor classification based on EEG hidden dynamics. Intell Data Anal 3(4):287–306. https://doi.org/10.1016/S1088-467X(99)00024-4
Shantha Selva Kumari R, Prabin Jose J (2011) Seizure detection in EEG using time frequency analysis and SVM. In: 2011 international conference on emerging trends in electrical and computer technology (ICETECT), 23–24 March 2011, pp 626–630. https://doi.org/10.1109/icetect.2011.5760193
Chisci L, Mavino A, Perferi G, Sciandrone M, Anile C, Colicchio G, Fuggetta F (2010) Real-time epileptic seizure prediction using AR models and support vector machines. IEEE Trans Biomed Eng 57(5):1124–1132. https://doi.org/10.1109/TBME.2009.2038990
Ghayab HRA, Li Y, Abdulla S, Diykh M, Wan X (2016) Classification of epileptic EEG signals based on simple random sampling and sequential feature selection. Brain Informatics 3:1–7. https://doi.org/10.1007/s40708-016-0039-1
I have Dyslexia. What does it mean? (2014) The Yale center for dyslexia & creativity. http://dyslexia.yale.edu/Stu_whatisdyslexia.html
Arns M, Peters S, Breteler R, Verhoeven L (2007) Different brain activation patterns in dyslexic children: evidence from EEG power and coherence patterns for the double-deficit theory of dyslexia. J Integr Neurosci 6(1):175–190. https://doi.org/10.1142/S0219635207001404
Andreadis II, Giannakakis GA, Papageorgiou C, Nikita KS (2009) Detecting complexity abnormalities in dyslexia measuring approximate entropy of electroencephalographic signals. In: Annual international conference of the IEEE engineering in medicine and biology society, 2009. EMBC 2009, 3–6 Sept 2009, pp 6292–6295. https://doi.org/10.1109/iembs.2009.5332798
Giannakakis GA, Tsiaparas NN, Xenikou MFS, Papageorgiou C, Nikita KS (2008) Wavelet entropy differentiations of event related potentials in dyslexia. In: 8th IEEE international conference on bioinformatics and bioengineering, 2008. BIBE 2008, 8–10 Oct 2008, pp 1–6. https://doi.org/10.1109/bibe.2008.4696836
Che Wan Fadzal CWNF, Mansor W, Lee KY, Mohamad S, Amirin S () Frequency analysis of EEG signal generated from dyslexic children. In: 2012 IEEE symposium on computer applications and industrial electronics (ISCAIE), 3–4 Dec 2012, pp 202–204. https://doi.org/10.1109/iscaie.2012.6482096
Che Wan Fadzal CWNF, Mansor W, Lee KY, Mohamad S, Mohamad N, Amirin S (2012) Comparison between characteristics of EEG signal generated from dyslexic and normal children. In: 2012 IEEE EMBS conference on biomedical engineering and sciences (IECBES), 17–19 Dec 2012, pp 943–946. https://doi.org/10.1109/iecbes.2012.6498210
Andrew Ng CR, Leong WY (2014) An EEG-based approach for left-handedness detection. Biomed Signal Process Control 10:92–101. https://doi.org/10.1016/j.bspc.2014.01.005
Provins KA, Cunliffe P (1972) The relationship between E.E.G. activity and handedness. Cortex 8(2):136–146. https://doi.org/10.1016/S0010-9452(72)80014-5
Frid A, Breznitz Z (2012) An SVM based algorithm for analysis and discrimination of dyslexic readers from regular readers using ERPs. In: 2012 IEEE 27th convention of electrical & electronics engineers in Israel (IEEEI), 14–17 Nov 2012, pp 1–4. https://doi.org/10.1109/eeei.2012.6377068
Karim I, Abdul W, Kamaruddin N (2013) Classification of dyslexic and normal children during resting condition using KDE and MLP. In: 2013 5th international conference on information and communication technology for the Muslim World (ICT4 M), 26–27 March 2013, pp 1–5. https://doi.org/10.1109/ict4m.2013.6518886
Fuad N, Mansor W, Lee KY (2013) Wavelet packet analysis of EEG signals from children during writing. In: 2013 IEEE symposium on computers & informatics (ISCI), 7–9 April 2013, pp 228–230. https://doi.org/10.1109/isci.2013.6612408
Israel GD (1992) Determining sample size. University of Florida Cooperative Extension Service, Institute of Food and Agriculture Sciences, EDIS, Florida
Preparing the Experiment Room (2015)
Khazi M, Kumar A, Vidya M (2012) Analysis of EEG using 10: 20 electrode system. Int J Innovative Res Sci Eng Technol 1(2):185–191
Johnston SS, McDonnell AP, Hawken LS (2008) Enhancing outcomes in early literacy for young children with disabilities: strategies for success. Interv Sch Clin 43(4):210–217. https://doi.org/10.1177/1053451207310342
Ziegler JC, Castel C, Pech-Georgel C, George F, Alario FX, Perry C (2008) Developmental dyslexia and the dual route model of reading: simulating individual differences and subtypes. Cognition 107(1):151–178. https://doi.org/10.1016/j.cognition.2007.09.004
Shi-Yun S, Kai-Quan S, Chong Jin O, Wilder-Smith E, Xiao-Ping L (2009) Automatic EEG artifact removal: a weighted support vector machine approach with error correction. IEEE Trans Biomed Eng 56(2):336–344. https://doi.org/10.1109/TBME.2008.2005969
Turnip A, Junaidi E (2014) Removal artifacts from EEG signal using independent component analysis and principal component analysis. In: 2014 2nd international conference on technology, informatics, management, engineering, and environment (TIME-E), 19–21 Aug 2014, pp 296–302. https://doi.org/10.1109/time-e.2014.7011635
Bugli C, Lambert P (2007) Comparison between principal component analysis and independent component analysis in electroencephalograms modelling. Biometr J 49(2):312–327. https://doi.org/10.1002/bimj.200510285
Sabisch B, Hahne A, Glass E, von Suchodoletz W, Friederici A (2006) Auditory language comprehension in children with developmental dyslexia: evidence from event-related brain potentials. J Cognit Neurosci 18(10):1676–1695. https://doi.org/10.1162/jocn.2006.18.10.1676
Bulea TC, Prasad S, Kilicarslan A, Contreras-Vidal JL (2014) Sitting and standing intention can be decoded from scalp EEG recorded prior to movement execution. Front Neurosci 8:376. https://doi.org/10.3389/fnins.2014.00376
Mullen T, Kothe C, Chi YM, Ojeda A, Kerth T, Makeig S, Cauwenberghs G, Jung T-P (2013) Real-time modeling and 3D visualization of source dynamics and connectivity using wearable EEG. In: 2013 35th annual international conference of the IEEE engineering in medicine and biology society (EMBC), 2013. IEEE, pp 2184–2187. https://doi.org/10.1109/embc.2013.6609968
Akin M (2002) Comparison of wavelet transform and FFT methods in the analysis of EEG signals. J Med Syst 26(3):241–247. https://doi.org/10.1023/A:1015075101937
Page A, Turner J, Mohsenin T, Oates T (2014) Comparing raw data and feature extraction for seizure detection with deep learning methods. In: The 27th international flairs conference
Boashash B, Azemi G, Ali Khan N (2015) Principles of time–frequency feature extraction for change detection in non-stationary signals: applications to newborn EEG abnormality detection. Pattern Recogn 48(3):616–627. https://doi.org/10.1016/j.patcog.2014.08.016
Eslahi SV, Dabanloo NJ (2013) Fuzzy support vector machine analysis in EEG classification. Int Res J Appl Basic Sci 5:161–165
Garrett D, Peterson DA, Anderson CW, Thaut MH (2003) Comparison of linear, nonlinear, and feature selection methods for EEG signal classification. IEEE Trans Neural Syst Rehabil Eng 11(2):141–144. https://doi.org/10.1109/TNSRE.2003.814441
Lotte F, Congedo M, Lécuyer A, Lamarche F, Arnaldi B (2007) A review of classification algorithms for EEG-based brain–computer interfaces. J Neural Eng 4:R1–R13
Parvar H, Sculthorpe-Petley L, Satel J, Boshra R, D’Arcy RCN, Trappenberg TP (2014) Detection of event-related potentials in individual subjects using support vector machines. Brain Inf 2(1):1–12. https://doi.org/10.1007/s40708-014-0006-7
Hosni SM, Gadallah ME, Bahgat SF, AbdelWahab MS (2007) Classification of EEG signals using different feature extraction techniques for mental-task BCI. In: International conference on computer engineering & systems, 2007. ICCES’07, 27–29 Nov 2007. pp 220–226. https://doi.org/10.1109/icces.2007.4447052
Lei C, Jie L, Yaoru S, Huaping Z, Chungang Y (2010) EEG-based vigilance analysis by using fisher score and PCA algorithm. In: 2010 IEEE international conference on progress in informatics and computing (PIC), 10–12 Dec. 2010, pp 175–179. https://doi.org/10.1109/pic.2010.5687413
Jatupaiboon N, Pan-ngum S, Israsena P (2013) Emotion classification using minimal EEG channels and frequency bands. In: 2013 10th international joint conference on computer science and software engineering (JCSSE), 29–31 May 2013, pp 21–24. https://doi.org/10.1109/jcsse.2013.6567313
Zhang T, Chen W (2016) LMD based features for the automatic seizure detection of EEG signals using SVM. IEEE Trans Neural Syst Rehabil Eng PP(99):1–1. https://doi.org/10.1109/TNSRE.2016.2611601
Liu YH, Huang CW, Hsiao YT (2013) Comparsion of methods for a motor imagery-based two-state self-paced brain-computer interface. In: 2013 international conference on advanced robotics and intelligent systems (ARIS), May 31 2013–June 2 2013, pp 174–178. https://doi.org/10.1109/aris.2013.6573555
Bland M (2011) Sample size for clinical trials. University of York, York
HP carried out the literature review and drafting of the manuscript. MFS and KWW reviewed the manuscript. All authors read and approved the final manuscript.
Harshani Perera received her BSc (Hons) in Software Engineering from the University of Wales, United Kingdom, and proceeded with her career in software engineering. With her passion for research, she then chose to pursue a PhD in developing a detection mechanism of learning disabilities using intelligent data analysis and classification of brainwave signals and received her PhD in Information Technology from Murdoch University, Australia. Dr Perera’s entrepreneurial passion also motivated her to co-found Inqbaytor Pty Ltd and she currently continues to serve as the CEO.
Mohd Fairuz Shiratuddin is a Senior Lecturer in the School of Engineering and Information Technology. In his early careers, he was trained as an engineer mainly dealing with computers and its applications in the United Kingdom and Malaysia. Then he decided to pursue a career in academia. He graduated with a Bachelor of Engineering degree in Electrical & Electronics from Northumbria University at Newcastle-upon-Tyne in the UK, a Master of Science degree in Information Technology (by research specializing in Virtual Reality) from Universiti Utara, Malaysia, a Master of Science degree in Architecture from Virginia Tech, USA, and a Doctor of Philosophy degree in Environmental Design & Planning also from Virginia Tech. Prior to working at Murdoch University, he was a Lecturer at Universiti Utara, Malaysia, and an Assistant Professor at the University of Southern Mississippi, USA. He always considers himself as a multi-disciplinary person with a lot of appreciation for arts and design, and technical development. His research interests include computer-based systems for children’s teaching and learning, and stroke survivors’ rehabilitation, virtual/mixed/augmented reality, natural user interfaces, games design, development and technologies, and artificial intelligence, for practical, real-world uses. He is currently leading Project Neuromender:: a low-cost home-based stroke rehabilitation system. He is also working on a project to utilize augmented reality, image processing and facial recognition technologies for security purposes. He has numerous publications in national and international conference proceedings, journals, books, book chapters and reports.
Kok Wai Wong is currently working as an Associate Professor with the School of Engineering and Information Technology at Murdoch University in Western Australia. He is a Senior Member of IEEE, a member of ACS, and Certified Professional of ACS. He is the current Vice-president (membership) for The Asia Pacific Neural Network Society (APNNS). He has also held executive positions such as the section chair for IEEE Western Australia Section and has also served as a member for the Emergent Technologies Technical Committee (ETTC) and Games Technical Committee (GTC) of the IEEE Computational Intelligence Society (CIS) in the past. His current research interests include intelligent data mining and data science, artificial intelligence and machine learning, and game and virtual reality technology.
The authors declare that they have no competing interests.
Ethics approval and consent to participate
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Perera, H., Shiratuddin, M.F. & Wong, K.W. Review of EEG-based pattern classification frameworks for dyslexia. Brain Inf. 5, 4 (2018). https://doi.org/10.1186/s40708-018-0079-9
- Feature extraction
- Artefact removal
- Artefact subspace reconstruction
- Support vector machine