Skip to main content

Data selection in EEG signals classification


The alcoholism can be detected by analyzing electroencephalogram (EEG) signals. However, analyzing multi-channel EEG signals is a challenging task, which often requires complicated calculations and long execution time. This paper proposes three data selection methods to extract representative data from the EEG signals of alcoholics. The methods are the principal component analysis based on graph entropy (PCA-GE), the channel selection based on graph entropy (GE) difference, and the mathematic combinations channel selection, respectively. For comparison purposes, the selected data from the three methods are then classified by three classifiers: the J48 decision tree, the K-nearest neighbor and the Kstar, separately. The experimental results show that the proposed methods are successful in selecting data without compromising the classification accuracy in discriminating the EEG signals from alcoholics and non-alcoholics. Among them, the proposed PCA-GE method uses only 29.69 % of the whole data and 29.5 % of the computation time but achieves a 94.5 % classification accuracy. The channel selection method based on the GE difference also gains a 91.67 % classification accuracy by using only 29.69 % of the full size of the original data. Using as little data as possible without sacrificing the final classification accuracy is useful for online EEG analysis and classification application design.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7


  1. 1.

    Haas LF (2003) Hans Berger (1873–1941), Richard Caton (1842–1926), and electroencephalography. J Neurol Neurosurg Psychiatr 74(1):9–9

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Lehnertz K, Elger CE (1998) Can Epileptic seizures be predicted? Evidence from nonlinear time series analysis of brain electrical activity. Phys Rev Lett 80(22):5019–5022

    CAS  Article  Google Scholar 

  3. 3.

    Martinerie J, Adam C, Quyen MLV, Baulac M, Clemenceau S, Renault B, Varela FJ (1998) Epileptic seizures can be anticipated by non-linear analysis. Nat Med 4(10):1173–1176

    CAS  Article  PubMed  Google Scholar 

  4. 4.

    Siuly S, Kabir E, Wang H, Zhang Y (2015) Exploring sampling in the detection of multicategory EEG signals. Computat Math Methods Med 2015:576437. doi:10.1155/2015/576437

    Google Scholar 

  5. 5.

    Siuly LY, Wen P (2011) EEG signal classification based on simple random sampling technique with least square support vector machine. Int J Biomed Eng Technol 7(4):390–409. doi:10.1504/IJBET.2011.044417

    Article  Google Scholar 

  6. 6.

    Zhu G, Li Y, Wen P (2011) Evaluating functional connectivity in alcoholics based on maximal weight matching. J Adv Comput Intell Intell Inform 15(9):1221–1227

    Google Scholar 

  7. 7.

    Wackermann J (1995) Beyond mapping: estimating complexity of multichannel EEG recordings. Acta Neurobiol Exp 56(1):197–208

    Google Scholar 

  8. 8.

    Zhu G, Li Y, Wen PP (2012) An efficient visibility graph similarity algorithm and its application on sleep stages classification. Brain Informatics. Springer, New York, pp 185–195

    Chapter  Google Scholar 

  9. 9.

    Stam C, Lelj EHVD, Keunen R, Tavy D (1999) Nonlinear EEG changes in postanoxic encephalopathy. Theory in Biosciences-Theorie in den Biowissenschaften 118(3–4):209–218

    Google Scholar 

  10. 10.

    Stam CJ, Van Woerkom T, Keunen R (1997) Non-linear analysis of the electroencephalogram in Creutzfeldt-Jakob disease. Biol Cybern 77(4):247–256

    CAS  Article  PubMed  Google Scholar 

  11. 11.

    Nguyen-Ky T, Wen P, Li Y, Malan M (2012) Measuring the hypnotic depth of anaesthesia based on the EEG signal using combined wavelet transform, eigenvector and normalisation techniques. Comput Biol Med 42(6):680–691

    Article  PubMed  Google Scholar 

  12. 12.

    Li T, Wen P, Jayamaha S (2014) Anaesthetic EEG signal denoise using improved nonlocal mean methods. Australas Phys Eng Sci Med 37(2):431–437

    CAS  Article  PubMed  Google Scholar 

  13. 13.

    Misulis KE, Spehlmann R (1994) Spehlmann’s evoked potential primer: visual, auditory, and somatosensory evoked potentials in clinical diagnosis. Butterworth-Heinemann Medical, Boston

    Google Scholar 

  14. 14.

    Oscar-Berman M, Marinković K (2007) Alcohol: effects on neurobehavioral functions and the brain. Neuropsychol Rev 17(3):239–257

    Article  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Richman JS, Moorman JR (2000) Physiological time-series analysis using approximate entropy and sample entropy. Am J Physiology-Heart Circ Physiol 278(6):H2039–H2049

    CAS  Google Scholar 

  16. 16.

    Di W, Zhihua C, Ruifang F, Guangyu L, Tian L Notice of Retraction Study on human brain after consuming alcohol based on EEG signal. In: Computer science and information technology (ICCSIT), 2010 3rd IEEE international conference on 2010. IEEE, pp 406–409

  17. 17.

    Sun Y, Ye N, Xu X EEG analysis of alcoholics and controls based on feature extraction. In: Signal processing, 2006 8th international conference on 2006 IEEE

  18. 18.

    Subasi A (2007) EEG signal classification using wavelet feature extraction and a mixture of expert model. Expert Syst Appl 32(4):1084–1093

    Article  Google Scholar 

  19. 19.

    Güler I, Übeyli ED (2005) Adaptive neuro-fuzzy inference system for classification of EEG signals using wavelet coefficients. J Neurosci Methods 148(2):113–121

    Article  PubMed  Google Scholar 

  20. 20.

    Tsuji T, Bu N, Fukuda O, Kaneko M (2003) A recurrent log-linearized Gaussian mixture network. Neural Netw IEEE Trans 14(2):304–316

    CAS  Article  Google Scholar 

  21. 21.

    Vasicek O (1976) A test for normality based on sample entropy. J R Stat Soc Ser B (Methodol) 38:54–59

    Google Scholar 

  22. 22.

    Polat K, Güneş S (2007) Classification of epileptiform EEG using a hybrid system based on decision tree classifier and fast Fourier transform. Appl Math Comput 187(2):1017–1026

    Google Scholar 

  23. 23.

    Chandaka S, Chatterjee A, Munshi S (2009) Cross-correlation aided support vector machine classifier for classification of EEG signals. Expert Syst Appl 36(2):1329–1336

    Article  Google Scholar 

  24. 24.

    Zhu G, Li Y, Wen PP, Wang S (2014) Analysis of alcoholic EEG signals based on horizontal visibility graph entropy. Brain Inform 1:1–7

    CAS  Article  Google Scholar 

  25. 25.

    Tomida, Naoki et al. Active Data selection for motor imagery EEG classification. Biomed Eng, IEEE Trans 62.2 (2015): 458–467

  26. 26.

    Bache K, Lichman M (2013) UCI machine learning repository. URL, vol 901

  27. 27.

    Gutin G, Mansour T, Severini S (2011) A characterization of horizontal visibility graphs and combinatorics on words. Phys A 390(12):2421–2428

    CAS  Article  Google Scholar 

  28. 28.

    Luque B, Lacasa L, Ballesteros F, Luque J (2009) Horizontal visibility graphs: exact results for random time series. Phys RevE 80(4):046103

    CAS  Google Scholar 

  29. 29.

    Diestel R (2005) Graph theory, 3rd edn. Springer, Berlin, New York

    Google Scholar 

  30. 30.

    Körner J (1973) Coding of an information source having ambiguous alphabet and the entropy of graphs. In: 6th Prague conference on information theory, pp 411–425

  31. 31.

    Shannon CE (2001) A mathematical theory of communication. ACM SIGMOBILE Mobile Comput Commun Rev 5(1):3–55

    Article  Google Scholar 

  32. 32.

    Person K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2(6):559–572

    Article  Google Scholar 

  33. 33.

    Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24(6):417

    Article  Google Scholar 

  34. 34.

    Salzberg SL (1994) C4. 5: programs for machine learning by j. ross quinlan. morgan kaufmann publishers, inc., 1993. Mach Learn 16(3):235–240

    Google Scholar 

  35. 35.

    Sehgal L, Mohan N, Sandhu PS (2012) Quality prediction of function based software using decision tree approach. In: International conference on computer engineering and multimedia technologies (ICCEMT), pp 43–47

  36. 36.

    Duda RO, Hart PE (1973) Pattern classification and scene analysis, vol 3. Wiley, New York

    Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Shuaifang Wang.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, S., Li, Y., Wen, P. et al. Data selection in EEG signals classification. Australas Phys Eng Sci Med 39, 157–165 (2016).

Download citation


  • EEG
  • Data selection
  • Horizontal visibility graph (HVG)
  • Principal component analysis (PCA)