Journal of Medical Systems

, Volume 36, Issue 4, pp 2141–2147

SVM Feature Selection Based Rotation Forest Ensemble Classifiers to Improve Computer-Aided Diagnosis of Parkinson Disease

ORIGINAL PAPER

Abstract

Parkinson disease (PD) is an age-related deterioration of certain nerve systems, which affects movement, balance, and muscle control of clients. PD is one of the common diseases which affect 1% of people older than 60 years. A new classification scheme based on support vector machine (SVM) selected features to train rotation forest (RF) ensemble classifiers is presented for improving diagnosis of PD. The dataset contains records of voice measurements from 31 people, 23 with PD and each record in the dataset is defined with 22 features. The diagnosis model first makes use of a linear SVM to select ten most relevant features from 22. As a second step of the classification model, six different classifiers are trained with the subset of features. Subsequently, at the third step, the accuracies of classifiers are improved by the utilization of RF ensemble classification strategy. The results of the experiments are evaluated using three metrics; classification accuracy (ACC), Kappa Error (KE) and Area under the Receiver Operating Characteristic (ROC) Curve (AUC). Performance measures of two base classifiers, i.e. KStar and IBk, demonstrated an apparent increase in PD diagnosis accuracy compared to similar studies in literature. After all, application of RF ensemble classification scheme improved PD diagnosis in 5 of 6 classifiers significantly. We, numerically, obtained about 97% accuracy in RF ensemble of IBk (a K-Nearest Neighbor variant) algorithm, which is a quite high performance for Parkinson disease diagnosis.

Keywords

Rotation forest Ensemble classification Parkinson Breast cancer Diabetes Feature selection Support vector machine Computer aided diagnosis 

References

  1. 1.
    Wahed, M., and Wahba, K., Data mining based-assistant tools for physicians to diagnose diseases, Micro-nano mechatronics and human science, 2003 IEEE Int. Symp. 388–391, 2003.Google Scholar
  2. 2.
    Ozcift, A., and Gulten, A., Assessing effects of preprocessing mass spectrometry data on classification performance. Eur. J. Mass Spectrom. 267-273, 2008.Google Scholar
  3. 3.
    Kononenko, I., Machine learning for medical diagnosis: History, state of the art and perspective. Artif. Intell. Med. 23(1):89–109, 2001.MathSciNetCrossRefGoogle Scholar
  4. 4.
    Djebbari, A., An ensemble machine learning approach to predict survival in breast cancer. Int. J. Comput. Biol. Drug Des. 1(3):275–294, 2008.CrossRefGoogle Scholar
  5. 5.
    Das, R., and Sengur, A., Evaluation of ensemble methods for diagnosing of valvular heart disease. Expert Syst. Appl. 2010.Google Scholar
  6. 6.
    Duangsoithong, R., and Windeatt, T., Relevant and redundant feature analysis with ensemble classification. Seventh International Conference on Advances in Pattern Recognition. 247-250, 2009.Google Scholar
  7. 7.
    Shadabi, F., Sharma, D., and Cox, R., Learning from ensembles: Using artificial neural network ensemble for medical outcomes prediction. Innovations in Information Technology, IEEE. 1-5, 2006.Google Scholar
  8. 8.
    Bruha, I., Meta-learner for unknown attribute values processing: Dealing with inconsistency of meta-databases. J.I I.S. 71-87, 2004.Google Scholar
  9. 9.
    Polikar, R., An ensemble based data fusion approach for early diagnosis of Alzheimer’s disease. Inf. Fusion. 83-95, 2008.Google Scholar
  10. 10.
    Das, R., A comparison of multiple classification methods for diagnosis of Parkinson disease. Expert Syst. Appl. 1568-1572, 2010.Google Scholar
  11. 11.
    Little, M., and McSharry, P., Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. Nature Precedings. 1-27, 2008.Google Scholar
  12. 12.
    Chaudhuri, K., and Healy, D. G., Non-motor symptoms of Parkinson’s disease: Diagnosis and management. Lancet Neurol. 235–245, 2006.Google Scholar
  13. 13.
    Rosen, K., and Kent R. D., Parametric quantitative acoustic analysis of conversation produced by speakers with dysarthria and healthy speakers. J. Speech Lang. Hear. Res. 395–411, 2006.Google Scholar
  14. 14.
    Little, M., and McSharry, P., Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans. Biomed. Eng. 1-8, 2008.Google Scholar
  15. 15.
    Lee, M., and Boroczky, L., A two-step approach for feature selection and classifier ensemble construction in computer-aided diagnosis. Computer-Based Med. Syst. 548–553, 2008.Google Scholar
  16. 16.
    Martinez, A. M., and Zhu, M., Where are linear feature extraction methods applicable? IEEE Transaction on Pattern Analysis and Machine Intelligence, pp. 1934–1944, 2005.Google Scholar
  17. 17.
    Fodor, I. K., A survey of dimension reduction techniques, Department of Energy by the University of California. 1-7, 2002.Google Scholar
  18. 18.
    Saeys, Y., A review of feature selection techniques in bioinformatics. Bioinformatics. Review. 2507–2517, 2007.Google Scholar
  19. 19.
    Jong, K., Feature selection in proteomic pattern data with support vector machines, computational intelligence in bioinformatics and computational biology, Proceedings of the 2004 IEEE Symposium. 41–48, 2004.Google Scholar
  20. 20.
    Chang, Y., Feature ranking using linear SVM, JMLR: workshop and conference proceedings. 53-64, 2008.Google Scholar
  21. 21.
    Guyon, I., Gene selection for cancer classification using support vector machines. Mach. Learn. 389-422, 2002.Google Scholar
  22. 22.
    Kuncheva, L. I., Combining pattern classifiers: Methods and algorithms .Wiley. 251.267, 2004.Google Scholar
  23. 23.
    Polikar, R., Ensemble based system in decision making. IEEE Circuits Syst. Mag. 21-44. 2006.Google Scholar
  24. 24.
    Zhang, C. and Zhang,J. S., RotBoost: a technique for combining rotation forest and adaboost. Pattern Recogn. Lett. 1524–1536, 2008.Google Scholar
  25. 25.
    Rodriguez, J., and Kuncheva L., Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 1619–1630, 2006.Google Scholar
  26. 26.
    Kuncheva, L., and Rodriguez, J., An experimental study on rotation forest ensembles. Lect. Notes Comput. Sci. 459–468, 2007.Google Scholar
  27. 27.
    Beale, R., and Jackson, T., Neural computing: an introduction. Institute of Physics Publishing, 1-22, 1990.Google Scholar
  28. 28.
    Aha, D. W., and Kibler, D., Instance-based learning algorithms. Mach. Learn. 37–66, 1991.Google Scholar
  29. 29.
    Alpaydin, E., Introduction to machine learning. MIT Press, 173-197, 2004.Google Scholar
  30. 30.
    Witten, I. H. and Ian, H., Data mining : practical machine learning tools and techniques. Morgan Kaufmann Ser. Data Manage. Syst. 153-168, 2005.Google Scholar
  31. 31.
    Huang, J.,and Ling, C., Using AUC and accuracy in evaluating learning algorithms. IEEE Trans. Knowl. Data Eng. 299-310, 2005.Google Scholar
  32. 32.
    David, A., Comparison of classification accuracy using Cohen’s weighted kappa. Expert Syst. Appl. 825-832, 2008.Google Scholar
  33. 33.
    Mangasarian, O., and Wolberg, W., Cancer diagnosis via linear programming. SIAM News. 1–18, 1990.Google Scholar
  34. 34.
    Kahn, M., Automated interpretation of diabetes patient data: Detecting temporal changes in insulin therapy. Proc Symp. Comp. Appl. Med. Care, IEEE Computer Society Press, 569–573, 1990.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Gaziantep Vocational School of Higher Education, Computer Programming DivisionUniversity of GaziantepGaziantepTurkey

Personalised recommendations