Evolving Systems

, Volume 10, Issue 1, pp 29–39 | Cite as

ILIOU machine learning preprocessing method for depression type prediction

  • Theodoros Iliou
  • Georgia Konstantopoulou
  • Mandani Ntekouli
  • Christina Lymperopoulou
  • Konstantinos Assimakopoulos
  • Dimitrios Galiatsatos
  • George AnastassopoulosEmail author
Original Paper


The main objective of this study was to find a data preprocessing method to boost the prediction performance of the machine learning algorithms in datasets of mental patients. Specifically, the machine learning methods must have almost excellent classification results in patients with depression, in order to achieve the sooner the possible the appropriate treatment. In this paper, we establish ILIOU data preprocessing method for Depression type detection. The performance of ILIOU data preprocessing method and principal component analysis preprocessing method was evaluated using the tenfold cross validation method assessing seven machine learning classification algorithms, nearest-neighbour classifier (IB1), C4.5 algorithm implementation (J48), random forest, multilayer perceptron (MLP), support vector machine (SMO), JRIP and fuzzy logic (FURIA), respectively. The classification results are presented and compared analytically. The experimental results reveal that the transformed dataset with new features after ILIOU preprocessing method implementation to the original dataset achieved 100% classification–prediction performance of the classification algorithms. So ILIOU data preprocessing method can be used for significantly boost classification algorithms performance in similar datasets and can be used for depression type prediction.


Data preprocessing ILIOU data preprocessing method Principal component analysis Machine learning Classification Feature selection Depression Mental illness 


  1. American Psychiatric Association (2000) Diagnostic and statistical manual of mental disorders DSM-IV-TR, 4th edn. American Psychiatric Publishing, Washington DCGoogle Scholar
  2. American Psychiatric Association (2013), Diagnostic and statistical manual of mental disorders DSM-V, 5th edn. American Psychiatric Publishing, Washington DC, pp 182–185CrossRefGoogle Scholar
  3. Balasubramanian M, Schwartz EL (2002) The isomap algorithm and topological stability. Science 295(5552):7CrossRefGoogle Scholar
  4. Beck AT, Young JE (1978) College blues. Psychol Today 12:80–92Google Scholar
  5. Beck AT, Emery G (1979) Cognitive therapy of anxiety and phobic disorders (Unpublished manual)Google Scholar
  6. Cuijpers P, van Straten A, Smit F, Mihalopoulos C, Beekman A (2008) Preventing the onset of depressive disorders: a meta-analytic review of psychological interventions. Am J Psychiatry 165(10):1272–1280CrossRefGoogle Scholar
  7. Cyran KA, Kawulok J, Kawulok M, Stawarz M, Michalak M, Pietrowska M, Polańska J (2013) Support vector machines in biomedical and biometrical applications. In: Emerging paradigms in machine learning, vol 13. Springer, Berlin, pp 379–417 (Google Scholar) CrossRefGoogle Scholar
  8. Dash M, Liu H (1997) Feature selection for classification, in intelligent data analysis. Elsevier, New York, pp 131–156 (Google Scholar) Google Scholar
  9. Dunteman GH (1989) Principal components analysis. SAGE Publications, Thousand OaksCrossRefGoogle Scholar
  10. Ennett CM, Frize M (2000) Selective sampling to overcome skewed a priori probabilities. In: Proceedings of AMIA symposium, pp 225–229 (Google Scholar) Google Scholar
  11. Eythymiou K, Mavroeidi Paylatou A, Kalantzi-Azizi A (2006) First aid in psychiatric health, a guide for psychiatric disorders and their treatment. Greek Letters Publishing, AthensGoogle Scholar
  12. Hall MA (1999) Correlation-based feature selection for machine learning. Waikato University, Department of Computer Science Google Scholar
  13. Hollon SD, Beck AT (1994) Cognitive and cognitive-behavioral therapies. In: Bergin AE, Garfield SL (eds) Handbook of psychotherapy and behavior change, 4th edn. Wiley, New York, pp 428–466Google Scholar
  14. Iliou T, Anagnostopoulos C-N, Nerantzaki M, Anastassopoulos G (2015) A novel machine learning data preprocessing method for enhancing classification algorithms performance. In: Proceedings of the 16th international conference on engineering applications of neural networks (INNS) (EANN ‘15’), ACM, New York, USA, Article 11, p 5. doi: 10.1145/2797143.2797155
  15. Information Sciences Theodoros Iliou, Anagnostopoulos C-N, Stephanakis IM, Anastassopoulos G (2015) A novel data preprocessing method for boosting neural network performance: a case study in osteoporosis prediction. Inf Sci 380:92–100 (ISSN 0020–0255) Google Scholar
  16. Jemos J (1984) Beck depression inventory: validation in a Greek sample. Athens University Medical SchoolGoogle Scholar
  17. Kapnogianni S, Kaklamani G, Efthymiou Κ (2016) Fighting depression. IBRT PublishingGoogle Scholar
  18. Khodayari-Rostamabad A, Reilly JP, Hasey G, Debruin H (2010) Using pre-treatment EEG data to predict response to SSRI treatment for MDD. Conf Proc IEEE Eng Med Biol Soc 2010:6103–6106Google Scholar
  19. Kohavi R (1995a) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the fourteenth international joint conference on artificial intelligence, vol 2, no 12, pp 1137–1143Google Scholar
  20. Kohavi R (1995b) A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI 14(2):1137–1145 (Google Scholar) Google Scholar
  21. Koprowski R, Zieleźnik W, Wróbel Z, Małyszek J, Stepien B, Wójcik W (2012) Assessment of significance of features acquired from thyroid ultrasonograms in Hashimoto’s disease. BioMed Eng OnLine 11:48. doi: 10.1186/1475-925X-11-48 (View Article Google Scholar) CrossRefGoogle Scholar
  22. Matthews BW (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta Protein Struct 405(2):442–451. doi: 10.1016/0005-2795(75)90109-9 CrossRefGoogle Scholar
  23. Moskowitz M, Feig SA, Cole-Beuglet V, Fox SH, Haberman JD, Libshitz HI, Zermeno A (1983) Evaluation of new imaging procedures for breast cancer: proper process. Am J Roentgenol 140(3):591–594.  10.2214/ajr.140.3.591 CrossRefGoogle Scholar
  24. Nouretdinov I, Costafreda SG, Gammerman A, Chervonenkis A, Vovk V, Vapnik V, Fu CHY (2011) Machine learning classification with confidence: application of transductive conformal predictors to MRI-based diagnostic and prognostic markers in depression. 56(2):809–813. doi: 10.1016/j.neuroimage.2010.05.023
  25. Patel MJ, Khalaf A, Aizensteina HJ (2015) Studying depression using imaging and machine learning methods. doi: 10.1016/j.nicl.2015.11.003 (Published online 2015)
  26. Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR (1996) A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 49(12):1373–1379.  10.1016/S0895-4356(96)00236-3 (View ArticleGoogle Scholar) CrossRefGoogle Scholar
  27. Pyle D (1999) Data preparation for data mining. Morgan Kaufmann Publishers, Los AltosGoogle Scholar
  28. Salomoni G, Grassi M, Mosini P, Riva P, Cavedini P, Bellodi L (2009) Artificial neural network model for the prediction of obsessive–compulsive disorder treatment response. J Clin Psychopharmacol 29:343–349CrossRefGoogle Scholar
  29. Simos G, Beck AT (2014) Cognitive behaviour therapy: a guide for the practising clinician, Vol 1, 1st ed<bib id="bib27">Smialowski P, Frishman D, Kramer S (2010) Pitfalls of supervised feature selection. Bioinformatics 26(3):440–443.  10.1093/bioinformatics/btp621 (View Article Google Scholar)
  30. Steyerberg EW, Bleeker SE, Moll HA, Grobbee DE, Moons KG (2003) Internal and external validation of predictive models: a simulation study of bias and precision in small samples. J Clin Epidemiol 56(5):441–447. doi: 10.1016/S0895-4356(03)00047-7 (View Article Google Scholar) CrossRefGoogle Scholar
  31. Vafaie H, Imam IF (1994) Feature selection methods: genetic algorithms vs. greedy-like search. In: Proceedings of international conference on fuzzy and intelligent control systemsGoogle Scholar
  32. Waikato Environment for Knowledge Analysis (2016) Data mining software in Java. Accessed 11 Dec 2016
  33. Weigand AS, Rumelhart DE, Huberman BA (1991) Generalization by weight elimination with application to forecasting. In: Lippmann RP, Moody J, Touretzky DS (eds) Advances in neural information processing systems, vol 3. Morgan Kaufman, San Mateo, pp 875–882 (Google Scholar) Google Scholar
  34. Westbrook D, Kennerley H, Kirk J (2014) Scientific editing. In: Kalantzi-Azizi A, Efthymiou K (eds) Introduction to cognitive-behavioral treatment, techniques and applications. Greek Letters Publishing, AthensGoogle Scholar
  35. Zhang GP (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybern Part C Appl Rev 30(4):451–462 (Google Scholar) CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  • Theodoros Iliou
    • 1
  • Georgia Konstantopoulou
    • 2
  • Mandani Ntekouli
    • 3
  • Christina Lymperopoulou
    • 1
  • Konstantinos Assimakopoulos
    • 4
  • Dimitrios Galiatsatos
    • 1
  • George Anastassopoulos
    • 1
    Email author
  1. 1.Medical Informatics Laboratory, Medical SchoolDemocritus University of ThraceThraceGreece
  2. 2.Special Office for Health Consulting ServicesUniversity of PatrasPatrasGreece
  3. 3.Wire Communications Laboratory, Department of Electrical EngineeringUniversity of PatrasPatrasGreece
  4. 4.Department of PsychiatryUniversity of PatrasPatrasGreece

Personalised recommendations