Abstract
Early diagnosis of diseases can save and leads to survival. There are several diagnoses techniques which mostly consist of classification and optimization parts. Although these techniques have their specific advantages, they have their significant disadvantages such as sensitivity to the number of features (symptoms) and need to features selection, challenge to detect non-integrated regions of one class and high complexity of their progresses. In this paper to fill up the disadvantages, a novel classification is proposed to disease diagnosis by different numbers of hyper-planes classifiers (HPC) that divides medical data into adequate regions based on assigning binary codes to each region. The HPC can find useful relationships between the symptoms of the diseases by tagging each region with the suitable class label. To optimize the HPC’s coefficients and improve disease diagnosis, chemical reaction optimization (CRO) is adapted based on four reactions on HPC’s coefficients, which are coded as molecular structures. Different numbers of HPCs are performed, and their experimental results are compared together. The interesting point of the results is disease diagnosis error 0.000% by five hyper-planes for test data of all investigated medical data set. Also, the best-obtained results of the CRO-HPC are compared with the best outputs of more than 50 methods of disease diagnosis from the previous state-of-the-art literature. This comparison shows that CRO-HPC’s diagnosis errors can compete with the majority of the other diagnostic methods.
Similar content being viewed by others
References
AlMuhaideb S, Menai MEB (2014) HColonies: a new hybrid metaheuristic for medical data classification. Appl Intell 41(1):282–298
Anto S, Chandramathi S, Aishwarya S (2016) An expert system based on LS-SVM and simulated annealing for the diagnosis of diabetes disease. Int J Inf Commun Technol 9(1):88–100
Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Stat Surv 4:40–79
Aslan MF et al (2018) Breast cancer diagnosis by different machine learning methods using blood analysis data. Int J Intell Syst Appl Eng 6(4):289–293
Avci E et al (2018) Performance comparison of some classifiers on chronic kidney disease data. In: 2018 6th international symposium on digital forensic and security (ISDFS). IEEE
Belarouci S, Bekaddour F, Chikh MA (2016) A comparative study of medical data classification based on LS-SVM and metaheuristics approaches. In: 2016 8th international conference on modelling, identification and control (ICMIC). IEEE
Brown G (2004) Diversity in neural network ensembles. University of Birmingham
Chatterjee S et al (2017) Hybrid modified cuckoo search-neural network in chronic kidney disease classification. In: 2017 14th international conference on engineering of modern electric systems (EMES). IEEE
Chen HL et al (2014) Towards an optimal support vector machine classifier using a parallel particle swarm optimization strategy. Appl Math Comput 239:180–197
Chen K-H et al (2016) Diagnosis of brain metastases from lung cancer using a modified electromagnetism like mechanism algorithm. J Med Syst 40(1):35
Cheng C-Y et al (2014) A failure-rate-reduction periodic preventive maintenance model with delayed initial time in a finite time period. Qual Technol Quant Manag 11(3):245–254
DeCoste D (2003) Anytime query-tuned kernel machines via cholesky factorization. In: Proceedings of the 2003 SIAM international conference on data mining. SIAM
Deoskar P, Singh D, Singh DA (2013) An efficient support based ant colony optimization technique for lung cancer data. Int J Adv Res Comput Commun Eng 2(9):3575–3581
Eggermont J, Kok JN, Kosters WA (2004) Genetic programming for data classification: partitioning the search space. In: Proceedings of the 2004 ACM symposium on applied computing. ACM
Grossman RL et al (2013) Data mining for scientific and engineering applications, vol 2. Springer, Berlin
Habrard A, Bernard M, Sebban M (2005) Detecting irrelevant subtrees to improve probabilistic learning from tree-structured data. Fundam Inform 66(1–2):103–130
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Hiesh M-H et al (2013) Classification of schizophrenia using genetic algorithm-support vector machine (GA-SVM). In: 2013 35th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE
Hong Z-Q, Yang J-Y (1991) Optimal discriminant plane for a small number of samples and design method of classifier on the plane. Pattern Recogn 24(4):317–324
Hore S, Chatterjee S, Shaw RK, Dey N, Virmani J (2018) Detection of chronic kidney disease: A NN-GA-Based approach. In: Panigrahi B, Hoda M, Sharma V, Goel S (eds) Nature inspired computing. Advances in intelligent systems and computing, vol 652. Springer, Singapore
Huang C-L, Dun J-F (2008) A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8(4):1381–1391
Iraji MS (2019) Combining predictors for multi-layer architecture of adaptive fuzzy inference system. Cogn Syst Res 53:71–84
Jona J, Nagaveni N (2014) Ant-cuckoo colony optimization for feature selection in digital mammogram. PJBS 17(2):266–271
Joudaki H et al (2015) Using data mining to detect health care fraud and abuse: a review of literature. Glob J Health Sci 7(1):194
Karabatak M (2015) A new classifier for breast cancer detection based on Naïve Bayesian. Measurement 72:32–36
Kaur G, Sharma A (2017) Predict chronic kidney disease using data mining algorithms in hadoop. In: International conference on inventive computing and informatics (ICICI). IEEE
Kumari A, Mehra R (2014) Design of hybrid method PSO and SVM for detection of brain neoplasm. Int J Eng Adv Technol 3(4):262–266
Lam AY, Li VO (2012) Chemical reaction optimization: a tutorial. Memet Comput 4(1):3–17
Li Y, Chen Z (2018) Performance evaluation of machine learning methods for breast cancer prediction. Appl Comput Math 7(4):212–216
Li J et al (2016) Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms. J Supercomput 72(10):3708–3728
Liu X et al (2016) Privacy-preserving patient-centric clinical decision support system on naive Bayesian classification. IEEE J Biomed Health Inform 20(2):655–668
Martin JK, Hirschberg DS (1995) The time complexity of decision tree induction. CiteSeer, Princeton
Michalski RS et al (1986) The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. Proc AAAI 1986:1041–1045
Peng S et al (2003) Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines. FEBS Lett 555(2):358–362
Polat K, Güneş S (2007a) Breast cancer diagnosis using least square support vector machine. Digit Signal Proc 17(4):694–701
Polat K, Güneş S (2007b) An improved approach to medical data sets classification: artificial immune recognition system with fuzzy resource allocation mechanism. Expert Syst 24(4):252–270
Polat K, Sentürk U (2018) A novel ML approach to prediction of breast cancer: combining of mad normalization, KMC based feature weighting and AdaBoostM1 classifier. In: 2018 2nd international symposium on multidisciplinary studies and innovative technologies (ISMSIT). IEEE
Polat H, Mehr HD, Cetin A (2017) Diagnosis of chronic kidney disease based on support vector machine by feature selection methods. J Med Syst 41(4):55
Saidi M, Chikh MA, Settouti N (2011) Automatic identification of diabetes diseases using a modified artificial immune recognition system2 (MAIRS2). In: Proceedings of 3ème conference internationale sur l ‘informatique et ses applications
Sakthivel K, Jayanthiladevi A, Kavitha C (2016) Automatic detection of lung cancer nodules by employing intelligent fuzzy c-means and support vector machine. Biomed Res 27:s123–s127
Salaken SM et al (2017) Lung cancer classification using deep learned features on low population dataset. In: 2017 IEEE 30th Canadian conference on electrical and computer engineering (CCECE). IEEE
Sarafrazi S, Nezamabadi-pour H (2013) Facing the classification of binary problems with a GSA-SVM hybrid system. Math Comput Model 57(1):270–278
Shah S, Kusiak A (2007) Cancer gene search with data-mining and genetic algorithms. Comput Biol Med 37(2):251–261
Shao Y-H et al (2015) Weighted linear loss twin support vector machine for large-scale classification. Knowl Based Syst 73:276–288
Street WN, Wolberg WH, Mangasarian OL (1993) Nuclear feature extraction for breast tumor diagnosis. In: IS&T/SPIE’s symposium on electronic imaging: science and technology. International society for optics and photonics
Sun T et al (2013) Comparative evaluation of support vector machines for computer aided diagnosis of lung cancer in CT based on a multi-dimensional data set. Comput Methods Programs Biomed 111(2):519–524
Tomar D, Agarwal S (2013) A survey on data mining approaches for healthcare. Int J Bio Sci Bio Technol 5(5):241–266
Tsang IW, Kwok JT, Cheung P-M (2005) Core vector machines: fast SVM training on very large data sets. J Mach Learn Res 6:363–392
Vieira SM et al (2013) Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients. Appl Soft Comput 13(8):3494–3504
Wang K-J et al (2015) A hybrid classifier combining borderline-SMOTE with AIRS algorithm for estimating brain metastasis from lung cancer: a case study in Taiwan. Comput Methods Programs Biomed 119(2):63–76
Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci 87(23):9193–9196
Wu M, Xu Z, Watada J (2012) Memetic algorithm based support vector machine classification. Int J Innov Manag Inf Prod 3(3):99–117
Ye Q et al (2012) Weighted twin support vector machines with local information and its application. Neural Netw 35:31–39
Zheng B, Yoon SW, Lam SS (2014) Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms. Expert Syst Appl 41(4):1476–1482
Zhou Z-H, Jiang Y (2004) NeC4. 5: neural ensemble based C4. 5. IEEE Trans Knowl Data Eng 16(6):770–773
Zięba M et al (2014) Boosted SVM for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients. Appl Soft Comput 14:99–108
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors of the current manuscript declare that Somayeh Jalayeri is graduated student of Islamic Azad University—Birjand Branch that she affiliated to her university and Majid Abdolrazzagh-Nezhad is Assist. Prof. of Bozorgmehr University of Qaenat as the supervisor of the current research. Except the above-declared conflict interest, the authors claim that there is not any conflict of interest and the research was not founded any grant.
Human participants and/or animals rights
In cases of research involving human participants and/or animals, the article does not contain any studies with human participants and/or animals performed by any of the authors. The investigated medical datasets to disease diagnosis are extracted from UCI (the UC Irvine Machine Learning Repository) that their details are presented in Appendix I.
Informed consent
The authors declare that informed consent was obtained from all individual participants included in the research.
Additional information
Communicated by V. Loia.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Jalayeri, S., Abdolrazzagh-Nezhad, M. Chemical reaction optimization to disease diagnosis by optimizing hyper-planes classifiers. Soft Comput 23, 13263–13282 (2019). https://doi.org/10.1007/s00500-019-03869-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-03869-9