Skip to main content
Log in

HColonies: a new hybrid metaheuristic for medical data classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Medical data feature a number of characteristics that make their classification a complex task. Yet, the societal significance of the subject and the computational challenge it presents has caused the classification of medical datasets to be a popular research area. A new hybrid metaheuristic is presented for the classification task of medical datasets. The hybrid ant–bee colonies (HColonies) consists of two phases: an ant colony optimization (ACO) phase and an artificial bee colony (ABC) phase. The food sources of ABC are initialized into decision lists, constructed during the ACO phase using different subsets of the training data. The task of the ABC is to optimize the obtained decision lists. New variants of the ABC operators are proposed to suit the classification task. Results on a number of benchmark, real-world medical datasets show the usefulness of the proposed approach. Classification models obtained feature good predictive accuracy and relatively small model size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Algorithm 5
Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://www.antminerplus.com/.

  2. http://www.cs.waikato.ac.nz/~ml/weka/.

  3. http://www.mathworks.com/products/matlab/.

  4. IBM Corp. Released 2011. IBM SPSS Statistics for Windows, Version 21.0. Armonk, NY: IBM Corp.

  5. http://weka.wikispaces.com/ARFF.

References

  1. Abuhamdah A, Ayob M, Kendall G, Sabar N (2013) Population based local search for university course timetabling problems. Appl Intell 40(1):44–53. doi:10.1007/s10489-013-0444-6

    Article  Google Scholar 

  2. Alcalá-fdez J, Sánchez L, García S, Jesus MJD, Ventura S, Garrell JM, Otero J, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput 13(3):307–318

    Article  Google Scholar 

  3. Aribarg T, Supratid S, Lursinsap C (2012) Optimizing the modified fuzzy ant-miner for efficient medical diagnosis. Appl Intell 37(3):1–20

    Article  Google Scholar 

  4. Bernadó-Mansilla E, Garrell-Guiu JM (2003) Accuracy based learning classifier systems: models, analysis and applications to classification tasks. Evol Comput 11(3):209–238

    Article  Google Scholar 

  5. Blum C, Aguilera MJB, Roli A, Sampels M (eds) (2008) Hybrid metaheuristics, an emerging approach to optimization. SCI, vol 114. Springer, Berlin

    MATH  Google Scholar 

  6. Cohen WW (1995) Fast effective rule induction. In: Prieditis A, Russell SJ (eds) Proc ICML, Morgan Kaufmann, pp 115–123

    Google Scholar 

  7. Cuevas E, Sención F, Zaldivar D, Pérez-Cisneros M, Sossa H (2012) A multi-threshold segmentation approach based on artificial bee colony optimization. Appl Intell 37(3):321–336

    Article  Google Scholar 

  8. Diwold K, Beekman M, Middendorf M (2010) Honeybee optimisation—an overview and a new bee inspired optimisation scheme. In: Panigrahi B, Shi Y, Lim MH (eds) Handbook of swarm intelligence, adaptation, learning, and optimization, vol 8. Springer, Berlin, pp 295–327. http://dx.doi.org/10.1007/978-3-642-17390-5_13

    Chapter  Google Scholar 

  9. Dorigo M (1992) Optimization, learning and natural algorithms. Ph.D. thesis, Politecnico di Milano, Italie

  10. Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66

    Article  Google Scholar 

  11. Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml

  12. Frank E, Witten I (1998) Generating accurate rule sets without global optimization. In: Proc ICML, Morgan Kaufmann, pp 144–151

    Google Scholar 

  13. Garcá-Laencina PJ, Sancho-Gómez JL, Figueiras-Vidal AR (2010) Pattern classification with missing data: a review. Neural Comput Appl 19(2):263–282

    Article  Google Scholar 

  14. García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977

    Article  Google Scholar 

  15. Gonzalez A, Perez R (1999) Slave: a genetic learning system based on an iterative approach. IEEE Trans Fuzzy Syst 7(2):176–191

    Article  Google Scholar 

  16. Hanczara B, Dougherty ER (2013) The reliability of estimated confidence intervals for classification error rates when only a single sample is available. Pattern Recognit 64(3):1067–1077

    Article  Google Scholar 

  17. Holland J (1975) Adaptation in natural and artificial systems, 1st edn. University of Michigan Press, Ann Arbor

    Google Scholar 

  18. Holm S (1979) A simple sequentially rejective test procedure. Scand J Stat 6(2):65–70

    MATH  MathSciNet  Google Scholar 

  19. Inza I, Calvo B, Armañanzas R, Bengoetxea E, Larrañaga P, Lozano JA (2010) Machine learning: an indispensable tool in bioinformatics. In: Matthiesen R (ed) Bioinformatics methods in clinical research, methods in molecular biology, vol 593. Humana Press, Clifton, pp 25–48. Chap. 2

    Chapter  Google Scholar 

  20. Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical Report TR06, Erciyes University, Engineering Faculty, Computer Engineering Department

  21. Karaboga D, Akay B (2009) A survey: algorithms simulating bee swarm intelligence. Artif Intell Rev 31(1–4):61–85. http://dblp.uni-trier.de/db/journals/air/air31.html#KarabogaA09

    Article  Google Scholar 

  22. Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J Glob Optim 39(3):459–471. doi:10.1007/s10898-007-9149-x

    Article  MATH  MathSciNet  Google Scholar 

  23. Karaboga D, Akay B, Ozturk C (2007) Artificial bee colony (ABC) optimization algorithm for training feed-forward neural networks. In: Torra V, Narukawa Y, Yoshida Y (eds) Proc MDAI. LNCS, vol 4617. Springer, Berlin, pp 318–329

    Google Scholar 

  24. Karaboga N, Kockanat S, Dogan H (2013) The parameter extraction of the thermally annealed schottky barrier diode using the modified artificial bee colony. Appl Intell 38(3):279–288

    Article  Google Scholar 

  25. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE international conference on neural networks, Piscataway, NJ, vol 4, pp 1942–1948

    Google Scholar 

  26. Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680

    Article  MATH  MathSciNet  Google Scholar 

  27. Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proc IJCAI, Morgan Kaufmann, vol 14, pp 1137–1145. http://dblp.uni-trier.de/db/conf/ijcai/ijcai95.html

    Google Scholar 

  28. Koza JR (1992) Genetic programming. MIT Press, Cambridge

    MATH  Google Scholar 

  29. Langdon WB (1997) Fitness causes bloat in variable size representations. Tech. Rep. CSRP-97-14, University of Birmingham, School of Computer Science

  30. Lavarč N (1999) Selected techniques for data mining in medicine. Artif Intell Med 16(1):3–23

    Article  Google Scholar 

  31. Lee K, Yoon W, Baek D (2006) A classification method using a hybrid genetic algorithm combined with an adaptive procedure for the pool of ellipsoids. Appl Intell 25(3):293–304

    Article  Google Scholar 

  32. Martens D, Backer MD, Haesen R, Vanthienen J, Snoeck M, Baesens B (2007) Classification with ant colony optimization. IEEE Trans Evol Comput 11(5):651–665

    Article  Google Scholar 

  33. Minnaert B, Martens D, De Baker M, Baesens B (2012) To tune or not to tune: rule evaluation for metaheuristic-based sequential covering algorithms. Working Paper 12769, Universiteit Gent

  34. Orriols-Puig A, Casillas J, Bernadó-Mansilla E (2008) A comparative study of several genetic-based supervised learning systems. In: Bull L, Bernadó-Mansilla E, Holmes JH (eds) Learning classifier systems in data mining, SCI, vol 125. Springer, Berlin, pp 205–230

    Chapter  Google Scholar 

  35. Otero FEB, Freitas AA, Johnson CG (2008) cAnt-Miner: an ant colony classification algorithm to cope with continuous attributes. In: Dorigo M, Birattari M, Blum C, Clerc M, Stützle T, Winfield AFT (eds) ANTS conference, Springer. LNCS, vol 5217, pp 48–59

    Google Scholar 

  36. Parpinelli RS, Lopes HS, Freitas AA (2002) Data mining with an ant colony optimization algorithm. IEEE Trans Evol Comput 6(4):321–332

    Article  Google Scholar 

  37. Penã-Reyes CA, Sipper M (2000) Evolutionary computation in medicine: an overview. Artif Intell Med 19(1):1–23

    Article  Google Scholar 

  38. Peng Jin KH Yunlong Z Li S (2006) Classification rule mining based on ant colony optimization algorithm. In: Huang DS, Li K, Irwin G (eds) Intell Control Autom. LNCIC, vol 344. Springer, Berlin, pp 654–663

    Chapter  Google Scholar 

  39. Pham DT, Ghanbarzadeh A, Koc E, Otri S, Rahim S, Zaidi M (2006) The bees algorithm, a novel tool for complex optimisation problems. In: Proc IPROMS. Elsevier, Amsterdam, pp 454–459

    Google Scholar 

  40. Quinlan R (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo

    Google Scholar 

  41. Sarkar BK, Sana SS, Chaudhuri K (2012) A genetic algorithm-based rule extraction system. Appl Soft Comput 12(1):238–254

    Article  Google Scholar 

  42. Sato T, Hagiwara M (1997) Bee system: finding solution by a concentrated search. In: Proc IEEE sys man cybern, vol 4, pp 3954–3959

    Google Scholar 

  43. Shukran MAM, Chung YY, Yeh WC, Wahid N, Zaidi AMA (2011) Artificial bee colony based data mining algorithms for classification tasks. Math Models Methods Appl Sci 5(4):217–231

    Google Scholar 

  44. Sousa T, Silva A, Neves A (2004) Particle swarm based data mining algorithms for classification tasks. Parallel Comput 30(5–6):767–783

    Article  Google Scholar 

  45. Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc B 36(2):111–147

    MATH  Google Scholar 

  46. Stützle T, Hoos HH (2000) MAX-MIN ant system. Future Gener Comput Syst 16(8):889–914

    Article  Google Scholar 

  47. Tan KC, Yu Q, Heng CM, Lee TH (2003) Evolutionary computing for knowledge discovery in medical diagnosis. Artif Intell Med 27(2):129–154

    Article  Google Scholar 

  48. Teodorovic D, Dell’orco M (2005) Bee colony optimization—a cooperative learning approach to complex transportation problems. In: Proc 16th mini-EURO conf advanced OR and AI methods in transportation, pp 51–60

    Google Scholar 

  49. Tian J, Yu B, Yu D, Ma S (2013) Missing data analyses: a hybrid multiple imputation algorithm using gray system theory and entropy based on clustering. Appl Intell pp 1–13. doi:10.1007/s10489-013-0469-x

  50. Verma B, Hassan S (2011) Hybrid ensemble approach for classification. Appl Intell 34(2):258–278

    Article  Google Scholar 

  51. von Frisch K (1967) The dance language and orientation of bees. Belknap Press of Harvard University Press, Cambridge

    Google Scholar 

  52. Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83

    Article  Google Scholar 

  53. Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175

    Article  Google Scholar 

  54. Zhu B, He C, Liatsis P (2012) A robust missing value imputation method for noisy data. Appl Intell 36(1):61–74

    Article  Google Scholar 

Download references

Acknowledgements

This research project has been supported by a grant from the “Research Center of the Center for Female Scientific and Medical Colleges”, Deanship of Scientific Research, King Saud University. The authors would like to thank the anonymous reviewers for their valuable and constructive comments. Special thanks to Dr. Joaquin Derrac Rus from Cardiff University, UK, for his assistance in using KEEL. Also, an honorable mention goes to Dr. Pedro J. García Laencina from Universidad Politécnica de Cartagena, Spain; Dr. Iñaki Inza from University of the Basque Country, Spain; Dr. Bahriye Basturk Akay from Erciyes University, Turkey; Dr. Pat Langley from Stanford University, California; and Dr. Sotos Kotsiantis from University of Patras, Greece, for providing useful information for the project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarab AlMuhaideb.

Rights and permissions

Reprints and permissions

About this article

Cite this article

AlMuhaideb, S., Menai, M.E.B. HColonies: a new hybrid metaheuristic for medical data classification. Appl Intell 41, 282–298 (2014). https://doi.org/10.1007/s10489-014-0519-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-014-0519-z

Keywords

Navigation