Advertisement

EvoloPy-FS: An Open-Source Nature-Inspired Optimization Framework in Python for Feature Selection

  • Ruba Abu Khurma
  • Ibrahim Aljarah
  • Ahmad Sharieh
  • Seyedali MirjaliliEmail author
Chapter
Part of the Algorithms for Intelligent Systems book series (AIS)

Abstract

Feature selection is a necessary critical stage in data mining process. There is always an arm race to build frameworks and libraries that ease and automate this process. In this chapter, an EvoloPy-FS framework is proposed, which is a Python open-source optimization framework that includes several well-regarded swarm intelligence (SI) algorithms. It is geared toward feature selection optimization problems. It is an easy to use, reusable, and adaptable framework. The objective of developing EvoloPy-FS is providing a feature selection engine to help researchers even those with less knowledge in SI in solving their problems and visualizing rapid results with a less programming effort. That is why the orientation of this work was to build an open-source, white-box framework, where algorithms and data structures are being explicit, transparent, and publicly available. EvoloPy-FS comes to continue our path for building an integrated optimization environment, which was started by the original EvoloPy for global optimization problems, then EvoloPy-NN for training multilayer perception neural network, and finally the new EvoloPy-FS for feature selection optimization. EvoloPy-FS is freely hosted on (www.evo-ml.com) with a helpful documentation.

Keywords

Nature-inspired algorithm (NIA) Swarm intelligence (SI) Evolutionary algorithm (EA) Feature selection (FS) Transfer function (TF) Framework Library Optimization 

References

  1. 1.
    Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157–1182Google Scholar
  2. 2.
    Huan L, Hiroshi M (eds) (2007) Computational methods of feature selection. CRC PressGoogle Scholar
  3. 3.
    Zhao, Z, Liu H (2007) Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th international conference on Machine learning. ACMGoogle Scholar
  4. 4.
    Huan L, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502CrossRefGoogle Scholar
  5. 5.
    Manoranjan D, Huan L (1997) Feature selection for classification. Intell Data Anal 1(3):131–156CrossRefGoogle Scholar
  6. 6.
    Hou C et al (2014) Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Trans Cybern 44(6):793–804CrossRefGoogle Scholar
  7. 7.
    Celeux G et al (2011) A framework for feature selection in clustering. J Am Stat Assoc 105:713–726. J Am Stat Assoc 106(493)Google Scholar
  8. 8.
    Zhao Z et al (2010) Advancing feature selection research. ASU Featur Sel Repos 2010:1–28Google Scholar
  9. 9.
    Li J et al (2017) Feature selection: a data perspective. ACM Computing Surveys (CSUR) 50(6):94CrossRefGoogle Scholar
  10. 10.
    Ramrez-Gallego S et al (2018) An information theory-based feature selection framework for big data under apache spark. IEEE Trans Syst, Man, Cybern: Syst 48(9):1441–1453CrossRefGoogle Scholar
  11. 11.
    Verénica B-C, Noelia S-M, Amparo A-B (2015) Recent advances and emerging challenges of feature selection in the context of big data. Knowl-Based Syst 86:33–45CrossRefGoogle Scholar
  12. 12.
    Liu, H, Motoda H (2012) Feature selection for knowledge discovery and data mining vol 454. Springer Science and Business MediaGoogle Scholar
  13. 13.
    Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324zbMATHCrossRefGoogle Scholar
  14. 14.
    Abe S (2010) Feature selection and extraction. Support vector machines for pattern classification. Springer, London, pp 331–341CrossRefGoogle Scholar
  15. 15.
    Molina LC, Belanche L, Nebot À (2002) Feature selection algorithms: a survey and experimental evaluation. Data mining, 2002. ICDM 2003. 2002 IEEE international conference on. Proceedings. IEEEGoogle Scholar
  16. 16.
    Yong L, Feng T, Zhiyong Z (2015) Feature selection based on dependency margin. IEEE Trans Cybern 45(6):1209–1221CrossRefGoogle Scholar
  17. 17.
    Ensan F, Bagheri E, Gašević D (2012) Evolutionary search-based test generation for software product line feature models. In: International conference on advanced information systems engineering. Springer, Berlin, HeidelbergGoogle Scholar
  18. 18.
    Yusta SC (2009) Different metaheuristic strategies to solve the feature selection problem. Pattern Recognit Lett 30(5):525–534CrossRefGoogle Scholar
  19. 19.
    Yang X-S (2013) Metaheuristic optimization: nature-inspired algorithms and applications.In: Artificial intelligence, evolutionary computing and metaheuristics. Springer, Berlin, Heidelberg, pp 405–420Google Scholar
  20. 20.
    Holland JH (1992) Genetic algorithms. Sci Am 267(1):66–73CrossRefGoogle Scholar
  21. 21.
    Koza JR (1992) Genetic programming II, automatic discovery of reusable subprograms. MIT Press, Cambridge, MAGoogle Scholar
  22. 22.
    Kennedy J (2006) Swarm intelligence. Handbook of nature-inspired and innovative computing. Springer, Boston, MA, pp 187–219CrossRefGoogle Scholar
  23. 23.
    Eberhart R, Kennedy J (2011) ’A new optimizer using particle swarm theory. Micro Machine and Human Science, 1995. MHS’95. In: Proceedings of the sixth international symposium on. IEEE, 1995Google Scholar
  24. 24.
    Dorigo M, Birattari M (2011) Ant colony optimization. Encyclopedia of machine learning. Springer, Boston, MA, pp 36–39Google Scholar
  25. 25.
    Xin-She Y, Suash D (2009) Cuckoo search via Lévy flights. In: World Congress on nature and biologically inspired computing (2009) NaBIC 2009. IEEE, p 2009Google Scholar
  26. 26.
    Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69(2014):46–61CrossRefGoogle Scholar
  27. 27.
    Mirjalili S, Mirjalili SM, Hatamlou A (2016) Multi-verse optimizer: a nature-inspired algorithm for global optimization. Neural Comput Appl 27(2):495–513CrossRefGoogle Scholar
  28. 28.
    Mirjalili S (2015) Moth-flame optimization algorithm: a novel nature-inspired heuristic paradigm. Knowl-Based Syst 89:228–249CrossRefGoogle Scholar
  29. 29.
    Mirjalili S, Andrew L (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67CrossRefGoogle Scholar
  30. 30.
    Xin-She Y (2010) A new metaheuristic bat-inspired algorithm. Nature inspired cooperative strategies for optimization (NICSO) Springer vol. 2010. Berlin, Heidelberg, pp 65–74Google Scholar
  31. 31.
    Xin-She Y (2010) Firefly algorithm, Levy flights and global optimization. Research and development in intelligent systems XXVI. Springer, London, pp 209–218Google Scholar
  32. 32.
    Mafarja M, Aljarah I, Faris H, Hammouri AI, Al-Zoubi AM, Mirjalili S (2019) Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst Appl 117:267–286CrossRefGoogle Scholar
  33. 33.
    Ahmed S, Mafarja M, Faris H, Aljarah I (2018) Feature selection using salp swarm algorithm with chaos. In: Proceedings of the 2nd international conference on intelligent systems, metaheuristics, and swarm intelligence ACM, pp 65–69Google Scholar
  34. 34.
    Faris H, Al-Zoubi AM, Heidari AA, Aljarah I, Mafarja M, Hassonah MA, Fujita H (2019) An intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks. Inf Fusion 48:67–83CrossRefGoogle Scholar
  35. 35.
    Mafarja M, Aljarah I, Heidari AA, Faris H, Fournier-Viger P, Li X, Mirjalili S (2018) Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowl-Based Syst 161:185–204CrossRefGoogle Scholar
  36. 36.
    Aljarah I, Mafarja M, Heidari AA, Faris H, Zhang Y, Mirjalili S (2018) Asynchronous accelerating multi-leader salp chains for feature selection. Appl Soft Comput 71:964–979CrossRefGoogle Scholar
  37. 37.
    Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, Al-Zoubi AM, Mirjalili S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl-Based Syst 145:25–45CrossRefGoogle Scholar
  38. 38.
    Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82CrossRefGoogle Scholar
  39. 39.
    Wall M (1996) GAlib: A C++ library of genetic algorithm components. Mech Eng Dep Mass Inst Technol 87:54Google Scholar
  40. 40.
    Keijzer M et al (2001) Evolving objects: A general purpose evolutionary computation library. in: International conference on artificial evolution (Evolution Artificielle). Springer, Berlin, HeidelbergGoogle Scholar
  41. 41.
    Emmerich M, Hosenberg R (2001) TEA-a C++ library for the design of evolutionary algorithms. Universitätsbibliothek DortmundGoogle Scholar
  42. 42.
    Harder R (2001) OpenTS: an open source java tabu search framework. INFORMS Annual Meeting, MiamiGoogle Scholar
  43. 43.
    Bleuler S et al (2003) PISA-a platform and programming language independent interface for search algorithms. iN: International conference on evolutionary multi-criterion optimization. Springer, Berlin, HeidelbergGoogle Scholar
  44. 44.
    Cahon S, Melab N, Talbi E-G (2004) Paradiseo: a framework for the reusable design of parallel and distributed metaheuristics. J Heuristics 10(3):357–380zbMATHCrossRefGoogle Scholar
  45. 45.
    Wagner S, Affenzeller M (2005) Heuristiclab: a generic and extensible optimization environment. Adaptive and natural computing algorithms. Springer, Vienna, pp 538–541CrossRefGoogle Scholar
  46. 46.
    Streichert F, Ulmer H (2005) JavaEvA-a java framework for evolutionary algorithms. In: Center for Bioinformatics Tübingen, University of Tübingen, Technical Report WSI-2005-06Google Scholar
  47. 47.
    Li Y, Yu S-M (2006) A unified optimization framework for real world problems. Lect Ser Comput Comput Sci 7:816–819Google Scholar
  48. 48.
    Pohlheim H (2007) Geatbx: genetic and evolutionary algorithm toolbox for use with matlab. H. Pohlheim, Berlin http://www.geatbx.com
  49. 49.
    Pampara, G, Engelbrecht AP, Cloete T (2008) Cilib: a collaborative framework for computational intelligence algorithms-part I. In: IEEE International joint conference on neural networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEEGoogle Scholar
  50. 50.
    Ventura S et al (2008) JCLEC: a Java framework for evolutionary computation. Soft Comput 12(4):381–392MathSciNetCrossRefGoogle Scholar
  51. 51.
    Perone CS (2009) Pyevolve: a Python open-source framework for genetic algorithms. Acm Sigevolution 4(1):12–20CrossRefGoogle Scholar
  52. 52.
    Kronfeld M, Planatscher H, Zell A (2010) The EvA2 optimization framework. In: International Conference on Learning and Intelligent Optimization. Springer, Berlin, HeidelbergGoogle Scholar
  53. 53.
    Durillo JJ, Nebro AJ (2011) JMetal: a Java framework for multi-objective optimization. Adv Eng Softw 42(10):760–771CrossRefGoogle Scholar
  54. 54.
    Weppenaar DVI, Vermaak HJ (2011) Solving planning problems with drools planner a tutorial. Interim: Interdiscip J 10(1):91–109Google Scholar
  55. 55.
    Lukasiewycz M et al (2011) Opt4J: a modular framework for meta-heuristic optimization. In: Proceedings of the 13th annual conference on Genetic and evolutionary computation. ACMGoogle Scholar
  56. 56.
    Fortin F-A et al (2012) DEAP: evolutionary algorithms made easy. J Mach Learn Res 13(Jul):2171–2175Google Scholar
  57. 57.
    Izzo, D (2012) Pygmo and pykep: open source tools for massively parallel optimization in astrodynamics (the case of interplanetary trajectory optimization). In: Proceedings of the fifth international conference on astrodynamics tools and techniques, ICATTGoogle Scholar
  58. 58.
    Luke, S (2017) ’ECJ then and now. In: Proceedings of the genetic and evolutionary computation conference companion. ACMGoogle Scholar
  59. 59.
    Tian Y et al (2017) PlatEMO: a MATLAB platform for evolutionary multi-objective optimization [educational forum]. IEEE Comput Intell Mag 12(4):73–87CrossRefGoogle Scholar
  60. 60.
    Kohavi R et al (1994) MLC++: a machine learning library in C++. In: Proceedings sixth international conference on tools with artificial intelligence. TAI 94. IEEEGoogle Scholar
  61. 61.
    Witten IH et al (1999) Weka: practical machine learning tools and techniques with Java implementationsGoogle Scholar
  62. 62.
    Hanke M et al (2009) PyMVPA: a python toolbox for multivariate pattern analysis of fMRI data. Neuroinformatics 7(1):37–53CrossRefGoogle Scholar
  63. 63.
    Kachel A et al (2010) Infosel++: information based feature selection c++ library. In: International conference on artificial intelligence and soft computing. Springer, Berlin, HeidelbergCrossRefGoogle Scholar
  64. 64.
    Schaul T et al (2010) PyBrain. J Mach Learn Res 11(Feb):743–746Google Scholar
  65. 65.
    Alcalá-Fdez J et al (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult-Valued Log Soft Comput 17Google Scholar
  66. 66.
    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O ... Vanderplas J (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12(Oct):2825–2830Google Scholar
  67. 67.
    Albanese D et al (2012) mlpy: Machine learning python. arXiv preprint arXiv:1202.6548 (2012)
  68. 68.
    Curtin RR et al (2013) MLPACK: a scalable C++ machine learning library. J Mach Learn Res 14(Mar): 801–805Google Scholar
  69. 69.
    Demar J et al (2013) Orange: data mining toolbox in Python. J Mach Learn Res 14(1):2349–2353zbMATHGoogle Scholar
  70. 70.
    Thüm T et al (2014) FeatureIDE: an extensible framework for feature-oriented software development. Sci Comput Program 79:70–85CrossRefGoogle Scholar
  71. 71.
    Soufan O et al (2015) DWFS: a wrapper feature selection tool based on a parallel genetic algorithm. PloS one 10(2):e0117988CrossRefGoogle Scholar
  72. 72.
    Roffo G (2016) Feature selection library (MATLAB toolbox). arXiv preprint arXiv:1607.01327 (2016)
  73. 73.
    van Rossum G (1990–2004) Python programming languageGoogle Scholar
  74. 74.
    Faris H, Aljarah I, Mirjalili S, Castillo PA, Guervés JJM (2016) EvoloPy: an open-source nature-inspired optimization framework in python. In: IJCCI (ECTA), pp 171–177Google Scholar
  75. 75.
    Faris H, Aljarah I, Al-Madi N, Mirjalili S (2016) Optimizing the learning process of feedforward neural networks using lightning search algorithm. Int J Artif Intell Tools 25(06):1650033CrossRefGoogle Scholar
  76. 76.
    Faris H, Aljarah I, Mirjalili S (2016) Training feedforward neural networks using multi-verse optimizer for binary classification problems. Appl Intell 45(2):322–332CrossRefGoogle Scholar
  77. 77.
    Aljarah I, Faris H, Mirjalili S (2018) Optimizing connection weights in neural networks using the whale optimization algorithm. Soft Comput 22(1):1–15CrossRefGoogle Scholar
  78. 78.
    Emary E, Zawbaa HM, Hassanien AE (2016) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172(2016):371–381CrossRefGoogle Scholar
  79. 79.
    Xue B, Zhang M, Browne WN (2013) Novel initialisation and updating mechanisms in PSO for feature selection in classification. In: European conference on the applications of evolutionary computation. Springer, Berlin, HeidelbergCrossRefGoogle Scholar
  80. 80.
    Chuang L-Y et al (2008) Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 32(1):29–38zbMATHCrossRefGoogle Scholar
  81. 81.
    Huang CL, Dun JF (2008) A distributed PSO-SVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8:1381–1391CrossRefGoogle Scholar
  82. 82.
    Xue B, Zhang M, Browne WN (2012) New fitness functions in binary particle swarm optimisation for feature selection. In: IEEE congress on evolutionary computation (CEC2012) pp 2145–2152Google Scholar
  83. 83.
    Lin W et al (2016) ’An empirical study on the characteristics of Python fine-grained source code change types. In: 2016 IEEE international conference on software maintenance and evolution (ICSME). IEEEGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Ruba Abu Khurma
    • 1
  • Ibrahim Aljarah
    • 1
  • Ahmad Sharieh
    • 1
  • Seyedali Mirjalili
    • 2
    • 3
    Email author
  1. 1.King Abdullah II School for Information TechnologyThe University of JordanAmmanJordan
  2. 2.Torrens University AustraliaBrisbaneAustralia
  3. 3.Griffith UniversityBrisbaneAustralia

Personalised recommendations