Skip to main content

Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines

  • Chapter
  • First Online:
Frontier Applications of Nature Inspired Computation

Part of the book series: Springer Tracts in Nature-Inspired Computing ((STNIC))

Abstract

Nowadays, a big pool of different machine learning components (i.e., algorithms and tools) exists that are capable of predicting various decisions in different problem domains successfully. Unfortunately, a problem has emerged in this respect that we cannot estimate safely which component behaves well on a particular dataset without huge experimental work. Consequently, designers and developers must capture as many methods as possible during experimental work to establish which one is more appropriate for the specific problem. To solve this challenge, researchers have proposed customized classification pipelines based on a framework of various search algorithms, machine learning tools, and appropriate parameters for these algorithms that are capable of working independently of user knowledge. Until recently, the majority of these pipelines were constructed using genetic programming. In this paper, a new method is proposed for evolving classification pipelines automatically, founded on stochastic nature-inspired population-based optimization algorithms. The algorithms act as a tool for modeling customized classification pipelines consisting of the following tasks: choosing the proper preprocessing method, selecting the appropriate classification tool, and optimizing the model hyperparameters. The evaluation of the customized classification pipelines also showed potential for using the proposed method in the real world.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal Charu C (2014) Data classification: algorithms and applications. Chapman and Hall/CRC, Chapman & Hall/CRC data mining and knowledge discovery series

    Google Scholar 

  2. Bishop Christopher M (2007) Pattern recognition and machine learning, 5th edn. Springer, Information Science and Statistics

    MATH  Google Scholar 

  3. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    Google Scholar 

  4. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Google Scholar 

  5. Cleveland William S (2014) Data science: an action plan for expanding the technical areas of the field of statistics. Stat Anal Data Mining 7:414–417

    Article  MathSciNet  Google Scholar 

  6. Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Measure 20(1):37–46

    Article  Google Scholar 

  7. Costa VO, Rodrigues CR (2018) Hierarchical ant colony for simultaneous classifier selection and hyperparameter optimization. In: 2018 IEEE congress on evolutionary computation (CEC). IEEE, pp 1–8

    Google Scholar 

  8. de Sá AGC, Pinto WJGS, Oliveira LOVB, Pappa GL (2017) RECIPE: a grammar-based framework for automatically evolving classification pipelines. In: European conference on genetic programming. Springer, pp 246–261

    Google Scholar 

  9. Dey N (2017) Advancements in applied metaheuristic computing. IGI Global

    Google Scholar 

  10. Dua D, Graff C (2017) UCI machine learning repository

    Google Scholar 

  11. Eberhart R, Kennedy J (1995) Particle swarm optimization. In: Proceedings of ICNN ’95—international conference on neural networks, vol 4, pp 1942–1948

    Google Scholar 

  12. Eiben AE, James E (2015) Introduction to evolutionary computing, 2nd edn. Springer Publishing Company, Incorporated, Smith

    Google Scholar 

  13. Fister I Jr, Yang X-S, Fister I, Brest J, Fister D (2013) A brief review of nature-inspired algorithms for optimization. Elektrotehniški vestnik 80(3):116–122

    Google Scholar 

  14. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42

    Article  Google Scholar 

  15. Gijsbers P (2018) Automatic construction of machine learning pipelines. Master’s thesis, Eindhoven University of Technology

    Google Scholar 

  16. Gupta N, Khosravy M, Patel N, Senjyu T (2018) A bi-level evolutionary optimization for coordinated transmission expansion planning. IEEE Access 6:48455–48477

    Google Scholar 

  17. Gupta N, Khosravy M, Patel N, Sethi I (2018) Evolutionary optimization based on biological evolution in plants. Proc Comput Sci 126:146–155

    Google Scholar 

  18. Herranz J, Matwin S, Nin J, Torra V (2010) Classifying data from protected statistical datasets. Comput Sec 29(8):875–890

    Google Scholar 

  19. Holzinger A, Dehmer M, Jurisica I (2014) Knowledge discovery and interactive data mining in bioinformatics—state-of-the-art, future challenges and research directions. BMC Bioinf 15(6):I1

    Google Scholar 

  20. Hutter F, Kotthoff L, Vanschoren J (eds) (2019) Automatic machine learning: methods, systems, challenges. Series on challenges in machine learning. Springer

    Google Scholar 

  21. Kang KC, Cohen SG, Hess JA, Novak WE, Peterson AS (1990) Feature-oriented domain analysis (FODA) feasibility study. Technical report CMU/SEI-90-TR-021, Software Engineering Institute, Carnegie Mellon University, Pittsburgh, PA

    Google Scholar 

  22. Koza John R (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, MA, USA

    MATH  Google Scholar 

  23. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61

    Google Scholar 

  24. Mohamed WNHW, Salleh MNM, Omar AH (2012) A comparative study of reduced error pruning method in decision tree algorithms. In: 2012 IEEE international conference on control system, computing and engineering. IEEE, pp 392–397

    Google Scholar 

  25. Olson RS, Bartley N, Urbanowicz RJ, Moore JH (2016) Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the genetic and evolutionary computation conference 2016, GECCO 2016. ACM, New York, NY, pp 485–492

    Google Scholar 

  26. Olson RS, Moore JH (2016) TPOT: a tree-based pipeline optimization tool for automating machine learning. In: Workshop on automatic machine learning, pp 66–74

    Google Scholar 

  27. Rosenblatt F (1961) Principles of neurodynamics. Perceptrons and the theory of brain mechanisms. Cornell Aeronautical Lab Inc, Buffalo, NY

    Google Scholar 

  28. Schapire RE (1999) A brief introduction to boosting. In: Proceedings of the 16th international joint conference on artificial intelligence, IJCAI ’99, vol 2. Morgan Kaufmann Publishers Inc, San Francisco, CA, pp 1401–1406

    Google Scholar 

  29. Schuster Stephan C (2007) Next-generation sequencing transforms today’s biology. Nat Methods 5(1):16

    Article  Google Scholar 

  30. Soda P, Iannello G (2010) Decomposition methods and learning approaches for imbalanced dataset: an experimental integration. In: 2010 20th international conference on pattern recognition. IEEE, pp 3117–3120

    Google Scholar 

  31. Stehman Stephen V (1997) Selecting and interpreting measures of thematic classification accuracy. Remote Sens Environ 62(1):77–89

    Article  Google Scholar 

  32. Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Glob Opt 11(4):341–359

    Google Scholar 

  33. Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300

    Article  Google Scholar 

  34. Xavier-Júnior JC, Freitas AA, Feitosa-Neto A, Ludermir TB (2018) A novel evolutionary algorithm for automated machine learning focusing on classifier ensembles. In: 2018 7th Brazilian conference on intelligent systems (BRACIS). IEEE, pp 462–467

    Google Scholar 

  35. Yang X-S (2010) A new metaheuristic bat-inspired algorithm. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 65–74

    Google Scholar 

  36. Zhu J, Chen N, Xing EP (2011) Infinite latent SVM for classification and multi-task learning. In: Advances in neural information processing systems, pp 1620–1628

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iztok Fister Jr. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Fister, I., Zorman, M., Fister, D., Fister, I. (2020). Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines. In: Khosravy, M., Gupta, N., Patel, N., Senjyu, T. (eds) Frontier Applications of Nature Inspired Computation. Springer Tracts in Nature-Inspired Computing. Springer, Singapore. https://doi.org/10.1007/978-981-15-2133-1_13

Download citation

Publish with us

Policies and ethics