Optimization Techniques for Machine Learning

  • Souad Taleb Zouggar
  • Abdelkader AdlaEmail author
Part of the Algorithms for Intelligent Systems book series (AIS)


This chapter outlines the fundamental of machine learning literature and provides the review of various literatures on understanding the variety of optimization techniques used for machine learning and prediction models. These techniques concern optimization either for the singular tree generation or the selection in homogeneous/heterogeneous ensembles. For the ensemble selection, various evaluation functions are studied and used with different methods of path. Comparisons with the state-and-art methods are performed on datasets or medical applications designed to validate the different techniques. The critical review of currently available optimization techniques is followed by descriptions of machine learning applications. This study will help the researcher to avoid overlapping efforts and make new basis for novice researchers.


Machine learning Decision trees Ensemble methods Heterogeneous ensembles Bagging Diversity measure Performance measure Ensemble selection 


  1. 1.
    Bzdok D, Altman N, Krzywinski M (2018) Statistics versus machine learning. Nat Methods 15(4)Google Scholar
  2. 2.
    Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth International GroupGoogle Scholar
  3. 3.
    Quinlan JR (1993) C4.5: programs for machine learning. Morgan KaufmannGoogle Scholar
  4. 4.
    Kodratoff Y (1998) Technique et outils de l’extraction de connaissances à partir de données. Université Paris-Sud, Revue SIGNAUX (92)Google Scholar
  5. 5.
    Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: 5th annual workshop on computational learning theory. ACM, Pittsburgh, pp 144–152Google Scholar
  6. 6.
    Kim J, Pearl J (1987) Convice; a conversational inference consolidation engine. IEEE Trans Syst Man Cybern 17:120–132Google Scholar
  7. 7.
    Sebag M (2001) Apprentissage automatique, quelques acquis, tendances et défis. L.M.S: Ecole PolytechniqueGoogle Scholar
  8. 8.
    Denis F, Gilleron R (1996) Notes de cours sur l’apprentissage automatique. Université de LilleGoogle Scholar
  9. 9.
    Kodratoff Y (1997) L’extraction de connaissance à partir de données: un nouveau sujet pour la recherche scientifique. Revue électronique READGoogle Scholar
  10. 10.
    Simon H (1983) Why should machines learn? In: Machine learning: an artificial intelligence approach, vol 1Google Scholar
  11. 11.
    Carbonell JG (1962) Learning by analogy: formulating and generalizing plans from past experience. In: Michalak RS, Carbonell JG, Mitchell TM (eds) Machine learning, an artificial intelligence approach. Tioga Press, Palo Alto, CAGoogle Scholar
  12. 12.
    Langley P, Simon HA (1995) Applications of machine learning and rule induction. Technical Report 95-1, Institute for the Study of Learning and ExpertiseGoogle Scholar
  13. 13.
    Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106Google Scholar
  14. 14.
    Denis F, Gilleron R (1997) Apprentissage à partir d’exemples. Université Charles de Gaulle, Lille 3Google Scholar
  15. 15.
    Dayan P, Sahani M, Deback G (1999) Unsupervised learning. In: Wilson RA, Keil F (eds) The MIT encyclopedia of the cognitive sciencesGoogle Scholar
  16. 16.
    Mitchell T (1997) Machine learning. McGraw-Hill Publishing Company, McGraw-Hill Series in Computer Science (Artificial Intelligence)Google Scholar
  17. 17.
    Taleb Zouggar S, Adla A (2013) On generating and simplifying decision trees using tree automata models. INFOCOMP J 12(2):32–43Google Scholar
  18. 18.
    Morgan JN, Sonquist JA (1963) Problems in the analysis of survey data, and a proposal. J Am Stat Assoc 58:415–434CrossRefGoogle Scholar
  19. 19.
    Kass G (1980) An exploratory technique for investigating large quantities of categorical data. Appl Stat 29(2):119–127CrossRefGoogle Scholar
  20. 20.
    Friedman JH (1977) A recursive partitioning decision rule for non parametric classification. IEEE Trans Comput 26(4):404–408CrossRefGoogle Scholar
  21. 21.
    Partalas I, Tsoumakas G, Vlahavas I (2012) A study on greedy algorithms for ensemble pruning. Technical Report TR-LPIS-360-12, LPIS, Dept. of Informatics, Aristotle University of Thessaloniki, GreeceGoogle Scholar
  22. 22.
    Taleb Zouggar S, Adla A (2017) Proposal for measuring quality of decision trees partition. Int J Decis Support Syst Technol 9(4):16–36CrossRefGoogle Scholar
  23. 23.
    Beiman L (1996) Heuristics of instability and stabilization in model selection. Ann Stat 24(6):2350–2383MathSciNetCrossRefGoogle Scholar
  24. 24.
    Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: The 2nd European conference, EuroCOLT ’95. Springer-Verlag, pp 23–37Google Scholar
  25. 25.
    Breiman L (2000) Randomizing outputs to increase prediction accuracy. Mach Learn 40:229–242CrossRefGoogle Scholar
  26. 26.
    Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844Google Scholar
  27. 27.
    Wolpert D (1992) Stacked generalization. Neural Netw 5:241–259CrossRefGoogle Scholar
  28. 28.
    Lewis-Beck MS, Bryman A, Liao TF (2004) Multi-strategy research. In: The SAGE encyclopedia of social science research methodsGoogle Scholar
  29. 29.
    Partalas I, Tsoumakas G, Vlahavas I (2010) An ensemble uncertainty aware measure for directed hill climbing ensemble pruning. Mach Learn 81:257–282MathSciNetCrossRefGoogle Scholar
  30. 30.
    Tsoumakas G, Partalas I, Vlahavas I (2009) An ensemble pruning primer. In: Okun, Valentino (eds) Applications of supervised and unsupervised ensemble methods. Springer-Verlag, pp 1–13Google Scholar
  31. 31.
    Margineantu DD, Dietterich TG (1997) Pruning adaptive boosting. In: The 14th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 211–218Google Scholar
  32. 32.
    Yang Y, Korb K, Ting K, Webb G (2005) Ensemble selection for superparent-one-dependence estimators. In: AI 2005: advances in artificial intelligence, pp 102–112Google Scholar
  33. 33.
    Martínez-Muñoz G, Suarez A (2006) Pruning in ordered bagging ensembles. In: 23rd international conference in machine learning (ICML-2006). ACM Press, New York, pp 609–616Google Scholar
  34. 34.
    Bakker B, Heskes T (2003) Clustering ensembles of neural network models. Neural Netw 16(2):261–269CrossRefGoogle Scholar
  35. 35.
    Fu Q, Hu SX, Zhao SY (2005) Clusterin-based selective neural network ensemble. J Zhejiang Univ Sci 6(5):387–392CrossRefGoogle Scholar
  36. 36.
    Zhou ZH, Tang W (2003) Selective ensemble of decision trees. In: 9th International conference on rough sets, fuzzy sets, data mining, and granular computing. Chongqing, China, pp 476–483Google Scholar
  37. 37.
    Zhang Y, Burer S, Street WN (2006) Ensemble pruning via semi-definite programming. J Mach Learn Res 7:1315–1338MathSciNetzbMATHGoogle Scholar
  38. 38.
    Partalas I, Tsoumakas G, Vlahavas I (2012) A study on greedy algorithms for ensemble pruning. Technical Report TR-LPIS-360-12, LPIS, Aristotle University of Thessaloniki, GreeceGoogle Scholar
  39. 39.
    Taleb Zouggar S, Adla A (2018) A diversity-accuracy measure for homogenous ensemble selection. Int J Interact Multimedia Artif Intell (IJIMAI)Google Scholar
  40. 40.
    Taleb Zouggar S, Adla A (2018) A new function for ensemble pruning. In Dargam F, Delias P, Linden I, Mareschal B (eds) 4th International conference, ICDSST 2018, Heraklion, Greece, May 22–25, 2018, Proceedings. Decision support systems VIII: sustainable data-driven and evidence-based decision support, LNBIP. Springer International Publishing AGGoogle Scholar
  41. 41.
    Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51:181–207CrossRefGoogle Scholar
  42. 42.
    Taleb Zouggar S, Adla A (2018) EMnGA: entropy measure and genetic algorithms based method for heterogeneous ensembles selection. IDEAL 2:271–279Google Scholar
  43. 43.
    Lallich S, Lenca P, Vaillant B (2007) Construction of an off-centered entropy for supervised learning. In ASMDA, 8Google Scholar
  44. 44.
    Breiman L (2001) Random forests. Mach Learn 45:5–32CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Department of EconomicsOran 2 UniversityOranAlgeria
  2. 2.Department of Computer ScienceOran 1 UniversityOranAlgeria

Personalised recommendations