Hybrid Methodology Based on Bayesian Optimization and GA-PARSIMONY for Searching Parsimony Models by Combining Hyperparameter Optimization and Feature Selection

  • Francisco Javier Martinez-de-PisonEmail author
  • Ruben Gonzalez-Sendino
  • Alvaro Aldama
  • Javier Ferreiro
  • Esteban Fraile
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10334)


This paper presents a hybrid methodology that combines Bayesian Optimization (BO) with a constrained version of the GA-PARSIMONY method to obtain parsimonious models. The proposal is designed to reduce the computational efforts associated to the use of GA-PARSIMONY alone. The method is initialized with BO to obtain favorable initial model parameters. With these parameters, a constrained GA-PARSIMONY is implemented to generate accurate parsimonious models using feature reduction, data transformation and parsimonious model selection. Finally, a second BO is run again with the selected features. Experiments with Extreme Gradient Boosting Machines (XGBoost) and six UCI databases demonstrate that the hybrid methodology obtains analogous models than the GA-PARSIMONY but with a significant reduction on the execution time in five of the six datasets.


GA-PARSIMONY Bayesian optimization Hyperparameter optimization Parsimony models Genetic algorithms 



We are greatly indebted to Banco Santander for the APPI16/05 fellowship and to the University of La Rioja for the EGI16/19 fellowship. This work used the Beronia cluster (Universidad de La Rioja), which is supported by FEDER-MINECO grant number UNLR-094E-2C-225.


  1. 1.
    Antonanzas-Torres, F., Urraca, R., Antonanzas, J., Fernandez-Ceniceros, J., de Pison, F.M.: Generation of daily global solar irradiation with support vector machines for regression. Energy Convers. Manag. 96, 277–286 (2015)CrossRefGoogle Scholar
  2. 2.
    Bergstra, J., Komer, B., Eliasmith, C., Yamins, D., Cox, D.D.: Hyperopt: a python library for model selection and hyperparameter optimization. Comput. Sci. Discov. 8(1), 014008 (2015)CrossRefGoogle Scholar
  3. 3.
    Bischl, B., Lang, M., Kotthoff, L., Schiffner, J., Richter, J., Studerus, E., Casalicchio, G., Jones, Z.M.: MLR: machine learning in r. J. Mach. Learn. Res. 17(170), 1–5 (2016)zbMATHMathSciNetGoogle Scholar
  4. 4.
    Caamaño, P., Bellas, F., Becerra, J.A., Duro, R.J.: Evolutionary algorithm characterization in real parameter optimization problems. Appl. Soft Comput. 13(4), 1902–1921 (2013)CrossRefGoogle Scholar
  5. 5.
    Chen, N., Ribeiro, B., Vieira, A., Duarte, J., Neves, J.C.: A genetic algorithm-based approach to cost-sensitive bankruptcy prediction. Expert Syst. Appl. 38(10), 12939–12945 (2011)CrossRefGoogle Scholar
  6. 6.
    Chen, T., He, T., Benesty, M.: xgboost: extreme gradient boosting (2015)., rpackageversion0.4-3
  7. 7.
    Corchado, E., Wozniak, M., Abraham, A., de Carvalho, A.C.P.L.F., Snásel, V.: Recent trends in intelligent data analysis. Neurocomputing 126, 1–2 (2014)CrossRefGoogle Scholar
  8. 8.
    Dhiman, R., Saini, J.: Priyanka: genetic algorithms tuned expert model for detection of epileptic seizures from EEG signatures. Appl. Soft Comput. 19, 8–17 (2014)CrossRefGoogle Scholar
  9. 9.
    Ding, S.: Spectral and wavelet-based feature selection with particle swarm optimization for hyperspectral classification. J. Softw. 6(7), 1248–1256 (2011)CrossRefGoogle Scholar
  10. 10.
    Fernandez-Ceniceros, J., Sanz-Garcia, A., Antonanzas-Torres, F., de Pison, F.M.: A numerical-informational approach for characterising the ductile behaviour of the t-stub component. part 2: Parsimonious soft-computing-based metamodel. Eng. Struct. 82, 249–260 (2015)CrossRefGoogle Scholar
  11. 11.
    Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)CrossRefzbMATHMathSciNetGoogle Scholar
  12. 12.
    Gorissen, D., Couckuyt, I., Demeester, P., Dhaene, T., Crombecq, K.: A surrogate modeling and adaptive sampling toolbox for computer based design. J. Mach. Learn. Res. 11, 2051–2055 (2010)Google Scholar
  13. 13.
    Hashem, I.A., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Ullah Khan, S.: The rise of big data on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)CrossRefGoogle Scholar
  14. 14.
    Huang, C.L., Dun, J.F.: A distributed PSO-SVM hybrid system with feature selection and parameter optimization. Appl. Soft Comput. 8(4), 1381–1391 (2008)CrossRefGoogle Scholar
  15. 15.
    Huang, C.J., Chen, Y.J., Chen, H.M., Jian, J.J., Tseng, S.C., Yang, Y.J., Hsu, P.A.: Intelligent feature extraction and classification of anuran vocalizations. Appl. Soft Comput. 19, 1–7 (2014)CrossRefGoogle Scholar
  16. 16.
    Michalewicz, Z., Janikow, C.Z.: Handling constraints in genetic algorithms. In: ICGA, pp. 151–157 (1991)Google Scholar
  17. 17.
    Olson, R.S., Bartley, N., Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO 2016, NY, USA, pp. 485–492. ACM, New York (2016)Google Scholar
  18. 18.
    Perner, P.: Improving the accuracy of decision tree induction by feature preselection. Appl. Artif. Intell. 15(8), 747–760 (2001)CrossRefGoogle Scholar
  19. 19.
    Martinez-de Pison, F.J., Fraile-Garcia, E., Ferreiro-Cabello, J., Gonzalez, R., Pernia, A.: Searching parsimonious solutions with GA-PARSIMONY and XGBoost in high-dimensional databases, pp. 201–210. Springer International Publishing, Cham (2017)Google Scholar
  20. 20.
    Core Team, R.: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2013)Google Scholar
  21. 21.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005)Google Scholar
  22. 22.
    Reif, M., Shafait, F., Dengel, A.: Meta-learning for evolutionary parameter optimization of classifiers. Mach. Learn. 87(3), 357–380 (2012)CrossRefMathSciNetGoogle Scholar
  23. 23.
    Sanz-Garcia, A., Fernandez-Ceniceros, J., Antonanzas-Torres, F., Pernia-Espinoza, A., Martinez-de Pison, F.J.: GA-PARSIMONY: a GA-SVR approach with feature selection and parameter optimization to obtain parsimonious solutions for predicting temperature settings in a continuous annealing furnace. Appl. Soft Comput. 35, 13–28 (2015)CrossRefGoogle Scholar
  24. 24.
    Sanz-Garcia, A., Fernández-Ceniceros, J., Fernández-Martínez, R., Martínez-De-Pisón, F.J.: Methodology based on genetic optimisation to develop overall parsimony models for predicting temperature settings on annealing furnace. Ironmak. Steelmak. 41(2), 87–98 (2014)CrossRefGoogle Scholar
  25. 25.
    Sanz-García, A., Fernández-Ceniceros, J., Antoñanzas-Torres, F., Martínez-de Pisón, F.J.: Parsimonious support vector machines modelling for set points in industrial processes based on genetic algorithm optimization. In: International Joint Conference SOCO13-CISIS13-ICEUTE13, Advances in Intelligent Systems and Computing, vol. 239, pp. 1–10. Springer International Publishing, Heidelberg (2014)Google Scholar
  26. 26.
    Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Technical report, Universities of Harvard, Oxford, Toronto, and Google DeepMind (2015)Google Scholar
  27. 27.
    Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 2951–2959. Curran Associates Inc., Red Hook (2012)Google Scholar
  28. 28.
    Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Gaussian process bandits without regret: an experimental design approach (2009). CoRR arXiv:abs/0912.3995
  29. 29.
    Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, NY, USA. ACM, New York (2013)Google Scholar
  30. 30.
    Urraca, R., Sanz-Garcia, A., Fernandez-Ceniceros, J., Sodupe-Ortega, E., Martinez-de-Pison, F.J.: Improving hotel room demand forecasting with a hybrid GA-SVR methodology based on skewed data transformation, feature selection and parsimony tuning. In: Onieva, E., Santos, I., Osaba, E., Quintián, H., Corchado, E. (eds.) HAIS 2015. LNCS (LNAI), vol. 9121, pp. 632–643. Springer, Cham (2015). doi: 10.1007/978-3-319-19644-2_52 CrossRefGoogle Scholar
  31. 31.
    Vieira, S.M., Mendonza, L.F., Farinha, G.J., Sousa, J.M.: Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients. Appl. Softw. Comput. 13(8), 3494–3504 (2013)CrossRefGoogle Scholar
  32. 32.
    Winkler, S.M., Affenzeller, M., Kronberger, G., Kommenda, M., Wagner, S., Jacak, W., Stekel, H.: Analysis of selected evolutionary algorithms in feature selection and parameter optimization for data based tumor marker modeling. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2011. LNCS, vol. 6927, pp. 335–342. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-27549-4_43 CrossRefGoogle Scholar
  33. 33.
    Xue, B., Zhang, M., Browne, W.N.: Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl. Soft Comput. 18, 261–276 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Francisco Javier Martinez-de-Pison
    • 1
    Email author
  • Ruben Gonzalez-Sendino
    • 1
  • Alvaro Aldama
    • 1
  • Javier Ferreiro
    • 1
  • Esteban Fraile
    • 1
  1. 1.EDMANS GroupUniversity of La RiojaLogroñoSpain

Personalised recommendations