Learning Variables Structure Using Evolutionary Algorithms to Improve Predictive Performance
- 44 Downloads
Several previous works have shown how using prior knowledge within machine learning models helps to overcome the curse of dimensionality issue in high dimensional settings. However, most of these works are based on simple linear models (or variations) or do make the assumption of knowing a pre-defined variable grouping structure in advance, something that will not always be possible. This paper presents a hybrid genetic algorithm and machine learning approach which aims to learn variables grouping structure during the model estimation process, thus taking advantage of the benefits introduced by models based on problem-specific information but with no requirement of having a priory any information about variables structure. This approach has been tested on four synthetic datasets and its performance has been compared against two well-known reference models (LASSO and Group-LASSO). The results of the analysis showed how that the proposed approach, called GAGL, considerably outperformed LASSO and performed as well as Group-LASSO in high dimensional settings, with the added benefit of learning the variables grouping structure from data instead of requiring this information a priory before estimating the model.
KeywordsGenetic Algorithms Machine Learning Prior knowledge Optimization
Authors acknowledge support through grants RTI2018-098160-B-I00 and RTI2018-100754-B-I00 from the Spanish Ministerio de Ciencia, Innovación y Universidades, which include ERDF funds, and from project 202C1800003 (UIC Airbus).
- 7.Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, IJCAI 1995 , vol. 2, pp. 1137–1143 (1995)Google Scholar
- 15.Urda, D., et al.: BLASSO: integration of biological knowledge into a regularized linear model. BMC Syst. Biol. 12(5), 361–372 (2018)Google Scholar
- 16.Urda, D., Jerez, J.M., Turias, I.J.: Data dimension and structure effects in predictive performance of deep neural networks. In: New Trends in Intelligent Software Methodologies, Tools and Techniques, pp. 361–372 (2018)Google Scholar