Abstract
Hybrid ensemble, an essential branch of ensembles, has flourished in the regression field, with studies confirming diversity’s importance. However, previous ensembles consider diversity in the sub-model training stage, with limited improvement compared to single models. In contrast, this study automatically selects and weights sub-models from a heterogeneous model pool. It solves an optimization problem using an interior-point filtering linear-search algorithm. The objective function innovatively incorporates negative correlation learning as a penalty term, with which a diverse model subset can be selected. The best sub-models from each model class are selected to build the NCL ensemble, which performance is better than the simple average and other state-of-the-art weighting methods. It is also possible to improve the NCL ensemble with a regularization term in the objective function. In practice, it is difficult to conclude the optimal sub-model for a dataset prior due to the model uncertainty. Regardless, our method would achieve comparable accuracy as the potential optimal sub-models. In conclusion, the value of this study lies in its ease of use and effectiveness, allowing the hybrid ensemble to embrace diversity and accuracy.
Similar content being viewed by others
Data availibility
The datasets used in this paper are all from the Kaggle open platform.
Code availability
We make our codes publicly on Github(https://github.com/BaiyunBuaa/Hybrid-ensemble-based-on-Negative-Correlation-Learning), please feel free to try!
References
Ala’raj, M., & Abbod, M. F. (2016). A new hybrid ensemble credit scoring model based on classifiers consensus system approach. Expert Systems with Applications, 64, 36–55. https://doi.org/10.1016/j.eswa.2016.07.017
Alhamdoosh, M., & Wang, D. (2014). Fast decorrelated neural network ensembles with random weights. Information Sciences, 264, 104–117. https://doi.org/10.1016/j.ins.2013.12.016
Armstrong, J. S. (2001). Principles of Forecasting: a Handbook for Researchers and Practitioners. Springer.
Beal, L. D., Hill, D. C., Martin, R. A., & Hedengren, J. D. (2018). Gekko optimization suite. Processes, 6(8), 106. https://doi.org/10.3390/pr6080106
Bian, Y., & Chen, H. (2021). When does diversity help generalization in classification ensembles? IEEE Transactions on Cybernetics, 52(9), 9059–9075. https://doi.org/10.1109/TCYB.2021.3053165
Bojer, C. S., & Meldgaard, J. P. (2020). Kaggle forecasting competitions: An overlooked learning opportunity. International Journal of Forecasting. https://doi.org/10.1016/j.ijforecast.2020.07.007
Box, G. E., & Tiao, G. C. (2011). Bayesian Inference in Statistical Analysis. Wiley.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.1007/BF00058655
Brown, G. (2004). Diversity in neural network ensembles. PhD thesis, Citeseer.
Brown, G., Wyatt, J., Harris, R., & Yao, X. (2005). Diversity creation methods: A survey and categorisation. Information Fusion, 6(1), 5–20. https://doi.org/10.1016/j.inffus.2004.04.004
Brown, G., Wyatt, J. L., Tino, P., & Bengio, Y. (2005). Managing diversity in regression ensembles. Journal of machine learning research, 6(9), 1621–1950.
Cano, A., & Krawczyk, B. (2020). Kappa updated ensemble for drifting data stream mining. Machine Learning, 109(1), 175–218. https://doi.org/10.1007/s10994-019-05840-z
Carpio, R. R., Taira, D. P., Ribeiro, L. D., Viera, B. F., Teixeira, A. F., Campos, M. M., Secchi, A. R., et al. (2021). Short-term oil production global optimization with operational constraints: A comparative study of nonlinear and piecewise linear formulations. Journal of Petroleum Science and Engineering, 198, 108141. https://doi.org/10.1016/j.petrol.2020.108141
Chandra, A., & Yao, X. (2006). Evolving hybrid ensembles of learning machines for better generalisation. Neurocomputing, 69(7–9), 686–700. https://doi.org/10.1016/j.neucom.2005.12.014
Chen, H., & Yao, X. (2009). Regularized negative correlation learning for neural network ensembles. IEEE Transactions on Neural Networks, 20(12), 1962–1979. https://doi.org/10.1109/TNN.2009.2034144
Chicco, D. (2017). Ten quick tips for machine learning in computational biology. BioData Mining, 10(1), 1–17. https://doi.org/10.1186/s13040-017-0155-3
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine learning research, 7, 1–30.
Drucker, H., Burges, C. J., Kaufman, L., Smola, A., Vapnik, V., et al. (1997). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161.
Dutta, H. (2009). Measuring diversity in regression ensembles. In IICAI, 9, 17.
Freund, Y., Schapire, R. E., et al. (1996). Experiments with a new boosting algorithm. In Icml, 96, 148–156.
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232.
Geisser, S. (1975). The predictive sample reuse method with applications. Journal of the American Statistical Association, 70(350), 320–328. https://doi.org/10.1080/01621459.1975.10479865
Hadavandi, E., Shahrabi, J., & Shamshirband, S. (2015). A novel boosted-neural network ensemble for modeling multi-target regression problems. Engineering Applications of Artificial Intelligence, 45, 204–219. https://doi.org/10.1016/j.engappai.2015.06.022
Hansen, L. K., & Salamon, P. (1990). Neural network ensembles. IEEE transactions on pattern analysis and machine intelligence, 12(10), 993–1001. https://doi.org/10.1109/34.58871
Ho, T.K. (1995). Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282. https://doi.org/10.1109/ICDAR.1995.598994. IEEE.
Hoch, T. (2015). An ensemble learning approach for the kaggle taxi travel time prediction challenge. In: DC@ PKDD/ECML.
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67. https://doi.org/10.2307/1267351
Jain, P., Kakade, S.M., Kidambi, R., Netrapalli, P., & Sidford, A. (2018). Accelerating stochastic gradient descent for least squares regression. In: Conference On Learning Theory, pp. 545–604. PMLR
LeBlanc, M., & Tibshirani, R. (1996). Combining estimates in regression and classification. Journal of the American Statistical Association, 91(436), 1641–1650. https://doi.org/10.1080/01621459.1996.10476733
Liu, Y., & Yao, X. (1999). Ensemble learning via negative correlation. Neural Networks, 12(10), 1399–1404. https://doi.org/10.1016/S0893-6080(99)00073-8
Liu, Y., Yao, X., & Higuchi, T. (2000). Evolutionary ensembles with negative correlation learning. IEEE Transactions on Evolutionary Computation, 4(4), 380–387. https://doi.org/10.1109/4235.887237
Mendes-Moreira, J., Soares, C., Jorge, A. M., & Sousa, J. F. D. (2012). Ensemble approaches for regression: A survey. ACM Computing Surveys (CSUR), 45(1), 1–40. https://doi.org/10.1145/2379776.2379786
Merz, C. J. (1999). Using correspondence analysis to combine classifiers. Machine Learning, 36(1), 33–58. https://doi.org/10.1023/A:1007559205422
Merz, C. J., & Pazzani, M. J. (1999). A principal components approach to combining regression estimates. Machine Learning, 36(1), 9–32. https://doi.org/10.1023/A:1007507221352
Peng, T., Zhang, C., Zhou, J., & Nazir, M. S. (2020). Negative correlation learning-based relm ensemble model integrated with ovmd for multi-step ahead wind speed forecasting. Renewable Energy. https://doi.org/10.1016/j.renene.2020.03.168
Perrone, M.P., & Cooper, L.N. (1992). When networks disagree: Ensemble methods for hybrid neural networks. Technical report, Brown Univ Providence Ri Inst for Brain and Neural Systems. https://doi.org/10.1142/9789812795885_0025
Pulsipher, J. L., Zhang, W., Hongisto, T. J., & Zavala, V. M. (2022). A unifying modeling abstraction for infinite-dimensional optimization. Computers & Chemical Engineering, 156, 107567.
Qi, C., & Tang, X. (2018). A hybrid ensemble method for improved prediction of slope stability. International Journal for Numerical and Analytical Methods in Geomechanics, 42(15), 1823–1839. https://doi.org/10.1002/nag.2834
Reeve, H. W., & Brown, G. (2018). Diversity and degrees of freedom in regression ensembles. Neurocomputing, 298, 55–68. https://doi.org/10.1016/j.neucom.2017.12.066
Rosenblatt, F. (1961). Principles of neurodynamics. perceptrons and the theory of brain mechanisms. Technical report, Cornell Aeronautical Lab Inc Buffalo NY. https://doi.org/10.1007/978-3-642-70911-1_20
Salgado, R.M., Pereira, J.J., Ohishi, T., Ballini, R., Lima, C., & Von Zuben, F.J. (2006). A hybrid ensemble model applied to the short-term load forecasting problem. In: The 2006 IEEE International Joint Conference on Neural Network Proceedings, pp. 2627–2634. https://doi.org/10.1109/IJCNN.2006.247141. IEEE
Simmons, C. R., Arment, J. R., Powell, K. M., & Hedengren, J. D. (2019). Proactive energy optimization in residential buildings with weather and market forecasts. Processes, 7(12), 929. https://doi.org/10.3390/pr7120929
Sirovetnukul, R., Chutima, P., Wattanapornprom, W., & Chongstitvatana, P. (2011). The effectiveness of hybrid negative correlation learning in evolutionary algorithm for combinatorial optimization problems. In: 2011 IEEE International Conference on Industrial Engineering and Engineering Management, pp. 476–481. https://doi.org/10.1109/IEEM.2011.6117963. IEEE.
Solomatine, D.P., & Shrestha, D.L. (2004). Adaboost. rt: a boosting algorithm for regression problems. In: 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541), vol. 2, pp. 1163–1168. https://doi.org/10.1109/IJCNN.2004.1380102. IEEE
Stigler, S. M. (1974). Gergonne’s 1815 paper on the design and analysis of polynomial regression experiments. Historia Mathematica, 1(4), 431–439. https://doi.org/10.1016/0315-0860(74)90033-0
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society: Series B (Methodological), 36(2), 111–133. https://doi.org/10.1111/j.2517-6161.1974.tb00994.x
Taieb, S. B., & Hyndman, R. J. (2014). A gradient boosting approach to the kaggle load forecasting competition. International Journal of Forecasting, 30(2), 382–394. https://doi.org/10.1016/j.ijforecast.2013.07.005
Tang, K., Lin, M., Minku, F. L., & Yao, X. (2009). Selective negative correlation learning approach to incremental learning. Neurocomputing, 72(13–15), 2796–2805. https://doi.org/10.1016/j.neucom.2008.09.022
Ting, K. M., Wells, J. R., Tan, S. C., Teng, S. W., & Webb, G. I. (2011). Feature-subspace aggregating: Ensembles for stable and unstable learners. Machine Learning, 82(3), 375–397. https://doi.org/10.1007/s10994-010-5224-5
Todorovski, L., & Džeroski, S. (2003). Combining classifiers with meta decision trees. Machine learning, 50(3), 223–249. https://doi.org/10.1023/A:1021709817809
Wächter, A., & Biegler, L. T. (2006). On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Mathematical Programming, 106(1), 25–57. https://doi.org/10.1007/s10107-004-0559-y
Webb, I., & Zheng, Z. (2004). Multistrategy ensemble learning: Reducing error by combining ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering, 16(8), 980–991. https://doi.org/10.1109/TKDE.2004.29
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259. https://doi.org/10.1016/S0893-6080(05)80023-1
Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G. J., Ng, A., Liu, B., Philip, S. Y., et al. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37. https://doi.org/10.1007/s10115-007-0114-2
Zhao, Q.L., Jiang, Y.H., & Xu, M. (2010). Incremental learning by heterogeneous bagging ensemble. In: International Conference on Advanced Data Mining and Applications, pp. 1–12 . https://doi.org/10.1007/978-3-642-17313-4_1. Springer
Zou, K. H., Tuncali, K., & Silverman, S. G. (2003). Correlation and simple linear regression. Radiology, 227(3), 617–628. https://doi.org/10.1148/radiol.2273011499
Acknowledgements
We thank the reviewers for their time and efforts in reviewing this manuscript. We are also grateful for all the valuable comments that help to improve the paper. Yanfei Kang is supported by the National Natural Science Foundation of China (No. 72171011).
Funding
Yanfei Kang is supported by the National Natural Science Foundation of China (No. 72171011).
Author information
Authors and Affiliations
Contributions
YB: algorithm design, data experiment, and paper writing GT: data experiment YK: main theory, paper revision, and submission SJ: paper revision and suggestions
Corresponding author
Ethics declarations
Conflict of interest
(check journal-specific guidelines for which heading to use) We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, and there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled “A hybrid ensemble method with negative correlation learning for regression’.
Ethical approval
This paper does not contain any studies with human participants or animals performed by any of the authors.
Consent to participate
Informed consent was obtained from all individual participants included in the study.
Consent for publication
The author confirms that this publication has been approved by all co-authors.
Additional information
Editor: Zhi-Hua Zhou.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bai, Y., Tian, G., Kang, Y. et al. A hybrid ensemble method with negative correlation learning for regression. Mach Learn 112, 3881–3916 (2023). https://doi.org/10.1007/s10994-023-06364-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-023-06364-3