A hybrid ensemble method with negative correlation learning for regression

Bai, Yun; Tian, Ganglin; Kang, Yanfei; Jia, Suling

doi:10.1007/s10994-023-06364-3

A hybrid ensemble method with negative correlation learning for regression

Published: 23 August 2023

Volume 112, pages 3881–3916, (2023)
Cite this article

Machine Learning Aims and scope Submit manuscript

Yun Bai^1,2,
Ganglin Tian³,
Yanfei Kang ORCID: orcid.org/0000-0001-8769-6650¹ &
…
Suling Jia¹

300 Accesses
1 Altmetric
Explore all metrics

Abstract

Hybrid ensemble, an essential branch of ensembles, has flourished in the regression field, with studies confirming diversity’s importance. However, previous ensembles consider diversity in the sub-model training stage, with limited improvement compared to single models. In contrast, this study automatically selects and weights sub-models from a heterogeneous model pool. It solves an optimization problem using an interior-point filtering linear-search algorithm. The objective function innovatively incorporates negative correlation learning as a penalty term, with which a diverse model subset can be selected. The best sub-models from each model class are selected to build the NCL ensemble, which performance is better than the simple average and other state-of-the-art weighting methods. It is also possible to improve the NCL ensemble with a regularization term in the objective function. In practice, it is difficult to conclude the optimal sub-model for a dataset prior due to the model uncertainty. Regardless, our method would achieve comparable accuracy as the potential optimal sub-models. In conclusion, the value of this study lies in its ease of use and effectiveness, allowing the hybrid ensemble to embrace diversity and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Locally Weighted Ensemble Learning for Regression

relf: robust regression extended with ensemble loss function

Article 12 November 2018

A Novel Selective Ensemble Learning Based on K-means and Negative Correlation

Data availibility

The datasets used in this paper are all from the Kaggle open platform.

Code availability

We make our codes publicly on Github(https://github.com/BaiyunBuaa/Hybrid-ensemble-based-on-Negative-Correlation-Learning), please feel free to try!

Notes

References

Ala’raj, M., & Abbod, M. F. (2016). A new hybrid ensemble credit scoring model based on classifiers consensus system approach. Expert Systems with Applications, 64, 36–55. https://doi.org/10.1016/j.eswa.2016.07.017
Article Google Scholar
Alhamdoosh, M., & Wang, D. (2014). Fast decorrelated neural network ensembles with random weights. Information Sciences, 264, 104–117. https://doi.org/10.1016/j.ins.2013.12.016
Article MathSciNet MATH Google Scholar
Armstrong, J. S. (2001). Principles of Forecasting: a Handbook for Researchers and Practitioners. Springer.
Book Google Scholar
Beal, L. D., Hill, D. C., Martin, R. A., & Hedengren, J. D. (2018). Gekko optimization suite. Processes, 6(8), 106. https://doi.org/10.3390/pr6080106
Article Google Scholar
Bian, Y., & Chen, H. (2021). When does diversity help generalization in classification ensembles? IEEE Transactions on Cybernetics, 52(9), 9059–9075. https://doi.org/10.1109/TCYB.2021.3053165
Article Google Scholar
Bojer, C. S., & Meldgaard, J. P. (2020). Kaggle forecasting competitions: An overlooked learning opportunity. International Journal of Forecasting. https://doi.org/10.1016/j.ijforecast.2020.07.007
Article Google Scholar
Box, G. E., & Tiao, G. C. (2011). Bayesian Inference in Statistical Analysis. Wiley.
MATH Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.1007/BF00058655
Article MATH Google Scholar
Brown, G. (2004). Diversity in neural network ensembles. PhD thesis, Citeseer.
Brown, G., Wyatt, J., Harris, R., & Yao, X. (2005). Diversity creation methods: A survey and categorisation. Information Fusion, 6(1), 5–20. https://doi.org/10.1016/j.inffus.2004.04.004
Article Google Scholar
Brown, G., Wyatt, J. L., Tino, P., & Bengio, Y. (2005). Managing diversity in regression ensembles. Journal of machine learning research, 6(9), 1621–1950.
MathSciNet MATH Google Scholar
Cano, A., & Krawczyk, B. (2020). Kappa updated ensemble for drifting data stream mining. Machine Learning, 109(1), 175–218. https://doi.org/10.1007/s10994-019-05840-z
Article MathSciNet MATH Google Scholar
Carpio, R. R., Taira, D. P., Ribeiro, L. D., Viera, B. F., Teixeira, A. F., Campos, M. M., Secchi, A. R., et al. (2021). Short-term oil production global optimization with operational constraints: A comparative study of nonlinear and piecewise linear formulations. Journal of Petroleum Science and Engineering, 198, 108141. https://doi.org/10.1016/j.petrol.2020.108141
Article Google Scholar
Chandra, A., & Yao, X. (2006). Evolving hybrid ensembles of learning machines for better generalisation. Neurocomputing, 69(7–9), 686–700. https://doi.org/10.1016/j.neucom.2005.12.014
Article Google Scholar
Chen, H., & Yao, X. (2009). Regularized negative correlation learning for neural network ensembles. IEEE Transactions on Neural Networks, 20(12), 1962–1979. https://doi.org/10.1109/TNN.2009.2034144
Article Google Scholar
Chicco, D. (2017). Ten quick tips for machine learning in computational biology. BioData Mining, 10(1), 1–17. https://doi.org/10.1186/s13040-017-0155-3
Article Google Scholar
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine learning research, 7, 1–30.
MathSciNet MATH Google Scholar
Drucker, H., Burges, C. J., Kaufman, L., Smola, A., Vapnik, V., et al. (1997). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161.
Google Scholar
Dutta, H. (2009). Measuring diversity in regression ensembles. In IICAI, 9, 17.
Google Scholar
Freund, Y., Schapire, R. E., et al. (1996). Experiments with a new boosting algorithm. In Icml, 96, 148–156.
Google Scholar
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232.
Article MathSciNet MATH Google Scholar
Geisser, S. (1975). The predictive sample reuse method with applications. Journal of the American Statistical Association, 70(350), 320–328. https://doi.org/10.1080/01621459.1975.10479865
Article MATH Google Scholar
Hadavandi, E., Shahrabi, J., & Shamshirband, S. (2015). A novel boosted-neural network ensemble for modeling multi-target regression problems. Engineering Applications of Artificial Intelligence, 45, 204–219. https://doi.org/10.1016/j.engappai.2015.06.022
Article Google Scholar
Hansen, L. K., & Salamon, P. (1990). Neural network ensembles. IEEE transactions on pattern analysis and machine intelligence, 12(10), 993–1001. https://doi.org/10.1109/34.58871
Article Google Scholar
Ho, T.K. (1995). Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282. https://doi.org/10.1109/ICDAR.1995.598994. IEEE.
Hoch, T. (2015). An ensemble learning approach for the kaggle taxi travel time prediction challenge. In: DC@ PKDD/ECML.
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55–67. https://doi.org/10.2307/1267351
Article MATH Google Scholar
Jain, P., Kakade, S.M., Kidambi, R., Netrapalli, P., & Sidford, A. (2018). Accelerating stochastic gradient descent for least squares regression. In: Conference On Learning Theory, pp. 545–604. PMLR
LeBlanc, M., & Tibshirani, R. (1996). Combining estimates in regression and classification. Journal of the American Statistical Association, 91(436), 1641–1650. https://doi.org/10.1080/01621459.1996.10476733
Article MathSciNet MATH Google Scholar
Liu, Y., & Yao, X. (1999). Ensemble learning via negative correlation. Neural Networks, 12(10), 1399–1404. https://doi.org/10.1016/S0893-6080(99)00073-8
Article Google Scholar
Liu, Y., Yao, X., & Higuchi, T. (2000). Evolutionary ensembles with negative correlation learning. IEEE Transactions on Evolutionary Computation, 4(4), 380–387. https://doi.org/10.1109/4235.887237
Article Google Scholar
Mendes-Moreira, J., Soares, C., Jorge, A. M., & Sousa, J. F. D. (2012). Ensemble approaches for regression: A survey. ACM Computing Surveys (CSUR), 45(1), 1–40. https://doi.org/10.1145/2379776.2379786
Article MATH Google Scholar
Merz, C. J. (1999). Using correspondence analysis to combine classifiers. Machine Learning, 36(1), 33–58. https://doi.org/10.1023/A:1007559205422
Article Google Scholar
Merz, C. J., & Pazzani, M. J. (1999). A principal components approach to combining regression estimates. Machine Learning, 36(1), 9–32. https://doi.org/10.1023/A:1007507221352
Article Google Scholar
Peng, T., Zhang, C., Zhou, J., & Nazir, M. S. (2020). Negative correlation learning-based relm ensemble model integrated with ovmd for multi-step ahead wind speed forecasting. Renewable Energy. https://doi.org/10.1016/j.renene.2020.03.168
Article Google Scholar
Perrone, M.P., & Cooper, L.N. (1992). When networks disagree: Ensemble methods for hybrid neural networks. Technical report, Brown Univ Providence Ri Inst for Brain and Neural Systems. https://doi.org/10.1142/9789812795885_0025
Pulsipher, J. L., Zhang, W., Hongisto, T. J., & Zavala, V. M. (2022). A unifying modeling abstraction for infinite-dimensional optimization. Computers & Chemical Engineering, 156, 107567.
Article Google Scholar
Qi, C., & Tang, X. (2018). A hybrid ensemble method for improved prediction of slope stability. International Journal for Numerical and Analytical Methods in Geomechanics, 42(15), 1823–1839. https://doi.org/10.1002/nag.2834
Article Google Scholar
Reeve, H. W., & Brown, G. (2018). Diversity and degrees of freedom in regression ensembles. Neurocomputing, 298, 55–68. https://doi.org/10.1016/j.neucom.2017.12.066
Article Google Scholar
Rosenblatt, F. (1961). Principles of neurodynamics. perceptrons and the theory of brain mechanisms. Technical report, Cornell Aeronautical Lab Inc Buffalo NY. https://doi.org/10.1007/978-3-642-70911-1_20
Salgado, R.M., Pereira, J.J., Ohishi, T., Ballini, R., Lima, C., & Von Zuben, F.J. (2006). A hybrid ensemble model applied to the short-term load forecasting problem. In: The 2006 IEEE International Joint Conference on Neural Network Proceedings, pp. 2627–2634. https://doi.org/10.1109/IJCNN.2006.247141. IEEE
Simmons, C. R., Arment, J. R., Powell, K. M., & Hedengren, J. D. (2019). Proactive energy optimization in residential buildings with weather and market forecasts. Processes, 7(12), 929. https://doi.org/10.3390/pr7120929
Article Google Scholar
Sirovetnukul, R., Chutima, P., Wattanapornprom, W., & Chongstitvatana, P. (2011). The effectiveness of hybrid negative correlation learning in evolutionary algorithm for combinatorial optimization problems. In: 2011 IEEE International Conference on Industrial Engineering and Engineering Management, pp. 476–481. https://doi.org/10.1109/IEEM.2011.6117963. IEEE.
Solomatine, D.P., & Shrestha, D.L. (2004). Adaboost. rt: a boosting algorithm for regression problems. In: 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541), vol. 2, pp. 1163–1168. https://doi.org/10.1109/IJCNN.2004.1380102. IEEE
Stigler, S. M. (1974). Gergonne’s 1815 paper on the design and analysis of polynomial regression experiments. Historia Mathematica, 1(4), 431–439. https://doi.org/10.1016/0315-0860(74)90033-0
Article MathSciNet MATH Google Scholar
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society: Series B (Methodological), 36(2), 111–133. https://doi.org/10.1111/j.2517-6161.1974.tb00994.x
Article MathSciNet MATH Google Scholar
Taieb, S. B., & Hyndman, R. J. (2014). A gradient boosting approach to the kaggle load forecasting competition. International Journal of Forecasting, 30(2), 382–394. https://doi.org/10.1016/j.ijforecast.2013.07.005
Article Google Scholar
Tang, K., Lin, M., Minku, F. L., & Yao, X. (2009). Selective negative correlation learning approach to incremental learning. Neurocomputing, 72(13–15), 2796–2805. https://doi.org/10.1016/j.neucom.2008.09.022
Article Google Scholar
Ting, K. M., Wells, J. R., Tan, S. C., Teng, S. W., & Webb, G. I. (2011). Feature-subspace aggregating: Ensembles for stable and unstable learners. Machine Learning, 82(3), 375–397. https://doi.org/10.1007/s10994-010-5224-5
Article MathSciNet Google Scholar
Todorovski, L., & Džeroski, S. (2003). Combining classifiers with meta decision trees. Machine learning, 50(3), 223–249. https://doi.org/10.1023/A:1021709817809
Article MATH Google Scholar
Wächter, A., & Biegler, L. T. (2006). On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Mathematical Programming, 106(1), 25–57. https://doi.org/10.1007/s10107-004-0559-y
Article MathSciNet MATH Google Scholar
Webb, I., & Zheng, Z. (2004). Multistrategy ensemble learning: Reducing error by combining ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering, 16(8), 980–991. https://doi.org/10.1109/TKDE.2004.29
Article Google Scholar
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259. https://doi.org/10.1016/S0893-6080(05)80023-1
Article Google Scholar
Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G. J., Ng, A., Liu, B., Philip, S. Y., et al. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37. https://doi.org/10.1007/s10115-007-0114-2
Article Google Scholar
Zhao, Q.L., Jiang, Y.H., & Xu, M. (2010). Incremental learning by heterogeneous bagging ensemble. In: International Conference on Advanced Data Mining and Applications, pp. 1–12 . https://doi.org/10.1007/978-3-642-17313-4_1. Springer
Zou, K. H., Tuncali, K., & Silverman, S. G. (2003). Correlation and simple linear regression. Radiology, 227(3), 617–628. https://doi.org/10.1148/radiol.2273011499
Article Google Scholar

Download references

Acknowledgements

We thank the reviewers for their time and efforts in reviewing this manuscript. We are also grateful for all the valuable comments that help to improve the paper. Yanfei Kang is supported by the National Natural Science Foundation of China (No. 72171011).

Funding

Yanfei Kang is supported by the National Natural Science Foundation of China (No. 72171011).

Author information

Authors and Affiliations

School of Economics and Management, Beihang University, Beijing, 100191, China
Yun Bai, Yanfei Kang & Suling Jia
The Centre for Processes, Renewable Energies and Energy Systems (PERSEE), MINES Paris - PSL University, Sophia Antipolis, France
Yun Bai
Faculty of Microwave, Observation, and Perspection of Environment, IMT Atlantique, Plouzané, 29280, France
Ganglin Tian

Authors

Yun Bai
View author publications
You can also search for this author in PubMed Google Scholar
Ganglin Tian
View author publications
You can also search for this author in PubMed Google Scholar
Yanfei Kang
View author publications
You can also search for this author in PubMed Google Scholar
Suling Jia
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

YB: algorithm design, data experiment, and paper writing GT: data experiment YK: main theory, paper revision, and submission SJ: paper revision and suggestions

Corresponding author

Correspondence to Yanfei Kang.

Ethics declarations

Conflict of interest

(check journal-specific guidelines for which heading to use) We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, and there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled “A hybrid ensemble method with negative correlation learning for regression’.

Ethical approval

This paper does not contain any studies with human participants or animals performed by any of the authors.

Consent to participate

Informed consent was obtained from all individual participants included in the study.

Consent for publication

The author confirms that this publication has been approved by all co-authors.

Additional information

Editor: Zhi-Hua Zhou.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bai, Y., Tian, G., Kang, Y. et al. A hybrid ensemble method with negative correlation learning for regression. Mach Learn 112, 3881–3916 (2023). https://doi.org/10.1007/s10994-023-06364-3

Download citation

Received: 27 September 2021
Revised: 22 May 2023
Accepted: 05 June 2023
Published: 23 August 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s10994-023-06364-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid ensemble method with negative correlation learning for regression

Abstract

Access this article

Similar content being viewed by others

Locally Weighted Ensemble Learning for Regression

relf: robust regression extended with ensemble loss function

A Novel Selective Ensemble Learning Based on K-means and Negative Correlation

Data availibility

Code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hybrid ensemble method with negative correlation learning for regression

Abstract

Access this article

Similar content being viewed by others

Locally Weighted Ensemble Learning for Regression

relf: robust regression extended with ensemble loss function

A Novel Selective Ensemble Learning Based on K-means and Negative Correlation

Data availibility

Code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation