Snapshot boosting: a fast ensemble framework for deep neural networks

Zhang, Wentao; Jiang, Jiawei; Shao, Yingxia; Cui, Bin

doi:10.1007/s11432-018-9944-x

Snapshot boosting: a fast ensemble framework for deep neural networks

Research Paper
Published: 24 December 2019

Volume 63, article number 112102, (2020)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Wentao Zhang^1,2,
Jiawei Jiang³,
Yingxia Shao⁴ &
…
Bin Cui^1,2,5

551 Accesses
30 Citations
3 Altmetric
Explore all metrics

Abstract

Boosting has been proven to be effective in improving the generalization of machine learning models in many fields. It is capable of getting high-diversity base learners and getting an accurate ensemble model by combining a sufficient number of weak learners. However, it is rarely used in deep learning due to the high training budget of the neural network. Another method named snapshot ensemble can significantly reduce the training budget, but it is hard to balance the tradeoff between training costs and diversity. Inspired by the ideas of snapshot ensemble and boosting, we propose a method named snapshot boosting. A series of operations are performed to get many base models with high diversity and accuracy, such as the use of the validation set, the boosting-based training framework, and the effective ensemble strategy. Last, we evaluate our method on the computer vision (CV) and the natural language processing (NLP) tasks, and the results show that snapshot boosting can get a more balanced trade-off between training expenses and ensemble accuracy than other well-known ensemble methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved residual network model for image recognition using a combination of snapshot ensembles and the cutout technique

Article 02 November 2019

An Ensemble Method Based on AdaBoost and Meta-Learning

Top-k Parametrized Boost

References

Liu L, Du X, Zhu L, et al. Learning discrete hashing towards efficient fashion recommendation. Data Sci Eng, 2018, 3: 307–322
Article Google Scholar
Abdelatti M, Yuan C Z, Zeng W, et al. Cooperative deterministic learning control for a group of homogeneous nonlinear uncertain robot manipulators. Sci China Inf Sci, 2018, 61: 112201
Article MathSciNet Google Scholar
Arun K S, Govindan V K. A hybrid deep learning architecture for latent topic-based image retrieval. Data Sci Eng, 2018, 3: 166–195
Article Google Scholar
Zhang C, Bengio S, Hardt M, et al. Understanding deep learning requires rethinking generalization. 2016. ArXiv: 1611.03530
Google Scholar
Opitz D, Maclin R. Popular ensemble methods: an empirical study. J Artif Intell Res, 1999, 11: 169–198
Article MATH Google Scholar
Melville P, Mooney R J. Creating diversity in ensembles using artificial data. Inf Fusion, 2005, 6: 99–111
Article Google Scholar
Jiang J, Cui B, Zhang C, et al. DimBoost: boosting gradient boosting decision tree to higher dimensions. In: Proceedings of the 2018 International Conference on Management of Data. New York: ACM, 2018. 1363–1376
Chapter Google Scholar
Gao W, Zhou Z H. On the doubt about margin explanation of boosting. Artif Intell, 2013, 203: 1–18
Article MathSciNet MATH Google Scholar
Mosca A, Magoulas G D. Deep incremental boosting. 2017. ArXiv: 1708.03704
Google Scholar
Quinlan J R. Bagging, boosting, and C4. 5. In: Proceedings of the 13th National Conference on Artificial Intelligence and 8th Innovative Applications of Artificial Intelligence Conference, Portland, 1996. 725–730
Google Scholar
Huang G, Li Y, Pleiss G, et al. Snapshot ensembles: train 1, get M for free. 2017. ArXiv: 1704.00109
Google Scholar
Loshchilov I, Hutter F. Sgdr: stochastic gradient descent with warm restarts. 2016. ArXiv: 1608.03983
Google Scholar
Zhou Z H. Ensemble methods: foundations and algorithms. Chapman and Hall/CRC, 2012
Google Scholar
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436–444
Article Google Scholar
Dietterich T G. Ensemble methods in machine learning. In: Proceedings of the International Workshop on multiple Classifier Systems. Berlin: Springer, 2000. 1–15
Google Scholar
Naftaly U, Intrator N, Horn D. Optimal ensemble averaging of neural networks. Netw-Comput Neural Syst, 1997, 8: 283–296
Article MATH Google Scholar
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 770–778
Google Scholar
Friedman J, Hastie T, Tibshirani R. The Elements of Statistical Learning. New York: Springer, 2001
MATH Google Scholar
Schwenk H, Bengio Y. Training methods for adaptive boosting of neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, 1998. 647–653
Google Scholar
Bucilu C, Caruana R, Niculescu-Mizil A. Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2006. 535–541
Chapter Google Scholar
Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. 2015. ArXiv: 1503.02531
Google Scholar
Breiman L. Stacked regressions. Mach Learn, 1996, 24: 49–64
Article MATH Google Scholar
van der Laan M J, Polley E C, Hubbard A E. Super learner. Stat Appl Genets Mol Biol, 2007, 6: 1
MathSciNet MATH Google Scholar
Young S, Abdou T, Bener A. Deep super learner: a deep ensemble for classification problems. In: Proceedings of the 31st Canadian Conference on Artificial Intelligence, Toronto, 2018. 84–95
Google Scholar
Ju C, Bibaut A, van der Laan M. The relative performance of ensemble methods with deep convolutional neural networks for image classification. J Appl Stat, 2018, 45: 2800–2818
Article MathSciNet MATH Google Scholar
Seyyedsalehi S Z, Seyyedsalehi S A. A fast and efficient pre-training method based on layer-by-layer maximum discrimination for deep neural networks. Neurocomputing, 2015, 168: 669–680
Article Google Scholar
Zhou Z H, Wu J, Tang W. Ensembling neural networks: many could be better than all. Artif Intell, 2002, 137: 239–263
Article MathSciNet MATH Google Scholar
Aho K, Derryberry D W, Peterson T. Model selection for ecologists: the worldviews of AIC and BIC. Ecology, 2014, 95: 631–636
Article Google Scholar
Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, 1995. 1137–1145
Google Scholar
Brownlee J. Discover feature engineering, how to engineer features and how to get good at it. Machine Learning Process, 2014
Google Scholar
Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res, 2003, 3: 1157–1182
MATH Google Scholar
Huang G, Liu Z, van der Maaten L, et al. Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 4700–4708
Google Scholar
Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. In: Proceedings of Advances in Neural Information Processing Systems, 2014. 3104–3112
Google Scholar
Krizhevsky A, Hinton G. Learning Multiple Layers of Features From Tiny Images. Technical Report, University of Toronto, 2009
Google Scholar
Lin M, Chen Q, Yan S. Network in network. 2013. ArXiv: 1312.4400
Google Scholar
Maas A L, Daly R E, Pham P T, et al. Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011. 142–150
Google Scholar
Freund Y, Schapire R E. Experiments with a new boosting algorithm. In: Proceedings of the 13th International Conference on ML, 1996. 148–156
Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 61832001, 61702015, 61702016, 61572039), National Key Research and Development Program of China (Grant No. 2018YFB1004403), and PKU-Tencent Joint Research Lab.

Author information

Authors and Affiliations

Center for Data Science, Peking University, Beijing, 100871, China
Wentao Zhang & Bin Cui
National Engineering Laboratory for Big Data Analysis and Applications, Beijing, 100871, China
Wentao Zhang & Bin Cui
Department of Computer Science, ETH Zurich, Zurich, 8092, Switzerland
Jiawei Jiang
Beijing Key Lab of Intelligent Telecommunications Software and Multimedia, School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, 100876, China
Yingxia Shao
Key Lab of High Confidence Software Technologies (MOE), Department of Computer Science, Peking University, Beijing, 100871, China
Bin Cui

Authors

Wentao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Yingxia Shao
View author publications
You can also search for this author in PubMed Google Scholar
Bin Cui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiawei Jiang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, W., Jiang, J., Shao, Y. et al. Snapshot boosting: a fast ensemble framework for deep neural networks. Sci. China Inf. Sci. 63, 112102 (2020). https://doi.org/10.1007/s11432-018-9944-x

Download citation

Received: 18 December 2018
Revised: 27 February 2019
Accepted: 15 April 2019
Published: 24 December 2019
DOI: https://doi.org/10.1007/s11432-018-9944-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Snapshot boosting: a fast ensemble framework for deep neural networks

Abstract

Access this article

Similar content being viewed by others

An improved residual network model for image recognition using a combination of snapshot ensembles and the cutout technique

An Ensemble Method Based on AdaBoost and Meta-Learning

Top-k Parametrized Boost

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Snapshot boosting: a fast ensemble framework for deep neural networks

Abstract

Access this article

Similar content being viewed by others

An improved residual network model for image recognition using a combination of snapshot ensembles and the cutout technique

An Ensemble Method Based on AdaBoost and Meta-Learning

Top-k Parametrized Boost

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation