Skip to main content
Log in

Convolutional neural networks ensembles through single-iteration optimization

  • Optimization
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Convolutional Neural Networks have been widely employed in a diverse range of computer vision-based applications, including image classification, object recognition, and object segmentation. Nevertheless, one weakness of such models concerns their hyperparameters’ setting, being highly specific for each particular problem. One common approach is to employ meta-heuristic optimization algorithms to find suitable sets of hyperparameters at the expense of increasing the computational burden, being unfeasible under real-time scenarios. In this paper, we address this problem by creating Convolutional Neural Networks ensembles through Single-Iteration Optimization, a fast optimization composed of only one iteration that is no more effective than a random search. Essentially, the idea is to provide the same capability offered by long-term optimizations, however, without their computational loads. The results among four well-known datasets revealed that creating one-iteration optimized ensembles provides promising results while diminishing the time to achieve them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Traditional optimization methods rely on gradients and Hessians, which are computationally costly and susceptible to local optima.

  2. Notice that this procedure is also required to perform meta-heuristic optimizations, where larger intervals may require more time for the algorithm to find suitable solutions.

  3. Our source code is available at https://github.com/lzfelix/random_ensembles.

  4. One can find the proposed architectures at https://github.com/lzfelix/random_ensembles/tree/master/experiments/models.

  5. One can find the distribution’s ranges at https://github.com/lzfelix/random_ensembles/blob/master/experiments/models/model_specs.py.

  6. Note that K stands for the number of networks considered in the ensemble.

References

  • Barone V, Cossi M, Tomasi J (1998) Geometry optimization of molecular structures in solution by the polarizable continuum model. J Comput Chem 19(4):404–417

    Article  Google Scholar 

  • Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  Google Scholar 

  • Bertsekas DP (1999) Nonlinear programming. Athena Scientific

  • Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision, pp 354–370. Springer

  • Clanuwat T, Bober-Irizar M, Kitamoto A, Lamb A, Yamamoto K, Ha D (2018) Deep learning for classical japanese literature

  • Cox D, Pinto N (2011) Beyond simple features: a large-scale feature search approach to unconstrained face recognition. In: Proceedings of the IEEE International Conference on Automatic Face Gesture Recognition and Workshops, pp 8–15

  • de Rosa GH, Papa JP, Marana NA, Scheirer W, Cox DD (2015) Fine-tuning convolutional neural networks using harmony search. In: Iberoamerican Congress on Pattern Recognition, pp 683–690. Springer

  • Deng L, Platt JC (2014) Ensemble deep learning for speech recognition. In: Fifteenth annual conference of the international speech communication association

  • Dietterich TG (2000) Ensemble methods in machine learning. In: International workshop on multiple classifier systems, pp 1–15. Springer

  • Fukushima K, Miyake S (1982) Neocognitron: a new algorithm for pattern recognition tolerant of deformations and shifts in position. Pattern Recognit 15(6):455–469

    Article  Google Scholar 

  • Geisler WS, Albrecht DG (1992) Cortical neurons: isolation of contrast gain control. Vis Res 32(8):1409–1410

    Article  Google Scholar 

  • Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  • Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001. https://doi.org/10.1109/34.58871

    Article  Google Scholar 

  • Hatamlou A (2013) Black hole: a new heuristic optimization approach for data clustering. Inf Sci 222:175–184

    Article  MathSciNet  Google Scholar 

  • Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106–154

    Article  Google Scholar 

  • Ju C, Bibaut A, Laan M (2018) The relative performance of ensemble methods with deep convolutional neural networks for image classification. J Appl Stat 45(15):2800–2818. https://doi.org/10.1080/02664763.2018.1441383

    Article  MathSciNet  MATH  Google Scholar 

  • Kennedy J, Eberhart RC (2001) Swarm Intell. Morgan Kaufmann Publishers Inc., San Francisco, USA

    Google Scholar 

  • Konno H, Yamazaki H (1991) Mean-absolute deviation portfolio optimization model and its applications to tokyo stock market. Manag Sci 37(5):519–531

    Article  Google Scholar 

  • Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. In: Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies, pp 3–24. IOS Press, Amsterdam, The Netherlands, The Netherlands

  • Krizhevsky A (2009) Learning multiple layers of features from tiny images. Tech Rep, Citeseer

    Google Scholar 

  • Krizhevsky A, Sutskever I, Hinton, GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  • Kumar A, Kim J, Lyndon D, Fulham M, Feng D (2017) An ensemble of fine-tuned convolutional neural networks for medical image classification. IEEE J Biomed Health Inf 21(1):31–40. https://doi.org/10.1109/JBHI.2016.2635663

    Article  Google Scholar 

  • LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551

    Article  Google Scholar 

  • LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  • Lee I, Kim D, Kang S, Lee S (2017) Ensemble deep learning for skeleton-based action recognition using temporal sliding lstm networks. In: The IEEE international conference on computer vision (ICCV)

  • Minetto R, Segundo MP, Sarkar S(2019) Hydra: an ensemble of convolutional neural networks for geospatial land classification. IEEE Trans Geosci Remote Sens

  • Papadimitriou CH (1977) The euclidean travelling salesman problem is np-complete. Theor Comput Sci 4(3):237–244

    Article  Google Scholar 

  • Rao SS, Rao SS (2009) Engineering optimization: theory and practice. Wiley

  • Rardin LR (1998) Optimization in operations research, vol 166, Prentice Hall

  • Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227. https://doi.org/10.1023/A:1022648800760

    Article  Google Scholar 

  • Smirnov EA, Timoshenko DM, Andrianov SN (2014) Comparison of regularization methods for imagenet classification with deep convolutional neural networks. AASRI Procedia 6, 89 – 94 . https://doi.org/10.1016/j.aasri.2014.05.013. 2nd AASRI Conference on Computational Intelligence and Bioinformatics

  • Wilcoxon F (1945) Individual comparisons by ranking methods. Biomet Bull 1(6):80–83

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gustavo H. de Rosa.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors are grateful to CNPq grants #430274/2018-1, #304315/2017-6, #307066/2017-7, and #427968/2018-6, as well as São Paulo Research Foundation (FAPESP) grants #2013/07375-0, #2014/12236-1, #2017/25908-6, #2018/21934-5, #2019/07665-4, and #2019/02205-5.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ribeiro, L.C.F., Rosa, G.H.d., Rodrigues, D. et al. Convolutional neural networks ensembles through single-iteration optimization. Soft Comput 26, 3871–3882 (2022). https://doi.org/10.1007/s00500-022-06791-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-022-06791-9

Keywords

Navigation