Ensemble classification from deep predictions with test data augmentation


Data augmentation has become a standard step to improve the predictive power and robustness of convolutional neural networks by means of the synthetic generation of new samples depicting different deformations. This step has been traditionally considered to improve the network at the training stage. In this work, however, we study the use of data augmentation at classification time. That is, the test sample is augmented, following the same procedure considered for training, and the decision is taken with an ensemble prediction over all these samples. We present comprehensive experimentation with several datasets and ensemble decisions, considering a rather generic data augmentation procedure. Our results show that performing this step is able to boost the original classification, even when the room for improvement is limited.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. Agrawal R (2008) Karmeshu: perturbation scheme for online learning of features: Incremental principal component analysis. Pattern Recognit 41(5):1452–1460

    Article  Google Scholar 

  2. Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: The IEEE conference on computer vision and pattern recognition (CVPR)

  3. Aksakalli V, Malekipirbazari M (2016) Feature selection via binary simultaneous perturbation stochastic approximation. Pattern Recognit Lett 75(Supplement C):41–47

    Article  Google Scholar 

  4. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, Springer, pp. 177–186

  5. Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167

    Article  Google Scholar 

  6. Calvo-Zaragoza J, Oncina J (2014) Recognition of pen-based music notation: the HOMUS dataset. In: 22nd international conference on pattern recognition, ICPR 2014, Stockholm, Sweden, August 24–28, pp 3038–3043

  7. Cui X, Goel V, Kingsbury B (2015) Data augmentation for deep neural network acoustic modeling. IEEE/ACM Trans Audio Speech Lang Proc 23(9):1469–1477

    Article  Google Scholar 

  8. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  9. Duchi JC, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159

    MathSciNet  MATH  Google Scholar 

  10. Duda RO, Hart PE (1973) Pattern recognition and scene analysis. Wiley, New York

    Google Scholar 

  11. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, AISTATS 2011, Fort Lauderdale, USA, April 11–13, pp 315–323

  12. Goodfellow I, Bengio Y, Courville A (2016) Regularization for deep learning. In: Deep learning, chap 10, MIT Press, pp 228–273

  13. Ha TM, Bunke H (1997) Off-line, handwritten numeral recognition by perturbation method. IEEE Trans Pattern Anal Mach Intell 19(5):535–539

    Article  Google Scholar 

  14. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. CoRR. arXiv:1502.01852

  15. Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16:550–554

    Article  Google Scholar 

  16. Jain AK, Mao J, Mohiuddin KM (1996) Artificial neural networks: a tutorial. Computer 29(3):31–44

    Article  Google Scholar 

  17. Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20:226–239

    Article  Google Scholar 

  18. Ko T, Peddinti V, Povey D, Khudanpur S (2015) Audio augmentation for speech recognition. In: 16th Annual conference of the international speech communication association, INTERSPEECH 2015, Dresden, Germany, September 6–10, pp 3586–3589

  19. Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical report, University of Toronto

  20. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: 26th Annual conference on neural information processing systems, pp 1106–1114

  21. Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, New York

    Google Scholar 

  22. Latecki LJ, Lakamper R, Eckhardt T (2000) Shape descriptors for non-rigid shapes with a single closed contour. In: Proceedings. IEEE conference on computer vision and pattern recognition, vol 1, IEEE, pp 424–429

  23. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  24. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  25. Lemley J, Bazrafkan S, Corcoran P (2017) Smart augmentation learning an optimal data augmentation strategy. IEEE Access 5:5858–5869

    Article  Google Scholar 

  26. Lv JJ, Cheng C, Tian GD, Zhou XD, Zhou X (2016) Landmark perturbation-based data augmentation for unconstrained face recognition. Signal Process Image Commun 47:465–475

    Article  Google Scholar 

  27. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. CoRR. arXiv:1312.6229

  28. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR. arXiv:1409.1556

  29. Smith LN, Topin N (2016) Deep convolutional neural network design patterns. arXiv preprint arXiv:1611.00847

  30. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  31. Sun Y, Wang X, Tang X (2013) Hybrid deep learning for face verification. In: The IEEE international conference on computer vision (ICCV)

  32. Sutskever I, Martens J, Dahl GE, Hinton GE (2013) On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th international conference on machine learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, pp 1139–1147

  33. Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970

    Article  Google Scholar 

  34. Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using dropconnect. In: Dasgupta S, Mcallester D (eds) JMLR workshop and conference proceedings, proceedings of the 30th international conference on machine learning (ICML-13), vol 28, pp 1058–1066

  35. Wilkinson RA, Geist J, Janet S, Grother PJ et al (1992) The first census optical character recognition system conference. Technical report, US Department of Commerce. https://doi.org/10.18434/T4H01C

  36. Zeiler MD (2012) ADADELTA: an adaptive learning rate method. CoRR. arXiv:1212.5701

  37. Zheng WS, Lai J, Yuen PC, Li SZ (2009) Perturbation LDA: learning the difference between the class empirical mean and its expectation. Pattern Recognit 42(5):764–779

    Article  Google Scholar 

Download references


First author thanks the support from the Spanish Ministerio de Ciencia, Innovación y Universidades through Juan de la Cierva-Formación Grant (Ref. FJCI-2016-27873).

Author information



Corresponding author

Correspondence to Jorge Calvo-Zaragoza.

Ethics declarations

Conflict of interest

Authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by V. Loia.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Calvo-Zaragoza, J., Rico-Juan, J.R. & Gallego, AJ. Ensemble classification from deep predictions with test data augmentation. Soft Comput 24, 1423–1433 (2020). https://doi.org/10.1007/s00500-019-03976-7

Download citation


  • Convolutional neural networks
  • Data augmentation
  • Ensemble classification