Ensemble classification from deep predictions with test data augmentation
- 95 Downloads
Data augmentation has become a standard step to improve the predictive power and robustness of convolutional neural networks by means of the synthetic generation of new samples depicting different deformations. This step has been traditionally considered to improve the network at the training stage. In this work, however, we study the use of data augmentation at classification time. That is, the test sample is augmented, following the same procedure considered for training, and the decision is taken with an ensemble prediction over all these samples. We present comprehensive experimentation with several datasets and ensemble decisions, considering a rather generic data augmentation procedure. Our results show that performing this step is able to boost the original classification, even when the room for improvement is limited.
KeywordsConvolutional neural networks Data augmentation Ensemble classification
First author thanks the support from the Spanish Ministerio de Ciencia, Innovación y Universidades through Juan de la Cierva-Formación Grant (Ref. FJCI-2016-27873).
Compliance with ethical standards
Conflict of interest
Authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
- Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: The IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
- Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, Springer, pp. 177–186Google Scholar
- Calvo-Zaragoza J, Oncina J (2014) Recognition of pen-based music notation: the HOMUS dataset. In: 22nd international conference on pattern recognition, ICPR 2014, Stockholm, Sweden, August 24–28, pp 3038–3043Google Scholar
- Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, AISTATS 2011, Fort Lauderdale, USA, April 11–13, pp 315–323Google Scholar
- Goodfellow I, Bengio Y, Courville A (2016) Regularization for deep learning. In: Deep learning, chap 10, MIT Press, pp 228–273Google Scholar
- He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. CoRR. arXiv:1502.01852
- Ko T, Peddinti V, Povey D, Khudanpur S (2015) Audio augmentation for speech recognition. In: 16th Annual conference of the international speech communication association, INTERSPEECH 2015, Dresden, Germany, September 6–10, pp 3586–3589Google Scholar
- Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical report, University of TorontoGoogle Scholar
- Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: 26th Annual conference on neural information processing systems, pp 1106–1114Google Scholar
- Latecki LJ, Lakamper R, Eckhardt T (2000) Shape descriptors for non-rigid shapes with a single closed contour. In: Proceedings. IEEE conference on computer vision and pattern recognition, vol 1, IEEE, pp 424–429Google Scholar
- Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. CoRR. arXiv:1312.6229
- Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR. arXiv:1409.1556
- Smith LN, Topin N (2016) Deep convolutional neural network design patterns. arXiv preprint arXiv:1611.00847
- Sun Y, Wang X, Tang X (2013) Hybrid deep learning for face verification. In: The IEEE international conference on computer vision (ICCV)Google Scholar
- Sutskever I, Martens J, Dahl GE, Hinton GE (2013) On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th international conference on machine learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, pp 1139–1147Google Scholar
- Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using dropconnect. In: Dasgupta S, Mcallester D (eds) JMLR workshop and conference proceedings, proceedings of the 30th international conference on machine learning (ICML-13), vol 28, pp 1058–1066Google Scholar
- Wilkinson RA, Geist J, Janet S, Grother PJ et al (1992) The first census optical character recognition system conference. Technical report, US Department of Commerce. https://doi.org/10.18434/T4H01C
- Zeiler MD (2012) ADADELTA: an adaptive learning rate method. CoRR. arXiv:1212.5701