Abstract
Training time of traditional multilayer perceptrons (MLPs) using back-propagation algorithm rises seriously with the problem scale. For multi-class problems, the convergence ratio is very low for training MLPs. The huge time-consuming and low convergence ratio greatly restricts the applications of MLPs on problems with tens and thousands of samples. To deal with these disadvantages, this paper proposes a fast BP network with dynamic sample selection (BPNDSS) method which can dynamically select the samples containing more contribution to the variation of the decision boundary for training after each iteration epoch. The proposed BPNDSS can significantly increase the training speed by only selecting a small subset of the whole samples. Moreover, two kinds of modular single-hidden-layer approaches are adopted to decompose a multi-class problem into multiple binary-class sub-problems, which result in the high rate of convergence. The experiments on Letter and MNIST handwritten recognition database show the effectiveness and the efficiency of BPNDSS. Moreover, BPNDSS results in comparable classification performance to the convolutional neural networks (CNNs), support vector machine, Adaboost, C4.5, and nearest neighbour algorithms. To further demonstrate the training speed improvement of the dynamic sample selection approach on large-scale datasets, we modify CNN to propose a dynamic sample selection CNN (DynCNN). Experiments on Image-Net dataset illustrate that DynCNN can result in similar performance to CNN, but consume less training time.
Similar content being viewed by others
Notes
The MNIST datasets is publicly available at http://yann.lecun.com/exdb/mnist/.
Image-Net database is publicly available from http://www.image-net.org/.
References
Allwein EL, Schapire RE, Singer Y (2001) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1:113–141
Anand R, Mehrotra K, Mohan CK, Ranka S (1995) Efficient classification for multiclass problems using modular neural networks. IEEE Trans Neural Netw 6(1):117–124
Bache K, Lichman M (2013) UCI machine learning repository. School of Information and Computer Sciences, University of California, Irvine. http://archive.ics.uci.edu/ml
Bottou L, Cortes C, Denker JS, Drucker I, Guyon LD, Jackel Y, LeCun UA, Muller E, Sackinger P, Simard et al (1994) Comparison of classifier methods: a case study in handwritten digit recognition. In: International conference on pattern recognition. IEEE, pp 77–77
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Gao D, Xie C, Nie G (2003) Combinative neural-network-based classifiers for optical handwritten character and letter recognition. In: International joint conference on neural networks, vol 3, pp 2232–2237
Gao D, Zhu S, Gu W (2005) A modular single-hidden-layer perceptron for letter recognition. In: Artificial neural networks: biological inspirations. Springer, Berlin, pp 461–467
Dede G, Sazlı MH (2010) Speech recognition with artificial neural networks. Digit Signal Proc 20(3):763–768
Frey PW, Slate DJ (1991) Letter recognition using holland-style adaptive classifiers. Mach Learn 6(2):161–182
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C Appl Rev 42(4):463–484
Gorman RP, Sejnowski TJ (1988) Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw 1(1):75–89
Haykin S (1998) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall PTR, Upper Saddle River
Huang FJ, LeCun Y (2006) Large-scale learning with SVM and convolutional for generic object categorization. In: IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 284–291
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1106–1114
Labusch K, Barth E, Martinetz T (2008) Simple method for high-performance digit recognition based on sparse coding. IEEE Trans Neural Netw 19(1):1985–1991
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
LeCun Y, Bengio Y, Hinton GE (2015) Deep learning. Nature 521:436–444
Li D, Dong Y (2011) Deep convex network: a scalable architecture for speech pattern classification. In: Interspeech. International Speech Communication Association, pp 2285–2288
Phansalkar VV, Sastry PS (1994) Analysis of the back-propagation algorithm with momentum. IEEE Trans Neural Netw 5(3):505–506
Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann, Los Altos
Razavi S, Tolson BA (2011) A new formulation for feedforward neural networks. IEEE Trans Neural Netw 22(10):1588–1598
Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141
Rumelhart DE (1986) Learning representations by back-propagating errors. Nature 323(9):533–536
Rumelhart DE, Hinton GE, Williams RJ (1985) Learning internal representations by error propagation. Technical report
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Li F (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Salakhutdinov R, Hinton GE (2007) Learning a nonlinear embedding by preserving class neighbourhood structure. J Mach Learn Res 2:412–419
Setiono R (2001) Feedforward neural network construction using cross validation. Neural Comput 13(12):2865–2877
Shrivastava V, Sharma N (2012) Artificial neural network based optical character recognition. Signal Image Process 3(5):73–80
Wang J, Wu W, Zurada JM (2012) Computational properties and convergence analysis of BPNN for cyclic and almost cyclic learning with penalty. Neural Netw 33:127–135
Yoshua B (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127
Acknowledgments
The authors would like to thank Natural Science Foundations of China under Grant Nos. 61272198 and 21176077, the Fundamental Research Funds for the Central Universities, and Shanghai Key Laboratory of Intelligent Information Processing of China under Grant No. IIPL-2012-003 for partial support.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fan, Q., Gao, D. A fast BP networks with dynamic sample selection for handwritten recognition. Pattern Anal Applic 21, 67–80 (2018). https://doi.org/10.1007/s10044-016-0566-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-016-0566-7