Pattern Analysis and Applications

, Volume 21, Issue 1, pp 67–80 | Cite as

A fast BP networks with dynamic sample selection for handwritten recognition

Theoretical Advances


Training time of traditional multilayer perceptrons (MLPs) using back-propagation algorithm rises seriously with the problem scale. For multi-class problems, the convergence ratio is very low for training MLPs. The huge time-consuming and low convergence ratio greatly restricts the applications of MLPs on problems with tens and thousands of samples. To deal with these disadvantages, this paper proposes a fast BP network with dynamic sample selection (BPNDSS) method which can dynamically select the samples containing more contribution to the variation of the decision boundary for training after each iteration epoch. The proposed BPNDSS can significantly increase the training speed by only selecting a small subset of the whole samples. Moreover, two kinds of modular single-hidden-layer approaches are adopted to decompose a multi-class problem into multiple binary-class sub-problems, which result in the high rate of convergence. The experiments on Letter and MNIST handwritten recognition database show the effectiveness and the efficiency of BPNDSS. Moreover, BPNDSS results in comparable classification performance to the convolutional neural networks (CNNs), support vector machine, Adaboost, C4.5, and nearest neighbour algorithms. To further demonstrate the training speed improvement of the dynamic sample selection approach on large-scale datasets, we modify CNN to propose a dynamic sample selection CNN (DynCNN). Experiments on Image-Net dataset illustrate that DynCNN can result in similar performance to CNN, but consume less training time.


Handwritten recognition Classifier design Artificial neural networks Sample selection 



The authors would like to thank Natural Science Foundations of China under Grant Nos. 61272198 and 21176077, the Fundamental Research Funds for the Central Universities, and Shanghai Key Laboratory of Intelligent Information Processing of China under Grant No. IIPL-2012-003 for partial support.


  1. 1.
    Allwein EL, Schapire RE, Singer Y (2001) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1:113–141MathSciNetMATHGoogle Scholar
  2. 2.
    Anand R, Mehrotra K, Mohan CK, Ranka S (1995) Efficient classification for multiclass problems using modular neural networks. IEEE Trans Neural Netw 6(1):117–124CrossRefGoogle Scholar
  3. 3.
    Bache K, Lichman M (2013) UCI machine learning repository. School of Information and Computer Sciences, University of California, Irvine.
  4. 4.
    Bottou L, Cortes C, Denker JS, Drucker I, Guyon LD, Jackel Y, LeCun UA, Muller E, Sackinger P, Simard et al (1994) Comparison of classifier methods: a case study in handwritten digit recognition. In: International conference on pattern recognition. IEEE, pp 77–77Google Scholar
  5. 5.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATHGoogle Scholar
  6. 6.
    Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefMATHGoogle Scholar
  7. 7.
    Gao D, Xie C, Nie G (2003) Combinative neural-network-based classifiers for optical handwritten character and letter recognition. In: International joint conference on neural networks, vol 3, pp 2232–2237Google Scholar
  8. 8.
    Gao D, Zhu S, Gu W (2005) A modular single-hidden-layer perceptron for letter recognition. In: Artificial neural networks: biological inspirations. Springer, Berlin, pp 461–467Google Scholar
  9. 9.
    Dede G, Sazlı MH (2010) Speech recognition with artificial neural networks. Digit Signal Proc 20(3):763–768CrossRefGoogle Scholar
  10. 10.
    Frey PW, Slate DJ (1991) Letter recognition using holland-style adaptive classifiers. Mach Learn 6(2):161–182Google Scholar
  11. 11.
    Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C Appl Rev 42(4):463–484CrossRefGoogle Scholar
  12. 12.
    Gorman RP, Sejnowski TJ (1988) Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw 1(1):75–89CrossRefGoogle Scholar
  13. 13.
    Haykin S (1998) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall PTR, Upper Saddle RiverMATHGoogle Scholar
  14. 14.
    Huang FJ, LeCun Y (2006) Large-scale learning with SVM and convolutional for generic object categorization. In: IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 284–291Google Scholar
  15. 15.
    Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1106–1114Google Scholar
  16. 16.
    Labusch K, Barth E, Martinetz T (2008) Simple method for high-performance digit recognition based on sparse coding. IEEE Trans Neural Netw 19(1):1985–1991CrossRefGoogle Scholar
  17. 17.
    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  18. 18.
    LeCun Y, Bengio Y, Hinton GE (2015) Deep learning. Nature 521:436–444CrossRefGoogle Scholar
  19. 19.
    Li D, Dong Y (2011) Deep convex network: a scalable architecture for speech pattern classification. In: Interspeech. International Speech Communication Association, pp 2285–2288Google Scholar
  20. 20.
    Phansalkar VV, Sastry PS (1994) Analysis of the back-propagation algorithm with momentum. IEEE Trans Neural Netw 5(3):505–506CrossRefGoogle Scholar
  21. 21.
    Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann, Los AltosGoogle Scholar
  22. 22.
    Razavi S, Tolson BA (2011) A new formulation for feedforward neural networks. IEEE Trans Neural Netw 22(10):1588–1598CrossRefGoogle Scholar
  23. 23.
    Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141MathSciNetMATHGoogle Scholar
  24. 24.
    Rumelhart DE (1986) Learning representations by back-propagating errors. Nature 323(9):533–536CrossRefMATHGoogle Scholar
  25. 25.
    Rumelhart DE, Hinton GE, Williams RJ (1985) Learning internal representations by error propagation. Technical reportGoogle Scholar
  26. 26.
    Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Li F (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRefGoogle Scholar
  27. 27.
    Salakhutdinov R, Hinton GE (2007) Learning a nonlinear embedding by preserving class neighbourhood structure. J Mach Learn Res 2:412–419Google Scholar
  28. 28.
    Setiono R (2001) Feedforward neural network construction using cross validation. Neural Comput 13(12):2865–2877CrossRefMATHGoogle Scholar
  29. 29.
    Shrivastava V, Sharma N (2012) Artificial neural network based optical character recognition. Signal Image Process 3(5):73–80Google Scholar
  30. 30.
    Wang J, Wu W, Zurada JM (2012) Computational properties and convergence analysis of BPNN for cyclic and almost cyclic learning with penalty. Neural Netw 33:127–135CrossRefMATHGoogle Scholar
  31. 31.
    Yoshua B (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringEast China University of Science and TechnologyShanghaiPeople’s Republic of China

Personalised recommendations