Image classification with a MSF dropout

  • Ruiqi Luo
  • Xian ZhongEmail author
  • Enxiao Chen


In recent years, as the main carrier of deep learning, Deep Neural Network has attracted the attention of experts in computer field. The application of deep neural network can effectively solve complex problems in life. However, in the process of training, the complex relationship caused by noisy data leads to an overfitting, which can impact the robustness of network model. Dropout, as one kind of random regularization techniques, carries a significant effect on restraining the overfitting of deep neural network. The traditional standard dropout can restrain the overfitting in a simple and quick way, but the accuracy is impacted because it cannot accurately locate the appropriate scale. This paper proposes a multi-scale fusion (MSF) dropout method on the basis of standard dropout. At first, several groups of network model with different combinations of dropout rates were trained; then the improved genetic algorithm was used to calculate the optimal scale of each network model; by reducing the corresponding network parameters through the optimal scale, the prediction sub-models were obtained; finally, these sub-models are fused into a final prediction model with certain weight. The present study applies MSF dropout to carry out the experiments in MNIST and CIFAR-10 standard datasets. The result of the study shows that the prediction accuracy is significantly improved compared with the other two kinds of dropout, which verifies the effectiveness of the multi-scale fusion method.


Neural network Regularization Genetic algorithm Multi-scale Dropout Deep learning 



  1. 1.
    Baldi P, Sadowski P (2013) Understanding dropout. NIPS 26:2814–2822Google Scholar
  2. 2.
    Baldi P, Sadowski P (2014) The dropout learning algorithm. Artif Intell 210:78–122MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bulo SR, Porzi L, Kontschieder P (2016) Dropout distillation. ICML 48:99–107Google Scholar
  4. 4.
    Cybenko G (1989) Approximation by superpositions of a sigmoidal function. MCSS 2(4):303–314MathSciNetzbMATHGoogle Scholar
  5. 5.
    Gal Y, Ghahramani Z (2015) Dropout as a Bayesian approximation: insights and applications, ICML, In Deep Learning WorkshopGoogle Scholar
  6. 6.
    Gal Y, Ghahramani Z (2015) On modern deep learning and variational inference, NIPS, In Advances in Approximate Bayesian Inference workshopGoogle Scholar
  7. 7.
    Ghezaiel W, Slimane AB, Braiek EB (2017) Nonlinear multi-scale decomposition by EMD for Co-Channel speaker identification. Multimed Tools Appl 76(20):20973–20988CrossRefGoogle Scholar
  8. 8.
    Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks, AISTATS, 15:315–323Google Scholar
  9. 9.
    Greovnik I, Kodelja T, Vertnik R, Sencic B et al (2012) Application of artificial neural networks in design of steel production path. Comput Mater Continua 30(1):19–38Google Scholar
  10. 10.
    Hasheminejad M, Farsi H (2017) Frame level sparse representation classification for speaker verification. Multimed Tools Appl 76(20):21211–21224CrossRefGoogle Scholar
  11. 11.
    Hinton GE, Srivastava N, Krizhevsky A (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580
  12. 12.
    Kingma DP, Salimans T, Welling M (2015) Variational dropout and the local reparameterization trick. NIPS 28:2575–2583Google Scholar
  13. 13.
    Liu CS (2011) A highly accurate multi-scale full/half-order polynomial interpolation. Comput Mater Continua 25(25):239–263Google Scholar
  14. 14.
    Nemirovski A, Juditsky A, Lan G et al (2009) Robust stochastic approximation approach to stochastic programming. Siam J Optim 19(4):1574–1609MathSciNetCrossRefGoogle Scholar
  15. 15.
    Nowlan SJ, Hinton GE (1992) Simplifying neural networks by soft weight-sharing. Neural Comput 4(4):473–493CrossRefGoogle Scholar
  16. 16.
    Pham V, Bluche T, Kermorvant C et al (2014) Dropout improves recurrent neural networks for handwriting recognition, ICFHR, 55:285–290Google Scholar
  17. 17.
    Ricks TM (2014) A multiscale modeling methodology for composites that includes fiber strength stochastics. Comput Mater Continua 40:99–129Google Scholar
  18. 18.
    Srivastava N, Hinton GE, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958MathSciNetzbMATHGoogle Scholar
  19. 19.
    Wager S, Wang S, Liang PS (2013) Dropout training as adaptive regularization. NIPS 26:351–359Google Scholar
  20. 20.
    Wei G, Zhihua Z (2015) Dropout rademacher complexity of deep neural networks. Sci China In Sc 59:072104Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyWuhan University of TechnologyWuhanChina
  2. 2.University of FloridaGainesvilleUSA

Personalised recommendations