Skip to main content

Advertisement

Log in

A novel consensus PSO-assisted trajectory unified and trust-tech methodology for DNN training and its applications

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The deep neural network (DNN) relies heavily on local solvers like stochastic gradient descent (SGD). However, these methods are sensitive to initial points and hyperparameters for their local property, which affects the stability of the optimization of DNN. This paper presents a novel three-stage methodology, termed consensus particle swarm optimization-based trajectory unified optimization-assisted Trust-Tech (CPTT), for DNN training with high accuracy and robustness. The CPTT is composed of Stage I: exploration and consensus, Stage II: robustly convergence, and Stage III: search optima. We explore the effect of each stage of the CPTT methodology proposed in this paper to accomplish the following advantages: (1) high-quality local optimal solutions (LOSs) and (2) stable convergence in random initialization. The performance of CPTT has been evaluated in stages on popular classification model structures and benchmark datasets, and the performance improvement of CPTT has been evaluated in object detection models. Its optimization performance is also illustrated in a real-world application for drone-based visual inspection of electric power transmission line corridors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Masud M, Rashed AEE, Hossain MS (2022) Convolutional neural network-based models for diagnosis of breast cancer. Neural Comput Appl 14:34

    Google Scholar 

  2. Wen L, Li X, Gao L (2020) A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Comput Appl 32(10):6111–6124

    Article  MathSciNet  Google Scholar 

  3. Wang SH, Muhammad K, Hong J, Sangaiah AK (2020) Alcoholism identification via convolutional neural network based on parametric ReLU, dropout, and batch normalization. Neural Comput Appl 32(3):665–680

    Article  Google Scholar 

  4. Yang D, Gao X, Kong L, Pang Y, Zhou B (2020) An event-driven convolutional neural architecture for non-intrusive load monitoring of residential appliance. IEEE Trans Consum Electron 66(2):173–182. https://doi.org/10.1109/TCE.2020.2977964

    Article  Google Scholar 

  5. Yuan K, Ying B, Sayed AH (2016) On the influence of momentum acceleration on online learning. J Mach Learn Res 17(1):6602–6667

    MathSciNet  MATH  Google Scholar 

  6. Duchi J, Hazan E, Singer Y (2011) Adaptive sub gradient methods for online learning and stochastic optimization. J Mach Learn Res 12(7)

  7. Tieleman T, Hinton G (2012) Lecture 65-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw Mach Learn 4(2):26–31

    Google Scholar 

  8. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  9. Devan P, Khare N (2020) An efficient XGBoost–DNN-based classification model for network intrusion detection system. Neural Comput Appl 32(16):12499–12514

    Article  Google Scholar 

  10. Zhang C, Lu Y (2021) Study on artificial intelligence: the state of the art and future prospects. J Ind Inf Integr 23:100224

    Google Scholar 

  11. Yu D, Deng L, Dahl G (2010) Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition. In: Proceeding of NIPS workshop on deep learning and unsupervised feature learning. sn

  12. Rajapakshe T, Latif S, Rana R, Khalifa S (2020) Deep reinforcement learning with pre-training for time-efficient training of automatic speech recognition. arXiv preprint arXiv:2005.11172

  13. Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training

  14. Sun Y, Xue B, Zhang M, Yen GG (2019) Completely automated CNN architecture design based on blocks. IEEE Trans Neural Netw Learn Syst 31(4):1242–1254

    Article  MathSciNet  Google Scholar 

  15. Xie L, Yuille A (2017) Genetic cnn. In: Proceedings of the IEEE international conference on computer vision

  16. Wilson AC, Roelofs R, Stern M, Srebro N, Recht B (2017) The marginal value of adaptive gradient methods in machine learning. Adv Neural Inf Process Syst 30

  17. Chaudhari P, Choromanska A, Soatto S, LeCun Y, Baldassi C, Borgs C, Chayes J, Sagun L, Zecchina R (2019) Entropy-sgd: biasing gradient descent into wide valleys. J Stat Mech: Theory Exp 2019(12):124018

    Article  MathSciNet  MATH  Google Scholar 

  18. Liu X, Miao X, Jiang H, Chen J (2020) Data analysis in visual power line inspection: an in-depth review of deep learning for component detection and fault diagnosis. Annu Rev Control 50:253–277

    Article  Google Scholar 

  19. Zhang YF, Chiang HD (2017) A novel consensus-based particle swarm optimization-assisted trust-tech methodology for large-scale global optimization. IEEE Trans Cybern 47(9):2717–2729

    Article  Google Scholar 

  20. Zhang YF, Chiang HD (2019) Enhanced ELITE-load: a novel CMPSOATT methodology constructing short-term load forecasting model for industrial applications. IEEE Trans Ind Inf 16(4):2325–2334

    Article  Google Scholar 

  21. Junior FEF, Yen GG (2019) Particle swarm optimization of deep neural networks architectures for image classification. Swarm Evol Comput 49:62–74

    Article  Google Scholar 

  22. Jain AK, Dubes RC (1988) Algorithms for clustering data (Prentice-hall advanced reference series). Prentice Hall, Englewood Cliffs

    Google Scholar 

  23. Li S, Tan M, Tsang IW, Kwok JT-Y (2011) A hybrid pso-bfgs strategy for global optimization of multimodal functions. IEEE Trans Syst Man Cybern Part B Cybern 41(4):1003–1014

    Article  Google Scholar 

  24. Houssein EH, Gad AG, Hussain K, Suganthan PN (2021) Major advances in particle swarm optimization: theory, analysis, and application. Swarm Evol Comput 63:100868

    Article  Google Scholar 

  25. Lin T.-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  26. Kelley CT, Keyes DE (1998) Convergence analysis of pseudo-transient continuation. SIAM J Numer Anal 35(2):508–523

    Article  MathSciNet  MATH  Google Scholar 

  27. Coffey TS, Kelley CT, Keyes DE (2003) Pseudotransient continuation and differential-algebraic equations. SIAM J Sci Comput 25(2):553–569

    Article  MathSciNet  MATH  Google Scholar 

  28. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  29. Gross S, Wilber M (2016) Training and investigating residual nets. Facebook AI Res 6(3)

  30. Al-Baali M, Grandinetti L, Pisacane O (2014) Damped techniques for the limited memory bfgs method for large-scale optimization. J Optim Theory Appl 161(2):688–699

    Article  MathSciNet  MATH  Google Scholar 

  31. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034

  32. Zhu M, Nazareth JL, Wolkowicz H (1999) The quasi-cauchy relation and diagonal updating. SIAM J Optim 9(4):1192–1204

    Article  MathSciNet  MATH  Google Scholar 

  33. Chiang H-D, Chu C-C (1996) A systematic search method for obtaining multiple local optimal solutions of nonlinear programming problems. IEEE Trans Circuits Syst I: Fundam Theory Appl 43(2):99–109

    Article  MathSciNet  Google Scholar 

  34. Lee J, Chiang H-D (2004) A dynamical trajectory-based methodology for systematically computing multiple optimal solutions of general nonlinear programming problems. IEEE Trans Autom Control 49(6):888–899

    Article  MathSciNet  MATH  Google Scholar 

  35. Hao Z.-Y, Chiang H.-D, Wang B (2021) TRUST-TECH-based systematic search for multiple local optima in deep neural nets. IEEE Trans Neural Netw Learn Syst

  36. Cai Z, Vasconcelos N (2018). Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  37. Li X, Wang W, Hu X, Li J, Tang J, Yang J (2021) Generalized focal loss v2: learning reliable localization quality estimation for dense object detection. In: Computer vision and pattern recognition

  38. Wang W, Peng W, Tong L, Tan X, Xin T (2019) Study on sustainable development of power transmission system under ice disaster based on a new security early warning model. J Clean Prod 228:175–184

    Article  Google Scholar 

  39. Glavic M (2019) (Deep) reinforcement learning for electric power system control and related problems: a short review and perspectives. Annu Rev Control 48:22–35

    Article  MathSciNet  Google Scholar 

  40. Qin X, Su Q, Huang SH (2017) Extended warranty strategies for online shopping supply chain with competing suppliers considering component reliability. J Syst Sci Syst Eng 26(6):753–773

    Article  Google Scholar 

  41. Lan M, Zhang Y, Zhang L, Du B (2018) Defect detection from uav images based on region-based cnns. In: 2018 IEEE international conference on data mining workshops (ICDMW). IEEE, pp 385–390

  42. Santos T, Moreira M, Almeida J, Dias A, Martins A, Dinis J, Formiga J, Silva E (2017) Plined: vision-based power lines detection for unmanned aerial vehicles. In: 2017 IEEE international conference on autonomous robot systems and competitions (ICARSC). IEEE, pp 253–259

  43. Azevedo F, Dias A, Almeida J, Oliveira A, Ferreira A, Santos T, Martins A, Silva E (2019) Real-time lidar-based power lines detection for unmanned aerial vehicles. In: 2019 IEEE international conference on autonomous robot systems and competitions (ICARSC). IEEE, pp 1–8

  44. Jenssen R, Roverso D (2018) Automatic autonomous vision-based power line inspection: a review of current status and the potential role of deep learning. Int J Electr Power Energy Syst 99:107–120

    Article  Google Scholar 

Download references

Funding

The National Key R&D Program of China under Grant, 2017YFB0902900, Hsiao-Dong Chiang, Special Funds for the Basic Research and Development Program in the Central Non-profit Research Institutesof China, 2017YFB0902902, Hsiao-Dong Chiang

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong-Feng Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lv, XL., Chiang, HD. & Zhang, YF. A novel consensus PSO-assisted trajectory unified and trust-tech methodology for DNN training and its applications. Neural Comput & Applic 35, 22375–22385 (2023). https://doi.org/10.1007/s00521-023-08893-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08893-3

Keywords

Navigation