Abstract
The deep neural network (DNN) relies heavily on local solvers like stochastic gradient descent (SGD). However, these methods are sensitive to initial points and hyperparameters for their local property, which affects the stability of the optimization of DNN. This paper presents a novel three-stage methodology, termed consensus particle swarm optimization-based trajectory unified optimization-assisted Trust-Tech (CPTT), for DNN training with high accuracy and robustness. The CPTT is composed of Stage I: exploration and consensus, Stage II: robustly convergence, and Stage III: search optima. We explore the effect of each stage of the CPTT methodology proposed in this paper to accomplish the following advantages: (1) high-quality local optimal solutions (LOSs) and (2) stable convergence in random initialization. The performance of CPTT has been evaluated in stages on popular classification model structures and benchmark datasets, and the performance improvement of CPTT has been evaluated in object detection models. Its optimization performance is also illustrated in a real-world application for drone-based visual inspection of electric power transmission line corridors.
Similar content being viewed by others
Data availability
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
References
Masud M, Rashed AEE, Hossain MS (2022) Convolutional neural network-based models for diagnosis of breast cancer. Neural Comput Appl 14:34
Wen L, Li X, Gao L (2020) A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Comput Appl 32(10):6111–6124
Wang SH, Muhammad K, Hong J, Sangaiah AK (2020) Alcoholism identification via convolutional neural network based on parametric ReLU, dropout, and batch normalization. Neural Comput Appl 32(3):665–680
Yang D, Gao X, Kong L, Pang Y, Zhou B (2020) An event-driven convolutional neural architecture for non-intrusive load monitoring of residential appliance. IEEE Trans Consum Electron 66(2):173–182. https://doi.org/10.1109/TCE.2020.2977964
Yuan K, Ying B, Sayed AH (2016) On the influence of momentum acceleration on online learning. J Mach Learn Res 17(1):6602–6667
Duchi J, Hazan E, Singer Y (2011) Adaptive sub gradient methods for online learning and stochastic optimization. J Mach Learn Res 12(7)
Tieleman T, Hinton G (2012) Lecture 65-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw Mach Learn 4(2):26–31
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Devan P, Khare N (2020) An efficient XGBoost–DNN-based classification model for network intrusion detection system. Neural Comput Appl 32(16):12499–12514
Zhang C, Lu Y (2021) Study on artificial intelligence: the state of the art and future prospects. J Ind Inf Integr 23:100224
Yu D, Deng L, Dahl G (2010) Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition. In: Proceeding of NIPS workshop on deep learning and unsupervised feature learning. sn
Rajapakshe T, Latif S, Rana R, Khalifa S (2020) Deep reinforcement learning with pre-training for time-efficient training of automatic speech recognition. arXiv preprint arXiv:2005.11172
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training
Sun Y, Xue B, Zhang M, Yen GG (2019) Completely automated CNN architecture design based on blocks. IEEE Trans Neural Netw Learn Syst 31(4):1242–1254
Xie L, Yuille A (2017) Genetic cnn. In: Proceedings of the IEEE international conference on computer vision
Wilson AC, Roelofs R, Stern M, Srebro N, Recht B (2017) The marginal value of adaptive gradient methods in machine learning. Adv Neural Inf Process Syst 30
Chaudhari P, Choromanska A, Soatto S, LeCun Y, Baldassi C, Borgs C, Chayes J, Sagun L, Zecchina R (2019) Entropy-sgd: biasing gradient descent into wide valleys. J Stat Mech: Theory Exp 2019(12):124018
Liu X, Miao X, Jiang H, Chen J (2020) Data analysis in visual power line inspection: an in-depth review of deep learning for component detection and fault diagnosis. Annu Rev Control 50:253–277
Zhang YF, Chiang HD (2017) A novel consensus-based particle swarm optimization-assisted trust-tech methodology for large-scale global optimization. IEEE Trans Cybern 47(9):2717–2729
Zhang YF, Chiang HD (2019) Enhanced ELITE-load: a novel CMPSOATT methodology constructing short-term load forecasting model for industrial applications. IEEE Trans Ind Inf 16(4):2325–2334
Junior FEF, Yen GG (2019) Particle swarm optimization of deep neural networks architectures for image classification. Swarm Evol Comput 49:62–74
Jain AK, Dubes RC (1988) Algorithms for clustering data (Prentice-hall advanced reference series). Prentice Hall, Englewood Cliffs
Li S, Tan M, Tsang IW, Kwok JT-Y (2011) A hybrid pso-bfgs strategy for global optimization of multimodal functions. IEEE Trans Syst Man Cybern Part B Cybern 41(4):1003–1014
Houssein EH, Gad AG, Hussain K, Suganthan PN (2021) Major advances in particle swarm optimization: theory, analysis, and application. Swarm Evol Comput 63:100868
Lin T.-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Kelley CT, Keyes DE (1998) Convergence analysis of pseudo-transient continuation. SIAM J Numer Anal 35(2):508–523
Coffey TS, Kelley CT, Keyes DE (2003) Pseudotransient continuation and differential-algebraic equations. SIAM J Sci Comput 25(2):553–569
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Gross S, Wilber M (2016) Training and investigating residual nets. Facebook AI Res 6(3)
Al-Baali M, Grandinetti L, Pisacane O (2014) Damped techniques for the limited memory bfgs method for large-scale optimization. J Optim Theory Appl 161(2):688–699
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Zhu M, Nazareth JL, Wolkowicz H (1999) The quasi-cauchy relation and diagonal updating. SIAM J Optim 9(4):1192–1204
Chiang H-D, Chu C-C (1996) A systematic search method for obtaining multiple local optimal solutions of nonlinear programming problems. IEEE Trans Circuits Syst I: Fundam Theory Appl 43(2):99–109
Lee J, Chiang H-D (2004) A dynamical trajectory-based methodology for systematically computing multiple optimal solutions of general nonlinear programming problems. IEEE Trans Autom Control 49(6):888–899
Hao Z.-Y, Chiang H.-D, Wang B (2021) TRUST-TECH-based systematic search for multiple local optima in deep neural nets. IEEE Trans Neural Netw Learn Syst
Cai Z, Vasconcelos N (2018). Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Li X, Wang W, Hu X, Li J, Tang J, Yang J (2021) Generalized focal loss v2: learning reliable localization quality estimation for dense object detection. In: Computer vision and pattern recognition
Wang W, Peng W, Tong L, Tan X, Xin T (2019) Study on sustainable development of power transmission system under ice disaster based on a new security early warning model. J Clean Prod 228:175–184
Glavic M (2019) (Deep) reinforcement learning for electric power system control and related problems: a short review and perspectives. Annu Rev Control 48:22–35
Qin X, Su Q, Huang SH (2017) Extended warranty strategies for online shopping supply chain with competing suppliers considering component reliability. J Syst Sci Syst Eng 26(6):753–773
Lan M, Zhang Y, Zhang L, Du B (2018) Defect detection from uav images based on region-based cnns. In: 2018 IEEE international conference on data mining workshops (ICDMW). IEEE, pp 385–390
Santos T, Moreira M, Almeida J, Dias A, Martins A, Dinis J, Formiga J, Silva E (2017) Plined: vision-based power lines detection for unmanned aerial vehicles. In: 2017 IEEE international conference on autonomous robot systems and competitions (ICARSC). IEEE, pp 253–259
Azevedo F, Dias A, Almeida J, Oliveira A, Ferreira A, Santos T, Martins A, Silva E (2019) Real-time lidar-based power lines detection for unmanned aerial vehicles. In: 2019 IEEE international conference on autonomous robot systems and competitions (ICARSC). IEEE, pp 1–8
Jenssen R, Roverso D (2018) Automatic autonomous vision-based power line inspection: a review of current status and the potential role of deep learning. Int J Electr Power Energy Syst 99:107–120
Funding
The National Key R&D Program of China under Grant, 2017YFB0902900, Hsiao-Dong Chiang, Special Funds for the Basic Research and Development Program in the Central Non-profit Research Institutesof China, 2017YFB0902902, Hsiao-Dong Chiang
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lv, XL., Chiang, HD. & Zhang, YF. A novel consensus PSO-assisted trajectory unified and trust-tech methodology for DNN training and its applications. Neural Comput & Applic 35, 22375–22385 (2023). https://doi.org/10.1007/s00521-023-08893-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08893-3