Learnability and robustness of shallow neural networks learned by a performance-driven BP and a variant of PSO for edge decision-making

He, Hongmei; Chen, Mengyuan; Xu, Gang; Zhu, Zhilong; Zhu, Zhenhuan

doi:10.1007/s00521-021-06019-1

Learnability and robustness of shallow neural networks learned by a performance-driven BP and a variant of PSO for edge decision-making

Original Article
Published: 27 April 2021

Volume 33, pages 13809–13830, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Hongmei He ORCID: orcid.org/0000-0002-6653-8298¹,
Mengyuan Chen²,
Gang Xu^2,3,
Zhilong Zhu² &
…
Zhenhuan Zhu⁴

280 Accesses
3 Citations
2 Altmetric
Explore all metrics

Abstract

It may not be easy to implement complex AI models in edge devices without strong computing capacity (e.g. GPU). The Universal Approximation Theorem states that a shallow neural network (SNN) can represent any nonlinear function. In this paper, we focus on the learnability and robustness of SNNs, obtained by a greedy tight force heuristic algorithm [a Performance Driven Back-Propagation (PDBP)] and a loose force meta-heuristic algorithm [a variant of particle swarm optimization (VPSO)]. From the engineering prospective, all sensors are well justified for a specific task. Hence, all sensor readings should be strongly correlated to the target, and the structure of an SNN should depend on the dimensions of a problem space. The key findings of the research are summarized as follows: (1) The number of hidden neurons of an SNN depends on the nonlinearity of the training data, and the number of hidden neurons up to the dimension number of a problem space could be enough; (2) The learnability of SNNs, produced by error-driven PDBP, is always better than that of SNNs, optimized by error-driven VPSOs; (3) The performances of SNNs, obtained by PDBPs and VPSOs, do not change much for different training rates; and (4) Comparing with other classic machine learning algorithms, such as C4.5, NB and NN in literature, the SNNs, obtained by accuracy-driven PDBPs, win for all tested data sets, and the improvement percentage is up to 32.86%. Hence, the research could provide valued guidance for the implementation of edge intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VAMPIRE: vectorized automated ML pre-processing and post-processing framework for edge applications

Article Open access 22 June 2022

How effective is the Grey Wolf optimizer in training multi-layer perceptrons

Article 17 January 2015

A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation

References

Karpathy A, Toderici G, Shetty S et al (2014) Large-scale video classification with convolutional neural networks. IEEE conference on computer vision and pattern recognition (CVPR), Columbus, OH, USA, 23–28 June 2014. https://doi.org/10.1109/CVPR.2014.223
Wang Z, Ren J, Zhang D et al (2018) A deep-learning based feature hybrid framework for spatiotemporal saliency detection inside videos. Neurocomputing 287:68–83
Article Google Scholar
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/tpami.2017.2699184 PMID: 28463186
Ren S, He K, Girshick RB, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Maccagno A, Mastropietro A, Mazziotta U, Scarpiniti M, Lee Y, Uncini A (2019) A CNN approach for audio classification in construction sites. The 29th Italian workshop on neural networks (WIRN 2019), Vietri sul Mare (SA), Italy, 12–14 June 2019
Zhang W, Zhai M, Huang Z, Liu C, Li W, Cao Y (2019) Speech RTE–E, with deep multipath convolutional neural networks. In: Yu H, Liu J, Liu L, Ju Z, Liu Y, Zhou D (eds) Intelligent robotics and applications. ICIRA, (2019) Lecture Notes in Computer Science, vol 11745. Springer, Cham
Blum AL, Rivest RL (1992) Training a 3-node neural network is NP-complete. Neural Netw 5:117–127
Article Google Scholar
Cybenkot G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2:303–314
Article MathSciNet Google Scholar
Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4(2):251–257
Article MathSciNet Google Scholar
Leshno M, Lin VY, Pinkus A, Schocken S (1993) Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw 6(6):861–867
Article Google Scholar
Tkachenko R, Izonin I (2018) Model and principles for the implementation of neural-like structures based on geometric data transformations. The first international conference on computer science, engineering and education applications (ICCSEEA2018), Kiev, Ukraine, 18–20 January 2018
Huang G-B, Zhu Q-Y, Siew C-K extreme learning machine: theory and applications. https://doi.org/10.1016/j.neucom.2005.12.126
Tkachenko R, Izonin I, Vitynskyi OP, Lotoshynska N, Pavlyuk O (2018) Development of the non-iterative supervised learning predictor based on the Ito decomposition and SGTM neural-like structure for managing medical insurance costs. Data 3(4):46. https://doi.org/10.3390/data3040046
Article Google Scholar
Izonin I, Tkachenko R, KryvinskaN N, Tkachenko P, Greguš M (2019) Multiple linear regression based on coefficients identification using non-iterative SGTM neural-like structure. In Book: Advances in computational intelligence. https://doi.org/10.1007/978-3-030-20521-8_39
Sengupta S, Basak S, Peters RA (2019) Particle swarm optimization: a survey of historical and recent developments with hybridization perspectives. Mach Learn Knowl Extr 1(1):157–191. https://doi.org/10.3390/make1010010
Article Google Scholar
Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci 87:9193–9196
Article Google Scholar
Almeida TA, Hidalgo JMG, Yamakami A (2011) Contributions to the study of SMS spam filtering:new collection and results. In: DocEng-11, Mountain View, California, USA, 19–22 September 2011
Dua D, Graff C (2019) UCI machine learning repository [https://archive.ics.uci.edu/ml/datasets.php]. University of California, School of Information and Computer Science, Irvine, CA
Anthony M, Bartlett PL (1999) Neural network learning: theoretical foundations. Cambridge University Press
Sharma RA, Goyal N, Choudhury M, Netrapalli P (2018) Learnability of learned neural networks. Sixth international conference on learning representations (ICLR), Vancouver Convention Center, Vancouver, BC, Canada April 30–May 3 (2018)
Kim B, Lee SM, Seo JK (2019) Improving learnability of neural networks: adding supplementary axes to disentangle data representation. https://arxiv.org/abs/1902.04205
Lee S-G, Kim J, Jung H-J, Choe Y (2019) Comparing sample-wise learnability across deep neural network models. The thirty-third aaai conference on artificial intelligence (AAAI-19), Hilton Hawaiian Village, Honolulu, Hawaii, USA,27 Jan.–Feb. 2019
Zhang Y, Lee JD, Wainwright MJ, Jordan MI (2017) On the Learnability of fully-connected neural networks. Proceedings of the 20th international conference on artificial intelligence and statistics, PMLR 54, 2017, pp 83–91
Arora S, Bhaskara A, Ge R, Ma T (2014) Provable bounds for learning some deep representations. In: Proceedings of the 31st international conference on machine learning, PMLR, 32(1), 2014, pp 584–592
Mohammad A, Masouros C, Andreopoulos Y (2020) Complexity-scalable neural-network-based MIMO detection with learnable weight scaling. IEEE Trans Commun 68(10):6101–6113. https://doi.org/10.1109/TCOMM.2020.3007622
Article Google Scholar
Zhong K, Song Z, Jain P, Bartlett PL, Dhillon IS (2017) Recovery guarantees for one-hidden-layer neural networks. Thirty-fourth international conference on machine learning (ICML2017). arXiv:1706.03175 [cs.LG]
Ge R, Lee JD, Ma T (2018) Learning one-hidden-layer neural networks with landscape design. Sixth international conference on learning representations (ICLR), Vancouver Convention Center, Vancouver, BC, Canada April 30–May 3, 2018
Li Y, Yuan Y (2017) Convergence analysis of two-layer neural networks with ReLU activation. NIPS’17: Proceedings of the 31st international conference on neural information processing systems, Dec. 2017, pp 597–607
Song L, Vempala SS, Xie B (2017) On the complexity of learning neural networks, NIPS 2017
Janzamin M, Sedghi H, Anandkumar A (2016) Beating the perils of non-convexity: guaranteed training of neural networks using tensor methods. arXiv:1506.08473v3
Kajornrit J (2015) A comparative study of optimization methods for improving artificial neural network performance. The 7th international conference on information technology and electrical engineering (ICITEE). Chiang Mai, Thailand
Nawi NM, Khan A, Rehman MZ, Chiroma H, Herawan T (2015) Weight, optimization in recurrent neural networks with hybrid metaheuristic Cuckoo search techniques for data classification, mathematical problems in engineering, vol 2015, Article ID 868375. https://doi.org/10.1155/2015/868375
Jiang L, Meng D, Zhao Q, Shan S, Hauptmann AG (2015) Self-paced curriculum learning. Twenty-ninth conference on artificial intelligence (AAAI15), Austin Texas, USA, 25–29 Jan. 2015, pp 2694–2700
Zhang A, Sun G, Ren J, Li X, Wang Z, Jia X (2018) A dynamic neighborhood learning-based gravitational search algorithm. IEEE Trans Cybern 48(1):436–447
Article Google Scholar
Bu F, Chen Z, Zhang Q (2015) Incremental updating method for big data feature learning. Comput Eng Appl 3:92–101
Google Scholar
Mangal R, Nori AV, Orso A (2019) Robustness of neural networks: a probabilistic and practical approach. arXiv:1902.05983v1 [cs.LG] 15 Feb 2019
Zeng H, Zhu C, Goldstein T, Huang F (2020) Are adversarial examples created equal? A learnable weighted minimax risk for robustness under non-uniform attacks. arXiv:2010.12989 [cs.LG], 24 Oct. 2020
Li D, Li Q, Ye Y, Xu S (2020) enhancing robustness of deep neural networks against adversarial malware samples: principles, framework, and AICS’2019 challenge, the AAAI-19 workshop on artificial intelligence for cyber security (AICS), 2019. arXiv:1812.08108v3 [cs.CR] 16 Sep 2020
AlBadawy EA, Saha A, Mazurowski MA (2018) Deep learning for segmentation of brain tumors: impact of cross-institutional training and testing. Med Phys 45(3):1150–1158. https://doi.org/10.1002/mp.12752
Article Google Scholar
Chui KT, Fung DCL, Lytras MD, Lamb TM (2020) Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Comput Hum Behav. https://doi.org/10.1016/j.chb.2018.06.032
Article Google Scholar
Rojas R (1996) Chapter 7 The backpropagation algorithm. Neural networks. Springer, Berlin, pp 151–184
Chapter Google Scholar
Ratnaweera A, Halgamuge SK, Watson H (2002) Particle swarm optimization with self-adaptive acceleration coefficients. In: Proceedings of the 1st international conference on fuzzy systems and knowledge discovery (FSKD’02): computational intelligence for the E-age, 2 vol, 18–22. Orchid Country Club, Singapore
Bratton D, Kennedy J (2007) Defining a standard for particle swarm optimization. In: Proceedings of the 2007 IEEE swarm intelligence symposium (SIS 2007), Honolulu, HI, USA, 1–5 April 2007. https://doi.org/10.1109/SIS.2007.368035
He H, Lawry J (2014) Linguistic attribute hierarchy and its optimisation for classification problems. Soft Comput 18(10):1967–1984
Article Google Scholar
He H, Tiwari A, Mehnen J et al (2016) Incremental information gain analysis of input attribute impact on RBF-kernel SVM spam detection, WCCI2016, Vancouver, Canada, 24–29 July, 2016
He H, Watson T, Maple C et al (2017) Semantic attribute deep learning with a hierarchy of linguistic decision trees for spam detection, IJCNN2017, Anchorage, Alaska, USA, 14–19 May 2017
He H, Zhu ZL, Xu G, Zhu Z (2018) How good a shallow neural network is for solving non-linear decision making problems: proceedings of the 9th international conference, BICS 2018, Xi’an, China, July 7–8, 2018. In book: advances in brain inspired cognitive systems. https://doi.org/10.1007/978-3-030-00563-4_2
Pima Indian Diabetes Database. https://www.kaggle.com/uciml/pima-indians-diabetes-database. Accessed 25 Sept 2020
Qin Z, Lawry J (2005) Decision tree learning with fuzzy labels. Inf Sci 172(1–2):91–129
Article MathSciNet Google Scholar
He H, Zhu Z (2020) A heuristically self-organised linguistic attribute deep learning in edge computing for IoT intelligence. arXiv:2006.04766, 08 Jun 2020

Download references

Acknowledgements

This research is sponsored by National Natural Science Foundation of China (61903002), the key project of Natural Science in Universities in Anhui Province, China (KJ2018A0111), and the Open Research Fund of Anhui Key Laboratory of Detection Technology and Energy Saving Devices, Anhui Polytechnic University, China (2017070503B026-A01).

Author information

Authors and Affiliations

School of Computer Science and Informatics, De Montfort University, Leicester, LE1 9BH, UK
Hongmei He
School of Electronic Engineering, Anhui Polytechnic University, Wuhu, China
Mengyuan Chen, Gang Xu & Zhilong Zhu
Key Laboratory of Advanced Perception and Intelligent Control of High-end Equipment, Ministry of Education, Wuhu, China
Gang Xu
Smart Perception Ltd, Milton Keynes, UK
Zhenhuan Zhu

Authors

Hongmei He
View author publications
You can also search for this author in PubMed Google Scholar
Mengyuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Gang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhilong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Zhenhuan Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongmei He.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, H., Chen, M., Xu, G. et al. Learnability and robustness of shallow neural networks learned by a performance-driven BP and a variant of PSO for edge decision-making. Neural Comput & Applic 33, 13809–13830 (2021). https://doi.org/10.1007/s00521-021-06019-1

Download citation

Received: 19 August 2020
Accepted: 07 April 2021
Published: 27 April 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s00521-021-06019-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learnability and robustness of shallow neural networks learned by a performance-driven BP and a variant of PSO for edge decision-making

Abstract

Access this article

Similar content being viewed by others

VAMPIRE: vectorized automated ML pre-processing and post-processing framework for edge applications

How effective is the Grey Wolf optimizer in training multi-layer perceptrons

A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learnability and robustness of shallow neural networks learned by a performance-driven BP and a variant of PSO for edge decision-making

Abstract

Access this article

Similar content being viewed by others

VAMPIRE: vectorized automated ML pre-processing and post-processing framework for edge applications

How effective is the Grey Wolf optimizer in training multi-layer perceptrons

A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation