Abstract
It may not be easy to implement complex AI models in edge devices without strong computing capacity (e.g. GPU). The Universal Approximation Theorem states that a shallow neural network (SNN) can represent any nonlinear function. In this paper, we focus on the learnability and robustness of SNNs, obtained by a greedy tight force heuristic algorithm [a Performance Driven Back-Propagation (PDBP)] and a loose force meta-heuristic algorithm [a variant of particle swarm optimization (VPSO)]. From the engineering prospective, all sensors are well justified for a specific task. Hence, all sensor readings should be strongly correlated to the target, and the structure of an SNN should depend on the dimensions of a problem space. The key findings of the research are summarized as follows: (1) The number of hidden neurons of an SNN depends on the nonlinearity of the training data, and the number of hidden neurons up to the dimension number of a problem space could be enough; (2) The learnability of SNNs, produced by error-driven PDBP, is always better than that of SNNs, optimized by error-driven VPSOs; (3) The performances of SNNs, obtained by PDBPs and VPSOs, do not change much for different training rates; and (4) Comparing with other classic machine learning algorithms, such as C4.5, NB and NN in literature, the SNNs, obtained by accuracy-driven PDBPs, win for all tested data sets, and the improvement percentage is up to 32.86%. Hence, the research could provide valued guidance for the implementation of edge intelligence.
Similar content being viewed by others
References
Karpathy A, Toderici G, Shetty S et al (2014) Large-scale video classification with convolutional neural networks. IEEE conference on computer vision and pattern recognition (CVPR), Columbus, OH, USA, 23–28 June 2014. https://doi.org/10.1109/CVPR.2014.223
Wang Z, Ren J, Zhang D et al (2018) A deep-learning based feature hybrid framework for spatiotemporal saliency detection inside videos. Neurocomputing 287:68–83
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/tpami.2017.2699184 PMID: 28463186
Ren S, He K, Girshick RB, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Maccagno A, Mastropietro A, Mazziotta U, Scarpiniti M, Lee Y, Uncini A (2019) A CNN approach for audio classification in construction sites. The 29th Italian workshop on neural networks (WIRN 2019), Vietri sul Mare (SA), Italy, 12–14 June 2019
Zhang W, Zhai M, Huang Z, Liu C, Li W, Cao Y (2019) Speech RTE–E, with deep multipath convolutional neural networks. In: Yu H, Liu J, Liu L, Ju Z, Liu Y, Zhou D (eds) Intelligent robotics and applications. ICIRA, (2019) Lecture Notes in Computer Science, vol 11745. Springer, Cham
Blum AL, Rivest RL (1992) Training a 3-node neural network is NP-complete. Neural Netw 5:117–127
Cybenkot G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2:303–314
Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4(2):251–257
Leshno M, Lin VY, Pinkus A, Schocken S (1993) Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw 6(6):861–867
Tkachenko R, Izonin I (2018) Model and principles for the implementation of neural-like structures based on geometric data transformations. The first international conference on computer science, engineering and education applications (ICCSEEA2018), Kiev, Ukraine, 18–20 January 2018
Huang G-B, Zhu Q-Y, Siew C-K extreme learning machine: theory and applications. https://doi.org/10.1016/j.neucom.2005.12.126
Tkachenko R, Izonin I, Vitynskyi OP, Lotoshynska N, Pavlyuk O (2018) Development of the non-iterative supervised learning predictor based on the Ito decomposition and SGTM neural-like structure for managing medical insurance costs. Data 3(4):46. https://doi.org/10.3390/data3040046
Izonin I, Tkachenko R, KryvinskaN N, Tkachenko P, Greguš M (2019) Multiple linear regression based on coefficients identification using non-iterative SGTM neural-like structure. In Book: Advances in computational intelligence. https://doi.org/10.1007/978-3-030-20521-8_39
Sengupta S, Basak S, Peters RA (2019) Particle swarm optimization: a survey of historical and recent developments with hybridization perspectives. Mach Learn Knowl Extr 1(1):157–191. https://doi.org/10.3390/make1010010
Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci 87:9193–9196
Almeida TA, Hidalgo JMG, Yamakami A (2011) Contributions to the study of SMS spam filtering:new collection and results. In: DocEng-11, Mountain View, California, USA, 19–22 September 2011
Dua D, Graff C (2019) UCI machine learning repository [https://archive.ics.uci.edu/ml/datasets.php]. University of California, School of Information and Computer Science, Irvine, CA
Anthony M, Bartlett PL (1999) Neural network learning: theoretical foundations. Cambridge University Press
Sharma RA, Goyal N, Choudhury M, Netrapalli P (2018) Learnability of learned neural networks. Sixth international conference on learning representations (ICLR), Vancouver Convention Center, Vancouver, BC, Canada April 30–May 3 (2018)
Kim B, Lee SM, Seo JK (2019) Improving learnability of neural networks: adding supplementary axes to disentangle data representation. https://arxiv.org/abs/1902.04205
Lee S-G, Kim J, Jung H-J, Choe Y (2019) Comparing sample-wise learnability across deep neural network models. The thirty-third aaai conference on artificial intelligence (AAAI-19), Hilton Hawaiian Village, Honolulu, Hawaii, USA,27 Jan.–Feb. 2019
Zhang Y, Lee JD, Wainwright MJ, Jordan MI (2017) On the Learnability of fully-connected neural networks. Proceedings of the 20th international conference on artificial intelligence and statistics, PMLR 54, 2017, pp 83–91
Arora S, Bhaskara A, Ge R, Ma T (2014) Provable bounds for learning some deep representations. In: Proceedings of the 31st international conference on machine learning, PMLR, 32(1), 2014, pp 584–592
Mohammad A, Masouros C, Andreopoulos Y (2020) Complexity-scalable neural-network-based MIMO detection with learnable weight scaling. IEEE Trans Commun 68(10):6101–6113. https://doi.org/10.1109/TCOMM.2020.3007622
Zhong K, Song Z, Jain P, Bartlett PL, Dhillon IS (2017) Recovery guarantees for one-hidden-layer neural networks. Thirty-fourth international conference on machine learning (ICML2017). arXiv:1706.03175 [cs.LG]
Ge R, Lee JD, Ma T (2018) Learning one-hidden-layer neural networks with landscape design. Sixth international conference on learning representations (ICLR), Vancouver Convention Center, Vancouver, BC, Canada April 30–May 3, 2018
Li Y, Yuan Y (2017) Convergence analysis of two-layer neural networks with ReLU activation. NIPS’17: Proceedings of the 31st international conference on neural information processing systems, Dec. 2017, pp 597–607
Song L, Vempala SS, Xie B (2017) On the complexity of learning neural networks, NIPS 2017
Janzamin M, Sedghi H, Anandkumar A (2016) Beating the perils of non-convexity: guaranteed training of neural networks using tensor methods. arXiv:1506.08473v3
Kajornrit J (2015) A comparative study of optimization methods for improving artificial neural network performance. The 7th international conference on information technology and electrical engineering (ICITEE). Chiang Mai, Thailand
Nawi NM, Khan A, Rehman MZ, Chiroma H, Herawan T (2015) Weight, optimization in recurrent neural networks with hybrid metaheuristic Cuckoo search techniques for data classification, mathematical problems in engineering, vol 2015, Article ID 868375. https://doi.org/10.1155/2015/868375
Jiang L, Meng D, Zhao Q, Shan S, Hauptmann AG (2015) Self-paced curriculum learning. Twenty-ninth conference on artificial intelligence (AAAI15), Austin Texas, USA, 25–29 Jan. 2015, pp 2694–2700
Zhang A, Sun G, Ren J, Li X, Wang Z, Jia X (2018) A dynamic neighborhood learning-based gravitational search algorithm. IEEE Trans Cybern 48(1):436–447
Bu F, Chen Z, Zhang Q (2015) Incremental updating method for big data feature learning. Comput Eng Appl 3:92–101
Mangal R, Nori AV, Orso A (2019) Robustness of neural networks: a probabilistic and practical approach. arXiv:1902.05983v1 [cs.LG] 15 Feb 2019
Zeng H, Zhu C, Goldstein T, Huang F (2020) Are adversarial examples created equal? A learnable weighted minimax risk for robustness under non-uniform attacks. arXiv:2010.12989 [cs.LG], 24 Oct. 2020
Li D, Li Q, Ye Y, Xu S (2020) enhancing robustness of deep neural networks against adversarial malware samples: principles, framework, and AICS’2019 challenge, the AAAI-19 workshop on artificial intelligence for cyber security (AICS), 2019. arXiv:1812.08108v3 [cs.CR] 16 Sep 2020
AlBadawy EA, Saha A, Mazurowski MA (2018) Deep learning for segmentation of brain tumors: impact of cross-institutional training and testing. Med Phys 45(3):1150–1158. https://doi.org/10.1002/mp.12752
Chui KT, Fung DCL, Lytras MD, Lamb TM (2020) Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Comput Hum Behav. https://doi.org/10.1016/j.chb.2018.06.032
Rojas R (1996) Chapter 7 The backpropagation algorithm. Neural networks. Springer, Berlin, pp 151–184
Ratnaweera A, Halgamuge SK, Watson H (2002) Particle swarm optimization with self-adaptive acceleration coefficients. In: Proceedings of the 1st international conference on fuzzy systems and knowledge discovery (FSKD’02): computational intelligence for the E-age, 2 vol, 18–22. Orchid Country Club, Singapore
Bratton D, Kennedy J (2007) Defining a standard for particle swarm optimization. In: Proceedings of the 2007 IEEE swarm intelligence symposium (SIS 2007), Honolulu, HI, USA, 1–5 April 2007. https://doi.org/10.1109/SIS.2007.368035
He H, Lawry J (2014) Linguistic attribute hierarchy and its optimisation for classification problems. Soft Comput 18(10):1967–1984
He H, Tiwari A, Mehnen J et al (2016) Incremental information gain analysis of input attribute impact on RBF-kernel SVM spam detection, WCCI2016, Vancouver, Canada, 24–29 July, 2016
He H, Watson T, Maple C et al (2017) Semantic attribute deep learning with a hierarchy of linguistic decision trees for spam detection, IJCNN2017, Anchorage, Alaska, USA, 14–19 May 2017
He H, Zhu ZL, Xu G, Zhu Z (2018) How good a shallow neural network is for solving non-linear decision making problems: proceedings of the 9th international conference, BICS 2018, Xi’an, China, July 7–8, 2018. In book: advances in brain inspired cognitive systems. https://doi.org/10.1007/978-3-030-00563-4_2
Pima Indian Diabetes Database. https://www.kaggle.com/uciml/pima-indians-diabetes-database. Accessed 25 Sept 2020
Qin Z, Lawry J (2005) Decision tree learning with fuzzy labels. Inf Sci 172(1–2):91–129
He H, Zhu Z (2020) A heuristically self-organised linguistic attribute deep learning in edge computing for IoT intelligence. arXiv:2006.04766, 08 Jun 2020
Acknowledgements
This research is sponsored by National Natural Science Foundation of China (61903002), the key project of Natural Science in Universities in Anhui Province, China (KJ2018A0111), and the Open Research Fund of Anhui Key Laboratory of Detection Technology and Energy Saving Devices, Anhui Polytechnic University, China (2017070503B026-A01).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
He, H., Chen, M., Xu, G. et al. Learnability and robustness of shallow neural networks learned by a performance-driven BP and a variant of PSO for edge decision-making. Neural Comput & Applic 33, 13809–13830 (2021). https://doi.org/10.1007/s00521-021-06019-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06019-1