Backpropagation for Fully Connected Cascade Networks

Cheng, Yiping

doi:10.1007/s11063-017-9588-4

Backpropagation for Fully Connected Cascade Networks

Published: 25 January 2017

Volume 46, pages 293–311, (2017)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Yiping Cheng¹

658 Accesses
6 Citations
Explore all metrics

Abstract

The fully connected cascade (FCC) networks are a recently proposed class of neural networks where each layer has only one neuron and each neuron is connected with all the neurons in its previous layers. In this paper we derive and describe in detail an efficient backpropagation algorithm (named BPFCC) for computing the gradient for FCC networks. Actually, the backpropagation in BPFCC is an elaborately designed process for computing the derivative amplification coefficients, which are essential for gradient computation. The average time complexity for computing an entry of the gradient is O(1). BPFCC needs to be called by training algorithms to do any useful work, and we wrote a program FCCNET for that purpose. Currently, FCCNET uses the Levenberg–Marquardt algorithm to train FCC networks, and the loss function for classification is designed based on a nonlinear extension of logistic regression. For two-class classification, we derive a Gauss–Newton-like approximation for the Hessian of the loss function, and when the number of classes is more than two, numerical approximation of the Hessian is used. Experimental results confirm the efficiency of BPFCC, and the validity of the companion techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development and Application of Artificial Neural Network

Article 30 December 2017

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

A survey of the recent architectures of deep convolutional neural networks

Article 21 April 2020

Notes

To save space the weights obtained in the experiments are not listed, but they can be obtained by contacting the author of this paper.
All the three problems are regression problems, including Spirals, although it is believed that Spirals has been designed by the Neuron by Neuron authors based on the two-spiral problem. However, the Two-Spiral problem treated by FCCNET in Sect. 5.3 is the original two-spiral classification problem.

References

LeCun Y, Bottou L, Orr G, Muller K (1998) Efficient backprop. In: Orr G, Muller K (eds) Neural networks: tricks of the trade. Springer, Berlin
Google Scholar
Livieris IE, Pintelas P (2013) A new conjugate gradient algorithm for training neural networks based on a modified secant equation. Appl Math Comput 221(15):491–502
Article MathSciNet MATH Google Scholar
Hagan MT, Menhaj MB (1994) Training feedforward networks with the Marquardt algorithm. IEEE Trans Neural Netw 5(6):989–993
Article Google Scholar
Robitaille B, Marcos B, Veillette M, Payre G (1996) Modified quasi-Newton methods for training neural networks. Comput Chem Eng 20(9):1133–1140
Article Google Scholar
Bottou L, Curtis FE, Nocedal J (2016) Optimization methods for large-scale machine learning. https://arxiv.org/abs/1606.04838
Wilamowski BM, Cotton NJ, Kaynak O, Dundar G (2008) Computing gradient vector and Jacobian matrix in arbitrarily connected neural networks. IEEE Trans Ind Electron 55(10):3784–3790
Article Google Scholar
Wilamowski BM, Yu H (2010) Improved computation for Levenberg–Marquardt training. IEEE Trans Neural Netw 21(6):930–937
Article Google Scholar
Wilamowski BM, Yu H (2010) Neural network learning without backpropagation. IEEE Trans Neural Netw 21(11):1793–1803
Article Google Scholar
Hunter D, Yu H, Pukish MS, Kolbusz J, Wilamowski BM (2012) Selection of proper neural network sizes and architectures: a comparative study. IEEE Trans Ind Inf 8(2):228–240
Article Google Scholar
Hussain S, Mokhtar M, Howe JM (2015) Sensor failure detection, identification, and accommodation using fully connected cascade neural network. IEEE Trans Ind Electron 62(3):1683–1692
Article Google Scholar
Deshpande G, Wang P, Rangaprakash D, Wilamowski B (2015) Fully connected cascade artificial neural network architecture for attention deficit hyperactivity disorder classification from functional magnetic resonance imaging data. IEEE Trans Cybern 45(12):2668–2679
Article Google Scholar
Nielsen MA (2015) Neural networks and deep learning. Determination Press
Haykin S (2008) Neural networks and learning machines, 3rd edn. Prentice Hall, Upper Saddle Rive
Google Scholar
Richard MD, Lippmann RP (1991) Neural network classifiers estimate Bayesian a posteriori probabilities. Neural Comput 3(4):461–483
Article Google Scholar
Alpaydin E (2010) Introduction to machine learning, 2nd edn. MIT Press, Cambridge
MATH Google Scholar
Theodoridis S, Koutroumbas K (2009) Pattern recognition, 4th edn. Elsevier, London
MATH Google Scholar
Zhang GP (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybernet C Appl Rev 30(4):451–462
Article Google Scholar
Allwein EL, Schapire RE, Singer Y (2001) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1(2):113–141
MathSciNet MATH Google Scholar
Ou G, Murphey YL (2007) Multi-class pattern classification using neural networks. Pattern Recognit 40(1):4–18
Article MATH Google Scholar
Cid-Sueiro J, Arribas JI, Urban-Munoz S, Figueiras-Vidal AR (1999) Cost functions to estimate a posteriori probabilities in multiclass problems. IEEE Trans Neural Netw 10(3):645–656
Article Google Scholar
Suresh S, Sundararajan N, Saratchandran P (2008) Risk-sensitive loss functions for sparse multi-category classification problems. Inf Sci 178(12):2621–2638
Article MathSciNet MATH Google Scholar
Arribas JI, Cid-Sueiro J (2005) A model selection algorithm for a posteriori probability estimation with neural networks. IEEE Trans Neural Netw 16(4):799–809
Article Google Scholar
Seghouane A-K, Amari S-I (2007) The AIC criterion and symmetrizing the Kullback–Leibler divergence. IEEE Trans Neural Netw 18(1):97–106
Article Google Scholar
Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin
MATH Google Scholar
Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn. Springer, Berlin
MATH Google Scholar
Mackay D (1992) Bayesian interpolation. Neural Comput 4(3):415–447
Article MATH Google Scholar
Wedge D, Ingram D, McLean D, Mingham C, Bandar Z (2006) On global-local artificial neural networks for function approximation. IEEE Trans Neural Netw 17(4):942–952
Article Google Scholar
Brooks TF, Pope DS, Marcolini AM (1989) Airfoil self-noise and prediction. Technical Report RP-1218, NASA
Gonzalez RL (2008) Neural networks for variational problems in engineering. PhD thesis, Technical University of Catalonia
Lang KJ, Witbrock M (1988) Learning to tell two spirals apart. In: Proceedings of the 1988 connectionist models summer school
Gritsenko A, Eirola E, Schupp D, Ratner E, Lendasse A (2016) Probabilistic methods for multiclass classification problems. In: Proceedings of ELM-2015, vol 2. Springer, pp 385–397

Download references

Acknowledgements

The author would like to thank the anonymous reviewers whose suggestions greatly enhanced the technical quality of this paper.

Author information

Authors and Affiliations

School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, 100044, China
Yiping Cheng

Authors

Yiping Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yiping Cheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, Y. Backpropagation for Fully Connected Cascade Networks. Neural Process Lett 46, 293–311 (2017). https://doi.org/10.1007/s11063-017-9588-4

Download citation

Published: 25 January 2017
Issue Date: August 2017
DOI: https://doi.org/10.1007/s11063-017-9588-4

Keywords

Mathematics Subject Classification

62M45

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Backpropagation for Fully Connected Cascade Networks

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

Facial emotion recognition using convolutional neural networks (FERC)

A survey of the recent architectures of deep convolutional neural networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Backpropagation for Fully Connected Cascade Networks

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

Facial emotion recognition using convolutional neural networks (FERC)

A survey of the recent architectures of deep convolutional neural networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation