Knowledge Fusion in Feedforward Artificial Neural Networks

Akhlaghi, Milad I.; Sukhov, Sergey V.

doi:10.1007/s11063-017-9712-5

Knowledge Fusion in Feedforward Artificial Neural Networks

Published: 20 September 2017

Volume 48, pages 257–272, (2018)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

813 Accesses
6 Citations
Explore all metrics

Abstract

Artificial neural networks are well known computational models that have been successful in demonstrating various human cognitive capabilities. Nevertheless, as opposed to the human brain, neural networks usually require starting from the scratch to learn a new task. Furthermore, in contrast to human abilities, re-training a network on a new task will not conserve already learned information necessarily and may lead to a catastrophic forgetting. Having a well-established method for knowledge transfer between neural networks can alleviate these issues. Here in this paper, we propose a method to fuse knowledge contained in separate trained networks. The method is non-iterative and does not require initial or additional training data or training sessions. The theoretical basis of the model based on a probabilistic approach is presented and its performance for feedforward neural networks is tested on classification tasks for several publicly available data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Balakrishnan S, Madigan D (2008) Algorithms for sparse linear classifiers in the massive data setting. J Mach Learn Res 9:313–337
MATH Google Scholar
Muller K-R, Mika S, Ratsch G, Tsuda K, Scholkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12(2):181–201
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Mokhtari F, Hossein-Zadeh G-A (2013) Decoding brain states using backward edge elimination and graph kernels in fMRI connectivity networks. J Neurosci Methods 212(2):259–268
Article Google Scholar
Gao Z-K, Cai Q, Yang Y-X, Dong N, Zhang S-S (2017) Visibility graph from adaptive optimal kernel time-frequency representation for classification of epileptiform EEG. Int J Neural Syst 27(04):1750005
Article Google Scholar
McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: the sequential learning problem. Psychol Learn Motiv 24:109–165
Article Google Scholar
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans knowl Data Eng 22(10):1345–1359
Article Google Scholar
Pérez-Sánchez B, Fontenla-Romero O, Guijarro-Berdiñas B (2016) A review of adaptive online learning for artificial neural networks. Artif Intell Rev 1–19
Jain LC, Seera M, Lim CP, Balasubramaniam P (2014) A review of online learning in supervised neural networks. Neural Comput Appl 25(3–4):491–509
Article Google Scholar
Thrun S, Pratt L (2012) Learning to learn. Springer, New York
MATH Google Scholar
Yang L, Jing L, Yu J, Ng MK (2016) Learning transferred weights from co-occurrence data for heterogeneous transfer learning. IEEE Trans Neural Netw Learn Syst 27(11):2187–2200
Article MathSciNet Google Scholar
Li J, Zhao R, Huang J-T, Gong Y (2014) Learning small-size DNN with output-distribution-based criteria. In: Fifteenth annual conference of the international speech communication association,
Hinton G, Vinyals O, Dean J (2014) Distilling the knowledge in a neural network. Paper presented at the NIPS workshop
Ba J, Caruana R (2014) Do deep nets really need to be deep? Advances in neural information processing systems 2654–2662
Tang Z, Wang D, Zhang Z (2016) Recurrent neural network training with dark knowledge transfer. In: IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5900–5904
Zhang Y, Xiang T, Hospedales TM, Lu H (2017) Deep mutual learning. arXiv preprint arXiv:1706.00384
Ratcliff R (1990) Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychol Rev 97(2):285–308
Article Google Scholar
French RM (1991) Using semi-distributed representations to overcome catastrophic interference in connectionist networks. In: Proceedings of the thirteenth annual conference of the cognitive science society, Erlbaum, Hillsdale, pp 173–178
French RM (1999) Catastrophic forgetting in connectionist networks. Trends Cogn Sci 3(4):128–135
Article Google Scholar
Caruana R (1998) Multitask learning. In: Learning to learn. Springer, New York, pp 95–133
Book Google Scholar
Li Z, Hoiem D (2016) Learning without forgetting. In: European conference on computer vision. Springer, New York, pp 614–629
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) DeCAF: A deep convolutional activation feature for generic visual recognition. In: International conference in machine learning (ICML), pp 647–655
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
French RM, Ans B, Rousset S (2001) Pseudopatterns and dual-network memory models: advantages and shortcomings. In: Connectionist models of learning, development and evolution. Springer, pp 13–22
Li H, Wang X, Ding S (2017) Research and development of neural network ensembles: a survey. Artif Intell Rev 1–25
Woźniak M, Graña M, Corchado E (2014) A survey of multiple classifier systems as hybrid systems. Inf Fusion 16:3–17
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Domingos P (2000) Bayesian averaging of classifiers and the overfitting problem. In: 17th international conference on machine learning, San Francisco, pp 223–230
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Article Google Scholar
Cho S-B, Kim JH (1995) Multiple network fusion using fuzzy logic. IEEE Trans Neural Netw 6(2):497–501
Article Google Scholar
Buciluǎ C, Caruana R, Niculescu-Mizil (2006) A Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 535–541
Mantas CJ (2008) A generic fuzzy aggregation operator: rules extraction from and insertion into artificial neural networks. Soft Comput 12(5):493–514
Article MATH Google Scholar
Kolman E, Margaliot M (2005) Are artificial neural networks white boxes? IEEE Trans Neural Netw 16(4):844–852
Article Google Scholar
Hruschka ER, Ebecken NF (2006) Extracting rules from multilayer perceptrons in classification problems: a clustering-based approach. Neurocomputing 70(1):384–397
Article Google Scholar
McGarry KJ, MacIntyre J (1999) Knowledge extraction and insertion from radial basis function networks. In: IEE colloquium on applied statistical pattern recognition, p 15
Kasabov N, Woodford B (1999) Rule insertion and rule extraction from evolving fuzzy neural networks: algorithms and applications for building adaptive, intelligent expert systems. In: Fuzzy systems conference proceedings, 1999. FUZZ-IEEE’99. 1999 IEEE international. IEEE, pp 1406–1411
Tran SN, Garcez ASdA (2016) Deep logic networks: inserting and extracting knowledge from deep belief networks. IEEE transactions on neural networks and learning systems, vol 99, pp 1–13. doi:10.1109/TNNLS.2016.2603784
Tran SN, Garcez AdA (2013) Knowledge extraction from deep belief networks for images. In: IJCAI-2013 workshop on neural-symbolic learning and reasoning
Fourati H (2015) Multisensor data fusion: from algorithms and architectural design to applications. CRC Press, Boca Raton
Google Scholar
Remagnino P, Monekosso DN, Jain LC (eds) Innovations in defence support systems -3, vol 336. Springer, Berlin Heidelberg, pp 1–21
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Simard PY, Steinkraus D, Platt JC (2003) Best practices for convolutional neural networks applied to visual document analysis. In: ICDAR. Citeseer, pp 958–962
Quinlan JR (1987) Simplifying decision trees. Int J Man-Mach Stud 27(3):221–234
Article Google Scholar
Noordewier MO, Towell GG, Shavlik JW (1991) Training knowledge-based neural networks to recognize genes in DNA sequences. In: Advances in neural information processing systems (vol 3, pp 530–536). Denver, CO: Morgan Kaufmann
Schlimmer JC (1987) Concept acquisition through representational adjustment. Doctoral disseration, Department of Information and Computer Science, University of California, Irvine
Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16(5):550–554
Article Google Scholar
McRae K, Hetherington PA (1993) Catastrophic interference is eliminated in pretrained networks. In: Proceedings of the 15h annual conference of the cognitive science society, pp 723–728
Bengio Y (2009) Learning deep architectures for AI. Foundations and trends ®. Mach Learn 2(1):1–127
Article MathSciNet MATH Google Scholar
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y Fitnets (2015) Hints for thin deep nets. In: Proceedings of international conference on learning representations

Download references

Author information

Authors and Affiliations

Orlando, FL, 32817, USA
Milad I. Akhlaghi
Kotel’nikov Institute of Radio Engineering and Electronics of Russian Academy of Sciences (Ulyanovsk Branch), 14 Spasskaya Str., Ulyanovsk, Russia, 432011
Sergey V. Sukhov

Authors

Milad I. Akhlaghi
View author publications
You can also search for this author in PubMed Google Scholar
Sergey V. Sukhov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sergey V. Sukhov.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 102 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akhlaghi, M.I., Sukhov, S.V. Knowledge Fusion in Feedforward Artificial Neural Networks. Neural Process Lett 48, 257–272 (2018). https://doi.org/10.1007/s11063-017-9712-5

Download citation

Published: 20 September 2017
Issue Date: August 2018
DOI: https://doi.org/10.1007/s11063-017-9712-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge Fusion in Feedforward Artificial Neural Networks

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Development and Application of Artificial Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 102 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Knowledge Fusion in Feedforward Artificial Neural Networks

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Development and Application of Artificial Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 102 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation