Neuro-Evolutionary Transfer Learning Through Structural Adaptation

ElSaid, AbdElRahman; Karnas, Joshua; Lyu, Zimeng; Krutz, Daniel; Ororbia, Alexander G.; Desell, Travis

doi:10.1007/978-3-030-43722-0_39

AbdElRahman ElSaid¹¹,
Joshua Karnas¹¹,
Zimeng Lyu¹¹,
Daniel Krutz¹¹,
Alexander G. Ororbia¹¹ &
…
Travis Desell¹¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12104))

Included in the following conference series:

International Conference on the Applications of Evolutionary Computation (Part of EvoStar)

1267 Accesses
6 Citations

Abstract

Transfer learning involves taking an artificial neural network (ANN) trained on one dataset (the source) and adapting it to a new, second dataset (the target). While transfer learning has been shown to be quite powerful and is commonly used in most modern-day statistical learning setups, its use has generally been restricted by architecture, i.e., in order to facilitate the reuse of internal learned synaptic weights, the underlying topology of the ANN to be transferred across tasks must remain the same and a new output layer must be attached (entailing removing the old output layer’s weights). This work removes this restriction by proposing a neuro-evolutionary approach that facilitates what we call adaptive structure transfer learning, which means that an ANN can be transferred across tasks that have different input and output dimensions while having the internal latent structure continuously optimized. We test the proposed optimizer on two challenging real-world time series prediction problems – our process adapts recurrent neural networks (RNNs) to (1) predict coal-fired power plant data before and after the addition of new sensors, and to (2) predict engine parameters where RNN estimators are trained on different airframes with different engines. Experiments show that not only does the proposed neuro-evolutionary transfer learning process result in RNNs that evolve and train faster on the target set than those trained from scratch but, in many cases, the RNNs generalize better even after a long training and evolution process. To our knowledge, this work represents the first use of neuro-evolution for transfer learning, especially for RNNs, and is the first methodological framework capable of adapting entire structures for arbitrary input/output spaces.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Moreno, G.A., Cámara, J., Garlan, D., Schmerl, B.: Proactive self-adaptation under uncertainty: a probabilistic model checking approach. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, pp. 1–12. ACM, New York (2015). http://doi.acm.org/10.1145/2786805.2786853
Moreno, G.A.: Adaptation timing in self-adaptive systems. Ph.D. thesis, Carnegie Mellon University (2017)
Google Scholar
Palmerino, J., Yu, Q., Desell, T., Krutz, D.: Accounting for tactic volatility in self-adaptive systems for improved decision-making. In: Proceedings of the 34th ACM/IEEE International Conference on Automated Software Engineering. ASE 2019. ACM, New York (2019)
Google Scholar
Gupta, P., Malhotra, P., Vig, L., Shroff, G.: Transfer learning for clinical time series analysis using recurrent neural networks. arXiv preprint arXiv:1807.01705 (2018)
Zhang, A., et al.: Transfer learning with deep recurrent neural networks for remaining useful life estimation. Appl. Sci. 8(12), 2416 (2018)
Article Google Scholar
Yoon, S., Yun, H., Kim, Y., Park, G.T., Jung, K.: Efficient transfer learning schemes for personalized language modeling using recurrent neural network. In: Workshops at the Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Zarrella, G., Marsh, A.: MITRE at SemEval-2016 task 6: transfer learning for stance detection. arXiv preprint arXiv:1606.03784 (2016)
Mrkšić, N., et al.: Multi-domain dialog state tracking using recurrent neural networks. arXiv preprint arXiv:1506.07190 (2015)
Mun, S., Shon, S., Kim, W., Han, D.K., Ko, H.: Deep neural network based learning and transferring mid-level audio features for acoustic scene classification. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 796–800. IEEE (2017)
Google Scholar
Taylor, M.E., Whiteson, S., Stone, P.: Transfer via inter-task mappings in policy search reinforcement learning. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, p. 37. ACM (2007)
Google Scholar
Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv preprint arXiv:1703.06345 (2017)
Verbancsics, P., Stanley, K.O.: Evolving static representations for task transfer. J. Mach. Learn. Res. 11(May), 1737–1769 (2010)
MathSciNet MATH Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Tang, Z., Wang, D., Zhang, Z.: Recurrent neural network training with dark knowledge transfer. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5900–5904. IEEE (2016)
Google Scholar
Deo, R.V., Chandra, R., Sharma, A.: Stacked transfer learning for tropical cyclone intensity prediction. arXiv preprint arXiv:1708.06539 (2017)
Ororbia, A., ElSaid, A., Desell, T.: Investigating recurrent neural network memory structures using neuro-evolution. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, pp. 446–455. ACM, New York (2019). http://doi.acm.org/10.1145/3321707.3321795
Ororbia II, A.G., Mikolov, T., Reitter, D.: Learning simpler language models with the differential state framework. Neural Comput. 29(12), 1–26 (2017). https://doi.org/10.1162/neco_a_01017. PMID: 28957029
Article MathSciNet MATH Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Zhou, G.-B., Wu, J., Zhang, C.-L., Zhou, Z.-H.: Minimal gated unit for recurrent neural networks. Int. J. Autom. Comput. 13(3), 226–234 (2016). https://doi.org/10.1007/s11633-016-1006-2
Article Google Scholar
Collins, J., Sohl-Dickstein, J., Sussillo, D.: Capacity and trainability in recurrent neural networks. arXiv preprint arXiv:1611.09913 (2016)
Message Passing Interface Forum: MPI: A message-passing interface standard. The International Journal of Supercomputer Applications and High Performance Computing 8(3/4), 159–416 (Fall/Winter 1994)
Google Scholar
Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)
Article Google Scholar
ElSaid, A., El Jamiy, F., Higgins, J., Wild, B., Desell, T.: Optimizing long short-term memory recurrent neural networks using ant colony optimization to predict turbine engine vibration. Appl. Soft Comput. 73, 969–991 (2018)
Article Google Scholar
Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: International Conference on Machine Learning, pp. 2342–2350 (2015)
Google Scholar
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)
Google Scholar

Download references

Acknowledgements

This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Combustion Systems under Award Number #FE0031547.

Author information

Authors and Affiliations

Rochester Institute of Technology, Rochester, NY, 14623, USA
AbdElRahman ElSaid, Joshua Karnas, Zimeng Lyu, Daniel Krutz, Alexander G. Ororbia & Travis Desell

Authors

AbdElRahman ElSaid
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Karnas
View author publications
You can also search for this author in PubMed Google Scholar
Zimeng Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Krutz
View author publications
You can also search for this author in PubMed Google Scholar
Alexander G. Ororbia
View author publications
You can also search for this author in PubMed Google Scholar
Travis Desell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Travis Desell .

Editor information

Editors and Affiliations

University of Granada, Granada, Spain
Pedro A. Castillo
Université Le Havre Normandie, Le Havre, France
Juan Luis Jiménez Laredo
Universidad de Extremadura, Mérida, Spain
Francisco Fernández de Vega

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

ElSaid, A., Karnas, J., Lyu, Z., Krutz, D., Ororbia, A.G., Desell, T. (2020). Neuro-Evolutionary Transfer Learning Through Structural Adaptation. In: Castillo, P.A., Jiménez Laredo, J.L., Fernández de Vega, F. (eds) Applications of Evolutionary Computation. EvoApplications 2020. Lecture Notes in Computer Science(), vol 12104. Springer, Cham. https://doi.org/10.1007/978-3-030-43722-0_39

Download citation

DOI: https://doi.org/10.1007/978-3-030-43722-0_39
Published: 09 April 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43721-3
Online ISBN: 978-3-030-43722-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics