Skip to main content

Neuro-Evolutionary Transfer Learning Through Structural Adaptation

  • Conference paper
  • First Online:
Applications of Evolutionary Computation (EvoApplications 2020)

Abstract

Transfer learning involves taking an artificial neural network (ANN) trained on one dataset (the source) and adapting it to a new, second dataset (the target). While transfer learning has been shown to be quite powerful and is commonly used in most modern-day statistical learning setups, its use has generally been restricted by architecture, i.e., in order to facilitate the reuse of internal learned synaptic weights, the underlying topology of the ANN to be transferred across tasks must remain the same and a new output layer must be attached (entailing removing the old output layer’s weights). This work removes this restriction by proposing a neuro-evolutionary approach that facilitates what we call adaptive structure transfer learning, which means that an ANN can be transferred across tasks that have different input and output dimensions while having the internal latent structure continuously optimized. We test the proposed optimizer on two challenging real-world time series prediction problems – our process adapts recurrent neural networks (RNNs) to (1) predict coal-fired power plant data before and after the addition of new sensors, and to (2) predict engine parameters where RNN estimators are trained on different airframes with different engines. Experiments show that not only does the proposed neuro-evolutionary transfer learning process result in RNNs that evolve and train faster on the target set than those trained from scratch but, in many cases, the RNNs generalize better even after a long training and evolution process. To our knowledge, this work represents the first use of neuro-evolution for transfer learning, especially for RNNs, and is the first methodological framework capable of adapting entire structures for arbitrary input/output spaces.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/travisdesell/exact/tree/master/datasets.

  2. 2.

    http://ngafid.org.

References

  1. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  2. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  3. Moreno, G.A., Cámara, J., Garlan, D., Schmerl, B.: Proactive self-adaptation under uncertainty: a probabilistic model checking approach. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, pp. 1–12. ACM, New York (2015). http://doi.acm.org/10.1145/2786805.2786853

  4. Moreno, G.A.: Adaptation timing in self-adaptive systems. Ph.D. thesis, Carnegie Mellon University (2017)

    Google Scholar 

  5. Palmerino, J., Yu, Q., Desell, T., Krutz, D.: Accounting for tactic volatility in self-adaptive systems for improved decision-making. In: Proceedings of the 34th ACM/IEEE International Conference on Automated Software Engineering. ASE 2019. ACM, New York (2019)

    Google Scholar 

  6. Gupta, P., Malhotra, P., Vig, L., Shroff, G.: Transfer learning for clinical time series analysis using recurrent neural networks. arXiv preprint arXiv:1807.01705 (2018)

  7. Zhang, A., et al.: Transfer learning with deep recurrent neural networks for remaining useful life estimation. Appl. Sci. 8(12), 2416 (2018)

    Article  Google Scholar 

  8. Yoon, S., Yun, H., Kim, Y., Park, G.T., Jung, K.: Efficient transfer learning schemes for personalized language modeling using recurrent neural network. In: Workshops at the Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  9. Zarrella, G., Marsh, A.: MITRE at SemEval-2016 task 6: transfer learning for stance detection. arXiv preprint arXiv:1606.03784 (2016)

  10. Mrkšić, N., et al.: Multi-domain dialog state tracking using recurrent neural networks. arXiv preprint arXiv:1506.07190 (2015)

  11. Mun, S., Shon, S., Kim, W., Han, D.K., Ko, H.: Deep neural network based learning and transferring mid-level audio features for acoustic scene classification. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 796–800. IEEE (2017)

    Google Scholar 

  12. Taylor, M.E., Whiteson, S., Stone, P.: Transfer via inter-task mappings in policy search reinforcement learning. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, p. 37. ACM (2007)

    Google Scholar 

  13. Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv preprint arXiv:1703.06345 (2017)

  14. Verbancsics, P., Stanley, K.O.: Evolving static representations for task transfer. J. Mach. Learn. Res. 11(May), 1737–1769 (2010)

    MathSciNet  MATH  Google Scholar 

  15. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  16. Tang, Z., Wang, D., Zhang, Z.: Recurrent neural network training with dark knowledge transfer. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5900–5904. IEEE (2016)

    Google Scholar 

  17. Deo, R.V., Chandra, R., Sharma, A.: Stacked transfer learning for tropical cyclone intensity prediction. arXiv preprint arXiv:1708.06539 (2017)

  18. Ororbia, A., ElSaid, A., Desell, T.: Investigating recurrent neural network memory structures using neuro-evolution. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2019, pp. 446–455. ACM, New York (2019). http://doi.acm.org/10.1145/3321707.3321795

  19. Ororbia II, A.G., Mikolov, T., Reitter, D.: Learning simpler language models with the differential state framework. Neural Comput. 29(12), 1–26 (2017). https://doi.org/10.1162/neco_a_01017. PMID: 28957029

    Article  MathSciNet  MATH  Google Scholar 

  20. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)

  21. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  22. Zhou, G.-B., Wu, J., Zhang, C.-L., Zhou, Z.-H.: Minimal gated unit for recurrent neural networks. Int. J. Autom. Comput. 13(3), 226–234 (2016). https://doi.org/10.1007/s11633-016-1006-2

    Article  Google Scholar 

  23. Collins, J., Sohl-Dickstein, J., Sussillo, D.: Capacity and trainability in recurrent neural networks. arXiv preprint arXiv:1611.09913 (2016)

  24. Message Passing Interface Forum: MPI: A message-passing interface standard. The International Journal of Supercomputer Applications and High Performance Computing 8(3/4), 159–416 (Fall/Winter 1994)

    Google Scholar 

  25. Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)

    Article  Google Scholar 

  26. ElSaid, A., El Jamiy, F., Higgins, J., Wild, B., Desell, T.: Optimizing long short-term memory recurrent neural networks using ant colony optimization to predict turbine engine vibration. Appl. Soft Comput. 73, 969–991 (2018)

    Article  Google Scholar 

  27. Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: International Conference on Machine Learning, pp. 2342–2350 (2015)

    Google Scholar 

  28. Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)

    Google Scholar 

Download references

Acknowledgements

This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Combustion Systems under Award Number #FE0031547.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Travis Desell .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

ElSaid, A., Karnas, J., Lyu, Z., Krutz, D., Ororbia, A.G., Desell, T. (2020). Neuro-Evolutionary Transfer Learning Through Structural Adaptation. In: Castillo, P.A., Jiménez Laredo, J.L., Fernández de Vega, F. (eds) Applications of Evolutionary Computation. EvoApplications 2020. Lecture Notes in Computer Science(), vol 12104. Springer, Cham. https://doi.org/10.1007/978-3-030-43722-0_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-43722-0_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-43721-3

  • Online ISBN: 978-3-030-43722-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics