Skip to main content

Neural Model Extraction for Model-Based Control of a Neural Network Forward Model

Abstract

Neural networks have been widely used to model nonlinear systems that are difficult to formulate. Thus far, because neural networks are a radically different approach to mathematical modeling, control theory has not been applied to them, even if they approximate the nonlinear state equation of a control object. In this research, we propose a new approach—i.e., neural model extraction, that enables model-based control for a feed-forward neural network trained for a nonlinear state equation. Specifically, we propose a method for extracting the linear state equations that are equivalent to the neural network corresponding to given input vectors. We conducted simple simulations of a two degrees-of-freedom planar manipulator to verify how the proposed method enables model-based control on neural network forward models. Through simulations, where different settings of the manipulator’s state observation are assumed, we successfully confirm the validity of the proposed method.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. Pfeifer R, Lungarella M, Iida F. Self-organization, embodiment, and biologically inspired robotics. Science. 2007;318(5853):1088. http://science.sciencemag.org/content/318/5853/1088.abstract

  2. Rus D, Tolley MT. Design, fabrication and control of soft robots. Nature. 2015;521:467. https://doi.org/10.1038/nature14543.

    Article  Google Scholar 

  3. Laschi C, Mazzolai B, Cianchetti M. Soft robotics: technologies and systems pushing the boundaries of robot abilities. Science Robotics. 2016;1(1). http://robotics.sciencemag.org/content/1/1/eaah3690.abstract

  4. Martius G, Hostettler R, Knoll A, Der R. Compliant control for soft robots: emergent behavior of a tendon driven anthropomorphic arm. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS); 2016. p. 767–73. https://doi.org/10.1109/IROS.2016.7759138.

  5. Gupta A, Eppner C, Levine S, Abbeel P. Learning dexterous manipulation for a soft robotic hand from human demonstrations. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS); 2016. p. 3786–93. https://doi.org/10.1109/IROS.2016.7759557.

  6. Ishige M, Umedachi T, Taniguchi T, Kawahara Y. Learning oscillator-based gait controller for string-form soft robots using parameter-exploring policy gradients. In: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS); 2018. p. 6445–52. https://doi.org/10.1109/IROS.2018.8594338.

  7. Hunt KJ, Sbarbaro D, Żbikowski R, Gawthrop PJ. Neural networks for control systems: a survey. Automatica. 1992;28(6):1083. https://doi.org/10.1016/0005-1098(92)90053-I.

    MathSciNet  Article  MATH  Google Scholar 

  8. Jin L, Li S, Yu J, He J. Robot manipulator control using neural networks: a survey. Neurocomputing. 2018;285:23. https://doi.org/10.1016/j.neucom.2018.01.002.

    Article  Google Scholar 

  9. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436. https://doi.org/10.1038/nature14539.

    Article  Google Scholar 

  10. Pierson HA, Gashler MS. Deep learning in robotics: a review of recent research. Adv Robot. 2017;31(16):821. https://doi.org/10.1080/01691864.2017.1365009.

    Article  Google Scholar 

  11. Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML); 2010. p. 807–14. https://doi.org/10.5555/3104322.3104425.

  12. Maas AL, Hannun AY, Ng AY. Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on Deep Learning for Audio, Speech and Language Processing; 2013.

  13. He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. CoRR arXiv:1502.01852 (2015).

  14. Nguyen-Tuong D, Peters J. Model learning for robot control: a survey. Cogn Process. 2011;12(4):319. https://doi.org/10.1007/s10339-011-0404-1.

    Article  Google Scholar 

  15. Peters J, Schaal S. Reinforcement learning of motor skills with policy gradients. Neural Netw. 2008;21(4):682. https://doi.org/10.1016/j.neunet.2008.02.003.

    Article  Google Scholar 

  16. Gaeta M, Loia V, Miranda S, Tomasiello S. Fitted Q-iteration by functional networks for control problems. Appl Math Model. 2016;40(21):9183. https://doi.org/10.1016/j.apm.2016.05.049.

    MathSciNet  Article  MATH  Google Scholar 

  17. Bruin T, Kober J, Tuyls K, Babuška R. Improved deep reinforcement learning for robotics through distribution-based experience retention. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS); 2016. p. 3947–52. https://doi.org/10.1109/IROS.2016.7759581.

  18. Gu S, Holly E, Lillicrap T, Levine S. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE international conference on robotics and automation (ICRA); 2017. p. 3389–96. https://doi.org/10.1109/ICRA.2017.7989385.

  19. Haarnoja T, Pong V, Zhou A, Dalal M, Abbeel P, Levine S. Composable deep reinforcement learning for robotic manipulation. In: 2018 IEEE international conference on robotics and automation (ICRA); 2018. p. 6244–51. https://doi.org/10.1109/ICRA.2018.8460756.

  20. Stulp F, Sigaud O. Path integral policy improvement with covariance matrix adaptation. In: Proceedings of the 29th international conference on international conference on machine learning (ICML); 2012. p. 1547–54. https://doi.org/10.5555/3042573.3042771.

  21. Stulp F, Oudeyer PY. Adaptive exploration through covariance matrix adaptation enables developmental motor learning. Paladyn. 2012;3(3):128. https://doi.org/10.2478/s13230-013-0108-6.

    Article  Google Scholar 

  22. Nguyen-Tuong D, Peters J, Seeger M, Schölkopf B. Learning inverse dynamics: a comparison. In: Advances in computational intelligence and learning: proceedings of the European symposium on artificial neural networks (ESANN); 2008. p. 13–8.

  23. Sigaud O, Salaün C, Padois V. On-line regression algorithms for learning mechanical models of robots: a survey. Robot Auton Syst. 2011;59(12):1115. https://doi.org/10.1016/j.robot.2011.07.006.

    Article  Google Scholar 

  24. Schaal S, Atkeson CG, Vijayakumar S. Real-time robot learning with locally weighted statistical learning. In: IEEE international conference on robotics and automation (ICRA); 2000. p. 288–93. https://doi.org/10.1109/ROBOT.2000.844072.

  25. Nguyen-Tuong D, Seeger M, Peters J. Model learning with local Gaussian process regression. Adv Robot. 2009;23(15):2015. https://doi.org/10.1163/016918609X12529286896877.

    Article  Google Scholar 

  26. Miyamoto H, Kawato M, Setoyama T, Suzuki R. Feedback-error-learning neural network for trajectory control of a robotic manipulator. Neural Netw. 1988;1(3):251. https://doi.org/10.1016/0893-6080(88)90030-5.

    Article  Google Scholar 

  27. Katayama M, Kawato M. Learning trajectory and force control of an artificial muscle arm by parallel-hierarchical neural network model. In: Advances in neural information processing systems; 1990. p. 436–42. https://proceedings.neurips.cc/paper/1990/file/3fe94a002317b5f9259f82690aeea4cd-Paper.pdf.

  28. Waegeman T, wyffels F, Schrauwen B. Feedback control by online learning an inverse model. IEEE Trans Neural Netw Learn Syst. 2012;23(10):1637. https://doi.org/10.1109/TNNLS.2012.2208655.

    Article  Google Scholar 

  29. Settles B. Synthesis lectures on artificial intelligence and machine learning. Act Learn. 2012;6(1):1.

    Google Scholar 

  30. Jordan MI, Rumelhart DE. Forward models: supervised learning with a distal teacher. Cogn Sci. 1992;16(3):307. https://doi.org/10.1016/0364-0213(92)90036-T.

    Article  Google Scholar 

  31. Dearden A, Demiris Y. Learning forward models for robots. In: Proceedings of the 19th international joint conference on artificial intelligence (IJCAI); 2005. p. 1440–5. https://doi.org/10.5555/1642293.1642521.

  32. Wolpert DM, Kawato M. Multiple paired forward and inverse models for motor control. Neural Netw. 1998;11(7):1317. https://doi.org/10.1016/S0893-6080(98)00066-5.

    Article  Google Scholar 

  33. Haruno M, Wolpert DM, Kawato M. Multiple paired forward-inverse models for human motor learning and control. Adv Neural Inform Process Syst. 1999;11:31–7.

    Google Scholar 

  34. Lambert A, Shaban A, Raj A, Liu Z, Boots B. Deep forward and inverse perceptual models for tracking and prediction. In: 2018 IEEE international conference on robotics and automation (ICRA); 2018. p. 675–82. https://doi.org/10.1109/ICRA.2018.8461050.

  35. Polydoros AS, Nalpantidis L. Survey of model-based reinforcement learning: applications on robotics. J Intell Robot Syst. 2017;86(2):153. https://doi.org/10.1007/s10846-017-0468-y.

    Article  Google Scholar 

  36. Hester T, Quinlan M, Stone P. RTMBA: a real-time model-based reinforcement learning architecture for robot control. In: 2012 IEEE international conference on robotics and automation (ICRA); 2012. p. 85–90. https://doi.org/10.1109/ICRA.2012.6225072.

  37. Martínez D, Alenyà G, Torras C. Safe robot execution in model-based reinforcement learning. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS); 2015, p. 6422–7. https://doi.org/10.1109/IROS.2015.7354295.

  38. Watter M, Springenberg J, Boedecker J, Riedmiller M. Embed to control: a locally linear latent dynamics model for control from raw images. In: Proceedings of the 28th international conference on neural information processing systems, vol. 2; 2015. p. 2746–54. https://doi.org/10.5555/2969442.2969546.

  39. Nagabandi A, Kahn G, Fearing RS, Levine S. Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In: 2018 IEEE international conference on robotics and automation (ICRA); 2018. p. 7559–66. https://doi.org/10.1109/ICRA.2018.8463189.

  40. Soloway D, Haley PJ. Neural generalized predictive control. In: Proceedings of the 1996 IEEE international symposium on intelligent control; 1996. p. 277–82. https://doi.org/10.1109/ISIC.1996.556214.

  41. Akesson B, Toivonen H. A neural network model predictive controller. J Process Control. 2006;16(9):937. https://doi.org/10.1016/j.jprocont.2006.06.001.

    Article  Google Scholar 

  42. Kashima K. Nonlinear model reduction by deep autoencoder of noise response data. In: 2016 IEEE 55th conference on decision and control (CDC); 2016. p. 5750–5. https://doi.org/10.1109/CDC.2016.7799153.

  43. Wang M, Li HX, Shen W. Deep auto-encoder in model reduction of lage-scale spatiotemporal dynamics. In: 2016 international joint conference on neural networks (IJCNN); 2016. p. 3180–6. https://doi.org/10.1109/IJCNN.2016.7727605.

  44. Lenz I, Knepper RA, Saxena A. DeepMPC: learning deep latent features for model predictive control. In: Robotics: science and systems (RSS); 2015. https://doi.org/10.15607/rss.2015.xi.012.

  45. Takahara K, Ikemoto S, Hosoda K. Reconstructing state-space from movie using convolutional autoencoder for robot control. In: The 15th international conference on intelligent autonomous systems (IAS), vol. 15; 2015. p. 480–9. https://doi.org/10.1007/978-3-030-01370-7_38.

  46. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR); 2016. p. 770–8. https://doi.org/10.1109/CVPR.2016.90.

Download references

Funding

This work was supported by JSPS KAKENHI Grant Number 18H01410, 19K22875, and 19H01122.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuhei Ikemoto.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ikemoto, S., Takahara, K., Kumi, T. et al. Neural Model Extraction for Model-Based Control of a Neural Network Forward Model. SN COMPUT. SCI. 2, 54 (2021). https://doi.org/10.1007/s42979-021-00456-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-021-00456-4

Keywords

  • Neural network
  • Model based control