Human locomotion with reinforcement learning using bioinspired reward reshaping strategies

Nowakowski, Katharine; Carvalho, Philippe; Six, Jean-Baptiste; Maillet, Yann; Nguyen, Anh Tu; Seghiri, Ismail; M’Pemba, Loick; Marcille, Theo; Ngo, Sy Toan; Dao, Tien-Tuan

doi:10.1007/s11517-020-02309-3

Human locomotion with reinforcement learning using bioinspired reward reshaping strategies

Original Article
Published: 08 January 2021

Volume 59, pages 243–256, (2021)
Cite this article

Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Katharine Nowakowski¹,
Philippe Carvalho¹,
Jean-Baptiste Six¹,
Yann Maillet¹,
Anh Tu Nguyen¹,
Ismail Seghiri¹,
Loick M’Pemba¹,
Theo Marcille¹,
Sy Toan Ngo¹ &
…
Tien-Tuan Dao ORCID: orcid.org/0000-0002-5088-3433^1,2

1424 Accesses
9 Citations
Explore all metrics

Abstract

Recent learning strategies such as reinforcement learning (RL) have favored the transition from applied artificial intelligence to general artificial intelligence. One of the current challenges of RL in healthcare relates to the development of a controller to teach a musculoskeletal model to perform dynamic movements. Several solutions have been proposed. However, there is still a lack of investigations exploring the muscle control problem from a biomechanical point of view. Moreover, no studies using biological knowledge to develop plausible motor control models for pathophysiological conditions make use of reward reshaping. Consequently, the objective of the present work was to design and evaluate specific bioinspired reward function strategies for human locomotion learning within an RL framework. The deep deterministic policy gradient (DDPG) method for a single-agent RL problem was applied. A 3D musculoskeletal model (8 DoF and 22 muscles) of a healthy adult was used. A virtual interactive environment was developed and simulated using opensim-rl library. Three reward functions were defined for walking, forward, and side falls. The training process was performed with Google Cloud Compute Engine. The obtained outcomes were compared to the NIPS 2017 challenge outcomes, experimental observations, and literature data. Regarding learning to walk, simulated musculoskeletal models were able to walk from 18 to 20.5 m for the best solutions. A compensation strategy of muscle activations was revealed. Soleus, tibia anterior, and vastii muscles are main actors of the simple forward fall. A higher intensity of muscle activations was also noted after the fall. All kinematics and muscle patterns were consistent with experimental observations and literature data. Regarding the side fall, an intensive level of muscle activation on the expected fall side to unbalance the body was noted. The obtained outcomes suggest that computational and human resources as well as biomechanical knowledge are needed together to develop and evaluate an efficient and robust RL solution. As perspectives, current solutions will be extended to a larger parameter space in 3D. Furthermore, a stochastic reinforcement learning model will be investigated in the future in scope with the uncertainties of the musculoskeletal model and associated environment to provide a general artificial intelligence solution for human locomotion learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep reinforcement learning coupled with musculoskeletal modelling for a better understanding of elderly falls

Article 22 April 2022

Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation

Article Open access 16 August 2021

Optimum trajectory learning in musculoskeletal systems with model predictive control and deep reinforcement learning

Article Open access 11 August 2022

Notes

References

Holmes SJ (1911) The beginnings of intelligence. Science 33(848):473–480. https://doi.org/10.1126/science.33.848.473
Article CAS PubMed Google Scholar
Sternberg RJ (1985) Human Intelligence: The Model Is the Message. Science 230(4730):1111–1118. https://doi.org/10.1126/science.230.4730.1111
Article CAS PubMed Google Scholar
Wang W, Pedretti G, Milo V, Carboni R, Calderoni A, Ramaswamy N, Spinelli AS, Ielmini D (2018) Learning of spatiotemporal patterns in a spiking neural network with resistive switching synapses. Sci Adv 4(9):eaat4752. https://doi.org/10.1126/sciadv.aat4752
Article CAS PubMed PubMed Central Google Scholar
McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539
Article CAS PubMed Google Scholar
Zhang Q, Liu Y, Gong C, Chen Y, Yu H (2020) Applications of deep learning for dense scenes analysis in agriculture: a review. Sensors (Basel) 20(5)):E1520. https://doi.org/10.3390/s20051520
Article Google Scholar
Kruger N, Janssen P, Kalkan S, Lappe M, Leonardis A, Piater J, Rodriguez-Sanchez AJ, Wiskott L (2013) Deep hierarchies in the primate visual cortex: what can we learn for computer vision? IEEE Trans Pattern Anal Mach Intell 35(8):1847–1871
Article Google Scholar
Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Dudzik A, Chung J, Choi DH, Powell R, Ewalds T, Georgiev P, Oh J, Horgan D, Kroiss M, Danihelka I, Huang A, Sifre L, Cai T, Agapiou JP, Jaderberg M, Vezhnevets AS, Leblond R, Pohlen T, Dalibard V, Budden D, Sulsky Y, Molloy J, Paine TL, Gulcehre C, Wang Z, Pfaff T, Wu Y, Ring R, Yogatama D, Wünsch D, McKinney K, Smith O, Schaul T, Lillicrap T, Kavukcuoglu K, Hassabis D, Apps C, Silver D (2019) Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575:350–354. https://doi.org/10.1038/s41586-019-1724-z
Article CAS PubMed Google Scholar
Jin W, Fatehi M, Abhishek K, Mallya M, Toyota B, Hamarneh G (2020) Artificial intelligence in glioma imaging: challenges and advances. J Neural Eng 17:021002. https://doi.org/10.1088/1741-2552/ab8131
Article PubMed Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article CAS PubMed Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
I Goodfellow, J Pouget-Abadie, M Mirza, B Xu, D Warde-Farley, S Ozair, A Courville, Y Bengio. Generative adversarial networks. Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014). pp. 2672–2680, 2014.
Pratt LY (1993) Discriminability-based transfer between neural networks. NIPS Conference: Advances in Neural Information Processing Systems 5. Morgan Kaufmann Publishers. pp. 204–211
Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Article Google Scholar
Feher da Silva C, Victorino CG, Caticha N et al (2017) Exploration and recency as the main proximate causes of probability matching: a reinforcement learning analysis. Sci Rep 7:15326. https://doi.org/10.1038/s41598-017-15587-z
Article CAS PubMed PubMed Central Google Scholar
Li J, Dong D, Wei Z et al (2020) Quantum reinforcement learning during human decision-making. Nat Hum Behav 4:294–307. https://doi.org/10.1038/s41562-019-0804-2
Article PubMed Google Scholar
Pesce E, Montana G (2020) Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication. Mach Learn 109:1727–1747. https://doi.org/10.1007/s10994-019-05864-5
Article Google Scholar
Gottesman O, Johansson F, Komorowski M, Faisal A, Sontag D, Doshi-Velez F, Celi LA (2019 Jan) Guidelines for reinforcement learning in healthcare. Nat Med 25(1):16–18. https://doi.org/10.1038/s41591-018-0310-5
Article CAS PubMed Google Scholar
Maia T, Frank M (2011) From reinforcement learning models to psychiatric and neurological disorders. Nat Neurosci 14:154–162. https://doi.org/10.1038/nn.2723
Article CAS PubMed PubMed Central Google Scholar
Jonsson A (2019 Feb) Deep reinforcement learning in medicine. Kidney Dis (Basel) 5(1):18–22. https://doi.org/10.1159/000492670
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015 Feb 26) Human-level control through deep reinforcement learning. Nature. 518(7540):529–533. https://doi.org/10.1038/nature14236
Article CAS PubMed Google Scholar
Ł Kidziński, SP Mohanty, C Ong, Z Huang, S Zhou, A Pechenko, A Stelmaszczyk, P Jarosik, M Pavlov, S Kolesnikov, S Plis, Z Chen, Z Zhang, J Chen, J Shi, Z Zheng, C Yuan, Z Lin, H Michalewski, P Miłoś, B Osiński, A Melnik, M Schilling, H Ritter, S Carroll, J Hicks, S Levine, M Salathé, S Delp. Learning to Run challenge solutions: adapting reinforcement learning methods for neuromusculoskeletal environments. Escalera S., Weimer M. (eds) The NIPS ‘17 Competition: Building Intelligent Systems. The Springer Series on Challenges in Machine Learning. Springer, Cham, arXiv:1804.00361, https://doi.org/10.1007/978-3-319-94042-7_7
Kidziński Ł, Ong C, Mohanty SP, Hicks J, Carroll S, Zhou B, Zeng H, Wang F, Lian R, Tian H, Jaśkowski W, Andersen G, Lykkebø OR, Toklu NE, Shyam P, Srivastava RK, Kolesnikov S, Hrinchuk O, Pechenko A, Ljungström M, Wang Z, Hu X, Hu Z, Qiu M, Huang J, Shpilman A, Sosin I, Svidchenko O, Malysheva A, Kudenko D, Rane L, Bhatt A, Wang Z, Qi P, Yu Z, Peng P, Yuan Q, Li W, Tian Y, Yang R, Ma P, Khadka S, Majumdar S, Dwiel Z, Liu Y, Tumer E, Watson J, Salathé M, Levine S, Delp S (2020) Artificial intelligence for prosthetics: challenge solutions. In: Escalera S, Herbrich R (eds) The NeurIPS ‘18 competition. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-29135-8_4
Chapter Google Scholar
B Zhou, H Zeng, F Wang, Y Li, H Tian. Efficient and Robust Reinforcement Learning with Uncertainty-based Value Expansion. arXiv:1912.05328
Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition, 2011 International Conference on Computer Vision, Barcelona, pp. 2556-2563
Kwolek B, Kepski M (2014) Human fall detection on embedded platform using depth maps and wireless accelerometer. Comput Methods Prog Biomed 117(3):489–501
Article Google Scholar
Bobick A, Davis J (2001) The recognition of human movement using temporal templates. IEEE Trans PAMI 23(3):257–267
Article Google Scholar
Heintz S, Gutierrez-Farewik EM (2007) Static optimization of muscle forces during gait in comparison to EMG-to-force processing approach. Gait Posture 26(2):279–288
Article Google Scholar
Meng L, Ceccarelli M, Yu Z, Chen X, Huang Q (2017) An experimental characterization of human falling down. Mech Sci 8:79–89. https://doi.org/10.5194/ms-8-79-2017
Article Google Scholar
Dao TT, Marin F, Pouletaut P, Aufaure P, Charleux F, Tho MCHB (2012) Estimation of accuracy of patient specific musculoskeletal modeling: case study on a post polio residual paralysis subject. Computer Method Biomech Biomed Eng 15(7):745–751
Article CAS Google Scholar
Delp SL, Anderson FC, Arnold AS, Loan P, Habib A, John CT, Guendelman E, Thelen DG (2007) Opensim: opensource software to create and analyze dynamic simulations of movement. IEEE Trans Biomed Eng 54(11):1940–1950
Article Google Scholar
Pavol MJ, Owings TM, Foley KT, Grabiner MD (2001) Mechanisms leading to a fall from an induced trip in healthy older adults. J Gerontol Ser A Biol Med Sci 56:M428{M437. https://doi.org/10.1093/gerona/56.7.M428
Article Google Scholar
Erdemir A, McLean S, Herzog W, van den Bogert AJ (2007) Model-based estimation of muscle forces exerted during movements. Clin Biomech (Bristol, Avon) 22(2):131–154
Article Google Scholar
Thelen DG, Anderson FC (2006) Using computed muscle control to generate forward dynamic simulations of human walking from experimental data. J Biomech 39(6):1107–1115
Article Google Scholar
Esrafilian A, Stenroth L, Mononen ME, Tanska P, Avela J, Korhonen RK (2020) EMG-Assisted Muscle Force Driven Finite Element Model of the Knee Joint with Fibril-Reinforced Poroelastic Cartilages and Menisci. Sci Rep 10(1):3026. https://doi.org/10.1038/s41598-020-59602-2
Article CAS PubMed PubMed Central Google Scholar
Samadi S, Arjmand N (2018) A novel stability-based EMG-assisted optimization method for the spine. Med Eng Phys

Download references

Acknowledgments

The authors would like to thank Nicolò Salvatico, Dao Zhou, Yunfei Zhao, and Jayson Galante for their support and contribution in different simulation tasks.

Author information

Authors and Affiliations

Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
Katharine Nowakowski, Philippe Carvalho, Jean-Baptiste Six, Yann Maillet, Anh Tu Nguyen, Ismail Seghiri, Loick M’Pemba, Theo Marcille, Sy Toan Ngo & Tien-Tuan Dao
Univ. Lille, CNRS, Centrale Lille, UMR 9013 LaMcube - Laboratoire de Mécanique, Multiphysique, Multiéchelle, F-59000, Lille, France
Tien-Tuan Dao

Authors

Katharine Nowakowski
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Carvalho
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Baptiste Six
View author publications
You can also search for this author in PubMed Google Scholar
Yann Maillet
View author publications
You can also search for this author in PubMed Google Scholar
Anh Tu Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Ismail Seghiri
View author publications
You can also search for this author in PubMed Google Scholar
Loick M’Pemba
View author publications
You can also search for this author in PubMed Google Scholar
Theo Marcille
View author publications
You can also search for this author in PubMed Google Scholar
Sy Toan Ngo
View author publications
You can also search for this author in PubMed Google Scholar
Tien-Tuan Dao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tien-Tuan Dao.

Ethics declarations

Conflict of interest

The authors declare no potential conflict of interests related to the present work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nowakowski, K., Carvalho, P., Six, JB. et al. Human locomotion with reinforcement learning using bioinspired reward reshaping strategies. Med Biol Eng Comput 59, 243–256 (2021). https://doi.org/10.1007/s11517-020-02309-3

Download citation

Received: 06 May 2020
Accepted: 30 December 2020
Published: 08 January 2021
Issue Date: January 2021
DOI: https://doi.org/10.1007/s11517-020-02309-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human locomotion with reinforcement learning using bioinspired reward reshaping strategies

Abstract

Access this article

Similar content being viewed by others

Deep reinforcement learning coupled with musculoskeletal modelling for a better understanding of elderly falls

Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation

Optimum trajectory learning in musculoskeletal systems with model predictive control and deep reinforcement learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Human locomotion with reinforcement learning using bioinspired reward reshaping strategies

Abstract

Access this article

Similar content being viewed by others

Deep reinforcement learning coupled with musculoskeletal modelling for a better understanding of elderly falls

Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation

Optimum trajectory learning in musculoskeletal systems with model predictive control and deep reinforcement learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation