Learning to Run Challenge: Synthesizing Physiologically Accurate Motion Using Deep Reinforcement Learning

Kidziński, Łukasz; Mohanty, Sharada P.; Ong, Carmichael F.; Hicks, Jennifer L.; Carroll, Sean F.; Levine, Sergey; Salathé, Marcel; Delp, Scott L.

doi:10.1007/978-3-319-94042-7_6

Łukasz Kidziński⁶,
Sharada P. Mohanty⁷,
Carmichael F. Ong⁶,
Jennifer L. Hicks⁶,
Sean F. Carroll⁷,
Sergey Levine⁸,
Marcel Salathé⁷ &
…
Scott L. Delp⁶

Part of the book series: The Springer Series on Challenges in Machine Learning ((SSCML))

1662 Accesses
19 Citations

Abstract

Synthesizing physiologically-accurate human movement in a variety of conditions can help practitioners plan surgeries, design experiments, or prototype assistive devices in simulated environments, reducing time and costs and improving treatment outcomes. Because of the large and complex solution spaces of biomechanical models, current methods are constrained to specific movements and models, requiring careful design of a controller and hindering many possible applications. We sought to discover if modern optimization methods efficiently explore these complex spaces. To do this, we posed the problem as a competition in which participants were tasked with developing a controller to enable a physiologically-based human model to navigate a complex obstacle course as quickly as possible, without using any experimental data. They were provided with a human musculoskeletal model and a physics-based simulation environment. In this paper, we discuss the design of the competition, technical difficulties, results, and analysis of the top controllers. The challenge proved that deep reinforcement learning techniques, despite their high computational cost, can be successfully employed as an optimization method for synthesizing physiologically feasible motion in high-dimensional biomechanical systems.

Sharada P. Mohanty and Carmichael F. Ong contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Ackermann, M., Van den Bogert, A.J.: Optimality principles for model-based prediction of human gait. Journal of biomechanics 43(6), 1055–1060 (2010)
Article Google Scholar
Anderson, F.C., Pandy, M.G.: A dynamic optimization solution for vertical jumping in three dimensions. Computer methods in biomechanics and biomedical engineering 2(3), 201–231 (1999)
Article Google Scholar
Anderson, F.C., Pandy, M.G.: Dynamic optimization of human walking. Journal of biomechanical engineering 123(5), 381–390 (2001)
Article Google Scholar
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research 47, 253–279 (2013)
Article Google Scholar
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
Google Scholar
Delp, S., Loan, J., Hoy, M., Zajac, F., Topp, E., Rosen, J.: An interactive graphics-based model of the lower extremity to study orthopaedic surgical procedures. IEEE Transactions on Biomedical Engineering 37(8), 757–767 (1990)
Article Google Scholar
Delp, S.L., Anderson, F.C., Arnold, A.S., Loan, P., Habib, A., John, C.T., Guendelman, E., Thelen, D.G.: Opensim: open-source software to create and analyze dynamic simulations of movement. IEEE transactions on biomedical engineering 54(11), 1940–1950 (2007)
Article Google Scholar
Dimitrakakis, C., Li, G., Tziortziotis, N.: The reinforcement learning competition 2014. AI Magazine 35(3), 61–65 (2014)
Article Google Scholar
Dorn, T.W., Wang, J.M., Hicks, J.L., Delp, S.L.: Predictive simulation generates human adaptations during loaded and inclined walking. PloS one 10(4), e0121,407 (2015)
Article Google Scholar
Geyer, H., Herr, H.: A muscle-reflex model that encodes principles of legged mechanics produces human walking dynamics and muscle activities. IEEE Transactions on neural systems and rehabilitation engineering 18(3), 263–273 (2010)
Article Google Scholar
Hamner, S.R., Delp, S.L.: Muscle contributions to fore-aft and vertical body mass center accelerations over a range of running speeds. Journal of Biomechanics 46(4), 780–787 (2013)
Article Google Scholar
Hunt, K., Crossley, F.: Coefficient of restitution interpreted as damping in vibroimpact. Journal of Applied Mechanics 42(2), 440–445 (1975)
Article Google Scholar
Jaśkowski, W., Lykkebø, O.R., Toklu, N.E., Trifterer, F., Buk, Z., Koutník, J., Gomez, F.: Reinforcement Learning to Run…Fast. In: S. Escalera, M. Weimer (eds.) NIPS 2017 Competition Book. Springer, Springer (2018)
Google Scholar
Kidziński, Ł., Mohanty, S.P., Ong, C., Huang, Z., Zhou, S., Pechenko, A., Stelmaszczyk, A., Jarosik, P., Pavlov, M., Kolesnikov, S., Plis, S., Chen, Z., Zhang, Z., Chen, J., Shi, J., Zheng, Z., Yuan, C., Lin, Z., Michalewski, H., Mio, P., Osiski, B., andrew, M., Schilling, M., Ritter, H., Carroll, S., Hicks, J., Levine, S., Salath, M., Delp, S.: Learning to run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. In: S. Escalera, M. Weimer (eds.) NIPS 2017 Competition Book. Springer, Springer (2018)
Google Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Google Scholar
Ong, C.F., Geijtenbeek, T., Hicks, J.L., Delp, S.L.: Predictive simulations of human walking produce realistic cost of transport at a range of speeds. In: Proceedings of the 16th International Symposium on Computer Simulation in Biomechanics, pp. 19–20 (2017)
Google Scholar
Schulman, J., Levine, S., Abbeel, P., Jordan, M.I., Moritz, P.: Trust region policy optimization. In: ICML, pp. 1889–1897 (2015)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Google Scholar
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)
Article Google Scholar
Song, S., Geyer, H.: A neural circuitry that emphasizes spinal feedback generates diverse behaviours of human locomotion. The Journal of physiology 593(16), 3493–3511 (2015)
Article Google Scholar
Thelen, D.G.: Adjustment of muscle mechanics model parameters to simulate dynamic contractions in older adults. Journal of Biomechanical Engineering 125(1), 70–77 (2003)
Article Google Scholar
Thelen, D.G., Anderson, F.C., Delp, S.L.: Generating dynamic simulations of movement using computed muscle control. Journal of Biomechanics 36(3), 321–328 (2003)
Article Google Scholar
Todorov, E., Erez, T., Tassa, Y.: Mujoco: A physics engine for model-based control. In: Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, pp. 5026–5033. IEEE (2012)
Google Scholar
Wang, J.M., Hamner, S.R., Delp, S.L., Koltun, V.: Optimizing locomotion controllers using biologically-based actuators and objectives. ACM transactions on graphics 31(4) (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Stanford University, Stanford, CA, USA
Łukasz Kidziński, Carmichael F. Ong, Jennifer L. Hicks & Scott L. Delp
Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland
Sharada P. Mohanty, Sean F. Carroll & Marcel Salathé
Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA
Sergey Levine

Authors

Łukasz Kidziński
View author publications
You can also search for this author in PubMed Google Scholar
Sharada P. Mohanty
View author publications
You can also search for this author in PubMed Google Scholar
Carmichael F. Ong
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer L. Hicks
View author publications
You can also search for this author in PubMed Google Scholar
Sean F. Carroll
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Levine
View author publications
You can also search for this author in PubMed Google Scholar
Marcel Salathé
View author publications
You can also search for this author in PubMed Google Scholar
Scott L. Delp
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Łukasz Kidziński .

Editor information

Editors and Affiliations

Department Mathematics & Informatics, University of Barcelona, Barcelona, Spain
Sergio Escalera
Microsoft (United States), Redmond, WA, USA
Markus Weimer

Appendix

6.1.1 Installation

We believe that the simplicity of use of the simulator (independently of the skills in computer science and biomechanics) contributed significantly to the success of the challenge. The whole installation process took around 1–5 mins depending on the internet connection. To emphasize this simplicity let us illustrate the installation process. Users were asked to install Anaconda (https://www.continuum.io/downloads) and then to install our reinforcement learning environment by typing

conda create -n opensim-rl -c kidzik opensim git source activate opensim-rl pip install git+https://github.com/kidzik/osim-rl.git

Next, they were asked to start a python interpreter which allows interaction with the musculoskeletal model and visualization of the skeleton (Fig. 6.8) after running

from osim.env import GaitEnv env = GaitEnv(visualize=True) observation = env.reset() for i in range(500): observation, reward, done, info = env.step (env.action_space.sample())

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kidziński, Ł. et al. (2018). Learning to Run Challenge: Synthesizing Physiologically Accurate Motion Using Deep Reinforcement Learning. In: Escalera, S., Weimer, M. (eds) The NIPS '17 Competition: Building Intelligent Systems. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-94042-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-94042-7_6
Published: 28 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94041-0
Online ISBN: 978-3-319-94042-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning to Run Challenge: Synthesizing Physiologically Accurate Motion Using Deep Reinforcement Learning

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

6.1.1 Installation

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation