Skip to main content

Learning to Run Challenge: Synthesizing Physiologically Accurate Motion Using Deep Reinforcement Learning

  • Conference paper
  • First Online:
The NIPS '17 Competition: Building Intelligent Systems

Abstract

Synthesizing physiologically-accurate human movement in a variety of conditions can help practitioners plan surgeries, design experiments, or prototype assistive devices in simulated environments, reducing time and costs and improving treatment outcomes. Because of the large and complex solution spaces of biomechanical models, current methods are constrained to specific movements and models, requiring careful design of a controller and hindering many possible applications. We sought to discover if modern optimization methods efficiently explore these complex spaces. To do this, we posed the problem as a competition in which participants were tasked with developing a controller to enable a physiologically-based human model to navigate a complex obstacle course as quickly as possible, without using any experimental data. They were provided with a human musculoskeletal model and a physics-based simulation environment. In this paper, we discuss the design of the competition, technical difficulties, results, and analysis of the top controllers. The challenge proved that deep reinforcement learning techniques, despite their high computational cost, can be successfully employed as an optimization method for synthesizing physiologically feasible motion in high-dimensional biomechanical systems.

Sharada P. Mohanty and Carmichael F. Ong contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://simtk-confluence.stanford.edu:8080/pages/viewpage.action?pageId=5113821

  2. 2.

    https://simtk.org/projects/kneeloads

  3. 3.

    see, e.g., http://www.rl-competition.org/

  4. 4.

    see, e.g., https://youtu.be/hx_bgoTF7bs

  5. 5.

    https://github.com/stanfordnmbl/osim-rl

  6. 6.

    https://anaconda.org/

  7. 7.

    http://crowdai.org/

  8. 8.

    https://github.com/stanfordnmbl/osim-rl/issues/78

  9. 9.

    https://github.com/ctmakro/stanford-osrl#the-simulation-is-too-slow

  10. 10.

    https://kaggle.com/

  11. 11.

    https://github.com/kidzik/osim-rl-grader

References

  • Ackermann, M., Van den Bogert, A.J.: Optimality principles for model-based prediction of human gait. Journal of biomechanics 43(6), 1055–1060 (2010)

    Article  Google Scholar 

  • Anderson, F.C., Pandy, M.G.: A dynamic optimization solution for vertical jumping in three dimensions. Computer methods in biomechanics and biomedical engineering 2(3), 201–231 (1999)

    Article  Google Scholar 

  • Anderson, F.C., Pandy, M.G.: Dynamic optimization of human walking. Journal of biomechanical engineering 123(5), 381–390 (2001)

    Article  Google Scholar 

  • Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research 47, 253–279 (2013)

    Article  Google Scholar 

  • Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)

    Google Scholar 

  • Delp, S., Loan, J., Hoy, M., Zajac, F., Topp, E., Rosen, J.: An interactive graphics-based model of the lower extremity to study orthopaedic surgical procedures. IEEE Transactions on Biomedical Engineering 37(8), 757–767 (1990)

    Article  Google Scholar 

  • Delp, S.L., Anderson, F.C., Arnold, A.S., Loan, P., Habib, A., John, C.T., Guendelman, E., Thelen, D.G.: Opensim: open-source software to create and analyze dynamic simulations of movement. IEEE transactions on biomedical engineering 54(11), 1940–1950 (2007)

    Article  Google Scholar 

  • Dimitrakakis, C., Li, G., Tziortziotis, N.: The reinforcement learning competition 2014. AI Magazine 35(3), 61–65 (2014)

    Article  Google Scholar 

  • Dorn, T.W., Wang, J.M., Hicks, J.L., Delp, S.L.: Predictive simulation generates human adaptations during loaded and inclined walking. PloS one 10(4), e0121,407 (2015)

    Article  Google Scholar 

  • Geyer, H., Herr, H.: A muscle-reflex model that encodes principles of legged mechanics produces human walking dynamics and muscle activities. IEEE Transactions on neural systems and rehabilitation engineering 18(3), 263–273 (2010)

    Article  Google Scholar 

  • Hamner, S.R., Delp, S.L.: Muscle contributions to fore-aft and vertical body mass center accelerations over a range of running speeds. Journal of Biomechanics 46(4), 780–787 (2013)

    Article  Google Scholar 

  • Hunt, K., Crossley, F.: Coefficient of restitution interpreted as damping in vibroimpact. Journal of Applied Mechanics 42(2), 440–445 (1975)

    Article  Google Scholar 

  • Jaśkowski, W., Lykkebø, O.R., Toklu, N.E., Trifterer, F., Buk, Z., Koutník, J., Gomez, F.: Reinforcement Learning to Run…Fast. In: S. Escalera, M. Weimer (eds.) NIPS 2017 Competition Book. Springer, Springer (2018)

    Google Scholar 

  • Kidziński, Ł., Mohanty, S.P., Ong, C., Huang, Z., Zhou, S., Pechenko, A., Stelmaszczyk, A., Jarosik, P., Pavlov, M., Kolesnikov, S., Plis, S., Chen, Z., Zhang, Z., Chen, J., Shi, J., Zheng, Z., Yuan, C., Lin, Z., Michalewski, H., Mio, P., Osiski, B., andrew, M., Schilling, M., Ritter, H., Carroll, S., Hicks, J., Levine, S., Salath, M., Delp, S.: Learning to run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. In: S. Escalera, M. Weimer (eds.) NIPS 2017 Competition Book. Springer, Springer (2018)

    Google Scholar 

  • Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

    Google Scholar 

  • Ong, C.F., Geijtenbeek, T., Hicks, J.L., Delp, S.L.: Predictive simulations of human walking produce realistic cost of transport at a range of speeds. In: Proceedings of the 16th International Symposium on Computer Simulation in Biomechanics, pp. 19–20 (2017)

    Google Scholar 

  • Schulman, J., Levine, S., Abbeel, P., Jordan, M.I., Moritz, P.: Trust region policy optimization. In: ICML, pp. 1889–1897 (2015)

    Google Scholar 

  • Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

    Google Scholar 

  • Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)

    Article  Google Scholar 

  • Song, S., Geyer, H.: A neural circuitry that emphasizes spinal feedback generates diverse behaviours of human locomotion. The Journal of physiology 593(16), 3493–3511 (2015)

    Article  Google Scholar 

  • Thelen, D.G.: Adjustment of muscle mechanics model parameters to simulate dynamic contractions in older adults. Journal of Biomechanical Engineering 125(1), 70–77 (2003)

    Article  Google Scholar 

  • Thelen, D.G., Anderson, F.C., Delp, S.L.: Generating dynamic simulations of movement using computed muscle control. Journal of Biomechanics 36(3), 321–328 (2003)

    Article  Google Scholar 

  • Todorov, E., Erez, T., Tassa, Y.: Mujoco: A physics engine for model-based control. In: Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, pp. 5026–5033. IEEE (2012)

    Google Scholar 

  • Wang, J.M., Hamner, S.R., Delp, S.L., Koltun, V.: Optimizing locomotion controllers using biologically-based actuators and objectives. ACM transactions on graphics 31(4) (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Łukasz Kidziński .

Editor information

Editors and Affiliations

Appendix

Appendix

6.1.1 Installation

We believe that the simplicity of use of the simulator (independently of the skills in computer science and biomechanics) contributed significantly to the success of the challenge. The whole installation process took around 1–5 mins depending on the internet connection. To emphasize this simplicity let us illustrate the installation process. Users were asked to install Anaconda (https://www.continuum.io/downloads) and then to install our reinforcement learning environment by typing

conda create -n opensim-rl -c kidzik opensim git source activate opensim-rl pip install git+https://github.com/kidzik/osim-rl.git

Next, they were asked to start a python interpreter which allows interaction with the musculoskeletal model and visualization of the skeleton (Fig. 6.8) after running

from osim.env import GaitEnv env = GaitEnv(visualize=True) observation = env.reset() for i in range(500):     observation, reward, done, info = env.step     (env.action_space.sample())

Fig. 6.8
figure 8

Visualization of the environment with random muscles activations after. This simulation is immediately visible to the user after following simple installation steps as described in Appendix

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kidziński, Ł. et al. (2018). Learning to Run Challenge: Synthesizing Physiologically Accurate Motion Using Deep Reinforcement Learning. In: Escalera, S., Weimer, M. (eds) The NIPS '17 Competition: Building Intelligent Systems. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-94042-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94042-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94041-0

  • Online ISBN: 978-3-319-94042-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics