Abstract
Synthesizing physiologically-accurate human movement in a variety of conditions can help practitioners plan surgeries, design experiments, or prototype assistive devices in simulated environments, reducing time and costs and improving treatment outcomes. Because of the large and complex solution spaces of biomechanical models, current methods are constrained to specific movements and models, requiring careful design of a controller and hindering many possible applications. We sought to discover if modern optimization methods efficiently explore these complex spaces. To do this, we posed the problem as a competition in which participants were tasked with developing a controller to enable a physiologically-based human model to navigate a complex obstacle course as quickly as possible, without using any experimental data. They were provided with a human musculoskeletal model and a physics-based simulation environment. In this paper, we discuss the design of the competition, technical difficulties, results, and analysis of the top controllers. The challenge proved that deep reinforcement learning techniques, despite their high computational cost, can be successfully employed as an optimization method for synthesizing physiologically feasible motion in high-dimensional biomechanical systems.
Sharada P. Mohanty and Carmichael F. Ong contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
see, e.g., http://www.rl-competition.org/
- 4.
see, e.g., https://youtu.be/hx_bgoTF7bs
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
References
Ackermann, M., Van den Bogert, A.J.: Optimality principles for model-based prediction of human gait. Journal of biomechanics 43(6), 1055–1060 (2010)
Anderson, F.C., Pandy, M.G.: A dynamic optimization solution for vertical jumping in three dimensions. Computer methods in biomechanics and biomedical engineering 2(3), 201–231 (1999)
Anderson, F.C., Pandy, M.G.: Dynamic optimization of human walking. Journal of biomechanical engineering 123(5), 381–390 (2001)
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research 47, 253–279 (2013)
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
Delp, S., Loan, J., Hoy, M., Zajac, F., Topp, E., Rosen, J.: An interactive graphics-based model of the lower extremity to study orthopaedic surgical procedures. IEEE Transactions on Biomedical Engineering 37(8), 757–767 (1990)
Delp, S.L., Anderson, F.C., Arnold, A.S., Loan, P., Habib, A., John, C.T., Guendelman, E., Thelen, D.G.: Opensim: open-source software to create and analyze dynamic simulations of movement. IEEE transactions on biomedical engineering 54(11), 1940–1950 (2007)
Dimitrakakis, C., Li, G., Tziortziotis, N.: The reinforcement learning competition 2014. AI Magazine 35(3), 61–65 (2014)
Dorn, T.W., Wang, J.M., Hicks, J.L., Delp, S.L.: Predictive simulation generates human adaptations during loaded and inclined walking. PloS one 10(4), e0121,407 (2015)
Geyer, H., Herr, H.: A muscle-reflex model that encodes principles of legged mechanics produces human walking dynamics and muscle activities. IEEE Transactions on neural systems and rehabilitation engineering 18(3), 263–273 (2010)
Hamner, S.R., Delp, S.L.: Muscle contributions to fore-aft and vertical body mass center accelerations over a range of running speeds. Journal of Biomechanics 46(4), 780–787 (2013)
Hunt, K., Crossley, F.: Coefficient of restitution interpreted as damping in vibroimpact. Journal of Applied Mechanics 42(2), 440–445 (1975)
Jaśkowski, W., Lykkebø, O.R., Toklu, N.E., Trifterer, F., Buk, Z., Koutník, J., Gomez, F.: Reinforcement Learning to Run…Fast. In: S. Escalera, M. Weimer (eds.) NIPS 2017 Competition Book. Springer, Springer (2018)
Kidziński, Ł., Mohanty, S.P., Ong, C., Huang, Z., Zhou, S., Pechenko, A., Stelmaszczyk, A., Jarosik, P., Pavlov, M., Kolesnikov, S., Plis, S., Chen, Z., Zhang, Z., Chen, J., Shi, J., Zheng, Z., Yuan, C., Lin, Z., Michalewski, H., Mio, P., Osiski, B., andrew, M., Schilling, M., Ritter, H., Carroll, S., Hicks, J., Levine, S., Salath, M., Delp, S.: Learning to run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. In: S. Escalera, M. Weimer (eds.) NIPS 2017 Competition Book. Springer, Springer (2018)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Ong, C.F., Geijtenbeek, T., Hicks, J.L., Delp, S.L.: Predictive simulations of human walking produce realistic cost of transport at a range of speeds. In: Proceedings of the 16th International Symposium on Computer Simulation in Biomechanics, pp. 19–20 (2017)
Schulman, J., Levine, S., Abbeel, P., Jordan, M.I., Moritz, P.: Trust region policy optimization. In: ICML, pp. 1889–1897 (2015)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)
Song, S., Geyer, H.: A neural circuitry that emphasizes spinal feedback generates diverse behaviours of human locomotion. The Journal of physiology 593(16), 3493–3511 (2015)
Thelen, D.G.: Adjustment of muscle mechanics model parameters to simulate dynamic contractions in older adults. Journal of Biomechanical Engineering 125(1), 70–77 (2003)
Thelen, D.G., Anderson, F.C., Delp, S.L.: Generating dynamic simulations of movement using computed muscle control. Journal of Biomechanics 36(3), 321–328 (2003)
Todorov, E., Erez, T., Tassa, Y.: Mujoco: A physics engine for model-based control. In: Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, pp. 5026–5033. IEEE (2012)
Wang, J.M., Hamner, S.R., Delp, S.L., Koltun, V.: Optimizing locomotion controllers using biologically-based actuators and objectives. ACM transactions on graphics 31(4) (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
6.1.1 Installation
We believe that the simplicity of use of the simulator (independently of the skills in computer science and biomechanics) contributed significantly to the success of the challenge. The whole installation process took around 1–5 mins depending on the internet connection. To emphasize this simplicity let us illustrate the installation process. Users were asked to install Anaconda (https://www.continuum.io/downloads) and then to install our reinforcement learning environment by typing
conda create -n opensim-rl -c kidzik opensim git source activate opensim-rl pip install git+https://github.com/kidzik/osim-rl.git
Next, they were asked to start a python interpreter which allows interaction with the musculoskeletal model and visualization of the skeleton (Fig. 6.8) after running
from osim.env import GaitEnv env = GaitEnv(visualize=True) observation = env.reset() for i in range(500): observation, reward, done, info = env.step (env.action_space.sample())
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Kidziński, Ł. et al. (2018). Learning to Run Challenge: Synthesizing Physiologically Accurate Motion Using Deep Reinforcement Learning. In: Escalera, S., Weimer, M. (eds) The NIPS '17 Competition: Building Intelligent Systems. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-94042-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-94042-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94041-0
Online ISBN: 978-3-319-94042-7
eBook Packages: Computer ScienceComputer Science (R0)