Artificial Intelligence for Prosthetics: Challenge Solutions

Kidziński, Łukasz; Ong, Carmichael; Mohanty, Sharada Prasanna; Hicks, Jennifer; Carroll, Sean; Zhou, Bo; Zeng, Hongsheng; Wang, Fan; Lian, Rongzhong; Tian, Hao; Jaśkowski, Wojciech; Andersen, Garrett; Lykkebø, Odd Rune; Toklu, Nihat Engin; Shyam, Pranav; Srivastava, Rupesh Kumar; Kolesnikov, Sergey; Hrinchuk, Oleksii; Pechenko, Anton; Ljungström, Mattias; Wang, Zhen; Hu, Xu; Hu, Zehong; Qiu, Minghui; Huang, Jun; Shpilman, Aleksei; Sosin, Ivan; Svidchenko, Oleg; Malysheva, Aleksandra; Kudenko, Daniel; Rane, Lance; Bhatt, Aditya; Wang, Zhengfei; Qi, Penghui; Yu, Zeyang; Peng, Peng; Yuan, Quan; Li, Wenxin; Tian, Yunsheng; Yang, Ruihan; Ma, Pingchuan; Khadka, Shauharda; Majumdar, Somdeb; Dwiel, Zach; Liu, Yinyin; Tumer, Evren; Watson, Jeremy; Salathé, Marcel; Levine, Sergey; Delp, Scott

doi:10.1007/978-3-030-29135-8_4

Łukasz Kidziński⁶,
Carmichael Ong⁶,
Sharada Prasanna Mohanty⁷,
Jennifer Hicks⁶,
Sean Carroll⁷,
Bo Zhou⁸,
Hongsheng Zeng⁸,
Fan Wang⁸,
Rongzhong Lian⁸,
Hao Tian⁸,
Wojciech Jaśkowski⁹,
Garrett Andersen⁹,
Odd Rune Lykkebø⁹,
Nihat Engin Toklu⁹,
Pranav Shyam⁹,
Rupesh Kumar Srivastava⁹,
Sergey Kolesnikov¹⁰,
Oleksii Hrinchuk¹¹,
Anton Pechenko¹²,
Mattias Ljungström¹³,
Zhen Wang¹⁴,
Xu Hu¹⁴,
Zehong Hu¹⁴,
Minghui Qiu¹⁴,
Jun Huang¹⁴,
Aleksei Shpilman¹⁵,
Ivan Sosin¹⁵,
Oleg Svidchenko¹⁵,
Aleksandra Malysheva¹⁵,
Daniel Kudenko¹⁶,
Lance Rane¹⁷,
Aditya Bhatt¹⁸,
Zhengfei Wang^19,20,
Penghui Qi¹⁹,
Zeyang Yu^19,21,
Peng Peng¹⁹,
Quan Yuan¹⁹,
Wenxin Li²⁰,
Yunsheng Tian²²,
Ruihan Yang²²,
Pingchuan Ma²²,
Shauharda Khadka²³,
Somdeb Majumdar²³,
Zach Dwiel²³,
Yinyin Liu²³,
Evren Tumer²³,
Jeremy Watson²⁴,
Marcel Salathé⁷,
Sergey Levine²⁵ &
…
Scott Delp⁶

Part of the book series: The Springer Series on Challenges in Machine Learning ((SSCML))

1684 Accesses
14 Citations

Abstract

In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants described their algorithms in this paper. Many solutions use similar relaxations and heuristics, such as reward shaping, frame skipping, discretization of the action space, symmetry, and policy blending. However, each team implemented different modifications of the known algorithms by, for example, dividing the task into subtasks, learning low-level control, or by incorporating expert knowledge and using imitation learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Find open-source code at: https://github.com/PaddlePaddle/PARL.
2.
https://youtu.be/ckPSJYLAWy0.
3.
https://youtu.be/mw9cVvaM0vQ.
4.
https://github.com/scitator/catalyst.
5.
https://mljx.io/x/neurips_walk_2018.gif.
6.
https://github.com/joneswong/rl_stadium.
7.
https://www.alibabacloud.com/press-room/alibaba-cloud-announces-machine-learning -platform-pai.
8.
Each observation provided by the simulator was a python dict, so it had to be flattened into an array of floats for the agent’s consumption. This flattening was done using a function from the helper library [27]. Due to an accident in using this code, some of the coordinates were replicated several times, thus the actual vector size used in the training is 417.
9.
https://github.com/wangzhengfei0730/NIPS2018-AIforProsthetics.
10.
https://github.com/hagrid67/prosthetics_public.
11.
joint_pos hip_l [1] in the observation dictionary.

References

Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, P., Zaremba, W.: Hindsight experience replay. In: NIPS (2017)
Google Scholar
authors, A.: Recurrent experience replay in distributed reinforcement learning. https://openreview.net/pdf?id=r1lyTjAqYX (2018)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Google Scholar
Barth-Maron, G., Hoffman, M.W., Budden, D., Dabney, W., Horgan, D., Muldal, A., Heess, N., Lillicrap, T.: Distributed distributional deterministic policy gradients. arXiv preprint arXiv:1804.08617 (2018)
Google Scholar
Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. arXiv preprint arXiv:1707.06887 (2017)
Google Scholar
Bellman, R.E.: Adaptive control processes: a guided tour. Princeton University Press (1961)
Google Scholar
Bhatt, A., Argus, M., Amiranashvili, A., Brox, T.: Crossnorm: Normalization for off-policy td reinforcement learning. arXiv preprint arXiv:1902.05605 (2019)
Google Scholar
Crowninshield, R.D., Brand, R.A.: A physiologically based criterion of muscle force prediction in locomotion. Journal of Biomechanics 14(11), 793–801 (1981)
Article Google Scholar
Dabney, W., Rowland, M., Bellemare, M.G., Munos, R.: Distributional reinforcement learning with quantile regression. arXiv preprint arXiv:1710.10044 (2017)
Google Scholar
Delp, S.L., Anderson, F.C., Arnold, A.S., Loan, P., Habib, A., John, C.T., Guendelman, E., Thelen, D.G.: Opensim: open-source software to create and analyze dynamic simulations of movement. IEEE transactions on biomedical engineering 54(11), 1940–1950 (2007)
Article Google Scholar
Dhariwal, P., Hesse, C., Plappert, M., Radford, A., Schulman, J., Sidor, S., Wu, Y.: OpenAI Baselines. https://github.com/openai/baselines (2017)
Dietterich, T.G., et al.: Ensemble methods in machine learning. Multiple classifier systems 1857, 1–15 (2000)
Article Google Scholar
Farris, D.J., Hicks, J.L., Delp, S.L., Sawicki, G.S.: Musculoskeletal modelling deconstructs the paradoxical effects of elastic ankle exoskeletons on plantar-flexor mechanics and energetics during hopping. Journal of Experimental Biology 217(22), 4018–4028 (2014)
Article Google Scholar
Fortunato, M., Azar, M.G., Piot, B., Menick, J., Osband, I., Graves, A., Mnih, V., Munos, R., Hassabis, D., Pietquin, O., et al.: Noisy networks for exploration. arXiv preprint arXiv:1706.10295 (2017)
Google Scholar
Fujimoto, S., van Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. arXiv preprint arXiv:1802.09477 (2018)
Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 (2018)
Google Scholar
Horgan, D., Quan, J., Budden, D., Barth-Maron, G., Hessel, M., Van Hasselt, H., Silver, D.: Distributed prioritized experience replay. arXiv preprint arXiv:1803.00933 (2018)
Google Scholar
Huang, G., Li, Y., Pleiss, G., Liu, Z., Hopcroft, J.E., Weinberger, K.Q.: Snapshot ensembles: Train 1, get m for free. arXiv preprint arXiv:1704.00109 (2017)
Google Scholar
Huang, Z., Zhou, S., Zhuang, B., Zhou, X.: Learning to run with actor-critic ensemble. arXiv preprint arXiv:1712.08987 (2017)
Google Scholar
Ian Osband Charles Blundell, A.P.B.V.R.: Deep exploration via bootstrapped dqn (2016)
Google Scholar
Jaśkowski, W., Lykkebø, O.R., Toklu, N.E., Trifterer, F., Buk, Z., Koutník, J., Gomez, F.: Reinforcement Learning to Run…Fast. In: S. Escalera, M. Weimer (eds.) NIPS 2017 Competition Book. Springer, Springer (2018)
Google Scholar
John, C.T., Anderson, F.C., Higginson, J.S., Delp, S.L.: Stabilisation of walking by intrinsic muscle properties revealed in a three-dimensional muscle-driven simulation. Computer methods in biomechanics and biomedical engineering 16(4), 451–462 (2013)
Article Google Scholar
Kidziński, Ł., Mohanty, S.P., Ong, C., Huang, Z., Zhou, S., Pechenko, A., Stelmaszczyk, A., Jarosik, P., Pavlov, M., Kolesnikov, S., et al.: Learning to run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments. arXiv preprint arXiv:1804.00361 (2018)
Google Scholar
Kidziński, Ł., Sharada, M.P., Ong, C., Hicks, J., Francis, S., Levine, S., Salathé, M., Delp, S.: Learning to run challenge: Synthesizing physiologically accurate motion using deep reinforcement learning. In: S. Escalera, M. Weimer (eds.) NIPS 2017 Competition Book. Springer, Springer (2018)
Google Scholar
Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. arXiv preprint arXiv:1706.02515 (2017)
Google Scholar
Lee, G., Kim, J., Panizzolo, F., Zhou, Y., Baker, L., Galiana, I., Malcolm, P., Walsh, C.: Reducing the metabolic cost of running with a tethered soft exosuit. Science Robotics 2(6) (2017)
Article Google Scholar
Lee, S.R.: Helper for NIPS 2018: AI for Prosthetics. https://github.com/seungjaeryanlee/osim-rl-helper (2018)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Google Scholar
Loshchilov, I., Hutter, F.: Sgdr: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (ICLR) 2017 Conference Track (2017)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Moritz, P., Nishihara, R., Wang, S., Tumanov, A., Liaw, R., Liang, E., Elibol, M., Yang, Z., Paul, W., Jordan, M.I., et al.: Ray: A distributed framework for emerging {AI} applications. In: 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18), pp. 561–577 (2018)
Google Scholar
Ong, C.F., Geijtenbeek, T., Hicks, J.L., Delp, S.L.: Predictive simulations of human walking produce realistic cost of transport at a range of speeds. In: Proceedings of the 16th International Symposium on Computer Simulation in Biomechanics, pp. 19–20 (2017)
Google Scholar
Pardo, F., Tavakoli, A., Levdik, V., Kormushev, P.: Time limits in reinforcement learning. arXiv preprint arXiv:1712.00378 (2017)
Google Scholar
Pavlov, M., Kolesnikov, S., Plis, S.M.: Run, skeleton, run: skeletal model in a physics-based simulation. ArXiv e-prints (2017)
Google Scholar
Peng, X.B., Abbeel, P., Levine, S., van de Panne, M.: Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. arXiv preprint arXiv:1804.02717 (2018)
Google Scholar
Plappert, M., Houthooft, R., Dhariwal, P., Sidor, S., Chen, R.Y., Chen, X., Asfour, T., Abbeel, P., Andrychowicz, M.: Parameter space noise for exploration. arXiv preprint arXiv:1706.01905 (2) (2017)
Google Scholar
Ross, S., Gordon, G., Bagnell, D.: A reduction of imitation learning and structured prediction to no-regret online learning. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 627–635 (2011)
Google Scholar
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015)
Google Scholar
Schulman, J., Levine, S., Abbeel, P., Jordan, M.I., Moritz, P.: Trust region policy optimization. In: ICML, pp. 1889–1897 (2015)
Google Scholar
Schulman, J., Moritz, P., Levine, S., Jordan, M., Abbeel, P.: High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 (2015)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017). URL http://arxiv.org/abs/1707.06347
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Google Scholar
Seth, A., Hicks, J., Uchida, T., Habib, A., Dembia, C., Dunne, J., Ong, C., DeMers, M., Rajagopal, A., Millard, M., Hamner, S., Arnold, E., Yong, J., Lakshmikanth, S., Sherman, M., Delp, S.: Opensim: Simulating musculoskeletal dynamics and neuromuscular control to study human and animal movement. Plos Computational Biology, 14(7). (2018)
Article Google Scholar
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 387–395 (2014)
Google Scholar
Song, S., Geyer, H.: A neural circuitry that emphasizes spinal feedback generates diverse behaviours of human locomotion. The Journal of physiology 593(16), 3493–3511 (2015)
Article Google Scholar
Sosin, I., Svidchenko, O., Malysheva, A., Kudenko, D., Shpilman, A.: Framework for Deep Reinforcement Learning with GPU-CPU Multiprocessing (2018). URL https://doi.org/10.5281/zenodo.1938263
Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112 (1999)
Article MathSciNet Google Scholar
Thelen, D.G., Anderson, F.C., Delp, S.L.: Generating dynamic simulations of movement using computed muscle control. Journal of Biomechanics 36(3), 321–328 (2003)
Article Google Scholar
Thelen, D.G., Anderson, F.C., Delp, S.L.: Generating dynamic simulations of movement using computed muscle control. Journal of biomechanics 36(3), 321–328 (2003)
Article Google Scholar
Uchida, T.K., Seth, A., Pouya, S., Dembia, C.L., Hicks, J.L., Delp, S.L.: Simulating ideal assistive devices to reduce the metabolic cost of running. PLOS ONE 11(9), 1–19 (2016). https://doi.org/10.1371/journal.pone.0163417
Article Google Scholar
Wu, Y., Tian, Y.: Training agent for first-person shooter game with actor-critic curriculum learning (2017)
Google Scholar
Yoshua, B., Jerome, L., Ronan, C., Jason, W.: Curriculum learning (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Bioengineering, Stanford University, Stanford, CA, USA
Łukasz Kidziński, Carmichael Ong, Jennifer Hicks & Scott Delp
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Sharada Prasanna Mohanty, Sean Carroll & Marcel Salathé
Baidu Inc., Shenzhen, China
Bo Zhou, Hongsheng Zeng, Fan Wang, Rongzhong Lian & Hao Tian
NNAISENSE, Lugano, Switzerland
Wojciech Jaśkowski, Garrett Andersen, Odd Rune Lykkebø, Nihat Engin Toklu, Pranav Shyam & Rupesh Kumar Srivastava
DBrain, Moscow, Russia
Sergey Kolesnikov
Skolkovo Institute of Science and Technology, Moscow, Russia
Oleksii Hrinchuk
GiantAI, Athens, Greece
Anton Pechenko
Spaces of Play UG, Berlin, Germany
Mattias Ljungström
Alibaba Group, Hangzhou, China
Zhen Wang, Xu Hu, Zehong Hu, Minghui Qiu & Jun Huang
JetBrains Research and National Research University Higher School of Economics, St. Petersburg, Russia
Aleksei Shpilman, Ivan Sosin, Oleg Svidchenko & Aleksandra Malysheva
JetBrains Research and University of York, York, UK
Daniel Kudenko
Imperial College London, London, UK
Lance Rane
University of Freiburg, Freiburg, Germany
Aditya Bhatt
inspir.ai, Beijing, China
Zhengfei Wang, Penghui Qi, Zeyang Yu, Peng Peng & Quan Yuan
Peking University, Beijing, China
Zhengfei Wang & Wenxin Li
Jilin University, Changchun, China
Zeyang Yu
Nankai University, Tianjin, China
Yunsheng Tian, Ruihan Yang & Pingchuan Ma
Intel AI, San Diego, CA, USA
Shauharda Khadka, Somdeb Majumdar, Zach Dwiel, Yinyin Liu & Evren Tumer
AICrowd Ltd, Lausanne, Switzerland
Jeremy Watson
University of California, Berkeley, Berkeley, CA, USA
Sergey Levine

Authors

Łukasz Kidziński
View author publications
You can also search for this author in PubMed Google Scholar
Carmichael Ong
View author publications
You can also search for this author in PubMed Google Scholar
Sharada Prasanna Mohanty
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer Hicks
View author publications
You can also search for this author in PubMed Google Scholar
Sean Carroll
View author publications
You can also search for this author in PubMed Google Scholar
Bo Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hongsheng Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Fan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Rongzhong Lian
View author publications
You can also search for this author in PubMed Google Scholar
Hao Tian
View author publications
You can also search for this author in PubMed Google Scholar
Wojciech Jaśkowski
View author publications
You can also search for this author in PubMed Google Scholar
Garrett Andersen
View author publications
You can also search for this author in PubMed Google Scholar
Odd Rune Lykkebø
View author publications
You can also search for this author in PubMed Google Scholar
Nihat Engin Toklu
View author publications
You can also search for this author in PubMed Google Scholar
Pranav Shyam
View author publications
You can also search for this author in PubMed Google Scholar
Rupesh Kumar Srivastava
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Kolesnikov
View author publications
You can also search for this author in PubMed Google Scholar
Oleksii Hrinchuk
View author publications
You can also search for this author in PubMed Google Scholar
Anton Pechenko
View author publications
You can also search for this author in PubMed Google Scholar
Mattias Ljungström
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xu Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zehong Hu
View author publications
You can also search for this author in PubMed Google Scholar
Minghui Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Jun Huang
View author publications
You can also search for this author in PubMed Google Scholar
Aleksei Shpilman
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Sosin
View author publications
You can also search for this author in PubMed Google Scholar
Oleg Svidchenko
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandra Malysheva
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Kudenko
View author publications
You can also search for this author in PubMed Google Scholar
Lance Rane
View author publications
You can also search for this author in PubMed Google Scholar
Aditya Bhatt
View author publications
You can also search for this author in PubMed Google Scholar
Zhengfei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Penghui Qi
View author publications
You can also search for this author in PubMed Google Scholar
Zeyang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Peng Peng
View author publications
You can also search for this author in PubMed Google Scholar
Quan Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Wenxin Li
View author publications
You can also search for this author in PubMed Google Scholar
Yunsheng Tian
View author publications
You can also search for this author in PubMed Google Scholar
Ruihan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Pingchuan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Shauharda Khadka
View author publications
You can also search for this author in PubMed Google Scholar
Somdeb Majumdar
View author publications
You can also search for this author in PubMed Google Scholar
Zach Dwiel
View author publications
You can also search for this author in PubMed Google Scholar
Yinyin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Evren Tumer
View author publications
You can also search for this author in PubMed Google Scholar
Jeremy Watson
View author publications
You can also search for this author in PubMed Google Scholar
Marcel Salathé
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Levine
View author publications
You can also search for this author in PubMed Google Scholar
Scott Delp
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Łukasz Kidziński .

Editor information

Editors and Affiliations

Universitat de Barcelona and Computer, Vision Center, Barcelona, Spain
Sergio Escalera
Amazon (Berlin), Berlin, Berlin, Germany
Ralf Herbrich

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kidziński, Ł. et al. (2020). Artificial Intelligence for Prosthetics: Challenge Solutions. In: Escalera, S., Herbrich, R. (eds) The NeurIPS '18 Competition. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-29135-8_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-29135-8_4
Published: 30 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29134-1
Online ISBN: 978-3-030-29135-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics