Unsupervised early prediction of human reaching for human–robot collaboration in shared workspaces

Abstract

This paper focuses on human–robot collaboration in industrial manipulation tasks that take place in a shared workspace. In this setting we wish to predict, as quickly as possible, the human’s reaching motion so that the robot can avoid interference while performing a complimentary task. Given an observed part of a human’s reaching motion, we thus wish to predict the remainder of the trajectory, and demonstrate that this is effective as a real-time input to the robot for human–robot collaboration tasks. We propose a two-layer framework of Gaussian Mixture Models and an unsupervised online learning algorithm that updates these models with newly-observed trajectories. Unlike previous work in this area which relies on supervised learning methods to build models of human motion, our approach requires no offline training or manual labeling. The main advantage of this unsupervised approach is that it can build models on-the-fly and adapt to new people and new motion styles as they emerge. We test our method on motion capture data from a human-human collaboration experiment to show the early prediction performance. We also present two human–robot workspace sharing experiments of varying difficulty where the robot predicts the human’s motion every 0.1 s. The experimental results suggest that our framework can use human motion predictions to decide on robot motions that avoid the human in real-time applications with high reliability.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Notes

  1. 1.

    https://github.com/WPI-ARC/unsupervised_online_reaching_prediction.

References

  1. Barras, C., Meignier, S., & Gauvain, J. L. (2004). Unsupervised online adaptation for speaker verification over the telephone. In: The speaker and language recognition workshop.

  2. Bennewitz, M., Burgard, W., & Thrun, S. (2002). Learning motion patterns of persons for mobile service robots. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 4, 3601–3606.

    Google Scholar 

  3. Bruce, A., & Gordon, G. (2004). Better motion prediction for people-tracking. In: Proceedings of the IEEE International Conference on Robotics and Automation(ICRA).

  4. Calinon, S. (2009). Robot programming by demonstration: A probabilistic approach. Lausanne: EPFL Press.

    Google Scholar 

  5. Calinon, S., & Billard, A. (2007). Incremental learning of gestures by imitation in a humanoid robot. In: Proceedings of the ACM/IEEE International Conference on Human-robot Interaction, pp. 255–262.

  6. Cederborg, T., Li, M., Baranes, A., & Oudeyer, P.Y. (2010). Incremental local online gaussian mixture regression for imitation learning of multiple tasks. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 267–274.

  7. Jiang, Y., & Saxena, A. (2014). Modeling high-dimensional humans for activity anticipation using gaussian process latent CRFs. In: Robotics: Science and systems.

  8. Kalakrishnan, M., Chitta, S., Theodorou, E., Pastor, P., & Schaal, S. (2011). STOMP: Stochastic trajectory optimization for motion planning. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 4569–4574.

  9. Koppula, H. S., & Saxena, A. (2013). Learning spatio-temporal structure from rgb-d videos for human activity detection and anticipation. In: Proceedings of the 30th International Conference on Machine Learning (ICML), pp. 792–800.

  10. Koppula, H. S., & Saxena, A. (2016). Anticipating human activities using object affordances for reactive robotic response. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 14–29.

    Article  Google Scholar 

  11. Koppula, H. S., Gupta, R., & Saxena, A. (2013). Learning human activities and object affordances from rgb-d videos. The International Journal of Robotics Research, 32(8), 951–970.

    Article  Google Scholar 

  12. Kulić, D., Ott, C., Lee, D., Ishikawa, J., & Nakamura, Y. (2011). Incremental learning of full body motion primitives and their sequencing through human motion observation. The International Journal of Robotics Research, 31(3), 330–345.

    Google Scholar 

  13. Luo, R., & Berenson, D. (2015). A framework for unsupervised online human reaching motion recognition and early prediction. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2426–2433.

  14. Luo, R., Hayne, R., & Berenson, D. (2016). Early prediction of human reaching motion for long-term human-robot collaboration. In: AI for long-term autonomy workshop at IEEE International Conference on Robotics and Automation (ICRA)

  15. Maeda, G. J., Neumann, G., Ewerton, M., Lioutikov, R., Kroemer, O., & Peters, J. (2017). Probabilistic movement primitives for coordination of multiple human–robot collaborative tasks. Autonomous Robots, 41(3), 593–612.

  16. Mainprice, J., & Berenson, D. (2013). Human-robot collaborative manipulation planning using early prediction of human motion. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 299–306

  17. Mainprice, J., Hayne, R., & Berenson, D. (2015). Predicting human reaching motion in collaborative tasks using inverse optimal control and iterative re-planning. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 885–892.

  18. Müller, M. (2007). Dynamic Time Warping. Berlin: Springer.

    Google Scholar 

  19. Nyga, D., Tenorth, M., & Beetz, M. (2011). How-models of human reaching movements in the context of everyday manipulation activities. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 6221–6226.

  20. Perez-D’Arpino, C., & Shah, J. A. (2015). Fast target prediction of human reaching motion for cooperative human-robot manipulation tasks using time series classification. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 6175–6182.

  21. Ravichandar, H. C., & Dani, A. (2015). Human intention inference and motion modeling using approximate em with online learning. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1819–1824.

  22. Sung, C., Feldman, D., & Rus, D. (2012a). Trajectory clustering for motion prediction. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1547–1552.

  23. Sung, J., Ponce, C., Selman, B., & Saxena, A. (2012b). Unstructured human activity detection from rgb-d images. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 842–849.

  24. Weinrich, C., Volkhardt, M., Einhorn, E., & Gross, H. M. (2013). Prediction of human collision avoidance behavior by lifelong learning for socially compliant robot navigation. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 376–381.

  25. Wu, G., Van der Helm, F. C., Veeger, H. D., Makhsous, M., Van Roy, P., Anglin, C., et al. (2005). ISB recommendation on definitions of joint coordinate systems of various joints for the reporting of human joint motion part ii: shoulder, elbow, wrist and hand. Journal of Biomechanics, 38(5), 981–992.

    Article  Google Scholar 

  26. Xia, L., Chen, C. C., & Aggarwal, J. (2012). View invariant human action recognition using histograms of 3d joints. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 20–27.

  27. Zhang, H., & Parker, L. E. (2011). 4-dimensional local spatio-temporal features for human activity recognition. In: Proceedings of the IEEE/RSJ international conference on Intelligent robots and systems (IROS), pp. 2044–2049.

  28. Zhang, Y., & Scordilis, M. S. (2008). Effective online unsupervised adaptation of Gaussian mixture models and its application to speech classification. Pattern Recognition Letters, 29(6), 735–744.

    Article  Google Scholar 

  29. Zhao, Y., Liu, Z., Yang, L., & Cheng, H. (2012). Combing rgb and depth map features for human activity recognition. In: Proceedings of the Asia-Pacific Signal Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1–4.

Download references

Funding

Funding was provided by National Science Foundation (Grant No. 1317462).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ruikun Luo.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 236938 KB)

Supplementary material 1 (mp4 236938 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Luo, R., Hayne, R. & Berenson, D. Unsupervised early prediction of human reaching for human–robot collaboration in shared workspaces. Auton Robot 42, 631–648 (2018). https://doi.org/10.1007/s10514-017-9655-8

Download citation

Keywords

  • Human motion prediction
  • Human–robot collaboration
  • Human–robot manipulation
  • Learning