Unsupervised early prediction of human reaching for human–robot collaboration in shared workspaces

Article

Abstract

This paper focuses on human–robot collaboration in industrial manipulation tasks that take place in a shared workspace. In this setting we wish to predict, as quickly as possible, the human’s reaching motion so that the robot can avoid interference while performing a complimentary task. Given an observed part of a human’s reaching motion, we thus wish to predict the remainder of the trajectory, and demonstrate that this is effective as a real-time input to the robot for human–robot collaboration tasks. We propose a two-layer framework of Gaussian Mixture Models and an unsupervised online learning algorithm that updates these models with newly-observed trajectories. Unlike previous work in this area which relies on supervised learning methods to build models of human motion, our approach requires no offline training or manual labeling. The main advantage of this unsupervised approach is that it can build models on-the-fly and adapt to new people and new motion styles as they emerge. We test our method on motion capture data from a human-human collaboration experiment to show the early prediction performance. We also present two human–robot workspace sharing experiments of varying difficulty where the robot predicts the human’s motion every 0.1 s. The experimental results suggest that our framework can use human motion predictions to decide on robot motions that avoid the human in real-time applications with high reliability.

Keywords

Human motion prediction Human–robot collaboration Human–robot manipulation Learning 

Supplementary material

Supplementary material 1 (mp4 236938 KB)

References

  1. Barras, C., Meignier, S., & Gauvain, J. L. (2004). Unsupervised online adaptation for speaker verification over the telephone. In: The speaker and language recognition workshop.Google Scholar
  2. Bennewitz, M., Burgard, W., & Thrun, S. (2002). Learning motion patterns of persons for mobile service robots. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 4, 3601–3606.Google Scholar
  3. Bruce, A., & Gordon, G. (2004). Better motion prediction for people-tracking. In: Proceedings of the IEEE International Conference on Robotics and Automation(ICRA).Google Scholar
  4. Calinon, S. (2009). Robot programming by demonstration: A probabilistic approach. Lausanne: EPFL Press.Google Scholar
  5. Calinon, S., & Billard, A. (2007). Incremental learning of gestures by imitation in a humanoid robot. In: Proceedings of the ACM/IEEE International Conference on Human-robot Interaction, pp. 255–262.Google Scholar
  6. Cederborg, T., Li, M., Baranes, A., & Oudeyer, P.Y. (2010). Incremental local online gaussian mixture regression for imitation learning of multiple tasks. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 267–274.Google Scholar
  7. Jiang, Y., & Saxena, A. (2014). Modeling high-dimensional humans for activity anticipation using gaussian process latent CRFs. In: Robotics: Science and systems.Google Scholar
  8. Kalakrishnan, M., Chitta, S., Theodorou, E., Pastor, P., & Schaal, S. (2011). STOMP: Stochastic trajectory optimization for motion planning. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 4569–4574.Google Scholar
  9. Koppula, H. S., & Saxena, A. (2013). Learning spatio-temporal structure from rgb-d videos for human activity detection and anticipation. In: Proceedings of the 30th International Conference on Machine Learning (ICML), pp. 792–800.Google Scholar
  10. Koppula, H. S., & Saxena, A. (2016). Anticipating human activities using object affordances for reactive robotic response. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 14–29.CrossRefGoogle Scholar
  11. Koppula, H. S., Gupta, R., & Saxena, A. (2013). Learning human activities and object affordances from rgb-d videos. The International Journal of Robotics Research, 32(8), 951–970.CrossRefGoogle Scholar
  12. Kulić, D., Ott, C., Lee, D., Ishikawa, J., & Nakamura, Y. (2011). Incremental learning of full body motion primitives and their sequencing through human motion observation. The International Journal of Robotics Research, 31(3), 330–345.Google Scholar
  13. Luo, R., & Berenson, D. (2015). A framework for unsupervised online human reaching motion recognition and early prediction. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2426–2433.Google Scholar
  14. Luo, R., Hayne, R., & Berenson, D. (2016). Early prediction of human reaching motion for long-term human-robot collaboration. In: AI for long-term autonomy workshop at IEEE International Conference on Robotics and Automation (ICRA)Google Scholar
  15. Maeda, G. J., Neumann, G., Ewerton, M., Lioutikov, R., Kroemer, O., & Peters, J. (2017). Probabilistic movement primitives for coordination of multiple human–robot collaborative tasks. Autonomous Robots, 41(3), 593–612.Google Scholar
  16. Mainprice, J., & Berenson, D. (2013). Human-robot collaborative manipulation planning using early prediction of human motion. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 299–306Google Scholar
  17. Mainprice, J., Hayne, R., & Berenson, D. (2015). Predicting human reaching motion in collaborative tasks using inverse optimal control and iterative re-planning. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 885–892.Google Scholar
  18. Müller, M. (2007). Dynamic Time Warping. Berlin: Springer.CrossRefGoogle Scholar
  19. Nyga, D., Tenorth, M., & Beetz, M. (2011). How-models of human reaching movements in the context of everyday manipulation activities. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 6221–6226.Google Scholar
  20. Perez-D’Arpino, C., & Shah, J. A. (2015). Fast target prediction of human reaching motion for cooperative human-robot manipulation tasks using time series classification. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 6175–6182.Google Scholar
  21. Ravichandar, H. C., & Dani, A. (2015). Human intention inference and motion modeling using approximate em with online learning. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1819–1824.Google Scholar
  22. Sung, C., Feldman, D., & Rus, D. (2012a). Trajectory clustering for motion prediction. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1547–1552.Google Scholar
  23. Sung, J., Ponce, C., Selman, B., & Saxena, A. (2012b). Unstructured human activity detection from rgb-d images. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 842–849.Google Scholar
  24. Weinrich, C., Volkhardt, M., Einhorn, E., & Gross, H. M. (2013). Prediction of human collision avoidance behavior by lifelong learning for socially compliant robot navigation. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 376–381.Google Scholar
  25. Wu, G., Van der Helm, F. C., Veeger, H. D., Makhsous, M., Van Roy, P., Anglin, C., et al. (2005). ISB recommendation on definitions of joint coordinate systems of various joints for the reporting of human joint motion part ii: shoulder, elbow, wrist and hand. Journal of Biomechanics, 38(5), 981–992.CrossRefGoogle Scholar
  26. Xia, L., Chen, C. C., & Aggarwal, J. (2012). View invariant human action recognition using histograms of 3d joints. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 20–27.Google Scholar
  27. Zhang, H., & Parker, L. E. (2011). 4-dimensional local spatio-temporal features for human activity recognition. In: Proceedings of the IEEE/RSJ international conference on Intelligent robots and systems (IROS), pp. 2044–2049.Google Scholar
  28. Zhang, Y., & Scordilis, M. S. (2008). Effective online unsupervised adaptation of Gaussian mixture models and its application to speech classification. Pattern Recognition Letters, 29(6), 735–744.CrossRefGoogle Scholar
  29. Zhao, Y., Liu, Z., Yang, L., & Cheng, H. (2012). Combing rgb and depth map features for human activity recognition. In: Proceedings of the Asia-Pacific Signal Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1–4.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Robotics ProgramUniversity of MichiganAnn ArborUSA
  2. 2.Computer Science DepartmentWorcester Polytechnic InstituteWorcesterUSA
  3. 3.Electrical Engineering and Computer Science DepartmentUniversity of MichiganAnn ArborUSA

Personalised recommendations