Skip to main content
Log in

Multi-users online recognition of technical gestures for natural human–robot collaboration in manufacturing

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

Human–robot collaboration in industrial context requires a smooth, natural and efficient coordination between robot and human operators. The approach we propose to achieve this goal is to use online recognition of technical gestures. In this paper, we present together, and analyze, parameterize and evaluate much more thoroughly, three findings previously unveiled separately by us in several conference presentations: (1) we show on a real prototype that multi-users continuous real-time recognition of technical gestures on an assembly-line is feasible (\(\approx \) 90% recall and precision in our case-study), using only non-intrusive sensors (depth-camera with a top-view, plus inertial sensors placed on tools); (2) we formulate an end-to-end methodology for designing and developing such a system; (3) we propose a method for adapting to new users our gesture recognition. Furthermore we present here two new findings: (1) by comparing recognition performances using several sets of features, we highlight the importance of choosing features that focus on the effective part of gestures, i.e. usually hands movements; (2) we obtain new results suggesting that enriching a multi-users training set can lead to higher precision than using a separate training dataset for each operator.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://nickgillian.com/grt/.

  2. Note that in modern factories, many tools such as screwing guns are actually connected to the assembly-line information system, so that binary information such as “moving” or “in use” can be readily available even without having to place inertial sensors on them.

References

  • Aarno, D., & Kragic, D. (2008). Motion intention recognition in robotassisted applications. Robotics and Autonomous Systems, 56(8), 692–705.

    Article  Google Scholar 

  • Bannat, A., Bautze, T., Beetz, M., Blume, J., Diepold, K., Ertelt, C., et al. (2011). Artificial cognition in production systems. IEEE Transactions on Automation Science and Engineering, 8(1), 148–174.

    Article  Google Scholar 

  • Biswas, K. K., & Basu, S. K. (2011). Gesture recognition using microsoft kinect. In The 5th international conference on automation, robotics and applications, (pp. 100–103). IEEE.

  • Bregonzio, M., Gong, S., & Xiang, T. (2009). Recognising action as clouds of space-time interest points. In 2009 IEEE conference on computer vision and pattern recognition (pp. 1948–1955). IEEE.

  • Bulling, A., Blanke, U., & Schiele, B. (2014). A tutorial on human activity recognition using body-worn inertial sensors. ACM Computing Surveys, 46(3), 1–33.

    Article  Google Scholar 

  • Calinon, S., & Billard, A. (2004). Stochastic gesture production and recognition model for a humanoid robot. In: Proceedings. 2004 IEEE/RSJ international conference on intelligent robots and systems, 2004 (IROS 2004), pp. 2769–2774.

  • Chen, C., Jafari, R., & Kehtarnavaz, N. (2016). Fusion of depth, skeleton, and inertial data for human action recognition. 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2712–2716). IEEE.

  • Chen, C. P., Chen, Y. T., Lee, P. H., Tsai, Y. P., & Lei, S. (2011). Real-time hand tracking on depth images. In 2011 visual communications and image processing (VCIP) (pp. 1–4). IEEE.

  • Chen, F., Zhong, Q., Cannella, F., Sekiyama, K., & Fukuda, T. (2015). Hand gesture modeling and recognition for human and robot interactive assembly using hidden Markov models. International Journal of Advanced Robotic Systems, 12(4), 48.

    Article  Google Scholar 

  • Chen, L., Wei, H., & Ferryman, J. (2013). A survey of human motion analysis using depth imagery. Pattern Recognition Letters, 34(15), 1995–2006.

    Article  Google Scholar 

  • Cherubini, A., Passama, R., Crosnier, A., Lasnier, A., & Fraisse, P. (2016). Collaborative manufacturing with physical human–robot interaction. Robotics and Computer-Integrated Manufacturing, 40, 1–13.

    Article  Google Scholar 

  • Corrales, R. J. A., García, G. G. J., Torres, M. F., & Perdereau, V. (2012). Cooperative tasks between humans and robots in industrial environments. Rijeka: InTech.

    Book  Google Scholar 

  • Coupeté, E., Manitsaris, S., & Moutarde, F. (2014). Real-time recognition of human gestures for collaborative robots on assembly-line. In 3rd international digital human modeling symposium (DHM2014), Tokyo, Japan (p. 7).

  • Coupeté, E., Moutarde, F., & Manitsaris, S. (2015). Gesture recognition using a depth camera for human robot collaboration on assembly lines. Procedia Manufacturing, 3, 518–525.

    Article  Google Scholar 

  • Coupeté, E., Moutarde, F., Manitsaris, S. (2016a). A user-adaptive gesture recognition system applied to human–robot collaboration in factories. In 3rd international symposium on movement and computing (MOCO’16), Thessalonique, Greece.

  • Coupeté, E., Moutarde, F., Manitsaris, S., & Hugues, O. (2016b). Recognition of technical gestures for human–robot collaboration in factories. In The ninth international conference on advances in computer–human interactions, Venise, Italy.

  • Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1(1), 269–271.

    Article  MathSciNet  MATH  Google Scholar 

  • Dollar, P., Rabaud, V., Cottrell, G., & Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In 2005 IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance (pp. 65–72). IEEE.

  • Dong, L., Wu, J., & Chen, X. (2007). A Body activity tracking system using wearable accelerometers. In 2007 IEEE international conference on multimedia and expo (pp. 1011–1014).

  • Dragan, A. D., Bauman, S., Forlizzi, J., & Srinivasa, S. S. (2015). Effects of robot motion on human–robot collaboration. In Proceedings of the tenth annual ACM/IEEE international conference on human–robot interaction—HRI ’15 (pp. 51–58). New York: ACM Press.

  • Hägele, M., Schaaf, W., & Helms, E. (2002). Robot assistants at manual workplaces: Effective co-operation and safety aspects. In Proceedings of the 33rd ISR (international symposium on robotics) (pp. 7–11).

  • Hamester, D., Jirak, D., & Wermter, S. (2013). Improved estimation of hand postures using depth images. In 2013 16th international conference on advanced robotics (ICAR) (pp. 1–6).

  • Hoffman, G., & Breazeal, C. (2007). Effects of anticipatory action on human–robot teamwork efficiency, fluency, and perception of team. In Proceedings of the ACM/IEEE international conference on human–robot interaction—HRI ’07 (p. 1). New York: ACM Press.

  • Joo, S. I., Weon, S. H., & Choi, H. I. (2014). Real-time depth-based hand detection and tracking. The Scientific World Journal, 2014, 284827.

    Google Scholar 

  • Junker, H., Amft, O., Lukowicz, P., & Tröster, G. (2008). Gesture spotting with body-worn inertial sensors to detect user activities. Pattern Recognition, 41(6), 2010–2024.

    Article  MATH  Google Scholar 

  • Ke, Y., Sukthankar, R., & Hebert, M. (2007). Spatio-temporal shape and flow correlation for action recognition. In 2007 IEEE conference on computer vision and pattern recognition (pp. 1–8). IEEE.

  • Laptev, I., & Lindeberg, T. (2003). Space–time interest points. In Proceedings ninth IEEE international conference on computer vision (Vol. 1, pp. 432–439).

  • Lenz, C., Nair, S., Rickert, M., Knoll, A., Rosel, W., Gast, J., Bannat, A., Wallhoff, F. (2008). Joint-action for humans and industrial robots for assembly tasks. In RO-MAN 2008—the 17th IEEE international symposium on robot and human interactive communication (pp. 130–135). IEEE.

  • Liu, J., Zhong, L., Wickramasuriya, J., & Vasudevan, V. (2009). uwave: Accelerometer-based personalized gesture recognition and its applications. Pervasive and Mobile Computing, 5(6), 657–675.

    Article  Google Scholar 

  • Luo, J., Wang, W., & Qi, H. (2013). Group sparsity and geometry constrained dictionary learning for action recognition from depth maps. In The IEEE international conference on computer vision (ICCV) (pp. 1809–1816).

  • Migniot, C., & Ababsa, F. (2013). 3D human tracking from depth cue in a buying behavior analysis context. In 15th international conference on computer analysis of images and patterns (CAIP 2013) (pp. 482–489).

  • Oikonomopoulos, A., Patras, I., & Pantic, M. (2005). Spatiotemporal salient points for visual recognition of human actions. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 36(3), 710–719.

    Article  Google Scholar 

  • Reyes, M., Domínguez, G., & Escalera, S. (2011). Feature weighting in dynamic timewarping for gesture recognition in depth data. In 2011 IEEE international conference on computer vision workshops (ICCV Workshops) (pp. 1182–1188).

  • Rickert, M., Foster, M. E., Giuliani, M., By, T., Panin, G., & Knoll, A. (2007). Integrating language, vision and action for human robot dialog systems. In C. Stephanidis (Ed.), Universal access in human-computer interaction. Ambient interaction. Ambient Interaction. UAHCI 2007. Lecture notes in computer science (Vol. 4555, pp. 987–995). Springer, Berlin, Heidelberg.

  • Schrempf, O. C., Hanebeck, U. D., Schmid, A. J., & Worn, H. (2005). A novel approach to proactive human–robot cooperation. In ROMAN 2005. IEEE international workshop on robot and human interactive communication, 2005 (pp. 555–560). IEEE.

  • Schuldt, C., Laptev, I., & Caputo, B. (2004). Recognizing human actions: a local SVM approach. In Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004. (Vol. 3, pp. 32–36). IEEE.

  • Schwarz, L. A., Mkhitaryan, A., Mateus, D., & Navab, N. (2012). Human skeleton tracking from depth data using geodesic distances and optical flow. Image and Vision Computing, 30(3), 217–226.

    Article  Google Scholar 

  • Sempena, S., Maulidevi, N. U., & Aryan, P. R. (2011). Human action recognition using dynamic time warping. In 2011 international conference on electrical engineering and informatics (ICEEI) (pp. 1–5). IEEE.

  • Shi, J., Jimmerson, G., Pearson, T., & Menassa, R. (2012). Levels of human and robot collaboration for automotive manufacturing. In Proceedings of the workshop on performance metrics for intelligent systems—PerMIS ’12 (p. 95). New York: ACM Press.

  • Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., et al. (2011). Real-time human pose recognition in parts from single depth images. In Proceedings of the 2011 IEEE conference on computer vision and pattern recognition, CVPR ’11 (pp. 1297–1304). Washington, DC: IEEE Computer Society.

  • Wang, H., Ullah, M. M., Klaser, A., Laptev, I., & Schmid, C. (2009). Evaluation of local spatio-temporal features for action recognition. In Proceedings of the British machine vision conference 2009 (pp. pp 124.1–124.11). British Machine Vision Association.

  • Wang, P., Li, W., Gao, Z., Zhang, J., Tang, C., & Ogunbona, P. O. (2016). Action recognition from depth maps using deep convolutional neural networks. IEEE Transactions on Human–Machine Systems, 46(4), 498–509.

    Article  Google Scholar 

  • Xia, L., Chen, C.C., Aggarwal, J.K. (2012). View invariant human action recognition using histograms of 3D joints. In 2012 IEEE Computer Society conference on computer vision and pattern recognition workshops (pp. 20–27). IEEE.

  • Yamato, J., Ohya, J., & Ishii, K. (1992). Recognizing human action in time-sequential images using hidden Markov model. In Proceedings 1992 IEEE Computer Society conference on computer vision and pattern recognition (pp. 379–385). IEEE Comput. Soc. Press.

  • Zhang, H., & Parker, L. E. (2011). 4-dimensional local spatio-temporal features for human activity recognition. In 2011 IEEE/RSJ international conference on intelligent robots and systems (pp. 2044—-2049).

  • Zhu, H. M., & Pun, C. M. (2012). Real-time hand gesture recognition from depth image sequences. In 2012 Ninth international conference on computer graphics, imaging and visualization (pp. 49–52).

Download references

Acknowledgements

This research benefited from the support of the Chair ‘PSA Peugeot Citroën Robotics and Virtual Reality’, led by MINES ParisTech and supported by PEUGEOT S.A. The partners of the Chair cannot be held accountable for the content of this paper, which engages the authors’ responsibility only.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabien Moutarde.

Additional information

This is one of the several papers published in Autonomous Robots comprising the Special Issue on Learning for Human–Robot Collaboration.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Coupeté, E., Moutarde, F. & Manitsaris, S. Multi-users online recognition of technical gestures for natural human–robot collaboration in manufacturing. Auton Robot 43, 1309–1325 (2019). https://doi.org/10.1007/s10514-018-9704-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-018-9704-y

Keywords

Navigation