Gaze and motion information fusion for human intention inference

Ravichandar, Harish Chaandar; Kumar, Avnish; Dani, Ashwin

doi:10.1007/s41315-018-0051-0

Gaze and motion information fusion for human intention inference

Regular Paper
Published: 19 March 2018

Volume 2, pages 136–148, (2018)
Cite this article

International Journal of Intelligent Robotics and Applications Aims and scope Submit manuscript

1402 Accesses
16 Citations
Explore all metrics

Abstract

An algorithm, named gaze-based multiple model intention estimator (G-MMIE), is presented for early prediction of the goal location (intention) of human reaching actions. The trajectories of the arm motion for reaching tasks are modeled by using an autonomous dynamical system with contracting behavior towards the goal location. To represent the dynamics of human arm reaching motion, a neural network (NN) is used. The parameters of the NN are learned under constraints derived based on contraction analysis. The constraints ensure that the trajectories of the dynamical system converge to a single equilibrium point. In order to use the motion model learned from a few demonstrations in new scenarios with multiple candidate goal locations, an interacting multiple-model (IMM) framework is used. For a given reaching motion, multiple models are obtained by translating the equilibrium point of the contracting system to different known candidate locations. Hence, each model corresponds to the reaching motion that ends at the respective candidate location. Further, since humans tend to look toward the location they are reaching for, prior probabilities of the goal locations are calculated based on the information about the human’s gaze. The posterior probabilities of the models are calculated through interacting model matched filtering. The candidate location with the highest posterior probability is chosen to be the estimate of the true goal location. Detailed quantitative evaluations of the G-MMIE algorithm on two different datasets involving 15 subjects, and comparisons with state-of-the-art intention inference algorithms are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Admoni, H., Scassellati, B.: Social eye gaze in human-robot interaction: a review. J. Hum. Robot Interact. 6(1), 25–63 (2017)
Article Google Scholar
Bader, T., Vogelgesang, M., Klaus, E.: Multimodal integration of natural gaze behavior for intention recognition during object manipulation. In: International conference on Multimodal interfaces, pp. 199–206 (2009)
Baldwin, D.A., Baird, J.A.: Discerning intentions in dynamic human action. Trends Cognit. Sci. 5(4), 171–178 (2001)
Article Google Scholar
Bar-Shalom, Y., Li, X.R., Kirubarajan, T.: Estimation with Applications to Tracking and Navigation. Wiley, New York (2001)
Book Google Scholar
Bartlett, M.S., Littlewort, G., Fasel, I., Movellan, J.R.: Real time face detection and facial expression recognition: development and applications to human computer interaction. In: IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), vol. 5, pp. 53–53 (2003)
Dani, A.P., McCourt, M., Curtis, J.W., Mehta, S.: Information fusion in human-robot collaboration using neural network representation. In: IEEE Conference on Systems, Man, Cybernetics, pp. 2114–2120 (2014)
Ding, H., Reißig, G., Wijaya, K., Bortot, D., Bengler, K., Stursberg, O.: Human arm motion modeling and long-term prediction for safe and efficient human-robot-interaction. In: IEEE International Conference on Robotics and Automation, pp. 5875–5880 (2011)
Elfring, J., Van De Molengraft, R., Steinbuch, M.: Learning intentions for improved human motion prediction. Robotics Auton. Syst. 62(4), 591–602 (2014)
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Flanagan, J.R., Johansson, R.S.: Action plans used in action observation. Nature 424(6950), 769–771 (2003)
Article Google Scholar
Friesen, A.L., Rao, R.P.: Gaze following as goal inference: a Bayesian model. In: Annual Conference of the Cognitive Science Society, vol. 33 (2011)
Gehrig, D., Krauthausen, P., Rybok, L., Kuehne, H., Hanebeck, U.D., Schultz, T., Stiefelhagen, R.: Combined intention, activity, and motion recognition for a humanoid household robot. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4819–4825 (2011)
Goodrich, M.A., Schultz, A.C.: Human-robot interaction: a survey. Found. Trends Hum. Comput. Interact. 1(3), 203–275 (2007)
Article MATH Google Scholar
Granstrom, K., Willett, P., Bar-Shalom, Y.: Systematic approach to IMM mixing for unequal dimension states. IEEE Trans. Aerosp. Electron. Syst. 51(4), 2975–2986 (2015)
Article Google Scholar
Gredebäck, G., Falck-Ytter, T.: Eye movements during action observation. Perspect. Psychol. Sci. 10(5), 591–598 (2015)
Article Google Scholar
Hart, J.W., Gleeson, B., Pan, M., Moon, A., MacLean, K., Croft, E.: Gesture, gaze, touch, and hesitation: timing cues for collaborative work. In: HRI Workshop on Timing in Human-Robot Interaction, Bielefeld (2014)
Hayhoe, M., Ballard, D.: Eye movements in natural behavior. Trends Cognit. Sci. 9(4), 188–194 (2005)
Article Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: ACM International Conference on Multimedia, pp. 675–678 (2014)
Kleinke, C.L.: Gaze and eye contact: a research review. Psychol. Bull. 100(1), 78 (1986)
Article Google Scholar
Koppula, H.S., Gupta, R., Saxena, A.: Learning human activities and object affordances from rgb-d videos. Int. J. Robotics Res. 32(8), 951–970 (2013)
Article Google Scholar
Kulic, D., Croft, E.A.: Affective state estimation for human-robot interaction. IEEE Trans. Robotics 23(5), 991–1000 (2007)
Article Google Scholar
Li, Y., Ge, S.: Human-robot collaboration based on motion intention estimation. IEEE/ASME Trans. Mechatron. 19(3), 1007–1014 (2014)
Article Google Scholar
Lin, T., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: European Conference on Computer Vision, pp. 740–755 (2014)
Liu, C., Hamrick, J.B., Fisac, J.F., Dragan, A.D., Hedrick, J.K., Sastry, S.S., Griffiths, T.L.: Goal inference improves objective and perceived performance in human-robot collaboration. In: International Conference on Autonomous Agents & Multiagent Systems, International Foundation for Autonomous Agents and Multiagent Systems, pp. 940–948 (2016)
Lohmiller, W., Slotine, J.J.E.: On contraction analysis for nonlinear systems. Automatica 34(6), 683–696 (1998)
Article MathSciNet MATH Google Scholar
Luo, R., Hayne, R., Berenson, D.: Unsupervised early prediction of human reaching for human–robot collaboration in shared workspaces. Autonom. Robots 42(3), 631–648 (2017)
MacKay, D.J.: Bayesian interpolation. Neural Comput. 4(3), 415–447 (1992)
Article MATH Google Scholar
Mainprice, J., Hayne, R., Berenson, D.: Predicting human reaching motion in collaborative tasks using inverse optimal control and iterative re-planning. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 885–892 (2015)
Matsumoto, Y., Heinzmann, J., Zelinsky, A.: The essential components of human-friendly robot systems. In: International Conference on Field and Service Robotics, pp. 43–51 (1999)
Matuszek, C., Herbst, E., Zettlemoyer, L., Fox, D.: Learning to parse natural language commands to a robot control system. In: Experimental Robotics. Springer Tracts in Advanced Robotics, vol. 88. Springer, Heidelberg (2013)
Monfort, M., Liu, A., Ziebart, B.D.: Intent prediction and trajectory forecasting via predictive inverse linear-quadratic regulation. In: AAAI Conference on Artificial Intelligence, pp. 3672–3678 (2015)
Morato, C., Kaipa, K.N., Zhao, B., Gupta, S.K.: Toward safe human robot collaboration by using multiple kinects based real-time human tracking. J. Comput. Inf. Sci. Eng. 14(1):011,006–1–011,006–9 (2014)
Pérez-D’Arpino, C., Shah, J.A.: Fast target prediction of human reaching motion for cooperative human-robot manipulation tasks using time series classification. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 6175–6182 (2015)
Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S., Carey, T.: Human-computer interaction. Addison-Wesley Longman Ltd, Essex, UK (1994)
Ravichandar, H., Dani, A.P.: Human intention inference through interacting multiple model filtering. In: IEEE Conference on Multisensor Fusion and Integration (MFI), pp. 220–225 (2015a)
Ravichandar, H., Dani, A.P.: Learning contracting nonlinear dynamics from human demonstration for robot motion planning. In: ASME Dynamic Systems and Control Conference (DSCC) (2015b)
Ravichandar, H., Dani, A.P.: Human intention inference using expectation-maximization algorithm with online model learning. IEEE Trans. Autom. Sci. Eng. 14(2), 855–868 (2017)
Article Google Scholar
Ravichandar, H., Kumar, A., Dani, A.: Bayesian human intention inference through multiple model filtering with gaze-based priors. In: 19th International Conference on Information Fusion (FUSION), pp. 2296–2302 (2016)
Ravichandar, H., Salehi, I., Dani, A.: Learning partially contracting dynamical systems from demonstrations. In: Proceedings of the 1st Annual Conference on Robot Learning, PMLR, vol. 78, pp. 369–378 (2017)
Razin, Y.S., Pluckter, K., Ueda, J., Feigh, K.: Predicting task intent from surface electromyography using layered hidden markov models. IEEE Robotics Autom. Lett. 2(2), 1180–1185 (2017)
Article Google Scholar
Recasens, A., Khosla, A., Vondrick, C., Torralba, A.: Where are they looking? In: Advances in Neural Information Processing Systems (NIPS) (2015)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Schaal, S.: Is imitation learning the route to humanoid robots? Trends Cognit. Sci. 3(6), 233–242 (1999)
Article Google Scholar
Simon, M.A.: Understanding human action: social explanation and the vision of social science. SUNY Press, Albany, NY (1982)
Song, D., Kyriazis, N., Oikonomidis, I., Papazov, C., Argyros, A., Burschka, D., Kragic, D.: Predicting human intention in visual observations of hand/object interactions. In: 2013 IEEE International Conference on Robotics and Automation, IEEE, pp. 1608–1615 (2013)
Strabala, K.W., Lee, M.K., Dragan, A.D., Forlizzi, J.L., Srinivasa, S., Cakmak, M., Micelli, V.: Towards seamless human-robot handovers. J. Hum. Robot Interact. 2(1), 112–132 (2013)
Article Google Scholar
Traver, V.J., del Pobil, A.P., Perez-Francisco, M.: Making service robots human-safe. In: IEEE/RSJ International Conference on Intelligent Robots and Systems., pp. 696–701 (2000)
Tsai, C.S., Hu, J.S., Tomizuka, M.: Ensuring safety in human-robot coexistence environment. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4191–4196 (2014)
Warrier, R.B., Devasia, S.: Inferring intent for novice human-in-the-loop iterative learning control. IEEE Trans. Control Syst. Technol. 25(5), 1698–1710 (2017)
Article Google Scholar
Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: Computer vision and pattern recognition (CVPR), 2010 IEEE conference on, IEEE, pp. 3485–3492 (2010)
Yao, B., Jiang, X., Khosla, A., Lin, A., Guibas, L., Fei-Fei, L.: Human action recognition by learning bases of action attributes and parts. In: International Conference on Computer Vision (ICCV), Barcelona (2011)
Yarbus, A.L.: Eye movements during perception of complex objects. In: Eye Movements and Vision, Springer, Boston, MA, pp. 171–211 (1967)
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems 27 (NIPS), pp. 487–495 (2014)

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Connecticut, Storrs, CT, 06269, USA
Harish Chaandar Ravichandar, Avnish Kumar & Ashwin Dani

Authors

Harish Chaandar Ravichandar
View author publications
You can also search for this author in PubMed Google Scholar
Avnish Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Ashwin Dani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashwin Dani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ravichandar, H.C., Kumar, A. & Dani, A. Gaze and motion information fusion for human intention inference. Int J Intell Robot Appl 2, 136–148 (2018). https://doi.org/10.1007/s41315-018-0051-0

Download citation

Received: 31 October 2017
Accepted: 05 March 2018
Published: 19 March 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s41315-018-0051-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gaze and motion information fusion for human intention inference

Abstract

Access this article

Similar content being viewed by others

Intention Recognition in Human Robot Interaction Based on Eye Tracking

Multi-modal Intention Prediction with Probabilistic Movement Primitives

Human Intent Prediction for Human-Robot Collaboration

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Gaze and motion information fusion for human intention inference

Abstract

Access this article

Similar content being viewed by others

Intention Recognition in Human Robot Interaction Based on Eye Tracking

Multi-modal Intention Prediction with Probabilistic Movement Primitives

Human Intent Prediction for Human-Robot Collaboration

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation