A saliency-driven robotic head with bio-inspired saccadic behaviors for social robotics

He, Hongsheng; Ge, Shuzhi Sam; Zhang, Zhengchen

doi:10.1007/s10514-013-9346-z

A saliency-driven robotic head with bio-inspired saccadic behaviors for social robotics

Published: 03 July 2013

Volume 36, pages 225–240, (2014)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Hongsheng He^1,2,
Shuzhi Sam Ge^3,4 &
Zhengchen Zhang³

723 Accesses
16 Citations
Explore all metrics

Abstract

This paper presents a robotic head for social robots to attend to scene saliency with bio-inspired saccadic behaviors. Scene saliency is determined by measuring low-level static scene information, motion, and object prior knowledge. Towards the extracted saliency spots, the designed robotic head is able to turn gazes in a saccadic manner while obeying eye–head coordination laws with the proposed control scheme. The results of the simulation study and actual applications show the effectiveness of the proposed method in discovering of scene saliency and human-like head motion. The proposed techniques could possibly be applied to social robots to improve social sense and user experience in human–robot interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Motion-Oriented Attention for a Social Gaze Robot Behavior

Attentive Robots

A soft-computing-based approach to artificial visual attention using human eye-fixation paradigm: toward a human-like skill in robot vision

Article 18 November 2017

References

Asfour, T., Welke, K., Azad, P., Ude, A., & Dillmann, R. (2008). The Karlsruhe humanoid head. In Proceedings of IEEE-RAS international conference on humanoid robots (pp. 447–453).
Breazeal, C. (2000). Sociable machines: Expressive social exchange between humans and robots. Ph.D. thesis, Massachusetts Institute of Technology.
Butko, N., Zhang, L., Cottrell, G., & Movellan J. (2008). Visual saliency model for robot cameras. In IEEE international conference on robotics and automation (pp. 2398–2403).
Choi, S.-B., Ban, S.-W., & Lee, M. (2004). Biologically motivated visual attention system using bottom-up saliency map and top-down inhibition. Neural Information Processing—Letters and Review, 2(1), 19–25.
Google Scholar
Corbetta, M., & Shulman, G. (2002). Control of goal-directed and stimulus-driven attention in the brain. Nature Reviews Neuroscience, 3, 201–215.
Article Google Scholar
Crawford, J., Martinez-Trujillo, J., & Klier, E. (2003). Neural control of three-dimensional eye and head movements. Current Opinion in Neurobiology, 13(6), 655–662.
Article Google Scholar
Crawford, J., & Vilis, T. (1991). Axes of eye rotation and listing’s law during rotations of the head. Journal of Neurophysiology, 65(3), 407–423.
Google Scholar
Cui, R., Gao, B., & Guo, J. (2012). Pareto-optimal coordination of multiple robots with safety guarantees. In Autonomous Robots, 1–17. doi:10.1007/s10514-012-9302-3.
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE Computer Society conference on computer vision and pattern recognition.
Donders, F. (1848). Beitrag zur lehre von den bewegungen des menschlichen auges. Holland Beitr Anat Physiol Wiss, 1(104), 384.
Google Scholar
Doretto, G., Chiuso, A., Wu, Y., & Soatto, S. (2003). Dynamic textures. International Journal of Computer Vision, 51(2), 91–109.
Article MATH Google Scholar
Gao, D., & Vasconcelos, N. (2007). Bottom-up saliency is a discriminant process. In IEEE 11th international conference on computer vision, ICCV 2007 (pp. 1–6).
Ge, S., He, H., & Zhang, Z. (2011). Bottom-up saliency detection for attention determination. Machine Vision and Applications, 24, 1–14.
Google Scholar
Glenn, B., & Vilis, T. (1992). Violations of listing’s law after large eye and head gaze shifts. Journal of Neurophysiology, 68(1), 309–318.
Google Scholar
Goossens, H., & Opstal, A. (1997). Human eye–head coordination in two dimensions under different sensorimotor conditions. Experimental Brain Research, 114(3), 542–560.
Article Google Scholar
Guitton, D., & Volle, M. (1987). Gaze control in humans: Eye–head coordination during orienting movements to targets within and beyond the oculomotor range. Journal of Neurophysiology, 58(3), 427–459.
Google Scholar
Hartley, R., & Zisserman, A. (2000). Multiple view geometry in computer vision (Vol. 2). New York: Cambridge University Press.
MATH Google Scholar
He, H., Ge, S., & Zhang, Z. (2011). Visual attention prediction using saliency determination of scene understanding for social robots. Special issue on towards an effective design of social robots. International Journal of Social Robotics, 3, 457–468.
Article Google Scholar
He, H., Zhang, Z., & Ge, S. (2010). Attention determination for social robots using salient region detection. In International conference on social robotics (pp. 295–304). Heidelberg: Springer.
Heuring, J., & Murray, D. (1999). Modeling and copying human head movements. IEEE Transactions on Robotics and Automation, 15(6), 1095–1108.
Article Google Scholar
Hwang, A. D., Higgins, E. C., & Pomplun, M. (2009). A model of top-down attentional control during visual search in complex scenes. Journal of Vision, 9(5), 25.1–25.18.
Google Scholar
Itti, L. (2003). Realistic avatar eye and head animation using a neurobiological model of visual attention. Tech. Rep. Defense Technical Information Center Document.
Itti, L. (2005). Models of bottom-up attention and saliency. In L. Itti, G. Rees, & J. K. Tsotsos (Eds.), Neurobiology of attention (pp. 576–582). San Diego, CA: Elsevier.
Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2010). Learning to predict where humans look. In International conference on computer vision.
Kanan, C., Tong, M. H., Zhang, L., & Cottrell, G. W. (2009). Sun: Top-down saliency using natural statistics. Visual Cognition, 17(6–7), 979–1003.
Article Google Scholar
Laschi, C., Asuni, G., Guglielmelli, E., Teti, G., Johansson, R., Konosu, H., et al. (2008). A bio-inspired predictive sensory-motor coordination scheme for robot reaching and preshaping. Autonomous Robots, 25(1), 85–101.
Article Google Scholar
Le Meur, O., Le Callet, P., Barba, D., & Thoreau, D. (2006). A coherent computational approach to model bottom-up visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5), 802–817.
Lopes, M., Bernardino, A., Santos-Victor, J., Rosander, K., & von Hofsten, C. (2009). Biomimetic eye-neck coordination. In Proceedings of IEEE international conference on development and learning (pp. 1–8).
Maini, E., Manfredi, L., Laschi, C., & Dario, P. (2008). Bioinspired velocity control of fast gaze shifts on a robotic anthropomorphic head. Autonomous Robots, 25(1), 37–58.
Article Google Scholar
Medendorp, W., Van Gisbergen, J., Horstink, M., & Gielen, C. (1999). Donders’ law in torticollis. Journal of Neurophysiology, 82(5), 2833.
Google Scholar
Milanese, R., Wechsler, H., Gill, S., Bost, J.-M., & Pun, T. (1994). Integration of bottom-up and top-down cues for visual attention using non-linear relaxation. In IEEE Computer Society conference on computer vision and pattern recognition, Proceedings CVPR’94 (pp. 781–785).
Morel, J., & Yu, G. (2009). Asift: A new framework for fully affine invariant image comparison. SIAM Journal on Imaging Sciences, 2(2), 438–469.
Article MATH MathSciNet Google Scholar
Nagai, Y., Hosoda, K., Morita, A., & Asada, M. (2003). A constructive model for the development of joint attention. Connection Science, 15(4), 211–229.
Article Google Scholar
Navalpakkam, V., & Itti, L. (2006). An integrated model of top-down and bottom-up attention for optimizing detection speed. In 2006 IEEE Computer Society conference on computer vision and pattern recognition (Vol. 2, pp. 2049–2056).
Oliva, A., Torralba, A., Castelhano, M. S., & Henderson, J. M. (2003). Top-down control of visual attention in object detection. In Proceedings of 2003 IEEE international conference on image processing, ICIP 2003 (Vol. 1, pp. 1–253).
Pagel, M., Maël, E., & Von Der Malsburg, C. (1998). Self calibration of the fixation movement of a stereo camera head. Autonomous Robots, 5(3), 355–367.
Article Google Scholar
Raphan, T. (1998). Modeling control of eye orientation in three dimensions. I. Role of muscle pulleys in determining saccadic trajectory. Journal of Neurophysiology, 79(5), 2653.
Google Scholar
Seo, H. J., & Milanfar, P. (2009). Nonparametric bottom-up saliency detection by self-resemblance. In IEEE computer society conference on computer vision and pattern recognition workshops. CVPR Workshops 2009 (pp. 45–52).
Smith, R. (2007). An overview of the tesseract ocr engine. In Proceedings of the ninth international conference on document analysis and recognition.
Tsagarakis, N., Metta, G., Sandini, G., Vernon, D., Beira, R., Becchi, F., et al. (2007). Icub: the design and realization of an open humanoid platform for cognitive and neuroscience research. Advanced Robotics, 21(10), 1151–1175.
Article Google Scholar
Tweed, D. (1997). Three-dimensional model of the human eye–head saccadic system. Journal of Neurophysiology, 77(2), 654.
Google Scholar
Viola, P., & Jones, M. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137–154.
Article Google Scholar
Westheimer, G. (1957). Kinematics of the eye. Journal of the Optical Society of America, 47, 967–974.
Article Google Scholar

Download references

Acknowledgments

The research is partially funded by Singapore National Research Foundation, Interactive Digital Media R&D Program, under research grant R-705-000-017-279, and the National Basic Research Program of China (973 Program) under Grant 2011CB707005.

Author information

Authors and Affiliations

State Key Laboratory of Synthetic Automation of Process Industries, Northeastern University, Shenyang, 110300, People’s Republic of China
Hongsheng He
Department of Electrical and Computer Engineering, National University of Singapore, Singapore, 117576, Singapore
Hongsheng He
Social Robotics Laboratory, Interactive Digital Media Institute, and Department of Electrical and Computer Engineering, National University of Singapore, Singapore, 117576, Singapore
Shuzhi Sam Ge & Zhengchen Zhang
Robot Institute, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 61173, People’s Republic of China
Shuzhi Sam Ge

Authors

Hongsheng He
View author publications
You can also search for this author in PubMed Google Scholar
Shuzhi Sam Ge
View author publications
You can also search for this author in PubMed Google Scholar
Zhengchen Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuzhi Sam Ge.

Appendix

1.1 Computation of linear projections using corresponding points

Given one corresponding position, the projective transformation maps is

$$\begin{aligned} \left[ \begin{array}{c} r_{k}^{i}\\ c_{k}^{i}\\ 1 \end{array}\right] =\left[ \begin{array}{ccc} P_{11}^{ij} &{} P_{12}^{ij} &{} P_{13}^{ij}\\ P_{21}^{ij} &{} P_{22}^{ij} &{} P_{23}^{ij}\\ P_{31}^{ij} &{} P_{32}^{ij} &{} P_{33}^{ij} \end{array}\right] \left[ \begin{array}{c} r_{k}^{j}\\ c_{k}^{j}\\ 1 \end{array}\right] \end{aligned}$$

(28)

To solve the optimal linear projection, the linear projection (28) can be rewritten in the matrix form Hartley and Zisserman (2000),

$$\begin{aligned} (\mathbf v _{k}^{x})^{T}\mathbf p ^{ij}&= 0\end{aligned}$$

(29)

$$\begin{aligned} (\mathbf v _{k}^{y})^{T}\mathbf p ^{ij}&= 0 \end{aligned}$$

(30)

where

$$\begin{aligned} \mathbf p ^{ij}&= [P_{11}^{ij},P_{12}^{ij},P_{13}^{ij},P_{21}^{ij},P_{22}^{ij},P_{23}^{ij},P_{31}^{ij},P_{32}^{ij},P_{33}^{ij}]^{T}\end{aligned}$$

(31)

$$\begin{aligned} \mathbf v _{k}^{x}&= [-r_{k}^{i},-c_{k}^{i},-1,0,0,0,r_{k}^{j}r_{k}^{i},r_{k}^{j}c_{k}^{i},r_{k}^{j}]\end{aligned}$$

(32)

$$\begin{aligned} \mathbf v _{k}^{y}&= [0,0,0,-r_{k}^{i},-c_{k}^{i},-1,c_{k}^{j}r_{k}^{i},c_{k}^{j}c_{k}^{i},c_{k}^{j}]. \end{aligned}$$

(33)

Assuming that corresponding points in both images can be identified and correspondence can be approximated with linear maps within a small movement of the camera, the projection matrix can be computed by

$$\begin{aligned} V_{ij}\mathbf p ^{ij}=0 \end{aligned}$$

(34)

where $V_{ij}=[\mathbf v _{1}^{x},\mathbf v _{1}^{y}, \mathbf v _{2}^{x},\mathbf v _{2}^{y},\ldots , \mathbf v _{k}^{x},\mathbf v _{k}^{y}]$ with $k$ corresponding points between the two images.

1.2 Quaternion element projection

Property 1

(Quaternion element projection) Let a quaternion be $x=o\mathsf 1 +a\mathsf i +b\mathsf j +c\mathsf k $ with $1$, $\mathsf i $, $\mathsf j $ and $\mathsf k $ as the element basis. If $a=0$, then $x*\mathsf i =\mathsf i *x^{+}$ where $x^{+}$ is the conjugate of $x$. The rest for $b$ and $c$ may be deduced by analogy.

Proof

The proof is straightforward by expanding the left and right sides of $x*\mathsf i =\mathsf i *x^{+}$ with the quaternion definition. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, H., Ge, S.S. & Zhang, Z. A saliency-driven robotic head with bio-inspired saccadic behaviors for social robotics. Auton Robot 36, 225–240 (2014). https://doi.org/10.1007/s10514-013-9346-z

Download citation

Received: 21 March 2012
Accepted: 08 June 2013
Published: 03 July 2013
Issue Date: March 2014
DOI: https://doi.org/10.1007/s10514-013-9346-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A saliency-driven robotic head with bio-inspired saccadic behaviors for social robotics

Abstract

Access this article

Similar content being viewed by others

Motion-Oriented Attention for a Social Gaze Robot Behavior

Attentive Robots

A soft-computing-based approach to artificial visual attention using human eye-fixation paradigm: toward a human-like skill in robot vision

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Computation of linear projections using corresponding points

1.2 Quaternion element projection

Property 1

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A saliency-driven robotic head with bio-inspired saccadic behaviors for social robotics

Abstract

Access this article

Similar content being viewed by others

Motion-Oriented Attention for a Social Gaze Robot Behavior

Attentive Robots

A soft-computing-based approach to artificial visual attention using human eye-fixation paradigm: toward a human-like skill in robot vision

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Computation of linear projections using corresponding points

1.2 Quaternion element projection

Property 1

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation