Estimating Gaze Direction from Low-Resolution Faces in Video
Abstract
In this paper we describe a new method for automatically estimating where a person is looking in images where the head is typically in the range 20 to 40 pixels high. We use a feature vector based on skin detection to estimate the orientation of the head, which is discretised into 8 different orientations, relative to the camera. A fast sampling method returns a distribution over previously-seen head-poses. The overall body pose relative to the camera frame is approximated using the velocity of the body, obtained via automatically-initiated colour-based tracking in the image sequence. We show that, by combining direction and head-pose information gaze is determined more robustly than using each feature alone. We demonstrate this technique on surveillance and sports footage.
Keywords
Skin Detection British Machine Vision Skin Pixel Body Direction Head PoseReferences
- 1.Beis, J.S., Lowe, D.G.: Shape indexing using approximate nearest-neighbour search in high-dimensional space. In: IEEE Conf. on Computer Vision and Pattern Recognition, San Juan, PR (June 1997)Google Scholar
- 2.Buxton, H.: Learning and Understanding Dynamic Scene Activity. In: ECCV Generative Model Based Vision Workshop, Copenhagen, Denmark (2002)Google Scholar
- 3.Chai, D., Ngan, K.N.: Locating facial region of a head-and-shoulders color image. In: Third IEEE International Conference on Automatic Face and Gesture Recognitions, Nara, Japan, April 1998, pp. 124–129 (1998)Google Scholar
- 4.Comaniciu, D., Meer, P.: Mean Shift Analysis and Applications. In: Proceedings of the International Conference on Computer Vision, September 20-25, 1999, vol. 2, p. 1197 (1999)Google Scholar
- 5.Dee, H., Hogg, D.: Detecting Inexplicable Behaviour. In: Proceedings of the British Machine Vision Conference (2004)Google Scholar
- 6.Efros, A.A., Berg, A., Mori, G., Malik, J.: Recognising Action at a Distance. In: Proceedings of the International Conference on Computer Vision, Nice, France (July 2003)Google Scholar
- 7.Galata, A., Johnson, N., Hogg, D.: Learning Behaviour Models of Human Activities. In: British Machine Vision Conference (1999)Google Scholar
- 8.Gee, A.H., Cipolla, R.: Determining the gaze of faces in images. Image and Vision Computing 12(10), 639–647 (1994)CrossRefGoogle Scholar
- 9.Grimson, W.E.L., Stauffer, C., Romano, R., Lee, L.: Using Adaptive Tracking to Classify and Monitor Activities in a Site. In: Computer Vision and Pattern Recognition, Santa Barbara, CA, USA, June 23-25 (1998)Google Scholar
- 10.Hidai, K., et al.: Robust Face Detection against Brightness Fluctuation and Size Variation. In: International Conference on Intelligent Robots and Systems, Japan, October 2000, vol. 2, pp. 1379–1384 (2000)Google Scholar
- 11.Jebara, T.S., Pentland, A.: Parametrized Structure from Motion for 3D Adaptive Feedback Tracking of Faces. In: Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico, pp. 144–150Google Scholar
- 12.Johnson, N., Hogg, D.: Learning the Distribution of Object Trajectories for Event Recognition. In: Proc. British Machine Vision Conference, September 1995, vol. 2, pp. 583–592 (1995)Google Scholar
- 13.Lucas, B.D., Kanade, T.: An Iterative Image Registration Technique with Application to Stereo Vision. In: DARPA Image Understanding Workshop (1981)Google Scholar
- 14.Makris, D., Ellis, T.: Spatial and Probabilistic Modelling of Pedestrian Behaviour. In: British Machine Vision Conference 2002, Cardiff, UK, September 2-5, 2002, vol. 2, pp. 557–566 (2002)Google Scholar
- 15.Matsumoto, Y., Zelinsky, A.: An Algorithm for Real-time Stereo Vision Implementation of Head Pose and Gaze Direction Measurement. In: Proceedings of IEEE Fourth International Conference on Face and Gesture Recognition, pp. 499–505 (2000)Google Scholar
- 16.McNames, J.: A Fast Nearest-Neighbor Algorithm Based on a Principal Axis Search Tree. IEEE Pattern Analysis and Machine Intelligence 23, 964–976 (2001)CrossRefGoogle Scholar
- 17.Morellas, V., Pavlidis, I., Tsiamyrtzis, P.: DETER: Detection of Events for Threat Evaluation and Recognition. Machine Vision and Applications 15(1), 29–46 (2003)CrossRefGoogle Scholar
- 18.Nene, S.A., Nayar, S.K.: A Simple Algorithm for Nearest Neighbor Search in High Dimensions. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 989–1003 (1997)CrossRefGoogle Scholar
- 19.Pang, D., Li, V.: Atlantoaxial Rotatory Fixation: Part 1-Biomechanics OF Normal Rotation at the Atlantoaxial Joint in Children. Neurosurgery 55(3), 614–626 (2004)MathSciNetCrossRefGoogle Scholar
- 20.Perez, A., Cordoba, M.L., Garcia, A., Mendez, R., Munoz, M.L., Pedraza, J.L., Sanchez, F.: A Precise Eye-Gaze Detection and Tracking System. In: Proceedings of the 11th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision (2003)Google Scholar
- 21.Robertson, N.M., Reid, I.D.: Behaviour understanding in video: a combined method. In: Proceedings of the International Conference on Computer Vision, Beijing, China (October 2005)Google Scholar
- 22.Robertson, N.M., Reid, I.D., Brady, J.M.: What are you looking at? Gaze recognition in medium-scale images. In: Human Activity Modelling and Recognition, British Machine Vision Conference, Oxford, UK (September 2005)Google Scholar
- 23.Sidenbladh, H., Black, M.J., Sigal, L.: Implicit probabilistic models of human motion for synthesis and tracking. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 784–800. Springer, Heidelberg (2002)CrossRefGoogle Scholar
- 24.Viola, P.A., Jones, M.J.: Robust Real-Time Face Detection. International Journal of Computer Vision 57(3), 137–154 (2004)CrossRefGoogle Scholar