Abstract
A novel framework for unsupervised face tracking and recognition is built on Detection-Tracking-Refinement-Recognition (DTRR) approach. This framework proposed a hybrid face detector for real-time face tracking which is robust to occlusions, facial expression and posture changes. After a posture correction and face alignment, the tracked face is featured by the Local Ternary Pattern (LTP) operator. Then these faces are clustered into several groups according to the distance between feature vectors. During the next step, those groups which each contains a series of faces can be further merged by the Scale-invariant feature transform (SIFT) operator. Due to extreme computing time consumption by SIFT, a multithreaded refinement process was given. After the refinement process, the relevant faces are put together which is of much importance for face recognition in videos. The framework is validated both on several videos collected in unconstrained condition (8 min each.) and on Honda/UCSD database. These experiments demonstrated that the framework is capable of tracking the face and automatically grouping a serial faces for a single human-being object in an unlabeled video robustly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Z. Kalal, K. Mikolajczyk, and J. Matas. Face-TLD: Tracking-Learning-Detection Applied to Faces. International Conference on Image Processing 1(1):3789–3792 (2010)
Paul Viola and Michael Jones.: Rapid Object Detection using a Boosted Cascade of Simple Features. Conference on Computer Vision and Pattern Recognition, 1(1):511 (2001)
J. Yang and A. Waibel A real-time face tracker. In Proceedings of the Third IEEE Workshop on Applications of Computer Vision, 1(1): 142–147 (1996)
H. Schneiderman and T. Kanade.: A statistical method for 3d object detection applied to faces and cars. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1(1):1746 (2000)
R.L. Hsu, M. Abdel-Mottaleb, and A.K. Jain.: Face Detection in Color Images. Pattern Analysis and Machine Intelligence,24(5):696–706 (2002)
Yan Wang, Yanghua Liu, Linmi Tao, Guangyou Xu.: Real-time multi-view face detection and pose estimation in video stream. 18th International Conference on Pattern Recognition, 4(1):354–357 (2006)
M. Nakamura, H. Nomiya and K. Uehara.: Improvement of boosting algorithm by modifying the weighting rule. Annals of Mathematics and Artificial Intelligence, 41(1):95–109 (2004)
T. Camus.: Real-Time Quantized Optical Flow. Journal of Real-Time Imaging, 3(1): 71–86 (1997)
David G. Lowe.: Distinctive image features from scale-invariant key points. International Journal of Computer Vision, 60(2): 91–110 (2004)
Xiao yang Tan and Bill Triggs.: Enhanced Local Texture Feature Sets for Face Recognition under Difficult Lighting Conditions. Proceedings of the 3rd international conference on Analysis and modeling of faces and gestures,1(1):168–182 (2009)
Piotr Dollár, Peter Welinder, Pietro Perona.: Cascaded Pose Regression. In Proceedings of CVPR. 1(1): 1078–1085 (2010)
K. C. Lee, J. Mo, M. H. Yang, and D. Kriegman.: Video based face recognition using probabilistic appearance manifolds. Computer Society Conference on Computer Vision and Pattern Recognition, 1(1):313 (2003)
Cevikalp H., Triggs B.: Face recognition based on image sets. Computer Vision and Pattern Recognition, 1(1): 2567–2573 (2010)
K. Fukui and O. Yamaguchi.: Face recognition using multiview point patterns for robot vision. In International Symposium of Robotics Research, pp.192–201 (2003)
Acknowledgment
This work is funded by the National Basic Research Program of China (No. 2010CB327902), the National Natural Science Foundation of China (No. 60873158, No. 61005016, No. 61061130560) and the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media, LLC
About this paper
Cite this paper
Wang, H., Wang, Y., Huang, J., Wang, F., Zhang, Z. (2013). An Unsupervised Real-Time Tracking and Recognition Framework in Videos. In: The Era of Interactive Media. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3501-3_37
Download citation
DOI: https://doi.org/10.1007/978-1-4614-3501-3_37
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-3500-6
Online ISBN: 978-1-4614-3501-3
eBook Packages: Computer ScienceComputer Science (R0)