Abstract
We develop a new video-based motion analysis algorithm to determine whether two persons have any interaction in their meeting. The interaction between two persons can be very general, such as shaking hands, exchanging objects, and so on. To make the motion analysis robust to image noise, we segment each video frame into a set of superpixels and then derive a motion feature and a motion pattern for each superpixel by averaging the optical flow within the superpixel. Specifically, we use the lattice cut to construct the superpixels, which are spatially and temporally consistent across frames. Based on the motion feature and the motion pattern of the superpixels, we develop an algorithm to divide an input video sequence into three consecutive periods: 1) two persons walking toward each other, 2) two persons meeting each other, and 3) two persons walking away from each other. The experiment show that the proposed algorithm can accurately distinguish the videos with and without human interactions.
Similar content being viewed by others
References
Aggarwal J K, Ryoo M S. Human activity analysis: A review [J]. ACM Computing Surveys, 2011, 43(3): 1601–1643.
Lucas B D, Kanade T. An iterative image registration technique with an application to stereo vision [C]//Proceedings of Imaging Understanding Workshop. Washington D C: IEEE Press, 1981: 121–130.
Moore A, Prince S J, Warrell J, et al. Superpixel lattices [C]// IEEE Conference on Computer Vision and Pattern Recognition. Washington D C: IEEE Press, 2008, 1 (12): 998–1005.
Moore A, Prince S J, Warrell J, et al. Scene shape priors for superpixel segmentation [C]//IEEE International Conference on Computer Vision. Washington D C: IEEE Press, 2009: 771–778.
Moore A, Prince S J, Warrell J. “Lattice Cut”-constructing superpixels using layer constraints [C]//IEEE Conference on Computer Vision and Pattern Recognition. Washington D C: IEEE Press, 2010: 2117–2124.
Ke Y, Sukthankar R, Hebert M. Efficient visual event detection using volumetric features [C]//IEEE International Conference on Computer Vision. Washington D C: IEEE Press, 2005, 1: 166–173.
Ke Y, Sukthankar R, Hebert M. Event detection in crowded videos [C]//IEEE International Conference on Computer Vision. Washington D C: IEEE Press, 2007, 1: 1–8.
Ke Y, Sukthankar R, Hebert M. Spatio-temporal shape and flow correlation for action recognition [C]//IEEE Workshop on Visual Surveillance. Washington D C: IEEE Press, 2007, 1: 1–8.
Yilmaz A, Shah M. Recognizing human actions in videos acquired by uncalibrated moving cameras [C]// IEEE International Conference on Computer Vision. Washington D C: IEEE Press, 2005, 1: 150–157.
Zheng H, Li Z, Fu Y. Efficient human action recognition by luminance field trajectory and geometry information [C]// IEEE International Conference on Multimedia and Expo. Washington D C: IEEE Press, 2009: 842–845.
Zhou Y, Yan S, Huang T. Pair-activity classification by bi-trajectories analysis [C]//IEEE Conference on Computer Vision and Pattern Recognition. Washington D C: IEEE Press, 2008: 1–8.
Ryoo M S, Aggarwal J K. Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities [C]//IEEE International Conference on Computer Vision. Washington D C: IEEE Press, 2009: 1593–1600.
Ni B, Yan S, Kassim A. Recognizing human group activities with localized causalities [C]//IEEE Conference on Computer Vision and Pattern Recognition. Washington D C: IEEE Press, 2009: 1470–1477.
Natarajan P, Nevatia R. Coupled hidden semi markov models for activity recognition [C]//IEEE Workshop on Motion and Video Computing. Washington D C: IEEE Press, 2007: 1–10.
Niebles J C, Han B, Li F F. Efficient extraction of human motion volumes by tracking [C]//IEEE Conference on Computer Vision and Pattern Recognition. Washington D C: IEEE Press, 2010: 655–662.
Tran D, Sorokin A. Human activity recognition with metric learning [C]//European Conference on Computer Vision. Berlin: Springer-Verlag, 2008: 1–14.
Oliver N, Horvitz E, Garg A. Layered representations for human activity recognition [C]//IEEE International Conference on Multimodal Interfaces. Washington D C: IEEE Press, 2002: 3–8.
Nguyen N T, Phung D Q, Venkatesh S, et al. Learning and detecting activities from movement trajectories using the hierarchical hidden markov models [C]//IEEE Conference on Computer Vision and Pattern Recognition. Washington D C: IEEE Press, 2005, 2: 955–960.
Zhang D, Gatica-Perez D, Bengio S, et al. Modeling individual and group actions in meetings with layered HMMs [J]. IEEE Transactions on Multimedia, 2006, 8(3): 509–520.
Damen D, Hogg D. Recognizing linked events: Searching the space of feasible explanations [C]// IEEE Conference on Computer Vision and Pattern Recognition. Washington D C: IEEE Press, 2009: 927–934.
Yin J, Meng Y. Human activity recognition in video using a hierarchical probabilistic latent model [C]//CVPR Workshop. Washington D C: IEEE Press, 2010: 15–20.
Joo S W, Chellappa R. Attribute grammar-based event recognition and anomaly detection [C]//CVPR Workshop. Berlin: Springer-Verlag, 2006: 107–114.
Gupta A, Srinivasan P, Shi J, et al. Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos [C]//IEEE Conference on Computer Vision and Pattern Recognition. Washington D C: IEEE Press, 2009: 2012–2019.
Schldt C, Laptev I, Caputo B. Recognizing human actions: A local SVM approach [J]. Proceedings of the International Conference on Pattern Recognition, 2004, 3: 32–36.
Blank M, Gorelick L, Shechtman E, et al. Actions as space-time shapes [J]. IEEE International Conference on Computer Vision, 2005, 2: 1395–1402.
Weinland D, Ronfard R, Boyer E. Free viewpoint action recognition using motion history volumes [J]. Computer Vision and Image Understanding, 2006, 104(2): 249–257.
Rodriguez M D, Ahmed J, Shah M. Action MACH: A spatio-temporal maximum average correlation height filter for action recognition [C]//IEEE Conference on Computer Vision and Pattern Recognition. Washington D C: IEEE Press, 2008: 1–8.
Laptev I, Marszalek M, Schmid C, et al. Learning realistic human actions from movies [C]//IEEE Conference on Computer Vision and Pattern Recognition. Washington D C: IEEE Press, 2010: 1–8.
Levinshtein A, Stere A, Kutulakos K, et al. Turbopixels: Fast superpixels using geometric flows [J]. IEEE Trans on Pattern Anlaysis and Machine Intelligence, 2009, 31(12): 2290–2297.
Martin D R, Fowlkes C C, Malik J. Learning to detect natural image boundaries using local brightness, color, and texture cues [J]. IEEE Trans on Pattern Anlaysis and Machine Intelligence, 2004, 26(5): 530–549.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Supported by the National Natural Science Foundation of China (61272453)
Biography: ZHENG Peng, male, Associate professor, Ph. D, research direction: computer vision, information hiding.
Rights and permissions
About this article
Cite this article
Zheng, P., Cao, Y. & Wang, S. Motion analysis for human interaction detection using optical flow on lattice superpixels. Wuhan Univ. J. Nat. Sci. 18, 109–116 (2013). https://doi.org/10.1007/s11859-013-0902-3
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11859-013-0902-3