Abstract
In the video analysis domain, automatic detection of actions performed in a recorded video represents an important scientific and industrial challenge. This paper presents a new method to approximate the boundaries of actions performed by a person while interacting with his environment (such as moving objects). This method relies on a Codebook quantization method to analyze the rough evolution of each pixel and then decide whether this evolution corresponds to an action or not; this decision is taken by an automated system. Statistics are then produced - at the scale of the whole frame - to estimate the start and the end of an action. According to our proposed evaluation protocol, this method produces interesting results on both real and simulated videos. This statistic-based protocol is discussed at the end of this paper. The interpretation of this evaluation protocol nominates this method to be a solid base to localize the exact boundaries of actions or - in the framework of this research activity - to associate prescriptive text with a visual content.
Similar content being viewed by others
Notes
A pixel’s RGB value (R, G, B) matches the codeword C if, and only if, the point (R, G, B) - in the RGB system - is located inside the cylinder corresponding to C.
Synchronization of a video with the text that describes its content
References
Ambata LU, Caluyo FS (2012) Background change detection using wavelet transform. TENCON 2012 I.E. Region 10 Conference, pp. 1–6. doi: 10.1109/tencon.2012.6412298.
Bouwmans T (2011) Recent Advanced Statistical Background Modeling for Foreground Detection - A Systematic Survey. Recent Patents Comput Sci 4:147–176
Cucchiara R, Grana C, Piccardi M, Prati A (2003) Detecting moving objects, ghosts and shadows in video streams. IEEE Trans Pattern Anal Mach Intell 25:1337–1342. doi:10.1109/tpami.2003.1233909
Elgammal A, Duraiswami R, Harwood D, Davis LS (2002) Background and foreground modeling using nonparametric kernel density estimation for visual surveillance. Proc IEEE 90:1151–1163. doi:10.1109/JPROC.2002.801448
Fihl P, Corlin R, Park S, Moeslund TB, Trivedi MM (2006) Tracking of individuals in very long video sequences. Adv Vis Comput Lect Notes Comput Sci 4291:60–69. doi:10.1007/11919476_7
Geng L, Xiao Z (2011) Real Time Foreground-Background Segmentation Using Two-Layer Codebook Model. International Conference on Control. Aut Syst Eng(CASE) 1:1–5. doi:10.1109/ICCASE.2011.5997546
Gibbins D, Newsam GN, Brooks MJ (1996) Detecting suspicious background changes in video surveillance of busy scenes. Proceedings Third IEEE Workshop on Applications of Computer Vision, pp. 22–26. doi: 10.1109/acv.1996.571990
Gong Y, Sin LT, Chuan CH, Zhang H, Sakauchi M (1995) Automatic Parsing of TV Soccer Programs. International Conference on Multimedia Computing and Systems 1:167–174. doi:10.1109/MMCS.1995.484921
Horprasert T, Harwood D, Davis LS (1999) A statistical approach for real-time robust background subtraction and shadow detection. IEEE Int Conf Comp Vis 99:1–19
Kim K, Chalidabhongse TH, Harwood D, Davis L (2005) Real-time foreground - background segmentation using codebook model. Real-time Imaging 11:172–185. doi:10.1016/j.rti.2004.12.004
Leykin A, Ran Y, Hammoud R (2007) Thermal-visible video fusion for moving target tracking and pedestrian classification. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. doi:10.1109/CVPR.2007.383444
MathWorks laboratories (2013) Scene Change Detection System (vipscenechange). Computer Vision System Toolbox in MATLAB (R2013a). http://fr.mathworks.com/help/vision/examples/scene-change-detection.html, Accessed 1 December 2014.
Radke RJ, Andra S, Al-Kofahi O, Roysam B (2005) Image Change Detection Algorithms - A Systematic Survey. IEEE Trans Image Process 14:294–307. doi:10.1109/tip.2004.838698
Rodriguez-Gomez R, Fernandez-Sanchez EJ, Diaz J, Ros E (2012) Codebook hardware implementation on FPGA for background subtraction. J Real-Time Image Proc 10:43–57. doi:10.1007/s11554-012-0249-6
Rui Y, Gupta A, Acero A (2000) Automatically Extracting Highlights for TV Baseball Programs. Proceedings of the eighth ACM international conference on Multimedia, pp. 105–115. doi: 10.1145/354384.354443.
Sigari MH, Fathy M (2008) Real-time Background Modeling/Subtraction using Two-Layer Codebook Model. Proc Int MultiConference Eng Comp Scientists (IMECS) 1:717–720
Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. IEEE Comp Soc Conf Comp Vision PattRecog 2:252. doi:10.1109/CVPR.1999.784637
Subudhi BN, Ghosh S, Ghosh A (2013) Change detection for moving object segmentation with robust background construction under Wronskian framework. Mach Vis Appl 24:795–809. doi:10.1007/s00138-012-0475-8
Sudhir G, Lee JCM, Jain AK (1998) Automatic classification of tennis video for high-level content-based retrieval. Proceedings of IEEE International Workshop on Content-Based Access of Image and Video Database 1:81–90. doi:10.1109/caivd.1998.646036
Szwoch G, Ellwart D, Czyżewski A (2012) Parallel implementation of background subtraction algorithms for real-time video processing on a supercomputer platform. J Real-Time Image Proc. doi:10.1007/s11554-012-0310-5
Wren CR, Azarbayejani A, Darrell T, Pentland AP (1997) Pfinder: Real-time tracking of the human body. IEEE Trans Pattern Anal Mach Intell 19:780–785. doi:10.1109/34.598236
Zhang D, Chang SF (2002) Event detection in baseball video using superimposed caption recognition. Proceedings of the tenth ACM international conference on Multimedia, pp. 315–318. doi: 10.1145/641007.641073.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wehbe, H., Haidar, B. & Joly, P. Action boundaries detection in a video. Multimed Tools Appl 75, 8239–8266 (2016). https://doi.org/10.1007/s11042-015-2748-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2748-5