Abstract
In this work, we present an approach for segmenting objects in videos taken in complex scenes. It propagates initial object label through the entire video by a frame-sequential manner where the initial label is usually given by the user. The proposed method has several contributions which make the propagation much more robust and accurate than other methods. First, a novel supervised motion estimation algorithm is employed between each pair of neighboring frames, by which a predicted shape model can be warped in order to segment the similar color around object boundary. Second, unlike previous methods with fixed modeling range, we design a novel range-adaptive appearance model to handle the tough problem of occlusion. Last, the paper gives a reasonable framework based on GraphCut algorithm for obtaining the final label of the object by combining the clues from both appearance and motion. In the experiments, the proposed approach is evaluated qualitatively and quantitatively with some recent methods to show it achieves state-of-art results on multiple videos from benchmark data sets.
Similar content being viewed by others
References
Avinash Ramakanth S, Venkatesh Babu R (2014) Seamseg: video object segmentation using patch seams. In: Proceedings of the 2014 IEEE conference on computer vision and pattern recognition
Bai X, Wang J, Simons D, Sapiro G (2009) Video snapcut: robust video object cutout using localized classifiers. In: ACM SIGGRAPH 2009 Papers, SIGGRAPH ’09, pp 70:1–70:11
Barnes C, Shechtman E, Finkelstein A, Goldman DB (2009) Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph 28(3):24:1–24:11
Boykov Y, Jolly M (2001) Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In: Proceedings of the 2001 IEEE international conference on computer vision, ICCV ’01. IEEE Computer Society, pp 105–112
Endres I, Hoiem D (2010) Category independent object proposals. In: Proceedings of the 11th European conference on computer vision: part V, ECCV’10. Springer, Berlin, pp 575–588
Faktor A, Irani M (2014) Video segmentation by non-local consensus voting. In: Proceedings of the 2014 British machine vision conference
Fan Q, Zhong F, Lischinski D, Cohen-Or D, Chen B (2015) Jumpcut: non-successive mask transfer and interpolation for video cutout. ACM Trans Graph 34 (6):195:1–195:10
Giordano D, Murabito F, Palazzo S, Spampinato C (2015) Superpixel-based video object segmentation using perceptual organization and location prior. In: Proceedings of the 2015 IEEE conference on computer vision and pattern recognition, CVPR’15, pp 4814–4822
Grundmann M, Kwatra V, Han M, Essa I (2010) Efficient hierarchical graph based video segmentation. In: Proceedings of the 2010 IEEE conference on computer vision and pattern recognition
Jain SD, Grauman K (2014) Supervoxel-consistent foreground propagation in video. In: Proceedings of the 2014 European conference on computer vision: part IV, Lecture Notes in Computer Science. Springer, pp 656–671
Jang WD, Lee C, Kim CS (2016) Primary object segmentation in videos via alternate convex optimization of foreground and background distributions. In: Proceedings of the 2016 IEEE conference on computer vision and pattern recognition
Jiang H, Zhang G, Wang H, Hujun B (2015) Spatio-temporal video segmentation of static scenes and its applications. IEEE Trans Multimed 17(1):3–15
Khoreva A, Galasso F, Hein M, Schiele B (2015) Classifier based graph construction for video segmentation. In: Proceedings of the 2015 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 951–960
Lee YJ, Kim J, Grauman K (2011) Key-segments for video object segmentation. In: Metaxas DN, Quan L, Sanfeliu A, Gool LJV (eds) Proceedings of the 2011 IEEE international conference on computer vision. IEEE Computer Society, pp 1995–2002
Li F, Kim T, Humayun A, Tsai D, Rehg JM (2013) Video segmentation by tracking many figure-ground segments. In: Proceedings of the 2013 IEEE international conference on computer vision, pp 2192– 2199
Maerki N, Perazzi F, Wang O, Sorkine-Hornung A (2016) Bilateral space video segmentation. In: Proceedings of the 2016 IEEE conference on computer vision and pattern recognition
Nagaraja N, Schmidt FR, Brox T (2015) Video segmentation with just a few strokes. In: Proceedings of the 2015 IEEE international conference on computer vision (ICCV), ICCV ’15. Santiago
Pan S, Sun W, Zheng Z (2016) Video segmentation algorithm based on superpixel link weight model. In: Multimedia tools and applications published online first, pp 1–20
Pan Z, Lei J, Zhang Y, Sun X, Kwong S (2016) Fast motion estimation based on content property for low-complexity h.265/hevc encoder. IEEE Trans Broadcast 62 (3):675–684
Papazoglou A, Ferrari V (2013) Fast object segmentation in unconstrained video. In: Proceedings of the 2013 IEEE international conference on computer vision. IEEE Computer Society, Los Alamitos, pp 1777–1784
Perazzi F, Pont-Tuset J, McWilliams B, Gool LV, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: The 2016 IEEE conference on computer vision and pattern recognition (CVPR)
Perazzi F, Wang O, Gross M, Sorkine-Hornung A (2015) Fully connected object proposals for video segmentation. In: Proceedings of the 2015 IEEE international conference on computer vision (ICCV), ICCV ’15. IEEE Computer Society, Washington, DC, pp 3227–3234
Poppe R (2010) A survey on vision-based human action recognition. Image Vision Comput 28(6):976–990
Prest A, Leistner C, Civera J, Schmid C, Ferrari V (2012) Learning object class detectors from weakly annotated video. In: Proceedings of the 2012 IEEE conference on computer vision and pattern recognition, pp 3282–3289. IEEE Computer Society
Rother C, Kolmogorov V, Blake A (2004) “grabcut”: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314
Silverman B (1986) Patchmatch: a randomized correspondence algorithm for structural image editing. Monographs on Statistics and Applied Probability
Sun D, Wulff J, Sudderth EB, Pfister H, Black MJ (2013) A fully-connected layered model of foreground and background flow. In: Proceedings of the 2013 IEEE conference on computer vision and pattern recognition, CVPR ’13. IEEE Computer Society, Washington, DC, pp 2451–2458
Tsai D, Flagg M, Nakazawa A, Rehg JM (2012) Motion coherent tracking using multi-label mrf optimization. Int J Comput Vis 100(2):190–202
Tsai YH, Yang MH, Black MJ (2016) Video segmentation via object flow. In: The 2016 IEEE conference on computer vision and pattern recognition (CVPR)
Varas D, Marques F (2014) Region-based particle filter for video object segmentation. In: Proceedings of the 2014 IEEE conference on computer vision and pattern recognition
Vijayanarasimhan S, Grauman K (2012) Active frame selection for label propagation in videos. In: Proceedings of the 2012 European conference on computer vision. Lecture Notes in Computer Science, vol 7576. Springer, pp 496–509
Wang J, Li T, Shi Y, Lian S, Ye J (2016) Forensics feature analysis in quaternion wavelet domain for distinguishing photographic images and computer graphics. In: Multimedia Tools and Applications published online first, pp 1–17
Wang W, Shen J, Porikli F (2015) Saliency-aware geodesic video object segmentation. In: CVPR. IEEE Computer Society, pp 3395–3402
Xiao F, Jae Lee Y (2016) Track and segment: an iterative unsupervised approach for video object proposals. In: The 2016 IEEE conference on computer vision and pattern recognition (CVPR)
Xu C, Corso JJ (2016) Libsvx: a supervoxel library and benchmark for early video processing. Int J Comput Vis 119(3):272–290
Zach C, Pock T, Bischof H (2007) A duality based approach for realtime tv-l1 optical flow. In: Proceedings of the 29th DAGM conference on pattern recognition. Springer, Berlin, pp 214–223
Zhang D, Javed O, Shah M (2013) Video object segmentation through spatially accurate and temporally dense extraction of primary object regions. In: Proceedings of the 2013 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, Los Alamitos, pp 628–635
Zhang Y, Tang Y, Cheng KL (2015) Efficient video cutout by paint selection. J Comput Sci Technol 30(3):467–477
Zhong F, Qin X, Peng Q, Meng X (2012) Discontinuity-aware video object cutout. ACM Trans Graph 31(6):175:1–175:10
Zhong F, Yang S, Qin X, Lischinski D, Cohen-Or D, Chen B (2014) Slippage-free background replacement forhand-held video. ACM Trans Graph 33(6):199:1–199:11
Acknowledgments
We thank the anonymous reviewers for their valuable comments. This paper is supported by National Natural Science Foundation of China (No. 61602252), Natural Science Foundation of Jiangsu Province of China (No. BK20160964, BK20160902, BK20160967), Project through the Priority Academic Program Development(PAPD) of Jiangsu Higher Education Institutions, Startup Foundation for Introducing Talent of Nanjing University of Information Science and Technology(NUIST) (No. 2243141601013).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Chen, Y., Hao, C., Wu, W. et al. Efficient frame-sequential label propagation for video object segmentation. Multimed Tools Appl 77, 6117–6133 (2018). https://doi.org/10.1007/s11042-017-4520-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-4520-5