Multiple Hypothesis Video Segmentation from Superpixel Flows

  • Amelio Vazquez-Reina
  • Shai Avidan
  • Hanspeter Pfister
  • Eric Miller
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6315)


Multiple Hypothesis Video Segmentation (MHVS) is a method for the unsupervised photometric segmentation of video sequences. MHVS segments arbitrarily long video streams by considering only a few frames at a time, and handles the automatic creation, continuation and termination of labels with no user initialization or supervision. The process begins by generating several pre-segmentations per frame and enumerating multiple possible trajectories of pixel regions within a short time window. After assigning each trajectory a score, we let the trajectories compete with each other to segment the sequence. We determine the solution of this segmentation problem as the MAP labeling of a higher-order random field. This framework allows MHVS to achieve spatial and temporal long-range label consistency while operating in an on-line manner. We test MHVS on several videos of natural scenes with arbitrary camera and object motion.


Video Sequence Video Stream Processing Window Video Segmentation Label Disagreement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Turaga, P., Veeraraghavan, A., Chellappa, R.: From videos to verbs: Mining videos for activities using a cascade of dynamical systems. In: CVPR 2007 (2007)Google Scholar
  2. 2.
    Pritch, Y., Rav-Acha, A., Peleg, S.: Nonchronological video synopsis and indexing. PAMI 30, 1971–1984 (2008)Google Scholar
  3. 3.
    Ren, X., Malik, J.: Tracking as repeated figure/ground segmentation. In: CVPR 2007 (2007)Google Scholar
  4. 4.
    Liu, S., Dong, G., Yan, C., Ong, S.: Video segmentation: Propagation, validation and aggregation of a preceding graph. In: CVPR 2008 (2008)Google Scholar
  5. 5.
    Jain, A.: Data clustering: 50 years beyond k-means. Pattern Recognition Letters (2009)Google Scholar
  6. 6.
    Brendel, W., Todorovic, S.: Video object segmentation by tracking regions. In: ICCV 2009 (2009)Google Scholar
  7. 7.
    Bugeau, A., Pérez, P.: Track and cut: simultaneous tracking and segmentation of multiple objects with graph cuts. JIVP, 1–14 (2008)Google Scholar
  8. 8.
    Yin, Z., Collins, R.: Shape constrained figure-ground segmentation and tracking. In: CVPR 2009 (2009)Google Scholar
  9. 9.
    Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Comput. Surv. 38, 13 (2006)CrossRefGoogle Scholar
  10. 10.
    Reid, D.B.: An algorithm for tracking multiple targets, vol. 17, pp. 1202–1211 (1978)Google Scholar
  11. 11.
    Wang, J., Xu, Y., Shum, H.Y., Cohen, M.F.: Video tooning. In: SIGGRAPH 2004 (2004)Google Scholar
  12. 12.
    Huang, Y., Liu, Q., Metaxas, D.: Video object segmentation by hypergraph cut. In: CVPR 2009 (2009)Google Scholar
  13. 13.
    De Menthon, D.: Spatio-temporal segmentation of video by hierarchical mean shift analysis. In: SMVP 2002 (2002)Google Scholar
  14. 14.
    Bai, X., Wang, J., Simons, D., Sapiro, G.: Video snapcut: robust video object cutout using localized classifiers. In: SIGGRAPH 2009 (2009)Google Scholar
  15. 15.
    Chan, A., Vasconcelos, N.: Variational layered dynamic textures. In: CVPR 2009 (2009)Google Scholar
  16. 16.
    Hedau, V., Arora, H., Ahuja, N.: Matching images under unstable segmentations. In: CVPR 2008 (2008)Google Scholar
  17. 17.
    Ayvaci, A., Soatto, S.: Motion segmentation with occlusions on the superpixel graph. In: ICCVW 2009 (2009)Google Scholar
  18. 18.
    Unger, M., Mauthner, T., Pock, T., Bischof, H.: Tracking as segmentation of spatial-temporal volumes by anisotropic weighted tv. In: Cremers, D., Boykov, Y., Blake, A., Schmidt, F.R. (eds.) EMMCVPR 2009. LNCS, vol. 5681, pp. 193–206. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  19. 19.
    Blackman, S., Popoli, R.: Design and Analysis of Modern Tracking Systems. Artech House (1999)Google Scholar
  20. 20.
    Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. PAMI 24, 603–619 (2002)Google Scholar
  21. 21.
    Maire, M., Arbelaez, P., Fowlkes, C., Malik, J.: Using contours to detect and localize junctions in natural images. In: CVPR 2008 (2008)Google Scholar
  22. 22.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2007)Google Scholar
  23. 23.
    Fulkerson, B., Vedaldi, A., Soatto, S.: Class segmentation and object localization with superpixel neighborhoods. In: ICCV 2009 (2009)Google Scholar
  24. 24.
    Boykov, Y., Funka-Lea, G.: Graph cuts and efficient n-d image segmentation. IJCV 70, 109–131 (2006)CrossRefGoogle Scholar
  25. 25.
    Kohli, P., Ladický, L., Torr, P.H.S.: Robust higher order potentials for enforcing label consistency. IJCV 82, 302–324 (2009)CrossRefGoogle Scholar
  26. 26.
    Kohli, P., Kumar, M.P., Torr, P.H.S.: P3 & beyond: Solving energies with higher order cliques. In: CVPR 2007 (2007)Google Scholar
  27. 27.
    Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: From contours to regions: An empirical evaluation. In: CVPR 2009 (2009)Google Scholar
  28. 28.
    Ding, T., Sznaier, M., Camps, O.: Fast track matching and event detection. In: CVPR 2008 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Amelio Vazquez-Reina
    • 1
    • 2
  • Shai Avidan
    • 3
  • Hanspeter Pfister
    • 1
  • Eric Miller
    • 2
  1. 1.School of Engineering and Applied SciencesHarvard UniversityUSA
  2. 2.Department of Computer ScienceTufts UniversityUSA
  3. 3.Adobe Systems Inc.USA

Personalised recommendations