Graph-Based Spatio-temporal Region Extraction

  • Eric Galmar
  • Benoit Huet
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4141)


Motion-based segmentation is traditionally used for video object extraction. Objects are detected as groups of significant moving regions and tracked through the sequence. However, this approach presents difficulties for video shots that contain both static and dynamic moments, and detection is prone to fail in absence of motion. In addition, retrieval of static contents is needed for high-level descriptions.

In this paper, we present a new graph-based approach to extract spatio-temporal regions. The method performs iteratively on pairs of frames through a hierarchical merging process. Spatial merging is first performed to build spatial atomic regions, based on color similarities. Then, we propose a new matching procedure for the temporal grouping of both static and moving regions. A feature point tracking stage allows to create dynamic temporal edges between frames and group strongly connected regions. Space-time constraints are then applied to merge the main static regions and a region graph matching stage completes the procedure to reach high temporal coherence. Finally, we show the potential of our method for the segmentation of real moving video sequences.


Feature Point Temporal Grouping Video Object Video Shot Frame Pair 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zhong, D., Chang, S.: Long-term moving object segmentation and tracking using spatio-temporal consistency. In: ICIP 2001, Thessaloniki, Greece, October 2001, vol. 2, pp. 57–60 (2001)Google Scholar
  2. 2.
    Xu, H., Younis, A., Kabuka, M.: Automatic moving object extraction for content-based applications. IEEE Transactions on Circuits and Systems for Video Technology 14(6), 796–812 (2004)CrossRefGoogle Scholar
  3. 3.
    DeMenthon, D., Doermann, D.: Video retrieval using spatio-temporal descriptors. In: Proceedings of the eleventh ACM international conference on Multimedia, Berkeley, CA, USA, pp. 508–517 (November 2003)Google Scholar
  4. 4.
    Greenspan, H., Goldberger, J., Mayer, A.: Probabilistic space-time video modeling via piecewise gmm. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(3), 384–396 (2004)CrossRefGoogle Scholar
  5. 5.
    Vincent, L., Soille, P.: Watersheds in digital space: an efficient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(6), 583–598 (1991)CrossRefGoogle Scholar
  6. 6.
    Deng, Y., Manjunath, B.: Unsupervised segmentation of color-texture regions in images and video. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(8), 800–810 (2001)CrossRefGoogle Scholar
  7. 7.
    Yuan, C., Ma, Y.F., Zhang, H.J.: A graph theoritic approach to video object segmentation in 2d+t space. Technical report, MSR (2003)Google Scholar
  8. 8.
    Gomila, C., Meyer, F.: Graph-based object tracking. In: ICIP 2003, vol. 2, pp. 41–44 (2003)Google Scholar
  9. 9.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. International Journal of Computer Vision 59(2), 167–181 (2004)CrossRefGoogle Scholar
  10. 10.
    Shi, J., Tomasi, C.: Good features to track. In: IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, pp. 593–600 (1994)Google Scholar
  11. 11.
    Milton, J.S., Arnold, J.: Introduction to Probability and Statistics. McGraw-Hill, New York (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Eric Galmar
    • 1
  • Benoit Huet
    • 1
  1. 1.Département multimédiaInstitut EurécomSophia-AntipolisFrance

Personalised recommendations