Skip to main content
Log in

Unsupervised Segmentation of Stereoscopic Video Objects: Constrained Segmentation Fusion Versus Greedy Active Contours

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

In this paper two efficient unsupervised video object segmentation approaches are proposed and thoroughly compared. Both methods are based on the exploitation of depth information, estimated from stereoscopic pairs. Depth is a more efficient semantic descriptor of visual content, since usually an object is located on one depth plane. However, depth information fails to accurately represent the contours of an object mainly due to erroneous disparity estimation and occlusion issues. For this reason, the first approach projects color segments onto depth information in order to address the limitations of both depth and color segmentation; color segmentation usually over-partitions an object into several regions, while depth fails to precisely represent object contours. Depth information is produced through an occlusion compensated disparity field and then a depth map is generated. On the contrary, color segmentation is accomplished by incorporating a modified version of the Multiresolution Recursive Shortest Spanning Tree segmentation algorithm (M-RSST). Next considering the first “Constrained Fusion of Color Segments” (CFCS) approach, a color segments map is created, by applying the M-RSST to one of the stereoscopic channels. In this case video objects are extracted by fusing color segments according to depth similarity criteria. The second method also utilizes the depth segments map. In particular an active contour is automatically initialized onto the boundary of each depth segment, which is usually different from a video object’s boundary. Initialization is accomplished by a fitness function that considers different color areas and preserves the shapes of depth segments’ boundaries. For acceleration purposes each point of the active contour is associated to an “attractive edge” point and a greedy approach is incorporated so that the active contour converges to its final position. Several experiments on real life stereoscopic sequences are performed and extensive comparisons in terms of speed and accuracy indicate the promising performance of both methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20

Similar content being viewed by others

References

  1. Ho, P.-G. (2011). Image Segmentation. InTech, ISBN 978-953-307-228-9.

  2. Doulamis, N., Doulamis, A., Avrithis, Y., Ntalianis, K., & Kollias, S. (2000). Efficient summarization of stereoscopic video sequences. IEEE Transaction Circuits and Systems for Video Technology, 10(4), 501–517.

    Article  Google Scholar 

  3. He, H. McKinnon, D. & Upcroft, B. (2011). Towards automatic object segmentation with sequential multiple views, ACRA 2011 Proceedings, Australian Robotics & Automation Association, (pp. 1–7).

  4. Boukov,Y. & Jolly, M.-P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In Proc. IEEE Int. Conf. on Computer Vision.

  5. Rother,C., Kolmogorov, V., Blake, A. (2004) GrabCut — Interactive Foreground Extraction using Iterated Graph Cuts. ACM Transactions on Graphics (SIGGRAPH).

  6. Zhang, G., Jia, J. & Bao, H. (2011). Simultaneous Multi-Body Stereo and Segmentation. In Proc. of the 13th International Conference on Computer Vision, Barcelona, Spain, Nov.

  7. C. Zhang, L. Wang, and R. Yang, “Semantic segmentation of urban scenes using dense depth maps,” In ECCV, p.p. 708–721, 2010

  8. Prisacariu, V. A., & Reid, I. D. (2012). PWP3D: real-time segmentation and tracking of 3D objects. International Journal of Computer Vision, 98(3), 335–354.

    Article  MathSciNet  Google Scholar 

  9. S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison and A. Fitzgibbon, “KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera,” In Proc. of ACM UIST, p.p. 559–568, 2011

  10. Wang, L., Zhang, C., Yang, R. & Zhang, C. (2010). TofCut: Towards Robust Real-time Foreground Extraction Using a Time-of-Flight Camera. Fifth International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT).

  11. Zhang, G., Jia, J., Hua, W., & Bao, H. (2011). Robust bilayer segmentation and motion/depth estimation with a handheld camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 603–617.

    Article  Google Scholar 

  12. Bleyer, M., Rother, C., Kohli, P., Scharstein, D., & Sinha, S. (2011). “Object stereo - joint stereo matching and object segmentation”, in proc. Colorado: IEEE Computer Vision & Pattern Recognition.

    Google Scholar 

  13. Guillemaut J.Y. & Hilton, A. (2011). “oint multi-layer segmentation and reconstruction for free-viewpoint video applications. International Journal of Computer Vision, (pp. 1–28).

  14. Xiao J. & Quan, L. (2009). Multiple view semantic segmentation for street view images. In Proc. of the IEEE 12th International Conference on Computer Vision, (pp. 686–693).

  15. Liu, Z., Shi, R., Shen, L., Xue, Y., Ngan, K. N. & Zhang, Z. (2012). Unsupervised Salient Object Segmentation Based on Kernel Density Estimation and Two-Phase Graph Cut. IEEE Trans. on Multimedia, Vol. 14, No. 4, Aug.

  16. Zhang, G., Jia, J., Hua, W. & Bao, H. (2011). Robust Bilayer Segmentation and Motion/Depth Estimation with a Handheld Camera. IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 33, No. 3, March.

  17. Szeliski, R. (2010). Computer Vision: Algorithms and Applications (Texts in Computer Science), Springer, Nov.

  18. Luenberger, D. G. & Ye, Y. (2010). Linear and Nonlinear Programming (International Series in Operations Research & Management Science), Springer, Nov.

  19. Doulamis, A. D., Doulamis, N. D., Ntalianis,K. S. & Kollias, S. D. (1999). Unsupervised Semantic Object Segmentation of Stereoscopic Video Sequences. In Proc. of the IEEE International Conference on Information, Intelligence and Systems (ICIIS), Washington D.C., U.S.A, November.

  20. Avrithis, Y., Doulamis, A., Doulamis N. & Kollias, S. (1999). A Stochastic Framework for Optimal Key Frame Extraction from MPEG Video Databases. Computer Vision and Image Understanding, Academic Press, Vol. 75, Nos 1/2, (pp. 3–24), July/August.

  21. Busin, L., Vandenbroucke N. & Macaire, L. (2008). Color spaces and image segmentation,” Adances in Imaging and Electron Physics, Vol. 151, Chapter 2, (pp. 65–168). Orlando, FL, USA: Elsevier Inc.. (ISSN: 1076–5670).

  22. Kass, M., Witkin, A., & Terzopoulos, D. (1987). Snakes: active contour models. International Journal of Computer Vision, 1, 321–331.

    Article  Google Scholar 

  23. Xu, C., & Prince, J. L. (1998). Snakes, shapes and gradient vector flow. IEEE Transaction Image Processing, 7(3), 359–369.

    Article  MathSciNet  Google Scholar 

  24. Amini, A. A., Tehrani, S. & Weymouth, T. E. (1988). Using Dynamic Programming for Minimizing the Energy of Active Contours in the Presence of Hard Constraints. In Proc. of the Second International Conference on Computer Vision (ICCV), (pp. 95–99).

  25. Williams, D. J., & Shah, M. (1992). A fast algorithm for active contours and curvature estimation. GVGIP: Image Understanding, 55(1), 14–26.

    Google Scholar 

  26. Slater,J. (1996). Eye to Eye with Stereoscopic TV. Image Technology, p. 23, Nov./Dec..

  27. Girdwood, C. & Chiwy, P. (1996). MIRAGE: An ACTS Project in Virtual Production and Stereoscopy. IBC Conference Publication, No. 428, (pp. 155–160), Sept.

  28. Wollborn, M. & Mech, R. (1997). Procedure for Objective Evaluation of VOP Generation Algorithms. Doc. ISO/IEC JTC1/SC29/WG11 MPEG97/2704, Fribourg, Switzerland, October.

  29. Correia P. & Pereira, F. (2000). Objective Evaluation of Relative Segmentation Quality. In Proc. International Conference on Image Processing (ICIP), Vancouver, Canada, September.

  30. P. Villegas, X. Marichal and A. Salcedo, “Objective Evaluation of Segmentation Masks in Video Sequences”, in Proc. Of Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Berlin, Germany, May-June 1999.

  31. Mitiche A. & Ayed, I. B. (2010). Variational and Level Set Methods in Image Segmentation (Springer Topics in Signal Processing), Springer, Oct.

  32. Dufaux, F., Popescu, B. P. & Cagnazzo, M. (2013). Emerging Technologies for 3D Video: Creation, Coding, Transmission and Rendering, Wiley, May.

  33. Dhond, U. R., & Aggarwal, J. K. (1989). Structure from stereo - a review. IEEE Transactions on Systems, Man, and Cybernetics, 19(6), 1489–1510.

    Article  MathSciNet  Google Scholar 

  34. Liu, D., Xiong, Y., Pulli, K. & Shapiro, L. (2011). Estimating image segmentation difficulty. In Proceedings of the 7th International Conference on Machine Learning and Data Mining in Pattern Recognition, (pp. 484–495).

Download references

Acknowledgments

The authors wish to thank Dr. Chas Girdwood, the project manager of the ITC (Winchester), for providing the 3D video sequence “Eye to Eye”, which was produced in the framework of ACTS MIRAGE project. Furthermore the authors want to thank very much Dr. Siegmund Pastoor of the HHI (Berlin), for providing the video sequences of the DISTIMA project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Klimis S. Ntalianis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ntalianis, K.S., Doulamis, A.D., Doulamis, N.D. et al. Unsupervised Segmentation of Stereoscopic Video Objects: Constrained Segmentation Fusion Versus Greedy Active Contours. J Sign Process Syst 81, 153–181 (2015). https://doi.org/10.1007/s11265-014-0921-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-014-0921-0

Keywords

Navigation