Skip to main content
Log in

Salient object detection and classification for stereoscopic images

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Stereoscopic images have become more and more prevalent following the rapid advances in 3D capturing and display techniques. However, there has been little research on visual content analysis for stereoscopic images. In this paper, we address the challenging problem of object detection and classification for stereoscopic images. An iterative method that can mutually boost salient object detection and object classification is proposed for stereoscopic images. This method includes two steps. In the first step, a 3D saliency detection method, which includes the contrastive and occlusion cues contained in each stereoscopic image pair along with the discriminative features provided by the SVM classifier, is proposed to localize object of interest in the stereoscopic images. In the second step, the bag of word features of foreground and background is pooled by using the localization information, and then is applied to train the SVM classifier. Each of the two steps benefits from the gradual improvement result in the other, no matter in the training or the testing process. To evaluate the performance of our approach, a 6-object class dataset of stereoscopic images real objects viewed under general lighting conditions, poses and viewpoints is set up. Our experimental results on the dataset, for object localization and object classification, demonstrate the effectiveness of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://photos.3dvisionlive.com/

  2. http://www.vlfeat.org/

References

  1. Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  2. Bilen H, Namboodiri VP, Gool LJV (2011) Object and action classification with latent variables. In: British machine vision conference (BMVC)

  3. Bruce N, Tsotsos J (2005) An attentional framework for stereo vision. In: Proceedings of the Canadian conference on computer and robot vision

  4. Bruce N, Tsotsos J (2006) Saliency based on information maximization. In: Advances in neural information processing systems (NIPS), vol. 18, p. 155–162

  5. Chai Y, Lempitsky V, Zisserman A (2011) Bicos: A bi-level co-segmentation method for image classification. In: IEEE international conference on computer vision

  6. Chamaret C, Godeffroy S, Lopez P, Meur OL (2010) Adaptive 3d rendering based on region-of-interest. In: Proceedings of SPIE

  7. Cheng M, Zhang G, Mitra N, Huang X, Hu S (2011) Global contrast based salient region detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  8. Delaitre V, Laptev I, Sivic J (2010) Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: British Machine vision conference (BMVC)

  9. Gao D, Han S, Vasconcelos N (2009) Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Transcations on Pattern Anal Machine Intell (PAMI) 31(6):989–1005

    Article  Google Scholar 

  10. He K, Sun J, Tang X (2010) Guided image filtering. In: The European conference on computer vision (ECCV)

    Google Scholar 

  11. Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  12. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Transcations Pattern Anal Machine Intell (PAMI) 20:1254–1259

    Article  Google Scholar 

  13. Koch C, Ullman S (1985) Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurbiology 4:219–227

    Google Scholar 

  14. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  15. Li F, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  16. Liu W, Tao D (2013) Multiview hessian regularization for image annotation. IEEE Trans Image Process 22:2676–2687

    Article  MathSciNet  Google Scholar 

  17. Liu W, Tao D, Cheng J, Tang Y (2014) Multiview hessian discriminative sparse coding for image annotation. Comput Vis Image Underst 118:50–60

    Article  Google Scholar 

  18. Mai L, Niu Y, Liu F (2013) Saliency aggregation: a data-driven approach. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  19. Maki A, Nordlund P, Eklundh J (1996) A computational model of depth-based attention. In: proceedings of the international conference on pattern recognition

  20. Murphy K, Torralba A, Eaton D, Freeman W (2006) Object detection and localization using local and global features. In: Toward category-level object recognition, springer berlin heidelberg

    Chapter  Google Scholar 

  21. Murray N, Vanrell M, Otazu X, Parraga CA (2011) Saliency estimation using a non-parametric low level vision model. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  22. Nguyen M H, Torresani L, de la Torre F, Rother C (2009) Weakly supervised discriminative localization and classification: a joint learning process. In: IEEE International conference on computer vision

  23. Niu Y, Geng Y, Li X (2012) Leveraging stereopsis for saliency analysis. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  24. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66

    Article  MathSciNet  Google Scholar 

  25. Ouerhani N, Hugli H (2000) Computing visual attention from scene depth. In: Proceedings of the international conference on pattern recognition

  26. Potapova E, Zillich M, Vincze M (2011) Learning what matters: combining probabilistic models of 2d and 3d saliency cues. Comput Vis Syst:132–142

  27. Rapantzikos K, Avrithis Y, Kollias S (2009) Dense saliency-based spationtemporal feature points for action recognition. In: Proceedings IEEE conference on computer vision and pattern recognition (CVPR)

  28. Reynolds J, Desimone R (2003) Interacting roles of attention and visual salience in v4, vol 37, pp 853–863

    Article  Google Scholar 

  29. Rhemann C, Hosni A, Bleyer M, Rother C, Gelautz M (2011) Fast cost-volume filtering for visual correspondence and beyond. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  30. Russakovsky O, Lin Y, Yu K, Fei-Fei L (2012) Object-centric spatial pooling for image classification. In: The European conference on computer vision (ECCV)

    Google Scholar 

  31. Sivic J, Zisserman A (2003) Video google: A text retrieval approach to object matching in videos. In: IEEE international conference on computer vision (ICCV)

  32. Tatler B, Baddeley R, Gilchrist I (2005) Visual correlates of fixation selection: effects of scale and time. Vis Res 45:643–659

    Article  Google Scholar 

  33. van Zoest W, Donk M (2004) Bottom-up and top-down control in visual search, vol 33. PERCEPTION LONDON, pp 927–938

  34. Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings IEEE conference on computer vision and pattern recognition (CVPR)

  35. Wolfe JM, Horowitz TS (2004) What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci 5:1–7

    Article  Google Scholar 

  36. Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  37. Yao B, Khosla A, Li F (2011) Combining randomization and discrimination for fine-grained image categorization. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)

  38. Zhai Y, Shah M (2006) Visual attention detection in video sequences using spatiotemporal cues. ACM Trans Multimed:815–824

  39. Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW (2008) Sun: a Bayesian framework for saliency using natural statistics. Journal of Vision 8(7):1–20

    Article  Google Scholar 

  40. Zhang Y, Jiang G, Yu M, Chen K (2010) Stereoscopic visual attention model for 3d video. Adv Multimed Model:314–324

  41. Zha Z-J, Wang M, Zheng Y-T, Yang Y, Hong R, Chua T-S (2012) Interactive video indexing with statistical active learning. IEEE Trans Multimed 14(1):17–27

    Article  Google Scholar 

  42. Zha Z-J, Zhang H, et al (2013) Detecting Group Activities with Multi-Camera Context. IEEE transactions on circuits and systems for video technologies 23(5):856–869

    Article  Google Scholar 

  43. Zha Z-J, Yang Y, Tang J, Wang M, Chua T-S (2014) Robust multi-view feature learning for RGB-D image understanding, ACM transactions on intelligent systems and technology

Download references

Acknowledgments

We would like to thanks the Flickr users and the NVIDIA 3D Vision Live sharers for their sharing photos. We also would like to thank Yuzhen Niu, Yujie Geng, Xueqing Li and Feng Liu for they providing the website links.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Cao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kang, K., Cao, Y., Zhang, J. et al. Salient object detection and classification for stereoscopic images. Multimed Tools Appl 75, 1443–1457 (2016). https://doi.org/10.1007/s11042-014-2142-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2142-8

Keywords

Navigation