Advertisement

Multimedia Tools and Applications

, Volume 75, Issue 3, pp 1443–1457 | Cite as

Salient object detection and classification for stereoscopic images

  • Kai Kang
  • Yang CaoEmail author
  • Jing Zhang
  • Zengfu Wang
Article

Abstract

Stereoscopic images have become more and more prevalent following the rapid advances in 3D capturing and display techniques. However, there has been little research on visual content analysis for stereoscopic images. In this paper, we address the challenging problem of object detection and classification for stereoscopic images. An iterative method that can mutually boost salient object detection and object classification is proposed for stereoscopic images. This method includes two steps. In the first step, a 3D saliency detection method, which includes the contrastive and occlusion cues contained in each stereoscopic image pair along with the discriminative features provided by the SVM classifier, is proposed to localize object of interest in the stereoscopic images. In the second step, the bag of word features of foreground and background is pooled by using the localization information, and then is applied to train the SVM classifier. Each of the two steps benefits from the gradual improvement result in the other, no matter in the training or the testing process. To evaluate the performance of our approach, a 6-object class dataset of stereoscopic images real objects viewed under general lighting conditions, poses and viewpoints is set up. Our experimental results on the dataset, for object localization and object classification, demonstrate the effectiveness of the method.

Keywords

Stereoscopic saliency Object detection Object classification 

Notes

Acknowledgments

We would like to thanks the Flickr users and the NVIDIA 3D Vision Live sharers for their sharing photos. We also would like to thank Yuzhen Niu, Yujie Geng, Xueqing Li and Feng Liu for they providing the website links.

References

  1. 1.
    Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  2. 2.
    Bilen H, Namboodiri VP, Gool LJV (2011) Object and action classification with latent variables. In: British machine vision conference (BMVC)Google Scholar
  3. 3.
    Bruce N, Tsotsos J (2005) An attentional framework for stereo vision. In: Proceedings of the Canadian conference on computer and robot visionGoogle Scholar
  4. 4.
    Bruce N, Tsotsos J (2006) Saliency based on information maximization. In: Advances in neural information processing systems (NIPS), vol. 18, p. 155–162Google Scholar
  5. 5.
    Chai Y, Lempitsky V, Zisserman A (2011) Bicos: A bi-level co-segmentation method for image classification. In: IEEE international conference on computer visionGoogle Scholar
  6. 6.
    Chamaret C, Godeffroy S, Lopez P, Meur OL (2010) Adaptive 3d rendering based on region-of-interest. In: Proceedings of SPIEGoogle Scholar
  7. 7.
    Cheng M, Zhang G, Mitra N, Huang X, Hu S (2011) Global contrast based salient region detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  8. 8.
    Delaitre V, Laptev I, Sivic J (2010) Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: British Machine vision conference (BMVC)Google Scholar
  9. 9.
    Gao D, Han S, Vasconcelos N (2009) Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Transcations on Pattern Anal Machine Intell (PAMI) 31(6):989–1005CrossRefGoogle Scholar
  10. 10.
    He K, Sun J, Tang X (2010) Guided image filtering. In: The European conference on computer vision (ECCV)Google Scholar
  11. 11.
    Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  12. 12.
    Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Transcations Pattern Anal Machine Intell (PAMI) 20:1254–1259CrossRefGoogle Scholar
  13. 13.
    Koch C, Ullman S (1985) Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurbiology 4:219–227Google Scholar
  14. 14.
    Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  15. 15.
    Li F, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  16. 16.
    Liu W, Tao D (2013) Multiview hessian regularization for image annotation. IEEE Trans Image Process 22:2676–2687MathSciNetCrossRefGoogle Scholar
  17. 17.
    Liu W, Tao D, Cheng J, Tang Y (2014) Multiview hessian discriminative sparse coding for image annotation. Comput Vis Image Underst 118:50–60CrossRefGoogle Scholar
  18. 18.
    Mai L, Niu Y, Liu F (2013) Saliency aggregation: a data-driven approach. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  19. 19.
    Maki A, Nordlund P, Eklundh J (1996) A computational model of depth-based attention. In: proceedings of the international conference on pattern recognitionGoogle Scholar
  20. 20.
    Murphy K, Torralba A, Eaton D, Freeman W (2006) Object detection and localization using local and global features. In: Toward category-level object recognition, springer berlin heidelbergCrossRefGoogle Scholar
  21. 21.
    Murray N, Vanrell M, Otazu X, Parraga CA (2011) Saliency estimation using a non-parametric low level vision model. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR)Google Scholar
  22. 22.
    Nguyen M H, Torresani L, de la Torre F, Rother C (2009) Weakly supervised discriminative localization and classification: a joint learning process. In: IEEE International conference on computer visionGoogle Scholar
  23. 23.
    Niu Y, Geng Y, Li X (2012) Leveraging stereopsis for saliency analysis. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  24. 24.
    Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66MathSciNetCrossRefGoogle Scholar
  25. 25.
    Ouerhani N, Hugli H (2000) Computing visual attention from scene depth. In: Proceedings of the international conference on pattern recognitionGoogle Scholar
  26. 26.
    Potapova E, Zillich M, Vincze M (2011) Learning what matters: combining probabilistic models of 2d and 3d saliency cues. Comput Vis Syst:132–142Google Scholar
  27. 27.
    Rapantzikos K, Avrithis Y, Kollias S (2009) Dense saliency-based spationtemporal feature points for action recognition. In: Proceedings IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  28. 28.
    Reynolds J, Desimone R (2003) Interacting roles of attention and visual salience in v4, vol 37, pp 853–863CrossRefGoogle Scholar
  29. 29.
    Rhemann C, Hosni A, Bleyer M, Rother C, Gelautz M (2011) Fast cost-volume filtering for visual correspondence and beyond. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  30. 30.
    Russakovsky O, Lin Y, Yu K, Fei-Fei L (2012) Object-centric spatial pooling for image classification. In: The European conference on computer vision (ECCV)Google Scholar
  31. 31.
    Sivic J, Zisserman A (2003) Video google: A text retrieval approach to object matching in videos. In: IEEE international conference on computer vision (ICCV)Google Scholar
  32. 32.
    Tatler B, Baddeley R, Gilchrist I (2005) Visual correlates of fixation selection: effects of scale and time. Vis Res 45:643–659CrossRefGoogle Scholar
  33. 33.
    van Zoest W, Donk M (2004) Bottom-up and top-down control in visual search, vol 33. PERCEPTION LONDON, pp 927–938Google Scholar
  34. 34.
    Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  35. 35.
    Wolfe JM, Horowitz TS (2004) What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci 5:1–7CrossRefGoogle Scholar
  36. 36.
    Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  37. 37.
    Yao B, Khosla A, Li F (2011) Combining randomization and discrimination for fine-grained image categorization. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  38. 38.
    Zhai Y, Shah M (2006) Visual attention detection in video sequences using spatiotemporal cues. ACM Trans Multimed:815–824Google Scholar
  39. 39.
    Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW (2008) Sun: a Bayesian framework for saliency using natural statistics. Journal of Vision 8(7):1–20CrossRefGoogle Scholar
  40. 40.
    Zhang Y, Jiang G, Yu M, Chen K (2010) Stereoscopic visual attention model for 3d video. Adv Multimed Model:314–324Google Scholar
  41. 41.
    Zha Z-J, Wang M, Zheng Y-T, Yang Y, Hong R, Chua T-S (2012) Interactive video indexing with statistical active learning. IEEE Trans Multimed 14(1):17–27CrossRefGoogle Scholar
  42. 42.
    Zha Z-J, Zhang H, et al (2013) Detecting Group Activities with Multi-Camera Context. IEEE transactions on circuits and systems for video technologies 23(5):856–869CrossRefGoogle Scholar
  43. 43.
    Zha Z-J, Yang Y, Tang J, Wang M, Chua T-S (2014) Robust multi-view feature learning for RGB-D image understanding, ACM transactions on intelligent systems and technologyGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of AutomationUniversity of Science and Technology of ChinaHefeiPeople’s Republic of China

Personalised recommendations