3D Research

, 8:29 | Cite as

A Depth Map Generation Algorithm Based on Saliency Detection for 2D to 3D Conversion

  • Yizhong Yang
  • Xionglou Hu
  • Nengju Wu
  • Pengfei Wang
  • Dong Xu
  • Shen Rong
3DR Express


In recent years, 3D movies attract people’s attention more and more because of their immersive stereoscopic experience. However, 3D movies is still insufficient, so estimating depth information for 2D to 3D conversion from a video is more and more important. In this paper, we present a novel algorithm to estimate depth information from a video via scene classification algorithm. In order to obtain perceptually reliable depth information for viewers, the algorithm classifies them into three categories: landscape type, close-up type, linear perspective type firstly. Then we employ a specific algorithm to divide the landscape type image into many blocks, and assign depth value by similar relative height cue with the image. As to the close-up type image, a saliency-based method is adopted to enhance the foreground in the image and the method combine it with the global depth gradient to generate final depth map. By vanishing line detection, the calculated vanishing point which is regarded as the farthest point to the viewer is assigned with deepest depth value. According to the distance between the other points and the vanishing point, the entire image is assigned with corresponding depth value. Finally, depth image-based rendering is employed to generate stereoscopic virtual views after bilateral filter. Experiments show that the proposed algorithm can achieve realistic 3D effects and yield satisfactory results, while the perception scores of anaglyph images lie between 6.8 and 7.8.

Graphical Abstract


Depth map 2D to 3D DIBR Saliency 



This work was supported by the National Natural Science Foundation of China under Grants 61401137 and 61404043, and the Fundamental Research Funds for the Central Universities under Grant No. J2014HGXJ0083.

Compliance with ethical standards

Conflict of interest

Authors declare that there is no conflict of interest regarding the publication of this paper.


  1. 1.
    Redert, A., Beeck, M. O. D., Fehn, C., Ijsselsteijn, W., Pollefeys, M., Gool, L. V., et al. (2002). ATTEST: Advanced three-dimensional television system technologies. In Proceedings of International Symposium on 3D Data Processing Visualization and Transmission, 2002 (pp. 313–319).Google Scholar
  2. 2.
    Fan, Y.-C., Kung, Y.-T., & Lin, B.-L. (2011). Three-dimensional auto-stereoscopic image recording, mapping and synthesis system for multi-view 3D display. IEEE Transactions on Magnetics, 47(3), 683–686.CrossRefGoogle Scholar
  3. 3.
    Jantet, V., Guillemot, C., & Morin, L. (2011). Joint projection filling method for occlusion handling in depth-image-based rendering. 3D Research, 2(4), 1–13.CrossRefGoogle Scholar
  4. 4.
    Phan, Raymond, & Androutsos, D. (2014). Robust semi-automatic depth map generation in unconstrained images and video sequences for 2D to stereoscopic 3D conversion. IEEE Transactions on Multimedia, 16(1), 122–136.CrossRefGoogle Scholar
  5. 5.
    Xiong, Y., & Shafer, S. A. (1993). Depth from focusing and defocusing. In Computer Vision and Pattern Recognition, 1993. Proceedings CVPR’93., 1993 IEEE Computer Society Conference on IEEE (pp. 68–73).Google Scholar
  6. 6.
    Kulkarni, J. B., & Sheelarani, C. M. (2015). Generation of depth map based on depth from focus: A survey. In International Conference on Computing Communication Control and Automation. IEEE.Google Scholar
  7. 7.
    Jung, Y. J., Baik, A., & Park, D. (2009). A novel 2D-to-3D conversion technique based on relative height-depth cue. In Proceedings of SPIE (vol. 7237, p. 72371.Google Scholar
  8. 8.
    Jung, C., Wang, L., Zhu, X., & Jiao, L. (2015). 2D to 3D conversion with motion-type adaptive depth estimation. Multimedia Systems, 21(5), 451–464.CrossRefGoogle Scholar
  9. 9.
    Battiato, S., Capra, A., Curti, S., & Cascia, M. L. (2004). 3D stereoscopic image pairs by depth-map generation. In International Symposium on 3D Data Processing, Visualization and Transmission (pp. 124–131).Google Scholar
  10. 10.
    Cozman, F., & E. Krotkov. (1997). Depth from scattering. In IEEE Computer Society Conference on Computer Vision & Pattern Recognition (pp. 801–806). Springer.Google Scholar
  11. 11.
    Zhou, Y., Hu, B., & Zhang, J. (2006). Occlusion detection and tracking method based on bayesian decision theory. In Pacific-Rim Symposium on Image and Video Technology (pp. 474–482). Berlin: Springer.Google Scholar
  12. 12.
    Prados, E., & Faugeras, O. (2006). Shape From Shading. Mathematical Models in Computer Vision the Handbook, 21, 375–388.MathSciNetMATHCrossRefGoogle Scholar
  13. 13.
    Loh, A. M., & Hartley, R. (2005). Shape from non-homogeneous, non-stationary, anisotropic, perspective texture. In British Machine Vision Conference (pp. 69–78).Google Scholar
  14. 14.
    Harman, P. V., Flack, J., Fox, S., & Dowley, M. (2002). Rapid 2D-to-3D conversion. In Electronic Imaging 2002 (pp. 78–86). International Society for Optics and Photonics.Google Scholar
  15. 15.
    Saxena, A., Chung, S. H., & Ng, A. Y. (2008). 3D depth reconstruction from a single still image. International Journal of Computer Vision, 76(1), 53–69.CrossRefGoogle Scholar
  16. 16.
    Kauff, P., Atzpadin, N., Fehn, C., Müller, M., Schreer, O., Smolic, A., et al. (2007). Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability. Signal Processing: Image Communication, 22(2), 217–234.Google Scholar
  17. 17.
    Hough, P. V. C. (1962). Method and means for recognizing complex patterns. U.S. Patent (no. 3069654).Google Scholar
  18. 18.
    Battiato, S., Curti, S., La Cascia, M., Tortora, M., & Scordato, E. (2004). Depth map generation by image classification. In Electronic Imaging 2004 (pp. 95–104). International Society for Optics and Photonics.Google Scholar
  19. 19.
    Lee, J., Yoo, S., Kim, C., & Vasudev, B. (2013). Estimating scene-oriented pseudo depth with pictorial depth cues. IEEE Transactions on Broadcasting, 59(2), 238–250.CrossRefGoogle Scholar
  20. 20.
    Zhang, Z., Yin, S., Liu, L., & Wei, S. (2015). A real-time time-consistent 2D-to-3D video conversion system using color histogram. IEEE Transactions on Consumer Electronics, 61(4), 524–530.CrossRefGoogle Scholar
  21. 21.
    Rahtu, E., Kannala, J., Salo, M., & Heikkilä, J. (2010). Segmenting salient objects from images and videos. In Computer VisionECCV 2010 (pp. 366–379).Google Scholar
  22. 22.
    Zhao, Y. X., Tai, H. P., Fang, S. J., & Chou, C. H. (2012). A new validity measure and fuzzy clustering algorithm for vanishing-point detection. In International Conference on Automatic Control and Artificial Intelligence (pp. 195–198). IET.Google Scholar
  23. 23.
    Paris, S., Kornprobst, P., Tumblin, J., & Durand, F. (2009). Bilateral filtering: Theory and applications. Foundations and Trends® in Computer Graphics and Vision, 4(1), 1–73.MATHCrossRefGoogle Scholar
  24. 24.
    Yin, S., Dong, H., Jiang, G., Liu, L., & Wei, S. (2015). A novel 2D-to-3D video conversion method using time-coherent depth maps. Sensors, 15(7), 15246–15264.CrossRefGoogle Scholar
  25. 25.
    Huynh-Thu, Q., & Ghanbari, M. (2008). Scope of validity of PSNR in image/video quality assessment. Electronics Letters, 44(13), 800–801.CrossRefGoogle Scholar
  26. 26.
    Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.CrossRefGoogle Scholar
  27. 27.
    Sheikh, H. R., & Bovik, A. C. (2006). Image information and visual quality. IEEE Transactions on Image Processing, 15(2), 430–444.CrossRefGoogle Scholar

Copyright information

© 3D Research Center, Kwangwoon University and Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.School of Electronic Science and Applied PhysicsHefei University of TechnologyHefeiChina

Personalised recommendations