Skip to main content

DIBR-Based Conversion from Monoscopic to Stereoscopic and Multi-View Video

  • Chapter
  • First Online:
3D-TV System with Depth-Image-Based Rendering

Abstract

This chapter aims to provide a tutorial on 2D-to-3D video conversion methods that exploit depth-image-based rendering (DIBR) techniques. It is devoted not only to university students who are new to this area of research, but also to researchers and engineers who want to enhance their knowledge of video conversion techniques. The basic principles and the various methods for converting 2D video to stereoscopic 3D, including depth extraction strategies and DIBR-based view synthesis approaches, are reviewed. Conversion artifacts and evaluation of conversion quality are discussed, and the advantages and disadvantages of the different methods are elaborated. Furthermore, practical implementations for the conversion from monoscopic to stereoscopic and multi-view video are drawn.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    N.B. the black bars that constitute the floating window are shown much wider than they actually are to make it more visible in the Figure.

References

  1. Advanced Television Systems Commitee (ATSC) (2011) Final report of the ATSC planning team on 3D-TV. PT1-049r1. Advanced Television Systems Commitee (ATSC), Washington DC, USA

    Google Scholar 

  2. International Organisation for Standardisation (ISO) (2009) Vision on 3D video. ISO/IEC JTC1/SC29/WG11 N10357, International Organisation for Standardisation (ISO), Lausanne, Switzerland

    Google Scholar 

  3. Society of Motion Picture and Television Engineers (SMPTE) (2009) Report of SMPTE task force on 3D to the home. TF3D, Society of Motion Picture and Television Engineers

    Google Scholar 

  4. Smolic A, Mueller K, Merkle P, Vetro A (2009) Development of a new MPEG standard for advanced 3D video applications. TR2009-068, Mitsubishi Electric Research Laboratories, Cambridge, MA, USA

    Google Scholar 

  5. Valentini VI (2011) Legend3D sets the transformers 2D-3D conversion record straight. In: indiefilm3D. Available at: http://indiefilm3d.com/node/518

  6. Tam WJ, Speranza F, Yano S, Ono K, Shimono H (2011) Stereoscopic 3D-TV: visual comfort. IEEE Trans Broadcast 57(2):335–346 part II

    Article  Google Scholar 

  7. Kauff P, Atzpadin N, Fehn C, Müller M, Schreer O, Smolic A, Tanger R (2007) Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability. Signal Processing: Image Communication (Special issue on three-dimensional video and television) 22(2):217–234

    Google Scholar 

  8. Zhang L, Vázquez C, Knorr S (2011) 3D-TV content creation: automatic 2D-to-3D video conversion. IEEE Trans Broadcast 57(2):372–383

    Article  Google Scholar 

  9. Tam WJ, Zhang L (2006) 3D-TV content generation: 2D-to-3D conversion. In IEEE International Conference on Multimedia and Expo, Toronto, Canada

    Google Scholar 

  10. Fehn C (2003) A 3D-TV approach using depth-image-based rendering (DIBR). In 3rd conference on visualization. Imaging and Image Processing, Benalmadena, Spain

    Google Scholar 

  11. Ostnes R, Abbott V, Lavender S (2004) Visualisation techniques: an overview—Part 1. Hydrogr J 113:4–7

    Google Scholar 

  12. Shimono K, Tam WJ, Nakamizo S (1999) Wheatstone-panum limiting case: occlusion, camouflage, and vergence-induced disparity cues. Percept Psychophys 61(3):445–455

    Article  Google Scholar 

  13. Ens J, Lawrence P (1993) An investigation of methods for determining depth from focus. IEEE Trans Pattern Anal Mach Intell 15(2):97–107

    Article  Google Scholar 

  14. Battiato S, Curti S, La Cascia M, Tortora M, Scordato E (2004) Depth map generation by image classification. Proc SPIE 5302:95–104

    Article  Google Scholar 

  15. Hudson W (1967) The study of the problem of pictorial perception among unacculturated groups. Int J Psychol 2(2):89–107

    Article  Google Scholar 

  16. Knorr S, Kunter M, Sikora T (2008) Stereoscopic 3D from 2D video with super-resolution capabilities. Signal Process: Image commun 23(9):665–676

    Article  Google Scholar 

  17. Mancini A (1998) Disparity estimation and intermediate view reconstruction for novel applications in stereoscopic video, McGill University, Canada

    Google Scholar 

  18. Scharstein D, Szeliski R (2003) High-accuracy stereo depth maps using structured light. In: IEEE Computer society conference on computer vision and pattern recognition (CVPR 2003), vol 1. Madison, WI, USA, pp 195–202

    Google Scholar 

  19. Yamada K, Suehiro K, Nakamura H (2005) Pseudo 3D image generation with simple depth models. In: International conference in consumer electronics, Las Vegas, NV, pp 277–278

    Google Scholar 

  20. Battiato S, Capra A, Curti S, La Cascia M (2004) 3D Stereoscopic image pairs by depth-map generation. In: 3D Data processing, visualization and transmission, pp 124–131

    Google Scholar 

  21. Nedovic V, Smeulders AWM, Redertand A, Geusebroek JM (2007) Depth information by stage classification. In: International conference on computer vision

    Google Scholar 

  22. Yamada K, Suzuki Y (2009) Real-time 2D-to-3D conversion at full HD 1080P resolution. In: 13th IEEE International symposium on consumer electronics, Las Vegas, NV, pp 103–106

    Google Scholar 

  23. Huang X, Wang L, Huang J, Li D, Zhang M (2009) A depth extraction method based on motion and geometry for 2D-to-3D conversion. In: Third international symposium on intelligent information technology application

    Google Scholar 

  24. Jung Y-J, Baik A, Park D (2009) A novel 2D-to-3D conversion technique based on relative height-depth-cue. In: SPIE Conference on stereoscopic displays and applications XX, San Jose, CA, vol 7237. p 72371U

    Google Scholar 

  25. Tam WJ, Speranza F, Zhang L (2009) Depth map generation for 3-D TV: importance of edge and boundary information. In: Javidi B, Okano F, Son J-Y(eds) Three-dimensional imaging, visualization and display. Springer, New York, pp 153–181

    Chapter  Google Scholar 

  26. Tam WJ, Yee AS, Ferreira J, Tariq S, Speranza F (2005) Stereoscopic image rendering based on depth maps created from blur and edge information. In: Proceedings of the stereoscopic displays and applications, vol 5664. pp 104–115

    Google Scholar 

  27. Tam WJ, Vázquez C, Speranza F (2009) 3D-TV: a novel method for generating surrogate depth maps using colour information. In: SPIE Conference stereoscopic displays and applications XX, San José, USA, vol 7237, p 72371A

    Google Scholar 

  28. Zhang L, Tam WJ (2005) Stereoscopic image generation based on depth images for 3D TV. IEEE Trans Broadcast 51:191–199

    Article  Google Scholar 

  29. Ernst FE (2003) 2D-to-3D video conversion based on time-consistent segmentation. In: Proceedings of the immersive communication and broadcast systems workshop, Berlin, Germany

    Google Scholar 

  30. Chang Y-L, Fang C-Y, Ding L-F, Chen S-Y, Chen L-G (2007) Depth map generation for 2D-to-3D conversion by short-term motion assisted color segmentation. In: IEEE International conference on multimedia and expo, pp 1958–1961

    Google Scholar 

  31. Vázquez C, Tam WJ (May 2010) CRC-CSDM: 2D to 3D conversion using colour-based surrogate depth maps. In: International conference on 3D systems and applications (3DSA 2010), Tokyo, Japan

    Google Scholar 

  32. Kim J, Baik A, Jung YJ, Park D (2010) 2D-to-3D conversion by using visual attention analysis. In: Proceedings SPIE, vol 7524, p 752412

    Google Scholar 

  33. Nothdurft H (2000) Salience from feature contrast: additivity across dimensions. Vis Res 40:1183–1201

    Article  Google Scholar 

  34. Rogers B-J, Graham M-E (1979) Motion parallax as an independent cue for depth perception. Perception 8:125–134

    Article  Google Scholar 

  35. Ferris S-H (1972) Motion parallax and absolute distance. J Exp Psychol 95(2):258–263

    Article  Google Scholar 

  36. Matsumoto Y, Terasaki H, Sugimoto K, Arakawa T (1997) Conversion system of monocular image sequence to stereo using motion parallax. In: SPIE Conference in stereoscopic displays and virtual reality systems IV, San Jose, CA, vol 3012. pp 108–112

    Google Scholar 

  37. Zhang L, Lawrence B, Wang D, Vincent A (2005) Comparison study of feature matching and block matching for automatic 2D to 3D video conversion. In: 2nd IEEE European conference on visual media production, London, UK, pp 122–129

    Google Scholar 

  38. Hartley R, Zisserman A (2000) Multiple view geometry in computer vision. Cambridge University Press, Cambridge, UK

    MATH  Google Scholar 

  39. Choi S, Woods J (1999) Motion-compensated 3-D subband coding of video. IEEE Trans Image Process 8(2):155–167

    Article  Google Scholar 

  40. Kim MB, Song MS (1998) Stereoscopic conversion of monoscopic video by the transformation of vertical to horizontal disparity. Proc SPIE 3295:65–75

    Article  Google Scholar 

  41. Ideses I, Yaroslavsky LP, Fishbain B (2007) Real-time 2D to 3D video conversion. J Real-Time Image Process 2(1):3–7

    Article  Google Scholar 

  42. Pourazad M-T, Nasiopoulos P, Ward R-K (2009) An H.264-based scheme for 2D-to-3D video conversion. IEEE Trans Consum Electron 55(2):742–748

    Google Scholar 

  43. Pourazad M-T, Nasiopoulos P, Ward R-K (2010) Generating the depth map from the motion information of H.264-encoded 2D video sequence. EURASIP J Image Video Process

    Google Scholar 

  44. Kim D, Min D, Sohn K (2008) A stereoscopic video generation method using stereoscopic display characterization and motion analysis. IEEE Trans Broadcast 54(2):188–197

    Article  Google Scholar 

  45. Po L-M, Xu X, Zhu Y, Zhang S, Cheung K-W, Ting C-W (2010) Automatic 2D-to-3D video conversion technique based on depth-from-motion and color segmentation. In: IEEE International conference on signal processing, Hong Kong, China, pp 1000–1003

    Google Scholar 

  46. Xu F, Er G, Xie X, Dai Q (2008) 2D-to-3D conversion based on motion and color mergence. In: 3DTV Conference, Istanbul, Turkey

    Google Scholar 

  47. Zhang G, Jia J, Wong TT, Bao H (2009) Consistent depth maps recovery from a video sequence. IEEE Trans Pattern Anal Mach Intell 31(6):974–988

    Google Scholar 

  48. Chang YL, Chang JY, Tsai YM, Lee CL, Chen LG (2008) Priority depth fusion for 2D-to-3D conversion systems. In: SPIE Conference on three-dimensional image capture and applications, San Jose, CA, vol 6805, p 680513

    Google Scholar 

  49. Cheng C-C, Li C-T, Tsai Y-M, Chen L-G (2009) Hybrid depth cueing for 2D-to-3D conversion system. In: SPIE Conference on Stereoscopic Displays and Applications XX, San Jose. CA, USA, vol 7237, p 723721

    Google Scholar 

  50. Chen Y, Zhang R, Karczewicz M (2011) Low-complexity 2D-to-3D video conversion. In: SPIE Conference on stereoscopic displays and applications XXII, vol 7863, p 78631I

    Google Scholar 

  51. Tam WJ, Alain G, Zhang L, Martin T, Renaud R (2004) Smoothing depth maps for improved stereoscopic image quality. In: Three-dimensional TV, video and display III (ITCOM’04), Philadelphia, PA, vol 5599, p 162

    Google Scholar 

  52. Vázquez C, Tam WJ, Speranza F (2006) Stereoscopic imaging: filling disoccluded areas in depth image-based rendering. In: SPIE Conference on three-dimensional tv, video and display V, Boston, MA, vol 6392, p 63920D

    Google Scholar 

  53. Shimono K, Tam WJ, Speranza F, Vázquez C, Renaud R (2010) Removing the cardboard effect in stereoscopic images using smoothed depth maps. In: Stereoscopic displays and applications XXI, San José, CA, vol 7524, p 75241C

    Google Scholar 

  54. Mori Y, Fukushima N, Yendo T, Fujii T, Tanimoto M (2009) View generation with 3D warping using depth information for FTV. Signal Process: Image Commun 24(12):65–72

    Article  Google Scholar 

  55. Chen W-Y, Chang Y-L, Lin S-F, Ding L-F, Chen L-G (2005) Efficient depth image based rendering with edge depenedent filter and interpolation. In: IEEE Internatinal conference on multimedia and expo, Amnsterdam, The Netherlands

    Google Scholar 

  56. International Organization for Standardization / International Electrotechnical Commission (2007) Representation of auxiliary video and supplemental information. ISO/IEC FDIS 23002-3:2007(E), International organization for standardization / International electrotechnical commission, Lausanne

    Google Scholar 

  57. Daly SJ, Held RT, Hoffman DM (2011) Perceptual issues in stereoscopic signal processing. IEEE Trans Broadcast 57(2):347–361

    Article  Google Scholar 

  58. Lang M, Hornung A, Wang O, Poulakos S, Smolic A, Gross M (2010) Nonlinear disparity mapping for stereoscopic 3D. In: ACM SIGGRAPH, Los Angeles, CA

    Google Scholar 

  59. Vázquez C, Tam WJ (2008) 3D-TV: coding of disocclusions for 2D+Depth representation of multi-view images. In: Tenth international conference on computer graphics and imaging (CGIM), Innsbruck, Austria

    Google Scholar 

  60. Tauber Z, Li Z-N, Drew M-S (2007) Review and preview: disocclusion by inpainting for image-based rendering. IEEE Trans Syst Man Cybernetics Part C: Appl Rev 37(4):527–540

    Google Scholar 

  61. Azzari L, Battisti F, Gotchev A (2010) Comparative analysis of occlusion-filling techniques in depth image-based rendering for 3D videos. In: 3rd Workshop on mobile video delivery, Firenze, Italy

    Google Scholar 

  62. Criminisi A, Perez P, Toyama K (2004) Region filling and object removal by exemplar-based image inpainting. IEEE Trans Image Process 13:1200–1212

    Article  Google Scholar 

  63. Criminisi A, Perez P, Toyama K, Gangnet M, Blake A (2006) Image region filling by exemplar-based inpainting. Patent No: 6,987,520, United States

    Google Scholar 

  64. Daribo I, Pesquet-Popescu B (2010) Depth-aided image inpainting for novel view synthesis. In International Workshop on Multimedia Signal Processing, Saint-Malo, France, pp 167–170

    Google Scholar 

  65. Gunnewiek R-K, Berrety R-PM, Barenbrug B, Magalhaes J-P (2009) Coherent spatial and temporal occlusion generation. In: Proceedings SPIE, vol 7237, p 723713

    Google Scholar 

  66. Cheng C-M, Lin S-J, Lai S-H (2011) Spatio-temporal consistent novel view synthesis algorithm from video-plus-depth sequences for autostereoscopic displays. IEEE Trans Broadcast 57(2):523–532

    Article  Google Scholar 

  67. Holliman NS, Dodgson NA, Favarola GE, Pockett L (2011) Three-dimensional displays: a review and applications analysis. IEEE Trans Broadcast 57(2):362–371

    Article  Google Scholar 

  68. Cheng CM, Lin SJ, Lai SH, Yang JC (2003) Improved novel view sysnthesis from depth image with large baseline. In: International conference on pattern recognition, Tampa, FL

    Google Scholar 

  69. Seymour M (2011) Art of stereo conversion: 2D-to-3D. In: fxguide. Available at: http://www.fxguide.com/featured/art-of-stereo-conversion-2d-to-3d/

  70. Boev A, Hollosi D, Gotchev A (2008) Classification of stereoscopic artefacts., Mobile3DTV (Project No. 216503) http://sp.cs.tut.fi/mobile3dtv/results/tech/D5.1_Mobile3DTV_v1.0.pdf. Accessed 22 Jun 2011

  71. Yamanoue H, Okui M, Okano F (2006) Geometrical analysis of puppet-theater and cardboard effects in stereoscopic HDTV images. IEEE Trans Circuits Syst Video Technol 16(6):744–752

    Article  Google Scholar 

  72. Mendiburu B (2009) Fundamentals of stereoscopic imaging. In: Digital cinema summit, NAB Las Vegas. Available at: http://www.3dtv.fr/NAB09_3D-Tutorial_BernardMendiburu.pdf

  73. Yeh Y-Y, Silverstein LD (1990) Limits of fusion and depth judgment in stereoscopic color displays. Hum Factors: J Hum Factors Ergon Soc 32:45–60

    Google Scholar 

  74. Tam WJ, Stelmach LB (1998) Display duration and stereoscopic depth discrimination. Can J Exp Psychol 52(1):56–61

    Google Scholar 

  75. International Telecommunication Union (2010) Methodology for the subjective assessment of the quality of television pictures, ITU-R

    Google Scholar 

  76. Tam WJ, Vincent A, Renaud R, Blanchfield P, Martin T (2003) Comparison of stereoscopic and non-stereoscopic video images for visual telephone systems. In: Stereoscopic displays and virtual reality systems X, San José, CA, vol 5006, pp 304–312

    Google Scholar 

Download references

Acknowledgment

We would like to express our sincere thanks to Mr. Robert Klepko for constructive suggestions during the preparation of this manuscript. Thanks are also due to NHK for providing the “Balloons,” “Tulips,” and “Redleaf” sequences.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Zhang, L., Vázquez, C., Huchet, G., Tam, W.J. (2013). DIBR-Based Conversion from Monoscopic to Stereoscopic and Multi-View Video. In: Zhu, C., Zhao, Y., Yu, L., Tanimoto, M. (eds) 3D-TV System with Depth-Image-Based Rendering. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9964-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-9964-1_4

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4419-9963-4

  • Online ISBN: 978-1-4419-9964-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics