DIBR-Based Conversion from Monoscopic to Stereoscopic and Multi-View Video

Zhang, Liang; Vázquez, Carlos; Huchet, Grégory; Tam, Wa James

doi:10.1007/978-1-4419-9964-1_4

Liang Zhang⁵,
Carlos Vázquez⁵,
Grégory Huchet⁵ &
…
Wa James Tam⁵

2087 Accesses
1 Citations

Abstract

This chapter aims to provide a tutorial on 2D-to-3D video conversion methods that exploit depth-image-based rendering (DIBR) techniques. It is devoted not only to university students who are new to this area of research, but also to researchers and engineers who want to enhance their knowledge of video conversion techniques. The basic principles and the various methods for converting 2D video to stereoscopic 3D, including depth extraction strategies and DIBR-based view synthesis approaches, are reviewed. Conversion artifacts and evaluation of conversion quality are discussed, and the advantages and disadvantages of the different methods are elaborated. Furthermore, practical implementations for the conversion from monoscopic to stereoscopic and multi-view video are drawn.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
N.B. the black bars that constitute the floating window are shown much wider than they actually are to make it more visible in the Figure.

References

Advanced Television Systems Commitee (ATSC) (2011) Final report of the ATSC planning team on 3D-TV. PT1-049r1. Advanced Television Systems Commitee (ATSC), Washington DC, USA
Google Scholar
International Organisation for Standardisation (ISO) (2009) Vision on 3D video. ISO/IEC JTC1/SC29/WG11 N10357, International Organisation for Standardisation (ISO), Lausanne, Switzerland
Google Scholar
Society of Motion Picture and Television Engineers (SMPTE) (2009) Report of SMPTE task force on 3D to the home. TF3D, Society of Motion Picture and Television Engineers
Google Scholar
Smolic A, Mueller K, Merkle P, Vetro A (2009) Development of a new MPEG standard for advanced 3D video applications. TR2009-068, Mitsubishi Electric Research Laboratories, Cambridge, MA, USA
Google Scholar
Valentini VI (2011) Legend3D sets the transformers 2D-3D conversion record straight. In: indiefilm3D. Available at: http://indiefilm3d.com/node/518
Tam WJ, Speranza F, Yano S, Ono K, Shimono H (2011) Stereoscopic 3D-TV: visual comfort. IEEE Trans Broadcast 57(2):335–346 part II
Article Google Scholar
Kauff P, Atzpadin N, Fehn C, Müller M, Schreer O, Smolic A, Tanger R (2007) Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability. Signal Processing: Image Communication (Special issue on three-dimensional video and television) 22(2):217–234
Google Scholar
Zhang L, Vázquez C, Knorr S (2011) 3D-TV content creation: automatic 2D-to-3D video conversion. IEEE Trans Broadcast 57(2):372–383
Article Google Scholar
Tam WJ, Zhang L (2006) 3D-TV content generation: 2D-to-3D conversion. In IEEE International Conference on Multimedia and Expo, Toronto, Canada
Google Scholar
Fehn C (2003) A 3D-TV approach using depth-image-based rendering (DIBR). In 3rd conference on visualization. Imaging and Image Processing, Benalmadena, Spain
Google Scholar
Ostnes R, Abbott V, Lavender S (2004) Visualisation techniques: an overview—Part 1. Hydrogr J 113:4–7
Google Scholar
Shimono K, Tam WJ, Nakamizo S (1999) Wheatstone-panum limiting case: occlusion, camouflage, and vergence-induced disparity cues. Percept Psychophys 61(3):445–455
Article Google Scholar
Ens J, Lawrence P (1993) An investigation of methods for determining depth from focus. IEEE Trans Pattern Anal Mach Intell 15(2):97–107
Article Google Scholar
Battiato S, Curti S, La Cascia M, Tortora M, Scordato E (2004) Depth map generation by image classification. Proc SPIE 5302:95–104
Article Google Scholar
Hudson W (1967) The study of the problem of pictorial perception among unacculturated groups. Int J Psychol 2(2):89–107
Article Google Scholar
Knorr S, Kunter M, Sikora T (2008) Stereoscopic 3D from 2D video with super-resolution capabilities. Signal Process: Image commun 23(9):665–676
Article Google Scholar
Mancini A (1998) Disparity estimation and intermediate view reconstruction for novel applications in stereoscopic video, McGill University, Canada
Google Scholar
Scharstein D, Szeliski R (2003) High-accuracy stereo depth maps using structured light. In: IEEE Computer society conference on computer vision and pattern recognition (CVPR 2003), vol 1. Madison, WI, USA, pp 195–202
Google Scholar
Yamada K, Suehiro K, Nakamura H (2005) Pseudo 3D image generation with simple depth models. In: International conference in consumer electronics, Las Vegas, NV, pp 277–278
Google Scholar
Battiato S, Capra A, Curti S, La Cascia M (2004) 3D Stereoscopic image pairs by depth-map generation. In: 3D Data processing, visualization and transmission, pp 124–131
Google Scholar
Nedovic V, Smeulders AWM, Redertand A, Geusebroek JM (2007) Depth information by stage classification. In: International conference on computer vision
Google Scholar
Yamada K, Suzuki Y (2009) Real-time 2D-to-3D conversion at full HD 1080P resolution. In: 13th IEEE International symposium on consumer electronics, Las Vegas, NV, pp 103–106
Google Scholar
Huang X, Wang L, Huang J, Li D, Zhang M (2009) A depth extraction method based on motion and geometry for 2D-to-3D conversion. In: Third international symposium on intelligent information technology application
Google Scholar
Jung Y-J, Baik A, Park D (2009) A novel 2D-to-3D conversion technique based on relative height-depth-cue. In: SPIE Conference on stereoscopic displays and applications XX, San Jose, CA, vol 7237. p 72371U
Google Scholar
Tam WJ, Speranza F, Zhang L (2009) Depth map generation for 3-D TV: importance of edge and boundary information. In: Javidi B, Okano F, Son J-Y(eds) Three-dimensional imaging, visualization and display. Springer, New York, pp 153–181
Chapter Google Scholar
Tam WJ, Yee AS, Ferreira J, Tariq S, Speranza F (2005) Stereoscopic image rendering based on depth maps created from blur and edge information. In: Proceedings of the stereoscopic displays and applications, vol 5664. pp 104–115
Google Scholar
Tam WJ, Vázquez C, Speranza F (2009) 3D-TV: a novel method for generating surrogate depth maps using colour information. In: SPIE Conference stereoscopic displays and applications XX, San José, USA, vol 7237, p 72371A
Google Scholar
Zhang L, Tam WJ (2005) Stereoscopic image generation based on depth images for 3D TV. IEEE Trans Broadcast 51:191–199
Article Google Scholar
Ernst FE (2003) 2D-to-3D video conversion based on time-consistent segmentation. In: Proceedings of the immersive communication and broadcast systems workshop, Berlin, Germany
Google Scholar
Chang Y-L, Fang C-Y, Ding L-F, Chen S-Y, Chen L-G (2007) Depth map generation for 2D-to-3D conversion by short-term motion assisted color segmentation. In: IEEE International conference on multimedia and expo, pp 1958–1961
Google Scholar
Vázquez C, Tam WJ (May 2010) CRC-CSDM: 2D to 3D conversion using colour-based surrogate depth maps. In: International conference on 3D systems and applications (3DSA 2010), Tokyo, Japan
Google Scholar
Kim J, Baik A, Jung YJ, Park D (2010) 2D-to-3D conversion by using visual attention analysis. In: Proceedings SPIE, vol 7524, p 752412
Google Scholar
Nothdurft H (2000) Salience from feature contrast: additivity across dimensions. Vis Res 40:1183–1201
Article Google Scholar
Rogers B-J, Graham M-E (1979) Motion parallax as an independent cue for depth perception. Perception 8:125–134
Article Google Scholar
Ferris S-H (1972) Motion parallax and absolute distance. J Exp Psychol 95(2):258–263
Article Google Scholar
Matsumoto Y, Terasaki H, Sugimoto K, Arakawa T (1997) Conversion system of monocular image sequence to stereo using motion parallax. In: SPIE Conference in stereoscopic displays and virtual reality systems IV, San Jose, CA, vol 3012. pp 108–112
Google Scholar
Zhang L, Lawrence B, Wang D, Vincent A (2005) Comparison study of feature matching and block matching for automatic 2D to 3D video conversion. In: 2nd IEEE European conference on visual media production, London, UK, pp 122–129
Google Scholar
Hartley R, Zisserman A (2000) Multiple view geometry in computer vision. Cambridge University Press, Cambridge, UK
MATH Google Scholar
Choi S, Woods J (1999) Motion-compensated 3-D subband coding of video. IEEE Trans Image Process 8(2):155–167
Article Google Scholar
Kim MB, Song MS (1998) Stereoscopic conversion of monoscopic video by the transformation of vertical to horizontal disparity. Proc SPIE 3295:65–75
Article Google Scholar
Ideses I, Yaroslavsky LP, Fishbain B (2007) Real-time 2D to 3D video conversion. J Real-Time Image Process 2(1):3–7
Article Google Scholar
Pourazad M-T, Nasiopoulos P, Ward R-K (2009) An H.264-based scheme for 2D-to-3D video conversion. IEEE Trans Consum Electron 55(2):742–748
Google Scholar
Pourazad M-T, Nasiopoulos P, Ward R-K (2010) Generating the depth map from the motion information of H.264-encoded 2D video sequence. EURASIP J Image Video Process
Google Scholar
Kim D, Min D, Sohn K (2008) A stereoscopic video generation method using stereoscopic display characterization and motion analysis. IEEE Trans Broadcast 54(2):188–197
Article Google Scholar
Po L-M, Xu X, Zhu Y, Zhang S, Cheung K-W, Ting C-W (2010) Automatic 2D-to-3D video conversion technique based on depth-from-motion and color segmentation. In: IEEE International conference on signal processing, Hong Kong, China, pp 1000–1003
Google Scholar
Xu F, Er G, Xie X, Dai Q (2008) 2D-to-3D conversion based on motion and color mergence. In: 3DTV Conference, Istanbul, Turkey
Google Scholar
Zhang G, Jia J, Wong TT, Bao H (2009) Consistent depth maps recovery from a video sequence. IEEE Trans Pattern Anal Mach Intell 31(6):974–988
Google Scholar
Chang YL, Chang JY, Tsai YM, Lee CL, Chen LG (2008) Priority depth fusion for 2D-to-3D conversion systems. In: SPIE Conference on three-dimensional image capture and applications, San Jose, CA, vol 6805, p 680513
Google Scholar
Cheng C-C, Li C-T, Tsai Y-M, Chen L-G (2009) Hybrid depth cueing for 2D-to-3D conversion system. In: SPIE Conference on Stereoscopic Displays and Applications XX, San Jose. CA, USA, vol 7237, p 723721
Google Scholar
Chen Y, Zhang R, Karczewicz M (2011) Low-complexity 2D-to-3D video conversion. In: SPIE Conference on stereoscopic displays and applications XXII, vol 7863, p 78631I
Google Scholar
Tam WJ, Alain G, Zhang L, Martin T, Renaud R (2004) Smoothing depth maps for improved stereoscopic image quality. In: Three-dimensional TV, video and display III (ITCOM’04), Philadelphia, PA, vol 5599, p 162
Google Scholar
Vázquez C, Tam WJ, Speranza F (2006) Stereoscopic imaging: filling disoccluded areas in depth image-based rendering. In: SPIE Conference on three-dimensional tv, video and display V, Boston, MA, vol 6392, p 63920D
Google Scholar
Shimono K, Tam WJ, Speranza F, Vázquez C, Renaud R (2010) Removing the cardboard effect in stereoscopic images using smoothed depth maps. In: Stereoscopic displays and applications XXI, San José, CA, vol 7524, p 75241C
Google Scholar
Mori Y, Fukushima N, Yendo T, Fujii T, Tanimoto M (2009) View generation with 3D warping using depth information for FTV. Signal Process: Image Commun 24(12):65–72
Article Google Scholar
Chen W-Y, Chang Y-L, Lin S-F, Ding L-F, Chen L-G (2005) Efficient depth image based rendering with edge depenedent filter and interpolation. In: IEEE Internatinal conference on multimedia and expo, Amnsterdam, The Netherlands
Google Scholar
International Organization for Standardization / International Electrotechnical Commission (2007) Representation of auxiliary video and supplemental information. ISO/IEC FDIS 23002-3:2007(E), International organization for standardization / International electrotechnical commission, Lausanne
Google Scholar
Daly SJ, Held RT, Hoffman DM (2011) Perceptual issues in stereoscopic signal processing. IEEE Trans Broadcast 57(2):347–361
Article Google Scholar
Lang M, Hornung A, Wang O, Poulakos S, Smolic A, Gross M (2010) Nonlinear disparity mapping for stereoscopic 3D. In: ACM SIGGRAPH, Los Angeles, CA
Google Scholar
Vázquez C, Tam WJ (2008) 3D-TV: coding of disocclusions for 2D+Depth representation of multi-view images. In: Tenth international conference on computer graphics and imaging (CGIM), Innsbruck, Austria
Google Scholar
Tauber Z, Li Z-N, Drew M-S (2007) Review and preview: disocclusion by inpainting for image-based rendering. IEEE Trans Syst Man Cybernetics Part C: Appl Rev 37(4):527–540
Google Scholar
Azzari L, Battisti F, Gotchev A (2010) Comparative analysis of occlusion-filling techniques in depth image-based rendering for 3D videos. In: 3rd Workshop on mobile video delivery, Firenze, Italy
Google Scholar
Criminisi A, Perez P, Toyama K (2004) Region filling and object removal by exemplar-based image inpainting. IEEE Trans Image Process 13:1200–1212
Article Google Scholar
Criminisi A, Perez P, Toyama K, Gangnet M, Blake A (2006) Image region filling by exemplar-based inpainting. Patent No: 6,987,520, United States
Google Scholar
Daribo I, Pesquet-Popescu B (2010) Depth-aided image inpainting for novel view synthesis. In International Workshop on Multimedia Signal Processing, Saint-Malo, France, pp 167–170
Google Scholar
Gunnewiek R-K, Berrety R-PM, Barenbrug B, Magalhaes J-P (2009) Coherent spatial and temporal occlusion generation. In: Proceedings SPIE, vol 7237, p 723713
Google Scholar
Cheng C-M, Lin S-J, Lai S-H (2011) Spatio-temporal consistent novel view synthesis algorithm from video-plus-depth sequences for autostereoscopic displays. IEEE Trans Broadcast 57(2):523–532
Article Google Scholar
Holliman NS, Dodgson NA, Favarola GE, Pockett L (2011) Three-dimensional displays: a review and applications analysis. IEEE Trans Broadcast 57(2):362–371
Article Google Scholar
Cheng CM, Lin SJ, Lai SH, Yang JC (2003) Improved novel view sysnthesis from depth image with large baseline. In: International conference on pattern recognition, Tampa, FL
Google Scholar
Seymour M (2011) Art of stereo conversion: 2D-to-3D. In: fxguide. Available at: http://www.fxguide.com/featured/art-of-stereo-conversion-2d-to-3d/
Boev A, Hollosi D, Gotchev A (2008) Classification of stereoscopic artefacts., Mobile3DTV (Project No. 216503) http://sp.cs.tut.fi/mobile3dtv/results/tech/D5.1_Mobile3DTV_v1.0.pdf. Accessed 22 Jun 2011
Yamanoue H, Okui M, Okano F (2006) Geometrical analysis of puppet-theater and cardboard effects in stereoscopic HDTV images. IEEE Trans Circuits Syst Video Technol 16(6):744–752
Article Google Scholar
Mendiburu B (2009) Fundamentals of stereoscopic imaging. In: Digital cinema summit, NAB Las Vegas. Available at: http://www.3dtv.fr/NAB09_3D-Tutorial_BernardMendiburu.pdf
Yeh Y-Y, Silverstein LD (1990) Limits of fusion and depth judgment in stereoscopic color displays. Hum Factors: J Hum Factors Ergon Soc 32:45–60
Google Scholar
Tam WJ, Stelmach LB (1998) Display duration and stereoscopic depth discrimination. Can J Exp Psychol 52(1):56–61
Google Scholar
International Telecommunication Union (2010) Methodology for the subjective assessment of the quality of television pictures, ITU-R
Google Scholar
Tam WJ, Vincent A, Renaud R, Blanchfield P, Martin T (2003) Comparison of stereoscopic and non-stereoscopic video images for visual telephone systems. In: Stereoscopic displays and virtual reality systems X, San José, CA, vol 5006, pp 304–312
Google Scholar

Download references

Acknowledgment

We would like to express our sincere thanks to Mr. Robert Klepko for constructive suggestions during the preparation of this manuscript. Thanks are also due to NHK for providing the “Balloons,” “Tulips,” and “Redleaf” sequences.

Author information

Authors and Affiliations

Communications Research Centre Canada, 3701 Carling Ave, Ottawa, ON, K2H 8S2, Canada
Liang Zhang, Carlos Vázquez, Grégory Huchet & Wa James Tam

Authors

Liang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Vázquez
View author publications
You can also search for this author in PubMed Google Scholar
Grégory Huchet
View author publications
You can also search for this author in PubMed Google Scholar
Wa James Tam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Zhang .

Editor information

Editors and Affiliations

, School of Electrical & Electronic, Nanyang Technological University, Nanyang Avenue 50, Singapore, 639798, Singapore
Ce Zhu
Electronic Engineering, Department of Information Science &, Zheda Road 38, Hangzhou, 310027, China, People's Republic
Yin Zhao
, Department of Information Science &, Zhejiang University, Zheda Road 38, Hangzhou, 310027, China, People's Republic
Lu Yu
Graduate School of Engineering, Department of Electrical Engineering and, Nagoya University, Nagoya, 464-8603, Japan
Masayuki Tanimoto

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhang, L., Vázquez, C., Huchet, G., Tam, W.J. (2013). DIBR-Based Conversion from Monoscopic to Stereoscopic and Multi-View Video. In: Zhu, C., Zhao, Y., Yu, L., Tanimoto, M. (eds) 3D-TV System with Depth-Image-Based Rendering. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9964-1_4

Download citation

DOI: https://doi.org/10.1007/978-1-4419-9964-1_4
Published: 15 August 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-9963-4
Online ISBN: 978-1-4419-9964-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics