LDV Generation from Multi-View Hybrid Image and Depth Video



The technology around 3D-TV is evolving rapidly. There are already different stereo displays available and auto-stereoscopic displays promise 3D without glasses in the near future. All of the commercially available content today is purely image-based. Depth-based content on the other hand provides better flexibility and scalability regarding future 3D-TV requirements and in the long term is considered to be a better alternative for 3D-TV production. However, depth estimation is a difficult process, which threatens to become the main bottleneck in the whole production chain. There are already different sophisticated depth-based formats such as LDV (layered depth video) or MVD (multi-view video plus depth) available, but no reliable production techniques for these formats exist today. Usually camera systems, consisting of multiple color cameras, are used for capturing. These systems however rely on stereo matching for depth estimation, which often fails in presence of repetitive patterns or textureless regions. Newer, hybrid systems offer a better alternative here. Hybrid systems incorporate active sensors in the depth estimation process and allow to overcome difficulties of the standard multi-camera systems. In this chapter a complete production chain for 2-layer LDV format, based on a hybrid camera system of 5 color cameras and 2 time-of-flight cameras, is presented. It includes real-time preview capabilities for quality control during the shooting and post-production algorithms to generate high-quality LDV content consisting of foreground and occlusion layers.


3D-TV  Alignment  Depth estimation  Depth image-based rendering (DIBR)  Foreground layer  Grab-cut  Hybrid camera system  Layered depth video (LDV)  LDV compliant capturing  Multi-view video plus depth (MVD)  Occlusion layer  Post-production  Bilateral filtering  Stereo matching  Thresholding  Time-of-flight (ToF) camera  Warping 


  1. 1.
    Fehn C (2004) Depth-image-based rendering (dibr), compression, and transmission for a new approach on 3d- tv. In: Stereoscopic displays and virtual reality systems XI. In: Proceedings of the SPIE 5291, pp 93–104, May 2004Google Scholar
  2. 2.
    Kauff P, Atzpadin N, Fehn C et al (2007) Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability. Sig Process: Image Commun 22:217–234CrossRefGoogle Scholar
  3. 3.
    Zhang L, Tam WJ (2005) Stereoscopic image generation based on depth images for 3D TV. Broadcast IEEE Trans 51(2):191–199CrossRefGoogle Scholar
  4. 4.
    Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vision 47(1/2/3):7–42Google Scholar
  5. 5.
    Frick A, Bartczak B, Koch R (2010) Real-time preview for layered depth video in 3D-TV. In: Proceedings of SPIE, vol 7724, p 77240FGoogle Scholar
  6. 6.
    Lee E-K, Ho Y-S (2011) Generation of high‐quality depth maps using hybrid camera system for 3‐D video. J Visual Commun Image Represent (JVCI) 22:73–84CrossRefGoogle Scholar
  7. 7.
    Bartczak B, Schiller I, Beder C, Koch R (2008) Integration of a time-of-flight camera into a mixed reality system for handling dynamic scenes, moving viewpoints and occlusions in real-time. In: Proceedings of the 3DPVT Workshop, 2008Google Scholar
  8. 8.
    Kolb A, Barth E, Koch R, Larsen R (2009) Time-of-flight sensors in computer graphic. In Proceedings of eurographics 2009—state of the art reports, pp 119–134Google Scholar
  9. 9.
    Schiller I, Beder C, Koch R (2008) Calibration of a PMD-Camera using a planar calibration pattern together with a multi-camera setup. In: Proceedings XXXVII international social for photogrammetryGoogle Scholar
  10. 10.
    Fehn C, Kauff P, Op de Beeck M, et al, “An evolutionary and optimised approach on 3D-TV. In: IBC 2002, International broadcast convention, Amsterdam, Netherlands, Sept 2002Google Scholar
  11. 11.
    Bartczak B, Vandewalle P, Grau O, Briand G, Fournier J, Kerbiriou P, Murdoch M et al (2011) Display-independent 3D-TV production and delivery using the layered depth video format. IEEE Trans Broadcast 57(2):477–490CrossRefGoogle Scholar
  12. 12.
    Barenburg B (2009) Declipse 2: multi-layer image and depth with transparency made practical. SPIE 7237:72371GCrossRefGoogle Scholar
  13. 13.
    Klein Gunnewiek R, Berretty R-P M, Barenbrug B, Magalhães JP (2009) Coherent spatial and temporal occlusion generation. In: Proceedings of SPIE vol 7237, p 723713Google Scholar
  14. 14.
    Shade J, Gortler S, He L, Szeliski R (1998) Layered depth images. In: Proceedings of the 25th annual conference on Computer graphics and interactive techniques (SIGGRAPH ‘98). ACM, New York, pp 231–242Google Scholar
  15. 15.
    Smolic A, Mueller K, Merkle P, Kauff P, Wiegand T (2009) An overview of available and emerging 3D video formats and depth enhanced stereo as efficient generic solution. In: Picture coding symposium, 2009, PCS 2009Google Scholar
  16. 16.
    Merkle P, Smolic A, Muller K, Wiegand T (2007) Multi-View video plus depth representation and coding. In: Image processing, 2007. ICIP 2007. IEEE International conference on, vol 1, pp 201–204Google Scholar
  17. 17.
    Frick A, Bartczack B, Koch R (2010) 3D-TV LDV content generation with a hybrid ToF-multicamera RIG. In: 3DTV-Conference: the true vision—Capture, transmission and display of 3D Video, June 2010Google Scholar
  18. 18.
    Frick A, Kellner F, Bartczak B, Koch R (2009) Generation of 3D-TV LDV-content with Time-Of-Flight Camera. 3D-TV Conference: the true vision—Capture, transmission and display of 3D Video, Mai 2009Google Scholar
  19. 19.
    Xu M, Ellis T (2001) Illumination-invariant motion detection using colour mixture models. In: British machine vision conference (BMVC 2001), Manchester, pp 163–172Google Scholar
  20. 20.
    Yang Q, Yang R, Davis J, Nister D (2007) Spatial-depth super resolution for range images. In: Computer vision and pattern recognition, CVPR ‘07, IEEE conference on, pp 1–8, June 2007Google Scholar
  21. 21.
    Kim S-Yl, Cho J-H, Koschan A, Abidi MA (2010) 3D Video generation and service based on a TOF depth sensor in MPEG‐4 multimedia framework. IEEE Trans Consum Electron 56(3):1730–1738CrossRefGoogle Scholar
  22. 22.
    Diebel J, Thrun S (2005) An application of markov random fields to range sensing. In: Advances in neural information processing systems, pp. 291–298Google Scholar
  23. 23.
    Chan D, Buisman H, Theobalt C, Thrun S (2008) A noise-aware filter for real-time depth upsampling. In: Workshop on multicamera and multi-modal sensor fusion, M2SFA2Google Scholar
  24. 24.
    Frick A, Franke M, Koch R (2011) Time-consistent foreground segmentation of dynamic content from color and depth video. In: DAGM 2011, LNCS 6835. Springer, Heidelberg, 2011, pp 296–305Google Scholar
  25. 25.
    Pham TQ, van Vliet LJ (2005) Separable bilateral filtering for fast video preprocessing. In: Multimedia and Expo, 2005. ICME 2005. IEEE International conference on, 2005Google Scholar
  26. 26.
    Orchard M, Bouman C (1991) Color quantization of images. IEEE Trans Signal Process 39(12):2677–2690CrossRefGoogle Scholar
  27. 27.
    Fischler M, Bolles R (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395MathSciNetCrossRefGoogle Scholar
  28. 28.
    Rother C, Kolmogorov V, Blake A (2004) Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314CrossRefGoogle Scholar
  29. 29.
    Boykov Y, Jolly M (2000) Interactive organ segmentation using graph cuts. In: Medical image computing and computer-assisted-intervention (MICCAI), pp 276–286Google Scholar
  30. 30.
    Boykov Y, Kolmogorov V (2004) An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision. IEEE Trans Pattern Anal Mach Intell 26(9):1124–1137CrossRefGoogle Scholar
  31. 31.
    Bartczak B, Koch R (2009) Dense depth maps from low resolution time-of-flight depth and high resolution color views. In: Advances in visual computing, vol 5876, Springer, Berlin, pp 228–239Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Computer Science DepartmentChristian-Albrechts-University of KielKielGermany

Personalised recommendations