3D Video Compression

  • Karsten Müller
  • Philipp Merkle
  • Gerhard Tech


In this chapter, compression methods for 3D video (3DV) are presented. This includes data formats, video and depth compression, evaluation methods, and analysis tools. First, the fundamental principles of video coding for classical 2D video content are reviewed, including signal prediction, quantization, transformation, and entropy coding. These methods are extended toward multi-view video coding (MVC), where inter-view prediction is added to the 2D video coding methods to gain higher coding efficiency. Next, 3DV coding principles are introduced, which are different from previous coding methods. In 3DV, a generic input format is used for coding and a dense number of output views are generated for different types of autostereoscopic displays. This influences the format selection, encoder optimization, evaluation methods, and requires new modules, like the decoder-side view generation, as discussed in this chapter. Finally, different 3DV formats are compared and discussed for their applicability for 3DV systems.


3D video (3DV)  Analysis tool  Correlation histogram  Data format  Depth-image-based rendering methods (DIBR)  Depth-enhanced stereo (DES)  Distortion measure  Entropy coding  Evaluation method  Inter-view prediction  layered depth video (LDV)  Multi-view video coding  Multi-view video plus depth (MVD)  Rate-distortion-optimization  Transform  Video coding  


  1. 1.
    Benzie P, Watson J, Surman P, Rakkolainen I, Hopf K, Urey H, Sainov V, von Kopylow C (2007) A survey of 3DTV displays: techniques and technologies. IEEE Trans Circuits Syst Video Technol 17(11):1647–1658CrossRefGoogle Scholar
  2. 2.
    Konrad J, Halle M (2007) 3-D displays and signal processing: an Answer to 3-D Ills? IEEE Signal Proces Mag 24(6):21Google Scholar
  3. 3.
    Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):2163–2177MathSciNetGoogle Scholar
  4. 4.
    Berger T (1971) Rate distortion theory. Prentice-Hall, Englewood CliffsGoogle Scholar
  5. 5.
    Wiegand T, Schwarz H (2011) Source coding: part I of fundamentals of source and video coding. Found Trends Signal Proces 4(1-2):1–222, Jan 2011.
  6. 6.
    Jayant NS, Noll P (1994) Digital coding of waveforms. Prentice-Hall, Englewood CliffsGoogle Scholar
  7. 7.
    Huffman DA (1952) A method for the construction of minimum redundancy codes. In: Proceedings IRE, pp 1098–1101, Sept 1952Google Scholar
  8. 8.
    Said A (2003) Arithmetic coding. In: Sayood K (ed) Lossless compression handbook. San Diego, Academic, LondonGoogle Scholar
  9. 9.
    Chen Y, Wang Y-K, Ugur K, Hannuksela M, Lainema J, Gabbouj M (2009) The Emerging MVC standard for 3D video services. EURASIP J Adv Sign Proces 2009(1)Google Scholar
  10. 10.
    ISO/IEC JTC1/SC29/WG11 (2008) Text of ISO/IEC 14496-10:200X/FDAM 1 multiview video coding. Doc. N9978, Hannover, Germany, July 2008Google Scholar
  11. 11.
    Merkle P, Smolic A, Mueller K, Wiegand T (2007) Efficient prediction structures for multiview video coding, invited paper. IEEE Trans Circuits Syst Video Technol 17(11):1461–1473CrossRefGoogle Scholar
  12. 12.
    Shimizu S, Kitahara M, Kimata H, Kamikura K, Yashima Y (2007) View scalable multi-view video coding using 3-d warping with depth map. IEEE Trans Circuits Syst Video Technol 17(11):1485–1495CrossRefGoogle Scholar
  13. 13.
    Vetro A, Wiegand T, Sullivan GJ (2011) Overview of the stereo and multiview video coding extensions of the H.264/AVC standard. Proc IEEE, Special issue on 3D Media and Displays 99(4):626–642Google Scholar
  14. 14.
    ITU-T and ISO/IEC JTC 1 (2010) Advanced video coding for generic audiovisual services. ITU-T Recommendation H.264 and ISO/IEC 14496-10 (MPEG-4 AVC), Version 10, March 2010Google Scholar
  15. 15.
    Wiegand T, Sullivan GJ, Bjøntegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circuits Syst Video Technol 13(7):560–576CrossRefGoogle Scholar
  16. 16.
    Schwarz H, Marpe D, Wiegand T (2006) Analysis of hierarchical B pictures and MCTF, ICME 2006. IEEE international conference on multimedia and expo, Toronto, July 2006Google Scholar
  17. 17.
    Strohmeier D, Tech G (2010) On comparing different codec profiles of coding methods for mobile 3D television and video. In: Proceedings 3D systems and applications, Tokyo, May 2010Google Scholar
  18. 18.
    ISO/IEC JTC1/SC29/WG11 (2009) Vision on 3D video. Doc. N10357, Lausanne, Feb 2009Google Scholar
  19. 19.
    Müller K, Smolic A, Dix K, Merkle P, Wiegand T (2009) Coding and intermediate view synthesis of multi-view video plus depth. In: Proceedings IEEE international conference on image processing (ICIP’09), Cairo, pp 741–744, Nov. 2009Google Scholar
  20. 20.
    Müller K, Merkle P, Wiegand T (2011) 3D video representation using depth maps. Proc IEEE, Special issue on 3D media and displays 99(4):643–656Google Scholar
  21. 21.
    Faugeras O (1993) Three-dimensional computer vision: a geometric viewpoint. MIT Press, CambridgeGoogle Scholar
  22. 22.
    Hartley R, Zisserman A (2000) Multiple view geometry in computer vision. Cambrigde University Press, CambrigdeMATHGoogle Scholar
  23. 23.
    Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vision 47(1):7–42MATHCrossRefGoogle Scholar
  24. 24.
    Bleyer M, Gelautz M (2005) A layered stereo matching algorithm using image segmentation and global visibility constraints. ISPRS J Photogrammetry Remote Sens 59(3):128–150CrossRefGoogle Scholar
  25. 25.
    Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M, Rother C (2006) A Comparative study of energy minimization methods for markov random fields. European conference on computer vision (ECCV 2006), vol 2, pp 16–29, Graz, May 2006Google Scholar
  26. 26.
    Atzpadin N, Kauff P, Schreer O (2004) Stereo analysis by hybrid recursive matching for real-time immersive video conferencing. IEEE Trans Circuits Syst Video Technol, Special issue on immersive Telecommunications 14(3):321–334Google Scholar
  27. 27.
    Cigla C, Zabulis X, Alatan AA (2007) Region-based dense depth extraction from multi-view video. In: Proceedings IEEE international conference on image processing (ICIP’07), San Antonio, USA, pp 213–216, Sept 2007Google Scholar
  28. 28.
    Felzenszwalb PF, Huttenlocher DP (2006) Efficient belief propagation for early vision. Int J Comp Vision 70(1):41CrossRefGoogle Scholar
  29. 29.
    Kolmogorov V (2006) Convergent tree-reweighted message passing for energy minimization. IEEE Trans Pattern Anal Mach Intell 28(10):1568CrossRefGoogle Scholar
  30. 30.
    Kolmogorov V, Zabih R (2002) Multi-camera scene reconstruction via graph cuts. European conference on computer vision, May 2002Google Scholar
  31. 31.
    Lee S-B, Ho Y-S (2010) View consistent multiview depth estimation for three-dimensional video generation. In: Proceedings IEEE 3DTV conference, Tampere, Finland, June 2010Google Scholar
  32. 32.
    Min D, Yea S, Vetro A (2010) Temporally consistent stereo matching using coherence function. In: Proceedings IEEE 3DTV conference, Tampere, June 2010Google Scholar
  33. 33.
    Tanimoto M, Fujii T, Suzuki K (2008) Improvement of depth map estimation and view synthesis. ISO/IEC JTC1/SC29/WG11, M15090, Antalya, Jan 2008Google Scholar
  34. 34.
    Müller K, Smolic A, Dix K, Merkle P, Kauff P, Wiegand T (2008) View synthesis for advanced 3D video systems. EURASIP J Image Video Proces, Special issue on 3D Image and Video Processing, vol 2008, Article ID 438148, 11 pages, 2008 doi: 10.1155/2008/438148
  35. 35.
    Zitnick CL, Kang SB, Uyttendaele M, Winder S, Szeliski R (2004) High-quality video view interpolation using a layered representation. ACM SIGGRAPH and ACM Transaction on Graphics, Los Angeles, Aug 2004Google Scholar
  36. 36.
    Gokturk S, Yalcin H, Bamji C (2004) A time‐of‐flight depth sensor system description, issues and solutions. In: Proceedings of IEEE computer vision and pattern recognition workshop, vol 4, pp 35–43Google Scholar
  37. 37.
    ISO/IEC DIS 14772-1 (1997) The virtual reality modeling language. April 1997Google Scholar
  38. 38.
    Würmlin S, Lamboray E, Gross M (2004) 3d video fragments: dynamic point samples for real-time free-viewpoint video. Computers and graphics, Special issue on coding, compression and streaming techniques for 3D and multimedia data, Elsevier, pp 3–14Google Scholar
  39. 39.
    Fusiello A, Trucco E, Verri A (2000) A compact algorithm for rectification of stereo pairs. Mach Vis Appl 12(1):16–22CrossRefGoogle Scholar
  40. 40.
    Kauff P, Atzpadin N, Fehn C, Müller M, Schreer O, Smolic A, Tanger R (2007) Depth map creation and image based rendering for advanced 3DTV services providing interoperability and scalability. Signal processing: image communication. Special issue on 3DTV, Feb 2007Google Scholar
  41. 41.
    Redert A, de Beeck MO, Fehn C, Ijsselsteijn W, Pollefeys M, Van Gool L, Ofek E, Sexton I, Surman P (2002) ATTEST–advanced three-dimensional television system techniques. In: Proceedings of international symposium on 3D data processing, visualization and transmission, pp 313–319, June 2002Google Scholar
  42. 42.
    Merkle P, Morvan Y, Smolic A, Farin D, Müller K, de With PHN, Wiegand T (2009) The effects of multiview depth video compression on multiview rendering. Signal Proces: Image Commun 24(1–2):73–88CrossRefGoogle Scholar
  43. 43.
    Liu Y, Huang Q, Ma S, Zhao D, Gao W (2009) Joint video/depth rate allocation for 3d video coding based on view synthesis distortion model. Signal Proces: Image Commun 24(8):666–681CrossRefGoogle Scholar
  44. 44.
    Merkle P, Singla J, Müller K, Wiegand T (2010) Correlation histogram analysis of depth-enhanced 3D video coding’. In: Proceedings IEEE international conference on image processing (ICIP’10), Hong Kong, pp 2605–2608, Sept 2010Google Scholar
  45. 45.
    Choi J, Min D, Ham B, Sohn K (2009) Spatial and temporal up-conversion technique for depth video. In: Proceedings IEEE international conference on image processing (ICIP’09), Cairo, Egypt, pp 741–744, Nov 2009Google Scholar
  46. 46.
    Daribo I, Tillier C, Pesquet-Popescu B (2008) Adaptive wavelet coding of the depth map for stereoscopic view synthesis. In: Proceedings IEEE international workshop on multimedia signal processing (MMSP’08), Cairns, Australia, pp 34–39, Oct 2009Google Scholar
  47. 47.
    Kim S-Y, Ho Y-S (2007) Mesh-based depth coding for 3d video using hierarchical decomposition of depth maps. In: Proceedings IEEE international conference on image processing (ICIP’07), San Antonio, pp V117–V120, Sept 2007Google Scholar
  48. 48.
    Kim W-S, Ortega A, Lai P, Tian D, Gomila C (2010) Depth map coding with distortion estimation of rendered view. Visual information processing and communication, Proceedings of the SPIE, vol 7543Google Scholar
  49. 49.
    Oh K-J, Yea S, Vetro A, Ho Y-S (2009) Depth reconstruction filter and down/up sampling for depth coding in 3-D video. IEEE Signal Proces Lett 16(9):747–750CrossRefGoogle Scholar
  50. 50.
    Smolic A, Müller K, Merkle P, Kauff P, Wiegand T (2009) An overview of available and emerging 3D video formats and depth enhanced stereo as efficient generic solution. In: Proceedings picture coding symposium (PCS 2009), Chicago, May 2009Google Scholar
  51. 51.
    Müller K, Smolic A, Dix K, Kauff P, Wiegand T (2008) Reliability-based generation and view synthesis in layered depth video. In: Proceedings IEEE international workshop on multimedia signal processing (MMSP2008), Cairns, pp 34–39, Oct 2008Google Scholar
  52. 52.
    Maitre M, Do MN (2009) Shape-adaptive wavelet encoding of depth maps. In: Proceedings picture coding symposium (PCS’09), Chicago, USA, May 2009Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Image Processing DepartmentFraunhofer Institute for Telecommunications, Heinrich-Hertz-InstitutBerlinGermany

Personalised recommendations