Multimedia Tools and Applications

, Volume 78, Issue 6, pp 6529–6558 | Cite as

Depth compression via planar segmentation

  • S. Hemanth KumarEmail author
  • K. R. Ramakrishnan


Augmented Reality applications are set to revolutionize the smartphone industry due to the integration of RGB-D sensors into mobile devices. Given the large number of smartphone users, efficient storage and transmission of RGB-D data is of paramount interest to the research community. While there exist Video Coding Standards such as HEVC and H.264/AVC for compression of RGB/texture component, the coding of depth data is still an area of active research. This paper presents a method for coding depth videos, captured from mobile RGB-D sensors, by planar segmentation. The segmentation algorithm is based on Markov Random Field assumptions on depth data and solved using Graph Cuts. While all prior works based on this approach remain restricted to images only and under noise-free conditions, this paper presents an efficient solution to planar segmentation in noisy depth videos. Also presented is a unique method to encode depth based on its segmented planar representation. Experiments on depth captured from a noisy sensor (Microsoft Kinect) shows superior Rate-Distortion performance over the 3D extension of HEVC codec.


Depth map video Segmentation Graph cuts Data compression Noisy depth sensors RANSAC 



  1. 1.
    3D High Efficiency Video Coding (3D-HTM),
  2. 2.
    Bay H, Tuytelaars T, Van Gool L (2006) Surf: Speeded up robust features. In: ECCV. IEEE, pp 404–417CrossRefGoogle Scholar
  3. 3.
    Bhattacharya U, Veerawal S, Govindu VM (2017) Uttaran and Veerawal, Sumit and Govindu, Venu Madhav, Fast Multiview 3D Scan Registration using Planar Structures, International Conference on 3D VisionGoogle Scholar
  4. 4.
    Bjøntegaard G (2001) Calculation of average PSNR differences between RD-curves, Technical Report VCEG-M33, ITU-T SG16/Q6, AustinGoogle Scholar
  5. 5.
    Blake A, Kohli P, Markov CR (2011) Random Fields for Vision and Image Processing. MIT Press, StanfordCrossRefGoogle Scholar
  6. 6.
    Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. In: IEEE Transactions on Pattern Analysis and Machine IntelligenceGoogle Scholar
  7. 7.
    Chatterjee A (2015) Geometric calibration and Shape Refinement for 3D Reconstruction. PhD Thesis ReportGoogle Scholar
  8. 8.
    Cheung G, Kim WS, Ortega A, Ishida J, Kubota A (2011) Depth map coding using graph based transform and transform domain sparsification. In: International workshop on multimedia signal processing, pp 1–6.
  9. 9.
    Delong A, Osokin A, Isack H, Boykov Y (2012) Fast approximate energy minimization with label costs. Int J Comput Vis 96(1):1–27MathSciNetCrossRefGoogle Scholar
  10. 10.
    Duch MM, Morros JR, Ruiz-Hidalgo J (2016) Depth map compression via 3D region-based representation, J Multimed Tools Appl. CrossRefGoogle Scholar
  11. 11.
    Farid M, Lucenteforte M, Grangetto M (2015) Panorama view with spatiotemporal occlusion compensation for 3D video coding. IEEE Trans Image Process 24(1):205–219. MathSciNetCrossRefGoogle Scholar
  12. 12.
    Fehn C, Schuur K, Kauff P, Smolic A (2003) Coding results for EE4 in MPEG 3DAV, ISO/IEC JTC1/SC29/WG11 M, vol 9561Google Scholar
  13. 13.
    Feng C, Taguchi Y, Kamat VR (2014) Fast plane extraction in organized point clouds using agglomerative hierarchical clustering. Fast plane extraction in organized point clouds using agglomerative hierarchical clustering, 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, pp 6218–6225.
  14. 14.
    Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography (PDF). Comm ACM 24(6):381–395. MathSciNetCrossRefGoogle Scholar
  15. 15.
    Gallup D, Frahm JM, Mordohai P, Pollefeys M (2008) Variable baseline/resolution stereo. 2008 IEEE Conference on Computer Vision and Pattern Recognition Variable baseline/resolution stereo, Anchorage, pp 1–8.
  16. 16.
    Jäger F (2012) Simplified depth map intra coding with an optional depth lookup table, 2012 International Conference on 3D Imaging (IC3D), Liege, pp 1–4.
  17. 17.
    Jager F (2011) Contour-based segmentation and coding for depth map compression. In: Visual communications and image processing, pp 1–4.
  18. 18.
    Janoch A, Karayev S, Jia Y, Barron JT, Fritz M, Saenko K, Darrell T (2011) A category-level 3-D object dataset: Putting the Kinect to work, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp 1168–1174Google Scholar
  19. 19.
    Hemanth Kumar S, Ramakrishnan KR (2014) Improved motion vector compression using 3d-warping. In: Data Compression Conference (DCC). IEEE, pp 424–424Google Scholar
  20. 20.
    Hemanth Kumar S, Suraj K, Ramakrishnan KR (2014) An efficient depth estimation using temporal 3D-Warping. 2014 International Conference on 3D Imaging (IC3D), Liege, pp 1–8.
  21. 21.
    Hemmat H, Bondarev Y, With P (2015) Real-time planar segmentation of depth images : from three-dimensional edges to segmented planes. J Electron Imaging 24(5):1–11Google Scholar
  22. 22.
    Howard P, Kossentini F, Martins B, Forchhammer S, Rucklidge W (2002) The emerging JBIG2 standard. IEEE Trans Circ Syst Video Technolo 8(7):838–848CrossRefGoogle Scholar
  23. 23.
    ITU-T and ISO/IEC Advanced video coding for generic audiovisual services ITU-T rec h.264 and ISO/IEC 14496-10 (AVC) (2010)Google Scholar
  24. 24.
    Isack H, Boykov Y (2012) Energy-based Geometric Multi-Model Fitting. Int J Comput Vis 97(2):123–147CrossRefGoogle Scholar
  25. 25.
    Kim WS, Ortega A, Lai P, Tian D (2015) Depth map coding optimization using rendered view distortion for 3D video coding. IEEE Trans Image Process 24 (11):3534–3545. MathSciNetCrossRefGoogle Scholar
  26. 26.
    Lei J, Li S, Zhu C, Sun M, Hou C (2015) Depth coding based on depth-texture motion and structure similarities. IEEE Trans Circ Syst Video Technol 25(2):275–286. CrossRefGoogle Scholar
  27. 27.
    Lossless photo compression benchmark (2013) 2013
  28. 28.
    Lossless image compression (2014)
  29. 29.
    Mahoney M (2005) Adaptive weighing of context models for lossless data compression. Florida Technical report, Melbourne,Google Scholar
  30. 30.
    Merkle P, Morvan Y, Smolic A, Farin D, Muller K, de With P, Wiegand T (2008) The effect of depth compression on multiview rendering quality. In: 3DTV-conference: the true vision - capture, transmission and display of 3D videoGoogle Scholar
  31. 31.
    Merkle P, Muller K, Marpe D, Wiegand T (2015) Depth intra coding for 3D video based on geometric primitives. IEEE Trans Circuits Syst Video TechnolGoogle Scholar
  32. 32.
    Milani S, Zanuttigh P, Zamarin M, Forchhammer S (2011) Efficient depth map compression exploiting segmented color data. In: IEEE international conference on multimedia and expo, pp 1–6.
  33. 33.
    Ozaktas HM, Onural L (2008) Three-Dimensional Television, Signals and Communication Technology. Springer, BerlinCrossRefGoogle Scholar
  34. 34.
    Ozkalayci B, Alatan A (2014) 3D planar representation of stereo depth images for 3DTV applications. IEEE Trans Image Process 23(12):5222–5232. MathSciNetCrossRefGoogle Scholar
  35. 35.
    Ozkalayci B (2014) Planar 3D Scene Representations for Depth Compression. Middle East Technical University (thesis report), Çankaya/AnkaraGoogle Scholar
  36. 36.
    Shahriyar S, Murshed M, Ali M, Paul M (2014) Efficient coding of depth map by exploiting temporal correlation. In: International conference on digital image computing: techniques and applications, pp 1–8.
  37. 37.
    Shen G, Kim WS, Narang SK, Ortega A, Lee J, Wey H Edge-adaptive transforms for efficient depth map coding, 28th Picture Coding Symposium, Nagoya, 2010, pp 566–569.
  38. 38.
    Sullivan GJ, Ohm JR, Han WJ, Wiegand T (2012) Overview of the high effciency video coding (hevc) standard. IEEE Transactions on Circuits and Systems for Video TechnologyGoogle Scholar
  39. 39.
    Smisek J, Jancosek M, Pajdla T (2011) 3D With kinect. 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, pp 1154–1160.
  40. 40.
    Skodras A, Christopoulos C, Ebrahimi T (2001) The JPEG 2000 still image compression standard. IEEE Signal Proc Mag 18:36–58CrossRefGoogle Scholar
  41. 41.
    Sturm J, Engelhard N, Endres F, Burgard W, Dremers D (2012) A Benchmark for the Evaluation of RGB-D SLAM Systems. Proceedings of the International Conference on Intelligent Robot Systems (IROS)Google Scholar
  42. 42.
    Tech G, Schwarz H, Muller K, Wiegand T (2012) 3D video coding using the synthesized view distortion change. In: Picture coding symposium, pp 25–28.
  43. 43.
    Tech G, Chen Y, Müller K, Ohm JR, Vetro A, Wang YK (2016) Overview of the Multiview and 3D Extensions of High Efficiency Video Coding. IEEE Trans Circ Syst Video Technol 26(1):35–49. CrossRefGoogle Scholar
  44. 44.
    The PAQ data compression programs (2013)
  45. 45.
    Umeyama S (1991) Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on pattern analysis and machine intelligence, pp 13Google Scholar
  46. 46.
    Yan C et al (2014) A Highly Parallel Framework for HEVC Coding Unit Partitioning Tree Decision on Many-core Processors. IEEE Signal Process Lett 21(5):573–576MathSciNetCrossRefGoogle Scholar
  47. 47.
    Yan C et al (2014) Efficient Parallel Framework for HEVC Motion Estimation on Many-Core Processors. IEEE Trans Circ Syst Video Technol 24(12):2077–2089CrossRefGoogle Scholar
  48. 48.
    Zou F, Tian D, Vetro A, Sun H, Au OC, Shimizu S (2014) View synthesis prediction in the 3D video coding extensions of AVC and HEVC. IEEE Trans Circ Syst Video Technol 24(10):1696–1708CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Electrical EngineeringIndian Institute of Science (IISc)BangaloreIndia

Personalised recommendations