3D hierarchical optimization for multi-view depth map coding

Duch, Marc Maceira; Varas, David; Rubió, Josep Ramon Morros; Ruiz-Hidalgo, Javier; Marques, Ferran

doi:10.1007/s11042-017-5409-z

3D hierarchical optimization for multi-view depth map coding

Published: 28 November 2017

Volume 77, pages 19869–19894, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Marc Maceira Duch¹,
David Varas¹,
Josep Ramon Morros Rubió¹,
Javier Ruiz-Hidalgo¹ &
…
Ferran Marques¹

373 Accesses
1 Citation
3 Altmetric
Explore all metrics

Abstract

Depth data has a widespread use since the popularity of high resolution 3D sensors. In multi-view sequences, depth information is used to supplement the color data of each view. This article proposes a joint encoding of multiple depth maps with a unique representation. Color and depth images of each view are segmented independently and combined in an optimal Rate-Distortion fashion. The resulting partitions are projected to a reference view where a coherent hierarchy for the multiple views is built. A Rate-Distortion optimization is applied to obtain the final segmentation choosing nodes of the hierarchy. The consistent segmentation is used to robustly encode depth maps of multiple views obtaining competitive results with HEVC coding standards.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

Article 23 March 2023

Review on image-stitching techniques

Article 20 March 2020

LSD-SLAM: Large-Scale Direct Monocular SLAM

References

Barrera F, Padoy N (2014) Piecewise planar decomposition of 3D point clouds obtained from multiple static rgb-d cameras. In: 2014 2nd International conference on 3D vision, vol 1, pp 194–201
Charikar M, Guruswami V, Wirth A (2003) Clustering with qualitative information. In: Proceedings of the 44th Annual IEEE symposium on foundations of computer science FOCS ’03. IEEE Computer Society, Washington, DC, pp 524–533
Fehn C (2004) Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV
Fischler M A, Bolles R C (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Article MathSciNet Google Scholar
Freeman H (1961) On the encoding of arbitrary geometric configurations. IRE Trans Electron Comput EC-10(2):260–268
Article MathSciNet Google Scholar
Gao Y, Cheung G, Maugey T, Frossard P, Liang J (2016) Encoder-driven inpainting strategy in multiview video compression. IEEE Trans Image Process 25 (1):134–149
Article MathSciNet Google Scholar
Glasner D, Vitaladevuni SN, Basri R (2011) Contour-based joint clustering of multiple segmentations. In: Proceedings of the 2011 IEEE Conference on computer vision and pattern recognition CVPR ’11. IEEE Computer Society, Washington, DC, pp 2385–2392
Gupta S, Arbeláez P, Malik J (2013) Perceptual organization and recognition of indoor scenes from RGB-D images. In: 2013 IEEE Conference on computer vision and pattern recognition (CVPR), pp 564–571
Kowdle A, Sinha S, Szeliski R (2012) Multiple view object cosegmentation using appearance and stereo cues. In: European Conference on computer vision. Firenze, pp 789–803 https://doi.org/10.1007/978-3-642-33715-4_57 https://doi.org/10.1007/978-3-642-33715-4_57
Liang B, Zheng L (2015) A survey on human action recognition using depth sensors. In: 2015 International conference on digital image computing: techniques and applications (DICTA), pp 1–8
Lucas L F R, Wegner K, Rodrigues N M M, Pagliari C L, da Silva E A B, de Faria S M M (2015) Intra predictive depth map coding using flexible block partitioning. IEEE Trans Image Process 24(11):4055– 4068
Article MathSciNet Google Scholar
Maceira M, Morros J R, Ruiz-Hidalgo J (2016) Depth map compression via 3D region-based representation. Multimed Tools Appl 1–24
Merkle P, Smolic A, Muller K, Wiegand T (2007) Efficient prediction structures for multiview video coding. IEEE Trans Circ Syst Video Technol 17(11):1461–1473
Article Google Scholar
Merkle P, Müller K, Marpe D, Wiegand T (2016) Depth intra coding for 3D video based on geometric primitives. IEEE Trans Circ Syst Vid Technol 26(3):570–582
Article Google Scholar
Micusik B, Kosecka J (2009) Piecewise planar city 3D modeling from street view panoramic sequences. In: IEEE Conference on computer vision and pattern recognition, 2009. CVPR 2009., pp 2906–2912
Müller K, Merkle P, Wiegand T (2011) 3-D video representation using depth maps. Proc IEEE 99(4):643–656
Article Google Scholar
Müller K, Schwarz H, Marpe D, Bartnik C, Bosse S, Brust H, Hinz T, Lakshman H, Merkle P, Rhee FH, Tech G, Winken M, Wiegand T (2013) 3D High-efficiency video coding for multi-view video and depth data. IEEE Trans Image Process 22(9):3366–3378
Article MathSciNet MATH Google Scholar
Ortega A, Ramchandran K (1998) Rate-distortion methods for image and video compression. IEEE Signal Process Mag 15(6):23–50
Article Google Scholar
Ostermann J, Bormans J, List P, Marpe D, Narroschke M, Pereira F, Stockhammer T, Wedi T (2004) Video coding with H.264/AVC: tools, performance, and complexity. IEEE Circ Syst Mag 4(1):7–28
Article Google Scholar
Özkalayc BO, Alatan AA (2014) 3D planar representation of stereo depth images for 3DTV applications. IEEE Trans Image Process 23(12):5222–5232
Article MathSciNet MATH Google Scholar
Ren X, Bo L, Fox D (2012) RGB-(D) scene labeling: features and algorithms. In: 2012 IEEE Conference on computer vision and pattern recognition (CVPR), pp 2759–2766
Rusanovskyy D, Aflaki P, Hannuksela M (2011) Undo dancer 3DV sequence for purposes of 3DV standardization. ISO/IEC JTC1/SC29/WG11 MPEG2010 M 20028
Salembier P, Garrido L (2000) Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval. IEEE Trans Image Process 9(4):561–576
Article Google Scholar
Schwarz LA, Mateus D, Lallemand J, Navab N (2011) Tracking planes with time of flight cameras and j-linkage. In: 2011 IEEE Workshop on applications of computer vision (WACV), pp 664–671
Shoham Y, Gersho A (1988) Efficient bit allocation for an arbitrary set of quantizers 36(9):1445– 1453
Google Scholar
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. In: ECCV
Sinha S, Steedly D, Szeliski R (2009) Piecewise planar stereo for image-based rendering. In: International conference on computer vision. Kyoto, pp 1881–1888
Sullivan G J, Ohm J R, Han W J, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circ Syst Vid Technol 22 (12):1649–1668
Article Google Scholar
Sullivan G J, Boyce J M, Chen Y, Ohm J R, Segall C A, Vetro A (2013) Standardized extensions of high efficiency video coding (HEVC). IEEE J Selected Top Signal Process 7(6):1001–1016
Article Google Scholar
Torres L, Kunt M (1996) Second generation video coding techniques. Springer, Boston, pp 1–30
Google Scholar
Varas D, Alfaro M, Marques F (2015) Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. In: 2015 IEEE International conference on computer vision (ICCV), pp 4579–4587
Verleysen C, De Vleeschouwer C (2016) Piecewise-planar 3d approximation from wide-baseline stereo. In: The IEEE Conference on computer vision and pattern recognition (CVPR)
Wang A, Lu J, Cai J, Wang G, Cham T J (2015) Unsupervised joint feature learning and encoding for RGB-D scene labeling. IEEE Trans Image Process 24 (11):4459–4473
Article MathSciNet Google Scholar
Yin F, Velastin S A, Ellis T, Makris D (2015) Learning multi-planar scene models in multi-camera videos. IET Comput Vis 9(1):25–40
Article Google Scholar
Zhang J, Li R, Li H, Rusanovskyy D, Hannuksela M M (2011) Ghost Town Fly 3DV sequence for purposes of 3DV standardization. ISO/IEC JTC1/SC29/WG11. Doc M 20027
Zitnick CL, Kang SB, Uyttendaele M, Winder S, Szeliski R (2004) High-quality video view interpolation using a layered representation. In: ACM SIGGRAPH 2004 Papers SIGGRAPH ’04. New York, pp 600–608

Download references

Acknowledgements

This work has been developed in the framework of projects TEC2013-43935-R and TEC2016-75976-R, financed by the Spanish Ministerio de Economía y Competitividad and the European Regional Development Fund (ERDF)

Author information

Authors and Affiliations

Universitat Politècnica de Catalunya, C. Jordi Girona, 1-3, Barcelona, Spain
Marc Maceira Duch, David Varas, Josep Ramon Morros Rubió, Javier Ruiz-Hidalgo & Ferran Marques

Authors

Marc Maceira Duch
View author publications
You can also search for this author in PubMed Google Scholar
David Varas
View author publications
You can also search for this author in PubMed Google Scholar
Josep Ramon Morros Rubió
View author publications
You can also search for this author in PubMed Google Scholar
Javier Ruiz-Hidalgo
View author publications
You can also search for this author in PubMed Google Scholar
Ferran Marques
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Maceira Duch.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duch, M.M., Varas, D., Rubió, J.R.M. et al. 3D hierarchical optimization for multi-view depth map coding. Multimed Tools Appl 77, 19869–19894 (2018). https://doi.org/10.1007/s11042-017-5409-z

Download citation

Received: 15 February 2017
Revised: 30 September 2017
Accepted: 09 November 2017
Published: 28 November 2017
Issue Date: August 2018
DOI: https://doi.org/10.1007/s11042-017-5409-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D hierarchical optimization for multi-view depth map coding

Abstract

Access this article

Similar content being viewed by others

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

Review on image-stitching techniques

LSD-SLAM: Large-Scale Direct Monocular SLAM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

3D hierarchical optimization for multi-view depth map coding

Abstract

Access this article

Similar content being viewed by others

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

Review on image-stitching techniques

LSD-SLAM: Large-Scale Direct Monocular SLAM

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation