Random access prediction structures for light field video coding with MV-HEVC

Avramelos, Vasileios; De Praeter, Johan; Van Wallendael, Glenn; Lambert, Peter

doi:10.1007/s11042-019-08605-x

Random access prediction structures for light field video coding with MV-HEVC

Published: 23 January 2020

Volume 79, pages 12847–12867, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Vasileios Avramelos ORCID: orcid.org/0000-0001-6601-0920¹,
Johan De Praeter¹,
Glenn Van Wallendael¹ &
…
Peter Lambert¹

510 Accesses
7 Citations
Explore all metrics

Abstract

Computational imaging and light field technology promise to deliver the required six-degrees-of-freedom for natural scenes in virtual reality. Already existing extensions of standardized video coding formats, such as multi-view coding and multi-view plus depth, are the most conventional light field video coding solutions at the moment. The latest multi-view coding format, which is a direct extension of the high efficiency video coding (HEVC) standard, is called multi-view HEVC (or MV-HEVC). MV-HEVC treats each light field view as a separate video sequence, and uses syntax elements similar to standard HEVC for exploiting redundancies between neighboring views. To achieve this, inter-view and temporal prediction schemes are deployed with the aim to find the most optimal trade-off between coding performance and reconstruction quality. The number of possible prediction structures is unlimited and many of them are proposed in the literature. Although some of them are efficient in terms of compression ratio, they complicate random access due to the dependencies on previously decoded pixels or frames. Random access is an important feature in video delivery, and a crucial requirement in multi-view video coding. In this work, we propose and compare different prediction structures for coding light field video using MV-HEVC with a focus on both compression efficiency and random accessibility. Experiments on three different short-baseline light field video sequences show the trade-off between bit-rate and distortion, as well as the average number of decoded views/frames, necessary for displaying any random frame at any time instance. The findings of this work indicate the most appropriate prediction structure depending on the available bandwidth and the required degree of random access.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HDRC: a subjective quality assessment database for compressed high dynamic range image

Article Open access 06 May 2024

Integral imaging near-eye 3D display using a nanoimprint metalens array

Article Open access 22 January 2024

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

References

Ahmad W, Sjöström M, Olsson R (2018) Compression scheme for sparsely sampled light field data based on pseudo multi-view sequences. SPIE proceedings, vol 10679
Avramelos V, Saenen I, Verhack R, Van Wallendael G, Lambert P, Sikora T (2018) Steered mixture-of-experts for light field video coding. SPIE proceedings, vol 10752
Avramelos V, Van Wallendael G, Lambert P (2019) Overview of MV-HEVC prediction structures for light field video. SPIE Proceedings:11137
Avramelos V, Verhack R, Saenen I, Van Wallendael G, Goossens B, Lambert P (2018) Highly parallel steered mixture-of-experts rendering at pixel-level for image and light field data. Journal of Real-Time Image Processing:1–17
Bjøntegaard G (2001) Calculation of average PSNR differences between RD-curves. ITU-t SG16 Q.6 document VCEG-m33
Conti C, Kovacs PT, Balogh T, Nunes P, Soares LD (2014) Light-field video coding using geometry-based disparity compensation. 3DTV-conference: the true vision - capture, transmission and display of 3D video
Domanski M, Stankiewicz O, Wegner K, Grajek T (2017) Immersive visual media, MPEG-I: 360 video, virtual navigation and beyond. International Conference on Systems, Signals and Image Processing
Dricot A, Jung J, Cagnazzo M, Pesquet B, Dufaux F (2014) Full parallax super multi-view video coding. IEEE international conference on image processing (ICIP)
Fraunhofer HHI: Multiview High Efficiency Video Coding (MV-HEVC) - HTM software repository. https://hevc.hhi.fraunhofer.de/svn/svn_3DVCSoftware/, last accessed on March 11th, 2019
Gao P, Xiang W (2014) Rate-distortion optimized mode switching for error-resilient multi-view video plus depth based 3d video coding. IEEE Transactions on Multimedia 16(7):1794–1808
Article Google Scholar
Hansen BC, Essock EA (2004) A Horizontal bias in human visual processing of orientation and its correspondence to the structural components of natural scenes. Journal of Vision 4:12
Article Google Scholar
Huszák Á (2017) Advanced free viewpoint video streaming techniques. Multimed Tools Appl 76(1):373–396
Article Google Scholar
ISO/IEC JTC1/SC29/WG11 Coding of moving pictures and audio: Call for evidence on free-viewpoint television: super-multiview and free navigation, 2015
Lafruit G, Domanski M, Wegner K, Grajek T, Senoh T, Jung J, Kovács PT, Goorts P, Jorissen L, Munteanu A, Ceulemans B, Carballeira P, Garcia S, Tanimoto M (2016) New visual coding exploration in MPEG: Super-MultiView and Free Navigation in Free viewpoint TV. Electronic imaging, stereoscopic displays and applications XXVII
Liu D, An P, Ma R, Shen L (2018) Hybrid linear weighted prediction and intra block copy based light field image coding. Multimed Tools Appl 77(24):31929–31951
Article Google Scholar
Liu D, An P, Ma R, Yang C, Shen L, Li K (2018) Scalable coding of 3d holoscopic image by using a sparse interlaced view image set and disparity map. Multimed Tools Appl 77(1):1261–1283
Article Google Scholar
Liu D, Wang L, Li L, Xiong Z, Wu F, Zeng W (2016) Pseudo-sequence-based light field image compression. IEEE International Conference in Multimedia & Expo Workshops (ICMEW):1–4
Merkle P, Smolic A, Muller K, Wiegand T (2007) Efficient prediction structures for multiview video coding. IEEE Trans Circuits Syst Video Technol 17 (11):1461–1473
Article Google Scholar
Multimedia Signal Processing Group (MMSPG): VQMT: Video Quality Measurement Tool, https://mmspg.epfl.ch/downloads/vqmt/, last accessed on March 23rd, 2019
Ozcinar C, Ekmekcioglu E, Ćalić J, Kondoz A (2016) Adaptive delivery of immersive 3d multi-view video over the internet. Multimed Tools Appl 75 (20):12431–12461
Article Google Scholar
Sánchez de la Fuente Y, Skupin R, Schierl T (2016) Video processing for panoramic streaming using HEVC and its scalable extensions. Multimed Tools Appl 76 (4):5631–5659
Article Google Scholar
Sullivan GJ, Boyce JM, Chen Y, Ohm J-R, Segall CA, Vetro A (2013) Standardized extensions of high efficiency video coding. IEEE Journal on Selected Topics in Signal Processing, vol 7
Sullivan GJ, Ohm JR, Han WJ, Wiegand T (2012) Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans Circuits Syst Video Technol, vol 22
Sze V, Budagavi M, Sullivan GJ (2014) Hight efficiency video coding - algorithms and architectures. Integrated Circuits and Systems, ISBN 3319068946, 9783319068947, Springer Publishing Company, Incorporated
Tech G, Chen Y, Mueller K, Ohm J-R, Vetro A, Wang Y-K (2015) Overview of the multiview and 3d extensions of high efficiency video coding. IEEE Trans Circuits Syst Video Technol, vol 26
Verhack R, Sikora T, Lange L, Jongebloed R, Van Wallendael G, Lambert P (2017) Steered mixture-of-experts for light field coding, depth estimation, and processing. IEEE International Conference on Multimedia and Expo (ICME):1183–1188
Wang G, Xiang W, Pickering M, Chen CW (2016) Light field multi-view video coding with two-directional parallel inter-view prediction. IEEE Trans Image Process, vol 25
Wang TC, Zhu JY, Khademi N, Efro A, Ramamoorthi R (2017) Light field video capture using a learning-based hybrid imaging system. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2017) 36(4)
Google Scholar
ISO/IEC JTC1/SC29 WG11 (2018) Text of PDTR ISO/IEC 23090-1 Immersive media architecture. Doc N17685, MPEG 122nd meeting, San Diego, USA
Zilly F, Riechert C, Mueller M, Eisert P, Sikora T, Kauff P (2014) Real-time generation of multi-view video plus depth content using mixed narrow and wide baseline. J Vis Commun Image Rep 25:632–648
Article Google Scholar

Download references

Acknowledgements

The research activities described in this article were funded by IDLab (Ghent University - imec), Flanders Innovation & Entrepreneurship (VLAIO), the Fund for Scientific Research Flanders (FWO Flanders), and the European Union. We would also like to share our gratitude to John Carmack (currently CTO of Oculus VR) for sharing his ideas via social media (Twitter), something which triggered the further investigation of specific parts in this work.

Author information

Authors and Affiliations

Department of Electronics and Information Systems, Ghent University - imec, IDLab Technologiepark-Zwijnaarde 122, 9052, Ghent, Belgium
Vasileios Avramelos, Johan De Praeter, Glenn Van Wallendael & Peter Lambert

Authors

Vasileios Avramelos
View author publications
You can also search for this author in PubMed Google Scholar
Johan De Praeter
View author publications
You can also search for this author in PubMed Google Scholar
Glenn Van Wallendael
View author publications
You can also search for this author in PubMed Google Scholar
Peter Lambert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vasileios Avramelos.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Avramelos, V., De Praeter, J., Van Wallendael, G. et al. Random access prediction structures for light field video coding with MV-HEVC. Multimed Tools Appl 79, 12847–12867 (2020). https://doi.org/10.1007/s11042-019-08605-x

Download citation

Received: 12 April 2019
Revised: 20 November 2019
Accepted: 20 December 2019
Published: 23 January 2020
Issue Date: May 2020
DOI: https://doi.org/10.1007/s11042-019-08605-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Random access prediction structures for light field video coding with MV-HEVC

Abstract

Access this article

Similar content being viewed by others

HDRC: a subjective quality assessment database for compressed high dynamic range image

Integral imaging near-eye 3D display using a nanoimprint metalens array

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation