Abstract
Computational imaging and light field technology promise to deliver the required six-degrees-of-freedom for natural scenes in virtual reality. Already existing extensions of standardized video coding formats, such as multi-view coding and multi-view plus depth, are the most conventional light field video coding solutions at the moment. The latest multi-view coding format, which is a direct extension of the high efficiency video coding (HEVC) standard, is called multi-view HEVC (or MV-HEVC). MV-HEVC treats each light field view as a separate video sequence, and uses syntax elements similar to standard HEVC for exploiting redundancies between neighboring views. To achieve this, inter-view and temporal prediction schemes are deployed with the aim to find the most optimal trade-off between coding performance and reconstruction quality. The number of possible prediction structures is unlimited and many of them are proposed in the literature. Although some of them are efficient in terms of compression ratio, they complicate random access due to the dependencies on previously decoded pixels or frames. Random access is an important feature in video delivery, and a crucial requirement in multi-view video coding. In this work, we propose and compare different prediction structures for coding light field video using MV-HEVC with a focus on both compression efficiency and random accessibility. Experiments on three different short-baseline light field video sequences show the trade-off between bit-rate and distortion, as well as the average number of decoded views/frames, necessary for displaying any random frame at any time instance. The findings of this work indicate the most appropriate prediction structure depending on the available bandwidth and the required degree of random access.
Similar content being viewed by others
References
Ahmad W, Sjöström M, Olsson R (2018) Compression scheme for sparsely sampled light field data based on pseudo multi-view sequences. SPIE proceedings, vol 10679
Avramelos V, Saenen I, Verhack R, Van Wallendael G, Lambert P, Sikora T (2018) Steered mixture-of-experts for light field video coding. SPIE proceedings, vol 10752
Avramelos V, Van Wallendael G, Lambert P (2019) Overview of MV-HEVC prediction structures for light field video. SPIE Proceedings:11137
Avramelos V, Verhack R, Saenen I, Van Wallendael G, Goossens B, Lambert P (2018) Highly parallel steered mixture-of-experts rendering at pixel-level for image and light field data. Journal of Real-Time Image Processing:1–17
Bjøntegaard G (2001) Calculation of average PSNR differences between RD-curves. ITU-t SG16 Q.6 document VCEG-m33
Conti C, Kovacs PT, Balogh T, Nunes P, Soares LD (2014) Light-field video coding using geometry-based disparity compensation. 3DTV-conference: the true vision - capture, transmission and display of 3D video
Domanski M, Stankiewicz O, Wegner K, Grajek T (2017) Immersive visual media, MPEG-I: 360 video, virtual navigation and beyond. International Conference on Systems, Signals and Image Processing
Dricot A, Jung J, Cagnazzo M, Pesquet B, Dufaux F (2014) Full parallax super multi-view video coding. IEEE international conference on image processing (ICIP)
Fraunhofer HHI: Multiview High Efficiency Video Coding (MV-HEVC) - HTM software repository. https://hevc.hhi.fraunhofer.de/svn/svn_3DVCSoftware/, last accessed on March 11th, 2019
Gao P, Xiang W (2014) Rate-distortion optimized mode switching for error-resilient multi-view video plus depth based 3d video coding. IEEE Transactions on Multimedia 16(7):1794–1808
Hansen BC, Essock EA (2004) A Horizontal bias in human visual processing of orientation and its correspondence to the structural components of natural scenes. Journal of Vision 4:12
Huszák Á (2017) Advanced free viewpoint video streaming techniques. Multimed Tools Appl 76(1):373–396
ISO/IEC JTC1/SC29/WG11 Coding of moving pictures and audio: Call for evidence on free-viewpoint television: super-multiview and free navigation, 2015
Lafruit G, Domanski M, Wegner K, Grajek T, Senoh T, Jung J, Kovács PT, Goorts P, Jorissen L, Munteanu A, Ceulemans B, Carballeira P, Garcia S, Tanimoto M (2016) New visual coding exploration in MPEG: Super-MultiView and Free Navigation in Free viewpoint TV. Electronic imaging, stereoscopic displays and applications XXVII
Liu D, An P, Ma R, Shen L (2018) Hybrid linear weighted prediction and intra block copy based light field image coding. Multimed Tools Appl 77(24):31929–31951
Liu D, An P, Ma R, Yang C, Shen L, Li K (2018) Scalable coding of 3d holoscopic image by using a sparse interlaced view image set and disparity map. Multimed Tools Appl 77(1):1261–1283
Liu D, Wang L, Li L, Xiong Z, Wu F, Zeng W (2016) Pseudo-sequence-based light field image compression. IEEE International Conference in Multimedia & Expo Workshops (ICMEW):1–4
Merkle P, Smolic A, Muller K, Wiegand T (2007) Efficient prediction structures for multiview video coding. IEEE Trans Circuits Syst Video Technol 17 (11):1461–1473
Multimedia Signal Processing Group (MMSPG): VQMT: Video Quality Measurement Tool, https://mmspg.epfl.ch/downloads/vqmt/, last accessed on March 23rd, 2019
Ozcinar C, Ekmekcioglu E, Ćalić J, Kondoz A (2016) Adaptive delivery of immersive 3d multi-view video over the internet. Multimed Tools Appl 75 (20):12431–12461
Sánchez de la Fuente Y, Skupin R, Schierl T (2016) Video processing for panoramic streaming using HEVC and its scalable extensions. Multimed Tools Appl 76 (4):5631–5659
Sullivan GJ, Boyce JM, Chen Y, Ohm J-R, Segall CA, Vetro A (2013) Standardized extensions of high efficiency video coding. IEEE Journal on Selected Topics in Signal Processing, vol 7
Sullivan GJ, Ohm JR, Han WJ, Wiegand T (2012) Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans Circuits Syst Video Technol, vol 22
Sze V, Budagavi M, Sullivan GJ (2014) Hight efficiency video coding - algorithms and architectures. Integrated Circuits and Systems, ISBN 3319068946, 9783319068947, Springer Publishing Company, Incorporated
Tech G, Chen Y, Mueller K, Ohm J-R, Vetro A, Wang Y-K (2015) Overview of the multiview and 3d extensions of high efficiency video coding. IEEE Trans Circuits Syst Video Technol, vol 26
Verhack R, Sikora T, Lange L, Jongebloed R, Van Wallendael G, Lambert P (2017) Steered mixture-of-experts for light field coding, depth estimation, and processing. IEEE International Conference on Multimedia and Expo (ICME):1183–1188
Wang G, Xiang W, Pickering M, Chen CW (2016) Light field multi-view video coding with two-directional parallel inter-view prediction. IEEE Trans Image Process, vol 25
Wang TC, Zhu JY, Khademi N, Efro A, Ramamoorthi R (2017) Light field video capture using a learning-based hybrid imaging system. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2017) 36(4)
ISO/IEC JTC1/SC29 WG11 (2018) Text of PDTR ISO/IEC 23090-1 Immersive media architecture. Doc N17685, MPEG 122nd meeting, San Diego, USA
Zilly F, Riechert C, Mueller M, Eisert P, Sikora T, Kauff P (2014) Real-time generation of multi-view video plus depth content using mixed narrow and wide baseline. J Vis Commun Image Rep 25:632–648
Acknowledgements
The research activities described in this article were funded by IDLab (Ghent University - imec), Flanders Innovation & Entrepreneurship (VLAIO), the Fund for Scientific Research Flanders (FWO Flanders), and the European Union. We would also like to share our gratitude to John Carmack (currently CTO of Oculus VR) for sharing his ideas via social media (Twitter), something which triggered the further investigation of specific parts in this work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Avramelos, V., De Praeter, J., Van Wallendael, G. et al. Random access prediction structures for light field video coding with MV-HEVC. Multimed Tools Appl 79, 12847–12867 (2020). https://doi.org/10.1007/s11042-019-08605-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-08605-x