Multimedia Tools and Applications

, Volume 77, Issue 17, pp 22985–23008 | Cite as

Perceptual rate distortion optimization of 3D–HEVC using PSNR-HVS

  • Sima ValizadehEmail author
  • Panos Nasiopoulos
  • Rabab Ward


Improved compression efficiency is highly desirable in the transmission of 3D video and its storage. 3D–HEVC achieves higher compression efficiency compared to the simulcast HEVC or disparity-compensated multi-view video coding (MVC). In 3D–HEVC, the mean square error (MSE) is used to measure distortion in the rate distortion optimization process. However, MSE is not a good measure to use for measuring visual quality, as it poorly correlates with human perception. We propose to integrate a perceptual video quality metric inside the rate distortion optimization process of the 3D–HEVC. Specifically, in the coding unit (CU) mode selection process, PSNR-HVS is used as a measure for distortion. PSNR-HVS is based on the characteristics of the human visual system (HVS). Performance evaluations show that the proposed approach improves the compression efficiency of 3D–HEVC for multi-view videos by 2.78%.


Perceptual video coding Rate distortion optimization (RDO) Human visual system (HVS) PSNR-HVS 3D–HEVC Coding unit structure 



This work was supported by the NPRP grant # NPRP 4-463-2-172 from the Qatar National Research Fund (a member of the Qatar Foundation). The statements made herein are solely the responsibility of the authors.


  1. 1.
    3D–HEVC Reference Software, HTM-10. (2017) Available:
  2. 2.
    Assembly IR (2003) Methodology for the subjective assessment of the quality of television pictures. International Telecommunication UnionGoogle Scholar
  3. 3.
    Aswathappa BH, Rao KR (2010) Rate-distortion optimization using structural information in H. 264 strictly intra-frame encoder. In System Theory (SSST), 2010 42nd Southeastern Symposium on 2010 Mar 7 (pp. 367–370). IEEEGoogle Scholar
  4. 4.
    Banitalebi Dehkordi A (2015) 3D video quality assessment (Doctoral dissertation, University of British Columbia)Google Scholar
  5. 5.
    Banitalebi-Dehkordi A, Pourazad MT, Nasiopoulos P (2013, May) 3D video quality metric for mobile applications. In Acoustics, Speech and Signal Processing (ICASSP), 2013 I.E. International Conference on 2013 May 26 (pp. 3731–3735). IEEEGoogle Scholar
  6. 6.
    Banitalebi-Dehkordi A, Pourazad MT, Nasiopoulos P (2013, October) A study on the relationship between depth map quality and the overall 3D video quality of experience. In 3DTV-Conference: The True Vision-Capture, Transmission and Dispaly of 3D Video (3DTV-CON), 2013 Oct 7 (pp. 1–4). IEEEGoogle Scholar
  7. 7.
    Banitalebi-Dehkordi A, Pourazad MT, Nasiopoulos P (2013) 3D video quality metric for 3D video compression. In IVMSP Workshop, 2013 I.E. 11th 2013 Jun 10 (pp. 1–4). IEEEGoogle Scholar
  8. 8.
    Banitalebi-Dehkordi A, Pourazad MT, Nasiopoulos P (2014) Effect of high frame rates on 3D video quality of experience. In Consumer Electronics (ICCE), 2014 I.E. International Conference on 2014 Jan 10 (pp. 416–417). IEEEGoogle Scholar
  9. 9.
    Banitalebi-Dehkordi A, Pourazad MT, Nasiopoulos P (2015) The effect of frame rate on 3D video quality and bitrate. Springer J 3D Res 6(1):5–34. CrossRefGoogle Scholar
  10. 10.
    Banitalebi-Dehkordi A, Pourazad MT, Nasiopoulos P (2016) An efficient human visual system based quality metric for 3D video. Multimed Tools Appl 75(8):4187–4215CrossRefGoogle Scholar
  11. 11.
    Benoit A, Le Callet P, Campisi P et al (2009) Quality assessment of stereoscopic images. EURASIP J Image Vid Process 2008(1):1–13Google Scholar
  12. 12.
    Benzie P, Watson J, Surman P et al (2007) A survey of 3DTV displays: techniques and technologies. IEEE Trans Circuits Sys Vid Technol 17(11):1647–1658CrossRefGoogle Scholar
  13. 13.
    Bjontegaard G (2001) Calcuation of average PSNR differences between RD-curves. Doc. VCEG-M33 ITU-T Q6/16, Austin, 2-4Google Scholar
  14. 14.
    Boev A, Gotchev A, Egiazarian K, Aksay A, Akar GB (2006) Towards compound stereo-video quality metric: a specific encoder-based framework. In Image Analysis and Interpretation, 2006 I.E. Southwest Symposium on 2006 Mar 26 (pp. 218–222). IEEEGoogle Scholar
  15. 15.
    Chen Y, Wang Y, Ugur K et al (2009) The emerging MVC standard for 3D video services. EURASIP J Appl Signal Process 2009:8Google Scholar
  16. 16.
    Chen Y, Tech G, Wegner K, Yea S (2015) Test model 11 of 3D-HEVC and MV-HEVC. JCT-3V of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, document JCT3V-K1003, GenevaGoogle Scholar
  17. 17.
    Dufaux F, Pesquet-Popescu B, Cagnazzo M (2013) Emerging technologies for 3D video: creation, coding, transmission and rendering. John Wiley & Sons, HobokenCrossRefGoogle Scholar
  18. 18.
    Egiazarian K, Astola J, Ponomarenko N, Lukin V, Battisti F, Carli M (2006) New full-reference quality metrics based on HVS. In Proceedings of the Second International Workshop on Video Processing and Quality Metrics, vol. 4Google Scholar
  19. 19.
    Gao Y, Dai Q (2014) View-based 3D object retrieval: challenges and approaches. IEEE MultiMedia 21(3):52–57CrossRefGoogle Scholar
  20. 20.
    Hewage C, Worrall ST, Dogan S et al (2008) Prediction of stereoscopic video quality using objective quality models of 2-D video. Electron Lett 44(16):963–965CrossRefGoogle Scholar
  21. 21.
    Hewage CT, Worrall ST, Dogan S, Villette S, Kondoz AM (2009) Quality evaluation of color plus depth map-based stereoscopic video. IEEE J Select Topics Signal Process 3(2):304–318 ChicagoCrossRefGoogle Scholar
  22. 22.
    Huynh-Thu Q, Ghanbari M (2008) Scope of validity of PSNR in image/video quality assessment. Electron Lett 44(13):800–801CrossRefGoogle Scholar
  23. 23.
    Huynh-Thu Q, Ghanbari M (2012) The accuracy of PSNR in predicting video quality for different video scenes and frame rates. Telecommun Syst 49(1):35–48CrossRefGoogle Scholar
  24. 24.
    Huynh-Thu Q, Le Callet P, Barkowsky M (2010) Video quality assessment: from 2D to 3D—challenges and future trends. In Image Processing (ICIP), 2010 17th IEEE International Conference on 2010 Sep 26 (pp. 4025–4028). IEEEGoogle Scholar
  25. 25.
    Huynh-Thu Q, Le Callet P, and Barkowsky M (2010) Video quality assessment: from 2D to 3D - challenges and future trends. 17th IEEE International Conference on image processing (ICIP), Hong Kong SAR China. pp. 4025-4028Google Scholar
  26. 26.
    ITU-T RECOMMENDATION P (1999) Subjective video quality assessment methods for multimedia applications. 34–35Google Scholar
  27. 27.
    Lewandowski F, Paluszkiewicz M, Grajek T et al (2013) Methodology for 3D video subjective quality evaluation. Int J Electron Telecom 59(1):25–32Google Scholar
  28. 28.
    Mai Z, Yang C, Po L et al (2005) A new rate-distortion optimization using structural information in H. 264 I-frame encoder. In: anonymous international conference on advanced concepts for intelligent vision systems. Springer, p 435Google Scholar
  29. 29.
    Mai ZY, Yang CL, Xie SL (2005) Improved best prediction mode (s) selection methods based on structural similarity in H. 264 I-frame encoder. In Systems, Man and Cybernetics, 2005 I.E. International Conference on 2005 Oct 10 (Vol. 3, pp. 2673–2678). IEEEGoogle Scholar
  30. 30.
    Mai ZY, Yang CL, Kuang KZ, Po LM (2006) A novel motion estimation method based on structural similarity for H. 264 inter prediction. In Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 I.E. International Conference on 2006 May 14 (Vol. 2, pp. II-II). IEEEGoogle Scholar
  31. 31.
    McCann K, Bross B, HanWJ, Kim IK, Sugimoto K, Sullivan GJ (2013) High efficiency video coding (HEVC) test model 13 (HM 13) encoder description, joint collaborative team on video coding (JCT-VC), document JCTVC-O1002, GenevaGoogle Scholar
  32. 32.
    MPEG Video and Requirements Group, Call for Proposals on 3D Video Coding Technology (2011) document N12036, GenevaGoogle Scholar
  33. 33.
    Müller, K, & Vetro, A (2014) Common test conditions of 3DV Core experiments, document JCT3V-G1100. San JoseGoogle Scholar
  34. 34.
    Muller K, Merkle P, Tech G, Wiegand T (2012) 3D video coding with depth modeling modes and view synthesis optimization. In Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific 2012 Dec 3 (pp. 1–4). IEEEGoogle Scholar
  35. 35.
    Müller K, Schwarz H, Marpe D, Bartnik C, Bosse S, Brust H et al (2013) 3D high-efficiency video coding for multi-view video and depth data. IEEE Trans Image Process 22(9):3366–3378MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Qin C, Chen X, Ye D, Wang J, Sun X (2016) A novel image hashing scheme with perceptual robustness using block truncation coding. Inf Sci 361:84–99CrossRefGoogle Scholar
  37. 37.
    Qin C, Sun M, Chang CC (2018) Perceptual hashing for color images based on hybrid extraction of structural features. Signal Process 142:194–205CrossRefGoogle Scholar
  38. 38.
    Rehman A, Wang Z (2012) SSIM-inspired perceptual video coding for HEVC. In Multimedia and Expo (ICME), 2012 I.E. International Conference on 2012 Jul 9 (pp. 497–502). IEEEGoogle Scholar
  39. 39.
    (2016) Requirements for a Future Video Coding Standard v4, ISO/IEC JTC 1/SC 29/WG 11 Coding of Moving Pictures and Audio, ISO/IEC JTC1/SC29/WG11, N16359, Geneva, June 2016Google Scholar
  40. 40.
    Schwarz H, Wiegand T (2012) Inter-view prediction of motion data in multiview video coding. In Picture Coding Symposium (PCS), 2012 May 7 (pp. 101–104). IEEEGoogle Scholar
  41. 41.
    Schwarz H, Bartnik C, Bosse S, Brust H, Hinz T, Lakshman H, Marpe D, Merkle P, Müller K, Rhee H, Tech G (2012) 3D video coding using advanced prediction, depth modeling, and encoder control methods. In Picture Coding Symposium (PCS), 2012 May 7 (pp. 1–4). IEEEGoogle Scholar
  42. 42.
    Sullivan GJ, Ohm J, Han W et al (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits Syst Vid Technol 22(12):1649–1668CrossRefGoogle Scholar
  43. 43.
    Sullivan GJ, Boyce JM, Chen Y, Ohm JR, Segall CA, Vetro A (2013) Standardized extensions of high efficiency video coding (HEVC). IEEE J Select Topics Signal Process 7(6):1001–1016CrossRefGoogle Scholar
  44. 44.
    Sze V, Budagavi M, Sullivan GJ (2014) High efficiency video coding (HEVC). Integrated Circuit and Systems, Algorithms and Architectures, pp 1–375.
  45. 45.
    Tech G, Schwarz H, Müller K, Wiegand T (2012) 3D video coding using the synthesized view distortion change. In Picture Coding Symposium (PCS), 2012 May 7 (pp. 25–28). IEEEGoogle Scholar
  46. 46.
    Tech G, Chen Y, Müller K et al (2016) Overview of the multiview and 3D extensions of high efficiency video coding. IEEE Tran Circuits Syst Vid Technol 26(1):35–49CrossRefGoogle Scholar
  47. 47.
    Valizadeh S, Nasiopoulos P, Ward R (2015) Perceptually-friendly rate distortion optimization in high efficiency video coding. In Signal Processing Conference (EUSIPCO), 2015 23rd European 2015 Aug 31 (pp. 115–119). IEEEGoogle Scholar
  48. 48.
    Valizadeh S, Nasiopoulos P, Ward R (2016) Optimizing the Lagrange multiplier in perceptually-friendly high efficiency video coding for mobile applications. In Computing, Networking and Communications (ICNC), 2016 International Conference on 2016 Feb 15 (pp. 1–4). IEEEGoogle Scholar
  49. 49.
    Vetro A, Wiegand T, Sullivan GJ (2011) Overview of the stereo and multiview video coding extensions of the H. 264/MPEG-4 AVC standard. Proc IEEE 99(4):626–642CrossRefGoogle Scholar
  50. 50.
    Wallace G (1991) The JPEG still picture compression standard. Commun ACM 34(4):30–44Google Scholar
  51. 51.
    Wan, Z, & Bovik, A C (2009) Mean squared error: Love it or leave it?. IEEE Signal Processing Magazine, 98–117Google Scholar
  52. 52.
    Wang S, Rehman A, Wang Z et al (2013) Perceptual video coding based on SSIM-inspired divisive normalization. IEEE Trans Image Process 22(4):1418–1429MathSciNetCrossRefzbMATHGoogle Scholar
  53. 53.
    Wang D, Wang B, Zhao S, Yao H (2017) View-based 3D object retrieval with discriminative views. Neurocomputing 252:58–66CrossRefGoogle Scholar
  54. 54.
    Yang CL, Wang HX, Po LM (2007) Improved inter prediction based on structural similarity in H. 264. In Signal Processing and Communications, 2007. ICSPC 2007. IEEE International Conference on 2007 Nov 24 (pp. 340–343). IEEEGoogle Scholar
  55. 55.
    Yang CL, Leung RK, Po LM, Mai ZY (2009) An SSIM-optimal H. 264/AVC inter frame encoder. In Intelligent Computing and Intelligent Systems, 2009. ICIS 2009. IEEE International Conference on 2009 Nov 20 (Vol. 4, pp. 291–295). IEEEGoogle Scholar
  56. 56.
    Yasakethu SLP, Hewage CT, Fernando WAC, Kondoz AM (2008) Quality analysis for 3D video using 2D video quality models. IEEE Trans Consum Electron 54(4):1135–1141CrossRefGoogle Scholar
  57. 57.
    Yeo C, Tan HL, Tan YH (2013) On rate distortion optimization using SSIM. IEEE Trans Circuits Syst for Vid Technol 23(7):1170–1181CrossRefGoogle Scholar
  58. 58.
    You J, Xing L, Perkis A, Wang X (2010) Perceptual quality assessment for stereoscopic images based on 2D image quality metrics and disparity analysis. In Proc. of International Workshop on Video Processing and Quality Metrics for Consumer Electronics, ScottsdaleGoogle Scholar
  59. 59.
    Zhao T, Zeng K, Rehman A, Wang Z (2013) On the use of SSIM in HEVC. In Signals, Systems and Computers, 2013 Asilomar Conference on 2013 Nov 3 (pp. 1107–1111). IEEEGoogle Scholar
  60. 60.
    Zhao S, Yao H, Zhang Y, Wang Y, Liu S (2015) View-based 3D object retrieval via multi-modal graph learning. Signal Process 31(112):110–118CrossRefGoogle Scholar
  61. 61.
    Zhao S, Chen L, Yao H, Zhang Y, Sun X (2015) Strategy for dynamic 3D depth data matching towards robust action retrieval. Neurocomputing 151:533–543CrossRefGoogle Scholar
  62. 62.
    Zhao P, Liu Y, Liu J, Yao R, Ci S (2016) Perceptual rate-distortion optimization for H. 264/AVC video coding from both signal and vision perspectives. Multimed Tools Appl 75(5):2781–2800CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2017

Authors and Affiliations

  1. 1.University of British ColumbiaVancouverCanada

Personalised recommendations