Skip to main content

Foveation-based content adaptive root mean squared error for video quality assessment

Abstract

When the video is compressed and transmitted over heterogeneous networks, it is necessary to ensure the satisfying quality for the end user. Since human observers are the end users of video applications, it is very important that the human visual system (HVS) characteristics are taken into account during the video quality evaluation. This paper deals with video quality assessment (VQA) based on HVS characteristics and proposes a novel full-reference (FR) VQA metric called the Foveation-based content Adaptive Root Mean Squared Error (FARMSE). FARMSE uses several HVS characteristics that significantly influence perception of distortions in a video. Primarily these are foveated vision, reduction of the spatial acuity due to motions as well as spatial masking. Foveated vision is related to variable resolution of HVS across the viewing field, where the highest resolution is at the point of fixation. The point of fixation is projected onto the fovea – the area of retina with the highest density of photoreceptors. The part of image that falls on fovea is perceived by the highest acuity, whereas the spatial acuity decreases as the distance of the image part from the fovea increases. Spatial acuity further decreases if eyes cannot track moving objects. Both mentioned mechanisms influence contrast sensitivity of the HVS. Contrast sensitivity is frequency dependent and FARMSE uses Haar filters to utilize this dependence. Furthermore, spatial masking is implemented in each frequency channel. The FARMSE performance is compared to this of nine state-of-the-art VQA metrics on two different databases, LIVE and ECVQ. Additionally, the metrics are compared in terms of calculation complexity. The performed experiments show that FARMSE achieves high performance when predicting the quality of videos with different resolutions, degradation types and content types. FARMSE results outperform the results of most of the analyzed metrics, whereas they are comparable to these of the best publicly available metrics, including the well-known MOtion-based Video Integrity Evaluation (MOVIE) index. Besides that, FARMSE calculation complexity is significantly lower than that of the metrics comparable thereto in terms of prediction accuracy.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Bae SH, Kim M (2016) DCT-QM: A DCT-based quality degradation metric for image quality optimization problems. IEEE Trans Image Process 25(10):4916–4930

    MathSciNet  Article  Google Scholar 

  2. Barten PGJ (1999) Contrast Sensitivity of the human eye and its effect on image quality. SPIE Publications, Washington

    Book  Google Scholar 

  3. 2Bhat A, Kannangara S, Zhao Y, Richardson I (2012) A full-reference quality metric for compressed video based on mean squared error and video content. IEEE Trans Circuits Syst Video Technol 22(2):165–173

  4. Birge B (2012) Particle Swarm Optimization Toolbox. http://www.mathworks.com/matlabcentral/fileexchange/7506-particle-swarm-optimization-toolbox

  5. Boccignone G, Marcelli A, Napoletano P, Di Fiore G, Iacovoni G, Morsa S (2008) Bayesian integration of face and low-level cues for foveated video coding. IEEE Trans Circuits Syst Video Technol 18(12):1727–1739

    Article  Google Scholar 

  6. Brandao T, Queluz MP (2010) No-reference quality assessment of H.264/AVC encoded video. IEEE Trans. Circuits Syst Video Technol 20(11):1437–1447

    Article  Google Scholar 

  7. Breitmeyer BG, Ogmen H (2000) Recent models and findings in visual backward masking: a comparison, review and update. Percept Psychophys 62(8):1572–1595

    Article  Google Scholar 

  8. Chandler DM, Hemami SS (2007) VSNR; A wavelet based visual signal-to-noise-ratio for nature images. IEEE Trans Image Process 16(9):2284–2297

    MathSciNet  Article  Google Scholar 

  9. Chandler DM, Hemami SS (2007). VSNR: A Wavelet-Based Visual Signal-to-Noise Ratio for Natural Images (C++ and MATLAB implementations). http://foulard.ece.cornell.edu/dmc27/vsnr/vsnr.html

  10. Chen Z, Liao N, Gu X, Wu F, Shi G (2016) Hybrid distortion ranking tuned bitstream-layer video quality assessment. IEEE Trans Circuits Syst Video Technol 26(6):1029–1043

    Article  Google Scholar 

  11. Chikkerur S, Sundaram V, Reisslein M, Karam LJ (2011) Objective video quality assessment: a classification, review, and performance comparison. IEEE Trans Broadcast 57(2):165–182

    Article  Google Scholar 

  12. Ciubotaru B, Muntean GM, Ghinea G (2009) Objective assessment of region of interest-aware adaptive multimedia streaming quality. IEEE Trans Broadcast 55(2):202–212

    Article  Google Scholar 

  13. Ciubotaru B, Ghinea G, Muntean GM (2014) Subjective assessment of region of interest-aware adaptive multimedia streaming quality. IEEE Trans Broadcast 60(1):50–60

    Article  Google Scholar 

  14. Daly S (1998) Engineering observations from spatiovelocity and spatiotemporal visual models. Proc SPIE 3299:180–191

    Article  Google Scholar 

  15. Eckert MP, Buchsbaum G (1993) The significance of eye movements and image acceleration for coding television image sequences. In: Watson AB (ed) Digital images and human vision. The MIT, Cambridge, pp 89–98

    Google Scholar 

  16. Fei X, Xiao L, Sun Y, Wei Z (2012) Perceptual image quality assessment based on structural similarity and visual masking. Signal Process Image Commun 27(7):772–783

    Article  Google Scholar 

  17. Geisler WS, Perry JS (1998) A real-time foveated multiresolution system for low bandwidth video communication. Proc SPIE 3299:294–305

    Article  Google Scholar 

  18. Gu K, Zhai G, Yang X, Zhang W (2014) Hybrid no-reference quality metric for singly and multiply distorted images. IEEE Trans Broadcast 60(3):555–567

    Article  Google Scholar 

  19. Joskowicz J, Sotelo R, Lopez Ardao JC (2013) Towards a general parametric model for perceptual video quality estimation. IEEE Trans Broadcast 59(4):569–579

    Article  Google Scholar 

  20. Lee B, Kim M (2013) No-reference PSNR estimation for HEVC encoded video. IEEE Trans Broadcast 59(1):20–27

    Article  Google Scholar 

  21. Lee S, Pattchis MS, Bovik AC (2002) Foveated video quality assessment. IEEE Trans Multimed 4(1):129–132

    Article  Google Scholar 

  22. Li S, Ma L, Ngan KN (2012) Full-reference video quality assessment by decoupling detail losses and additive impairments. IEEE Trans Circuits Syst Video Technol 22(7):1100–1112

    Article  Google Scholar 

  23. Lisberg SG, Evinger C, Johnson GW, Fuchs AF (1981) Relation between eye acceleration and retinal image velocity during foveal pursuit in man and monkey. J Neurophysiol 46(2):229–249

    Article  Google Scholar 

  24. Liu H, Heynderickx I (2011) Visual attention in objective image quality assessment: based on eye-tracking data. IEEE Trans Circuits Syst Video Technol 21(7):971–982

    Article  Google Scholar 

  25. LIVE software release (2017). http://live.ece.utexas.edu/research/Quality/index.htm

  26. Ma L, Li S, Ngan KN (2012) Reduced-reference video quality assessment of compressed video sequences. IEEE Trans Circuits Syst Video Technol 22(10):1441–1456

    Article  Google Scholar 

  27. Masry MA, Hemami SS (2004) A metric for continuous quality evaluation of compressed video with severe distortions. Signal Process Image Commun 19(1):133–146

    Article  Google Scholar 

  28. McDonagh P, Pande A, Murphy L, Mohapatra P (2013) Toward deployable methods for assessment of quality for scalable IPTV services. IEEE Trans Broadcast 59(2):223–237

    Article  Google Scholar 

  29. Mittal A, Moorthy AK, Geisler WS, Bovik AC (2011) Task dependence of visual attention on compressed videos: points of gaze statistics and analysis. Proc SPIE 7685:78650T–786510

    Article  Google Scholar 

  30. Moorthy AK, Seshadrinathan K, Soundararajan R, Bovik AC (2010) Wireless video quality assessment: a study of subjective scores and objective algorithms. IEEE Trans Circuits Syst Video Technol 20(4):587–599

    Article  Google Scholar 

  31. Murthy AV, Karam LJ (2010) IVQUEST-Image and video quality evaluation software. http://ivulab.asu.edu/Quality/IVQUEST

  32. Murthy AV, Karam LJ (2010) A MATLAB based framework for image and video quality evaluation. Proc Int Work Qual Multimed Exp QoMEX 2010:242–247

    Google Scholar 

  33. Na T, Kim M (2014) A novel no-reference PSNR estimation method with regard to deblocking filtering effect in H.264/AVC bitstreams. IEEE Trans Circuits Syst Video Technol 24(2):320–330

    Article  Google Scholar 

  34. Narwaria M, Lin W, Liu A (2012) Low-complexity video quality assessment using temporal quality variations. IEEE Trans Multimed 14(3):525–535

    Article  Google Scholar 

  35. Osberger W, Rohaly AM (2001) Automatic detection of regions of interest in complex video sequences. Proc SPIE 4299:361–372

    Article  Google Scholar 

  36. Ou YF, Ma Z, Liu T, Wang Y (2011) Perceptual quality assessment of video considering both frame rate and quantization artifacts. IEEE Trans Circuits Syst Video Technol 21(3):286–298

    Article  Google Scholar 

  37. Park J, Seshadrinathan K, Lee S, Bovik AC (2013) Video quality pooling adaptive to perceptual distortion severity. IEEE Trans Image Process 22(2):610–620

    MathSciNet  Article  MATH  Google Scholar 

  38. Pinson MH, Wolf S (2004) A new standardized method for objectively measuring video quality. IEEE Trans Broadcast 50(3):312–322

    Article  Google Scholar 

  39. Pinson MH, Choi LK, Bovik AC (2014) Temporal video quality model accounting for variable frame delay distortions. IEEE Trans Broadcast 60(4):637–649

    Article  Google Scholar 

  40. Privitera CM, Stark LW (2000) Algorithms for defining visual regions-of-interest: comparison with eye fixation. IEEE Trans Pattern Anal 22(9):970–982

    Article  Google Scholar 

  41. Rajashekar U, Linde I, Bovik AC, Cormack LK (2008) GAFFE: a gaze-attentive fixation finding engine. IEEE Trans Image Process 17(4):564–573

    MathSciNet  Article  Google Scholar 

  42. Rimac-Drlje S, Žagar D, Martinović G (2009) Spatial masking and perceived video quality in multimedia applications. Proc – Int Conf Syst, Signals and Image Proc IWSSIP 2009:1–4

    Google Scholar 

  43. Rimac-Drlje S, Vranješ M, Žagar D (2010) Foveated mean squared error – a novel video quality metric. Multimed Tools Appl 49:425–445

    Article  Google Scholar 

  44. Ryu S, Sohn K (2014) No-reference quality assessment for stereoscopic images based on binocular quality perception. IEEE Trans Circuits Syst Video Technol 24(4):591–602

    Article  Google Scholar 

  45. Seshadrinathan K, Bovik AC (2010) Motio-tuned spatio-temproal quality assessment of natural videos. IEEE Trans Image Process 19(2):335–350

    MathSciNet  Article  MATH  Google Scholar 

  46. Seshadrinathan K, Soundararajan R, Bovik AC, Cormack LK (2010) Study of subjective and objective quality assessment of video. IEEE Trans Image Process 19(6):1427–1441

    MathSciNet  Article  MATH  Google Scholar 

  47. Seshadrinathan K, Soundararajan R, Bovik AC, Cormack LK (2010) A subjective study to evaluate video quality assessment algorithms. Proc SPIE 7527:75270H–752710

    Article  MATH  Google Scholar 

  48. Seyedebrahimi M, Bailey C, Peng XH (2013) Model and performance of a no-reference quality assessment metric for video streaming. IEEE Trans Circuits Syst Video Technol 23(12):2034–2043

    Article  Google Scholar 

  49. Sogaard J, Forchhammer S, Korhonen J (2015) No-reference video quality assessment using codec analysis. IEEE Trans Circuits Syst Video Technol 25(10):1637–1650

    Article  Google Scholar 

  50. Soundararajan R, Bovik AC (2013) Video quality assessment by reduced reference spatio-temporal entropic differencing. IEEE Trans Circuits Syst Video Technol 23(4):684–694

    Article  Google Scholar 

  51. Staelens N, De Meulenaere J, Claeys M, Van Wallendael G, Van den Broeck W, De Cock J, Van de Walle R, Demeester P, De Turck F (2014) Subjective quality assessment of longer duration video sequences delivered over HTTP adaptive streaming to tablet devices. IEEE Trans Broadcast 60(4):707–714

    Article  Google Scholar 

  52. Stealens N, Deschrijver D, Vladislavleva E, Vermuelen B, Dhaene T, Demeester P (2013) Constructing a no-reference H.264/AVC bitstream-based video quality metric using genetic programming-based symbolic regression. IEEE Trans Circuits Syst Video Technol 23(8):1322–1333

    Article  Google Scholar 

  53. Subjective Video Quality Assessment Methods for Multimedia Applications (1999) ITU-T Recommendation P.910, Geneve, Swiss. https://www.itu.int/rec/T-REC-P.910/en

  54. Sun X, Yao H, Ji R, Liu XM (2014) Toward statistical modeling of saccadic eye-movement and visual saliency. IEEE Trans Image Process 23(11):4649–4662

    MathSciNet  Article  MATH  Google Scholar 

  55. van den Branden Lambrecht CJ, Verscheure O (1996) Perceptual quality measure using a spatio-temporal model of the human visual system. Proc SPIE 2668:450–461

    Article  Google Scholar 

  56. Van der Linde I, Rajashekar U, Bovik AC, Cormack LK (2009) DOVES: a database of visual eye movements. Spat Vis 22(2):161–177

    Article  Google Scholar 

  57. Video Quality Experts Group (2003) Final report from the video quality experts group on the validation of objective models of video quality assessment, Phase II. VQEG, http://www.vqeg.org

  58. Vranješ M (2012) Objective image quality metric based on spatio-temporal features of video signal and foveated vision. PhD Thesis, Josip Juraj Strossmayer University of Osijek, Croatia

  59. Vranješ M, Rimac-Drlje S, Vranješ D (2012) ECVQ and EVVQ video quality databases. Proc – Int Symp Electron in Marine ELMAR 2012:13–17

    Google Scholar 

  60. Vranješ M, Rimac-Drlje S, Grgić K (2013) Review of objective video quality metrics and performance comparison using different databases. Signal Process Image Commun 28(1):1–19

    Article  Google Scholar 

  61. Wang Z, Bovik AC, Lu L, Kouloheris J (2001) Foveated wavelet image quality index. Proc SPIE 4472:1–11

    Article  Google Scholar 

  62. Wang Z, Simoncelli EP, Bovik AC (2003) Multi-scale structural similarity for image quality assessment (invitetd paper) Conf Record – Asilomar Conf Signals. Syst and Computers ACSSC 2003:1398–1402

    Google Scholar 

  63. Wang Z, Bovik AC, Sheikh H, Simoncelli E (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612

    Article  Google Scholar 

  64. Wang Y, Jiang T, Ma S, Lee KI (2012) Novel spatio-temporal structural information based video quality metric. IEEE Trans Circuits Syst Video Technol 22(7):989–998

    Article  Google Scholar 

  65. Winkler S (2005) Digital video quality: vision models and metrics. Wiley, Chichester

    Book  Google Scholar 

  66. Winkler S, Mohandas P (2008) The evolution of video quality measurement: from PSNR to Hybrid Metrics. IEEE Trans Broadcast 54(3):660–668

    Article  Google Scholar 

  67. Wu HR, Rao KR (2006) Digital video image quality and perceptual coding. CRC Press, Taylor & Francis Group, Boca Raton

    Google Scholar 

  68. Wu Q, Li H, Meng F, Ngan KN, Luo B, Au OC, Huang C, Zeng B (2016) Blind image quality assessment based on multichannel feature fusion and label transfer. IEEE Trans Circuits Syst Video Technol 26(3):425–440

    Article  Google Scholar 

  69. Xu J, Ye P, Li Q, Du H, Liu Y, Doermann D (2016) Blind image quality assessment based on high order statistics aggregation. IEEE Trans Image Process 25(9):4444–4457

    MathSciNet  Article  Google Scholar 

  70. Xue W, Mou X, Zhang L, Bovik AC, Feng X (2014) Blind image quality assessment using joint statistics of gradient magnitude and Laplacian features. IEEE Trans Image Process 23(11):4850–4862

    MathSciNet  Article  MATH  Google Scholar 

  71. Xue Y, Erkin B, Wang Y (2015) A-novel no-reference video quality metric for evaluating temporal jerkiness due to frame freezing. IEEE Trans Multimed 17(1):134–139

    Article  Google Scholar 

  72. Yan C, Zhang Y, Dai F, Li L (2013) Highly parallel framework for HEVC motion estimation on many-core platform. Proc - Data Comp Conf DCC 2013:63–72

    Google Scholar 

  73. Yan C, Zhang Y, Xu J, Dai F, Li L, Dai Q, Wu F (2014) A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Process Lett 21(5):573–557

    Article  Google Scholar 

  74. Yan C, Zhang Y, Xu J, Dai F, Zhang J, Dai Q, Wu F (2014) Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Circuits Syst Video Technol 24(12):2077–2089

    Article  Google Scholar 

  75. Yan C, Zhang Y, Dai F, Wang X, Li L, Dai Q (2014) Parallel deblocking filter for HEVC on many-core processor. Electron Lett 50(5):367–368

    Article  Google Scholar 

  76. Yan C, Zhang Y, Dai F, Zhang J, Li L, Dai Q (2014) Efficient parallel HEVC intra prediction on many-core processor. Electron Lett 50(11):805–806

    Article  Google Scholar 

  77. Yan C, Xie H, Yang D, Yin J, Zhang Y, Dai Q (2017) Supervised Hash Coding With Deep Neural Network for Environment Perception of Intelligent Vehicles. IEEE Trans. Intell. Transp. Syst. https://doi.org/10.1109/TITS.2017.2749965

  78. Yan C, Xie H, Liu S, Yin J, Zhang Y, Dai Q (2017) Effective Uyghur Language Text Detection in Complex Background Images for Traffic Prompt Identification. IEEE Trans. Intell. Transp. Syst. https://doi.org/10.1109/TITS.2017.2749977

  79. Yeh HH, Yang CY, Lee MS, Chen CS (2013) Video aesthetic quality assessment by temporal integration of photo- and motion-based features. IEEE Trans Multimed 15(8):1944–1957

    Article  Google Scholar 

  80. You J, Reiter U, Hannuksela MM, Gabbouj M, Perkis A (2010) Perceptual-based objective quality metrics for audio-visual services – a survey. Signal Process Image Commun 25(7):482–501

    Article  Google Scholar 

  81. You J, Korhonen J, Perkis A (2010) Attention modelling for video quality assessment: balancing global quality and local quality. Proc – Int Conf Multimed and Expo ICME 2010:914–919

    Google Scholar 

  82. You J, Ebrahimi T, Perkis S (2014) Attention driven foveated video quality assessment. IEEE Trans Image Process 23(1):200–213

    MathSciNet  Article  MATH  Google Scholar 

  83. Zegarra Rodriguez D, Lopes Rosa R, Costa Alfaia E, Issy Abrahao J, Bressan G (2016) Video quality metric for streaming service using DASH standard. IEEE Trans Broadcast 62(3):628–639

    Article  Google Scholar 

  84. Zhang F, Bull DR (2016) A perception-based hybrid model for video quality assessment. IEEE Trans Circuits Syst Video Technol 26(6):1017–1028

    Article  Google Scholar 

  85. Zhang L, Shen Y, Li H (2014) VSI: a visual saliency-induced index for perceptual image quality. IEEE Trans Image Process 23(10):4270–4281

    MathSciNet  Article  MATH  Google Scholar 

  86. Zhang L, Zhang L, Bovik AC (2015) A feature-enriched completely bling image quality evaluator. IEEE Trans Image Process 24(8):2579–2591

    MathSciNet  Article  Google Scholar 

  87. Zhao Y, Yu L, Chen Z, Zhu C (2011) Video quality assessment based on measuring perceptual noise from spatial and temporal perspectives. IEEE Trans Circuits Syst Video Technol 21(12):1890–1902

    Article  Google Scholar 

  88. Zhu K, Li C, Asari V, Saupe D (2015) No-reference video quality assessment based on artifact measurement and statistical analysis. IEEE Trans Circuits Syst Video Technol 25(4):533–546

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the J.J. Strossmayer University of Osijek business fund through the internal competition for the research and artistic projects „IZIP-2016“ (project title: “Providing of digital video signal based services in rural and rarely populated areas”).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mario Vranješ.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Vranješ, M., Rimac-Drlje, S. & Vranješ, D. Foveation-based content adaptive root mean squared error for video quality assessment. Multimed Tools Appl 77, 21053–21082 (2018). https://doi.org/10.1007/s11042-017-5544-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5544-6

Keywords

  • FARMSE
  • Foveated vision
  • Human visual system
  • Spatio-temporal activity
  • Video quality assessment