No-Reference Video Quality Assessment Based on Ensemble of Knowledge and Data-Driven Models

  • Li SuEmail author
  • Pamela Cosman
  • Qihang Peng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11296)


No-reference (NR) video quality assessment (VQA) aims to evaluate video distortion in line with human visual perception without referring to the corresponding pristine signal. Many methods try to design models using prior knowledge of people’s experience. It is challenging due to the underlying complexity of video content, and the relatively limited understanding of the intricate mechanisms of the human visual system. Recently, some learning-based NR-VQA methods were proposed and regarded as data driven methods. However, in many practical scenarios, the labeled data is quite limited which significantly restricts the learning ability. In this paper, we first propose a data-driven model, V-CNN. It adaptively fits spatial and temporal distortion of time-varying video content. By using a shallow neural network, the spatial part runs faster than traditional models. The temporal part is more consistent with human subjective perception by introducing temporal SSIM jitter and hysteresis pooling. We then exploit the complementarity of V-CNN and a knowledge-driven model, VIIDEO. Compared to state-of-the-art full reference, reduced reference and no reference VQA methods, the proposed ensemble model shows a better balance between performance and efficiency with limited training data.


No reference video quality assessment Neural network Prior knowledge Spatial and temporal information 



This work was supported in part by the China Scholarship Council Program and by the National Natural Sciences Foundation of China: 61472389, 61332016 and 61301154.


  1. 1.
    Fang, Y., Yan, J., Li, L., Wu, J., Lin, W.: No reference quality assessment for screen content images with both local and global feature representation. IEEE Trans. Image Process. 27(4), 1600–1610 (2018)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Wu, Q., Li, H., Meng, F., Ngan, K.N.: Generic proposal evaluator: a lazy learning strategy toward blind proposal quality assessment. IEEE Trans. Intell. Transp. Syst. 19(1), 306–319 (2018)CrossRefGoogle Scholar
  3. 3.
    Caviedes, J.E., Oberti, F.: No-reference quality metric for degraded and enhanced video. In: Visual Communications and Image Processing, International Society for Optics and Photonics, pp. 621–632 (2003)Google Scholar
  4. 4.
    Babu, R.V., Bopardikar, A.S., Perkis, A., Hillestad, O.I.: No-reference metrics for video streaming applications. In: International Workshop on Packet Video, pp. 10–11 (2004)Google Scholar
  5. 5.
    Farias, M.C., Mitra, S.K.: No-reference video quality metric based on artifact measurements. In: IEEE International Conference on Image Processing, vol. 3, pp. III–141 (2005)Google Scholar
  6. 6.
    Lin, X., Tian, X., Chen, Y.: No-reference video quality assessment based on region of interest. In: 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), pp. 1924–1927 (2012)Google Scholar
  7. 7.
    Zhu, K., Hirakawa, K., Asari, V., Saupe, D.: A no-reference video quality assessment based on Laplacian pyramids. In: 20th IEEE International Conference on Image Processing (ICIP), pp. 49–53 (2013)Google Scholar
  8. 8.
    Yang, F., Wan, S., Chang, Y., Wu, H.R.: A novel objective no-reference metric for digital video quality assessment. IEEE Signal Process. Lett. 12(10), 685–688 (2005)CrossRefGoogle Scholar
  9. 9.
    Mittal, A., Saad, M.A., Bovik, A.C.: A completely blind video integrity oracle. IEEE Trans. Image Process. 25(1), 289–300 (2016)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Saad, M.A., Bovik, A.C., Charrier, C.: Blind prediction of natural video quality. IEEE Trans. Image Process. 23(3), 1352–1365 (2014)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Xu, J., Ye, P., Liu, Y., Doermann, D.: No-reference video quality assessment via feature learning. In: IEEE International Conference on Image Processing (ICIP), pp. 491–495 (2014)Google Scholar
  12. 12.
    Li, Y., et al.: No-reference video quality assessment with 3D shearlet transform and convolutional neural networks. IEEE Trans. Circ. Syst. Video Technol. 26(6), 1044–1057 (2016)CrossRefGoogle Scholar
  13. 13.
    Mittal, A., Soundararajan, R., Bovik, A.C.: Making a completely blind image quality analyzer. IEEE Signal Process. Lett. 20(3), 209–212 (2013)CrossRefGoogle Scholar
  14. 14.
    Kang, L., Ye, P., Li, Y., Doermann, D.: Convolutional neural networks for no-reference image quality assessment. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1733–1740 (2014)Google Scholar
  15. 15.
    Sheikh, H.R., Sabir, M.F., Bovik, A.C.: A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 15(11), 3440–3451 (2006)CrossRefGoogle Scholar
  16. 16.
    Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRefGoogle Scholar
  17. 17.
    Seshadrinathan, K., Bovik, A.C.: Temporal hysteresis model of time varying subjective video quality. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1153–1156 (2011)Google Scholar
  18. 18.
    Li, R., Zeng, B., Liou, M.L.: A new three-step search algorithm for block motion estimation. IEEE Trans. Circ. Syst. Video Technol. 4(4), 438–442 (1994)CrossRefGoogle Scholar
  19. 19.
    Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: The SSIM index for image quality assessment. MATLAB implementation, vol. 23, p. 66 (2003).
  20. 20.
    Seshadrinathan, K., Soundararajan, R., Bovik, A.C., Cormack, L.K.: Study of subjective and objective quality assessment of video. IEEE Trans. Image Process. 19(6), 1427–1441 (2010)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Seshadrinathana, K., Soundararajanb, R., Bovik, A.C., Cormack, L.K.: A subjective study to evaluate video quality assessment algorithms. In: IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, p. 75270H (2010)Google Scholar
  22. 22.
    Vu, P.V., Vu, C.T., Chandler, D.M.: A spatiotemporal most-apparent-distortion model for video quality assessment. In: 18th IEEE International Conference on Image Processing (ICIP), pp. 2505–2508 (2011)Google Scholar
  23. 23.
    Seshadrinathan, K., Bovik, A.C.: Motion tuned spatio-temporal quality assessment of natural videos. IEEE Trans. Image Process. 19(2), 335–350 (2010)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Soundararajan, R., Bovik, A.C.: Video quality assessment by reduced reference spatio-temporal entropic differencing. IEEE Trans. Circ. Syst. Video Technol. 23(4), 684–694 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of Chinese Academy of SciencesBeijingChina
  2. 2.University of California at San DiegoSan DiegoUSA
  3. 3.University of Electronic Science and Technology of ChinaChengduChina

Personalised recommendations