No-Reference Video Quality Assessment Based on Ensemble of Knowledge and Data-Driven Models
No-reference (NR) video quality assessment (VQA) aims to evaluate video distortion in line with human visual perception without referring to the corresponding pristine signal. Many methods try to design models using prior knowledge of people’s experience. It is challenging due to the underlying complexity of video content, and the relatively limited understanding of the intricate mechanisms of the human visual system. Recently, some learning-based NR-VQA methods were proposed and regarded as data driven methods. However, in many practical scenarios, the labeled data is quite limited which significantly restricts the learning ability. In this paper, we first propose a data-driven model, V-CNN. It adaptively fits spatial and temporal distortion of time-varying video content. By using a shallow neural network, the spatial part runs faster than traditional models. The temporal part is more consistent with human subjective perception by introducing temporal SSIM jitter and hysteresis pooling. We then exploit the complementarity of V-CNN and a knowledge-driven model, VIIDEO. Compared to state-of-the-art full reference, reduced reference and no reference VQA methods, the proposed ensemble model shows a better balance between performance and efficiency with limited training data.
KeywordsNo reference video quality assessment Neural network Prior knowledge Spatial and temporal information
This work was supported in part by the China Scholarship Council Program and by the National Natural Sciences Foundation of China: 61472389, 61332016 and 61301154.
- 3.Caviedes, J.E., Oberti, F.: No-reference quality metric for degraded and enhanced video. In: Visual Communications and Image Processing, International Society for Optics and Photonics, pp. 621–632 (2003)Google Scholar
- 4.Babu, R.V., Bopardikar, A.S., Perkis, A., Hillestad, O.I.: No-reference metrics for video streaming applications. In: International Workshop on Packet Video, pp. 10–11 (2004)Google Scholar
- 5.Farias, M.C., Mitra, S.K.: No-reference video quality metric based on artifact measurements. In: IEEE International Conference on Image Processing, vol. 3, pp. III–141 (2005)Google Scholar
- 6.Lin, X., Tian, X., Chen, Y.: No-reference video quality assessment based on region of interest. In: 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), pp. 1924–1927 (2012)Google Scholar
- 7.Zhu, K., Hirakawa, K., Asari, V., Saupe, D.: A no-reference video quality assessment based on Laplacian pyramids. In: 20th IEEE International Conference on Image Processing (ICIP), pp. 49–53 (2013)Google Scholar
- 11.Xu, J., Ye, P., Liu, Y., Doermann, D.: No-reference video quality assessment via feature learning. In: IEEE International Conference on Image Processing (ICIP), pp. 491–495 (2014)Google Scholar
- 14.Kang, L., Ye, P., Li, Y., Doermann, D.: Convolutional neural networks for no-reference image quality assessment. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1733–1740 (2014)Google Scholar
- 17.Seshadrinathan, K., Bovik, A.C.: Temporal hysteresis model of time varying subjective video quality. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1153–1156 (2011)Google Scholar
- 19.Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: The SSIM index for image quality assessment. MATLAB implementation, vol. 23, p. 66 (2003). http://www.cns.nyu.edu/lcv/ssim
- 21.Seshadrinathana, K., Soundararajanb, R., Bovik, A.C., Cormack, L.K.: A subjective study to evaluate video quality assessment algorithms. In: IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, p. 75270H (2010)Google Scholar
- 22.Vu, P.V., Vu, C.T., Chandler, D.M.: A spatiotemporal most-apparent-distortion model for video quality assessment. In: 18th IEEE International Conference on Image Processing (ICIP), pp. 2505–2508 (2011)Google Scholar