Abstract
Deep learning techniques are utilized for most computer vision tasks. Especially, Convolutional Neural Networks (CNNs) have shown great performance in detection and classification tasks. Recently, in the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are used to extract spatial and temporal features from stereoscopic videos, but the importance of the disparity information which is very important did not consider well. Most of the recently proposed deep learning-based methods mostly used cost volume methods to produce the stereo correspondence for large disparities. Because the disparities can differ considerably for stereo cameras with different configurations, recently the Parallax Attention Mechanism (PAM) is proposed that captures the stereo correspondence disregarding the disparity changes. In this paper, we propose a new SVQA model using a base 3D CNN-based network, and a modified PAM-based left and right feature fusion model. Firstly, we use 3D CNNs and residual blocks to extract features from the left and right views of a stereo video patch. Then, we modify the PAM model to fuse the left and right features with considering the disparity information, and using some fully connected layers, we calculate the quality score of a stereoscopic video. We divided the input videos into cube patches for data augmentation and remove some cubes that confuse our model from the training dataset. Two standard stereoscopic video quality assessment benchmarks of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 are used to train and test our model. Experimental results indicate that our proposed model is very competitive with the state-of-the-art methods in the NAMA3DS1-COSPAD1 dataset, and it is the state-of-the-art method in the LFOVIAS3DPh2 dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Guo, Y., et al.: 3D object recognition in cluttered scenes with local surface features: a survey. IEEE Trans. Pattern. Anal. Mach. Intell. 36(11), 2270–2287 (2014)
Statistics: Theatrical Market. “Hollywood: Motion Picture Association of America (2011)
Yan, Q., Gong, D., Zhang, Y.: Two-stream convolutional networks for blind image quality assessment. IEEE Trans. Image Process. 28(5), 2200–2211 (2018)
Appina, B., et al: No-reference stereoscopic video quality assessment algorithm using joint motion and depth statistics. In: 25th IEEE International Conference on Image Processing (ICIP). IEEE (2018)
Lee, T.M., Yoon, J.-C., Lee, I.-K.: Motion sickness prediction in stereoscopic videos using 3d convolutional neural networks. IEEE Trans. Visual. Comput. Graph. 25(5), 1919–1927 (2019)
Karpathy, A., et al.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)
Yang, J., et al.: Stereoscopic video quality assessment based on 3D convolutional neural networks. Neurocomputing 309, 83–93 (2018)
Ma, S., et al.: Stereoscopic video quality assessment based on the two-step-training binocular fusion network. In: IEEE Visual Communications and Image Processing (VCIP). IEEE (2019)
Wang, L., et al: Learning parallax attention for stereo image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)
Zhang, H., Goodfellow, I.J., Metaxas, D.N., Odena, A.: Self-attention generative adversarial networks. In: NIPS (2018)
Fu, J., Liu, J., Tian, H., Fang, Z., Lu, H.: Dual attention network for scene segmentation (2018). arXiv preprint: arXiv:1809.02983
Balasubramanyam, A., et al.: Study of subjective quality and objective blind quality prediction of stereoscopic videos. IEEE Trans. Image. Process. 28(10), 5027–5040 (2019)
Urvoy, M., et al.: NAMA3DS1-COSPAD1: Subjective video quality assessment database on coding conditions introducing freely available high quality 3D stereoscopic sequences. In: Fourth International Workshop on Quality of Multimedia Experience. IEEE (2012)
Cheng, E., et al.: RMIT3DV: pre-announcement of a creative commons uncompressed HD 3D video database. In: 2012 Fourth International Workshop on Quality of Multimedia Experience. IEEE (2012)
Huber, P.J.: Robust Statistics, vol. 523. John Wiley & Sons, Hoboken (2004)
Pinson, M.H., Wolf, S.: A new standardized method for objectively measuring video quality. IEEE Trans. Broadcast. 50(3), 312–322 (2004)
Wang, Z., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003. Vol. 2. IEEE (2003)
Sheikh, H.R., Bovik, A.C.: Image information and visual quality. IEEE Trans. Image Process. 15(2), 430–444 (2006)
Mittal, A., Soundararajan, R., Bovik, A.C.: Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 20(3), 209–212 (2012)
Sameeulla Khan, M., Appina, B., Channappayya, S.S.: Full-reference stereo image quality assessment using natural stereo scene statistics. IEEE. Signal. Process. Lett. 22(11), 1985–1989 (2015)
Lin, Y.-H., Ja-Ling, W.: Quality assessment of stereoscopic 3D image compression by binocular integration behaviors. IEEE Trans. Image Process. 23(4), 1527–1542 (2014)
Joveluro, P., et al.: Perceptual video quality metric for 3D video quality assessment. In: 2010 3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video. IEEE (2010)
Jin, L., et al.: 3D-DCT based perceptual quality assessment of stereo video. In: 2011 18th IEEE International Conference on Image Processing. IEEE (2011)
Feng, L., et al.: Quality assessment of 3D asymmetric view coding using spatial frequency dominance model. In: 2009 3DTV Conference: The True Vision-Capture, Transmission and Display of 3D Video. IEEE (2009)
Yang, J., et al.: A no-reference optical flow-based quality evaluator for stereoscopic videos in curvelet domain. Inf. Sci. 414, 133–146 (2017)
Jiang, G., et al.: No reference stereo video quality assessment based on motion feature in tensor decomposition domain. J. Visual. Commun. Image. Represent. 50, 247–262 (2018)
Acknowledgment
This work is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) 2232 Outstanding International Researchers Program, Project No. 118C301.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Imani, H., Zaim, S., Islam, M.B., Junayed, M.S. (2022). Stereoscopic Video Quality Assessment Using Modified Parallax Attention Module. In: Durakbasa, N.M., Gençyılmaz, M.G. (eds) Digitizing Production Systems. Lecture Notes in Mechanical Engineering. Springer, Cham. https://doi.org/10.1007/978-3-030-90421-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-90421-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90420-3
Online ISBN: 978-3-030-90421-0
eBook Packages: EngineeringEngineering (R0)