Skip to main content

Stereoscopic Video Quality Assessment Using Modified Parallax Attention Module

  • Conference paper
  • First Online:
Digitizing Production Systems

Part of the book series: Lecture Notes in Mechanical Engineering ((LNME))

Abstract

Deep learning techniques are utilized for most computer vision tasks. Especially, Convolutional Neural Networks (CNNs) have shown great performance in detection and classification tasks. Recently, in the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are used to extract spatial and temporal features from stereoscopic videos, but the importance of the disparity information which is very important did not consider well. Most of the recently proposed deep learning-based methods mostly used cost volume methods to produce the stereo correspondence for large disparities. Because the disparities can differ considerably for stereo cameras with different configurations, recently the Parallax Attention Mechanism (PAM) is proposed that captures the stereo correspondence disregarding the disparity changes. In this paper, we propose a new SVQA model using a base 3D CNN-based network, and a modified PAM-based left and right feature fusion model. Firstly, we use 3D CNNs and residual blocks to extract features from the left and right views of a stereo video patch. Then, we modify the PAM model to fuse the left and right features with considering the disparity information, and using some fully connected layers, we calculate the quality score of a stereoscopic video. We divided the input videos into cube patches for data augmentation and remove some cubes that confuse our model from the training dataset. Two standard stereoscopic video quality assessment benchmarks of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 are used to train and test our model. Experimental results indicate that our proposed model is very competitive with the state-of-the-art methods in the NAMA3DS1-COSPAD1 dataset, and it is the state-of-the-art method in the LFOVIAS3DPh2 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Guo, Y., et al.: 3D object recognition in cluttered scenes with local surface features: a survey. IEEE Trans. Pattern. Anal. Mach. Intell. 36(11), 2270–2287 (2014)

    Article  Google Scholar 

  2. Statistics: Theatrical Market. “Hollywood: Motion Picture Association of America (2011)

    Google Scholar 

  3. Yan, Q., Gong, D., Zhang, Y.: Two-stream convolutional networks for blind image quality assessment. IEEE Trans. Image Process. 28(5), 2200–2211 (2018)

    Article  MathSciNet  Google Scholar 

  4. Appina, B., et al: No-reference stereoscopic video quality assessment algorithm using joint motion and depth statistics. In: 25th IEEE International Conference on Image Processing (ICIP). IEEE (2018)

    Google Scholar 

  5. Lee, T.M., Yoon, J.-C., Lee, I.-K.: Motion sickness prediction in stereoscopic videos using 3d convolutional neural networks. IEEE Trans. Visual. Comput. Graph. 25(5), 1919–1927 (2019)

    Article  Google Scholar 

  6. Karpathy, A., et al.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)

    Google Scholar 

  7. Yang, J., et al.: Stereoscopic video quality assessment based on 3D convolutional neural networks. Neurocomputing 309, 83–93 (2018)

    Article  Google Scholar 

  8. Ma, S., et al.: Stereoscopic video quality assessment based on the two-step-training binocular fusion network. In: IEEE Visual Communications and Image Processing (VCIP). IEEE (2019)

    Google Scholar 

  9. Wang, L., et al: Learning parallax attention for stereo image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  10. Zhang, H., Goodfellow, I.J., Metaxas, D.N., Odena, A.: Self-attention generative adversarial networks. In: NIPS (2018)

    Google Scholar 

  11. Fu, J., Liu, J., Tian, H., Fang, Z., Lu, H.: Dual attention network for scene segmentation (2018). arXiv preprint: arXiv:1809.02983

  12. Balasubramanyam, A., et al.: Study of subjective quality and objective blind quality prediction of stereoscopic videos. IEEE Trans. Image. Process. 28(10), 5027–5040 (2019)

    Article  MathSciNet  Google Scholar 

  13. Urvoy, M., et al.: NAMA3DS1-COSPAD1: Subjective video quality assessment database on coding conditions introducing freely available high quality 3D stereoscopic sequences. In: Fourth International Workshop on Quality of Multimedia Experience. IEEE (2012)

    Google Scholar 

  14. Cheng, E., et al.: RMIT3DV: pre-announcement of a creative commons uncompressed HD 3D video database. In: 2012 Fourth International Workshop on Quality of Multimedia Experience. IEEE (2012)

    Google Scholar 

  15. Huber, P.J.: Robust Statistics, vol. 523. John Wiley & Sons, Hoboken (2004)

    Google Scholar 

  16. Pinson, M.H., Wolf, S.: A new standardized method for objectively measuring video quality. IEEE Trans. Broadcast. 50(3), 312–322 (2004)

    Article  Google Scholar 

  17. Wang, Z., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  18. Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003. Vol. 2. IEEE (2003)

    Google Scholar 

  19. Sheikh, H.R., Bovik, A.C.: Image information and visual quality. IEEE Trans. Image Process. 15(2), 430–444 (2006)

    Article  Google Scholar 

  20. Mittal, A., Soundararajan, R., Bovik, A.C.: Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 20(3), 209–212 (2012)

    Article  Google Scholar 

  21. Sameeulla Khan, M., Appina, B., Channappayya, S.S.: Full-reference stereo image quality assessment using natural stereo scene statistics. IEEE. Signal. Process. Lett. 22(11), 1985–1989 (2015)

    Article  Google Scholar 

  22. Lin, Y.-H., Ja-Ling, W.: Quality assessment of stereoscopic 3D image compression by binocular integration behaviors. IEEE Trans. Image Process. 23(4), 1527–1542 (2014)

    Article  MathSciNet  Google Scholar 

  23. Joveluro, P., et al.: Perceptual video quality metric for 3D video quality assessment. In: 2010 3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video. IEEE (2010)

    Google Scholar 

  24. Jin, L., et al.: 3D-DCT based perceptual quality assessment of stereo video. In: 2011 18th IEEE International Conference on Image Processing. IEEE (2011)

    Google Scholar 

  25. Feng, L., et al.: Quality assessment of 3D asymmetric view coding using spatial frequency dominance model. In: 2009 3DTV Conference: The True Vision-Capture, Transmission and Display of 3D Video. IEEE (2009)

    Google Scholar 

  26. Yang, J., et al.: A no-reference optical flow-based quality evaluator for stereoscopic videos in curvelet domain. Inf. Sci. 414, 133–146 (2017)

    Article  Google Scholar 

  27. Jiang, G., et al.: No reference stereo video quality assessment based on motion feature in tensor decomposition domain. J. Visual. Commun. Image. Represent. 50, 247–262 (2018)

    Article  Google Scholar 

Download references

Acknowledgment

This work is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) 2232 Outstanding International Researchers Program, Project No. 118C301.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Imani, H., Zaim, S., Islam, M.B., Junayed, M.S. (2022). Stereoscopic Video Quality Assessment Using Modified Parallax Attention Module. In: Durakbasa, N.M., Gençyılmaz, M.G. (eds) Digitizing Production Systems. Lecture Notes in Mechanical Engineering. Springer, Cham. https://doi.org/10.1007/978-3-030-90421-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-90421-0_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-90420-3

  • Online ISBN: 978-3-030-90421-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics