Skip to main content
Log in

Leveraging visual attention and neural activity for stereoscopic 3D visual comfort assessment

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript


Visual comfort assessment (VCA) for stereoscopic three-dimensional (S3D) images is a challenging problem in the community of 3D quality of experience (3D-QoE). The goal of VCA is to automatically predict the degree of perceived visual discomfort in line with subjective judgment. The challenges of VCA typically lie in the following two aspects: 1) formulating effective visual comfort-aware features, and 2) finding an appropriate way to pool them into an overall visual comfort score. In this paper, a novel two-stage framework is proposed to address these problems. In the first stage, primary predictive feature (PPF) and advanced predictive feature (APF) are separately extracted and then integrated to reflect the perceived visual discomfort for 3D viewing. Specifically, we compute the S3D visual attention-weighted disparity statistics and neural activities of the middle temporal (MT) area in human brain to construct the PPF and APF, respectively. Followed by the first stage, the integrated visual comfort-aware features are fused with a single visual comfort score by using random forest (RF) regression, mapping from a high-dimensional feature space into a low-dimensional quality (visual comfort) space. Comparison results with five state-of-the-art relevant models on a standard benchmark database confirm the superior performance of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others


  1. Achanta R, Shaji A, Smith K, Lucchi A, Fua P, üsstrunk S (2012) SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282

    Article  Google Scholar 

  2. Borji A, Itti L (2013) State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell 35(1):185–207

    Article  Google Scholar 

  3. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  4. Chang B, Yang F, Wan S et al. (2013) “Effect of content on visual comfort in viewing stereoscopic videos,”. Proc Sign Inform Process Assoc Ann Summit Conf (APSIPA)

  5. Choi J, Kim D, Choi S, Sohn K (2010) Visual fatigue modeling and analysis for stereoscopic video. Opt Eng 51(1):017206

    Article  Google Scholar 

  6. Cumming B, Parker A (1997) Responses of primary visual cortical neurons to binocular disparity without depth perception. Nature 389(6648):280–283

    Article  Google Scholar 

  7. DeAngelis G, Cumming B, Newsome W (1998) Cortical area MT and the perception of stereoscopic depth. Nature 394(6694):677–680

    Article  Google Scholar 

  8. DeAngelis G, Newsome W (1999) Organization of disparity-selective neurons in macaque area MT. J Neurosci 19(4):1398–1415

    Google Scholar 

  9. DeAngelis G, Uka T (2003) Coding of horizontal disparity and velocity by MT neurons in the alert macaque. J Neurophysiol 89(2):1094–1111

    Article  Google Scholar 

  10. Fang Y, Wang J, Narwaria M, Le Callet P, Lin W (2014) Saliency detection for stereoscopic images. IEEE Trans Image Process 23(6):2625–2636

    Article  MathSciNet  Google Scholar 

  11. Final Report from the Video Quality Experts Group on the Validation of Objective Models of Video Quality Assessment VQEG, 2000. [Online]. Available:

  12. Gao Y, Wang M, Ji R, Wu X, Dai Q (2014) 3-D object retrieval with Hausdorff distance learning. IEEE Trans Ind Electron 61(4):2088–2098

    Article  Google Scholar 

  13. Gao Y, Wang M, Tao D, Ji R, Dai Q (2012) 3-D object retrieval and recognition with hypergraph analysis. IEEE Trans Image Process 21(9):4290–4303

    Article  MathSciNet  Google Scholar 

  14. Harel J, Koch C, Perona P et al. (2006) “Graph-based visual saliency,”. Proc Adv Neural Inform Process Syst

  15. Hoffman D, Girshick A, Akeley K, Banks M (2008) Vergence-accommodation conflicts hinder visual performance and cause visual fatigue. J Vis 8:1–30

    Article  Google Scholar 

  16. Hou X, Zhang L (2007) “Saliency detection: a spectral residual approach,”. Proc IEEE Int Conf Comput Vision Pattern Recognition (CVPR)

  17. Hur N, Lee H, Lee G, Lee S, Gotchev A, Park S (2011) 3DTV broadcasting and distribution systems. IEEE Trans Broadcast 57(2):395–407

    Article  Google Scholar 

  18. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  19. ITU-R BT.1438 (2000) Subjective assessment for stereoscopic television pictures

  20. ITU-R BT-500 (2002) Methodology for the subjective assessment of the quality of television pictures

  21. Jiang Q, Shao F, Jiang G, Yu M, Peng Z, Yu C (2015) A depth perception and visual comfort guided computational model for stereoscopic 3D visual saliency. Signal Process Image Commun 38:57–69

    Article  Google Scholar 

  22. Jiang Q, Shao F, Jiang G, Yu M, Peng Z (2015) Three-dimensional visual comfort assessment via preference learning. J Electron Imag 24(4):043002

    Article  Google Scholar 

  23. Jiang Q, Shao F, Jiang G, Yu M, Peng Z (2015) Supervised dictionary learning for blind image quality assessment using quality-constraint sparse coding. J Vis Commun Image Represent 33:123–133

    Article  Google Scholar 

  24. Jiang Q, Shao F, Lin W, Jiang G (2016) On predicting visual comfort of stereoscopic images: a learning to rank based approach. IEEE Sign Process Lett 23(2):302–306

    Article  Google Scholar 

  25. Jung Y, Sohn H, Lee S, Park H, Ro Y (2013) Predicting visual discomfort of stereoscopic images using human attention model. IEEE Trans Circ Syst Video Technol 23(12):2077–2082

    Article  Google Scholar 

  26. Kim D, Sohn K (2011) Visual fatigue prediction for stereoscopic image. IEEE Trans Circ Syst Video Technol 21(2):231–236

    Article  Google Scholar 

  27. Lambooij M, Ijsselsteijn W, Fortuin M, Heynderickx I (2009) Visual discomfort and visual fatigue of stereoscopic displays: a review. J Imag Sci Technol 53(3):1–14

    Article  Google Scholar 

  28. Lambooij M, IJsselsteijn W, Heynderickx I (2011) Visual discomfort of 3-D TV: assessment methods and modeling. Displays 32(4):209–218

    Article  Google Scholar 

  29. Lang C, Nguyen T, Katti H et al. (2012) “Depth matters: influence of depth cues on visual saliency,”. Proc 12th Europ Conf Comput Vision (ECCV)

  30. Lee S, Jung Y, Sohn H, Speranza F, Ro Y (2013) Effect of stimulus width on the perceived visual discomfort in viewing stereoscopic 3D-TV. IEEE Trans Broadcast 59(4):580–590

    Article  Google Scholar 

  31. Liu Y, Cormack L, Bovik A (2011) Statistical modeling of 3-D natural scenes with application to bayesian stereopsis. IEEE Trans Image Process 20(9):2515–2530

    Article  MathSciNet  Google Scholar 

  32. Martinez L, Alonso J (2003) Complex receptive fields in primary visual cortex. Neuroscientist 9(5):317–331

    Article  Google Scholar 

  33. Mittal A, Moorthy A, Ghosh J et al. (2011) “Algorithmic assessment of 3D quality of experience for images and videos,”. Proc IEEE Digit Sign Process Workshop 338–343

  34. Moorthy A, Bovik A (2009) Visual importance pooling for image quality assessment. IEEE J Select Topics Sign Process 3(2):193–201

    Article  Google Scholar 

  35. Nojiri Y, Yamanoue H, Ide S et al. (2006) “Parallax distribution and visual comfort on stereoscopic HDTV,”. Proc IBC 373–380

  36. Park J, Lee S, Bovik A (2014) 3D visual discomfort prediction: vergence, foveation, and the physiological optics of accommodation. IEEE J Select Topic Sign Process 8(3):415–426

    Article  Google Scholar 

  37. Shao F, Li K, Lin W, Jiang G, Yu M, Dai Q (2015) Full-reference quality assessment of stereoscopic images by learning binocular receptive field properties. IEEE Trans Image Process 24(10):2971–2983

    Article  MathSciNet  Google Scholar 

  38. Shao F, Lin W, Gu S, Jiang G, Srikanthan T (2013) Perceptual full-reference quality assessment of stereoscopic images by considering binocular visual characteristics. IEEE Trans Image Process 22(5):1940–1953

    Article  MathSciNet  Google Scholar 

  39. Shibata T, Kim J, Hoffman D et al. (2011 “The zone of comfort: predicting visual discomfort with stereo displays,”. J Vision 11(8)

  40. Sohn H, Jung Y, Lee S, Ro Y (2013) Predicting visual discomfort using object size and disparity information in stereoscopic images. IEEE Trans Broadcast 59(1):28–37

    Article  Google Scholar 

  41. Sun D, Roth S, Black M et al. (2010) “Secrets of optical flow estimation and their principles,”. Proc IEEE Int Conf Comput Vision Pattern Recognition (CVPR) 2432–2439

  42. Tam W, Speranza F, Yano S, Shimono K, Ono H (2011) Stereoscopic 3D-TV: visual comfort. IEEE Trans Broadcast 57(2):335–346

    Article  Google Scholar 

  43. Ukai K, Howarth P (2008) Visual fatigue caused by viewing stereoscopic motion images: background, theories, and observations. Displays 29(2):106–116

    Article  Google Scholar 

  44. Urvoy M, Barkowsky M, Le Callet P (2013) How visual fatigue and discomfort impact 3D-TV quality of experience: a comprehensive review of technological, psychophysical, and psychological factors. Ann Telecommun-Annales Des Télécommun 68(11-12):641–655

    Article  Google Scholar 

  45. Wang Z, Shang X (2006) “Spatial pooling strategies for perceptual image quality assessment,”. Proc IEEE Int Conf Imag Process (ICIP) 2945–2948

  46. Wang J, Sliva M, Le Callet P, Ricordel V (2013) A computational model of stereoscopic 3D visual saliency. IEEE Trans Imag Process 22(6):2151–2165

    Article  MathSciNet  Google Scholar 

  47. Yano S, Ide S, Mitsuhashi T, Thwaites H (2002) A study of visual fatigue and visual comfort for 3D HDTV/HDTV images. Displays 23(4):191–201

    Article  Google Scholar 

  48. Zhao S, Gao Y, Jiang X et al. (2014) “Exploring principles-of-art features for image emotion recognition,”. Proc ACM Int Conf Multimed 47–56

  49. Zhao S, Yao H, Jiang X et al. (2015) “Predicting continuous probability distribution of image emotions in valence-arousal space,”. Proc ACM Conf Multimed Conf 879–882

  50. Zhao S, Yao H, Zhang Y, Wang Y, Liu S (2015) View-based 3D object retrieval via multi-modal graph learning. Signal Process 112:110–118

    Article  Google Scholar 

Download references


The authors would like to thank the editor and all of the reviewers for their valuable comments and suggestions that have led to improvements in the quality and presentation of this paper. This work was supported in part by the Natural Science Foundation of China (grant 61271021, U1301257), in part by the Scientific Research Foundation of Graduate School of Ningbo University. It was also sponsored by the K.C. Wong Magna Fund in Ningbo University.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Feng Shao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, Q., Shao, F., Jiang, G. et al. Leveraging visual attention and neural activity for stereoscopic 3D visual comfort assessment. Multimed Tools Appl 76, 9405–9425 (2017).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: