Comfort-driven disparity adjustment for stereoscopic video

Abstract

Pixel disparity—the offset of corresponding pixels between left and right views—is a crucial parameter in stereoscopic three-dimensional (S3D) video, as it determines the depth perceived by the human visual system (HVS). Unsuitable pixel disparity distribution throughout an S3D video may lead to visual discomfort. We present a unified and extensible stereoscopic video disparity adjustment framework which improves the viewing experience for an S3D video by keeping the perceived 3D appearance as unchanged as possible while minimizing discomfort. We first analyse disparity and motion attributes of S3D video in general, then derive a wide-ranging visual discomfort metric from existing perceptual comfort models. An objective function based on this metric is used as the basis of a hierarchical optimisation method to find a disparity mapping function for each input video frame. Warping-based disparity manipulation is then applied to the input video to generate the output video, using the desired disparity mappings as constraints. Our comfort metric takes into account disparity range, motion, and stereoscopic window violation; the framework could easily be extended to use further visual comfort models. We demonstrate the power of our approach using both animated cartoons and real S3D videos.

References

  1. [1]

    Maimone, A.; Wetzstein, G.; Hirsch, M.; Lanman, D.; Raskar, R.; Fuchs, H. Focus 3D: Compressive accommodation display. ACM Transactions on Graphics Vol. 32, No. 5, Article No. 153, 2013.

    Article  Google Scholar 

  2. [2]

    Wetzstein, G.; Lanman, D.; Heidrich, W.; Raskar, R. Layered 3D: Tomographic image synthesis for attenuation-based light field and high dynamic range displays. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 95, 2011.

    Article  Google Scholar 

  3. [3]

    Didyk, P.; Ritschel, T.; Eisemann, E.; Myszkowski, K.; Seidel, H.-P. A perceptual model for disparity. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 96, 2011.

    Article  Google Scholar 

  4. [4]

    Didyk, P.; Ritschel, T.; Eisemann, E.; Myszkowski, K.; Seidel, H.-P.; Matusik, W. A luminancecontrast-aware disparity model and applications. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 184, 2012.

    Article  Google Scholar 

  5. [5]

    Hoffman, D. M.; Girshick, A. R.; Akeley, K.; Banks, M. S. Vergence-accommodation conflicts hinder visual performance and cause visual fatigue. Journal of Vision Vol. 8, No. 3, 33, 2008.

    Article  Google Scholar 

  6. [6]

    Howard, I. P.; Rogers, B. J. Perceiving in Depth, Vol. 2: Stereoscopic Vision. New York: Oxford University Press, 2012.

  7. [7]

    Palmer, S. E. Vision Science: Photons to Phenomenology. Cambridge, MA,USA: MIT Press, 1999.

    Google Scholar 

  8. [8]

    Mendiburu, B. 3D Movie Making: Stereoscopic Digital Cinema from Script to Screen. Oxon, UK: Focal Press, 2009.

    Google Scholar 

  9. [9]

    Kellnhofer, P.; Ritschel, T.; Myszkowski, K.; Seidel, H.-P. Optimizing disparity for motion in depth. Computer Graphics Forum Vol. 32, No. 4, 143–152, 2013.

    Article  Google Scholar 

  10. [10]

    Lang, M.; Hornung, A.; Wang, O.; Poulakos, S.; Smolic, A.; Gross, M. Nonlinear disparity mapping for stereoscopic 3D. ACM Transactions on Graphics Vol. 29, No. 4, Article No. 75, 2010.

    Article  Google Scholar 

  11. [11]

    Liu, C.-W.; Huang, T.-H.; Chang, M.-H.; Lee, K.- Y.; Liang, C.-K.; Chuang, Y.-Y. 3D cinematography principles and their applications to stereoscopic media processing. In: Proceedings of the 19th ACM International Conference on Multimedia, 253–262, 2011.

    Google Scholar 

  12. [12]

    Shibata, T.; Kim, J.; Hoffman, D. M.; Banks, M. S. The zone of comfort: Predicting visual discomfort with stereo displays. Journal of Vision Vol. 11, No. 8, 11, 2011.

    Article  Google Scholar 

  13. [13]

    Cho, S.-H.; Kang, H.-B. Subjective evaluation of visual discomfort caused from stereoscopic 3D video using perceptual importance map. In: Proceedings of IEEE Region 10 Conference, 1–6, 2012.

    Google Scholar 

  14. [14]

    Du, S.-P.; Masia, B.; Hu, S.-M.; Gutierrez, D. A metric of visual comfort for stereoscopic motion. ACM Transactions on Graphics Vol. 32, No. 6, Article No. 222, 2013.

    Article  Google Scholar 

  15. [15]

    Jung, Y. J.; Lee, S.-i.; Sohn, H.; Park, H. W.; Ro, Y. M. Visual comfort assessment metric based on salient object motion information in stereoscopic video. Journal of Electronic Imaging Vol. 21, No. 1, 011008, 2012.

    Article  Google Scholar 

  16. [16]

    Kooi, F. L.; Toet, A. Visual comfort of binocular and 3D displays. Displays Vol. 25, Nos. 2–3, 99–108, 2004.

    Article  Google Scholar 

  17. [17]

    Mu, T.-J.; Sun, J.-J.; Martin, R. R.; Hu, S.-M. A response time model for abrupt changes in binocular disparity. The Visual Computer Vol. 31, No. 5, 675–687, 2015.

    Article  Google Scholar 

  18. [18]

    Templin, K.; Didyk, P.; Myszkowski, K.; Hefeeda, M. M.; Seidel, H.-P.; Matusik, W. Modeling and optimizing eye vergence response to stereoscopic cuts. ACM Transactions on Graphics Vol. 33, No. 4, Article No. 145, 2014.

    Article  Google Scholar 

  19. [19]

    Jin, E. W.; Miller, M. E.; Endrikhovski, S.; Cerosaletti, C. D. Creating a comfortable stereoscopic viewing experience: Effects of viewing distance and field of view on fusional range. In: Proceedings of SPIE 5664, Stereoscopic Displays and Virtual Reality Systems XII, 10, 2005.

    Google Scholar 

  20. [20]

    Ukai, K.; Howarth, P. A. Visual fatigue caused by viewing stereoscopic motion images: Background, theories, and observations. Displays Vol. 29, No. 2, 106–116, 2008.

    Article  Google Scholar 

  21. [21]

    Zilly, F.; Müller, M.; Eisert, P.; Kauff, P. The stereoscopic analyzer—An image-based assistance tool for stereo shooting and 3D production. In: Proceedings of the 17th IEEE International Conference on Image Processing, 4029–4032, 2010.

    Google Scholar 

  22. [22]

    Masia, B.; Wetzstein, G.; Didyk, P.; Gutierrez, D. A survey on computational displays: Pushing the boundaries of optics, computation, and perception. Computers & Graphics Vol. 37, No. 8, 1012–1038, 2013.

    Article  Google Scholar 

  23. [23]

    Lo, W.-Y.; van Baar, J.; Knaus, C.; Zwicker, M.; Gross, M. H. Stereoscopic 3D copy & paste. ACM Transactions on Graphics Vol. 29, No. 6, Article No. 147, 2010.

    Article  Google Scholar 

  24. [24]

    Tong, R.-F.; Zhang, Y.; Cheng, K.-L. StereoPasting: Interactive composition in stereoscopic images. IEEE Transactions on Visualization and Computer Graphic Vol. 19, No. 8, 1375–1385, 2013.

    Article  Google Scholar 

  25. [25]

    Kim, Y.; Lee, Y.; Kang, H.; Lee, S. Stereoscopic 3D line drawing. ACM Transactions on Graphics Vol. 32, No. 4, Article No. 57, 2013.

    Google Scholar 

  26. [26]

    Niu, Y.; Feng, W.-C.; Liu, F. Enabling warping on stereoscopic images. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 183, 2012.

    Article  Google Scholar 

  27. [27]

    Kim, C.; Hornung, A.; Heinzle, S.; Matusik, W.; Gross, M. Multi-perspective stereoscopy from light fields. ACM Transactions on Graphics Vol. 30, No. 6, Article No. 190, 2011.

    Google Scholar 

  28. [28]

    Masia, B.; Wetzstein, G.; Aliaga, C.; Raskar, R.; Gutierrez, D. Display adaptive 3D content remapping. Computers & Graphics Vol. 37, No. 8, 983–996, 2013.

    Article  Google Scholar 

  29. [29]

    Koppal, S. J.; Zitnick, C. L.; Cohen, M.; Kang, S. B.; Ressler, B.; Colburn, A. A viewer-centric editor for 3D movies. IEEE Computer Graphics and Applications Vol. 31, No. 1, 20–35, 2011.

    Article  Google Scholar 

  30. [30]

    Oskam, T.; Hornung, A.; Bowles, H.; Mitchell, K.; Gross, M. OSCAM-optimized stereoscopic camera control for interactive 3D. ACM Transactions on Graphics Vol. 30, No. 6, Article No. 189, 2011.

    Article  Google Scholar 

  31. [31]

    Tseng, K.-L.; Huang, W.-J.; Luo, A.-C.; Huang, W.- H.; Yeh, Y.-C.; Chen, W.-C. Automatically optimizing stereo camera system based on 3D cinematography principles. In: Proceedings of 3DTV-Conference: The True Vision—Capture, Transmission and Display of 3D Video, 1–4, 2012

    Google Scholar 

  32. [32]

    Cheng, M.-M.; Mitra, N. J.; Huang, X.; Torr, P. H. S.; Hu, S.-M. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 37, No. 3, 569–582, 2015.

    Article  Google Scholar 

  33. [33]

    Felzenszwalb, P. F.; Huttenlocher, D. P. Efficient graph-based image segmentation. International Journal of Computer Vision Vol. 59, No. 2, 167–181, 2004.

    Article  Google Scholar 

  34. [34]

    Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Sü sstrunk, S. SLIC superpixels compared to stateof-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 34, No. 11, 2274–2282, 2012.

    Article  Google Scholar 

  35. [35]

    Manning, C. D.; Schütze, H. Foundations of Statistical Natural Language Processing. Cambridge, MA,USA: MIT Press, 1999.

    Google Scholar 

  36. [36]

    Syswerda, G. A study of reproduction in generational and steady state genetic algorithms. Foundation of Genetic Algorithms Vol. 2, 94–101, 1991.

    Google Scholar 

  37. [37]

    Syswerda, G. Uniform crossover in genetic algorithms. In: Proceedings of the 3rd International Conference on Genetic Algorithms, 2–9, 1989.

    Google Scholar 

  38. [38]

    Higashi, N.; Iba, H. Particle swarm optimization with Gaussian mutation. In: Proceedings of the IEEE Swarm Intelligence Symposium, 72–79, 2003.

    Google Scholar 

  39. [39]

    Brox, T.; Malik, J. Large displacement optical flow: Descriptor matching in variational motion estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 33, No. 3, 500–513, 2011.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Miao Wang.

Additional information

This article is published with open access at Springerlink.com

Miao Wang is currently a Ph.D. candidate at Tsinghua University, Beijing, China. He received his B.S. degree from Xidian University in 2011. His research interests include computer graphics, image processing, and computer vision.

Xi-Jin Zhang is currently a Ph.D. candidate at Tsinghua University, Beijing, China. He received his B.S. degree from Xidian University in 2014. His research interests include image and video processing, computer vision, and machine learning.

Jun-Bang Liang is currently an undergraduate student at Tsinghua University, Beijing, China. His research interests include computer vision and computer graphics.

Song-Hai Zhang received his Ph.D. degree in 2007 from Tsinghua University. He is currently an associate professor in the Department of Computer Science and Technology of Tsinghua University, Beijing, China. His research interests include image and video processing, geometric computing.

Ralph R. Martin is currently a professor at Cardiff University. He obtained his Ph.D. degree in 1983 from Cambridge University. He has published about 300 papers and 14 books, covering such topics as solid and surface modeling, intelligent sketch input, geometric reasoning, reverse engineering, and various aspects of computer graphics. He is a Fellow of the Learned Society of Wales, the Institute of Mathematics and its Applications, and the British Computer Society. He is on the editorial boards of Computer-Aided Design, Computer Aided Geometric Design, Geometric Models, and Computers and Graphics.

Open Access The articles published in this journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www. editorialmanager.com/cvmj.

Electronic supplementary material

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, M., Zhang, X., Liang, J. et al. Comfort-driven disparity adjustment for stereoscopic video. Comp. Visual Media 2, 3–17 (2016). https://doi.org/10.1007/s41095-016-0037-5

Download citation

Keywords

  • stereoscopic video editing
  • video enhancement
  • perceptual visual computing
  • video manipulation