Skip to main content

Saliency and Attention for Video Quality Assessment

  • Chapter
  • First Online:
From Human Attention to Computational Attention

Part of the book series: Springer Series in Cognitive and Neural Systems ((SSCNS,volume 10))

  • 1813 Accesses

Abstract

This chapter presents an overview of the published research focused on the application of visual attention and saliency models to the problem of image and video quality assessment. Determining the perceptual quality of multimedia content is crucial for achieving quality-of-experience-driven multimedia services. The problem has been gaining significance in the wake of the recent explosion of visual and multimedia applications.Attention and saliency models have the potential to improve the performance of state-of-the-art quality assessment algorithms significantly and are generating increased interest within the research community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
€32.70 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Austria)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 117.69
Price includes VAT (Austria)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 164.99
Price includes VAT (Austria)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
EUR 164.99
Price includes VAT (Austria)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    This work supported in part by FP7 QoSTREAM (no. 295220) and COST SoftStat IC0702 projects.

References

  1. Brooke, R. (1951). The variation of critical fusion frequency with brightness at various retinal locations. JOSA, 41(12), 1010–1016.

    Article  CAS  Google Scholar 

  2. Cisco. (2015). Cisco visual networking index: Global – 2019 forecast, San Jose.

    Google Scholar 

  3. Connor, C., Egeth, H., & Yantis, S. (2004). Visual attention: Bottom-up versus top-down. Current Biology, 14(19), R850–R852.

    Article  CAS  PubMed  Google Scholar 

  4. Culibrk, D., Crnojevic, V., & Antic, B. (2009). Multiscale background modelling and segmentation. In Proceedings of the 16th International Conference on Digital Signal Processing, Chicago, Santorini, Greece (pp. 922–927).

    Google Scholar 

  5. Culibrk, D., Kukolj, D., Vasiljevic, P., Pokric, M., & Zlokolica, V. (2009). Feature selection for neural-network based no-reference video quality assessment. In ICANN (2). (pp. 633–642).

    Google Scholar 

  6. Culibrk, D., Mirkovic, M., Lugonja, P., & Crnojevic, V. (2010). Mining web videos for video quality assessment. In 2010 International Conference of Soft Computing and Pattern Recognition (SoCPaR), Paris (pp. 75–80). doi: 10.1109/SOCPAR.2010.5686400, http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5686400&isnumber=5685841

  7. Culibrk, D., Mirkovic, M., Zlokolica, V., Pokric, M., Crnojevic, V., & Kukolj, D. (2010). Salient motion features for video quality assessment. IEEE Transactions on Image Processing, 20, 948–958.

    Article  PubMed  Google Scholar 

  8. Engelke, U., Kaprykowsky, H., Zepernick, H. J., & Ndjiki-Nya, P. (2011). Visual attention in quality assessment. IEEE Signal Processing Magazine, 28(6), 50–59.

    Article  Google Scholar 

  9. Ferzli, R., & Karam, L. A no-reference objective image sharpness metric based on just-noticeable blur and probability summation. Proceedings of IEEE 2007 International Conference on Image Processing 3, III –445–III –448 (16 2007-Oct 19 2007)

    Google Scholar 

  10. Idrissi, N., Martinez, J., & Aboutajdine, D. (2005). Selecting a discriminant subset of co-occurrence matrix features for texture-based image retrieval. (pp. 696–703). Advances in visual computing. Berlin/Heidelberg: Springer.

    Google Scholar 

  11. Itti, L. (2004). Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Transactions on Image Processing, 13(10), 1304–1318.

    Article  PubMed  Google Scholar 

  12. Itti, L., & Baldi, P. F. (2009). Bayesian surprise attracts human attention. Vision Research, 49(10), 1295–1306.

    Article  PubMed  Google Scholar 

  13. Itti, L., & Koch, C. (2001). Computational modelling of visual attention. Nature Reviews Neuroscience, 2(3), 194–203.

    Article  CAS  PubMed  Google Scholar 

  14. Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259.

    Article  Google Scholar 

  15. ITU-R BT.500. (2002). Methodology for the Subjective Assessment of the Quality of Television Pictures. Video Quality Experts Group.

    Google Scholar 

  16. James, W. (1950). The principles of psychology (Vol. 1). Dover Publications. http://www.worldcat.org/isbn/0486203816

  17. Kim, K., & Davis, L. (2004). A fine-structure image/video quality measure using local statistics. In Proceedings of IEEE 2004 International Conference on Image Processing, Singapore (Vol. V, pp. 3535–3538).

    Google Scholar 

  18. Kirenko, I. (2006). Reduction of coding artifacts using chrominance and luminance spatial analysis. In International Conference on Consumer Electronics, 2006. ICCE ’06. 2006 Digest of Technical Papers, St. Petersburg, Las Vegas (pp. 209–210).

    Google Scholar 

  19. Kusuma, T., Caldera, M., & Zepernick, H. (2004). Utilising objective perceptual image quality metrics for implicit link adaptation. In Proceedings of IEEE 2004 International Conference on Image Processing, Singapore (Vol. IV, pp. 2319–2322).

    Google Scholar 

  20. Le Meur, O., Le Callet, P., Barba, D., & Thoreau, D. (2006). A coherent computational approach to model bottom-up visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5), 802–817

    Article  PubMed  Google Scholar 

  21. Liu, T., Feng, X., Reibman, A., & Wang, Y. (2009). Saliency inspired modeling of packet-loss visibility in decoded videos. In International Workshop VPQM, Scottsdale (pp. 1–4).

    Google Scholar 

  22. Liu, Z., Yan, H., Shen, L., Wang, Y., & Zhang, Z. (2009). A motion attention model based rate control algorithm for h.264/avc. In Eighth IEEE/ACIS International Conference on Computer and Information Science, Shanghai (pp. 568–573).

    Google Scholar 

  23. Longfei, Z., Yuanda, C., Gangyi, D., & Yong, W. (2008). A computable visual attention model for video skimming. In ISM ’08: Proceedings of the 2008 Tenth IEEE International Symposium on Multimedia (pp. 667–672). Washington, DC: IEEE Computer Society.

    Chapter  Google Scholar 

  24. Ma, Y. F., Hua, X. S., Lu, L., & Zhang, H. J. (2005). A generic framework of user attention model and its application in video summarization. IEEE Transactions on Multimedia, 7(5), 907–919.

    Article  Google Scholar 

  25. Marques, O., Mayron, L. M., Borba, G. B., & Gamba, H. R. (2007). An attention-driven model for grouping similar images with image retrieval applications. EURASIP Journal on Advances in Signal Processing, 2007, 116

    Article  Google Scholar 

  26. Olveczky, B. P., Baccus, S. A., & Meister, M. (2003). Segregation of object and background motion in the retina. Nature, 423, 401–408.

    Article  PubMed  Google Scholar 

  27. Seshadrinathan, K., & Bovik, A. (2011). Automatic prediction of perceptual quality of multimedia signals – a survey. Multimedia Tools and Applications, 51, 163–186.

    Article  Google Scholar 

  28. Siagian, C., & Itti, L. (2007). Rapid biologically-inspired scene classification using features shared with visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(2), 300–312.

    Article  PubMed  Google Scholar 

  29. Siagian, C., & Itti, L. (2009). Biologically inspired mobile robot vision localization. IEEE Transactions on Robotics, 25(4), 861–873.

    Article  Google Scholar 

  30. Solomon, J., & Sperling, G. (1995). 1st-and 2nd-order motion and texture resolution in central and peripheral vision. Vision Research, 35(1), 59–64.

    Article  CAS  PubMed  Google Scholar 

  31. Stentiford, F. W. (2003). An attention based similarity measure with application to content-based information retrieval. In Proceedings of the Storage and Retrieval for Media Databases Conference, SPIE Electronic Imaging, Santa Clara

    Google Scholar 

  32. Styles, E. A. (2005). Attention, perception, and memory: An integrated introduction. New York: Taylor & Francis/Routledge.

    Google Scholar 

  33. Tsotsos, J. K., Culhane, S. M., Winky, W. Y. K., Lai, Y., Davis, N., & Nuflo, F. (1995). Modeling visual attention via selective tuning. Artificial Intelligence, 78(1–2), 507–545. http://dx.doi.org/10.1016/0004-3702(95)00025-9

    Article  Google Scholar 

  34. Venkatesh Babu, R., Perkis, A., & Hillestad, O. (2008). Evaluation and monitoring of video quality for UMA enabled video streaming systems. Multimedia Tools and Applications, 37(2), 211–231.

    Article  Google Scholar 

  35. Video Quality Experts Group (VQEG). (2000). Final report from the Video Quality Experts Group on the validation of objective quality metrics for video quality assessment. Available: http://www.its.bldrdoc.gov/vqeg/projects/frtv-phase-i/frtv-phase-i.aspx (online).

  36. Wang, Z., & Li, Q. (2007). Video quality assessment using a statistical model of human visual speed perception. JOSA A, 24(12), B61–B69.

    Article  PubMed  Google Scholar 

  37. Wang, Z., Sheikh, H. R., & Bovik, A. C. (2002). No-reference perceptual quality assessment of jpeg compressed images. In Proceedings of IEEE 2002 International Conferencing on Image Processing, Rochester (pp. 477–480).

    Google Scholar 

  38. Warwick, G., & Thong, N. (2004). Classification of video sequences in MPEG domain. In Signal Processing for Telecommunications and Multimedia (Chapter 6). New York: Springer. http://link.springer.com/book/10.1007%2Fb99846

    Google Scholar 

  39. Winkler, S. (2012). Analysis of public image and video databases for quality assessment. IEEE Journal of Selected Topics in Signal Processing, 6(6), 616–625.

    Article  Google Scholar 

  40. Wolf, S., & Pinson, M. (2002). Ntia report 02-392: Video quality measurement techniques. Technical report, Institute for Telecommunication Sciences. http://www.its.bldrdoc.gov/pub/ntia-rpt/02-392/

    Google Scholar 

  41. Wolfe, J. M. (2000). Visual attention. In Seeing (pp. 335–386). San Diego, CA: Academic Press.

    Chapter  Google Scholar 

  42. YouTube. (2015). Youtube: Press statistics. http://www.youtube.com/t/press_statistics

  43. Zhou Wang, L. L., & Bovik, A. C. (2004). Video quality assessment based on structural distortion measurement. Signal Processing: Image Communication, 19(2), 121–132.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dubravko Culibrk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this chapter

Cite this chapter

Culibrk, D. (2016). Saliency and Attention for Video Quality Assessment. In: Mancas, M., Ferrera, V., Riche, N., Taylor, J. (eds) From Human Attention to Computational Attention. Springer Series in Cognitive and Neural Systems, vol 10. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-3435-5_20

Download citation

Publish with us

Policies and ethics