How Well Current Saliency Prediction Models Perform on UAVs Videos?

Perrin, Anne-Flore; Zhang, Lu; Le Meur, Olivier

doi:10.1007/978-3-030-29888-3_25

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11678))

Included in the following conference series:

International Conference on Computer Analysis of Images and Patterns

1624 Accesses
5 Citations

Abstract

It is exciting to witness the fast development of Unmanned Aerial Vehicle (UAV) imaging which opens the door to many new applications. In view of developing rich and efficient services, we wonder which strategy should be adopted to predict salience in UAV videos. To that end, we introduce here a benchmark of off-the-shelf state-of-the-art models for saliency prediction. This benchmark monitors two challenging aspects related to salience, namely the peculiar characteristics of UAV contents and the temporal dimension of videos. This paper enables to identify the strengths and weaknesses of current static, dynamic, supervised and unsupervised models for drone videos. Eventually, we highlight several strategies for the development of visual attention in UAV videos.

The presented work is funded by the ongoing research project ANR ASTRID DISSOCIE (Automated Detection of SaliencieS from Operators’ Point of View and Intelligent Compression of DronE videos) referenced as ANR-17-ASTR-0009.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bak, C., Kocak, A., Erdem, E., Erdem, A.: Spatio-temporal saliency networks for dynamic saliency prediction. IEEE Trans. Multimedia 20(7), 1688–1698 (2018)
Article Google Scholar
Bazzani, L., Larochelle, H., Torresani, L.: Recurrent mixture density network for spatiotemporal visual attention. arXiv preprint arXiv:1603.08199 (2016)
Borji, A.: Saliency prediction in the deep learning era: an empirical investigation. arXiv preprint arXiv:1810.03716 (2018)
Bruckert, A., Tavakoli, H.R., Liu, Z., Christie, M., Meur, O.L.: Deep saliency models: the quest for the loss function. arXiv preprint arXiv:1907.02336 (2019)
Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? IEEE Trans. Pattern Anal. Mach. Intell. 41(3), 740–757 (2019)
Article Google Scholar
Bylinskii, Z., et al.: MIT saliency benchmark (2015)
Google Scholar
Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: A deep multi-level network for saliency prediction. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 3488–3493. IEEE (2016)
Google Scholar
Jain, S.D., Xiong, B., Grauman, K.: FusionSeg: learning to combine motion and appearance for fully automatic segmentation of generic objects in videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3664–3673 (2017)
Google Scholar
Fang, Y., Wang, Z., Lin, W., Fang, Z.: Video saliency incorporating spatiotemporal cues and uncertainty weighting. IEEE Trans. Image Process. 23(9), 3910–3921 (2014)
Article MathSciNet Google Scholar
Foulsham, T., Kingstone, A., Underwood, G.: Turning the world around: patterns in saccade direction vary with picture orientation. Vis. Res. 48(17), 1777–1790 (2008)
Article Google Scholar
Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans. Image Process. 19(1), 185–198 (2010)
Article MathSciNet Google Scholar
Guo, X., Cui, L., Park, B., Ding, W., Lockhart, M., Kim, I.: How will humans cut through automated vehicle platoons in mixed traffic environments? A simulation study of drivers’ gaze behaviors based on the dynamic areas-of-interest. In: Adams, S., Beling, P., Lambert, J., Scherer, W., Fleming, C. (eds.) Systems Engineering in Context. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00114-8_55
Chapter Google Scholar
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, pp. 545–552 (2007)
Google Scholar
Hossein Khatoonabadi, S., Vasconcelos, N., Bajic, I.V., Shan, Y.: How many bits does it take for a stimulus to be salient? In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
Google Scholar
Hou, X., Harel, J., Koch, C.: Image signature: highlighting sparse salient regions. IEEE Trans. Pattern Anal. Mach. Intell. 34(1), 194–201 (2012). https://doi.org/10.1109/TPAMI.2011.146
Article Google Scholar
Howard, I.P., Rogers, B.: Depth perception. Stevens Handb. Exp. Psychol. 6, 77–120 (2002)
Google Scholar
Huang, X., Shen, C., Boix, X., Zhao, Q.: SALICON: reducing the semantic gap in saliency prediction by adapting deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 262–270 (2015)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
Article Google Scholar
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM 2014, pp. 675–678. ACM (2014)
Google Scholar
Jiang, L., Xu, M., Liu, T., Qiao, M., Wang, Z.: DeepVS: a deep learning based video saliency prediction approach. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 625–642. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_37
Chapter Google Scholar
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2106–2113. IEEE (2009)
Google Scholar
Kim, D.K., Chen, T.: Deep neural network for real-time autonomous indoor navigation. arXiv preprint arXiv:1511.04668 (2015)
Krassanakis, V., Filippakopoulou, V., Nakos, B.: EyeMMV toolbox: an eye movement post-analysis tool based on a two-step spatial dispersion threshold for fixation identification. J. Eye Mov. Res. 7(1) (2014). https://doi.org/10.16910/jemr.7.1.1
Krassanakis, V., Perreira Da Silva, M., Ricordel, V.: Monitoring human visual behavior during the observation of unmanned aerial vehicles (UAVs) videos. Drones 2(4), 36 (2018)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc. (2012)
Google Scholar
Kümmerer, M., Wallis, T.S.A., Bethge, M.: Saliency benchmarking made easy: separating models, maps and metrics. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 798–814. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_47
Chapter Google Scholar
Le Meur, O., Baccino, T.: Methods for comparing scanpaths and saliency maps: strengths and weaknesses. Behav. Res. Method 45(1), 251–266 (2013)
Article Google Scholar
Le Meur, O., Le Callet, P., Barba, D.: Predicting visual fixations on video based on low-level visual features. Vis. Res. 47(19), 2483–2498 (2007)
Article Google Scholar
Li, G., Xie, Y., Wei, T., Wang, K., Lin, L.: Flow guided recurrent neural encoder for video salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3243–3252 (2018)
Google Scholar
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 445–461. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_27
Chapter Google Scholar
Murray, N., Vanrell, M., Otazu, X., Parraga, C.A.: Saliency estimation using a non-parametric low-level vision model. In: CVPR 2011, pp. 433–440, June 2011
Google Scholar
Ninassi, A., Le Meur, O., Le Callet, P., Barba, D.: Does where you gaze on an image affect your perception of quality? Applying visual attention to image quality metric. In: 2007 IEEE International Conference on Image Processing, vol. 2, p. II-169. IEEE (2007)
Google Scholar
Pan, J., et al.: SalGAN: visual saliency prediction with generative adversarial networks. arXiv preprint arXiv:1701.01081 (2017)
Pan, J., Sayrol, E., Giro-i Nieto, X., McGuinness, K., O’Connor, N.E.: Shallow and deep convolutional networks for saliency prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 598–606 (2016)
Google Scholar
Riche, N., Mancas, M., Duvinage, M., Mibulumukini, M., Gosselin, B., Dutoit, T.: RARE2012: a multi-scale rarity-based saliency detection with its comparative statistical analysis. Signal Process. Image Commun. 28(6), 642–658 (2013)
Article Google Scholar
Rudoy, D., Goldman, D.B., Shechtman, E., Zelnik-Manor, L.: Learning video saliency from human gaze using candidate selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1147–1154 (2013)
Google Scholar
Sokalski, J., Breckon, T.P., Cowling, I.: Automatic salient object detection in UAV imagery. In: Proceedings of the 25th International Unmanned Air Vehicle Systems, pp. 1–12 (2010)
Google Scholar
Trinh, H., Li, J., Miyazawa, S., Moreno, J., Pankanti, S.: Efficient UAV video event summarization. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR 2012), pp. 2226–2229. IEEE (2012)
Google Scholar
Tseng, P.H., Carmi, R., Cameron, I.G., Munoz, D.P., Itti, L.: Quantifying center bias of observers in free viewing of dynamic natural scenes. J. Vis. 9(7), 4 (2009)
Article Google Scholar
Vig, E., Dorr, M., Cox, D.: Large-scale optimization of hierarchical features for saliency prediction in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2798–2805 (2014)
Google Scholar
Wang, Z., Ren, J., Zhang, D., Sun, M., Jiang, J.: A deep-learning based feature hybrid framework for spatiotemporal saliency detection inside videos. Neurocomputing 287, 68–83 (2018)
Article Google Scholar
Zhang, J., Sclaroff, S.: Exploiting surroundedness for saliency detection: a boolean map approach. IEEE Trans. Pattern Anal. Mach. Intell. 38(5), 889–902 (2016)
Article Google Scholar
Zhang, L., Tong, M.H., Marks, T.K., Shan, H., Cottrell, G.W.: SUN: a Bayesian framework for saliency using natural statistics. J. Vis. 8(7), 32 (2008)
Article Google Scholar
Zhao, Y., Ma, J., Li, X., Zhang, J.: Saliency detection and deep learning-based wildfire identification in UAV imagery. Sensors 18(3), 712 (2018)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Univ Rennes, CNRS, IRISA, 263 Avenue Général Leclerc, 35000, Rennes, France
Anne-Flore Perrin & Olivier Le Meur
Univ Rennes, INSA Rennes, CNRS, IETR - UMR 6164, 35000, Rennes, France
Lu Zhang

Authors

Anne-Flore Perrin
View author publications
You can also search for this author in PubMed Google Scholar
Lu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Le Meur
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anne-Flore Perrin .

Editor information

Editors and Affiliations

Department of Computer and Electrical Engineering and Applied Mathematics, University of Salerno, Fisciano (SA), Italy
Mario Vento
Department of Computer and Electrical Engineering and Applied Mathematics, University of Salerno, Fisciano (SA), Italy
Gennaro Percannella

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Perrin, AF., Zhang, L., Le Meur, O. (2019). How Well Current Saliency Prediction Models Perform on UAVs Videos?. In: Vento, M., Percannella, G. (eds) Computer Analysis of Images and Patterns. CAIP 2019. Lecture Notes in Computer Science(), vol 11678. Springer, Cham. https://doi.org/10.1007/978-3-030-29888-3_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-29888-3_25
Published: 22 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29887-6
Online ISBN: 978-3-030-29888-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics