Advertisement

Highly Efficient Salient Object Detection with 100K Parameters

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12351)

Abstract

Salient object detection models often demand a considerable amount of computation cost to make precise prediction for each pixel, making them hardly applicable on low-power devices. In this paper, we aim to relieve the contradiction between computation cost and model performance by improving the network efficiency to a higher degree. We propose a flexible convolutional module, namely generalized OctConv (gOctConv), to efficiently utilize both in-stage and cross-stages multi-scale features, while reducing the representation redundancy by a novel dynamic weight decay scheme. The effective dynamic weight decay scheme stably boosts the sparsity of parameters during training, supports learnable number of channels for each scale in gOctConv, allowing 80% of parameters reduce with negligible performance drop. Utilizing gOctConv, we build an extremely light-weighted model, namely CSNet, which achieves comparable performance with \({\sim }0.2\%\) parameters (100k) of large models on popular salient object detection benchmarks. The source code is publicly available at https://mmcheng.net/sod100k/.

Keywords

Salient object detection Highly efficient 

Notes

Acknowledgements

Ming-Ming Cheng is the corresponding author. This research was supported by Major Project for New Generation of AI under Grant No. 2018AAA0100400, NSFC (61922046), Tianjin Natural Science Foundation (18ZXZNGX00110), and the Fundamental Research Funds for the Central Universities, Nankai University (63201169).

Supplementary material

504443_1_En_42_MOESM1_ESM.pdf (216 kb)
Supplementary material 1 (pdf 216 KB)

References

  1. 1.
    Achanta, R., Hemami, S., Estrada, F., Süsstrunk, S.: Frequency-tuned salient region detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1597–1604 (2009)Google Scholar
  2. 2.
    Borji, A., Cheng, M.M., Hou, Q., Jiang, H., Li, J.: Salient object detection: a survey. Comput. Vis. Media 5(2), 117–150 (2019).  https://doi.org/10.1007/s41095-019-0149-9CrossRefGoogle Scholar
  3. 3.
    Chen, S., Tan, X., Wang, B., Hu, X.: Reverse attention for salient object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 236–252. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01240-3_15CrossRefGoogle Scholar
  4. 4.
    Chen, Y., et al.: Drop an octave: reducing spatial redundancy in convolutional neural networks with octave convolution. In: IEEE International Conference on Computer Vision (ICCV) (2019)Google Scholar
  5. 5.
    Cheng, M.M., Hou, Q.B., Zhang, S.H., Rosin, P.L.: Intelligent visual media processing: when graphics meets vision. J. Comput. Sci. Technol. 32(1), 110–121 (2017).  https://doi.org/10.1007/s11390-017-1681-7CrossRefGoogle Scholar
  6. 6.
    Cheng, M.M., Mitra, N.J., Huang, X., Torr, P.H., Hu, S.M.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2015)CrossRefGoogle Scholar
  7. 7.
    Cheng, M.M., Warrell, J., Lin, W.Y., Zheng, S., Vineet, V., Crook, N.: Efficient salient region detection with soft image abstraction. In: IEEE International Conference on Computer Vision (ICCV), pp. 1529–1536 (2013)Google Scholar
  8. 8.
    Dongsheng, R., Jun, W., Nenggan, Z.: Linear context transform block. arXiv preprint arXiv:1909.03834 (2019)
  9. 9.
    Fan, D.-P., Cheng, M.-M., Liu, J.-J., Gao, S.-H., Hou, Q., Borji, A.: Salient objects in clutter: bringing salient object detection to the foreground. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 196–212. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01267-0_12CrossRefGoogle Scholar
  10. 10.
    Fan, D.-P., Zhai, Y., Borji, A., Yang, J., Shao, L.: BBS-Net: RGB-D salient object detection with a bifurcated backbone strategy network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 275–292. Springer, Cham (2020).  https://doi.org/10.1007/978-3-030-58610-2_17CrossRefGoogle Scholar
  11. 11.
    Fan, R., Cheng, M.M., Hou, Q., Mu, T.J., Wang, J., Hu, S.M.: S4Net: single stage salient-instance segmentation. Comput. Vis. Media 6(2), 191–204 (2020).  https://doi.org/10.1007/s41095-020-0173-9CrossRefGoogle Scholar
  12. 12.
    Feng, M., Lu, H., Ding, E.: Attentive feedback network for boundary-aware salient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  13. 13.
    Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2Net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. (2020)Google Scholar
  14. 14.
    Gayoung, L., Yu-Wing, T., Junmo, K.: Deep saliency with encoded low level distance map and high level features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  15. 15.
    Han, Q., Zhao, K., Xu, J., Cheng, M.M.: Deep Hough transform for semantic line detection. In: Vedaldi, A., et al. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 249–265. Springer, Cham (2020)Google Scholar
  16. 16.
    Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 447–456 (2015)Google Scholar
  17. 17.
    He, J., Feng, J., Liu, X., Tao, C., Chang, S.F.: Mobile product search with bag of hash bits and boundary reranking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  18. 18.
    He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034 (2015)Google Scholar
  19. 19.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)Google Scholar
  20. 20.
    He, Y., Kang, G., Dong, X., Fu, Y., Yang, Y.: Soft filter pruning for accelerating deep convolutional neural networks. In: International Joint Conference on Artificial Intelligence (IJCAI) (2018)Google Scholar
  21. 21.
    He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4340–4349 (2019)Google Scholar
  22. 22.
    He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 1389–1397 (2017)Google Scholar
  23. 23.
    Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative saliency map with convolutional neural network. In: International Conference on Machine Learning (ICML) (2015)Google Scholar
  24. 24.
    Hou, Q., Cheng, M.M., Hu, X., Borji, A., Tu, Z., Torr, P.: Deeply supervised salient object detection with short connections. IEEE Trans. Pattern Anal. Mach. Intell. 41(4), 815–828 (2019).  https://doi.org/10.1109/TPAMI.2018.2815688CrossRefGoogle Scholar
  25. 25.
    Hou, Q., Jiang, P.T., Wei, Y., Cheng, M.M.: Self-erasing network for integral object attention. In: NeurIPS (2018)Google Scholar
  26. 26.
    Hou, Q., Liu, J., Cheng, M.M., Borji, A., Torr, P.H.: Three birds one stone: a unified framework for salient object segmentation, edge detection and skeleton extraction. arXiv preprint arXiv:1803.09860 (2018)
  27. 27.
    Howard, A., et al.: Searching for mobilenetv3. arXiv preprint arXiv:1905.02244 (2019)
  28. 28.
    Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
  29. 29.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  30. 30.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (ICML) (2015)Google Scholar
  31. 31.
    Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)CrossRefGoogle Scholar
  32. 32.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2014)Google Scholar
  33. 33.
    Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: Advances in Neural Information Processing Systems (NIPS), pp. 950–957 (1992)Google Scholar
  34. 34.
    Li, G., Kim, J.: DABNet: depth-wise asymmetric bottleneck for real-time semantic segmentation. In: British Machine Vision Conference (BMVC) (2019)Google Scholar
  35. 35.
    Li, G., Xie, Y., Lin, L., Yu, Y.: Instance-level salient object segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017Google Scholar
  36. 36.
    Li, G., Yu, Y.: Visual saliency based on multiscale deep features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015Google Scholar
  37. 37.
    Li, G., Yu, Y.: Deep contrast learning for salient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016Google Scholar
  38. 38.
    Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. In: International Conference on Learning Representations (ICLR) (2016)Google Scholar
  39. 39.
    Li, X., et al.: Deepsaliency: multi-task deep neural network model for salient object detection. IEEE Trans. Image Process. 25(8), 3919–3930 (2016).  https://doi.org/10.1109/TIP.2016.2579306MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Li, X., Yang, F., Cheng, H., Liu, W., Shen, D.: Contour knowledge transfer for salient object detection. In: European Conference on Computer Vision (ECCV), pp. 355–370 (2018)Google Scholar
  41. 41.
    Li, Y., Hou, X., Koch, C., Rehg, J.M., Yuille, A.L.: The secrets of salient object segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014Google Scholar
  42. 42.
    Lin, M., Chen, Q., Yan, S.: Network in network. In: International Conference on Learning Representations (ICLR) (2013)Google Scholar
  43. 43.
    Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117–2125 (2017)Google Scholar
  44. 44.
    Liu, J.J., Hou, Q., Cheng, M.M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  45. 45.
    Liu, N., Han, J.: DHSNet: deep hierarchical saliency network for salient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016Google Scholar
  46. 46.
    Liu, N., Han, J., Yang, M.H.: PiCANet: learning pixel-wise contextual attention for saliency detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
  47. 47.
    Liu, Z., et al.: MetaPruning: meta learning for automatic neural network channel pruning. In: IEEE International Conference on Computer Vision (ICCV) (2019)Google Scholar
  48. 48.
    Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: IEEE International Conference on Computer Vision (ICCV), pp. 2736–2744 (2017)Google Scholar
  49. 49.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)Google Scholar
  50. 50.
    Luo, J.H., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: IEEE International Conference on Computer Vision (ICCV), pp. 5058–5066 (2017)Google Scholar
  51. 51.
    Luo, Z., Mishra, A., Achkar, A., Eichel, J., Li, S., Jodoin, P.M.: Non-local deep features for salient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017Google Scholar
  52. 52.
    Ma, N., Zhang, X., Zheng, H.T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: European Conference on Computer Vision (ECCV), pp. 116–131 (2018)Google Scholar
  53. 53.
    Mehta, D., Kim, K.I., Theobalt, C.: On implicit filter level sparsity in convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 520–528 (2019)Google Scholar
  54. 54.
    Mehta, S., Rastegari, M., Shapiro, L., Hajishirzi, H.: ESPNetv2: a light-weight, power efficient, and general purpose convolutional neural network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9190–9200 (2019)Google Scholar
  55. 55.
    Movahedi, V., Elder, J.H.: Design and perceptual validation of performance measures for salient object segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW). pp. 49–56, June 2010Google Scholar
  56. 56.
    Paszke, A., Chaurasia, A., Kim, S., Culurciello, E.: ENet: a deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147 (2016)
  57. 57.
    Piao, Y., Ji, W., Li, J., Zhang, M., Lu, H.: Depth-induced multi-scale recurrent attention network for saliency detection. In: IEEE International Conference on Computer Vision (ICCV), October 2019Google Scholar
  58. 58.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  59. 59.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2014)Google Scholar
  60. 60.
    Tan, M., Le, Q.V.: EfficientNet: rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946 (2019)
  61. 61.
    Tan, Y.Q., Gao, S.H., Li, X.Y., Cheng, M.M., Ren, B.: VecRoad: point-based iterative graph exploration for road graphs extraction. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)Google Scholar
  62. 62.
    Wang, J., Jiang, H., Yuan, Z., Cheng, M.M., Hu, X., Zheng, N.: Salient object detection: a discriminative regional feature integration approach. Int. J. Comput. Vis. 123(2), 251–268 (2017).  https://doi.org/10.1007/s11263-016-0977-3CrossRefGoogle Scholar
  63. 63.
    Wang, L., et al.: Learning to detect salient objects with image-level supervision. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  64. 64.
    Wang, L., Lu, H., Xiang, R., Yang, M.H.: Deep networks for saliency detection via local estimation and global search. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  65. 65.
    Wang, L., Wang, L., Lu, H., Zhang, P., Ruan, X.: Saliency detection with recurrent fully convolutional networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 825–841. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_50CrossRefGoogle Scholar
  66. 66.
    Wang, T., Borji, A., Zhang, L., Zhang, P., Lu, H.: A stagewise refinement model for detecting salient objects in images. In: IEEE International Conference on Computer Vision (ICCV), October 2017Google Scholar
  67. 67.
    Wang, T., et al.: Detect globally, refine locally: a novel approach to saliency detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
  68. 68.
    Wang, W., Zhao, S., Shen, J., Hoi, S.C.H., Borji, A.: Salient object detection with pyramid attention and salient edges. In: The IEEE Conference on Computer Vision and Pattern Recognition (2019)Google Scholar
  69. 69.
    Wang, X., Liang, X., Yang, B., Li, F.W.: No-reference synthetic image quality assessment with convolutional neural network and local image saliency. Comput. Vis. Media 5(2), 193–208 (2019)CrossRefGoogle Scholar
  70. 70.
    Wu, R., Feng, M., Guan, W., Wang, D., Lu, H., Ding, E.: A mutual learning method for salient object detection with intertwined multi-supervision. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019Google Scholar
  71. 71.
    Wu, T., Tang, S., Zhang, R., Zhang, Y.: CGNet: a light-weight context guided network for semantic segmentation. arXiv preprint arXiv:1811.08201 (2018)
  72. 72.
    Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019Google Scholar
  73. 73.
    Yan, Q., Xu, L., Shi, J., Jia, J.: Hierarchical saliency detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2013Google Scholar
  74. 74.
    Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3166–3173 (2013)Google Scholar
  75. 75.
    Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: BiseNet: bilateral segmentation network for real-time semantic segmentation. In: European Conference on Computer Vision (ECCV), pp. 325–341 (2018)Google Scholar
  76. 76.
    Zeng, Y., Zhang, P., Zhang, J., Lin, Z., Lu, H.: Towards high-resolution salient object detection. In: The IEEE International Conference on Computer Vision (ICCV), October 2019Google Scholar
  77. 77.
    Zhang, G., Wang, C., Xu, B., Grosse, R.: Three mechanisms of weight decay regularization. In: International Conference on Learning Representations (ICLR) (2019)Google Scholar
  78. 78.
    Zhang, L., Dai, J., Lu, H., He, Y., Wang, G.: A bi-directional message passing model for salient object detection. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
  79. 79.
    Zhang, P., Wang, D., Lu, H., Wang, H., Ruan, X.: Amulet: aggregating multi-level convolutional features for salient object detection. In: IEEE International Conference on Computer Vision (ICCV), October 2017Google Scholar
  80. 80.
    Zhang, P., Wang, D., Lu, H., Wang, H., Yin, B.: Learning uncertain convolutional features for accurate saliency detection. In: IEEE International Conference on Computer Vision (ICCV), pp. 212–221. IEEE (2017)Google Scholar
  81. 81.
    Zhang, Q., et al.: Split to be slim: an overlooked redundancy in vanilla convolution. In: International Joint Conference on Artificial Intelligence (IJCAI) (2020)Google Scholar
  82. 82.
    Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6848–6856 (2018)Google Scholar
  83. 83.
    Zhang, X., Wang, T., Qi, J., Lu, H., Wang, G.: Progressive attention guided recurrent network for salient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
  84. 84.
    Zhang, Z., Jin, W., Xu, J., Cheng, M.-M.: Gradient-induced co-saliency detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 455–472. Springer, Cham (2020).  https://doi.org/10.1007/978-3-030-58610-2_27CrossRefGoogle Scholar
  85. 85.
    Zhao, J.X., Liu, J.J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M.: EGNet: edge guidance network for salient object detection. In: IEEE International Conference on Computer Vision (ICCV), October 2019Google Scholar
  86. 86.
    Zhao, K., Gao, S.H., Wang, W., Cheng, M.M.: Optimizing the f-measure for threshold-free salient object detection. In: IEEE International Conference on Computer Vision (ICCV), October 2019Google Scholar
  87. 87.
    Zhu, W., Liang, S., Wei, Y., Sun, J.: Saliency optimization from robust background detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2814–2821 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.College of Computer ScienceNankai UniversityTianjinChina
  2. 2.Yitu TechnologyShanghaiChina

Personalised recommendations