Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Multi-scale energy optimization for object proposal generation

  • 276 Accesses

Abstract

In this paper, we present an object proposal generation method by applying energy optimization into superpixel merging algorithms in a multiscale framework, which could generate possible object locations in one image. As images in object detection datasets always enjoy high diversity, we adopt two different energy functions with multi-scales. Thus, our method enjoys the strength of global search, which is strong in locating salient object by concerning the whole image at one merge iteration, as well as the strength of local search which is more likely to recall the un-salient instances. What’s more, unlike most superpixel merging algorithms that are based on diversified segmentation results, our approach takes advantage of robust edge detection and segments each image only once, which greatly reduces the number of proposals. Experiments on PASCAL VOC 2007 test set show that the proposed method outperforms most previous superpixel merging based methods and also could compete with state-of-the-art proposal generators.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Notes

  1. 1.

    Intersection-over-Union is to measure the overlap rate between the intersection of a candidate box and the ground truth box and the area of their union.

  2. 2.

    In practice we set 𝜖 e = 0.05.

  3. 3.

    In practice we set 𝜖 s = 0.1.

  4. 4.

    Here we use the fast version of [41], which performs better than their Quality version with less proposals.

References

  1. 1.

    Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Susstrunk S (2012) Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282

  2. 2.

    Alexe B, Deselaers T, Ferrari V (2010) What is an object?. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):73–80

  3. 3.

    Alexe B, Deselaers T, Ferrari V (2012) Measuring the objectness of image windows. IEEE Trans Pattern Anal Mach Intell 34(11):2189–2202

  4. 4.

    Arbelaez P, Pont-Tuset J, Barron J, Marques F, Malik J (2014) Multiscale combinatorial grouping. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):328–335

  5. 5.

    Branson S, Beijbom O, Belongie S (2013) Efficient large-scale structured learning. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):1806–1813

  6. 6.

    Bruce N, Tsotsos J (2006) Saliency based on information maximization. Advances in Neural Information Processing Systems (NIPS):155–162

  7. 7.

    Carreira J, Sminchisescu C (2012) Cpmc: Automatic object segmentation using constrained parametric min-cuts. IEEE Trans Pattern Anal Mach Intell 34(7):1312–1328

  8. 8.

    Cheng MM, Mitra NJ, Huang X, Torr PH, Hu SM (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582

  9. 9.

    Cheng MM, Zhang Z, Lin WY, Torr PHS (2014) BING: Binarized Normed gradients for objectness estimation at 300fps. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3286–3293

  10. 10.

    Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):886–893

  11. 11.

    Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L Imagenet large scale visual recognition competition 2012 (ilsvrc2012). http://www.image-net.org/challenges/LSVRC/2012/

  12. 12.

    Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):248–255

  13. 13.

    Dollár P, Zitnick CL (2013) Structured forests for fast edge detection. IEEE International Conference on Computer Vision (ICCV):1841–1848

  14. 14.

    Endres I, Hoiem D (2010) Category independent object proposals. pp 575–588

  15. 15.

    Endres I, Hoiem D (2014) Category-independent object proposals with diverse ranking. IEEE Trans Pattern Anal Mach Intell 36(2):222–234

  16. 16.

    Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338

  17. 17.

    Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

  18. 18.

    Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181

  19. 19.

    Fidler S, Mottaghi R, Yuille A, Urtasun R (2013) Bottom-up segmentation for top-down detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3294–3301

  20. 20.

    Girshick R, Donahue J, Darrell T, Malik J (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158

  21. 21.

    Gonzalez-Garcia A, Vezhnevets A, Ferrari V (2015) An active search strategy for efficient object class detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3022–3031

  22. 22.

    Han J, He S, Qian X, Wang D, Guo L, Liu T (2013) An object-oriented visual saliency detection framework based on sparse coding representations. IEEE Trans Circ Syst Video Technol 23(12):2009–2021

  23. 23.

    Han J, Zhang D, Hu X, Guo L, Ren J, Wu F (2015) Background prior-based salient object detection via deep reconstruction residual. IEEE Trans Circ Syst Video Technol 25(8):1309–1321

  24. 24.

    Han J, Zhang D, Wen S, Guo L, Liu T, Li X (2016) Two-stage learning to predict human eye fixations via SDAEs. IEEE Trans Cybern 46(2):487–498

  25. 25.

    Hare S, Golodetz S, Saffari A, Vineet V, Cheng MM, Hicks S, Torr P (2016) Struck: Structured output tracking with kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence

  26. 26.

    Hariharan B, Arbeláez P, Girshick R, Malik J (2014) Simultaneous detection and segmentation. pp 297–312

  27. 27.

    Hariharan B, Malik J, Ramanan D (2012) Discriminative decorrelation for clustering and classification. European Conference on Computer Vision (ECCV):459–472

  28. 28.

    Hosang J, Benenson R, Dollár P, Schiele B (2016) What makes for effective detection proposals?. IEEE Trans Pattern Anal Mach Intell 38(4):814–830

  29. 29.

    Hosang J, Benenson R, Schiele B (2014) How good are detection proposals, really? British Machine Vision Conference (BMVC)

  30. 30.

    Humayun A, Li F, Rehg JM (2014) RIGOR: Reusing inference in graph cuts for generating object regions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):336–343

  31. 31.

    Itti L, Koch C, Niebur E, et al. (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

  32. 32.

    Karianakis N, Fuchs TJ, Soatto S (2015) Boosting convolutional features for robust object proposals. arXiv preprint arXiv:1503.06350

  33. 33.

    Krähenbühl P, Koltun V (2014) Geodesic object proposals. pp 725–739

  34. 34.

    Li N, Ye J, Ji Y, Ling H, Yu J (2014) Saliency detection on light field. pp 2806–2813

  35. 35.

    Li X, Lu H, Zhang L, Ruan X, Yang MH (2013) Saliency detection via dense and sparse reconstruction. pp 2976–2983

  36. 36.

    Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. pp 740–755

  37. 37.

    Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. IEEE International Conference on Computer Vision (ICCV):89–96

  38. 38.

    Manen S, Guillaumin M, Gool LV (2013) Prime object proposals with randomized prim’s algorithm. IEEE International Conference on Computer Vision (ICCV):2536–2543

  39. 39.

    Rantalankila P, Kannala J, Rahtu E (2014) Generating object segmentation proposals using global and local search. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):2417–2424

  40. 40.

    Van de Sande KE, Uijlings JR, Gevers T, Smeulders AW (2011) Segmentation as selective search for object recognition. IEEE International Conference on Computer Vision (ICCV):1879–1886

  41. 41.

    Uijlings JR, Van de Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

  42. 42.

    Valenti R, Sebe N, Gevers T (2009) Image saliency by isocentric curvedness and color. IEEE International Conference on Computer Vision (ICCV):2185–2192

  43. 43.

    Wang L, Lu H, Ruan X, Yang MH (2015) Deep networks for saliency detection via local estimation and global search. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3183–3192

  44. 44.

    Wei Y, Wen F, Zhu W, Sun J (2012) Geodesic saliency using background priors. pp 29–42

  45. 45.

    Yang C, Zhang L, Lu H, Ruan X, Yang MH (2013) Saliency detection via graph-based manifold ranking. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3166–3173

  46. 46.

    Zhang Z, Warrell J, Torr PH (2011) Proposal generation for object detection using cascaded ranking svms. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):1497–1504

  47. 47.

    Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):2814–2821

  48. 48.

    Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. pp 391–405

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China(No.61301238, 61201424), China Scholarship Council(No.201506205024) and the Natural Science Foundation of Tianjin, China(No.14ZCDZGX00831).

Author information

Correspondence to Jufeng Yang.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, C., Yang, J., Wang, K. et al. Multi-scale energy optimization for object proposal generation. Multimed Tools Appl 76, 10481–10499 (2017). https://doi.org/10.1007/s11042-016-3616-7

Download citation

Keywords

  • Object proposal
  • Multi scales
  • Saliency
  • Superpixel merging