Advertisement

Superpixel Sampling Networks

  • Varun JampaniEmail author
  • Deqing Sun
  • Ming-Yu Liu
  • Ming-Hsuan Yang
  • Jan Kautz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11211)

Abstract

Superpixels provide an efficient low/mid-level representation of image data, which greatly reduces the number of image primitives for subsequent vision tasks. Existing superpixel algorithms are not differentiable, making them difficult to integrate into otherwise end-to-end trainable deep neural networks. We develop a new differentiable model for superpixel sampling that leverages deep networks for learning superpixel segmentation. The resulting Superpixel Sampling Network (SSN) is end-to-end trainable, which allows learning task-specific superpixels with flexible loss functions and has fast runtime. Extensive experimental analysis indicates that SSNs not only outperform existing superpixel algorithms on traditional segmentation benchmarks, but can also learn superpixels for other tasks. In addition, SSNs can be easily integrated into downstream deep networks resulting in performance improvements.

Keywords

Superpixels Deep learning Clustering 

Notes

Acknowledgments

We thank Wei-Chih Tu for providing evaluation scripts. We thank Ben Eckart for his help in the supplementary video.

Supplementary material

474212_1_En_22_MOESM1_ESM.pdf (15.1 mb)
Supplementary material 1 (pdf 15503 KB)

Supplementary material 2 (mp4 39065 KB)

References

  1. 1.
    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 34(11), 2274–2282 (2012)CrossRefGoogle Scholar
  2. 2.
    Achanta, R., Susstrunk, S.: Superpixels and polygons using simple non-iterative clustering. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  3. 3.
    Aljalbout, E., Golkov, V., Siddiqui, Y., Cremers, D.: Clustering with deep learning: taxonomy and new methods. arXiv preprint arXiv:1801.07648 (2018)
  4. 4.
    Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 33(5), 898–916 (2011)CrossRefGoogle Scholar
  5. 5.
    Van den Bergh, M., Boix, X., Roig, G., Van Gool, L.: SEEDS: superpixels extracted via energy-driven sampling. Int. J. Comput. Vis. (IJCV) 111(3), 298–314 (2015)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Van den Bergh, M., Carton, D., Van Gool, L.: Depth SEEDS: recovering incomplete depth data using superpixels. In: IEEE Workshop on Applications of Computer Vision (WACV), pp. 363–368 (2013)Google Scholar
  7. 7.
    Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33783-3_44CrossRefGoogle Scholar
  8. 8.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations (ICLR) (2015)Google Scholar
  9. 9.
    Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 24(5), 603–619 (2002)CrossRefGoogle Scholar
  10. 10.
    Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  11. 11.
    Everingham, M., Eslami, S.A., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. (IJCV) 111(1), 98–136 (2015)CrossRefGoogle Scholar
  12. 12.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. International J. Comput. Vis. (IJCV) 59, 167–181 (2004)CrossRefGoogle Scholar
  13. 13.
    Gadde, R., Jampani, V., Kiefel, M., Kappler, D., Gehler, P.V.: Superpixel convolutional networks using bilateral inceptions. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 597–613. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_36CrossRefGoogle Scholar
  14. 14.
    Giraud, R., Ta, V.T., Papadakis, N.: SCALP: superpixels with contour adherence using linear path. In: International Conference on Pattern Recognition (ICPR) (2016)Google Scholar
  15. 15.
    Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. Int. J. Comput. Vis. 80(3), 300–316 (2008)CrossRefGoogle Scholar
  16. 16.
    Greff, K., Rasmus, A., Berglund, M., Hao, T., Valpola, H., Schmidhuber, J.: Tagger: deep unsupervised perceptual grouping. In: Advances in Neural Information Processing Systems (NIPS) (2016)Google Scholar
  17. 17.
    Greff, K., van Steenkiste, S., Schmidhuber, J.: Neural expectation maximization. In: Advances in Neural Information Processing Systems (NIPS) (2017)Google Scholar
  18. 18.
    He, S., Lau, R.W., Liu, W., Huang, Z., Yang, Q.: SuperCNN: a superpixelwise convolutional neural network for salient object detection. Int. J. Comput. Vis. (IJCV) 115(3), 330–344 (2015)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Hershey, J.R., Chen, Z., Le Roux, J., Watanabe, S.: Deep clustering: discriminative embeddings for segmentation and separation. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016)Google Scholar
  20. 20.
    Hu, Y., Song, R., Li, Y., Rao, P., Wang, Y.: Highly accurate optical flow estimation on superpixel tree. Image Vis. Comput. 52, 167–177 (2016)CrossRefGoogle Scholar
  21. 21.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (ICML), pp. 448–456 (2015)Google Scholar
  22. 22.
    Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM Multimedia (MM), pp. 675–678 (2014)Google Scholar
  23. 23.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)Google Scholar
  24. 24.
    Levinshtein, A., Stere, A., Kutulakos, K.N., Fleet, D.J., Dickinson, S.J., Siddiqi, K.: TurboPixels: fast superpixels using geometric flows. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 31(12), 2290–2297 (2009)CrossRefGoogle Scholar
  25. 25.
    Li, Z., Chen, J.: Superpixel segmentation using linear spectral clustering. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  26. 26.
    Liu, M.Y., Tuzel, O., Ramalingam, S., Chellappa, R.: Entropy rate superpixel segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  27. 27.
    Liu, Y.J., Yu, C.C., Yu, M.J., He, Y.: Manifold SLIC: a fast method to compute content-sensitive superpixels. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  28. 28.
    Lu, J., Yang, H., Min, D., Do, M.N.: Patch match filter: efficient edge-aware filtering meets randomized search for fast correspondence field estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1854–1861 (2013)Google Scholar
  29. 29.
    Machairas, V., Faessel, M., Cárdenas-Peña, D., Chabardes, T., Walter, T., Decencière, E.: Waterpixels. IEEE Trans. Image Process. (TIP) 24(11), 3707–3716 (2015)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Perazzi, F., Krähenbühl, P., Pritch, Y., Hornung, A.: Saliency filters: contrast based filtering for salient region detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 733–740 (2012)Google Scholar
  31. 31.
    Rasmus, A., Berglund, M., Honkala, M., Valpola, H., Raiko, T.: Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)Google Scholar
  32. 32.
    Ren, C.Y., Prisacariu, V.A., Reid, I.D.: gSLICr: SLIC superpixels at over 250hz. arXiv preprint arXiv:1509.04232 (2015)
  33. 33.
    Ren, X., Malik, J.: Learning a classification model for segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2003)Google Scholar
  34. 34.
    Sharma, A., Tuzel, O., Liu, M.Y.: Recursive context propagation network for semantic scene labeling. In: Advances in Neural Information Processing Systems (NIPS) (2014)Google Scholar
  35. 35.
    Shu, G., Dehghan, A., Shah, M.: Improving an object detector and extracting regions using superpixels. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3721–3727 (2013)Google Scholar
  36. 36.
    Stutz, D., Hermans, A., Leibe, B.: Superpixels: an evaluation of the state-of-the-art. Comput. Vis. Image Underst. 166(C), 1–27 (2018)CrossRefGoogle Scholar
  37. 37.
    Sun, D., Liu, C., Pfister, H.: Local layering for joint motion estimation and occlusion detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1098–1105 (2014)Google Scholar
  38. 38.
    Tu, W.C., et al.: Learning superpixels with segmentation-aware affinity loss. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  39. 39.
    Veksler, O., Boykov, Y., Mehrani, P.: Superpixels and supervoxels in an energy optimization framework. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 211–224. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-15555-0_16CrossRefGoogle Scholar
  40. 40.
    Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: International conference on machine learning (ICML) (2016)Google Scholar
  41. 41.
    Yamaguchi, K., McAllester, D., Urtasun, R.: Robust monocular epipolar flow estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1862–1869 (2013)Google Scholar
  42. 42.
    Yan, J., Yu, Y., Zhu, X., Lei, Z., Li, S.Z.: Object detection by labeling superpixels. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5107–5116 (2015)Google Scholar
  43. 43.
    Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  44. 44.
    Yang, F., Lu, H., Yang, M.H.: Robust superpixel tracking. IEEE Trans. Image Process. 23(4), 1639–1651 (2014)MathSciNetCrossRefGoogle Scholar
  45. 45.
    Yao, J., Boben, M., Fidler, S., Urtasun, R.: Real-time coarse-to-fine topologically preserving segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  46. 46.
    Zhu, W., Liang, S., Wei, Y., Sun, J.: Saliency optimization from robust background detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Varun Jampani
    • 1
    Email author
  • Deqing Sun
    • 1
  • Ming-Yu Liu
    • 1
  • Ming-Hsuan Yang
    • 1
    • 2
  • Jan Kautz
    • 1
  1. 1.NVIDIAWestfordUSA
  2. 2.UC MercedMercedUSA

Personalised recommendations