Skip to main content

Abstract

Dense descriptors are becoming increasingly popular in a host of tasks, such as dense image correspondence, bag-of-words image classification, and label transfer. However, the extraction of descriptors on generic image points, rather than selecting geometric features, requires rethinking how to achieve invariance to nuisance parameters. In this work we pursue invariance to occlusions and background changes by introducing segmentation information within dense feature construction. The core idea is to use the segmentation cues to downplay the features coming from image areas that are unlikely to belong to the same region as the feature point. We show how to integrate this idea with dense SIFT, as well as with the dense scale- and rotation-invariant descriptor (SID). We thereby deliver dense descriptors that are invariant to background changes, rotation, and/or scaling. We explore the merit of our technique in conjunction with large displacement motion estimation and wide-baseline stereo, and demonstrate that exploiting segmentation information yields clear improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We were unaware of this work when first publishing [43].

  2. 2.

    https://github.com/etrulls/softseg-descriptors-release.

References

  1. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)

    Article  MATH  Google Scholar 

  2. Berg, A.C., Malik, J.: Geometric blur for template matching. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, vol. 1. IEEE, New York (2001)

    Google Scholar 

  3. Borgefors, G.: Distance transformations in digital images. Comput. Vis. Graphics Image Process. 34(3), 344–371 (1986)

    Article  Google Scholar 

  4. Bovik, A.C., Clark, M., Geisler, W.S.: Multichannel texture analysis using localized spatial filters. Trans. Pattern Anal. Mach. Intell. 12(1), 55–73 (1990)

    Article  Google Scholar 

  5. Brox, T., Malik, J.: Berkeley motion segmentation dataset. http://lmb.informatik.uni-freiburg.de/resources/datasets/moseg.en.html (2010)

  6. Casasent, D., Psaltis, D.: Position, rotation, and scale invariant optical correlation. Appl. Opt. 15(7), 1795–1799 (1976)

    Article  Google Scholar 

  7. Deriche, R.: Using Canny’s criteria to derive a recursively implemented optimal edge detector. Int. J. Comput. Vis. 1(2), 167–187 (1987)

    Article  Google Scholar 

  8. Dice, L.R.: Measures of the amount of ecologic association between species. Ecology 26(3), 297–302 (1945)

    Article  Google Scholar 

  9. Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: Proceedings of the International Conference on Computer Vision, pp. 1841–1848. IEEE, New York (2013)

    Google Scholar 

  10. Freeman, W.T., Adelson, E.H.: The design and use of steerable filters. Trans. Pattern Anal. Mach. Intell. 13(9), 891–906 (1991)

    Article  Google Scholar 

  11. Fulkerson, B., Vedaldi, A., Soatto, S.: Localizing objects with smart dictionaries. In: European Conference on Computer Vision, pp. 179–192. Springer, New York (2008)

    Google Scholar 

  12. Hassner, T., Mayzels, V., Zelnik-Manor, L.: On SIFTs and their scales. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 1522–1528. IEEE, New York (2012)

    Google Scholar 

  13. Kokkinos, I., Yuille, A.: Scale invariance without scale selection. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2008)

    Google Scholar 

  14. Kokkinos, I., Bronstein, M., Yuille, A.: Dense scale invariant descriptors for images and surfaces (2012). INRIA Research Report 7914

    Google Scholar 

  15. Kokkinos, I., Bronstein, M.M., Litman, R., Bronstein, A.M.: Intrinsic shape context descriptors for deformable shapes. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 159–166. IEEE, New York (2012)

    Google Scholar 

  16. Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. Trans. Pattern Anal. Mach. Intell. 28(10), 1568–1583 (2006)

    Article  Google Scholar 

  17. Leordeanu, M., Sukthankar, R., Sminchisescu, C.: Efficient closed-form solution to generalized boundary detection. In: European Conference on Computer Vision, pp. 516–529. Springer, New York (2012)

    Google Scholar 

  18. Liu, C., Yuen, J., Torralba, A.: SIFT flow: Dense correspondence across scenes and its applications. Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)

    Article  Google Scholar 

  19. Liu, K., Skibbe, H., Schmidt, T., Blein, T., Palme, K., Brox, T., Ronneberger, O.: Rotation-invariant HOG descriptors using Fourier analysis in polar and spherical coordinates. Int. J. Comput. Vis. 106(3), 342–364 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  20. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  21. Maire, M., Arbeláez, P., Fowlkes, C., Malik, J.: Using contours to detect and localize junctions in natural images. In: Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, New York (2008)

    Book  Google Scholar 

  22. Maire, M., Yu, S.X., Perona, P.: Object detection and segmentation from joint embedding of parts and pixels. In: Proceedings of the International Conference on Computer Vision, pp. 2142–2149. IEEE, New York (2011)

    Google Scholar 

  23. Mallat, S.: Zero-crossings of a wavelet transform. Trans. Inf. Theory 37(4), 1019–1033 (1991)

    Article  MathSciNet  Google Scholar 

  24. Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors. Int. J. Comput. Vis. 65(1–2), 43–72 (2005)

    Article  Google Scholar 

  25. Nowak, E., Jurie, F., Triggs, B.: Sampling strategies for bag-of-features image classification. In: European Conference on Computer Vision, pp. 490–503. Springer, New York (2006)

    Google Scholar 

  26. Perona, P., Malik, J.: Scale-space and edge detection using anisotropic diffusion. Trans. Pattern Anal. Mach. Intell. 12(7), 629–639 (1990)

    Article  Google Scholar 

  27. Porat, M., Zeevi, Y.Y.: The generalized Gabor scheme of image representation in biological and machine vision. Trans. Pattern Anal. Mach. Intell. 10(4), 452–468 (1988)

    Article  MATH  Google Scholar 

  28. Ren, X., Malik, J.: Learning a classification model for segmentation. In: Proceedings of the International Conference on Computer Vision, pp. 10–17. IEEE, New York (2003)

    Google Scholar 

  29. Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D: Nonlinear Phenom. 60(1), 259–268 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  30. Schmid, C., Mohr, R.: Local grayvalue invariants for image retrieval. Trans. Pattern Anal. Mach. Intell. 19(5), 530–534 (1997)

    Article  Google Scholar 

  31. Schmidt, U., Roth, S.: Learning rotation-aware features: from invariant priors to equivariant descriptors. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 2050–2057. IEEE, New York (2012)

    Google Scholar 

  32. Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2007)

    Google Scholar 

  33. Shi, J., Malik, J.: Normalized cuts and image segmentation. Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  34. Simonyan, K., Vedaldi, A., Zisserman, A.: Descriptor learning using convex optimisation. In: European Conference on Computer Vision, pp. 243–256. Springer, New York (2012)

    Google Scholar 

  35. Simonyan, K., Vedaldi, A., Zisserman, A.: Learning local feature descriptors using convex optimisation. Trans. Pattern Anal. Mach. Intell. 12, 25–70 (2014)

    Google Scholar 

  36. Stein, A., Hebert, M.: Incorporating background invariance into feature-based object recognition. In: Application of Computer Vision, vol. 1, pp. 37–44. IEEE, 7th IEEE Workshop on Applications of Computer Vision (WACV), New York (2005)

    Google Scholar 

  37. Strecha, C., Tuytelaars, T., Van Gool, L.: Dense matching of multiple wide-baseline views. In: Proceedings of the International Conference on Computer Vision, pp. 1194–1201. IEEE, New York (2003)

    Google Scholar 

  38. Strecha, C., von Hansen, W., Van Gool, L., Fua, P., Thoennessen, U.: On benchmarking camera calibration and multi-view stereo for high resolution imagery. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2008)

    Google Scholar 

  39. Strecha, C., Bronstein, A.M., Bronstein, M.M., Fua, P.: LDA-hash: improved matching with smaller descriptors. Trans. Pattern Anal. Mach. Intell. 34(1), 66–78 (2012)

    Article  Google Scholar 

  40. Tola, E., Lepetit, V., Fua, P.: Daisy: An efficient dense descriptor applied to wide-baseline stereo. Trans. Pattern Anal. Mach. Intell. 32(5), 815–830 (2010)

    Article  Google Scholar 

  41. Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Proceedings of the International Conference on Computer Vision, pp. 839–846. IEEE, New York (1998)

    Google Scholar 

  42. Tron, R., Vidal, R.: Hopkins 155 dataset. http://www.vision.jhu.edu/data.htm (2007)

  43. Trulls, E., Kokkinos, I., Sanfeliu, A., Moreno-Noguer, F.: Dense segmentation-aware descriptors. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 2890–2897. IEEE, New York (2013)

    Google Scholar 

  44. Trulls, E., Tsogkas, S., Kokkinos, I., Sanfeliu, A., Moreno-Noguer, F.: Segmentation-aware deformable part models. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 168–175. IEEE, New York (2014)

    Google Scholar 

  45. Vedaldi, A., Fulkerson, B.: An Open and Portable Library of Computer Vision Algorithms, http://www.vlfeat.org (2008)

  46. Winder, S., Hua, G., Brown, M.: Picking the best daisy. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 178–185. IEEE, New York (2009)

    Google Scholar 

  47. Wolberg, G., Zokai, S.: Robust image registration using log-polar transform. In: International Conference on Pattern Recognition, vol. 1, pp. 493–496. IEEE, New York (2000)

    Google Scholar 

  48. Yao, J., Cham, W.K.: 3D modeling and rendering from multiple wide-baseline images by match propagation. Signal Process. Image Commun. 21(6), 506–518 (2006)

    Article  Google Scholar 

Download references

Acknowledgements

This work has been partially funded by the Spanish Ministry of Economy and Competitiveness under project RobInstruct TIN2014-58178-R and ERA-Net Chistera project ViSen PCIN-2013-047; by EU projects AEROARMS H2020-ICT-2014-1-644271, MOBOT FP7-ICT-2011-600796, and RECONFIG FP7-ICT-600825; and by grant ANR-10-JCJC-0205 (HiCoRe).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iasonas Kokkinos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Trulls, E., Kokkinos, I., Sanfeliu, A., Moreno-Noguer, F. (2016). Dense Segmentation-Aware Descriptors. In: Hassner, T., Liu, C. (eds) Dense Image Correspondences for Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-319-23048-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23048-1_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23047-4

  • Online ISBN: 978-3-319-23048-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics