Dense Segmentation-Aware Descriptors

Trulls, Eduard; Kokkinos, Iasonas; Sanfeliu, Alberto; Moreno-Noguer, Francesc

doi:10.1007/978-3-319-23048-1_5

Eduard Trulls³,
Iasonas Kokkinos⁴,
Alberto Sanfeliu³ &
…
Francesc Moreno-Noguer³

2032 Accesses

Abstract

Dense descriptors are becoming increasingly popular in a host of tasks, such as dense image correspondence, bag-of-words image classification, and label transfer. However, the extraction of descriptors on generic image points, rather than selecting geometric features, requires rethinking how to achieve invariance to nuisance parameters. In this work we pursue invariance to occlusions and background changes by introducing segmentation information within dense feature construction. The core idea is to use the segmentation cues to downplay the features coming from image areas that are unlikely to belong to the same region as the feature point. We show how to integrate this idea with dense SIFT, as well as with the dense scale- and rotation-invariant descriptor (SID). We thereby deliver dense descriptors that are invariant to background changes, rotation, and/or scaling. We explore the merit of our technique in conjunction with large displacement motion estimation and wide-baseline stereo, and demonstrate that exploiting segmentation information yields clear improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We were unaware of this work when first publishing [43].
2.
https://github.com/etrulls/softseg-descriptors-release.

References

Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Article MATH Google Scholar
Berg, A.C., Malik, J.: Geometric blur for template matching. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, vol. 1. IEEE, New York (2001)
Google Scholar
Borgefors, G.: Distance transformations in digital images. Comput. Vis. Graphics Image Process. 34(3), 344–371 (1986)
Article Google Scholar
Bovik, A.C., Clark, M., Geisler, W.S.: Multichannel texture analysis using localized spatial filters. Trans. Pattern Anal. Mach. Intell. 12(1), 55–73 (1990)
Article Google Scholar
Brox, T., Malik, J.: Berkeley motion segmentation dataset. http://lmb.informatik.uni-freiburg.de/resources/datasets/moseg.en.html (2010)
Casasent, D., Psaltis, D.: Position, rotation, and scale invariant optical correlation. Appl. Opt. 15(7), 1795–1799 (1976)
Article Google Scholar
Deriche, R.: Using Canny’s criteria to derive a recursively implemented optimal edge detector. Int. J. Comput. Vis. 1(2), 167–187 (1987)
Article Google Scholar
Dice, L.R.: Measures of the amount of ecologic association between species. Ecology 26(3), 297–302 (1945)
Article Google Scholar
Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: Proceedings of the International Conference on Computer Vision, pp. 1841–1848. IEEE, New York (2013)
Google Scholar
Freeman, W.T., Adelson, E.H.: The design and use of steerable filters. Trans. Pattern Anal. Mach. Intell. 13(9), 891–906 (1991)
Article Google Scholar
Fulkerson, B., Vedaldi, A., Soatto, S.: Localizing objects with smart dictionaries. In: European Conference on Computer Vision, pp. 179–192. Springer, New York (2008)
Google Scholar
Hassner, T., Mayzels, V., Zelnik-Manor, L.: On SIFTs and their scales. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 1522–1528. IEEE, New York (2012)
Google Scholar
Kokkinos, I., Yuille, A.: Scale invariance without scale selection. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2008)
Google Scholar
Kokkinos, I., Bronstein, M., Yuille, A.: Dense scale invariant descriptors for images and surfaces (2012). INRIA Research Report 7914
Google Scholar
Kokkinos, I., Bronstein, M.M., Litman, R., Bronstein, A.M.: Intrinsic shape context descriptors for deformable shapes. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 159–166. IEEE, New York (2012)
Google Scholar
Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. Trans. Pattern Anal. Mach. Intell. 28(10), 1568–1583 (2006)
Article Google Scholar
Leordeanu, M., Sukthankar, R., Sminchisescu, C.: Efficient closed-form solution to generalized boundary detection. In: European Conference on Computer Vision, pp. 516–529. Springer, New York (2012)
Google Scholar
Liu, C., Yuen, J., Torralba, A.: SIFT flow: Dense correspondence across scenes and its applications. Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)
Article Google Scholar
Liu, K., Skibbe, H., Schmidt, T., Blein, T., Palme, K., Brox, T., Ronneberger, O.: Rotation-invariant HOG descriptors using Fourier analysis in polar and spherical coordinates. Int. J. Comput. Vis. 106(3), 342–364 (2014)
Article MathSciNet MATH Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Maire, M., Arbeláez, P., Fowlkes, C., Malik, J.: Using contours to detect and localize junctions in natural images. In: Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, New York (2008)
Book Google Scholar
Maire, M., Yu, S.X., Perona, P.: Object detection and segmentation from joint embedding of parts and pixels. In: Proceedings of the International Conference on Computer Vision, pp. 2142–2149. IEEE, New York (2011)
Google Scholar
Mallat, S.: Zero-crossings of a wavelet transform. Trans. Inf. Theory 37(4), 1019–1033 (1991)
Article MathSciNet Google Scholar
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors. Int. J. Comput. Vis. 65(1–2), 43–72 (2005)
Article Google Scholar
Nowak, E., Jurie, F., Triggs, B.: Sampling strategies for bag-of-features image classification. In: European Conference on Computer Vision, pp. 490–503. Springer, New York (2006)
Google Scholar
Perona, P., Malik, J.: Scale-space and edge detection using anisotropic diffusion. Trans. Pattern Anal. Mach. Intell. 12(7), 629–639 (1990)
Article Google Scholar
Porat, M., Zeevi, Y.Y.: The generalized Gabor scheme of image representation in biological and machine vision. Trans. Pattern Anal. Mach. Intell. 10(4), 452–468 (1988)
Article MATH Google Scholar
Ren, X., Malik, J.: Learning a classification model for segmentation. In: Proceedings of the International Conference on Computer Vision, pp. 10–17. IEEE, New York (2003)
Google Scholar
Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D: Nonlinear Phenom. 60(1), 259–268 (1992)
Article MathSciNet MATH Google Scholar
Schmid, C., Mohr, R.: Local grayvalue invariants for image retrieval. Trans. Pattern Anal. Mach. Intell. 19(5), 530–534 (1997)
Article Google Scholar
Schmidt, U., Roth, S.: Learning rotation-aware features: from invariant priors to equivariant descriptors. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 2050–2057. IEEE, New York (2012)
Google Scholar
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2007)
Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Article Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Descriptor learning using convex optimisation. In: European Conference on Computer Vision, pp. 243–256. Springer, New York (2012)
Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Learning local feature descriptors using convex optimisation. Trans. Pattern Anal. Mach. Intell. 12, 25–70 (2014)
Google Scholar
Stein, A., Hebert, M.: Incorporating background invariance into feature-based object recognition. In: Application of Computer Vision, vol. 1, pp. 37–44. IEEE, 7th IEEE Workshop on Applications of Computer Vision (WACV), New York (2005)
Google Scholar
Strecha, C., Tuytelaars, T., Van Gool, L.: Dense matching of multiple wide-baseline views. In: Proceedings of the International Conference on Computer Vision, pp. 1194–1201. IEEE, New York (2003)
Google Scholar
Strecha, C., von Hansen, W., Van Gool, L., Fua, P., Thoennessen, U.: On benchmarking camera calibration and multi-view stereo for high resolution imagery. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, New York (2008)
Google Scholar
Strecha, C., Bronstein, A.M., Bronstein, M.M., Fua, P.: LDA-hash: improved matching with smaller descriptors. Trans. Pattern Anal. Mach. Intell. 34(1), 66–78 (2012)
Article Google Scholar
Tola, E., Lepetit, V., Fua, P.: Daisy: An efficient dense descriptor applied to wide-baseline stereo. Trans. Pattern Anal. Mach. Intell. 32(5), 815–830 (2010)
Article Google Scholar
Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Proceedings of the International Conference on Computer Vision, pp. 839–846. IEEE, New York (1998)
Google Scholar
Tron, R., Vidal, R.: Hopkins 155 dataset. http://www.vision.jhu.edu/data.htm (2007)
Trulls, E., Kokkinos, I., Sanfeliu, A., Moreno-Noguer, F.: Dense segmentation-aware descriptors. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 2890–2897. IEEE, New York (2013)
Google Scholar
Trulls, E., Tsogkas, S., Kokkinos, I., Sanfeliu, A., Moreno-Noguer, F.: Segmentation-aware deformable part models. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 168–175. IEEE, New York (2014)
Google Scholar
Vedaldi, A., Fulkerson, B.: An Open and Portable Library of Computer Vision Algorithms, http://www.vlfeat.org (2008)
Winder, S., Hua, G., Brown, M.: Picking the best daisy. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 178–185. IEEE, New York (2009)
Google Scholar
Wolberg, G., Zokai, S.: Robust image registration using log-polar transform. In: International Conference on Pattern Recognition, vol. 1, pp. 493–496. IEEE, New York (2000)
Google Scholar
Yao, J., Cham, W.K.: 3D modeling and rendering from multiple wide-baseline images by match propagation. Signal Process. Image Commun. 21(6), 506–518 (2006)
Article Google Scholar

Download references

Acknowledgements

This work has been partially funded by the Spanish Ministry of Economy and Competitiveness under project RobInstruct TIN2014-58178-R and ERA-Net Chistera project ViSen PCIN-2013-047; by EU projects AEROARMS H2020-ICT-2014-1-644271, MOBOT FP7-ICT-2011-600796, and RECONFIG FP7-ICT-600825; and by grant ANR-10-JCJC-0205 (HiCoRe).

Author information

Authors and Affiliations

Institut de Robòtica i Informàtica Industrial (UPC/CSIC), C/ Llorens i Artigas 4-6, 08028, Barcelona, Spain
Eduard Trulls, Alberto Sanfeliu & Francesc Moreno-Noguer
Ecole Centrale Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Iasonas Kokkinos

Authors

Eduard Trulls
View author publications
You can also search for this author in PubMed Google Scholar
Iasonas Kokkinos
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Sanfeliu
View author publications
You can also search for this author in PubMed Google Scholar
Francesc Moreno-Noguer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Iasonas Kokkinos .

Editor information

Editors and Affiliations

The Open University of Israel, Raanana, Israel
Tal Hassner
Google Research, Cambridge, Massachusetts, USA
Ce Liu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Trulls, E., Kokkinos, I., Sanfeliu, A., Moreno-Noguer, F. (2016). Dense Segmentation-Aware Descriptors. In: Hassner, T., Liu, C. (eds) Dense Image Correspondences for Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-319-23048-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-23048-1_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23047-4
Online ISBN: 978-3-319-23048-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics