Unsupervised natural image patch learning

Danon, Dov; Averbuch-Elor, Hadar; Fried, Ohad; Cohen-Or, Daniel

doi:10.1007/s41095-019-0147-y

Unsupervised natural image patch learning

Research Article
Open access
Published: 22 August 2019

Volume 5, pages 229–237, (2019)
Cite this article

Download PDF

You have full access to this open access article

Computational Visual Media Aims and scope Submit manuscript

Unsupervised natural image patch learning

Download PDF

Dov Danon¹,
Hadar Averbuch-Elor¹,
Ohad Fried² &
…
Daniel Cohen-Or¹

1062 Accesses
12 Citations
Explore all metrics

Abstract

A metric for natural image patches is an important tool for analyzing images. An efficient means of learning one is to train a deep network to map an image patch to a vector space, in which the Euclidean distance reflects patch similarity. Previous attempts learned such an embedding in a supervised manner, requiring the availability of many annotated images. In this paper, we present an unsupervised embedding of natural image patches, avoiding the need for annotated images. The key idea is that the similarity of two patches can be learned from the prevalence of their spatial proximity in natural images. Clearly, relying on this simple principle, many spatially nearby pairs are outliers. However, as we show, these outliers do not harm the convergence of the metric learning. We show that our unsupervised embedding approach is more effective than a supervised one or one that uses deep patch representations. Moreover, we show that it naturally lends itself to an efficient self-supervised domain adaptation technique onto a target domain that contains a common foreground object.

Article PDF

Learning from patches by efficient spectral decomposition of a structured kernel

Article 17 December 2015

A Correlation-Based Dissimilarity Measure for Noisy Patches

Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach

Article 12 July 2016

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Matviychuk, Y.; Hughes, S. M. Exploring the manifold of image patches. In: Proceedings of Bridges, 339–342, 2015.
Google Scholar
Shi, K.; Zhu, S.-C. Mapping natural image patches by explicit and implicit manifolds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–7, 2007.
Google Scholar
Fried, O.; Avidan, S.; Cohen-Or, D. Patch2Vec: Globally consistent image patch representation. Computer Graphics Forum Vol. 36, No. 7, 183–194, 2017.
Article Google Scholar
Doersch, C.; Gupta, A.; Efros, A. A. Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, 1422–1430, 2015.
Google Scholar
Julesz, B. Textons, the elements of texture perception, and their interactions. Nature Vol. 290, No. 5802, 91–97, 1981.
Article Google Scholar
Randen, T.; Husoy, J. H. Filtering for texture classification: A comparative study. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 21, No. 4, 291–310, 1999.
Article Google Scholar
Gabor, D. Theory of communication. Part 1: The analysis of information. Journal of the Institution of Electrical Engineers — Part III: Radio and Communication Engineering Vol. 93, No. 26, 429–441, 1946.
Google Scholar
De Bonet, J. S.; Viola, P. Texture recognition using a non-parametric multi-scale statistical model. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 641–647, 1998.
Google Scholar
Heeger, D. J.; Bergen, J. R. Pyramid-based texture analysis/synthesis. In: Proceedings of the IEEE International Conference on Image Processing 648–650, 1995.
Chapter Google Scholar
Varma, M.; Zisserman, A. Texture classification: Are filter banks necessary? In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, II–691, 2003.
Google Scholar
Barnes, C.; Zhang, F. L. A survey of the state-of-the-art in patch-based synthesis. Computational Visual Media Vol. 3, No. 1, 3–20, 2017.
Article Google Scholar
Žbontar, J.; LeCun, Y. Stereo matching by training a convolutional neural network to compare image patches. The Journal of Machine Learning Research Vol. 17, No. 1, 2287–2318, 2016.
MATH Google Scholar
Simo-Serra, E.; Trulls, E.; Ferraz, L.; Kokkinos, I.; Fua, P.; Moreno-Noguer, F. Discriminative learning of deep convolutional feature point descriptors. In: Proceedings of the IEEE International Conference on Computer Vision, 118–126, 2015.
Google Scholar
Hu, S.-M.; Zhang, F.-L.; Wang, M.; Martin, R. R.; Wang, J. PatchNet: A patch-based image representation for interactive library-driven image editing. ACM Transactions on Graphics Vol. 32, No. 6, Article No. 196, 2013.
Google Scholar
Barnes, C.; Zhang, F.-L.; Lou, L.; Wu, X.; Hu, S.-M. PatchTable: Efficient patch queries for large datasets and applications. ACM Transactions on Graphics Vol. 34, No. 4, Article No. 97, 2015.
Google Scholar
Cimpoi, M.; Maji, S.; Vedaldi, A. Deep filter banks for texture recognition and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3828–3836, 2015.
Google Scholar
Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3431–3440, 2015.
Google Scholar
Isola, P.; Zoran, D.; Krishnan, D.; Adelson, E. H. Learning visual groups from co-occurrences in space and time. arXiv preprint arXiv:1511.06811, 2015.
Google Scholar
Wang X.; Gupta, A. Unsupervised learning of visual representations using videos. In: Proceeding of the IEEE International Conference on Computer Vision, 2794–2802, 2015.
Google Scholar
Pathak, D.; Krähenbühl, P.; Donahue, J.; Darrell, T.; Efros, A. A. Context encoders: Feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2536–2544, 2016.
Google Scholar
Ben-David, S.; Blitzer, J.; Crammer, K.; Kulesza, A.; Pereira, F.; Vaughan, J. W. A theory of learning from different domains. Machine Learning Vol. 79, Nos. 1–2, 151–175, 2010.
Article MathSciNet Google Scholar
Chen, M.; Xu, Z.; Weinberger, K. Q.; Sha, F. Marginalized denoising autoencoders for domain adaptation. In: Proceedings of the 29th International Conference on Machine Learning, 1627–1634, 2012.
Google Scholar
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S. A. et al. ImageNet large scale visual recognition challenge. International Journal of Computer Vision Vol. 115, No. 3, 211–252, 2015.
Article MathSciNet Google Scholar
Oquab, M.; Bottou, L.; Laptev, I.; Sivic, J. Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1717–1724, 2014.
Google Scholar
Razavian, A. S.; Azizpour, H.; Sullivan, J.; Carlsson, S. CNN features off-the-shelf: An astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 806–813, 2014.
Google Scholar
Patel, V. M.; Gopalan, R.; Li, R. N.; Chellappa, R. Visual domain adaptation: A survey of recent advances. IEEE Signal Processing Magazine Vol. 32, No. 3, 53–69, 2015.
Article Google Scholar
Ganin, Y.; Lempitsky, V. Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd International Conference on Machine Learning, Vol. 37, 1180–1189, 2015.
Google Scholar
Kodirov, E.; Xiang, T.; Fu, Z.; Gong, S. Unsupervised domain adaptation for zero-shot learning. In: Proceedings of the IEEE International Conference on Computer Visio, 2452–2460, 2015.
Google Scholar
Szegedy, C.; Liu, W.; Jia, Y. Q.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–9, 2015.
Google Scholar
Bagon, S. Matlab wrapper for graph cut. 2006. Available at http://www.wisdom.weizmann.ac.il/~bagon/matlab.html.
Google Scholar
Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 33, No. 5, 898–916, 2011.
Article Google Scholar
Rubinstein, M.; Joulin, A.; Kopf, J.; Liu, C. Unsupervised joint object discovery and segmentation in Internet images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1939–1946, 2013.
Google Scholar
Wang, M.; Lai, Y.; Liang, Y.; Martin, R. R.; Hu, S.-M. BiggerPicture: Data-driven image extrapolation using graph matching. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 173, 2014.
Google Scholar

Download references

Author information

Authors and Affiliations

Tel-Aviv University, Tel Aviv, 6997801, Israel
Dov Danon, Hadar Averbuch-Elor & Daniel Cohen-Or
Stanford University, Stanford, CA, 94305, USA
Ohad Fried

Authors

Dov Danon
View author publications
You can also search for this author in PubMed Google Scholar
Hadar Averbuch-Elor
View author publications
You can also search for this author in PubMed Google Scholar
Ohad Fried
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Cohen-Or
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dov Danon.

Additional information

Dov Danon is a Ph.D. student at the School of Computer Science, Tel-Aviv University. He received his B.Sc. (summa cum laude) degree in computer science and mathematics from the Ben Gurion of the Negev in 2007 and M.Sc. degree in computer science from Tel-Aviv University in 2016. His research interests include machine learning and, in particular, unsupervised learning in image processing.

Hadar Averbuch-Elor is a Ph.D. student at the School of Electrical Engineering, Tel-Aviv University, and a research scientist at Amazon. She received her B.Sc. (cum laude) degree in electrical engineering from the Technion in 2012. She worked as a computer vision algorithm developer in the defense industry from 2011 to 2015. Her research interests include computer vision and computer graphics, focusing on unstructured image collections and unsupervised techniques.

Ohad Fried is a postdoctoral research scholar at the School of Computer Science, Stanford University, and a fellow in the Brown Institute for Media Innovation. He received his B.Sc. (magna cum laude) degree in computer science and computational biology and M.Sc. (cum laude) degree in computer science, both from the Hebrew University, in 2010 and 2012 respectively. He received his Ph.D. degree from the Department of Computer Science at Princeton University in 2017. Currently, his main interests are visual communication methods at the intersection of graphics, vision, and HCI.

Daniel Cohen-Or is a professor at the School of Computer Science, Tel-Aviv University. He received his B.Sc. (cum laude) degree in mathematics and computer science and M.Sc. (cum laude) degree in computer science, both from Ben-Gurion University, in 1985 and 1986, respectively. He received his Ph.D. degree from the Department of Computer Science at the State University of New York at Stony Brook in 1991. He received the 2005 Eurographics Outstanding Technical Contributions Award. In 2015, he was named a Thomson Reuters Highly Cited Researcher. Currently, his main interests are in a few areas: image synthesis, analysis and reconstruction, motion and transformations, shapes, and surfaces.

Electronic supplementary material

Supervised vs. Unsupervised Embedding Technique

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Reprints and permissions

About this article

Cite this article

Danon, D., Averbuch-Elor, H., Fried, O. et al. Unsupervised natural image patch learning. Comp. Visual Media 5, 229–237 (2019). https://doi.org/10.1007/s41095-019-0147-y

Download citation

Received: 23 April 2019
Accepted: 18 May 2019
Published: 22 August 2019
Issue Date: September 2019
DOI: https://doi.org/10.1007/s41095-019-0147-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Unsupervised natural image patch learning

Abstract

Article PDF

Similar content being viewed by others

Learning from patches by efficient spectral decomposition of a structured kernel

A Correlation-Based Dissimilarity Measure for Noisy Patches

Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supervised vs. Unsupervised Embedding Technique

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised natural image patch learning

Abstract

Article PDF

Similar content being viewed by others

Learning from patches by efficient spectral decomposition of a structured kernel

A Correlation-Based Dissimilarity Measure for Noisy Patches

Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supervised vs. Unsupervised Embedding Technique

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation