Skip to main content

Mining Self-similarity: Label Super-Resolution with Epitomic Representations

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12371))

Included in the following conference series:

Abstract

We show that simple patch-based models, such as epitomes (Jojic et al., 2003), can have superior performance to the current state of the art in semantic segmentation and label super-resolution, which uses deep convolutional neural networks. We derive a new training algorithm for epitomes which allows, for the first time, learning from very large data sets and derive a label super-resolution algorithm as a statistical inference over epitomic representations. We illustrate our methods on land cover mapping and medical image analysis tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/anthonymlortiz/epitomes_lsr.

  2. 2.

    \(E_2^{(\ell )}\) is trained to model the patches poorly modeled by the self-diversifying \(E_1^{(\ell )}\). Hence, \(E_2^{(\ell )}\) simply has much higher posteriors and more diversity of texture.

  3. 3.

    We found it helpful to work with \(2\times \) downsampled images and use \(7\times 7\) patches for embedding, with approximately \(0.05|W|^2\) patches sampled for tiles of size \(W \times W\).

  4. 4.

    We used training settings identical to those of [18]. The training collapsed to a minimum in which the “water” class was not predicted, but the accuracy would be lower than that of all-tile epitomic LSR even if all water were predicted correctly.

References

  1. Bazzani, L., Cristani, M., Perina, A., Murino, V.: Multiple-shot person re-identification by chromatic and epitomic analyses. Pattern Recogn. Lett. 33(7), 898–903 (2012)

    Article  Google Scholar 

  2. Brendel, W., Bethge, M.: Approximating CNNs with bag-of-local-features models works surprisingly well on imagenet. In: International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  3. Chesapeake Conservancy: Land cover data project (2017). https://chesapeakeconservancy.org/wp-content/uploads/2017/01/LandCover101Guide.pdf

  4. Cheung, V., Jojic, N., Samaras, D.: Capturing long-range correlations with patch models. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition. pp. 1–8. IEEE (2007)

    Google Scholar 

  5. Dai, J., He, K., Sun, J.: Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1635–1643 (2015)

    Google Scholar 

  6. Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Tenth IEEE International Conference on Computer Vision (ICCV 2005). vol. 2, pp. 1816–1823. IEEE (2005)

    Google Scholar 

  7. Frey, B.J., Jojic, N.: Transformed component analysis: Joint estimation of spatial transformations and image components. In: Proceedings of the Seventh IEEE International Conference on Computer Vision. vol. 2, pp. 1190–1196. IEEE (1999)

    Google Scholar 

  8. Ganchev, K., Gillenwater, J., Taskar, B., et al.: Posterior regularization for structured latent variable models. J. Mach. Learn. Res. 11, 2001–2049 (2010)

    MathSciNet  MATH  Google Scholar 

  9. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: Imagenet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In: International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  10. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  11. Homer, C., et al.: Completion of the 2011 national land cover database for the conterminous united states-representing a decade of land cover change information. Photogramm. Eng. Remote Sens. 81(5), 345–354 (2015)

    Google Scholar 

  12. Hong, S., Noh, H., Han, B.: Decoupled deep neural network for semi-supervised semantic segmentation. In: Advances in Neural Information Processing Systems. pp. 1495–1503 (2015)

    Google Scholar 

  13. Hou, L., et al.: Sparse autoencoder for unsupervised nucleus detection and representation in histopathology images. Pattern Recogn. 86, 188–200 (2019)

    Article  Google Scholar 

  14. Jojic, N., Frey, B.J., Kannan, A.: Epitomic analysis of appearance and shape. In: ICCV. vol. 3, p. 34 (2003)

    Google Scholar 

  15. Jojic, N., Perina, A., Murino, V.: Structural epitome: a way to summarize one’s visual experience. In: Advances in Neural Information Processing Systems. pp. 1027–1035 (2010)

    Google Scholar 

  16. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006). vol. 2, pp. 2169–2178. IEEE (2006)

    Google Scholar 

  17. Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29, pp. 4898–4906. Curran Associates, Inc. (2016). http://papers.nips.cc/paper/6203-understanding-the-effective-receptive-field-in-deep-convolutional-neural-networks.pdf

  18. Malkin, K., et al.: Label super-resolution networks. In: International Conference on Learning Representations (2019)

    Google Scholar 

  19. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-2010). pp. 807–814 (2010)

    Google Scholar 

  20. Ni, K., Kannan, A., Criminisi, A., Winn, J.: Epitomic location recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(12), 2158–2167 (2009)

    Article  Google Scholar 

  21. Nilsback, M.E., Zisserman, A.: A visual vocabulary for flower classification. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR2 006). vol. 2, pp. 1447–1454. IEEE (2006)

    Google Scholar 

  22. Papandreou, G., Chen, L.C., Murphy, K.P., Yuille, A.L.: Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1742–1750 (2015)

    Google Scholar 

  23. Papandreou, G., Chen, L.C., Yuille, A.L.: Modeling image patches with a generic dictionary of mini-epitomes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2051–2058 (2014)

    Google Scholar 

  24. Papandreou, G., Kokkinos, I., Savalle, P.A.: Modeling local and global deformations in deep learning: Epitomic convolution, multiple instance learning, and sliding window detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 390–399 (2015)

    Google Scholar 

  25. Pathak, D., Krahenbuhl, P., Darrell, T.: Constrained convolutional neural networks for weakly supervised segmentation. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1796–1804 (2015)

    Google Scholar 

  26. Perina, A., Jojic, N.: Spring lattice counting grids: scene recognition using deformable positional constraints. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 837–851. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_60

    Chapter  Google Scholar 

  27. Robinson, C., et al.: Large scale high-resolution land cover mapping with multi-resolution data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 12726–12735 (2019)

    Google Scholar 

  28. Robinson, C., Malkin, K., Hu, L., Dilkina, B., Jojic, N.: Weakly supervised semantic segmentation in the 2020 IEEE GRSS Data Fusion Contest. In: Proceedings of the International Geoscience and Remote Sensing Symposium (2020)

    Google Scholar 

  29. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  30. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  31. Saltz, J., Gupta, R., Hou, L., Kurc, T., Singh, P., Nguyen, V., Samaras, D., Shroyer, K.R., Zhao, T., Batiste, R., et al.: Spatial organization and molecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology images. Cell reports 23(1), 181 (2018)

    Article  Google Scholar 

  32. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  33. Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45054-8_2

    Chapter  Google Scholar 

  34. Yeung, S., Kannan, A., Dauphin, Y., Fei-Fei, L.: Epitomic variational autoencoders (2016)

    Google Scholar 

  35. Zhang, H., Fritts, J.E., Goldman, S.A.: Image segmentation evaluation: a survey of unsupervised methods. Comput. Vis. Image Underst. 110(2), 260–280 (2008)

    Article  Google Scholar 

  36. Zhou, N., et al.: Evaluation of nucleus segmentation in digital pathology images through large scale image synthesis. In: Medical Imaging 2017: Digital Pathology. vol. 10140, p. 101400K. International Society for Optics and Photonics (2017)

    Google Scholar 

Download references

Acknowledgments

The authors thank Caleb Robinson for valuable help with experiments [28] and the reviewers for comments on earlier versions of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikolay Malkin .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 5712 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Malkin, N., Ortiz, A., Jojic, N. (2020). Mining Self-similarity: Label Super-Resolution with Epitomic Representations. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12371. Springer, Cham. https://doi.org/10.1007/978-3-030-58574-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58574-7_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58573-0

  • Online ISBN: 978-3-030-58574-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics