Mining Mid-level Visual Patterns with Deep CNN Activations

Li, Yao; Liu, Lingqiao; Shen, Chunhua; Hengel, Anton van den

doi:10.1007/s11263-016-0945-y

Mining Mid-level Visual Patterns with Deep CNN Activations

Published: 29 August 2016

Volume 121, pages 344–364, (2017)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yao Li¹,
Lingqiao Liu¹,
Chunhua Shen¹ &
…
Anton van den Hengel¹

2282 Accesses
34 Citations
Explore all metrics

Abstract

The purpose of mid-level visual element discovery is to find clusters of image patches that are representative of, and which discriminate between, the contents of the relevant images. Here we propose a pattern-mining approach to the problem of identifying mid-level elements within images, motivated by the observation that such techniques have been very effective, and efficient, in achieving similar goals when applied to other data types. We show that Convolutional Neural Network (CNN) activations extracted from image patches typical possess two appealing properties that enable seamless integration with pattern mining techniques. The marriage between CNN activations and a pattern mining technique leads to fast and effective discovery of representative and discriminative patterns from a huge number of image patches, from which mid-level elements are retrieved. Given the patterns and retrieved mid-level visual elements, we propose two methods to generate image feature representations. The first encoding method uses the patterns as codewords in a dictionary in a manner similar to the Bag-of-Visual-Words model. We thus label this a Bag-of-Patterns representation. The second relies on mid-level visual elements to construct a Bag-of-Elements representation. We evaluate the two encoding methods on object and scene classification tasks, and demonstrate that our approach outperforms or matches the performance of the state-of-the-arts on these tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic visual pattern mining from categorical image dataset

Article 19 December 2018

Deep sparse representation-based mid-level visual elements discovery in fine-grained classification

Article 22 August 2018

Improving Generalization via Scalable Neighborhood Component Analysis

Notes

Answer key: 1. aeroplane, 2. train, 3. cow, 4. motorbike, 5. bike, 6. sofa.
http://www.borgelt.net/apriori.html.

References

Agarwal, A., & Triggs, B. (2008). Multilevel image coding with hyperfeatures. International Journal of Computer Vision, 78(1), 15–27.
Article Google Scholar
Agrawal, P., Girshick, R., & Malik, J. (2014). Analyzing the performance of multilayer neural networks for object recognition. In Proceedings European Conference on Computer Vision, (pp. 329–344).
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules in large databases. In Proceedings International Conference Very Large Databases, (pp. 487–499).
Aubry, M., Maturana, D., Efros, A. A., Russell, B. C., Sivic, J. (2014a) Seeing 3d chairs: exemplar part-based 2d-3d alignment using a large dataset of cad models. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition, (pp. 3762–3769).
Aubry, M., Russell, B. C., & Sivic, J. (2014b). Painting-to-3d model alignment via discriminative visual elements. In Proceedings Annual ACM SIGIR Conference, 33(2), p. 14.
Azizpour, H., Razavian, A. S., Sullivan, J., Maki, A., & Carlsson, S. (2016). Factors of transferability for a generic convnet representation. IEEE Transactions Pattern Analysis and Machine Intelligence, 38(9),1790–1802.
Bansal, A., Shrivastava, A., Doersch, C., & Gupta, A. (2015). Mid-level elements for object detection. arXiv preprint arXiv:1504.07284
Borgelt, C. (2012). Frequent item set mining. Wiley Interdisc Review: Data Mining and Knowledge Discovery, 2(6), 437–456.
Google Scholar
Bossard, L., Guillaumin, M., & Gool, L. V. (2014). Food-101 mining discriminative components with random forests. In Proceedings European Conference on Computer Vision, (pp. 446–461).
Bourdev, L. D., & Malik, J. (2009). Poselets: Body part detectors trained using 3d human pose annotations. In Proceedings IEEE International Conference on Computer Vision, (pp. 1365–1372).
Bourdev, L. D., Maji, S., Brox, T., & Malik, J. (2010). Detecting people using mutually consistent poselet activations. In Proceeding European Conference on Computer Vision, (pp. 168–181).
Bourdev, L. D., Maji, S., & Malik, J. (2011). Describing people: A poselet-based approach to attribute classification. In Proceedings IEEE International Conference on Computer Vision, (pp. 1543–1550).
Boureau, Y., Bach, F. R., LeCun, Y., & Ponce, J. (2010). Learning mid-level features for recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 2559–2566).
Chatfield, K., Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Return of the devil in the details: Delving deep into convolutional nets. In Proceedings British Machine Vision Conference.
Cheng, H., Yan, X., Han, J., & Yu, P. S. (2008). Direct discriminative pattern mining for effective classification. In Proceedings IEEE International Conference on Data Engineering, (pp. 169–178).
Choi, M. J., Torralba, A., & Willsky, A. S. (2012). A tree-based context model for object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(2), 240–252.
Article Google Scholar
Cimpoi, M., Maji, S., & Vedaldi, A. (2015). Deep filter banks for texture recognition and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 3828–3836).
Cimpoi, M., Maji, S., Kokkinos, I., & Vedaldi, A. (2016). Deep filter banks for texture recognition, description, and segmentation. International Journal of Computer Vision, 118(1), 65–94.
Article MathSciNet Google Scholar
Courbariaux, M., & Bengio, Y. (2016). Binarynet: Training deep neural networks with weights and activations constrained to +1 or -1. arXiv preprint arXiv:1602.02830
Crowley, E., & Zisserman, A. (2014). The state of the art: Object retrieval in paintings using discriminative regions. In Proceedings British Machine Vision Conference.
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Li, F. F. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, (pp. 248–255).
Diba, A., Pazandeh, A. M., Pirsiavash, H., & Gool, L. V. (2016). Deepcamp: Deep convolutional action & attribute mid-level patterns. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition.
Divvala, S. K., Hoiem, D., Hays, J., Efros, A. A., Hebert, M. (2009). An empirical study of context in object detection. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, (pp. 1271–1278).
Doersch, C., Singh, S., Gupta, A., Sivic, J., & Efros, A. A. (2012). What makes paris look like paris? In Proceedings Annual International ACM SIGIR Conference, 31(4), p. 101.
Doersch, C., Gupta, A., & Efros, A. A. (2013). Mid-level visual element discovery as discriminative mode seeking. In Proceedings Advances in Neural Information Processing Systems, (pp. 494–502).
Dosovitskiy, A., & Brox, T. (2016). Inverting visual representations with convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Endres, I., Shih, K. J., Jiaa, J., & Hoiem, D. (2013). Learning collections of part models for object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 939–946).
Everingham, M., Gool, L. J. V., Williams, C. K. I., Winn, J. M., & Zisserman, A. (2010). The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338.
Article Google Scholar
Everingham, M., Eslami, S. M. A., Gool, L. V., Williams, C. K. I., Winn, J. M., & Zisserman, A. (2015). The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 111(1), 98–136.
Article Google Scholar
Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., & Lin, C. J. (2008). Liblinear: A library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.
MATH Google Scholar
Felzenszwalb, P. F., Girshick, R. B., McAllester, D. A., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.
Article Google Scholar
Fernando, B., & Tuytelaars, T. (2013). Mining multiple queries for image retrieval: On-the-fly learning of an object-specific mid-level representation. In Proceedings of IEEE International Conference on Computer Vision, (pp. 2544–2551).
Fernando, B., Fromont, É., & Tuytelaars, T. (2012). Effective use of frequent itemset mining for image classification. In Proceedings of European Conference on Computer Vision, (pp. 214–227).
Fernando, B., Fromont, É., & Tuytelaars, T. (2014). Mining mid-level features for image classification. International Journal of Computer Vision, 108(3), 186–203.
Article MathSciNet Google Scholar
Fouhey, D. F., Gupta, A., & Hebert, M. (2013). Data-driven 3d primitives for single image understanding. In Proceedings of IEEE International Conference on Computer Vision, (pp. 3392–3399).
Fouhey, D. F., Hussain, W., Gupta, A., & Hebert, M. (2015). Single image 3d without a single 3d image. In Proceedings of IEEE International Conference on Computer Vision, (pp. 1053–1061).
Gao, Y., Beijbom, O., Zhang, N., & Darrell, T. (2010). Compact bilinear pooling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 317–326).
Gilbert, A., & Bowden, R. (2014). Data mining for action recognition. In Proceedings of Asian Conference on Computer Vision, (pp. 290–303).
Gilbert, A., Illingworth, J., & Bowden, R. (2011). Action recognition using mined hierarchical compound features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 883–897.
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 580–587).
Girshick, R. B., Donahue, J., Darrell, T., & Malik, J. (2016). Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 142–158.
Article Google Scholar
Gong, Y., Wang, L., Guo, R., & Lazebnik, S. (2014). Multi-scale orderless pooling of deep convolutional activation features. In Proceedings of European Conference on Computer Vision, (pp. 392–407).
Grahne, G., & Zhu, J. (2005). Fast algorithms for frequent itemset mining using fp-trees. IEEE Transactions on Knowledge and Data Engineering, 17(10), 1347–1362.
Article Google Scholar
Hariharan, B., Malik, J., & Ramanan, D. (2012). Discriminative decorrelation for clustering and classification. In Proceedings of European Conference on Computer Vision, (pp. 459–472).
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9), 1904–1916.
Article Google Scholar
Hoiem, D., Efros, A. A., & Hebert, M. (2008). Putting objects in perspective. International Journal of Computer Vision, 80(1), 3–15.
Article Google Scholar
Jain, A., Gupta, A., Rodriguez, M., & Davis, L. S. (2013). Representing videos using mid-level discriminative patches. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 2571–2578).
Jegou, H., Douze, M., Schmid, C., & Pérez, P. (2010). Aggregating local descriptors into a compact image representation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 3304–3311).
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093
Juneja, M., Vedaldi, A., Jawahar, C. V., & Zisserman, A. (2013). Blocks that shout: Distinctive parts for scene classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 923–930).
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Proceedings of Advances Neural Information Processing Systems, (pp. 1106–1114).
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 2169–2178).
Lee, Y. J., Efros, A. A., & Hebert, M. (2013). Style-aware mid-level representation for discovering visual connections in space and time. In Proceedings of IEEE International Conference on Computer Vision, (pp. 1857–1864).
Li, Q., Wu, J., & Tu, Z. (2013). Harvesting mid-level visual concepts from large-scale internet images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 851–858).
Li, Y., Liu, L., Shen, C., & van den Hengel, A. (2015). Mid-level deep pattern mining. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 971–980).
Lin, T., RoyChowdhury, A., & Maji, S. (2015). Bilinear CNN models for fine-grained visual recognition. In Proceedings of European Conference on Computer Vision, (pp. 1449–1457).
Liu, L., & Wang, L. (2012). What has my classifier learned? visualizing the classification rules of bag-of-feature model by support region detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 3586–3593).
Liu, L., Shen, C., Wang, L., van den Hengel, A., & Wang, C. (2014). Encoding high dimensional local features by sparse coding based fisher vectors. In Proceedings of Advances Neural Information Processing Systems, (pp. 1143–1151).
Liu, L., Shen, C., & van den Hengel, A. (2015). The treasure beneath convolutional layers: Cross convolutional layer pooling for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 4749–4757).
Malisiewicz, T., & Efros, A. A. (2009). Beyond categories: The visual memex model for reasoning about object relationships. In Proceedings of Advances Neural Information Processing Systems, (pp. 1222–1230).
Malisiewicz, T., Gupta, A., & Efros, A. A. (2011). Ensemble of exemplar-svms for object detection and beyond. In Proceedings of IEEE International Conference on Computer Vision, (pp. 89–96).
Matzen, K., & Snavely, N. (2015). Bubblenet: Foveated imaging for visual discovery. In Proceedings of IEEE International Conference on Computer Vision, (pp. 1931–1939).
Mettes, P., van Gemert, J. C., & Snoek, C. G. M. (2016). No spare parts: Sharing part detectors for image categorization. Computer Vision Image Understanding
Oquab, M., Bottou, L., Laptev, I., & Sivic, J. (2014). Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 1717–1724).
Oramas, J., & Tuytelaars, T. (2016). Modeling visual compatibility through hierarchical mid-level elements. arXiv preprint arXiv:1604.00036
Owens, A., Xiao, J., Torralba, A., & Freeman, W. T. (2013). Shape anchors for data-driven multi-view reconstruction. In Proceedings of IEEE International Conference on Computer Vision, (pp. 33–40).
Parizi, S. N., Vedaldi, A., Zisserman, A., & Felzenszwalb, P. (2015). Automatic discovery and optimization of parts for image classification. In Proceedings International Conference on Learning Representations.
Perronnin, F., Liu, Y., Sánchez, J., Poirier, H. (2010a) Large-scale image retrieval with compressed fisher vectors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 3384–3391).
Perronnin, F., Sánchez, J., Mensink, T. (2010b) Improving the fisher kernel for large-scale image classification. In Proceedings of European Conference on Computer Vision, (pp. 143–156).
Quack, T., Ferrari, V., Leibe, B., & Gool, L. J. V. (2007). Efficient mining of frequent and distinctive feature configurations. In Proceedings of IEEE International Conference on Computer Vision, (pp. 1–8).
Quattoni, A., & Torralba, A. (2009). Recognizing indoor scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 413–420).
Rastegari, M., Ordonez, V., Redmon, J., & Farhadi, A. (2016). In Proceedings of European Conference on Computer Vision.
Razavian, A. S., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). Cnn features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, (pp. 512–519).
Rematas, K., Fernando, B., Dellaert, F., & Tuytelaars, T. (2015). Dataset fingerprints: Exploring image collections through data mining. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 4867–4875).
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Article MathSciNet Google Scholar
Shih, K. J., Endres, I., & Hoiem, D. (2015). Learning discriminative collections of part detectors for object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(8), 1571–1584.
Article Google Scholar
Shrivastava, A., Malisiewicz, T., Gupta, A., & Efros, A. A. (2011). Data-driven visual similarity for cross-domain image matching. Proceedings of Annual ACM SIGIR Conference, 30(6), p. 154.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In Proceedings International Conference on Learning Representations.
Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep fisher networks for large-scale image classification. In Proceedings of Advances Neural Information Processing Systems, (pp. 163–171).
Singh, S., Gupta, A., & Efros, A. A. (2012). Unsupervised discovery of mid-level discriminative patches. In Proceedings of European Conference on Computer Vision, (pp. 73–86).
Sivic, J., & Zisserman, A. (2003). Video google: A text retrieval approach to object matching in videos. In Proceedings of IEEE International Conference on Computer Vision, (pp. 1470–1477).
Song, H. O., Lee, Y. J., Jegelka, S., & Darrell, T. (2014). Weakly-supervised discovery of visual pattern configurations. In Proceedings of Advances Neural Information Processing Systems, (pp. 1637–1645).
Sun, J., & Ponce, J. (2013). Learning discriminative part detectors for image classification and cosegmentation. In Proceedings of IEEE International Conference on Computer Vision, (pp. 3400–3407).
Sun, J., & Ponce, J. (2016). Learning dictionary of discriminative part detectors for image categorization and cosegmentation. International Journal of Computer Vision, 2, 1–23.
MathSciNet Google Scholar
Torralba, A. (2003). Contextual priming for object detection. International Journal of Computer Vision, 53(2), 169–191.
Article MathSciNet Google Scholar
Uno, T., Asai, T., Uchida, Y., & Arimura, H. (2003). LCM: An efficient algorithm for enumerating frequent closed item sets. In Proceedings of the Workshop on Frequent Itemset Mining Implementations, International Conference on Data Mining.
Voravuthikunchai, W., Crémilleux, B., & Jurie, F. (2014). Histograms of pattern sets for image classification and object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 224–231).
Vreeken, J., van Leeuwen, M., & Siebes, A. (2011). Krimp: mining itemsets that compress. Data Mining and Knowledge Discovery, 23(1), 169–214.
Article MathSciNet MATH Google Scholar
Wang, J., Liu, Z., Wu, Y., & Yuan, J. (2014). Learning actionlet ensemble for 3d human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5), 914–927.
Article Google Scholar
Wang, J., Yang, Y., Mao, J., Huang, Z., & Xu, C. H. W. (2016a). Cnn-rnn: A unified framework for multi-label image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Wang, L., Qiao, Y., Tang, X. (2013a) Motionlets: Mid-level 3d parts for human motion recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 2674–2681).
Wang, X., Wang, B., Bai, X., Liu, W., Tu, Z. (2013b) Max-margin multiple-instance dictionary learning. In Proceedings International Conference on Machine Learning, (pp. 846–854).
Wang, Y., Choi, J., Morariu, V. I., & Davis, L. S. (2016b). Mining discriminative triplets of patches for fine-grained classification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1163–1172).
Wei, Y., Xia, W., Huang, J., Ni, B., Dong, J., Zhao, Y., Yan, S. (2014). CNN: single-label to multi-label. CoRR arXiv:1406.5726
Yao, B., & Fei-Fei, L. (2010). Grouplet: A structured image representation for recognizing human and object interactions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 9–16).
Yoo, D., Park, S., Lee, J. Y., & Kweon, I. S. (2015). Multi-scale pyramid pooling for deep convolutional representation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, (pp. 71–80).
Yuan, J., Wu, Y., & Yang, M. (2007). Discovery of collocation patterns: from visual words to visual phrases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Proceedings of European Conference on Computer Vision, (pp. 818–833).
Zhao, R., Ouyang, W., & Wang, X. (2014). Learning mid-level filters for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 144–151).
Zhou, B., Lapedriza À, Xiao, J., Torralba, A., & Oliva, A. (2014). Learning deep features for scene recognition using places database. In Proceedings of Advances Neural Information Processing Systems, (pp. 487–495).

Download references

Acknowledgments

This work was in part supported by ARC Future Fellowship (FT120100969). Y. Li and L. Liu equally contributed to this work.

Author information

Authors and Affiliations

The School of Computer Science, The University of Adelaide, Adelaide, Australia
Yao Li, Lingqiao Liu, Chunhua Shen & Anton van den Hengel

Authors

Yao Li
View author publications
You can also search for this author in PubMed Google Scholar
Lingqiao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chunhua Shen
View author publications
You can also search for this author in PubMed Google Scholar
Anton van den Hengel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chunhua Shen.

Additional information

Communicated by Josef Sivic.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Liu, L., Shen, C. et al. Mining Mid-level Visual Patterns with Deep CNN Activations. Int J Comput Vis 121, 344–364 (2017). https://doi.org/10.1007/s11263-016-0945-y

Download citation

Received: 27 January 2016
Accepted: 16 August 2016
Published: 29 August 2016
Issue Date: February 2017
DOI: https://doi.org/10.1007/s11263-016-0945-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining Mid-level Visual Patterns with Deep CNN Activations

Abstract

Access this article

Similar content being viewed by others

Automatic visual pattern mining from categorical image dataset

Deep sparse representation-based mid-level visual elements discovery in fine-grained classification

Improving Generalization via Scalable Neighborhood Component Analysis

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mining Mid-level Visual Patterns with Deep CNN Activations

Abstract

Access this article

Similar content being viewed by others

Automatic visual pattern mining from categorical image dataset

Deep sparse representation-based mid-level visual elements discovery in fine-grained classification

Improving Generalization via Scalable Neighborhood Component Analysis

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation