Gram regularization for sparse and disentangled representation

Gao, Zhentao; Chen, Yuanyuan; Guo, Quan; Yi, Zhang

doi:10.1007/s10044-021-01033-4

Gram regularization for sparse and disentangled representation

Theoretical Advances
Published: 01 February 2022

Volume 25, pages 337–349, (2022)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Zhentao Gao¹,
Yuanyuan Chen¹,
Quan Guo¹ &
…
Zhang Yi ORCID: orcid.org/0000-0002-5867-9322¹

239 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Relationship between samples is often ignored when training neural networks for classification tasks. If properly utilized, such information can bring many benefits for the trained models. On the one hand, neural networks trained ignoring similarities between samples may represent different samples closely even if they belong to different classes, which undermines discrimination abilities of the trained models. On the other hand, regularizing inter-class and intra-class similarities in the feature space during training can effectively disentangle the representation between classes and make the representation sparse. To achieve this, a new regularization method is proposed to penalize positive inter-class similarities and negative intra-class similarities in the feature space. Experimental results show that the proposed method can not only obtain sparse and disentangled representation but also improve the performance of the trained models on many datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature dimensionality reduction: a review

Article Open access 21 January 2022

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization

Article 15 April 2024

References

Arzamasov V (2018) Electrical grid stability simulated data data set. https://archive.ics.uci.edu/ml/datasets/Electrical+Grid+Stability+Simulated+Data+
Bengio Y, Courville A, Vincent P (2012) Representation learning: a review and new perspectives. arXiv:1206.5538
Bock RK (2007) Magic gamma telescope data set. https://archive.ics.uci.edu/ml/datasets/MAGIC+Gamma+Telescope
Chen TQ, Rubanova Y, Bettencourt J, Duvenaud DK (2018) Neural ordinary differential equations. In: Advances in neural information processing systems, pp 6571–6583
Dua D, Graff C (2017) UCI machine learning repository. https://archive.ics.uci.edu/ml
Freire AL, Barreto GA, Veloso M, Varela AT (2009) Short-term memory mechanisms in neural network learning of robot navigation tasks: a case study. In: Robotics symposium
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv:1412.6572
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, pp 630–645
Higgins I, Amos D, Pfau D, Racanière S, Matthey L, Rezende DJ, Lerchner A (2018) Towards a definition of disentangled representations. CoRR arXiv:1812.02230
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv:1503.02531
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR arXiv:1502.03167
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images. Technical reports, Citeseer
Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems
LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Lei Ba J, Kiros JR, Hinton GE (2016) Layer normalization. arXiv:1607.06450
Li P, Church KW, Hastie TJ (2007) Conditional random sampling: a sketch-based sampling technique for sparse data. In: Advances in neural information processing systems, pp 873–880
Li Z, Liu J, Yang Y, Zhou X, Lu H (2013) Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Trans Knowl Data Eng 26(9):2138–2150
Google Scholar
Liu X, Liu W, Mei T, Ma H (2016) A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In: Proceedings of the European conference on computer vision (ECCV), pp 869–884
Locatello F, Tschannen M, Bauer S, Rätsch G, Schölkopf B, Bachem O (2019) Disentangling factors of variation using few labels. arXiv:1905.01258
Müller R, Kornblith S, Hinton GE (2019) When does label smoothing help? In: Advances in neural information processing systems 32. Curran Associates Inc., pp 4694–4703. http://papers.nips.cc/paper/8717-when-does-label-smoothing-help.pdf
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. In: NIPS workshop on deep learning and unsupervised feature learning 2011
Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607
Article Google Scholar
Praagman J (1985) Classification and regression trees: Leo breiman, jerome h. friedman, richard a. olshen and charles j. stone the wadsworth statistics, probability series, wadsworth, belmont, (1984) x + 358 pages. Eur J Oper Res 19(1):144. https://doi.org/10.1016/0377-2217(85)90321-2
Rajkovic V (1997) Nursery data set. https://archive.ics.uci.edu/ml/datasets/Nursery
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484
Article Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Stolcke A, Seide F (2018) Achieving human parity in conversational speech recognition using CNTK and a GPU farm. In: GPU technology conference. https://www.microsoft.com/en-us/research/publication/achieving-human-parity-conversational-speech-recognition-usingcntk-gpu-farm/, gPU Technology Conference
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. CoRR arXiv:1409.3215
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision. arXiv:1512.00567
Thomas V, Bengio E, Fedus W, Pondard J, Beaudoin P, Larochelle H, Pineau J, Precup D, Bengio Y (2018) Disentangling the independently controllable factors of variation by interacting with the world. arXiv:1802.09484
Tran L, Yin X, Liu X (2017) Disentangled representation learning Gan for pose-invariant face recognition. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1283–1292. https://doi.org/10.1109/CVPR.2017.141
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: International conference on machine learning
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2008) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Article Google Scholar
Wu Y, He K (2018) Group normalization. arXiv:1803.08494
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv:1708.07747
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2017.634
Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SCH (2020) Deep learning for person re-identification: a survey and outlook. arXiv:2001.04193
Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):265–286
Article MathSciNet Google Scholar

Download references

Funding

This work was supported by the National Natural Science Foundation of China [Grant No. 61432012].

Author information

Authors and Affiliations

Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, 610065, People’s Republic of China
Zhentao Gao, Yuanyuan Chen, Quan Guo & Zhang Yi

Authors

Zhentao Gao
View author publications
You can also search for this author in PubMed Google Scholar
Yuanyuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Quan Guo
View author publications
You can also search for this author in PubMed Google Scholar
Zhang Yi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhang Yi.

Ethics declarations

Conflict of interest

All authors have declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, Z., Chen, Y., Guo, Q. et al. Gram regularization for sparse and disentangled representation. Pattern Anal Applic 25, 337–349 (2022). https://doi.org/10.1007/s10044-021-01033-4

Download citation

Received: 30 December 2019
Accepted: 23 September 2021
Published: 01 February 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s10044-021-01033-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gram regularization for sparse and disentangled representation

Abstract

Access this article

Similar content being viewed by others

Feature dimensionality reduction: a review

Learning from imbalanced data: open challenges and future directions

PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Gram regularization for sparse and disentangled representation

Abstract

Access this article

Similar content being viewed by others

Feature dimensionality reduction: a review

Learning from imbalanced data: open challenges and future directions

PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation