PCA-AE: Principal Component Analysis Autoencoder for Organising the Latent Space of Generative Networks

Pham, Chi-Hieu; Ladjal, Saïd; Newson, Alasdair

doi:10.1007/s10851-022-01077-z

PCA-AE: Principal Component Analysis Autoencoder for Organising the Latent Space of Generative Networks

Published: 13 April 2022

Volume 64, pages 569–585, (2022)
Cite this article

Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

1104 Accesses
8 Citations
Explore all metrics

Abstract

Autoencoders and generative models produce some of the most spectacular deep learning results to date. However, understanding and controlling the latent space of these models presents a considerable challenge. Drawing inspiration from principal component analysis and autoencoders, we propose the principal component analysis autoencoder (PCA-AE). This is a novel autoencoder whose latent space verifies two properties. Firstly, the dimensions are organised in decreasing importance with respect to the data at hand. Secondly, the components of the latent space are statistically independent. We achieve this by progressively increasing the latent space during training, and with a covariance loss applied to the latent codes. The resulting autoencoder produces a latent space which separates the intrinsic attributes of the data into different components of the latent space, in a completely unsupervised manner. We also describe an extension of our approach to the case of powerful, pre-trained GANs. We show results on both synthetic examples of shapes and on a state-of-the-art GAN. For example, we are able to separate the colour shade scale of hair, pose of faces and gender, without accessing any labels. We compare the PCA-AE with other state-of-the-art approaches, in particular with respect to the ability to disentangle attributes in the latent space. We hope that this approach will contribute to better understanding of the intrinsic latent spaces of powerful deep generative models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

Visualizing and Understanding Convolutional Networks

Notes

The following argmin are to be understood up to a sign.
https://github.com/YannDubs/disentangling-vae.
PyTorch GAN zoo: https://github.com/facebookresearch/pytorch_GAN_zoo.
StyleGAN Code: https://github.com/rosinality/style-based-gan-pytorch.

References

Kingma, D.P.,Welling, M.: Auto-encoding variational bayes. In: International Conference on Learning Representations (2014)
Van den Aaron, O., Nal, K., Lasse, E., Oriol, V., Alex, G., et al.: Conditional image generation with pixelcnn decoders. In: Advances in Neural Information Processing Systems, pp. 4790–4798 (2016)
Casper, K., Sønderby, T., Raiko, L., Maaløe, S., Kaae, S., Ole, W.: Ladder variational autoencoders. In: Advances in Neural Information Processing Systems, pp. 3738–3746 (2016)
Ilya, T., Olivier, B., Sylvain, G., Bernhard, S.: Wasserstein auto-encoders. In: International Conference on Learning Representations (2018)
Huaibo, H., Ran, H., Zhenan, S., Tieniu, T., et al.: Introvae: introspective variational autoencoders for photographic image synthesis. In: Advances in Neural Information Processing Systems, pp. 52–63 (2018)
Ari, H., Arno, S., Juho, K.: Towards photographic image manipulation with balanced growing of generative autoencoders. In The IEEE Winter Conference on Applications of Computer Vision, pp. 3120–3129 (2020)
Ian, G., Jean, P.-A., Mehdi, M., Bing, X., David, W.-F., Sherjil, O., Aaron, C., Yoshua, B.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Tim, S., Ian, G., Wojciech, Zaremba, V., Cheung, A., Radford, X., Chen: Improved techniques for training gans. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)
Tero, Karras, Timo, Aila, Samuli, Laine, Jaakko, Lehtinen: Progressive growing of gans for improved quality, stability, and variation. In International Conference on Learning Representations, 2018
Akash, S., Lazar, V., Chris, R., Michael, U., Gutmann, C., Sutton: Veegan: reducing mode collapse in gans using implicit variational learning. In: Advances in Neural Information Processing Systems, pp. 3308–3318 (2017)
Tero, K., Samuli, L., Timo, A.: A style-based generator architecture for generative adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2019)
Yunjey, C., Minje, C., Munyoung, K., Jung-Woo, H., Sunghun, K., Jaegul, C.: Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2018)
Jun-Yan, Z., Richard, Z., Deepak, P., Trevor, D., Alexei, A., Efros, O., Wang, E., Shechtman: Toward multimodal image-to-image translation. In: Advances in Neural Information Processing Systems, pp. 465–476 (2017)
Chi-Hieu, P., Carlos, T., Hélène, M., Nathalie, B., Ronan, F., Nicolas, P., François, R.: Simultaneous super-resolution and segmentation using a generative adversarial network: Application to neonatal brain MRI. In: International Symposium on Biomedical Imaging, pp. 991–994 (2019)
Durk, P., Kingma, S., Mohamed, D., Jimenez, R., Max, W.: Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems, pp. 3581–3589 (2014)
Scott, R., Kihyuk, S., Yuting, Z., Honglak, L.: Learning to disentangle factors of variation with manifold interaction. In: International Conference on Machine Learning, pp. 1431–1439 (2014)
Michael, F., Mathieu, J., Jake, Z., Junbo, Z., Aditya, R., Pablo, S., Yann, L.: Disentangling factors of variation in deep representation using adversarial training. In: Advances in Neural Information Processing Systems, pp. 5040–5048 (2016)
Emily, L., Denton, et al.: Unsupervised learning of disentangled representations from video. In: Advances in Neural Information Processing Systems, pp. 4414–4423 (2017)
Wei-Ning, H., Yu, Z., James, G.: Unsupervised learning of disentangled and interpretable representations from sequential data. In: Advances in Neural Information Processing Systems, pp. 1878–1889 (2017)
Narayanaswamy, S., Brooks, P., Jan-Willem, Van de, M., Alban, D., Noah, G., Pushmeet, K., Frank, Wood, Philip, Tood: Learning disentangled representations with semi-supervised deep generative models. In: Advances in Neural Information Processing Systems, pp. 5925–5935 (2017)
Marc’Aurelio, R., Y-Lan, B., Yann, L., Cun: Sparse feature learning for deep belief networks. In: Advances in Neural Information Processing Systems, pp. 1185–1192 (2008)
Xavier, Glorot, A., Bordes, Y.: Bengio: deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (2011)
Alireza, M., Brendan, F.: K-sparse autoencoders. In: International Conference on Learning Representations (2014)
Alec, R., Luke, M., Soumith, C.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: International Conference on Learning Representations (2016)
Martin, A., Soumith, C., Léon, B.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp. 214–223 (2017)
Ishaan, G., Faruk, A., Martin, A., Vincent, D., Aaron, C.: Courville: improved training of wasserstein gans. In: Advances in Neural Information Processing Systems, pp. 5767–5777 (2017)
Xi, C., Yan, D., Rein, H., John, S., Ilya, S., Pieter, A.: Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2172–2180 (2016)
Augustus, O., Christopher, O., Jonathon, S.: Conditional image synthesis with auxiliary classifier gans. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 2642–2651. JMLR. org (2017)
Xinchen, Y., Jimei, Y., Kihyuk, S., Honglak, L.: Attribute2image: conditional image generation from visual attributes. In: European Conference on Computer Vision, pp. 776–791. Springer (2016)
Sudipto, M., Himanshu, A., Eugene, L., Sreeram, K.: Clustergan: latent space clustering in generative adversarial networks. In: Proceedings of the AAAI Conference on Artificial Intelligence 33, pp. 4610–4617 (2019)
Quentin, D., Chi-Hieu, P., Clément, C., Carlos, T., Guillaume, D., Hélène, M., Nathalie, B., Ronan, F., Nicolas, P., François, R.: SegSRGAN: super-resolution and segmentation uing generative adversarial networks \(-\) application to neonatal brain MRI. Comput. Biol. Med., 103755 (2020)
Salah, R., Yoshua, B., Aaron, C., Pascal, V., Mehdi, M.: Disentangling factors of variation for facial expression recognition. In: European Conference on Computer Vision, pp. 808–822. Springer (2012)
Brian, C., Jesse, A., Livezey, A.K., Bansal, B.A.: Olshausen: discovering hidden factors of variation in deep networks. arXiv:1412.6583 (2014)
Abhishek, K., Prasanna, S., Avinash, B.: Variational inference of disentangled latent concepts from unlabeled observations. In: International Conference on Learning Representations (2018)
José, L.: Overcoming the disentanglement vs reconstruction trade-off via jacobian supervision. In: International Conference on Learning Representations (2019)
Guillaume, L., Neil, Z., Nicolas, U., Antoine, B., Ludovic, D., et al.: Fader networks: manipulating images by sliding attributes. In: Advances in Neural Information Processing Systems, pp. 5967–5976 (2017)
Irina, H., Loic, M., Arka, P., Christopher, B., Xavier, G., Matthew, B., Shakir, M., Alexander, L.: \(\beta \)-vae: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations, vol. 2, p. 6 (2017)
Christopher, P., Burgess, I., Higgins, A., Pal, L., Matthey, N., Watters, G., Desjardins, A., Lerchner.: Understanding disentangling in \(\beta \)-vae. In: NIPS Workshop on Learning Disentangled Representations (2018)
Hyunjik, K., Andriy, M.: Disentangling by factorising. In: International Conference on Machine Learning (2018)
Tian, Q., Chen, X., Li, R.B, Grosse, D.K., Duvenaud.: Isolating sources of disentanglement in variational autoencoders. In: Advances in Neural Information Processing Systems, pp. 2610–2620 (2018)
Wei, W., Dan, Y., Feiyu, C., Yunsheng, P., Sheng, H., Yongxin, G.: Clustering with orthogonal autoencoder. IEEE. Access 7, 62421–62432 (2019)
Yaodong, Y., Kwan, H.R., Chan, C., You, C., Song, Y.M.: Learning diverse and discriminative representations via the principle of maximal coding rate reduction. Advances in Neural Information Processing Systems, 33 (2020)
Sergey, I., Christian, S.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)
Diederik, K., Jimmy, B.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)
Ziwei, L., Ping, L., Xiaogang, W., Xiaoou, T.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (2015)
://www.faceplusplus.com/.Face++ cognitive services

Download references

Author information

Authors and Affiliations

LTCI, Télécom Paris, Institut Polytechnique de Paris, 19 Place Marguerite Perey, 91120, Palaiseau, France
Chi-Hieu Pham, Saïd Ladjal & Alasdair Newson
LSL, L@BISEN, ISEN Yncrea, 20 rue Cuirassé Bretagne, CS42807, 29228, Brest Cedex, France
Chi-Hieu Pham

Authors

Chi-Hieu Pham
View author publications
You can also search for this author in PubMed Google Scholar
Saïd Ladjal
View author publications
You can also search for this author in PubMed Google Scholar
Alasdair Newson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chi-Hieu Pham.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Submitted to the editors. This work was funded by the Labex DIGICOSME.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pham, CH., Ladjal, S. & Newson, A. PCA-AE: Principal Component Analysis Autoencoder for Organising the Latent Space of Generative Networks. J Math Imaging Vis 64, 569–585 (2022). https://doi.org/10.1007/s10851-022-01077-z

Download citation

Received: 14 January 2021
Accepted: 23 February 2022
Published: 13 April 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s10851-022-01077-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PCA-AE: Principal Component Analysis Autoencoder for Organising the Latent Space of Generative Networks

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

ImageNet Large Scale Visual Recognition Challenge

Visualizing and Understanding Convolutional Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PCA-AE: Principal Component Analysis Autoencoder for Organising the Latent Space of Generative Networks

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

ImageNet Large Scale Visual Recognition Challenge

Visualizing and Understanding Convolutional Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation