Learning Gradient-Based ICA by Neurally Estimating Mutual Information

  • Hlynur Davíð HlynssonEmail author
  • Laurenz Wiskott
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11793)


Several methods of estimating the mutual information of random variables have been developed in recent years. They can prove valuable for novel approaches to learning statistically independent features. In this paper, we use one of these methods, a mutual information neural estimation (MINE) network, to present a proof-of-concept of how a neural network can perform linear ICA. We minimize the mutual information, as estimated by a MINE network, between the output units of a differentiable encoder network. This is done by simple alternate optimization of the two networks. The method is shown to get a qualitatively equal solution to FastICA on blind-source-separation of noisy sources.


Adversarial training Deep learning Independent component analysis 


  1. 1.
  2. 2.
    Amari, S.i., Cichocki, A., Yang, H.H.: A new learning algorithm for blind signal separation. In: Advances in Neural Information Processing Systems, pp. 757–763 (1996)Google Scholar
  3. 3.
    Belghazi, M.I., et al.: MINE: mutual information neural estimation. arXiv preprint arXiv:1801.04062 (2018)
  4. 4.
    Bell, A.J., Sejnowski, T.J.: A non-linear information maximisation algorithm that performs blind separation. In: Advances in Neural Information Processing Systems, pp. 467–474 (1995)Google Scholar
  5. 5.
    Blaschke, T., Wiskott, L.: CuBICA: independent component analysis by simultaneous third-and fourth-order cumulant diagonalization. IEEE Trans. Sig. Process. 52(5), 1250–1256 (2004)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Blaschke, T., Zito, T., Wiskott, L.: Independent slow feature analysis and nonlinear blind source separation. Neural Comput. 19(4), 994–1021 (2007)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Brakel, P., Bengio, Y.: Learning independent features with adversarial nets for non-linear ICA. arXiv preprint arXiv:1710.05050 (2017)
  8. 8.
    Cardoso, J.F.: InfoMax and maximum likelihood for blind source separation. IEEE Sig. Process. Lett. 4(4), 112–114 (1997)CrossRefGoogle Scholar
  9. 9.
    Darmois, G.: Analyse générale des liaisons stochastiques: etude particulière de l’analyse factorielle linéaire. Revue de l’Institut international de statistique, pp. 2–8 (1953)Google Scholar
  10. 10.
    Dozat, T.: Incorporating Nesterov momentum into ADAM (2016)Google Scholar
  11. 11.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  12. 12.
    Hjelm, R.D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Trischler, A., Bengio, Y.: Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:R1808.06670 (2018)
  13. 13.
    Hyvärinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Netw. 13(4–5), 411–430 (2000)CrossRefGoogle Scholar
  14. 14.
    Hyvärinen, A., Pajunen, P.: Nonlinear independent component analysis: existence and uniqueness results. Neural Netw. 12(3), 429–439 (1999)CrossRefGoogle Scholar
  15. 15.
    Ilin, A., Honkela, A.: Post-nonlinear independent component analysis by variational Bayesian learning. In: Puntonet, C.G., Prieto, A. (eds.) ICA 2004. LNCS, vol. 3195, pp. 766–773. Springer, Heidelberg (2004). Scholar
  16. 16.
    Jutten, C., Karhunen, J.: Advances in nonlinear blind source separation. In: Proceedings of the 4th International Symposium on Independent Component Analysis and Blind Signal Separation (ICA2003), pp. 245–256 (2003)Google Scholar
  17. 17.
    Karhunen, J.: Neural approaches to independent component analysis and source separation. In: ESANN, vol. 96, pp. 249–266 (1996)Google Scholar
  18. 18.
    McAllester, D., Statos, K.: Formal limitations on the measurement of mutual information. arXiv preprint arXiv:1811.04251 (2018)
  19. 19.
    Schmidhuber, J.: Learning factorial codes by predictability minimization. Neural Comput. 4(6), 863–879 (1992)CrossRefGoogle Scholar
  20. 20.
    Schüler, M., Hlynsson, H.D., Wiskott, L.: Gradient-based training of slow feature analysis by differentiable approximate whitening. arXiv preprint arXiv:1808.08833 (2018)
  21. 21.
    Sprekeler, H., Zito, T., Wiskott, L.: An extension of slow feature analysis for nonlinear blind source separation. J. Mach. Learn. Res. 15(1), 921–947 (2014)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Taleb, A., Jutten, C.: Source separation in post-nonlinear mixtures. IEEE Trans. Sig. Process. 47(10), 2807–2820 (1999)CrossRefGoogle Scholar
  23. 23.
    Veličković, P., Fedus, W., Hamilton, W.L., Liò, P., Bengio, Y., Hjelm, R.D.: Deep graph InfoMax. arXiv preprint arXiv:1809.10341 (2018)
  24. 24.
    Ziehe, A., Kawanabe, M., Harmeling, S., Müller, K.R.: Blind separation of post-nonlinear mixtures using linearizing transformations and temporal decorrelation. J. Mach. Learn. Res. 4(Dec), 1319–1338 (2003)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Ruhr University BochumBochumGermany

Personalised recommendations