Visualizing and Understanding Nonnegativity Constrained Sparse Autoencoder in Deep Learning

  • Babajide O. Ayinde
  • Ehsan Hosseini-Asl
  • Jacek M. ZuradaEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9692)


In this paper, we demonstrate how complex deep learning structures can be understood by humans, if likened to isolated but understandable concepts that use the architecture of Nonnegativity Constrained Autoencoder (NCAE). We show that by constraining most of the weights in the network to be nonnegative using both \(L_1\) and \(L_2\) nonnegativity penalization, a more understandable structure can result with minute deterioration in classification accuracy. Also, this proposed approach yields a more sparse feature extraction and additional output layer sparsification. The concept is illustrated using MNIST and the NORB datasets.


Deep architecture Semi-supervised learning White-box model Part-based representation 


  1. 1.
    Hosseini-Asl, E., Zurada, J., Nasraoui, O.: Deep learning of part-based representation of data using sparse autoencoders with nonnegativity constraints. IEEE Trans. Neural Netw. Learn. Syst. 99, 1–13 (2015)CrossRefGoogle Scholar
  2. 2.
    Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 19, 153 (2007)Google Scholar
  3. 3.
    Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.: Extracting and composing robust features with denoising autoencoders. In: 25th International Conference on Machine learning, pp. 1096–1103. ACM (2008)Google Scholar
  5. 5.
    Lee, H., Ekanadham, C., Ng, A.: Sparse deep belief net model for visual area V2. Adv. Neural Inf. Process. Syst. 7, 873–880 (2007)Google Scholar
  6. 6.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)CrossRefGoogle Scholar
  7. 7.
    Chorowski, J., Zurada, J.M.: Learning understandable neural networks with nonnegative weight constraints. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 62–69 (2015)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Olshausen, B.A., et al.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996)CrossRefGoogle Scholar
  9. 9.
    Ranzato, M., Boureau, Y., LeCun, Y.: Sparse feature learning for deep belief networks. Adv. Neural Inf. Process. Syst. 20, 1185–1192 (2007)Google Scholar
  10. 10.
    Ishikawa, M.: Structural learning with forgetting. Neural Netw. 9(3), 509–521 (1996)CrossRefGoogle Scholar
  11. 11.
    Bartlett, P.L.: The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Trans. Inf. Theory 44(2), 525–536 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Ayinde, B.O., Barnawi, A.Y.: Differential evolution based deployment of wireless sensor networks. In: 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA), pp. 131–137. IEEE (2014)Google Scholar
  13. 13.
    Gnecco, G., Sanguineti, M.: Regularization techniques and suboptimal solutions to optimization problems in learning from data. Neural Comput. 22(3), 793–829 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Moody, J., Hanson, S., Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. Adv. Neural Inf. Process. Syst. 4, 950–957 (1995)Google Scholar
  15. 15.
    Lemme, A., Reinhart, R., Steil, J.: Online learning and generalization of parts-based image representations by non-negative sparse autoencoders. Neural Netw. 33, 194–203 (2012)CrossRefGoogle Scholar
  16. 16.
    Nguyen, T.D., Tran, T., Phung, D., Venkatesh, S.: Learning partsbased representations with nonnegative restricted Boltzmann machine. In: Asian Conference on Machine Learning, pp. 133–148 (2013)Google Scholar
  17. 17.
    Wright, S.J., Nocedal, J.: Numerical Optimization. Springer, New York (1999)zbMATHGoogle Scholar
  18. 18.
    Hashim, H.A., Ayinde, B., Abido, M.: Optimal placement of relay nodes in wireless sensor network using artificial bee colony algorithm. J. Netw. Comput. Appl. 64, 239–248 (2016)CrossRefGoogle Scholar
  19. 19.
    Zurada, J.M.: Introduction to Artificial Neural Systems. West Publishing Co., St. Paul (1992)Google Scholar
  20. 20.
    Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    LeCun, Y., Cortes, C., Burges, C.J.: The MNIST database of handwritten digits (1998)Google Scholar
  22. 22.
    der Maaten, L.V., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)zbMATHGoogle Scholar
  23. 23.
    Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv preprint arXiv:1207.0580

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Babajide O. Ayinde
    • 1
  • Ehsan Hosseini-Asl
    • 1
  • Jacek M. Zurada
    • 1
    • 2
    Email author
  1. 1.Electrical and Computer Engineering DepartmentUniversity of LouisvilleLouisvilleUSA
  2. 2.Information Technology InstituteUniversity of Social ScienceLodzPoland

Personalised recommendations