Skip to main content
Log in

Linear Feature Transform and Enhancement of Classification on Deep Neural Network

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

A weighted and convex regularized nuclear norm model is introduced to construct a rank constrained linear transform on feature vectors of deep neural networks. The feature vectors of each class are modeled by a subspace, and the linear transform aims to enlarge the pairwise angles of the subspaces. The weight and convex regularization resolve the rank degeneracy of the linear transform. The model is computed by a difference of convex function algorithm whose descent and convergence properties are analyzed. Numerical experiments are carried out in convolutional neural networks on CAFFE platform for 10 class handwritten digit images (MNIST) and small object color images (CIFAR-10) in the public domain. The transformed feature vectors improve the accuracy of the network in the regime of low dimensional features subsequent to dimensional reduction via principal component analysis. The feature transform is independent of the network structure, and can be applied to reduce complexity of the final fully-connected layer without retraining the feature extraction layers of the network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Ba, L.J., Caruana, R.: Do Deep Nets Really Need to be Deep? arxiv:1312.6184 (2013)

  2. cuda-convnet. https://code.google.com/p/cuda-convnet

  3. Deng, L., Yu, D.: Deep Learning: Methods and Applications. NOW Publishers, Breda (2014)

    MATH  Google Scholar 

  4. Denton, E., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: Advances in Neural Information Processing Systems (NIPS), pp. 1269–1277 (2014)

  5. Hinton, G.: Learning multiple layers of representation. Trends Cognit. Sci 11(10), 428–434 (2007)

    Article  Google Scholar 

  6. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:1408.5093, (2014)

  7. Krizhevsky, A.: Learning Multiple Layers of Features from Tiny Images. www.cs.toronto.edu/~kriz/index.htm (2009)

  8. Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)

    Google Scholar 

  9. LeCun, Y., Bottou, L., Orr, G., Müller, K.: Neural Networks: Tricks of the Trade. Springer, Berlin (1998)

    Google Scholar 

  10. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  11. Qiu, Q., Sapiro, G.: Learning transformations for clustering and classification. J. Mach. Learn. Res. 16, 187–225 (2015)

    MathSciNet  MATH  Google Scholar 

  12. Recht, B., Ré, C.: Parallel stochastic gradient algorithms for large-scale matrix completion. Math. Program. Comput. 5(2), 201–226 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  13. Schmidhuber, J.: Deep Learning in Neural Networks: An Overview. arXiv:1404.7828v4 (2014)

  14. Sironi, A., Tekin, B., Rigamonti, R., Lepetit, V., Fua, P.: Learning separable filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 94–106 (2015)

    Article  Google Scholar 

  15. Tao, P.D., An, L.T.H.: Convex analysis approach to d.c. programming: theory, algorithms and applications. Acta Math. Vietnam. 22, 289–355 (1997)

    MathSciNet  MATH  Google Scholar 

  16. Tao, P.D., An, L.T.H.: A DC optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  17. Watson, G.A.: Characterization of the subdifferential of some matrix norms. Linear Algebra Appl. 170, 33–45 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  18. Yin, P., Xin, J.: PhaseLiftOff: an accurate and stable phase retrieval method based on difference of trace and Frobenius norms. Commun. Mathe. Sci. 13(2), 1033–1049 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  19. Yin, P., Xin, J.: Iterative \(\ell _1\) minimization for non-convex compressed sensing. J. Comput. Math 35(4), 437–449 (2017)

    Article  Google Scholar 

  20. Yu, D., Deng, L.: Automatic Speech Recognition: A Deep Learning Approach. Signals and Communications Technology. Springer, Berlin (2015)

    Book  Google Scholar 

Download references

Acknowledgements

The authors wish to thank Dr. Xiangxin Zhu of Google Inc., Dr. Yang Yang, and Dr. Xin Zhong for helpful conversations on DNN and CAFFE.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Penghang Yin.

Additional information

This work was partially supported by NSF Grants DMS-1222507, DMS-1522383, IIS-1632935.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, P., Xin, J. & Qi, Y. Linear Feature Transform and Enhancement of Classification on Deep Neural Network. J Sci Comput 76, 1396–1406 (2018). https://doi.org/10.1007/s10915-018-0666-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10915-018-0666-1

Keywords

Mathematics Subject Classification

Navigation