Improving Deep Neural Networks by Adding Auxiliary Information

  • Sihyeon Seong
  • Chanho Lee
  • Junmo KimEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 751)


As the recent success of deep neural networks solved many single domain tasks, next generation problems should be on multi-domain tasks. To its previous stage, we investigated how auxiliary information can affect the deep learning model. By setting the primary class and auxiliary classes, characteristics of deep learning models can be studied when the additional task is added to original tasks. In this paper, we provide a theoretical consideration on additional information and concluded that at least random information should not affect deep learning models. Then, we propose an architecture which is capable of ignoring redundant information and show this architecture practically copes well with auxiliary information. Finally, we propose some examples of auxiliary information which can improve the performance of our architecture.



This work was supported by the ICCTDP (No. 10063172) funded by MOTIE, Korea.


  1. 1.
    Cover, T.M., Thomas, J.A.: Elements of information theory. John Wiley & Sons (2012)Google Scholar
  2. 2.
    Forgy, E.W.: Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21, 768–769 (1965)Google Scholar
  3. 3.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  4. 4.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Jung, H., Lee, S., Yim, J., Park, S., Kim, J.: Joint fine-tuning in deep neural networks for facial expression recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2983–2991 (2015)Google Scholar
  7. 7.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv:1412.6980
  8. 8.
    Li, Z., Hoiem, D.: Learning without forgetting. In: European Conference on Computer Vision, pp. 614–629. Springer (2016)CrossRefGoogle Scholar
  9. 9.
    MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. Oakland, CA, USA (1967)Google Scholar
  10. 10.
    Ruder, S.: An overview of multi-task learning in deep neural networks (2017). arXiv:1706.05098
  11. 11.
    Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization (2016). arXiv:1611.03530
  12. 12.
    Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2017)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Statistical Inference and Information Theory Laboratory, School of Electrical EngineeringKAISTDaejeonSouth Korea

Personalised recommendations