Improving Convolutional Neural Network Design via Variable Neighborhood Search

  • Teresa Araújo
  • Guilherme Aresta
  • Bernardo Almada-Lobo
  • Ana Maria Mendonça
  • Aurélio Campilho
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10317)

Abstract

An unsupervised method for convolutional neural network (CNN) architecture design is proposed. The method relies on a variable neighborhood search-based approach for finding CNN architectures and hyperparameter values that improve classification performance. For this purpose, t-Distributed Stochastic Neighbor Embedding (t-SNE) is applied to effectively represent the solution space in 2D. Then, k-Means clustering divides this representation space having in account the relative distance between neighbors. The algorithm is tested in the CIFAR-10 image dataset. The obtained solution improves the CNN validation loss by over \(15\%\) and the respective accuracy by \(5\%\). Moreover, the network shows higher predictive power and robustness, validating our method for the optimization of CNN design.

Keywords

Machine learning Convolutional neural network Parameter optimization Variable neighborhood search 

References

  1. 1.
    Cireşan, D.C., Giusti, A., Gambardella, L.M., Schmidhuber, J.: Mitosis detection in breast cancer histology images with deep neural networks. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 411–418. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40763-5_51 CrossRefGoogle Scholar
  2. 2.
    Domhan, T., Springenberg, J.T., Hutter, F.: Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In: IJCAI International Joint Conference on Artificial Intelligence 2015, pp. 3460–3468, January 2015Google Scholar
  3. 3.
  4. 4.
    Hansen, P., Mladenovi, N.: Variable neighborhood search: principles and applications. Eur. J. Oper. Res. 130, 449–467 (2001)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Jin, J., Yan, Z., Fu, K., Jiang, N., Zhang, C.: Neural Network Architecture Optimization through Submodularity and Supermodularity, pp. 1–10 (2016)Google Scholar
  6. 6.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 11061114 (2012)Google Scholar
  7. 7.
    Krizhevsky, A.: Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto (2009)Google Scholar
  8. 8.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  9. 9.
    Maaten, L., Hinton, G.E.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)MATHGoogle Scholar
  10. 10.
    Macqueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, no. 233, pp. 281–297 (1967)Google Scholar
  11. 11.
    Snoek, J., Larochelle, H., Adams, R.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 1–9 (2012)Google Scholar
  12. 12.
    Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. In: ICLR 2015, pp. 1–14 (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Teresa Araújo
    • 1
    • 2
  • Guilherme Aresta
    • 1
    • 2
  • Bernardo Almada-Lobo
    • 1
    • 2
  • Ana Maria Mendonça
    • 1
    • 2
  • Aurélio Campilho
    • 1
    • 2
  1. 1.INESC TEC - Institute for Systems and Computer Engineering, Technology and SciencePortoPortugal
  2. 2.Faculdade de Engenharia da Universidade do PortoPortoPortugal

Personalised recommendations