Improving Convolutional Neural Network Design via Variable Neighborhood Search
An unsupervised method for convolutional neural network (CNN) architecture design is proposed. The method relies on a variable neighborhood search-based approach for finding CNN architectures and hyperparameter values that improve classification performance. For this purpose, t-Distributed Stochastic Neighbor Embedding (t-SNE) is applied to effectively represent the solution space in 2D. Then, k-Means clustering divides this representation space having in account the relative distance between neighbors. The algorithm is tested in the CIFAR-10 image dataset. The obtained solution improves the CNN validation loss by over \(15\%\) and the respective accuracy by \(5\%\). Moreover, the network shows higher predictive power and robustness, validating our method for the optimization of CNN design.
KeywordsMachine learning Convolutional neural network Parameter optimization Variable neighborhood search
Teresa Araújo and Guilherme Aresta equally contributed to this work. Project “NanoSTIMA: Macro-to-Nano Human Sensing: Towards Integrated Multimodal Health Monitoring and Analytics/NORTE-01-0145-FEDER-000016” is financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund (ERDF). Teresa Araújo is funded by the FCT grant contract SFRH/BD/122365/2016. Guilherme Aresta is funded by the FCT grant contract SFRH/BD/120435/2016.
- 1.Cireşan, D.C., Giusti, A., Gambardella, L.M., Schmidhuber, J.: Mitosis detection in breast cancer histology images with deep neural networks. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 411–418. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40763-5_51 CrossRefGoogle Scholar
- 2.Domhan, T., Springenberg, J.T., Hutter, F.: Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In: IJCAI International Joint Conference on Artificial Intelligence 2015, pp. 3460–3468, January 2015Google Scholar
- 3.Github: Cifar 10 CNN. https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py
- 5.Jin, J., Yan, Z., Fu, K., Jiang, N., Zhang, C.: Neural Network Architecture Optimization through Submodularity and Supermodularity, pp. 1–10 (2016)Google Scholar
- 6.Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 11061114 (2012)Google Scholar
- 7.Krizhevsky, A.: Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto (2009)Google Scholar
- 10.Macqueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, no. 233, pp. 281–297 (1967)Google Scholar
- 11.Snoek, J., Larochelle, H., Adams, R.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 1–9 (2012)Google Scholar
- 12.Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. In: ICLR 2015, pp. 1–14 (2015)Google Scholar