Automatic Structural Search for Multi-task Learning VALPs
- 383 Downloads
The neural network research field is still producing novel and improved models which continuously outperform their predecessors. However, a large portion of the best-performing architectures are still fully hand-engineered by experts. Recently, methods that automatize the search for optimal structures have started to reach the level of state-of-the-art hand-crafted structures. Nevertheless, replacing the expert knowledge requires high efficiency from the search algorithm, and flexibility on the part of the model concept. This work proposes a set of model structure-modifying operators designed specifically for the VALP, a recently introduced multi-network model for heterogeneous multi-task problems. These modifiers are employed in a greedy multi-objective search algorithm which employs a non dominance-based acceptance criterion in order to test the viability of a structure-exploring method built on the operators. The results obtained from the experiments carried out in this work indicate that the modifiers can indeed form part of intelligent searches over the space of VALP structures, which encourages more research in this direction.
KeywordsHeterogeneous multi-task learning Deep learning Structure optimization
This work has been supported by the TIN2016-78365-R (Spanish Ministry of Economy, Industry and Competitiveness) and the IT-1244-19 (Basque Government) programs http://www.mineco.gob.es/portal/site/mineco. Unai Garciarena also holds a predoctoral grant (ref. PIF16/238) by the University of the Basque Country.
We also gratefully acknowledge the support of NVIDIA Corporation with the donation of a Titan X Pascal GPU used to accelerate the process of training the models used in this work.
- 1.Chen, T., Goodfellow, I., Shlens, J.: Net2net: accelerating learning via knowledge transfer (2015). arXiv preprint arXiv:1511.05641
- 2.Elsken, T., Metzen, J.-H., Hutter, F.: Simple and efficient architecture search for convolutional neural networks (2017). arXiv preprint arXiv:1711.04528
- 3.Fernando, C., et al.: Convolution by evolution: differentiable pattern producing networks. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, pp. 109–116. ACM (2016)Google Scholar
- 4.Fonseca, C.M., Paquete, L., López-Ibáñez, M.: An improved dimension-sweep algorithm for the hypervolume indicator. In: 2006 IEEE International Conference on Evolutionary Computation, pp. 1157–1163. IEEE (2006)Google Scholar
- 5.Garciarena, U., Mendiburu, A., Santana, R.: Towards automatic construction of multi-network models for heterogeneous multi-task learning (2019). arXiv preprint arXiv:1903.09171
- 6.He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
- 7.Howard, A.G., et al.: Efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:1704.04861
- 8.Kingma, D.P., Welling, M.: Auto-encoding variational Bayes (2013). arXiv preprint arXiv:1312.6114
- 10.Liang, J., Meyerson, E., Miikkulainen, R.: Evolutionary architecture search for deep multitask networks. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2018, pp. 466–473. ACM, New York (2018)Google Scholar
- 11.Miikkulainen, R., et al.: Evolving deep neural networks (2017). arXiv preprint arXiv:1703.00548
- 12.Rawal, A., Miikkulainen, R.: Evolving deep LSTM-based memory networks using an information maximization objective. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, pp. 501–508. ACM (2016)Google Scholar
- 14.Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)Google Scholar
- 17.Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)Google Scholar
- 18.Wei, T., Wang, C., Rui, Y., Chen, C.W.: Network morphism. In: International Conference on Machine Learning, pp. 564–572 (2016)Google Scholar
- 19.Wu, Z., Rajendran, S., van As, T., Zimmermann, J., Badrinarayanan, V., Rabinovich, A.: EyeNet: A Multi-Task Network for Off-Axis Eye Gaze Estimation and User Understanding (2019). arXiv preprint arXiv:1908.09060
- 20.Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms, August 2017. arXiv: cs.LG/1708.07747