Selective Filter Transfer
Today deep learning has become a supreme tool for machine learning regardless of application field. However, due to a large number of parameters, tremendous amount of data is required to avoid over-fitting. Data acquisition and labeling are done by human one by one, and therefore expensive. In a given situation, the dataset with adequate amount is difficult to acquire. To resolve the problem, transfer learning is adopted (Yosinski et al, Adv Neural Inf Process Syst, 2014, ). The transfer learning is delivering the knowledge learned from abundant dataset, e.g. ImageNet, to the dataset of interest. The fundamental way to transfer knowledge is to reuse the weights leaned from huge dataset. The brought weights can be either frozen or fine-tuned with respect to a new small dataset. The transfer learning definitely showed a improvement on target performance. However, one drawback is that the performance depends on the similarity between the source and the target dataset (Azizpour et al, Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2015, ). In other words, the two datasets should be alike to be effective. Then finding the similar source dataset becomes another difficulty. To alleviate the problems, we propose a method that maximizes the effectiveness of the transferred weights regardless of what source data is used. Among the weights pre-trained with source data, only the ones relevant to the target data is transferred. The relevance is measured statistically. In this way, we improved the classification accuracy of downsized 50 sub-class ImageNet 2012 by 2%.
This work was supported by Institute for Information communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (2017-0-01780, The technology development for event recognition/relational reasoning and learning knowledge based system for video understanding).
- 1.Yosinski, J. et al.: How transferable are features in deep neural networks? Adv. Neural Inf. Process. Syst. (2014)Google Scholar
- 2.Azizpour, H. et al.: From generic to specific deep representations for visual recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2015)Google Scholar
- 3.Jia, Y. et al.: Caffe: convolutional architecture for fast feature embedding (2014). arXiv:1408.5093
- 4.Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. (2012)Google Scholar
- 5.He, K. et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar