Advertisement

One Shot Learning with Margin

  • Xianchao Zhang
  • Jinlong Nie
  • Linlin Zong
  • Hong Yu
  • Wenxin LiangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11440)

Abstract

One shot learning is a task of learning from a few examples, which poses a great challenge for current machine learning algorithms. One of the most effective approaches for one shot learning is metric learning. But metric-based approaches suffer from data shortage problem in one shot scenario. To alleviate this problem, we propose one shot learning with margin. The margin is beneficial to learn a more discriminative metric space. We integrate the margin into two representative one shot learning models, prototypical networks and matching networks, to enhance their generalization ability. Experimental results on benchmark datasets show that margin effectively boosts the performance of one shot learning models.

Keywords

One shot learning Metric learning Meta learning 

References

  1. 1.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  2. 2.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  3. 3.
    Edwards, H., Storkey, A.: Towards a neural statistician. In: International Conference on Learning Representations (ICLR) (2017)Google Scholar
  4. 4.
    Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning (ICML), pp. 1126–1135 (2017)Google Scholar
  5. 5.
    Goldberger, J., Hinton, G.E., Roweis, S.T., Salakhutdinov, R.R.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems (NIPS), pp. 513–520 (2005)Google Scholar
  6. 6.
    Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 1735–1742. IEEE (2006)Google Scholar
  7. 7.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  8. 8.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
  9. 9.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  10. 10.
    Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
  11. 11.
    Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2 (2015)Google Scholar
  12. 12.
    Kulis, B., et al.: Metric learning: a survey. Found. Trends® Mach. Learn. 5(4), 287–364 (2013)CrossRefGoogle Scholar
  13. 13.
    Lake, B., Salakhutdinov, R., Gross, J., Tenenbaum, J.: One shot learning of simple visual concepts. In: Proceedings of the Annual Meeting of the Cognitive Science Society, vol. 33 (2011)Google Scholar
  14. 14.
    Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Liu, H., Zhang, X., Zhang, X., Cui, Y.: Self-adapted mixture distance measure for clustering uncertain data. Knowl.-Based Syst. 126, 33–47 (2017)CrossRefGoogle Scholar
  16. 16.
    Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1 (2017)Google Scholar
  17. 17.
    Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural networks. In: International Conference on Machine Learning (ICML), pp. 507–516 (2016)Google Scholar
  18. 18.
    van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)zbMATHGoogle Scholar
  19. 19.
    Mishra, N., Rohaninejad, M., Chen, X., Abbeel, P.: A simple neural attentive meta-learner. In: International Conference on Learning Representations (ICLR) (2018)Google Scholar
  20. 20.
    Nichol, A., Achiam, J., Schulman, J.: On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999 (2018)
  21. 21.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  22. 22.
    Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: International Conference on Learning Representations (ICLR) (2017)Google Scholar
  23. 23.
    Ren, M., et al.: Meta-learning for semi-supervised few-shot classification. In: International Conference on Learning Representations (ICLR) (2018)Google Scholar
  24. 24.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Salakhutdinov, R., Hinton, G.: Learning a nonlinear embedding by preserving class neighbourhood structure. In: Artificial Intelligence and Statistics, pp. 412–419 (2007)Google Scholar
  26. 26.
    Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: International Conference on Machine Learning (ICML), pp. 1842–1850 (2016)Google Scholar
  27. 27.
    Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815–823 (2015)Google Scholar
  28. 28.
    Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems (NIPS), pp. 4080–4090 (2017)Google Scholar
  29. 29.
    Sohn, K.: Improved deep metric learning with multi-class N-pair loss objective. In: Advances in Neural Information Processing Systems (NIPS), pp. 1857–1865 (2016)Google Scholar
  30. 30.
    Triantafillou, E., Zemel, R., Urtasun, R.: Few-shot learning through an information retrieval lens. In: Advances in Neural Information Processing Systems (NIPS), pp. 2252–2262 (2017)Google Scholar
  31. 31.
    Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems (NIPS), pp. 3630–3638 (2016)Google Scholar
  32. 32.
    Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10(Feb), 207–244 (2009)zbMATHGoogle Scholar
  33. 33.
    Yang, L., Jin, R.: Distance metric learning: a comprehensive survey. Mich. State Univ. 2(2), 4 (2006)Google Scholar
  34. 34.
    Zhang, X., Zhang, X., Liu, H.: Self-adapted multi-task clustering. In: IJCAI, pp. 2357–2363 (2016)Google Scholar
  35. 35.
    Zhang, X., Zhang, X., Liu, H., Liu, X.: Multi-task clustering through instances transfer. Neurocomputing 251, 145–155 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Xianchao Zhang
    • 1
    • 2
  • Jinlong Nie
    • 1
    • 2
  • Linlin Zong
    • 1
    • 2
  • Hong Yu
    • 1
    • 2
  • Wenxin Liang
    • 3
    Email author
  1. 1.School of SoftwareDalian University of TechnologyDalianChina
  2. 2.Key Laboratory for Ubiquitous Network and Service Software of Liaoning ProvinceDalianChina
  3. 3.School of Software EngineeringChongqing University of Posts and TelecommunicationsChongqingChina

Personalised recommendations