Deep transfer learning for military object recognition under small training set condition

  • Zhi Yang
  • Wei Yu
  • Pengwei Liang
  • Hanqi Guo
  • Likun Xia
  • Feng Zhang
  • Yong Ma
  • Jiayi Ma
Original Article
  • 47 Downloads

Abstract

Convolutional neural network is powerful for general object recognition. However, its excellent performance depends largely on huge training set. Facing task like military object recognition in which image samples for training are scarce, its performance will degrade sharply. To solve this problem, a deep transfer learning method is proposed in this paper. The main idea consists of two parts: transfer learning for prior knowledge embedding and mixed layer for better feature extraction. It has been proved that the ability of feature extraction learned in large dataset is helpful to related tasks and can be transferred to a new neural network. The transfer learning process is achieved by fixing the weights of some layers and then retraining the remained layers. The key problem for deep transfer learning is which part should be transferred and which part should be retrained to adapt the network to the new task. This problem is solved by extensive experiments, and it is found that retraining the last three layers and transferring prior to the other layers can reach the best performance. Besides, we used mixed layer scheme to make use of the current information. In each mixed layer, convolution filters in different scales are combined together, helping to adapt features in different scales. By employing these two methods, the proposed method exhibits a large improvement in military object recognition under small training set. Experiments demonstrate that our method can achieve a high recognition precision, superior to many other algorithms compared.

Keywords

Object recognition Small training set Military Transfer learning Convolutional neural network 

Notes

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant Nos. 61773295, 61503288 and 61572076.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. 1.
    Adankon MM, Cheriet M (2009) Support vector machine. In: International conference on intelligent networks and intelligent systems, pp 418–421Google Scholar
  2. 2.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATHGoogle Scholar
  3. 3.
    Dai W, Yang Q, Xue GR, Yu Y (2007) Boosting for transfer learning. In: International conference on machine learning, pp 193–200Google Scholar
  4. 4.
    Denoeux T (1995) A k-nearest neighbor classification rule based on Dempster–Shafer theory. IEEE Trans Syst Man Cybern 25(5):804–813CrossRefGoogle Scholar
  5. 5.
    Gao Y, Ma J, Yuille AL (2017) Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans Image Process 26(5):2545–2560MathSciNetCrossRefGoogle Scholar
  6. 6.
    García-Laencina PJ, Sancho-Gómez JL, Figueiras-Vidal AR (2010) Pattern classification with missing data: a review. Neural Comput Appl 19(2):263–282CrossRefGoogle Scholar
  7. 7.
    Girshick R (2015) Fast r-cnn. In: IEEE international conference on computer vision, pp 1440–1448Google Scholar
  8. 8.
    Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation, pp 580–587Google Scholar
  9. 9.
    Guo X, Li Y, Ling H (2017) Lime: low-light image enhancement via illumination map estimation. IEEE Trans Image Process 26(2):982–993MathSciNetCrossRefGoogle Scholar
  10. 10.
    He K, Sun, J (2014) Convolutional neural networks at constrained time cost. In: Computer vision and pattern recognition, pp 5353–5360Google Scholar
  11. 11.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Computer vision and pattern recognition, pp 770–778Google Scholar
  12. 12.
    Johnson J, Alahi A, Li FF (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision, pp 694–711Google Scholar
  13. 13.
    Kasthuriarachchy BH, Zoysa KD, Premaratne HL (2015) Enhanced bag-of-words model for phrase-level sentiment analysis. In: International conference on advances in ICT for emerging regions, pp 210–214Google Scholar
  14. 14.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105Google Scholar
  15. 15.
    Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Computer vision and pattern recognition, 2006 IEEE computer society conference on, pp 2169–2178Google Scholar
  16. 16.
    Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  17. 17.
    Li Y, Gong S, Sherrah J, Liddell H (2004) Support vector machine based multi-view face detection and recognition. Image Vis Comput 22(5):413–427CrossRefGoogle Scholar
  18. 18.
    Li Y, Zhang Y, Huang X, Zhu H, Ma J (2018) Large-scale remote sensing image retrieval by deep hashing neural networks. IEEE Trans Geosci Remote Sens 56(2):950–965CrossRefGoogle Scholar
  19. 19.
    Liu Z, Li X, Luo P, Loy CC, Tang X (2015) Semantic image segmentation via deep parsing network. In: IEEE international conference on computer vision, pp 1377–1385Google Scholar
  20. 20.
    Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 3431–3440Google Scholar
  21. 21.
    Ma J, Jiang J, Liu C, Li Y (2017) Feature guided gaussian mixture model with semi-supervised em and local geometric constraint for retinal image registration. Inf Sci 417:128–142MathSciNetCrossRefGoogle Scholar
  22. 22.
    Ma J, Ma Y, Li C (2019) Infrared and visible image fusion methods and applications: a survey. Inf Fusion 45:153–178CrossRefGoogle Scholar
  23. 23.
    Ma J, Zhao J (2017) Robust topological navigation via convolutional neural network feature and sharpness measure. IEEE Access 5:20707–20715CrossRefGoogle Scholar
  24. 24.
    Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951CrossRefGoogle Scholar
  25. 25.
    Semwal VB, Mondal K, Nandi GC (2017) Robust and accurate feature selection for humanoid push recovery and classification: deep learning approach. Neural Comput Appl 28(3):565–574CrossRefGoogle Scholar
  26. 26.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
  27. 27.
    Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI, pp 4278–4284Google Scholar
  28. 28.
    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition, pp 1–9Google Scholar
  29. 29.
    Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Computer vision and pattern recognition, pp 2818–2826Google Scholar
  30. 30.
    Toshev A, Szegedy C (2014) Deeppose: Human pose estimation via deep neural networks. In: Computer vision and pattern recognition, pp 1653–1660Google Scholar
  31. 31.
    Viola P, Jones M (2003) Rapid object detection using a boosted cascade of simple features. In: Computer vision and pattern recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE computer society conference on, vol 1, pp I–511–I–518Google Scholar
  32. 32.
    Wu Y, Ianakiev K, Govindaraju V (2002) Improved k-nearest neighbor classification. Pattern Recognit 35(10):2311–2318CrossRefMATHGoogle Scholar
  33. 33.
    Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? Eprint Arxiv 27:3320–3328Google Scholar
  34. 34.
    Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp 818–833Google Scholar
  35. 35.
    Zhang H, Cao X, Ho JK, Chow TW (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inform 13(2):520–531CrossRefGoogle Scholar
  36. 36.
    Zhang H, Li J, Ji Y, Yue H (2017) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inform 13(2):616–624CrossRefGoogle Scholar
  37. 37.
    Zhang H, Wu QJ, Chow TW, Zhao M (2012) A two-dimensional neighborhood preserving projection for appearance-based face recognition. Pattern Recognit 45(5):1866–1876CrossRefMATHGoogle Scholar
  38. 38.
    Zhao M, Zhan C, Wu Z, Tang P (2015) Semi-supervised image classification based on local and global regression. IEEE Signal Process Lett 22(10):1666–1670CrossRefGoogle Scholar
  39. 39.
    Zhou ZH, Feng J (2017) Deep forest: towards an alternative to deep neural networks. In: IJCAI, pp 3553–3559Google Scholar

Copyright information

© The Natural Computing Applications Forum 2018

Authors and Affiliations

  1. 1.College of Computer Science and TechnologyWuhan University of Science and TechnologyWuhanChina
  2. 2.Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial SystemWuhan University of Science and TechnologyWuhanChina
  3. 3.Electronic Information SchoolWuhan UniversityWuhanChina
  4. 4.College of Information EngineeringCapital Normal UniversityBeijingChina
  5. 5.China Academy of Electronics and Information TechnologyBeijingChina

Personalised recommendations