Small-scale moving target detection in aerial image by deep inverse reinforcement learning

  • Wei SunEmail author
  • Dashuai Yan
  • Jie Huang
  • Changhao Sun


It proposes a deep inverse reinforcement learning method for slow and weak moving targets detection in aerial video. Differential gray images of adjacent frames are used as the network model input, and the feature network layer extracts the candidate moving target regions through the multi-layer convolution. The candidate target information is used as the initial layer of the policy network. The expert trajectory is used to adjust and optimize the feature convolution network model and the policy fully connected network model to realize the training the reward return function and the expert policy. In the stage of autonomous improvement policy, the policy model is re-optimized by unmarked aerial video, and deep inverse reinforcement learning and nonlinear policy network are used to make decision on moving target position and size information. The target size of the multi-group aerial video test set is 10 * 10 pixels. Experimental results show that the proposed algorithm has the advantage of the nonlinear policy of the neural network compared with the traditional moving target detection algorithm, and the detection result is more accurate. At the same time, compared with the traditional marginal programming (MMP) method and the structured classification based (SCIRL) method, the proposed algorithm shows obvious advantages in the accuracy of aerial video moving target detection.


Aerial image Deep inverse reinforcement Small-scale target detection 



We would like to thank the anonymous reviewers and the associate editor for their valuable comments and suggestions to improve the quality of the manuscript. This work was supported by National Nature Science Foundation of China (NSFC) under Grants 61671356, 61703403, 61601352.

Compliance with ethical standards

Conflict of interest

The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.


  1. Carneiro G, Nascimento JC (2013) Combining multiple dynamic models and deep learning architectures for tracking the left ventricle endocardium in ultrasound data. IEEE Trans Pattern Anal Mach Intell 35(11):2592–2607Google Scholar
  2. Chang X, Yang Y (2014) Semi-supervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst 28(10):2294–2305Google Scholar
  3. Chang X, Yu YL, Yang Y, Xing EP (2016) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Softw Eng 39(8):1617–1632Google Scholar
  4. Chang X, Ma Z, Lin M, Yang Y, Hauptmann A (2017a) Feature interaction augmented sparse learning for fast kinect motion detection. IEEE Trans Image Process 26(8):3911–3920MathSciNetzbMATHGoogle Scholar
  5. Chang X, Ma Z, Yang Y, Zeng Z, Hauptmann AG (2017b) Bi-level semantic representation analysis for multimedia event detection. IEEE Trans Cybern 47(5):1180–1197Google Scholar
  6. Chen C, Liu K, Kehtarnavaz N (2016) Real-time human action recognition based on depth motion maps. J Real-Time Image Proc 12(1):155–163Google Scholar
  7. Choi J, Kim KE (2017) Hierarchical Bayesian inverse reinforcement learning. IEEE Trans Cybern 45(4):793–805Google Scholar
  8. Dikmen O, Fevotte C (2012) Maximum marginal likelihood estimation for nonnegative dictionary learning in the gamma–Poisson model. IEEE Trans Signal Process 60(10):5163–5175MathSciNetzbMATHGoogle Scholar
  9. Jeba JA, Roy S, Rashid MO et al (2019) Towards green cloud computing an algorithmic approach for energy minimization in cloud data centers. Int J Cloud Appl Comput 9(1):59–81Google Scholar
  10. Kelly JD, Hedengren JD (2013) A steady-state detection (SSD) algorithm to detect non-stationary drifts in processes. J Process Control 23(3):326–331Google Scholar
  11. Khellah FM (2011) Texture classification using dominant neighborhood structure. IEEE Trans Image Process 20(11):3270–3279MathSciNetzbMATHGoogle Scholar
  12. Konda V (2003) Actor-critic algorithms. SIAM J Control Optim 42(4):1143–1166MathSciNetzbMATHGoogle Scholar
  13. Lazib L, Zhao Y, Qin B, Liu T (2016) Negation scope detection with recurrent neural networks models in review texts. In: International conference of young computer scientists, engineers and educators. Springer, SingaporeGoogle Scholar
  14. Li L, Zhu H, Yang G, Qian J (2014) Referenceless measure of blocking artifacts by Tchebichef kernel analysis. IEEE Signal Process Lett 21(1):122–125Google Scholar
  15. Li L, Lin W, Wang X, Yang G, Bahrami K, Kot AC (2016a) No-reference image blur assessment based on discrete orthogonal moments. IEEE Trans Cybern 46(1):39–50Google Scholar
  16. Li L, Wu D, Wu J, Li H, Lin W, Kot AC (2016b) Image sharpness assessment by sparse representation. IEEE Trans Multimed 18(6):1085–1097Google Scholar
  17. Li Z, Nie F, Chang X, Yang Y (2017a) Beyond trace ratio: weighted harmonic mean of trace ratios for multiclass discriminant analysis. IEEE Transa Knowl Data Eng 29(10):2100–2110Google Scholar
  18. Li L, Xia W, Lin W, Fang Y, Wang S (2017b) No-reference and robust image sharpness evaluation based on multiscale spatial and spectral features. IEEE Trans Multimed 19(5):1030–1040Google Scholar
  19. Liao RF, Wen H, Wu J, Pan F, Xu A, Jiang Y, Cao M (2019) Deep-learning-based physical layer authentication for industrial wireless sensor networks. Sensors 19(11):2440Google Scholar
  20. Lincoln R, Galloway S, Stephen B et al (2012) Comparing policy gradient and value function based reinforcement learning methods in simulated electrical power trade. IEEE Trans Power Syst 27(1):373–380Google Scholar
  21. Mathews VJ, Xie Z (1993) A stochastic gradient adaptive filter with gradient adaptive step size. IEEE Trans Signal Process 41(6):2075–2087zbMATHGoogle Scholar
  22. Mnih V, Kavukcuoglu K, Silver D et al (2013) Playing Atari with deep reinforcement learning. Comput Sci 12:1–9Google Scholar
  23. Nair A, Srinivasan P, Blackwell S et al (2015) Massively parallel methods for deep reinforcement learning. Comput SciGoogle Scholar
  24. Nguyen P, Arsalan M, Koo J et al (2018) LightDenseYOLO: a fast and accurate marker tracker for autonomous UAV landing by visible light camera sensor on drone. Sensors 18(6):1315Google Scholar
  25. Ozturk E, Sokmen I (2015) Resonant peaks of the linear optical absorption and rectification coefficients in GaAs/GaAlAs quantum well: combined effects of intense laser, electric and magnetic fields. Int J Mod Phys B 29(05):2338Google Scholar
  26. Pan J-S, Kong L, Sung T-W, Tsai P-W, Snasel W (2018) α-fraction first strategy for hierarchical wireless sensor neteorks. J Internet Technol 19(6):1717–1726Google Scholar
  27. Sutton RS (1988) Learning to predict by the method of temporal differences. Mach Learn 3(1):9–44Google Scholar
  28. Van Hasselt H, Guez A, Silver D (2015) Deep reinforcement learning with double Q-learning. Comput Sci 9:1–9Google Scholar
  29. Wu J, Guo S, Huang H, Liu W, Xiang Y (2018) Information and communications technologies for sustainable development goals: state-of-the-art, needs and perspectives. IEEE Commun Surv Tutor 20(3):2389–2406Google Scholar
  30. Xia C, El Kamel A (2016) Neural inverse reinforcement learning in autonomous navigation. Robot Autonomous Syst 84:1–14Google Scholar
  31. Yang Q, Xue D (2013) Gait recognition based on sparse representation and segmented frame difference energy image. Inf Control 42(1):27–32Google Scholar
  32. Yang G et al (2018) Convolutional neural network-based embarrassing situation detection under camera for social robot in smart homes. Sensors 18(5):1530Google Scholar
  33. Zeng X, Yeung DS (2001) Sensitivity analysis of multilayer perceptron to input and weight perturbations. IEEE Trans Neural Netw 12(6):1358–1366Google Scholar
  34. Zhang Q, Liu Y, Pan J, Yan Y (2015) Continuous speech recognition based on convolutional neural network. In: International conference on digital image processing, international society for optics and photonicsGoogle Scholar
  35. Zhifei S, Joo EM (2012) A survey of inverse reinforcement learning techniques. Int J Intell Comput Cybern 5(3):293–311MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Aerospace Science and TechnologyXidian UniversityXi’anChina
  2. 2.Qian Xuesen Laboratory of Space TechnologyChina Academy of Space TechnologyBeijingChina

Personalised recommendations