Keywords

1 Introduction

As the carrier of air transport and reconnaissance, rotorcraft plays a significant role in military and civil fields. However, it is difficult to distinguish a rotorcraft from ground targets by the conventional pulsed-Doppler radar when the rotorcraft is flying or hovering with a low speed at a low altitude. For example, the traditional radar signal processing algorithm can neither distinguish a helicopter from the armored vehicles in battlefields, nor distinguish the unmanned aerial vehicles from the pedestrians in urban environments, etc. Moreover, it is even more challenging to classify different types of rotorcrafts. Thus, an automatic rotorcraft detection technique is in real demand [1, 2].

The concepts of Micro-motion and Micro-Doppler were introduced by Chen et al. [3]. In many cases, an object or any structural component of an object may have oscillatory motion, called micro motion [3]. The source of micro motion may be a rotating propeller of a fixed-wing aircraft, the rotating rotor blades of a helicopter, a rotating antenna, the flapping wings of birds, or a walking person with swinging arms and legs, etc. The Doppler shift caused by micro motion is called micro-Doppler. Target recognition based on micro-Doppler features can extract motion signatures of target from the radar returns, which opens up a probe to new ways of target recognition.

Rotorcraft blade is usually made of metal or composite material, which produces a strong radar reflex. As the speed of the rotor tip is faster than that of the fuselage or the fretting of other ground targets, rotor rotation caused by Doppler shift occupy a unique position. Micro-Doppler is therefore very suitable for rotorcraft target recognition and classification. However, most of the existing recognition or classification algorithms devise discriminative features on the post-processed signals rather on the raw micro-Doppler signals. Such dependence on the domain knowledge of micro-Doppler signals limits the scalability of those algorithms.

Deep learning algorithms have revolutionized several applications such as image classification, speech recognition, etc. in recent years. Comparing with the previous state-of-the-art algorithms that principally rely on domain knowledge-based features, the classification accuracy of deep learning-based algorithms has been improved significantly. Therefore, in this paper, we consider an alternative deep learning approach to overcome such limitations. To the best of our knowledge, deep learning approach has not been widely used in the radar community, particularly for rotorcraft recognition with Doppler signatures.

In this paper, we propose to use deep convolutional neural networks (DCNNs) to recognize micro-Doppler signatures in spectrograms and then classify and recognize the rotorcraft targets. By training the convolutional neural network on micro-Doppler spectrograms, the model can learn the inner features automatically so that it can recognize the rotorcraft targets and distinguish various rotorcrafts. In the experiments, we train the proposed DCNN on simulated micro-Doppler spectrograms and then apply the model to rotorcraft detection and helicopter classification. High recognition rate is observed in the experiments.

The remainder of this paper is organized as follows. Section 2 introduces the related work. Section 3 presents our rotorcraft recognition method in detail. Section 4 presents the rotorcrafts classification. Finally, we conclude our work in Sect. 5.

2 Related Work

In recent year, most of the existing research in micro-Doppler target recognition follows the following prototype: firstly, explicitly extract some unique features from micro-Doppler signals that are able to distinguish different targets; then, apply such features in target classification [4]. Molchanov et al. extracts different types of targets from micro-Doppler signals and then classifies the ground moving targets [5, 6]. Molchanov et al. also classifies helicopters and fixed-wing aircraft based on their difference in Doppler energy distribution [7]. Thayaparan et al. decomposes micro-Doppler signals by wavelet decomposition and then extracts the micro-Doppler features by time-frequency analysis. Cai et al. extracts the micro-Doppler signals by Hilbert-Huang transformation [9]. Since such approaches explicitly extract features, the algorithms rely on prior knowledge and expert empirical experience seriously.

In recent years, with the development of high-performance processors, such as Graphic Processing Unit (GPUs) and FPGAs, CNN has achieved excellent empirical performance in a number of applications, such as speech recognition, image recognition, and natural language processing [10, 11]. CNN employ a deep neural network structure which stacks multiple layers of simple neural network, through a supervised back propagation algorithm.CNN extracts hierarchical abstractions from the training data. Comparing with the traditional image recognition algorithm, CNN is not necessary to extract empirical features. In the ImageNet LSVRC-2010 competition, 17% in top-5 error rate was obtained when classifying 1.2 million HD images of 1000 different classes [12], meanwhile the top-5 error rate using the traditional algorithm was up to 25.7% [13]. In 2014, a face recognition method based on CNN are presented and achieved an accuracy of 99.15% [14, 15]

3 Rotorcraft Recognition Through Micro-Doppler Signatures Using CNN

The main purpose of rotorcraft recognition is to find the signals from rotorcraft targets in various echo signals received by radar. In addition to find the rotating rotor, we simulate the radar echo of oscillating pendulums, spinning tops, walking people and flying birds at the same time. The emulated radar echo are processed with the short-time Fourier transform so we can obtain the micro-Doppler spectrograms. During the observation window, 100 radar micro-Doppler spectrograms are simulated for each class of the targets. The corresponding sample spectrograms are shown in Fig. 1.

Fig. 1.
figure 1

Sample Micro-Doppler spectrograms. (a) Rotor blades. (b) Oscillating pendulums. (c) Spinning and precession top. (d) Walking humans. (e) Wing-flapping birds.

(a) Emulation of the radar backscattering from the rotating rotor blades is shown as follows: the radar is located at (X = 10 m, Y = 0 m, Z = 20 m) with a wavelength of 0.06 m at the C-band, the rotor center is located at (X = 0 m, Y = 0 m, Z = 0 m), the length of the blade is L = 6 m, the width of the blade is W = 1 m, and the rotation rate is Ω from 2 rev/s to 6 rev/s following the normal distribution, the observation window is 1 s and the number of blades is N = 3.

(b) When emulating the radar backscattering from an oscillating pendulum, given the location of the radar at (X = 10 m, Y = 0 m, Z = 20 m), the pivot point of the pendulum is assumed at (X = 0 m, Y = 0 m, Z = 2 m). The string length L falls in 0.3 m to 2 m and the mass of the small bob is from 10 g to 80 g. In cases of damping and driving, let the damping constant be γ = 0.07 and the driving amplitude be A = 15, and the normalized driving frequency be f = 0.2. The radar wavelength is 0.03 m at the X-band, the observation time is T = 10 s.

(c) Emulation of the radar backscattering from a precession top. During the radar observation time interval, the simulated spinning and precession top is m = 25 kg, located at (X = 0 m, Y = 0 m, Z = 2 m). The distance between the center of mass and the fixed tip point is L = 0.563 m, the moments of inertia \( I_{1} = I_{2} = 0.117\,{\text{kg}} \cdot {\text{m}}^{2} ,I_{3} = 8.5\,{\text{kg}} \cdot {\text{m}}^{2} \), and the initial nutation angle \( \theta_{0} = 20^{^\circ } \). The radar wavelength is 0.03 m at the X-band radar located at (X = 10 m, Y = 0 m, Z = 20 m). The observation time is T = 8 s.

(d) Emulation of the radar backscattering from walking humans using the human walking model in [16]. Assume that the relative velocity of the walking person range from \( 0.2\,{\text{m}}/{\text{s}} \) to \( 3\,{\text{m}}/{\text{s}} \) following the normal distribution, the height of the person be \( \varvec{H }\, = \,1.8\,{\mathbf{m}} \), where the radar is located at (X = 10 m, Y = 0 m, Z = 20 m) with a wavelength of 0.02 m. The starting point of the human base is located at (X = 0 m, Y = 0 m, Z = 0 m), and the observation window is 4 s.

(e) Emulation of the radar backscattering from the flapping wings of birds. The mean of the flapping frequency is set at 0.1 Hz with variance 0.1, the mean of the arm length is 0.5 cm with variance 0.1, and the birds is flying with a velocity from 0.7 m/s to 1.3 m/s, where all the parameters above follow the normal distribution. Meanwhile, the X-band radar is located at (X = 20 m, Y = 0 m, Z = −10 m) and the observation window is 10 s.

We simulated 100 spectrograms for each target, and get 500 spectrograms in total. In this paper, we employ spectrogram itself as the input to the CNN. In other words, we regard the spectrogram classification problem as an image recognition problem. The observation window is set long enough to capture the periodic micro-Doppler signatures. The size of the input spectrogram is normalized to 218 × 127 pixels. Among the 500 data, 80% are used as the training dataset, and the rest are used as the testing dataset.

As the number of the training samples is relatively small, we designed a small-scaled CNN for the purpose of rotorcraft recognition. The proposed CNN consists of six layers, three convolution layers followed by three fully-connected layers. As show in Fig. 2, the first convolution layer filters the 218 × 127 input image with 128 kernels of size 5 × 5 with a stride of 3 pixels (this is the distance between the receptive field centers of neighboring neurons in a kernel map). The second convolutional layer takes as input the (response-normalized and pooled) output of the first convolutional layer and filters it with 256 kernels of size 3 × 3 and 3 pixels per stride. The last convolutional layer had 256 convolution filters of size 3 × 3. For max pooling, we used 3 × 3 max pooling with 3 pixels per stride for the first layer, 2 × 2 max pooling with 3 pixels per stride for the second layer and the 3 × 3 max pooling with 2 pixels per stride for the third layer. Furthermore, three full-connected layers are directly connected to the output of the third pooling layer and the target classes and each layer had 256 neurons.

Fig. 2.
figure 2

CNN for rotorcraft recognition

Figure 3 presents the how the accuracy improves as the training goes on. It is easy to observe that the accuracy of the model reached 99% after the 4th epoch. The training time of each epoch is less than 1 s., that is, the total training time of the deep neural network is less than 10 s.

Fig. 3.
figure 3

The accuracy and the loss of the DCNN model in each epoch.

Figure 4 visualizes the feature representations extracted by the DCNN model at different levels. It is evident that the feature maps at each layer are more abstract than those at its previous layer. The feature maps at the very first layer contain more edges and structures, while the feature maps at subsequent layers contain more implicit understandings of the spectrogram.

Fig. 4.
figure 4

Visualization of part of the features extracted in different layers in the DCNN model.

In order to test the robustness of the proposed DCNN model, we reduced the resolution of the spectrogram to 109 × 86 and 55 × 43, and conducted the same experiments on the two sets of data separately. The experimental results are presented in Table 1. It is evident that when the resolution is 109 × 87, the recognition accuracy of 99% can be achieved at the 2nd epoch, when the resolution reduced to 55 × 43, 99% accuracy can also be achieved but at the 10th epoch. It is indicating that our proposed DCNN model is robust because it is able to maintain a high recognition accuracy when the resolution of the input is reduced.

Table 1. The accuracy of the DCNN model on the testing datasets with different resolution.

4 Rotorcraft Classification Through Micro-Doppler Signatures Using CNN

The classification of the rotorcraft is to further classify the rotorcraft after the rotor is recognized. Table 2 lists a few features of different helicopters. These estimated feature parameters are important for classifying the type of an unknown helicopter. In this emulation, radar is located at (X = 500 m, Y = 0 m, Z = 20 m) with a wavelength of 0.06 m at the C-band, observation window is 1 s, range resolution is 0.5 m, number of pulses is 10240. The rotor center is located at (X = 0 m, Y = 0 m, Z = 0 m), other features are show in Table 2. In each type of rotorcraft, variance of blade length and rotation rate is 0.1, and showed a normal distribution, for each helicopter, we simulated 500 spectrograms.

Table 2. Main rotor features of typical helicopters

During the observation time of 0.2 s, 500 radar backscattering is simulated for each class of targets and their sample spectrograms are shown in Fig. 5, the size of the extracted spectrogram was 218 × 172.

Fig. 5.
figure 5

Sample spectrograms of helicopter. (a) A-Helicopter. (b) B-Helicopter. (c) C-Helicopter. (d) D-Helicopter. (e) E-Helicopter. (f) F-Helicopter. (g) G-Helicopter.

Because the data and type we employed are significantly larger than those of the first experiment. In this study, the CNN had eight layers, the first five layers are convolution layers, and the other three layers are full connected layers. The first convolution layer filters the 218 × 127 input image with 128 kernels of size 7 × 7 with a stride of 3 pixels. The second had 256 kernels of size 3 × 3 and 2 pixels per stride. For max pooling, we used the 3 × 3 max pooling with 3 pixels per stride for the first layer, the 2 × 2 max pooling with 3 pixels per stride for the second layer. The third and fourth convolutional layer had 256 convolution filters of size of 3 × 3. The last convolutional layer is max pooling layer with 256 max pooling filters of 5 × 5 and 5 pixels per stride. Furthermore, we had three full-connected layers that directly connects the output of the last pooling and seven target classes and each layer had 256 neurons (see Fig. 6).

Fig. 6.
figure 6

CNN for rotorcraft classification

We used a fourfold cross validation to evaluate the classification performance in this study. The training and test sets in each fold contained 2800 and 500 samples respectively. For learning, we used the mini-batch SGD with a learning rate of 0.001 and a batch size of 10 and dropout was applied for the final fully connected layer with a probability of 0.5.

In this paper, we used the open-source toolkit Keras, which uses the NVIDIA GPU and CUDA library (e.g., cuDNN [17]) to speed up the computation. We used the NVIDIA GeForce GTX Titan X edition GPU (with a 12-GB memory) and Intel(R) Xeon(R) CPU(4 cores) with a 8-GB memory in our experiments. The resulting classification accuracy was 100%, as listed in Fig. 7.

Fig. 7.
figure 7

Accuracy and loss of DCNN for each epoch in rotorcraft classification

In addition, we also tested the recognition speed of the trained convolutional neural network and the results is show in Table 3. The testing time for each fold with 4000 spectrograms was about 2 s on average, meet the real-time requirements. Table 4 show the accuracy from other machine learning algorithm, it can be seen that only DCNN achieve great accuracy in this term.

Table 3. Training time of DCNN
Table 4. The classification accuracy for different classifiers

5 Conclusion

In this paper, a deep learning-based moving target recognition method is proposed for rotorcraft recognition and classification. We propose to use DCNN models to recognize rotorcrafts from the ground moving targets on raw micro-Doppler spectrums. Comparing with the traditional radar signal processing techniques, DCNN-based models automatically extract features from the micro-Doppler spectrograms rather than introducing any explicit domain knowledge. Simulated data are used in the experiments. The proposed DCNN model successfully detected the rotorcraft from five types of moving objects with 100% accuracy. Similarly, he proposed DCNN classified seven types of helicopters with 100% accuracy. In addition to the superior accuracy, the performance is very high, 2 s for 4000 spectrums on average, which satisfied the requirement of real time prediction in real applications. In the future, we will focus on optimizing the proposed models on real radar micro-Doppler data.