1 Introduction

The dual linear polarization radar can transmit horizontal and vertical polarization waves alternately or simultaneously, and it can also use different signal processing methods to deal with the echo signals from two polarization directions. Moreover, it is easy to obtain the horizontal reflectivity (Z H ), differential reflectivity (Z DR), co-polar correlation coefficient (ρ HV), differential propagation phase constant (K DP), and other polarization parameters. The difference in shape, size, or spatial orientation between different types of hydrometeor will produce different polarization parameters, and it can promote the development of hydrological meteorological classification by these polarization parameters. Compared with the conventional Doppler weather radar system, its ability to estimate the precipitation and recognize the hydrometeor phase has been improved significantly. What is more, it is an important tool in the fields of artificial influence on weather, aviation warning, and disaster monitoring [1,2,3,4,5].

Liu et al. [6] established a hydrometeor classification system based on fuzzy logic and neural network. In the system, the horizontal reflectivity, differential reflectivity, differential propagation phase shift, correlation coefficient, linear depolarization ratio, and the corresponding height are used as the inputs, and the neural network learning algorithm is applied to adjust the parameters. Finally, the inputs and parameters of the system are calculated to determine the type of hydrometeors [6,7,8].

Chandrasekar et al. [9] summarized the researches on echo classification and the identification of hydrological fluid, which are based on dual polarization radar in recent years. The classification principle of various types was described, and the characteristics of hydrometeor classification were analyzed. It promoted the study in the hydrometeor classification of the dual polarization radar greatly [9, 10].

Besic et al. [11] used a semi-supervised approach to realize the classification of hydrometeors. In this study, the K-medoids (KM) approach is used to cluster the sample data and the clustering results are evaluated by the Kolmogorov-Smirnov (KS) test method. Finally, the fuzzy logic algorithm is applied to realize the highly precise classification of hydrometeors based on the clustering results [11].

Hinton et al. [12] published an article in Science, which opened a gate for deep learning in the field of machine learning. Deep learning, as a kind of emerging learning algorithm of multi-layer neural network, has solved the local minimum defect in the traditional training algorithm. Moreover, it has been widely used in machine learning and computer vision and has aroused widespread concern in various fields [12,13,14,15]. Tao et al. [16] used the deep learning approach to realize the precipitation identification with bispectral satellite information. In this study, the effectiveness of the deep learning (DL) approaches to extract useful features from bispectral satellite information, infrared (IR), and water vapor (WV) channels and to produce rain/no-rain (R/NR) detection is explored [16].

Based on the data detected by WSR-98D/XD dual polarization Doppler weather radar, which has been upgraded, we designs a hydrometeor identification system based on deep learning and fuzzy logic methods. In this paper, there is a matrix of polarimetric measurements corresponding to each range bin, and it is composed of the data of the range bin and adjacent range bin, which covers approximately 1 km2. Therefore, the matrix of 21 × 21 ×4 can reflect the precipitation information of the current range bin. The system used the deep learning algorithm to extract the features from the matrix of polarimetric measurements, and the results are sent into the Softmax classifier to identify the precipitation pattern. Then, the fuzzy logic method is applied to judge the type of precipitation particles based on the precipitation pattern which was known. Finally, the optimal identification results of hydrometeor type can be obtained. The deep learning method is used for initial clustering, and then, the fuzzy logic method is used for accurate clustering; the result shows that it can improve the hydrometeor classification accuracy significantly.

The structure of this article is as follows. In Section 2, the dual polarization radar measurements used for various classifications, the deep learning method, and the fuzzy logic algorithm are briefly described. The process of experimental design is described in Section 3. In Section 4, the performance of the identification system of hydrometeor types is evaluated by the assessment data collected by the experiment center of atmospheric exploration in China.

2 Methods

2.1 Dual linear polarization Doppler weather radar

Polarization refers to the direction of vibration of the electric field when the electromagnetic wave propagates, and when the electric field vibrates in the horizontal direction, it is named the horizontal polarized wave. Otherwise, if in the vertical direction, it is named the vertical polarized wave. The dual linear polarization Doppler weather radar mainly alternately or simultaneously transmits horizontal and vertical polarization waves. Moreover, the distribution of the propagation medium in the space is not uniform, so the different polarization signal attenuation and phase shift will be different. As a result, it is possible to obtain the attenuation difference and the phase shift of the two polarization waves. When the frequency of the radar is higher, the impact of the propagation will be more serious. Through the corresponding processing and calculation, a series of polarimetric measurements such as horizontal reflectivity, vertical reflectivity, differential reflectivity factor, differential propagation phase change, differential propagation phase constant, and co-polar correlation coefficient can be obtained. For the above parameters, the detailed introduction is as follows.

2.1.1 Horizontal reflectivity factor Z H and vertical reflectivity factor Z V

When transmitting horizontal polarization, the expression of the horizontal reflectivity factor is

$$ {Z}_H=\underset{0}{\overset{D_{\mathrm{max}}}{\int }}N(D)\bullet {D_H}^6{d}_D $$
(1)

where D H is the size of the particle detected by the radar in the horizontal direction and N(D) is the drop distribution of the precipitation particles. The expression of the vertical reflectivity factor is

$$ {Z}_V=\underset{0}{\overset{D_{\mathrm{max}}}{\int }}N(D).{D_V}^6{d}_D $$
(2)

In the expression, N(D) is the size of the particle detected by radar in the vertical direction.

2.1.2 The differential reflectivity factor Z DR

The differential reflectivity factor Z DR is mainly calculated by the horizontal reflectivity Z H and vertical reflectivity factor Z V :

$$ {Z}_{\mathrm{DR}}=10\cdotp \log \left(\frac{Z_H}{Z_V}\right) $$
(3)

In the expression, Z DR is corresponding to the size and axial ratio of the precipitation particles, and the axis ratio is defined as a/b, in which a and b are respectively the horizontal and vertical axis radius of the particle.

2.1.3 The differential propagation phase change Φ DP and differential propagation phase constant K DP

The dual linear differential propagation phase change is defined as

$$ {\varPhi}_{\mathrm{DP}}={\varPhi}_{\mathrm{HH}}-{\varPhi}_{\mathrm{VV}} $$
(4)

where Φ HH and Φ VV are respectively the two-way phase angle at a certain distance from the arrival of the antenna when the radar signal is in the horizontal and vertical polarization. What is more, the differential propagation phase constant K DP can be defined as

$$ {K}_{\mathrm{DP}}=\frac{\varPhi_{\mathrm{DP}}\left({r}_2\right)-{\varPhi}_{\mathrm{DP}}\left({r}_1\right)}{2\left({r}_2-{r}_1\right)} $$
(5)

In the expression, K DP is the propagation constant difference of the horizontal and vertical polarization wave, and it contains the difference of the isotropic and anisotropic particles. Since the isotropic particles have similar phase shifts for the horizontal and vertical polarized waves, the difference of K DP is mainly due to the difference in the composition of the anisotropic particles. In general, K DP increases with the increase of the dielectric constant and ellipticity. Compared with the reflectivity factor, K DP is not sensitive to the change of particle distribution. Moreover, its measurement is not affected by partial beam blocking and isotropic particles; however, it depends on the density of particles.

2.1.4 The co-polar correlation coefficient ρ HV

The co-polar correlation coefficient is defined as the magnitude of the zero-lag correlation coefficient between the horizontal and the vertical polarization echo signal. What is more, it reflects the correlation of the backscatter characteristics between the horizontal and vertical polarized waves. As can be seen from the characteristics of ρ HV, the particle shape, the spatial orientation, and the number of precipitation particles are the main factors that affect its value.

Besides, the dual polarization Doppler radar also contains other polarimetric measurements such as linear depolarization ratio (LDR), covariance coefficient (CC), velocity (V), and spectral width (W).

2.2 The deep learning method for fuzzy logic classification

Deep learning aims at building a deep neural network model by simulating the learning process of brain. Combined with a lot of training data, the relationship among variables is to be obtained. Moreover, common network models include auto encoding network, deep belief, and convolution network. The model used in this paper is a convolutional neural network model [17], and it is a supervised multi-layer neural network, which is mainly composed of the convolution and the subsampling parts.

2.2.1 The selection of sample data

The data used in this study are Z H , Z DR, K DP, and ρ HV from the WSR-98D/XD dual polarization weather radar, which is located in Chengdu, Sichuan, China (30° 34′ N, 103° 55′ E), and its performance indicators are presented in Table 1.

Table 1 The main technical indicators of the WSR-98D/XD dual polarized weather radar

In most products of the dual polarization radar, the length of each range bin is 50 m, and the beam width is 1°. For the PPI scan mode, a volume scan consists of 14 elevation angles, and each elevation contains 350 to 370 radial directions; moreover, each radial direction includes 6000 range bin, while the valid data locate in the former 2000 range bins.

In this study, the data on each range bin is treated as a basic unit, so each basic unit should contains four polarization parameters, and they are Z H , Z DR, K DP, and ρ HV. Considering the high correlation of data between the range bin and its adjacent range bin, we choose to use a matrix of polarimetric measurements to reflect the type of precipitation particles corresponding to each range bin, and the matrix composed of the data of the range bin and adjacent range bin, which is near the range bin within 0.5 km. The implementation steps for obtaining the matrix are as follows.

Firstly, a sample data extraction window is established, and it is composed of 21 × 21 cells. Moreover, the length of each cell is 50 m, and it is equal to the length of a range bin.

Secondly, the data extraction window is used to obtain the matrix of polarimetric measurements corresponding to a range bin, and its center coincides with the range bin. For each cell of the data extraction window, when the range bin falls into it, the data from the range bin is used as the data of the cell.

Finally, through the above steps, we can obtain a sample of 21 × 21 × 4. In order to obtain the sample label, we adopt the voting principle, where the precipitation particle type with the most votes in the sample data is used as the label of the range bin.

In summary, for each range bin, a sample data matrix of 21 × 21 × 4 and a sample label data can be obtained.

2.2.2 The convolutional neural network

Convolutional neural network (CNN) is a multi-layer sensor, and it is inspired by the biological visual neural mechanism. It consists of multiple convolution and subsampling layers, with the ability to automatically extract the sample features. In the convolution layer, the neurons of each network layer are connected to the neurons of the upper layer only in a small neighborhood. Moreover, it can extract the features from input and keep the spatial structure of the original signal. Therefore, the image can be used for the input of the neural network, avoiding the complex process of data reconstruction and feature extraction in the traditional recognition algorithm. In the subsampling layer, the original data is compressed by sampling, which reduces the computational complexity and establishes the invariance of the spatial structure. In addition, CNN is more similar to the biological neural network by using the weight-sharing network structure, and it has achieved the best performance in pattern recognition. Nowadays, CNN has become an important research tool in many fields such as the image recognition and the automatic speech recognition.

CNN, a typical supervised learning algorithm, relies on a large number of labeled samples to train, and the back propagation (BP) algorithm is adopted in the training process. Once the original data is transmitted via the network, the corresponding output is obtained. If the actual output does not match the label data, the error is propagated backwards through the hidden layer to the input layer. It will be assigned to all the cells of each layer so as to obtain the error of signals in each unit, which are applied to correct the weights of the each unit. Therefore, the learning process can be summarized as the process of adjusting the weights by error back propagation, and it will continue to reduce the error until the error is less than the acceptable level or exceeds the number of iterations that has been set in advance.

2.3 The fuzzy classifier

The fuzzy logic method was first proposed by Zadeh in 1965. The traditional fuzzy logic method mainly includes four steps: fuzzification, inference, aggregation, and defuzzification. Considering the difference of the performance characteristics of precipitation particles among different precipitation patterns, different fuzzy logic classifiers are designed for the three precipitation patterns about rain, snow, and hail. These fuzzy logic classifiers of this paper use Z H ,Z DR, K DP, and ρ HV as input and then processed by fuzzification, inference, aggregation, and defuzzification. In the end, the results are transformed into a single precipitation particle type. The design process of the fuzzy logic method corresponding to the rainfall pattern is described in detail as follows, and its final output is 1 (drizzle), 2 (rain), and 3 (heavy rain). Its structure is presented in Fig. 1.

Fig. 1
figure 1

The structure of the fuzzy logic classifier corresponding to the rainfall pattern

2.3.1 Fuzzification and selection of membership function

The fuzzy logic algorithm for hydrometeor-type identification needs four input variables, which will be processed fuzzily firstly and be transformed into a fuzzy basis using the membership function. The certain input data can belong to different fuzzy basis, and it has different degree of membership in different fuzzy basis [9]. It is obvious that the selection of membership function is the most important part. By conducting a large number of comparative experiments, it is found that the beta membership function is best for the identification of hydrometeor type, and its expression is

$$ \beta \left(x,a,b,m\right)=\frac{1}{1+{\left|\frac{x-m}{a}\right|}^{2b}} $$
(6)

In the expression, the input variable is x, while the width of the function is a. Moreover, the slope of the function is b, and the range of output value is between 0 and 1.

2.3.2 Inference

After the previous analysis, the core of the classifier based on fuzzy logic mainly lies in the construction of membership function and rules. The IF-THEN rules for this hydrometeor-type classification can be described as follows:

IF (X 1 is MBF1j and X 2 is MBF2j and X 3 is MBF3j and X 4 is MBF4j ),

THEN hydrometeor is j.

where j = 1, 2, 3 corresponds to the three types of precipitation particles about drizzle, rain, and heavy rain and MBF ij represents the degree of membership function corresponding to the four input parameters. Moreover, the intensity R j of the j-type precipitation particles can be expressed by the following expression:

$$ {R}_j=\sum \limits_{i=1}^4\left[{W}_{ij}\times {\mathrm{MBf}}_{\mathrm{ij}}\left({X}_i\right)\right] $$
(7)

In the expression, W ij is the contribution of the ith parameter to the j-type precipitation particles and MBF ij (X i ) indicates the membership value of the characteristic parameter X i that corresponds to the jth-type particle.

2.3.3 Aggregation

The results obtained from the inference of individual rules are aggregated by the maximum aggregation method, and the maximum aggregation method regards the result of the maximum truth value as the final result. Moreover, the formula can be expressed as \( C=\underset{j}{\max}\left[{R}_j\right] \).

2.3.4 Defuzzification

The purpose of the fuzzy process is to find the index value which is corresponding to the maximum rule strength, and then, it will be outputted as the result.

3 Experimental design

The overview of the experimental design is presented in Fig. 2. In this study, two important modules are built to realize the hydrometeor-type identification system. The module of precipitation pattern classification system based on deep learning is applied to identify the precipitation pattern corresponding to a range bin, and the other module of precipitation particle identification system based on fuzzy logic is utilized to judge the hydrometeor-type corresponding to a range bin. For the above modules, the detailed introductions are as follows.

Fig. 2
figure 2

Overview of the hydrometeor-type identification system

3.1 The precipitation pattern classification system based on deep learning

In the system, the input is a matrix of 21 × 21 × 4 that reflects the precipitation pattern of a range bin, and the convolution neural network is applied to extract the features from the input. Moreover, the Softmax classifier is applied to classify the precipitation pattern such as rain, snow, and hail based on the features extracted by the convolution neural network algorithm. The structure diagram of this system is presented in Fig. 3.

Fig. 3
figure 3

The structure of the precipitation pattern identification system

As shown in Fig. 3, the system is composed of two convolution-subsampling layers and a Softmax classifier, and its workflow is mainly divided into the training process and the testing process. The detailed descriptions are as follows.

3.1.1 The training process

In the training process, the BP algorithm is used to adjust the parameters of the system based on a large number of labeled sample data, so that these parameters can meet the requirements of accurate classification. Its implementation steps are described as follows.

Firstly, the input data is convolution with the convolution layer, and the results are sampled by the subsampling layer. Repeat the above steps once; 40 features can be extracted from the input.

Secondly, the Softmax classifier is applied to identify the precipitation pattern based on the 40 features from step 1. The output including 1 (rain), 2 (snow), or 3 (hail) can be obtained.

Finally, if the actual output does not match the label data, the BP algorithm is applied to correct the weights of the each neuron in the system.

The training process can be summarized as the process of adjusting the weights by the BP algorithm, and it will continue to reduce the error until the error reaches to the acceptable level or exceeds the number of iterations that has been set in advance.

3.1.2 The testing process

The testing process uses some labeled samples to evaluate the accuracy of the system by comparing the output of the system with the label. In this study, we use many useful sample data detected in some special weather such as rain, snow, and hail to test the system; it is found that the system has a high accuracy with 81.17%.

3.2 The precipitation particle identification system based on fuzzy logic

The precipitation pattern corresponding to a range bin has been obtained in Section 3.1. Considering the different precipitation pattern including different precipitation particles, we need to build three different fuzzy logic classifiers corresponding to the three different precipitation patterns such as rain, snow, and hail. The relationship between the precipitation pattern and the hydrometeor type is presented in the Table 2.

Table 2 The relationship between the precipitation pattern and the hydrometeor type

As shown in Fig. 2, the system uses the Z H , Z DR, K DP, and ρ HV of the current range bin as input, and then, the input is processed by fuzzification, inference, aggregation, and defuzzification. In the end, the results are transformed into a single precipitation particle type.

4 Performance evaluation of the system

The assessment data is from a stratiform precipitation process detected at 1508 UTC 27 March 2017, and the main PPI products including the horizontal reflectivity (Z H ), differential reflectivity (Z DR), co-polar correlation coefficient (ρ HV), and differential propagation phase constant (K DP) are displayed in Fig. 4, which the corresponding detection elevation is 1.4°.

Fig. 4
figure 4

The radar measurements corresponding to the case of 27 March 2017. a Z H . b Z DR. c K DP. d ρ HV

The classification result of precipitation pattern corresponding to the assessment data is presented in Fig. 5.

Fig. 5
figure 5

The classification result of precipitation pattern corresponding to the case of 27 March 2017. a PPI of the classification result of precipitation pattern from the product of dual polarization weather radar. b PPI of the classification results of precipitation pattern from the classification system

As represented in Fig. 5, Fig. 5a shows the classification results from the radar product, while Fig. 5b shows the classification results from the precipitation pattern classification system. By comparing and analyzing the two images, it can be seen that their distribution is similar, and the difference is that Fig. 5b is more smoother than Fig. 5a. Some clutter interference has been removed, making the classification result better. It is mainly because that the system has used the feature matrix as the input, and the feature matrix consists of all the data of near the range bin within the scope of 1 km2 rather than a feature vector only from the range bin. Although this method will increase the complexity of data processing, it can remove the interference of clutter and noise. Moreover, the influence of measurement error in radar system on classification accuracy can be reduced effectively, so that the identification accuracy of the system can improve significantly.

As described in Section 3, the hydrometeor-type identification system has realized the identification of precipitation particles about crystals (CR), drizzle (LR), rain (RN), heavy rain (HR), wet snow (WS), dry snow (DS), ice crystals (IC), graupel (GR), ice hail (IH), and rain hail (RH), and the identification results corresponding to the assessment data are presented in Fig. 6.

Fig. 6
figure 6

The identification result of hydrometeor type corresponding to the case of 27 March 2017

The identification system in this paper mainly identifies the hydrometeor type within 100 km, and the distance between the range rings is 30 km. As presented in Fig. 6, rain is the main hydrometeor type in the region within 70 km from the radar station. Comparing to the Fig. 4, the intensity of the echo is ranging from 10 to 25 dBz, and the value of Z DR is mostly from − 1.5 to 1.5 dB. The value of K DP is ranging from − 1 to 1°/km, and the value of ρ HV is ranging from 0.95 to 1. From the distribution of these polarization parameters, it can be inferred that the precipitation particles in this region should be raindrops, and it is consistent with the classification results of the system.

As shown in these purple regions of the Fig. 6, its hydrometeor type mainly consists of ice crystals, graupel, and ice hail. The distance from the radar station is 70 km, and the height is about 1.7 km. Moreover, above these regions, its hydrometeor type is hail, and rain is the main hydrometeor type under these regions. It can be inferred that the freezing level is located in these regions. As presented in Fig. 4, the value of Z H is ranging from 25 to 40 dB, and the value of Z DR is mostly from 1 to 2.5 dB. Moreover, the value of ρ HV is ranging from 0.92 to 0.95. From the distribution of these polarization parameters, it can be inferred that these regions belong to the freezing level; and it is consistent with the identification results of the system.

5 Conclusions

An intelligent hydrometeor-type identification system for dual polarization Doppler weather radar data has been developed. It combines the main advantages of both deep learning and fuzzy logic algorithms, and it keeps the potential operational implementation reasonably simple. The hydrometeor-type identification system mainly consists of two modules. The first module, referred to as the precipitation pattern classification system based on deep learning in this paper, applies deep learning techniques to automatically extract useful features from the matrix of polarimetric measurements. Moreover, the Softmax classifier is used to classify the precipitation type based on the features extracted by the deep learning algorithm. The second module, referred to as the particle identification system based on fuzzy logic in this paper, applies the fuzzy logic methods to judge the hydrometeor type based on the Z H , Z DR, K DP, ρ HV, and the precipitation pattern which was classified by module 1.

In the system, because of the use of convolutional neural network algorithm for initial clustering, the interference of clutter and noise have been removed; meanwhile, the influence caused by the measurement error of polarization parameters can be reduced effectively. In addition, the fuzzy logic method is used for accurate clustering. The final classification result depends on the membership only rather than the specific values, so it won’t be affected by the inaccurate values of some parameters. From the March 27, 2017, case, we can see that the identification result agrees well with other polarized products.