1 Introduction

Due to the flexibility, survivability and long-distance transmission, shortwave communication has always been a reserved and development method in the field of wireless communication. Shortwave signal automatic recognition technology [1] is an important content of signal blind processing and an important basis for subsequent signal analysis, monitoring and countermeasure. With the development of modern shortwave communication technology, shortwave communication shows a trend of diversification of types, fine differentiation of specifications and continuous emergence of new signal types. Most of the traditional signal automatic recognition technologies are concentrated in the closed-set level. When new unknown signal enter the system, the correct result cannot be obtained. Therefore, in order to meet the need of convenience, intelligence and timeliness of modern blind signal processing, it is of great value to carry out the research on efficient open-set recognition technology of shortwave signal.

At present, most traditional signal recognition algorithms as well as algorithms based on deep learning only consider the recognition of known signal types. When a new unknown signal type appears, it will be recognized as one of the known signal, resulting in discrimination error. To solve the above problem, Literature [2] proposed a support vector data description (SVDD) algorithm with density scaled classification margin (DSCM), which determines the interval between hypersphere and positive samples according to the relative density proportion of two types of positive training samples, and carries out open-set recognition in combination with support vector description, However, the algorithm can only distinguish 2 types of positive sample signals, and will classify all unknown signal types into one class. Literature [3] extends the algorithm of incremental support vector machine (ISVM) [4] combined with error correcting output codes (ECOC) [5] to multi classification for incremental learning and recognition, but this algorithm cannot solve the forgetting problem in incremental learning. Besides, designing coding matrix requires more priori information, and its multi classification ability is restricted by the coding length, as well as the model needs to be trained every time when a new signal is received, lead to its low efficiency.

The generative adversarial (GA) method is also used to solve the open-set recognition problem. Literature [6] combines the improved intra class splitting (ICS) algorithm with the genetic adversarial algorithm to obtain the boundary signal samples, then trains the boundary signal samples as unknown types of signals and realizes the open-set recognition. However, the process of constructing boundary samples is complex and the effect is unstable, and it also cannot distinguish different types of unknown signal. Literature [7] uses the generative countermeasure network theory to build a reconstruction and discrimination network (RDN) model to identify the modulation types of signals. However, the difference between the reconstructed signal data and the real unknown signal data is difficult to control, and when the known signal types is more than 2, the classification and discrimination mechanism will be very complex, which results in low operability. In addition, it is still unable to distinguish different types of unknown signals.

Some other methods, such as Literature [8] uses the extreme value-weibull distribution to fit the cut-off probability of the distance from the feature to the feature center, combines the classification cross entropy with the center loss, and modifies the output of the dual channel long-short term memory (DCLSTM) network to conduct the modulation recognition. This algorithm proposes the concepts of feature center and feature distance. In some cases, it can distinguish different unknown types of signals, but it cannot distinguish signals of different specifications with the same modulation mode.

From the above analysis, it can be concluded that the current signal open-set recognition algorithms have the following shortcomings: 1) Some algorithms are only applicable to 2 types of known signals, and no longer applicable when the number of known signal type increases; 2) The existed works focus on the signal modulation recognition, the recognition method for different specifications with the same modulation mode is hardly considered; 3) It is difficult to distinguish different types of unknown signals, unknown signals can only be distinguished into one class, called ‘unknown class’.

In this paper, we propose a method to transform features of different signals into different regression values, and use these values to distinguish different signals. The contributions of proposed method are described as follow: Firstly, we design a dual-input neural network to fuse and map the feature information extracted from signal data stream and vector diagram. For better feature extraction, we design a network structure based on dense convolution theory. Secondly, different from the traditional recognition network structure, we use the hyperbolic tangent (Tanh) activation function to perform numerical regression on signal features at the end of the network, and establish a one-to-one nonlinear mapping relationship between signal feature and specific value. Thirdly, we test the network in closed-set, using the upper and lower quintile algorithm to obtain the regression discrimination threshold of each known signal and the center distance threshold for unknown signal. Finally, we perform open-set experiments to demonstrate the effectiveness of the proposed method.

2 Distinguishing Features of Shortwave Signal

2.1 Data Stream

Specific shortwave standard has unique generation algorithm and transmission specification. These rules and standards make its signal data stream presents unique information organization format. Taking MIL-STD-188-110A (110A) [9], MIL-STD-188-141B(141B) [10] and Link11 SLEW [11] as an example, the typical information transmission format is shown in Fig. 1.

Fig. 1.
figure 1

Typical transmission format for shortwave 110A, 141B and Link11 SLEW signal. The information format of 110A signal consists of preamble sequence, data sequence, end field and flush field. 141B consists of protection sequence, preamble sequence and valid data. Link11 SLEW consists of header sequence, phase reference sequence, start code, tactical data and Supervised stop code.

We can conclude that the data transmission organization structure of different signals is unique, and the bits of each sequence and field are not the same. These differences make the received 110A, 141B and Link11 data stream present the unique data characteristics of their respective signal. Based on this, if a feature extraction algorithm with high performance and strong robustness can be found for signal data, the feature extracted from signal data stream can be used as recognition criteria to distinguish the type of different shortwave signals.

2.2 Vector Diagram

Vector diagram shows the symbol track by reconstructing two channels of received signal data in time order, not only can distinguish frequency shift keying (FSK) and phase shift keying (PSK), but also can distinguish signals with different PSK modulation modes, as shown in Fig. 2. The symbols of PSK signals have a fixed phase, so the vector diagram is in the form of constellation point and symbol trajectory, while the phase of FSK signals is random during symbol conversion, so the vector diagram is in the form of circle.

Fig. 2.
figure 2

Vector diagram of shortwave signal. It shows signal with different modulation mode has different vector diagram forms.

In this paper, the signal vector diagram is used as the supplementary feature extraction source. By powerful feature processing ability of neural network, the different feature information of signal specification represented by data flow and the modulation feature information represented by vector diagram is fused, and then learned and mapped, to further improve the performance of signal recognition.

3 Proposed Method

In this section, we first describe the dual-input neural network architecture of our method, then we present the algorithm for obtaining the discrimination threshold. Finally, we demonstrate the procedure of the proposed scheme.

3.1 Dual-Input Regression Neural Network

Regression analysis (RA) is a statistical analysis method to determine the relationship between two or more variables. We construct dual-input regression neural network to map the extracted signal feature to specific value. By using the difference of numerical regression result, we can distinguish different signals in open-set range.

The proposed dual-input regression neural network is illustrated in Fig. 3. The feature extraction is conducted by 7 feature extraction modules. The structure of feature extraction module is shown in Fig. 4. The network connects adjacent feature extraction module through the transformation module, each transformation module contains a 1 × 1 convolution and a 2 × 2 average pool. After extracting the feature via the above \((66{ + }18) \times 2 + 5 = 173\) layers network and conduct a 7 × 7 global average pool, the acquired feature information are fused by concatenation, and then establish the nonlinear relationship between signal feature and specific value by regression processing. Except for the end of the network, the rectified linear unit (ReLu) is used in each layer. During the compilation and optimization of the network, the Adam algorithm is used to work out the optimal solution of the network structure parameters.

Fig. 3.
figure 3

Structure of dual-input regression neural network. The data matrix branch contains 4 feature extraction modules and the vector diagram branch contains 3. Each feature extraction module contains different numbers of connection nodes.

Fig. 4.
figure 4

Structure of the feature extraction module designed based on densely connected convolution [12], which has a better performance than residual structure [13].

At the end of the network, Tanh activation function is used for regression from signal eigenvectors to preset specific values:

$$ {\text{Tanh}} (x) = \frac{{e^{x} - e^{ - x} }}{{e^{x} + e^{ - x} }},x \in ( - \infty , + \infty ) $$

Compared with Sigmoid activation function, which is widely used in regression operation:

$$ {\text{Sigmoid}}(x) = \frac{1}{{1 + e^{ - x} }},x \in ( - \infty , + \infty ) $$

The Sigmoid activation function may change the distribution of original data to some extent, as shown in Fig. 5, while Tanh does not. Moreover, Tanh has a larger gradient, so that the convergence speed is faster in regression operation, which can achieve better training effect.

Fig. 5.
figure 5

Comparison between Tanh and sigmoid activation function. Sigmoid is non-zero mean, its output range is (0,1). Non-zero mean data will be mixed during output, which will change the distribution of original data to a certain extent. The Tanh activation function is zero mean and the output range is (−1,1), which solves the above problem.

3.2 Discrimination Threshold

After regression of a specific signal with several signal samples, the result values will fall into a small range. In this paper, the upper and lower quintile algorithm is used to work out the interval threshold and center distance threshold of known signal, in which the interval threshold is used as the basis to distinguish known and unknown signals, the center distance threshold is taken as the length when intercepting the numerical cluster of unknown signals. Suppose that after regression processing of a known signal S, the numerical distribution of several samples is shown in Fig. 6.

Fig. 6.
figure 6

Diagram of upper and lower quintile algorithm. The outliers of the numerical regression results are removed through this algorithm, and the appropriate threshold is obtained.

Define \(\gamma_{low}\) as the lower quintile of the data set, indicating that there is only 1/5 of all data, which value is less than \(\gamma_{low}\). Similarly, define \(\gamma_{up}\) as the upper quintile of the data set, which means that only 1/5 of all data has a value greater than \(\gamma_{up}\). According to the upper and lower quintile algorithm, the interval threshold of regression value for signal S is defined as:

$$ \left\{ \begin{gathered} \delta_{low} = \gamma_{low} - \mu (\gamma_{up} - \gamma_{low} ) \hfill \\ \delta_{up} = \gamma_{up} + \mu (\gamma_{up} - \gamma_{low} ) \hfill \\ \end{gathered} \right. $$

where \(\delta_{low}\) is the lower bound threshold of regression value for signal S, \(\delta_{up}\) is the upper bound threshold, and \(\mu\) is the scale factor, which is 1.5 in this paper. In addition, \(\delta_{up} - \delta_{low}\) is the upper and lower distance threshold of the regression for signal S. After regression test of known signals in the closed-set, use:

$$ D = \lambda \frac{1}{2J}\sum\limits_{{{\text{j}} = 1}}^{J} {(\delta_{up}^{(n)} - \delta_{low}^{(n)} )} $$

To calculate the center distance threshold D, which is used as the length of subsequent center-distance interception of unknown signals numerical clusters. In Eq. (4), J is the number of known signal types, \(\delta_{up}^{(n)}\) and \(\delta_{low}^{(n)}\) represent the upper bound threshold and lower bound threshold of the j-th known signal, \(\lambda\) is the grace factor, the value we use is 1.38.

3.3 Algorithm Scheme

According to the above discussion, the open-set recognition process is as follows:

  1. 1)

    Preprocess known shortwave signals and construct training signal data sets;

  2. 2)

    Use the training data set to train the network, when the network’s loss value falls below the preset threshold, the training is terminated and the network is saved;

  3. 3)

    Since the network cannot conduct zero-error regression, the trained network is used to test the known signal. With the upper and lower quintile algorithm, the interval threshold and center distance threshold of each known signal are obtained as the standard to distinguish between known and unknown signals and the subsequent interception of the unknown signal;

  4. 4)

    In the open-set range, use the network to recognize the preprocessed signals. For the regression value of a specific signal, if it falls within the threshold of a known signal interval in step 3), it is judged as such known signal, and if it falls outside the threshold of all known signal intervals, it is judged as unknown signal;

  5. 5)

    Use the kernel density clustering algorithm [14] to cluster all regression values identified as unknown signals to obtain the number of categories, regression numerical clustering clusters and corresponding density center coordinate. For each numerical clustering cluster, use the density center coordinate combined with the center distance threshold to intercept, the signal samples represented by the regression numerical points falling within the interception range are identified as such unknown signal, so as to complete the open-set recognition.

4 Experimental Results

In this section, the recognition performance of proposed method is simulated and tested. The experimental platform is configured with Intel (R) Xeon (R) e-2276m processor, NVIDIA Quadro RTX 5000 GPU and 32 GB DDR4 memory.

Signal used in the experiment includes 6 types: 110A, MIL-STD-188-110B (110B) [15], MIL-STD-188-141A(141A) [16], 141B, Link11 SLEW, PACTOR [17]. The signal setting of the experiment is shown in Table 1. During experiment, 110A, 141B, Link11 SLEW and PACTOR are used for network training as known signals, and are set to regress to the value of 0, 1, 2, and 3. 110B and 141A as unknown signals are not used for training. After obtaining the discrimination threshold according to Sect. 3.2, 110B and 141A are used as network input together with the 4 known signals in the open-set test stage.

Table 1. Attributes of experimental signal samples

For generating vector diagram, the size is set to 128 × 128 to fit the structure of the network. For data stream, as the network’s performance will be affected by the change of data statistical distribution, resulting in the inconsistency of calculation dimensional dynamic range and the decline of learning performance. Therefore, the normalization algorithm is adopted as:

$$ {\text{Norm}}(data) = \frac{{data - \frac{\max (data) + \min (data)}{2}}}{\max (data) - \min (data)}{ + }0.5 $$

which \(data\) represents the signal data before normalization, \({\text{Norm}}(data)\) is the data after normalization processing. With normalization, the network can process data at the same scale, gaining better learning and regression performance. In addition, considering that the neural network can perform efficient operation on two-dimensional data structure, so the normalized data is constructed as 336 × 336 data matrix to obtain the high efficiency of data structure.

4.1 Recognition Performance

Table 2 shows the open-set recognition result of proposed method, The signal-to-noise ratio (SNR) of the experiment is 6dB. It is shown that after regression operation of 4 known signals 110A, 141B, Link11 SLEW and PACTOR, it does not completely regressed to the preset value, but have slight deviation. Therefore, according to the upper and lower quintile algorithm in Sect. 3.2, the upper bound and lower bound thresholds of regression for each known signals are obtained to distinguish known and unknown signal. At the same time, the center distance threshold obtained for center-distance interception of unknown signals is 0.0581. The experiment result indicates that when the SNR is 6dB, the recognition accuracy of known signals reaches more than 96%, which verifies the feasibility of the proposed method.

Table 2. Open-set recognition results of the proposed method

Once regression processing is completed, use the kernel density clustering algorithm to obtain the numerical clustering clusters and density centers of unknown signal, and then intercepts them by using the center distance threshold. The proposed method can distinguish the unknown signal 1 (110B) with a recognition accuracy of 90.1%, and the unknown signal 2 (141A) with a recognition accuracy of 99.20%.

Overall, compared with the traditional open-set recognition method, which has few applicable signal types, difficult to distinguish signals of different specifications with same modulation mode and difficult to distinguish different unknown signals, the proposed method can effectively deal with the open-set signal data set, of which 4 signals are 8PSK modulation mode, and can distinguish different types of unknown signals.

4.2 Influence of Numerical Scale on Regression

This section discusses the influence of different training regression scale on network performance through comparative experiments. Table 3 shows the training regression value of 2 experiments on the known signals 110A, 141B, Link11 SLEW and PACTOR. During the training stage, 4 known signals are regressed to the value of 0, 1, 2, 3 and 0, 100, 200, 300.

In order to better observe the result, signal samples are input into the network in the order of signal type during the test stage. The corresponding relationship between signal sample type and signal serial number is shown in Table 4.

Table 3. Training regression value of each experiment

The number of each signal type is 1000. The regression result of each experiment is shown in Fig. 7. It can be seen that when different scale of regression is set, the network will carry out numerical regression according to the preset scale, and the result of both experiment have good discrimination.

Table 4. Corresponding relationship between signal sample type and serial number

This is because, although the numerical scales are different, once the network completes the training under this scale, a nonlinear mapping relationship matching this scale is formed. In other words, the training of different scale will only lead to the difference in the numerical dimension of regression result, and will not affect the discrimination performance between signals.

Fig. 7.
figure 7

The numerical regression result at different scales of training regression value. The experimental results show that different regression numerical scale will not affect the discrimination of signals.

5 Conclusions

By combining the feature information of shortwave signal data stream and vector diagram, an open-set signal recognition method is proposed. Using the good feature extraction ability of densely connected convolution and the excellent feature processing and regression performance of dual-input regression neural network, the open-set signal recognition task is well completed. Experimental results show that compared with the traditional method, the proposed method can distinguish different type of unknown signals while maintaining the open-set recognition accuracy, and can effectively distinguish signals of different specifications with same modulation mode. In addition, this paper proposes to establish the regression relationship between signal feature and specific value, and embody the feature of different signal types as different regression values. This idea of transforming feature information for processing provides a new approach for further research in this field.