Keywords

1 Introduction

Ensuring signal authenticity is critical for Global Navigation Satellite System (GNSS) users, particularly intentional attacks to GNSS receivers are feasible during the signal transmission or when transmitting GNSS data to location-based applications (Borio et al. 2017). In the field of Wireless Local Area Networks (WLAN), wireless device fingerprinting is a considerably viable strategy to address the issue of signal authenticity. The fingerprints/signatures derived from the device-specific metrics are generated to identify individual devices or separate different devices (Xu et al. 2015; Polak and Goeckel 2015).

GNSS receiver fingerprinting has accordingly been investigated to preliminarily discriminate receivers in static scenarios (Borio et al. 2016). The motivation is that the receiver clock errors show certain controlled behaviour, i.e. frequency stability, thanks to the clocks’ physical properties. Borio et al. (2017, 2016) indicated that a combination of three clock-specific features enables the separation of a few geodetic receivers from a few mass-market receivers. Note that, the Temperature-Compensated Crystal Oscillator (TCXO) embedded in GNSS receivers has limited long-term frequency stability and is highly sensitive to environmental factors like accelerations and vibrations (Jain and Schön 2020). Thus, the reliability of deriving clock-related features of an internal clock to fingerprint receivers is not guaranteed. In turn, high-precision atomic clocks show stable performance. A comprehensive overview of the foundational principles of quartz oscillators and atomic clocks is available in Teunissen and Montenbruck (2017). By equipping GNSS receivers with precise clocks like miniature atomic clocks (MAC) or chip-scale atomic clocks (CSAC), the vertical accuracy of the positioning results can be significantly improved, and even navigation using three satellites can be realized (Clock Coasting (Sturza 1983; Weinbach and Schön 2011; Krawinkel and Schön 2014)). It has been proven that the holdover performance of a CSAC-aided GNSS receiver, i.e. the recovery time from signal outages, is always stable and better than a normal receiver when time lapse exceeds 1 min (Fernández et al. 2017).

This paper investigates the potential of GNSS receiver fingerprinting in static and dynamic conditions by utilizing CSACs as receivers’ external clocks. The clock-specific features for characterizing fingerprinting are presented in Sect. 2. Section 3 proposes the approaches adopted to feature extraction. Several GNSS measurements in various scenarios are then collected in Sect. 4. Finally, the feasibility of fingerprinting using clock-specific features is analyzed in Sect. 5, followed by the conclusion summarized in Sect. 6.

2 Clock-Derived Features

Clocks’ fingerprints derive from unique clock-related features tied to frequency stability, demonstrating how an instant frequency adheres to its nominal frequency over time, thus reflecting clocks’ unique physical behaviour. Consequently, 13 such features, consistent with those in Borio et al. (2017, 2016), are extracted from the characteristics of the metrics summarized in the following.

2.1 Allan Deviation (ADEV)

ADEV, the standard deviation of the first differences of fractional frequency values, is the most common way to measure frequency stability in time domain. The overlapping ADEV further improves original ADEV by utilizing all possible sample combinations overlapped to each other, leading to better estimate confidence (Riley 2008). Equation 1 is the way to calculate overlapping ADEV \(\sigma _y(\tau )\). \(\mathit {y}\) represents the fractional frequency samples determined from the receiver clock drift, and \(\mathit {N}\) is the samples’ total number. The sampling interval \(\tau \), also known as averaging time, is the multiplication of averaging factor \(\mathit {n}\) and sampling rate \(\mathit {T_s}\) (\(\tau = n\cdot T_s\)).

$$\displaystyle \begin{aligned} \sigma_y^2(\tau) \!=\! \!\frac{1}{2n^2(N\!-\!2n\!+\!1)} \!\sum_{j=1}^{N\!-\!2n\!+\!1}\! \!\left(\! \!\sum_{i=j}^{j\!+\!n\!-\!1}\! \!\left(\! y_{i\!+\!n} \!-\! y_i \!\right)\! \!\right)\!^2 \! {} \end{aligned} $$
(1)

Figure 1a gives an example of overlapping ADEV for various oscillators. Apparently the Rubidium frequency standard (SRS PRS10) has the best frequency stability for both short and long term. The advantage of CSACs is the stable performances at long-term averaging time, while the stability of the quartz oscillators becomes comparably unstable. However, the OCXO post-filter shows good short-term stability. Based on the oscillators’ different performances, the candidate features, noted as \(OA_x\), are decided to be the values at short-term 1s \(\mathit {OA}_{\mathit {1}}\), 30\({s}\) \(\mathit {OA}_{\mathit {30}}\) and their slope \(\mathit {OA}_{\mathit {slope}}\). The minimum value \(\mathit {OA}_{\mathit {min}}\) and the averaging time \(\tau _{\mathit {min}}\) of \(\mathit {OA}_{\mathit {min}}\) are also supposed to be useful for differentiating the clocks.

Fig. 1
figure 1

Metrics of frequency stability for oscillators including CSACs (Jackson Labs CSAC (blue), Microsemi CSAC (green), Stanford Research Systems PRS10 (high-precision) (purple)), quartz oscillators (Jackson Labs OCXO (orange), TCXO (yellow)). The metrics are derived from static GNSS data collected for Project VENADU-A2 (Krawinkel and Schön 2014). (a) Overlapping ADEV. (b) MTIE(left,solid)+rmsTIE(right,dashed). (c) Auto-correlation

2.2 Time Interval Error (TIE)

Time Interval Error (TIE) is another measure of a clock’s time errors. It describes time error variations through a time interval \(\tau \) starting from the time point \(t_0\), defined as Eq. 2. TE means time errors, referring to the differences between instantaneous times and its ideal times (Bregni 2002). It can be calculated by the integral of frequency errors (\(\sum _{i=0}^n y_i T_s\), n: sample lag with similar meaning to m mentioned in Eq. 1) (Borio et al. 2016). In this way the maximum TIE (MTIE, Eq. 3) and the root mean square of TIE (\(\mathit {TIE}_{\mathit {rms}}\), Eq. 4) are meaningful for characterizing a clock’s stability behaviours. Different from the measures determined by averaging data samples, MTIE refers to variations of the peak values of TIE within a time period T, as described in Eq. 3 (Bregni 2002).

$$\displaystyle \begin{aligned} \mathit{TIE}(\tau) &= \mathit{TE}(t_0+\tau)-\mathit{TE}(t_0){} \end{aligned} $$
(2)
$$\displaystyle \begin{aligned} \mathit{MTIE}(\tau) \!&=\! \max\limits_{t_0=0}^{T-\tau} \!\left( \max\limits_{t=t_0}^{t_0+\tau}\mathit{TE}(t) \!-\! \min\limits_{t=t_0}^{t_0+\tau}\mathit{TE}(t) \right)\! {} \end{aligned} $$
(3)
$$\displaystyle \begin{aligned} \mathit{TIE}_{\mathit{rms}}(\tau) \!&=\! \sqrt{\!\frac{1}{N\!-\!n}\!\sum\nolimits_{t_0=1}^{N-n}\mathit{TIE}(\tau)^2\!}{} \end{aligned} $$
(4)

From Fig. 1b, we can see the \(\mathit {MTIE}\) and \(\mathit {TIE}_{\mathit {rms}}\) curves of each oscillator are less distinctive than \(\sigma _y(\tau )\). The curves rise along averaging time with very similar slopes, especially for CSACs. Nevertheless, the values at 1s, 30s and their slope describe generally the clocks’ behaviours. Thus the features \(\mathit {MTIE}_{\mathit {1}}\), \(\mathit {MTIE}_{\mathit {30}}\), \(\mathit {MTIE}_{\mathit {slope}}\), \(\mathit {rmsTIE}_{\mathit {1}}\), \(\mathit {rmsTIE}_{\mathit {30}}\) and \(\mathit {rmsTIE}_{\mathit {slope}}\) are extracted as the potential features of fingerprinting.

2.3 Correlation Between Time Series

In Polak and Goeckel (2015), Borio et al. (2016), the autocorrelation of normalized frequency errors is utilized to produce features as fingerprints of oscillators. The autocorrelation curves of several oscillators from short to long time intervals are shown in Fig. 1c. It is noticeable that the internal clock generates high-correlated time series (\(\sim \)1) until the interval increases to \(\sim 10^3\mathit {s}\). Another quartz oscillator (orange) shows also correlations, whereas the time series of CSACs decorrelate themselves quickly within \(\sim \)30s. To choose specific characters for separating different clocks, the candidate features are decided as correlation at time intervals of 20s \(\mathit {R}_{\mathit {20}}\) and of 60s \(\mathit {R}_{\mathit {60}}\).

3 Feature Extraction

Feature extraction is a way to exploit practical features for fingerprinting. Attention should be paid on the reduction of feature dimension because features can be redundant (Xu et al. 2015). The idea is to form a feature set by either creating important new features from, or directly selecting several essential features among, the candidate features. Three related machine learning approaches are proposed in this section, essentially recasting clock fingerprinting as a classification problem.

3.1 Pre-Processing Procedures

First of all, a series of pre-processes are implemented to acquire more precise frequency data, enabling candidate features to better reflect the clocks’ real stability behaviours. Figure 2 outlines the pre-processing steps for GNSS raw data. The receiver clock parameter Clock Drifts is initially resolved by passing the Doppler observations through a Single Point Positioning (SPP) estimation. This raw frequency data then undergoes further processing: small gaps are filled, deterministic effects like frequency offsets or frequency drifts are subtracted, and outliers are removed.

Fig. 2
figure 2

Pre-processing flowchart of fingerprinting

To determine the minimum observation duration for extracting reliable features, the processed static data sampling in 1 Hz is divided into non-overlapping segments. The segment length increases from 20 min to 120 min, with 10 min increment. For kinematic data sampling in 10 Hz, the length of data segments starts from 30s with 1 min increment. Note that, a longer segment duration allows the extracted features more broadly representing the clocks’ frequency stability but in our case induces a reduction of sample size due to the fixed total data duration. Conversely, a shorter segment duration presents the opposite situation. For instance, given a five-day measurement sampling in 1 Hz, the quantity of segments ranges between 360 and 60. Suppose five receivers are measuring simultaneously, the accumulated sample size n increases correspondingly, ranging from 1800 to 300. For each segment/sample, a feature vector of 13 features is subsequently computed. Hence, for each dataset of a specific observation duration, a feature matrix \(\mathit {A}\) is compiled for classification, accomplished by stacking feature vectors with the dimension of \(n\times 13\).

3.2 Singular Value Decomposition and Support Vector Machine

A widely-used method for reducing data dimension is Principle Component Analysis (PCA) (Bishop and Nasrabadi 2006). It projects the features to a lower-dimensional space where the most important information is determined from dimensions with the greatest variances and meanwhile de-correlated because the dimensions are orthogonal to each other. This can be realized by Singular Value Decomposition (SVD) of feature matrix, where the right singular vector sorts the importance of information in columns of original data.

Equation 5 specifies this process. The columns \(\mathit {C}_{\mathit {i}}\) of the feature matrix A are firstly normalized by \(\frac {C_i-\min (C_i)}{\max (C_i)-\min (C_i)}\) because of features’ different units and magnitudes. Decomposing the normalized A, we obtain a matrix \(\Sigma \) containing the singular values arranged on the diagonal from large to small, or from important to unimportant. The right singular vector V  consists of the columns \(V_{i,13\times 1}\) referring to the singular values in sequence. If the cumulative proportion of the first m singular values exceeds the empirical \(95\%\) threshold, we consider the first m columns of V  contains sufficient information to describe A. Hence, the new feature matrix \(A'\) is derived by a multiplication in Eq. 5. Specifically, each new feature element is calculated by \(\sum _{i=1}^{13}v_ia_i\), in which \(a_i\) is an element of the original feature vector \(A_{i,1\times 13}\) and \(v_i\) is an element of \(V^{\prime }_{i,13\times 1}\), equivalent to a weight. Thus, each feature vector is characterized by m new generated features instead of 13 original features. However, the new features cannot be physically interpreted like the original features.

$$\displaystyle \begin{aligned} \begin{aligned} {} A_{n\times 13} &= U_{n\times n} \cdot \Sigma_{n\times 13} \cdot V^T_{13\times 13} \\ A^{\prime}_{n\times m} &= A_{n\times 13} \times \begin{bmatrix} V_1\quad V_2\quad \cdots\quad V_{\mathbf{m}} \end{bmatrix} \end{aligned} \end{aligned} $$
(5)

Support Vector Machine (SVM) is a widely-used classifier, fitting for scenarios with sparse data samples, akin to our experimental situation. The essential of SVM is to discover an optimal hyperplane that simultaneously maximizes the vertical distances from the hyperplane to the planes formed by each class’s support vectors. Support vectors signify each class’s data samples nearest to the hyperplane. In our case, multi-class classification problems are considered due to the experiments conducted usually with multiple clocks. This can be solved through one-versus-the-rest approach which trains multiple hyperplanes (Bishop and Nasrabadi 2006). Each trained hyperplane separates one and the rest classes.

Moreover, cross validation is performed ten times to train desired classifiers with robust generalization. Each iteration utilizes 90\(\%\) randomly-selected samples to train a classifier which is subsequently tested by the remaining 10\(\%\). The testing samples are assigned to the classes yielding the highest scores or probabilities. Subsequently the feasibility of fingerprinting of this approach is assessed by calculating the overall accuracy (OA, different from \(OA_x\) above), precision and recall. Additionally, the optimal, i.e. minimum, observation duration is marked by superior OA.

3.3 Decision Tree

The key idea is to construct a binary tree by issuing a decision for each tree node, i.e. choosing the optimal attribute for partitioning the whole data samples. For multi-class classification problems, the optimal attribute for the root node should separate the dataset into two portions which have as few samples of the same category as possible. The same arrangement is in turn adopted to the two portions until all classes have been identified. This idea can be realized by measuring the information gain of each attribute for every node decision. The information gain is computed by comparing the change in entropy before and after partitioning the dataset. The entropy describes the degree of impurity or disorder in a dataset.

$$\displaystyle \begin{aligned}{} \begin{aligned} \mathit{Gain}(D,a) &= \mathit{Ent}(D) - \sum\nolimits_{v=1}^V\frac{\lvert D_v \rvert}{\lvert D \rvert}\mathit{Ent}(D_v) \\ \mathit{Ent}(D) &= -\sum\nolimits_{k=1}^K p_k log_2 p_k \end{aligned} \end{aligned} $$
(6)

Equation 6 explains information gain mathematically, in which D, a and v represent a dataset, a feature/an attribute, and possible values of a feature, respectively. \(\frac {\lvert D_v \rvert }{\lvert D \rvert }\) equals to a weight for v by means of the proportion of the positive samples. k and \(\mathit {p_k}\) denotes the classes and the proportion of samples of a class k (Wang et al. 2017). The greater the information gain, the greater the purity gain obtained by using feature a to segment the dataset D.

Similar to the approach SVD+SVM, ten-times cross validation is implemented for decision tree. During the training phase, 90\(\%\) samples are used to train a decision tree. Several efficient features are selected during node decisions. Few features will be decided for the dataset with high purity and vice versa. The remaining samples test the decision tree, resulting in scores or probabilities of all classes, and quality measures.

3.4 Filtering Method

We adopt the filtering method successfully developed in Borio et al. (2017) to fingerprint various oscillators as a comparison with the approaches above. The candidate features are randomly assigned to groups with a capacity of three, i.e. \(C_{13}^3\! =\! 286\) combinations. Essentially, it defines a score function (Eq. 7) to rank these combinations, selecting the one with the highest score for fingerprinting. The score function is a ratio between the minimum inter-class (different classes i and j) distances and the maximum intra-class distances (class i). F denotes a feature subset, cf. Borio et al. (2017).

$$\displaystyle \begin{aligned}{} \mathit{Score}\;G(F) = \frac{\min_{i\neq j}d_{i,j}(F)}{\max_id_i(F)} \rightarrow \max \end{aligned} $$
(7)

In our case, we utilize the filtering method during the training procedures of 10 times cross validation. Hence, the feature subset of the largest Score G is used in the testing procedure, in which the testing samples are characterized by the three features. The classification is done by comparing the Mahalanobis distances from the samples to the class centers, followed by an evaluation process.

4 Overview on Experiments

GNSS data documented in various scenarios are gathered to demonstrate the effectiveness of the approaches mentioned above (Table 1). Firstly a static experiment was executed on the institute’s roof top in 1Hz sampling rate for four days (Krawinkel and Schön 2014). A fast-driving experiment consists of tracks along the route comprising a highway, an urban area in city Siegen, three tunnels and a small road with plaster, producing a \(\sim \)1.5 hours dataset sampled in 10Hz. A flight experiment was realized in Dortmund with the same equipment setup, yielding \(\sim \)2.5 hours data in 10Hz sampling rate (Jain and Schön 2020).

Table 1 Experiment data summary and description

Each GNSS receiver of the same type functions by either utilizing its built-in clock or connecting externally to a miniature clock, cf. Table 1. For each kinematic experiment, a reference trajectory is created using a relative positioning approach, based on high-quality observations from Inertial Measurement Unit (IMU) of IGI AEROcontrol and GPS phase measurements. The operation setup of the same clocks is tested in static scenario.

5 Fingerprinting Results

5.1 Static Scenarios

Figure 3 gives an overview of the three approaches’ performances with five clocks in static condition. Only OA is shown here because precision and recall have similar behaviours. First, we can notice the accuracy of the three approaches for datasets in all durations is larger than 99\(\%\), denoting the capability of the three approaches for fingerprinting the clocks. The missing part of less than 1\(\%\) is resulted from one or two wrongly-classified samples visualized in Fig. 4. Additionally, for the necessary amount of features for fingerprinting, the feature dimension of SVD+SVM reduces from 13 to 7, while the other two methods require only three features. This implies that SVD+SVM is not as efficient as the other two approaches in this experiment context. It also accounts for its minimum observation duration of only 30min to achieve its best accuracy, as supposed to 50min required by the other two approaches.

Fig. 3
figure 3

OA of classification results for datasets with various observation periods. The datasets are obtained from static Exp1 and processed using the three proposed approached. f is the number of selected features

Fig. 4
figure 4

3D visualization of fingerprinting results for data of static Exp1. (a) Filtering method: obtained feature combination of \(OA_1\), \(OA_{slope}\) and \(rmsTIE_{slope}\), selected for frequency data of 30 min time length. (b) SVD+SVM: first three of seven generated features, selected for frequency data of 20 min time length

The classification results of filtering method are shown in Fig. 4a. 814 samples are denoted in five colors in three dimensions of the selected features. The features of each sample are derived from a 30min observation dataset. Intuitively the clusters of five clocks are distinctive, demonstrating the feasibility of fingerprinting of this method. The CSAC clusters (red \(\&\) magenta) are relatively concentrated and their 3D locations are close. This can be interpreted by the two CSACs’ similar physical properties. In reverse, the distribution of quartz oscillators’ samples (green \(\&\) blue) are scattered, especially in the dimension of \(rmsTIE_{slope}\), implying their instability. The class of high-precision clock (cyan) is additionally easy to be identified due to its distinct locations to others. Specifically, the sample marked by a black triangle is wrongly assigned, resulting from the shorter Mahalanobis distance of the sample to the blue cluster center instead of the green one.

The performance of SVD+SVM is displayed in Fig. 4b. 1222 samples are expressed by the first three of seven new features, originated from 20-min observation datasets. Similarly, the clusters are well separated to each other and CSACs in red and magenta symbols are especially straightforward to be distinguished. Besides that, the five clusters get closer comparing to those in Fig. 4a because seven features are in fact used in SVM to divide the classes. Admittedly, the new features derived via SVD lack a clear physical explanation, which makes it difficult to discover the efficiency of each feature candidate.

A different feature set, derived from the same 1222 samples, is decided by decision tree. Figure 5 presents how the trained tree distinguishes classes. Noticing its left column, \(OA_{30}\) distinctly isolates the internal clock (blue) which has the features markedly differing from others. In the middle column, \(rmsTIE_{slope}\) further separates the green from the rest. Though both \(OA_{30}\) and \(OA_1\) can separate the cyan from the magenta, the tree opts for \(OA_1\) because it provides a greater separation (\(\sim 6\times 10^{-11}\) in 1-D distance) between the two. The two CSACs (red and magenta) are again divided by \(OA_1\). Finally, the effectiveness of a feature to distinguish classes can be concluded from the diagonal plots. Overlapping bars suggest the feature is not powerful for distinguishment, and vice versa.

Fig. 5
figure 5

\(3\times 3\) scatter plots of fingerprinting results using method decision tree for data of static Exp1. Three features \(OA_{30}\), \(rmsTIE_{slope}\) and \(OA_1\) are selected for frequency data of 20 min time length

In summary, CSACs are more identifiable by forming focused sample sets. Note that, although the selected features, minimum observation duration, and fingerprinting results may vary with each execution, fingerprinting in static scenarios via the three approaches are proven feasible, given appropriate features are chosen. Moreover, the filtering method and decision tree outperform SVD+SVM in efficiency due to fewer required features.

5.2 Dynamic Scenarios

Clocks employed in dynamic scenarios suffer from environmental factors like temperature variations, accelerations and vibrations etc. Especially the data is recorded in hard GNSS conditions like urban areas, and through flight maneuvers. The derived clock-related features are likely to mix with massive noise, undoubtedly complicating the clock fingerprinting process. Table 2 summarizes the classification results of three approaches in such scenarios. Principally, the OA of dynamic scenarios (\(<75\%\)) is lower than static scenarios (\(\sim 90\%\)). The filtering method’s comparably low OA indicates three features cannot adequately handle such complicated situation. SVD+SVM performs slightly better with more features required. Furthermore, decision tree has the most accurate classification results and generally needs few features. The large number required for selected features partly results from the high volume of samples, which increases the class confusion.

Table 2 The number of selected features (\(\#\)f) and OA of three approaches in two kinematic experiments and the corresponding static experiment

Figure 6 gives the classification results of decision tree using kinematic data segmented in 30 s. Note that, the results are visualized in only three dimensions, whereas more than three are needed to get the OA shown in Table 2. Apparently the four clusters in the first two plots are not successfully distinguished, accompanying with many wrong-classified samples (black). This is also reflected in the confusion matrices of Fig. 6c. Nevertheless, the centroid of the magenta (internal clock) in Fig. 6b is clearly isolated from others, leading to a \(\sim 100\%\) classification precision of this class. In return, the sample sets of the rest classes are densely mixed because of the similar properties of high precision and stability. The four clusters in Fig. 6a are scattered except the green (LCR900) is slightly apart from others. This indicates again the comparable characteristics of the four clocks and the necessity of exploring more efficient features to distinguish them.

Fig. 6
figure 6

Fingerprinting results of kinematic data segmented in 30s, via decision tree. (a) Car exp2: \(\sim \)440 samples visualized by first 3 of 5 selected features with OA \(\sim 65\%\); (b) Flight exp3: \(\sim \)1400 samples visualized by first 3 of 9 selected features with OA \(\sim 79\%\); (c) Confusion matrices of car exp2 (top) and flight exp3 (bottom), the numbers are rounded in percentage

6 Conclusion

In this paper, we investigate the feasibility of receiver fingerprinting aided by CSACs in static and kinematic conditions. 13 features related to clocks’ frequency stability are treated as fingerprints, derived from overlapping ADEV, TIE and autocorrelation. The approaches SVD+SVM and decision tree are adopted to determine practical features for fingerprinting. An existing filtering method is implemented for comparison.

The three approaches are proven effective to classify clocks in static scenarios with OA exceeding 99\(\%\). Besides, CSACs are advantageous for clock identification due to the extremely stable clock behaviour. For dynamic GNSS data, decision tree outperforms with over 70\(\%\) OA, followed by SVD+SVM, while filtering method is not useful due to insufficient selected features. Lastly, the amount of necessary features and min. observation duration for fingerprinting highly rely on the complexity of experiment data. Static data with less noise and deterministic effects requires normally few features and short observation periods.