Introduction

Since the late 2010 s, the growth in positioning applications has been accompanied by the emergence of low-cost dual-frequency Global Navigation Satellite System (GNSS) receivers. Several manufacturers have developed low-cost GNSS chips (e.g. Septentrio Mosaic X5, https://www.septentrio.com/en/products/gps/gnss-receiver-modules/mosaic-x5, last access: April 11, 2024; U-blox ZED-F9P, https://www.u-blox.com/en/product/zed-f9p-module, last access: April 11, 2024) used for a wide range of applications (smartphones, drone guidance, etc.) at a cost of 120–500 €. The use of these chips is facilitated by their integration on easy-to-interface boards and is also complemented by the market launch of low-cost antennas at a cost of 50–200 € (https://store-drotek.com/, last access: April 11, 2024; https://www.ardusimple.com/, last access: April 11, 2024). These receivers can enable precise positioning and their use in geoscience applications is a real opportunity, in particular thanks to the possibility of mass deployment at affordable prices. Taking advantage of these new technologies, the CentipedeRTK network was born in 2019. The CentipedeRTK network is a collaborative GNSS network that aims to provide free real-time centimeter positioning (Ancelin et al. 2023). The network was first developed in mainland France, before rapidly expanding to French overseas territories, other European countries, and now around the world. It has benefited from the combined resources of research institutes, public bodies, farmers and private companies. The network now includes more than 650 permanent stations (April 25, 2024), most of them being low-cost. The network is collaborative, and each station is built and maintained by its owner. The technical and IT management of the network is provided by the French National Research Institute for Agriculture, Food and the Environment (Institut national de recherche pour l’agriculture, l’alimentation et l’environnement, INRAE).

Since mid-2022, the raw GNSS data collected by these stations have been archived by the RÉNAG (RÉseau NAtional GNSS permanent) data center (Re3data.Org 2022). The aim of this archiving is, firstly, to assess the quality of the data acquired for geoscience applications such as tectonic studies, atmospheric observation, sea level monitoring, etc. If the quality of the data is proven to be sufficient for science applications, the second objective is to archive and disseminate these data on a long term basis. This is justified by recent studies using low-cost receivers for scientific applications. Indeed, several studies have shown that low-cost systems can be used for accurate coordinate estimation and fine displacement tracking (Hohensinn et al. 2022; Ogutcu et al. 2023; Vidal et al. 2024). Hohensinn et al. (2022) concludes that the accuracy of displacements measured using a low-cost receiver and antenna is similar to that obtained using geodetic-grade equipment, i.e. about 1 cm on the horizontal component and 4 cm on the vertical component for precise point positioning (PPP) analysis using real-time products. These results are somewhat less conclusive for Ogutcu et al. (2023), where displacements in static (one position per day) and kinematic (one position per second) modes are slightly less well detected with low-cost equipment than with geodetic equipment: the degradation is of the order of a factor of about 2, with a detectability threshold for horizontal displacement of 20 mm in kinematic and 5.6 mm in static, 30 mm in kinematic and 8.4 mm in static for vertical displacement. The importance of the type of antenna used is also emphasized, with preference given to those with accurate calibration (Hohensinn et al. 2022). More recently, Vidal et al. (2024) studied the performance of two low-cost GNSS receivers (ZED-F9P and Mosaic X5), equipped with low-cost antennas (AS-ANT3BCAL), by comparing them with geodetic grade receivers for 1 year. While the performance of the ZED-F9P receiver was less impressive over the period (more cycle jumps, fewer observations available), that of the Mosaic X5 receiver appears to be on a par with that of the geodetic receivers. From the point of view of positioning quality, the various evaluations carried out (static, kinematic and real-time positioning) confirm good quality positions of the Mosaic X5 receiver in terms of repeatability and low noise level, at the same level as the geodetic class receivers (1–2 mm horizontally, 3 mm vertically); the results for the ZED-F9P receivers are slightly worse, both in terms of repeatability (2–3 mm horizontally, 5 mm vertically) and noise level.

Several studies have also been carried out to assess the possibility of monitoring the space and time distribution of atmospheric water vapor using low-cost GNSS equipment. Indeed, the analysis of GNSS measurements requires the estimation of tropospheric propagation delays, which affect signal transmission and are related to atmospheric air density and water vapor content. To achieve this, zenith tropospheric delays (ZTD) are estimated during GNSS analysis along with the antenna position and clock offset. These delays are decomposed into a hydrostatic component and a wet component. From the wet component, the integrated water vapor content (IWV) can be derived with an accuracy of 1–2 kg m\(^{-2}\) (Guerova et al. 2016). The use of GNSS data from terrestrial antennas is therefore more than common, both for meteorological and climatological applications (Poli et al. 2007; Guerova et al. 2016; Bosser and Bock 2021).

The use of low-cost GNSS devices for tropospheric monitoring has therefore been evaluated in several case studies. These studies covered periods ranging from a few days to a few months, over limited geographical areas (at a single site or at the scale of an urban area). These case studies confirmed the ability of low-cost GNSS stations to provide high quality ZTD. Deviations from those produced by nearby conventional stations (geodetic-grade receivers and antennas) reach the millimeter range, with biases between about 1 and 3 mm and standard deviations between 2 and 5 mm (Krietemeyer et al. 2020; Marut et al. 2022; Stepniak and Paziewski 2022; Aichinger-Rosenberger et al. 2023). Krietemeyer et al. (2020) emphasized the importance of calibrating low-cost antennas, which allow a significant reduction in the root mean square (RMS) of differences with conventional antennas (from 15–20 mm to about 4 mm). Focusing on 2–3-days periods in winter and summer, Stepniak and Paziewski (2022) confirmed these results, noting that using a geodetic-grade antenna with a low-cost receiver reduces the bias (less than 1 mm) compared to using a low-cost antenna (about 2 mm). A similar bias (1–3 mm) was also shown by Aichinger-Rosenberger et al. (2023) over periods of varying length (from 36 h to 3 months), between two low-cost antennas and a geodetic-grade antenna. ZTD retrieval can also be achieved in real time, despite the use of lower quality satellite products. Differences with post-processed solutions had RMS values of less than 10 mm (Marut et al. 2022; Aichinger-Rosenberger et al. 2023). The tropospheric solutions provided by low-cost systems have also been evaluated using more conventional meteorological techniques. Stepniak and Paziewski (2022), Marut et al. (2022), Aichinger-Rosenberger et al. (2023) compared estimates provided by low-cost GNSS systems with those derived from numerical weather models. The estimated ZTD compared by Stepniak and Paziewski (2022) with the European Centre for Medium-range Weather Forecasts (ECMWF) fifth reanalysis (ERA5) had RMS of differences of 9–12 mm, with larger deviations in summer than in winter. Using a more spatially resolved model (around 10 km), the Weather Research & Forecasting (WRF) model, Marut et al. (2022) observed differences with a bias close to 0 mm and standard deviations between 6 and 10 mm (1 and 1.5 kg m\(^{-2}\) in terms of IWV) for a 1-month campaign implementing a network of 16 low-cost systems deployed at the scale of a conurbation (14 km\(\times\)24 km). Aichinger-Rosenberger et al. (2023) used an even denser numerical weather model with a 1.1 km \(\times\)1.1 km grid, Consortium for Small Scale Modeling (COSMO-1), to evaluate the tropospheric delays estimated by a low-cost station. The RMS of the ZTD differences calculated over a 3-week period were 17 mm and 24 mm for post-processing and real-time analysis, respectively.

In light of these results, we set out to confirm the quality of a low-cost receiver network specifically for meteorological and climatological applications. To the best of our knowledge, no study of this scale has ever been performed: this validation exploited the high potential of the low-cost CentipedeRTK network and was carried out at an extended regional scale, that of mainland France (about 1000 km \(\times\) 1000 km); we were interested in data collected over a whole year, in 2023, by a set of nearly 300 stations equipped with a low-cost receiver and antenna. This evaluation was based on systematic comparisons with conventional GNSS stations in the vicinity of the low-cost stations in the CentipedeRTK network. These comparisons are complemented by comparisons with the ERA5 reanalysis.

To achieve this, the paper is organized as follows. In the section following this introduction, we detail the data we used; the CentipedeRTK network is presented along with the conventional networks used as reference. We also describe our methodology for data analysis and comparisons and, finally, we present the method used to retrieve IWV from GNSS ZTD for comparison with ERA5 reanalysis data. In the third section, we evaluate the quality of positioning using low-cost stations and then present the results of the various intercomparisons using conventional GNSS stations and ERA5 products. These results are discussed in a fourth section. In a fifth section, we highlight the interest in tropospheric products from low-cost stations when describing an atmospheric river-type meteorological situation. In the sixth and final section, the main results are summarized and prospects for the use of such a low-cost network are outlined.

Data and methods

The CentipedeRTK real-time network

Launched in 2019, the CentipedeRTK network (https://centipede.fr/, last access: April 25, 2024) is a collaborative network of open GNSS base stations (Ancelin et al. 2022a, b) providing Real-Time Kinematic (RTK) positioning corrections available to anyone in the coverage area. The network has been extended by public institutions, private individuals, private actors such as farmers and other public partners. The objectives of the CentipedeRTK project was to develop an innovative processing chain based on freely available software and hardware components to create reliable, lightweight, low-cost and easy-to-use solutions, and to provide complete coverage of the country (starting with mainland France). The network is financially supported by INRAE and, since its launch in 2019, has benefited from the pooling of resources between research institutes, public bodies, farmers and private companies. Since its deployment in France in 2019, the network has expanded considerably, with more than 650 operational GNSS stations (April 25, 2024) and a wide range of applications in the public (17%) and private (66%) sectors, particularly in agriculture (Ancelin et al. 2023).

Time-delayed observations are collected for archiving purpose at the RÉNAG data center directly from the CentipedeRTK Network and Transport of RTCM (Radio Technical Commission for Maritime Services) via Internet Protocol (NTRIP) Caster. The real-time RTCM streams are converted into 1 h/1 s RINEX (Receiver Independent Exchange format) files, version 3, with the BKG NTRIP Client (BNC). Daily, the high-rate files are decimated and concatenated to 24 h/30 s files and for each of the latter, a quality-check report is produced using the Anubis Pro software. Selected data quality metrics are stored in a database for dynamic visualization on the web (http://gnssfr.unice.fr/quality-check, last access: May 3, 2024).

We focused on stations equipped with a low-cost receiver and antenna and located in mainland France. This corresponded to 406 stations available for the whole year 2023. A second selection of stations was made, refining the list of stations to those that acquired dual-frequency measurements and presented data availability for at least 50% of the year 2023. This refined selection resulted in a subset of 331 stations that met the specified criteria. The geographic distribution of the stations is shown in Fig. 1.

Fig. 1
figure 1

Geographical distribution of the GNSS stations considered in this study; circles represent CentipedeRTK low-cost stations, triangles represent conventional stations. CentipedeRTK stations that are not included in the study are shown in white. CentipedeRTK and conventional stations that were processed but not selected (as described in the text) are shown in gray. LIEN and CETT are the two stations from the CentipedeRTK network presented in Supplementary Materials

These 331 stations were all equipped with ZED-F9P GNSS receivers. The ZED-F9P is a dual-frequency, multi-constellation GNSS chip marketed by U-blox. This chip tracks a variety of GNSS signals, but not all (ZED-F9P product summary: https://content.u-blox.com/sites/default/files/ZED-F9P_ProductSummary_UBX-17005151.pdf, last access: April 11, 2024). In particular, phase measurements on the GPS L2 frequency are performed by tracking the L2C code; only the 25 GPS satellites that transmit this second civil signal (https://www.gps.gov/systems/gps/modernization/civilsignals/, last access: April 11, 2024) can therefore be tracked on 2 frequencies. The types of antennas installed on these stations were more diverse (Table 1). The vast majority of stations were equipped with a JCA228B0002 antenna supplied by Zhejiang JC Antenna Co. 51 stations were equipped with antennas without calibration sheets, 43 of which used the same type of antenna, the ANN-MB-00 model supplied by U-blox. For calibrated antennas, only a relative calibration was available. This calibration was provided by the National Geodetic Survey (NGS, https://www.ngs.noaa.gov/ANTCAL/LoadFile?file=ngs14.atx, last access: June 25, 2024).

Table 1 Types of antennas for the 331 CentipedeRTK stations processed in this study

French conventional networks

We considered the stations of three conventional permanent GNSS networks in France, mainland: the Réseau GNSS Permanent (RGP), the Réseau NAtional GNSS Permanent (RÉNAG), and the private Géodata-Orphéon network:

  • The RGP (https://rgp.ign.fr, last access: March 15, 2024) is a GNSS network managed by the Institut de l’information Géographique et Forestières (IGN), which includes almost 500 permanent stations. A minority of these permanent stations belong to IGN, while the others are owned by public or private partner organizations (teaching and research institutions, private networks, etc.). Data from these stations is archived in the form of RINEX version 2 or 3 files and made available on the Internet via a file server (ftp://rgpdata.ign.fr, last access: March 15, 2024).

  • The RÉNAG network (Re3data.Org 2022; Epos-France 2023) is the National GNSS network of French research public laboratories in geosciences, and is an official National Observation Service from CNRS-INSU (Centre National de la Recherche Scientifique, Institut des Sciences de la Terre et de l’Univers). This network is dedicated to scientific research and Earth observation in internal and external geophysics and geodesy, and is composed of more than 80 continuous GNSS stations operating in metropolitan France. Because it aims at precisely measuring small-scale deformation processes, these stations are mostly equipped with geodetic receivers and calibrated antennas. 30 s/24 h data created at the receiver are available in RINEX 3 format on the data center server (http://renag.resif.fr/en/donnees/, last access: April 16, 2024), and through the Epos (European Plate Observing System) infrastructure.

  • The Geodata-Orphéon network (https://reseau-orpheon.fr/en/, last access: April 16, 2024) is operated by the Geodata Diffusion company for RTK precise positioning applications and composed of about 200 continuous GNSS stations equipped with geodetic quality receivers & antennas. Observations from these stations are collected from streamed data in 30 s/24 h RINEX 2 or 3 format. In partnership with the RÉNAG National Observing Service, the RINEX data are archived by the RÉNAG data center (Re3data.Org 2022) and distributed under the license CC-By NC 4.0.

First, we considered all the stations in these permanent networks that were less than 30 km away from CentipedeRTK stations with an elevation difference of less than 100 m to avoid altitude related discrepancies. This resulted in the consideration of a set of 337 reference stations. A more detailed selection of conventional stations was then made in order to obtain the longest and most reliable time series of comparable data. This selection is described in detail below.

GNSS data processing

GNSS data from each network were analyzed in precise point positioning with Ambiguity Resolution (PPP-AR) mode using GipsyX 2.0 software (Bertiger et al. 2020) with the same computation strategy. PPP was preferred over relative positioning, in order to avoid network effects as observed in the past when analyzing data from a national network (Stepniak et al. 2022). GipsyX uses a Square Root Information Filter (SRIF) algorithm (Bierman 1977). The model parameters are described as stochastic (time-dependent parameters) or constant parameters (e.g. position). The variations of the stochastic parameters are modeled as random walk (e.g. tropospheric parameters) or white noise (e.g. receiver clock). At each epoch, a prediction of each unknown variable is made and then corrected by observations. Note that in the absence of observations (e.g. during a measurement interruption), the prediction is not corrected, but the predicted value is still considered as an estimate (with a larger uncertainty).

We used the no-net-rotation 300 s time resolution final satellite orbit and clock products provided by the Jet Propulsion Laboratory (JPL). Only data from the GPS constellation were analyzed, as no JPL products were available to resolve ambiguities for the other constellations. The GPS raw observations were processed over a 30-h window centered on midnight, then the solutions were extracted over the 00–24 h time slot to avoid edge effects at the day-boundaries. An ionosphere-free combination of the dual-frequency observations is made to eliminate the effect of the ionosphere; an ionospheric correction of order 2 is also applied (Kedar et al. 2003). Phase ambiguities were fixed using the wide-lane phase biases computed by JPL (Bertiger et al. 2010). The cut-off angle was fixed at 7\(^\circ\) with a \(\sqrt{\sin elev}\) down-weighting of low-elevation observations. The solid Earth and polar tides were corrected according to IERS conventions (Petit and Luzum 2010) and the ocean loading was corrected using the Finite Element Solution tide model FES2004 (Lyard et al. 2006) using the coefficients calculated by the ocean tide loading provider (Machiel Simon Bos and Hans-Georg Scherneck, http://holt.oso.chalmers.se/loading/, last access: March 15, 2024). We used GPS satellite antenna models provided by the International GNSS Service (IGS, https://files.igs.org/pub/station/general/igs14.atx, last access: June 25, 2024). For the ground antennas, we used the IGS absolute models if available; otherwise, we used the relative models provided by the National Geodetic Survey (NGS, https://www.ngs.noaa.gov/ANTCAL/LoadFile?file=ngs14.atx, last access: June 25, 2024). At the end of the analysis, the positions were transformed into the ITRF2014 after application of a 7-parameter Helmert transformation using JPL’s x-files (Bertiger et al. 2020).

Tropospheric delays were modeled by time-varying zenith hydrostatic delays (ZHD), zenith wet delays (ZWD), and horizontal gradients. To model their dependence in elevation, we used their respective Vienna Mapping Function 1 (VMF1) (Boehm et al. 2006) (the Vienna Mapping Function 3 (VMF3) was not used, as it had not yet been implemented in GipsyX 2.0). The a priori values for ZHD and ZWD were calculated from the VMF3 coefficient grids that are derived from the 6-hourly ECMWF operational analyses by Technische Universität Wien (TU-Wien, https://vmf.geo.tuwien.ac.at/, last access: March 15, 2024) (Re3data.Org 2016; Landskron and Böhm 2017). Corrections to a priori ZWD and horizontal gradient values were parameterized as random walk processes with 300 s time resolution and were estimated during data processing. The random walk parameters were set to 5 and 0.5 mm s\(^{-1/2}\) for ZWD and gradients, respectively; the uncertainties with respect to their a priori values were set at 50 cm and 10 cm respectively. Zenith tropospheric delays (ZTD) finally obtained by summing a priori ZHD, a priori ZWD, and the estimated ZWD correction.

GNSS output screening and selection

At the end of the processing, a quality check was performed to detect outliers and spurious estimates. This quality check is also useful to detect and reject estimates from the SRIF prediction stage alone in the absence of measurements. This quality assessment, called screening, is essential for the analysis of a large dataset such as the one used in this study. The screening method used here is similar to that used and validated in previous studies (Bosser and Bock 2021; Bock et al. 2021). To do this, we considered the estimated ZTD values and the associated formal errors, \(\sigma _{ZTD}\). The screening procedure is divided into 3 steps:

  • First, a range check was performed on both ZTD and \(\sigma _{ZTD}\): we rejected values outside the intervals [1; 3] m for ZTD and [0.1; 4 ] mm for \(\sigma _{ZTD}\).

  • Second, an outlier check was performed on the ZTD and \(\sigma _{ZTD}\):

    • ZTD values outside the interval \(\text {Md}(ZTD)\pm 0.5\) m, where "\(\text {Md}\)" denotes the median, were rejected.

    • \(\sigma _{ZTD}\) values above \(\text {Md}(\sigma _{ZTD})+3\times \text {IQR}(\sigma _{ZTD})\), where \(\text {IQR}\) denotes the interquartile range, were rejected.

  • Finally, a daily check was performed out to reject days where more than 50% of the ZTD were missing (due to rejection by the previous stages or lack of raw data).

Table 2 shows the rejection rates at the end of each screening stage for CentipedeRTK and conventional networks. The overall rejection rate for conventional networks is rather high, above 8%. This was mainly due to the around 80 stations for which data were systematically missing for a part of the night, a first period between January and May, a second period, in December. The rejection rate for these stations then rose to over 30% over the whole year, as only the values predicted by the SRIF are used, without correction by GNSS measurements. If this subnetwork of stations is excluded, the overall rejection rate drops to 2%, which is in the same order of magnitude as the rejection rates obtained in previous studies (Bosser and Bock 2021; Bock et al. 2021). For stations from the CentipedeRTK network, we obtain an overall rejection rate of 4.3%, which is still satisfactory. In each situation, the range check step on the formal error of the ZTD rejects the most values. The outlier and range checks on the ZTD generally reject few values.

Table 2 Rejection rate after each stage of the screening process for the CentipedeRTK and conventional networks

At the end of the screening, only stations with estimates for at least 50% of the 5 min-epochs in 2023 were considered. This resulted in 265 stations for the CentipedeRTK network and 332 stations for the conventional networks.

The final stage of estimates selection for the stations from the conventional networks was to retain a single station for each CentipedeRTK station (at most) for comparison purposes. To do so, we first ensured that comparable stations had a coverage of more than 50%. We then selected the conventional station with the highest coverage. If two conventional stations offered a similar overlap of ±5%, we selected the one with the lowest RMS on ZTD differences. This final selection resulted in a set of 186 stations from the conventional networks (see Fig. 1), that were compared with the nearest CentipedeRTK stations. There were therefore 79 stations belonging to the CentipedeRTK network that could not be compared with stations from conventional networks.

Figure 2 shows the number of stations with ZTD estimates at each 5-min epoch over the year 2023 for the conventional and CentipedeRTK networks. The upper part of the graph shows the estimates available at the end of the PPP processing with GipsyX, while the lower part shows the estimates available after screening. The amount of estimates available for the CentipedeRTK stations increases significantly on June due to update of the raw data retrieval flow at RÉNAG data center. Over a short period at the beginning of October and again at the end of December, this number of estimates suddenly decreased, with a gradual return to normal, because of issues with the CentipedeRTK IT infrastructure. For conventional stations, the number of available estimates can vary considerably. These variations generally occurred over short periods (< 2 h). They mainly concerned two sub-networks of stations depending on the same infrastructure, which experienced periodic failures (every day at the same time) during the year 2023. There were two situations:

  • The failures occurred periodically: at the end or beginning of the 30 h processing session, so that the estimates were missing at the end of the GipsyX processing; this affected about 15 stations. This can be seen in Fig. 2a.

  • The failures occurred in the middle of the 30 h session: as explained above, the estimated values were those predicted by the SRIF and were therefore rejected during the screening process. About 60 stations were affected as can be seen in Fig. 2b.

Fig. 2
figure 2

Number of stations available per 5-min epoch at the end of the GNSS analysis (top) and at the end of the screening (bottom). In blue, stations from the CentipedeRTK network; in orange, stations from conventional networks

Extrapolation of GNSS ZTD

Finally, even if the selected conventional stations differ in height from the CentipedeRTK stations by less than 100 m, it is necessary to consider an additional vertical correction corresponding to the fraction of the ZTD between the two stations. To do this, the ZTD at each conventional antenna was extrapolated to the "associated" CentipedeRTK antenna using a formulation based on Steigenberger et al. (2009) and Parracho et al. (2018):

$$\begin{gathered} ZHD_{{CVT}} (h_{{CTP}} ) = ZHD_{{CVT}} (h_{{CVT}} ) - 10^{{ - 6}} k_{1} \hfill \\ \quad \quad \quad \quad \quad \quad \quad \frac{{P(h_{{CVT}} )}}{{T(h_{{CVT}} )}} \cdot \frac{{g(h_{{CVT}} )}}{{g_{{atm}} }} \cdot (h_{{CTP}} - h_{{CVT}} ) \hfill \\ \end{gathered}$$
(1)

and:

$$\begin{aligned} ZWD_{{CVT}} (h_{{CTP}} ) = & ZWD_{{CVT}} (h_{{CVT}} ) \\ & \times \left[ {1 - k \times (h_{{CTP}} - h_{{CVT}} )} \right] \\ \end{aligned}$$
(2)

where \(h_{CTP}\) and \(h_{CVT}\) are the ellipsoid heights of the CentipedeRTK and conventional antennas respectively, \(T(h_{CVT})\), \(P(h_{CVT})\) and \(g(h_{CVT})=9. 8062\) m s\(^{-2}\) are respectively the mean temperature, pressure and gravity between the two GNSS antennas; P and T could be approximated by a standard model as GPT (Boehm et al. 2007); \(k_1\) is a refractive constant (Thayer 1974) and \(g_{atm}=9. 7840\) m s\(^{-2}\) is the approximate gravity of the center of mass of the atmosphere (Boehm and Schuh 2013); \(k=4\times 10^{-4}\) m\(^{-1}\) is the water vapor lapse rate (Parracho et al. 2018).

The ERA5 reanalysis

Developed by the ECMWF, the ERA5 Reanalysis is one of the most recent and advanced atmospheric reanalyses currently available (Hersbach et al. 2020). It provides a global coverage with a spatial resolution of 0.25\(^\circ\) and an hourly temporal resolution. This reanalysis does not assimilate measurements from ground-based GNSS stations, and is therefore an independent data source for the evaluation of products derived from the analysis of CentipedeRTK stations. Comparisons with ERA5 are also useful for evaluating the potential of using measurements from low-cost GNSS stations for operational weather monitoring.

We first used the Total Content Water Vapor product (TCWV) from the ERA5 reanalysis surface grid to retrieve IWV. This product is given at the surface, the height of which follows the orography of the model. For comparison, the ERA5 data were extrapolated from the model surface to the height of the GNSS antennas using the following formulation (Parracho et al. 2018; Bosser and Bock 2021):

$$\begin{aligned} IWV_{ERA5}(h_{CTP})= & {} IWV_{ERA5}(h_{ERA5}) \nonumber \\{} & {} \times \left[ 1 - k \times (h_{CTP}-h_{ERA5})\right] \end{aligned}$$
(3)

where \(h_{CTP}\) and \(h_{ERA5}\) are the height of the CentipedeRTK antenna and the ERA5 orography respectively, \(k=4\times 10^{-4}\) kg m\(^{-3}\) is the water vapor lapse rate (Parracho et al. 2018).

The IWV extracted from ERA5 are intended to be compared with the IWV calculated at each CentipedeRTK antenna. To avoid errors associated with this vertical extrapolation, IWV have only been extracted for antennas whose height difference from the ERA5 orography is less than 100 m. The CentipedeRTK ZTD are converted to IWV using the method described in Bosser and Bock (2021):

  • ZHD were calculated for each CentipedeRTK station using the ERA5 Surface Pressure (SP) analysis grid and the modified Saastamoinen formula (Saastamoinen 1972; Bosser et al. 2007).

  • ZWD were then calculated by subtracting of these ZHD from the estimated ZTD and converted to IWV using the Bevis et al. (1992) formula:

    $$\begin{aligned} IWV = \kappa (T_m)\times ZWD \end{aligned}$$
    (4)

    \(\kappa\) is a function given by Bevis et al. (1992) that depends on the integrated mean temperature, \(T_m\), which has been interpolated using values provided by the TU-Wien database (https://vmf.geo.tuwien.ac.at/, last access: March 15, 2024)

Assessment of CentipedeRTK retrievals

Comparison of GNSS retrievals

Positions

A first evaluation of the analysis of the estimates retrieved for the CentipedeRTK network stations was performed by studying the repeatability of the estimated positions for each antenna of the network. The positions were first expressed in the ITRF2014 reference frame (Altamimi et al. 2016). The positions were then transformed into RGF93_v2b, the French national reference frame (Fages 2021), using the 14-parameter similarity between ITRF2014 and ETRF2000(R14) (Altamimi 2018) and the plate rotation poles for the Eurasian plate from Altamimi et al. (2017). Repeatabilities were calculated from the daily positions estimated for each station. These positions were weighted by the number of epochs available after screening for each daily session (maximum of 288 epochs per day, with solutions calculated at 5 min time resolution).

Fig. 3
figure 3

Distribution of the weighted position repeatabilities computed for each station of the CentipedeRTK network. From left to right: a East component (mm), b North component (mm), c Up component (mm). Framed numerical values indicate the mean and the standard deviation

Figure 3 shows the distribution of repeatabilities calculated on the East, North and Up components for all selected CentipedeRTK stations (265). For all stations, the average repeatability for the horizontal components is quite good, about 2 mm, with a standard deviation of about 1 mm. It can be seen that for the majority of stations, the repeatability on the horizontal components is less than 5 mm (96% for the East component, 99% for the North component). The repeatabilities computed for the vertical component are generally higher by a factor of 3, with an average value of \(6.7\pm 5.4\) mm; 95% of the stations have repeatabilities on the vertical component of less than 10 mm.

Fig. 4
figure 4

Distribution of the weighted position repeatabilities computed for each station of the conventional networks (top, ac) and the CentipedeRTK network (bottom, df), only the 186 comparable stations are considered. From left to right: a, d East component (mm), b, e North component (mm), c, f Up component (mm). Framed numerical values indicate the mean and the standard deviation

Figure 4 compares the repeatabilities calculated for stations in the conventional (top panel) and CentipedeRTK (bottom panel) networks. Only matching stations (186) are shown here: even though the stations are not strictly co-located and may present a potentially very different environment, this gives an idea of the differences on a comparable sample. Conventional stations show better repeatability on average, with much lower dispersion than CentipedeRTK stations. This is true for all components. The average repeatability of the CentipedeRTK stations is 15–20% higher for each component. Dispersions are almost doubled for all position components. In absolute terms, the average repeatability drops by less than 1 mm for the horizontal components, and by 1 mm for the vertical. On the vertical, the dispersion is much greater with CentipedeRTK stations: this may be due to the type of monument used (the roof of a farm building for many stations), the absence of an antenna model for some stations, or the higher sensitivity of these antennas to multipath. Overall, the visual inspection of several position time series from CentipedeRTK and nearby conventional stations argues in favor of a general agreement between both datasets.

In Supplementary Materials, Fig. S1 shows the time series of positions in East, North and Up for two CentipedeRTK stations (LIEN and CETT) and the corresponding nearby conventional stations (LROC and SETE). Although the time series are sometimes noisier for the CentipedeRTK stations, the overall variations are fairly similar, probably due to deficiencies in the modeling of geophysical effects (troposphere model, tides and/or loading, etc.) or errors in the JPL products used.

Troposphere delays

We focus here on the ZTD estimated for the 186 station pairs of the CentipedeRTK and conventional networks. Figure 5 shows the histograms of bias, standard deviation and correlation calculated between each station pair. The average bias is 2.7 mm (estimates from CentipedeRTK stations being larger than those from conventional stations) and was shown to be statistically significant using a Student’s t-test; this value is close to that already observed in the literature when comparing ZTD estimated by low-cost and conventional antennas and may be related to a default in the calibration of low-cost antennas or multipath as suggested by Krietemeyer et al. (2020). The variability of this bias is large, about 5.5 mm, with about 80% of stations showing a bias between \(\pm 7.5\) mm. This high variability may be due to the variable environmental conditions of the CentipedeRTK antennas, which can sometimes be far from those preferred for geodetic and geoscience applications. The mean standard deviation of the differences is slightly less than 5 mm and is fairly stable over the entire network. This results in a mean RMS of differences of \(7.4\pm 3.0\) mm. The mean correlation of the ZTD time series is very good (0.995), highlighting a very good consistency in the temporal evolution of the ZTD retrieved by the two types of stations.

Fig. 5
figure 5

Comparison of ZTD estimated for stations in the CentipedeRTK and conventional networks (CentipedeRTK − Conventional): a bias, b standard deviations, c: correlation. Framed numerical values indicate the mean and the standard deviation

In Supplementary Materials, Fig. S2 illustrates the good agreement of estimated ZTD with those from conventional stations (SETE and LROC) for two CentipedeRTK stations (CETT and LIEN).

Figure 6 shows the time variation of mean ZTD, bias, standard deviation and number of stations compared for the whole network. The number of pairs of values compared varies over the course of the year, for the reasons already mentioned in the previous section. The bias is stable over time, around 2–3 mm as noted above, with a few negative peaks associated with decreases in the number of comparable stations available. The standard deviations of the differences are generally lower (6–8 mm) in winter, when the mean ZTD is also lower; in summer, the standard deviations of the differences are higher (8–2 mm), coinciding with periods when the mean ZTD is also higher, due to the higher temperatures that allow more water vapor to accumulate in the atmospheric air.

Fig. 6
figure 6

Time variation of spatial mean ZTD and corresponding standard deviation as a filled area (a), bias (b), standard deviation of the differences (c) and number of comparison points (d)

To investigate the origin of this bias, we first examined the spatial distribution of the differences (Fig. 7). There is no geographical behavior, neither for the bias between networks nor for the standard deviation of the differences. We also checked that there was no relationship between station difference in height and the observed biases (not shown here), which could have revealed a shortcoming in our extrapolation method.

Fig. 7
figure 7

Spatial distribution of parameters derived from comparisons of estimated ZTD between CentipedeRTK and conventional stations. a bias, b standard deviation of differences

An intercomparison of the stations from conventional networks only was also carried out. To do this, we looked at the conventional stations and identified comparable stations from the conventional networks, using the same strategy as described in Sect. 2.4. This resulted in 176 pairs of stations from the conventional network for comparison. The statistics of the differences in ZTD calculated on these pairs showed no bias (mean bias of 0.3 mm ± 4.1) and the standard deviations were similar to those obtained for CentipedeRTK stations (4.6 mm ± 1.7 mm). This confirms that, on average, conventional stations do not have a systematic bias and that the vertical extrapolation method we use does not introduce one. The bias observed here with the CentipedeRTK stations may therefore be inherent to the system used (receiver and/or antenna) or the environmental conditions of the stations. For each CentipedeRTK station, we checked the availability of observations from low-elevation satellites; the absence of such measurements could be explained by a mask preventing good visibility of the sky and thus degrading the GNSS calculation. Although 2 of the most biased stations have a low observation rate between 7 and 10\(^{\circ }\), no systematic behavior was found (not shown here).

The type of antenna used on each CentipedeRTK station was also examined. Figure 8 shows the bias and standard deviation of the deviations observed between the 186 CentipedeRTK stations and conventional stations as a function of the type of antenna fitted to the CentipedeRTK station. Only antennas equipped with at least 5 stations are considered, i.e. 3 antennas: models JCA228B002 (122 stations, relative calibration), AS-ANT2BCAL (33 stations, relative calibration) and ANN-MB-00 (24 stations, no calibration). The stations equipped with the JCA228B002 antenna are the most numerous and have a statistically significant bias of about 2 mm. The AS-ANT2BCAL antenna is fitted to 33 stations and has a higher average bias, despite the availability of a calibration sheet. Finally, the uncalibrated ANN-MB-00 antenna is used on 24 stations, with an average bias of 3.7 mm, which varies from station to station (6.5 mm). These last comparisons suggest that the CentipedeRTK stations have a systematic bias, which is more pronounced for the AS-ANT2BCAL and ANN-MB-00 antennas.

Fig. 8
figure 8

Means and standard deviations of the ZTD differences for the CentipedeRTK stations equipped with the JCA228B002, AS-ANT2BCAL and ANN-MB-00 antennas. The colors indicate the different antenna types. The numbers in brackets are the number of stations and the last values are the mean bias and standard deviation as a function of antenna type

Evaluation of CentipedeRTK IWV with ERA5

A second assessment is carried out using the IWV extracted from the ERA5 model and those calculated from the ZTD for the stations in the CentipedeRTK network. 240 CentipedeRTK stations are included in these comparisons; 25 could not be compared due to a height difference with the ERA5 model orography of more than 100 m.

Figure 9 shows the histograms of bias, standard deviation and correlation calculated between ERA5 and each CentipedeRTK station. The mean differences, a bias of 0.06 ± 0.82 kg m\(^{-2}\) and a standard deviation of 1.48 ± 0.18 kg m\(^{-2}\), are consistent with those reported in the literature for similar comparisons, despite a slightly high dispersion of bias per station. It is noteworthy that Bosser and Bock (2021) reported a bias of 0.36 ± 0.40 kg m\(^{-2}\) over Western Europe with a similar analysis strategy, over 2 months, for a set of stations from conventional networks (ERA5 wetter than GNSS). Previous results showed that ZTD from CentipedeRTK stations were almost 3 mm larger than those from conventional stations, i.e. an order of magnitude of 0.4 kg m\(^{-2}\). A small bias between ERA5 and CentipedeRTK was therefore expected. The station with the largest bias with ERA5 (−3.12 kg m\(^{-2}\)) was also identified as having the largest bias with respect to a conventional station; it is also one of the stations with a low rate of observations obtained for satellites with elevations between 7 and 10\(^{\circ }\). In terms of RMS of the differences, 88.8% of the stations have values below 2 kg m\(^{-2}\), 99.2% below 3 kg m\(^{-2}\) (only 2 stations exceeding). The correlation coefficients are high, averaging 0.984 ± 0.003 and exceeding 0.97 for 99.6% of the stations (only one station being below 0.97, with a correlation coefficient of 0.966).

Fig. 9
figure 9

Comparison of IWV calculated for stations in the CentipedeRTK network and ERA5 (CentipedeRTK − ERA5): a bias, b standard deviations, c: correlation. Framed numerical values indicate the mean value and the standard deviation

An illustration of the agreement between ERA5 and the CentipedeRTK stations is shown in Supplementary Materials, Fig. S3, for the CETT and LIEN stations.

The 1 h-time evolution of the mean IWV (CentipedeRTK only), its spatial variability and differences of the IWV (spatial bias and standard deviation) are shown in Fig. 10. The general trend in IWV evolution is that classically observed at mid-latitudes, with lower values in winter (10–20 kg m\(^{-2}\)) and higher values in summer (20–40 kg m\(^{-2}\)). The bias is centered around 0 throughout 2023, with larger variations (up to 2.5 kg m\(^{-2}\)) in summer when IWV variations are larger. The standard deviation of the differences is less than 2 kg m\(^{-2}\) in winter and reaches higher values in summer when IWV is higher (up to 4 kg m\(^{-2}\)). Over the year 2023, the RMS of the spatial differences is less than 2 kg m\(^{-2}\) in 82.8% of the epochs and less than 3 kg m\(^{-2}\) in 99.1% of the epochs.

Fig. 10
figure 10

Time variation of spatial mean IWV and corresponding standard deviation as a filled area (a), bias between CentipedeRTK stations and ERA5 (b), standard deviation of differences between CentipedeRTK stations and ERA5 (c) and number of comparison points (d)

Discussion

The differences observed between CentipedeRTK and conventional stations remain moderate, reflecting the ability of low-cost stations to provide high quality atmospheric water vapor measurements. However, there is a statistically significant bias between the two types of stations, with the low-cost antenna giving slightly larger delays. This bias appears to be antenna type dependent and has been noted in previous studies. Using an uncalibrated ANN-MB-00 antenna, Stepniak et al. (2022) showed a bias of the order of 2 mm (the low-cost antenna providing a larger tropospheric delay than a geodesic class antenna). The same study also showed that the type of low-cost receiver (ZED-F9P) had no effect on this bias. Over a 36-h period, Aichinger-Rosenberger et al. (2023) also shows that two types of low-cost antennas (including the AS-ANT2B-CAL model also used by some CentipedeRTK stations) provide larger tropospheric delays than a geodetic-class antenna. The observed biases are of the same order of magnitude (and sign) as those obtained in this study, which was carried out over a longer period and a larger number of stations. A specific relative calibration of the antennas (to correct for variations in the antenna phase center and any multipath), as suggested by Krietemeyer et al. (2020), seems to be a good alternative to mitigate this bias. This supports the hypothesis as a possible the origin of this bias, i.e. the type of low-cost antenna used, whether calibrated or not.

The differences observed between the IWV estimated for the CentipedeRTK stations and the ERA5 reanalysis are slightly larger than those obtained for conventional GNSS stations (Bosser and Bock 2021). However, these differences remain small in magnitude, with almost all stations showing RMS differences of less than 3 kg m\(^{-2}\) and more than 80% of them showing RMS differences of less than 2 kg m\(^{-2}\). In view of the accuracy requirements for the use of GNSS retrievals for Numerical Weather Prediction (NWP) outlined by Offiler (2010), all these results confirm the benefits of low-cost GNSS stations for NWP; the differences observed are compatible with the "breakthrough" level, for nowcasting, local, regional and global NWP, with a required accuracy of 2.0 kg m\(^{-2}\). This "breakthrough" level should allow significant improvements for these different NWP applications, should such GNSS estimates be assimilated. For climatology applications, the requirements are more stringent, with an expected accuracy of the order of 1.5 kg m\(^{-2}\) for a "breakthrough" level. Given the observed differences, values from CentipedeRTK stations are likely to exceed this threshold; a reduction in the observed bias, probably through improved antenna calibration or multipath mitigation, could enable meeting this threshold.

Finally, it should be noted that these results have been obtained for a network of stations deployed in a centimeter real time positioning context. Even if installation recommendations are given to the various participants, the immediate environment of the stations is not necessarily as favorable as that of more conventional stations. It should also be borne in mind that these results have been obtained using observations from the GPS constellation only; it could be expected that the use of data from other GNSS constellations will improve the quality of solutions from CentipedeRTK stations.

Monitoring an atmospheric river with CentipedeRTK stations

To illustrate the potential use of data from a network of low-cost stations for meteorological purposes, we studied the weather situation in mid-December 2023, when a Rhum-Express-type atmospheric river reached France. An atmospheric river is a meteorological phenomenon characterized by the large-scale transport of atmospheric moisture over long distances, often associated with extreme weather conditions. In practical terms, it refers to narrow and intense bands of water vapor moving rapidly through the atmosphere, driven by powerful weather systems. These atmospheric rivers cause a rapid increase in both surface temperature and integrated water vapor content, and can lead to heavy precipitation that can cause floods and other natural disasters (Corringham et al. 2019). A Rhum-Express-type atmospheric river characterizes an atmospheric river spreading from the Caribbean to Western Europe.

Figures 11 and 12 show the IWV maps retrieved by the GNSS CentipedeRTK stations, the associated IWV horizontal gradients and the rainfall rates measured over a 6 h period by the Météo-France network of surface meteorological stations (https://portail-api.meteofrance.fr/, last accessed: 10 April 2024). The IWV horizontal gradients are derived from the GNSS wet gradients using Eq. (4). The GNSS horizontal wet gradients are calculated using the same methodology as Ning and Elgered (2021): the hydrostatic tropospheric gradients are derived from the global 6 h grids provided by TU-Wien (Landskron and Böhm 2018), and subtracted from the GNSS total horizontal gradients to obtain the wet values. Figure S4 in Supplementary Materials shows how well the gradients from the CentipedeRTK stations agree with those from the conventional stations.

Fig. 11
figure 11

Left: IWV (colored circles) and horizontal IWV gradients (gray arrows) for stations of the CentipedeRTK network. Right: 6 h rainfall rates measured at ground meteorological stations. From top to bottom: 10 December - 6am, 10 December - 12:00

Fig. 12
figure 12

Same as Fig. 11. From top to bottom: 11 December - 6am, 12 December - 00:00

Fueled by a series of depressions over the North Atlantic, a warm, moist atmospheric flow spread from the West Indies and reached the European continent on 10 December 2023. The first result was a significant increase in the IWV, which rose within a few hours from 5−10 kg m\(^{-2}\) (Fig. 11, top left) to over 30−40 kg m\(^{-2}\) (Fig. 11, bottom left). This is particularly noticeable at coastal stations in central–western France. The arrival of the moisture front on the French coast was marked by locally heavy rainfall from midday on 10 December (Fig. 11, bottom right). The rain front then gradually spread across a band between 44 and 47\(^\circ\) north, crossing the whole of France (Fig. 12, top right). Around midnight on 12 December, the precipitation zone shifted slightly northwards, coinciding with a wet front that was also slightly further north (Fig. 12, bottom right). The event gradually ended between 13 and 14 December, with lower rainfall and IWV fluctuating spatially between 10 and 25 kg m\(^{-2}\) (Fig. 12, bottom left). Although they appear to be more spatially variable, with some stations likely to have anomalous values, the horizontal gradients are overall equally sensitive to the passage of the front. On 10 December at 6 a.m., no particular trend was observed; for the other periods shown, the majority of the stations shown horizontal gradients in agreement with the spatial distribution of IWV, especially in the areas with the highest IWV (west-central coastal region).

Fig. 13
figure 13

IWV (kg m\(^{-2}\), left axis) calculated for the CentipedeRTK DELP station (blue) and from ERA5 (red); 1 h rainfall rate (mm, right axis) for Météo-France weather station number 86139001, less than 5 km away

In Fig. 13 we focus on the CentipedeRTK DELP station (circle framed in black in Figs. 11 and 12) by comparing the IWV retrieved at this station, the values extracted from ERA5 and the rainfall measured for a nearby meteorological station (less than 5 km away). There is a sharp increase in IWV from 6 am (IWV less than 10 kg m\(^{-2}\)) to around 2 pm (more than 30 kg m\(^{-2}\)). The timing of the increase is well captured by ERA5, although the model underestimates the peak by more than 3 kg m\(^{-2}\). The IWV remains high, between 25 and 35 kg m\(^{-2}\), for more than 2 days. During this wet event, the ERA5 IWV variations are regularly shifted by 1–2 h with respect to GNSS retrievals. IWV peaks are also underestimated (by 1–5 kg m\(^{-2}\)). The end of the wet event is advanced by about 1–2 h by ERA5. From noon on 12 December, IWV values are at more conventional levels (around 15–20 kg m\(^{-2}\)). At the beginning of the episode, the significant increase in IWV is accompanied by heavy precipitation, with rainfall rates exceeding 4 mm h\(^{-1}\)). The period of high IWV values is associated with heavy rainfall, with rainfall rates exceeding 4 mm h\(^{-1}\)). Three sequences of heavy rainfall can be identified during this period: around 10 December at 15:00, then around 11 December at 2:00, and finally around 12 December at 2:00. The latter event was accompanied by a rapid decrease in IWV from 30 o 15 kg m\(^{-2}\).

Conclusion

In this work, we have evaluated the tropospheric delays retrieved by a regional network of low-cost GNSS stations located in mainland France. These GNSS stations are part of the CentipedeRTK network, a collaborative network of more than 650 low-cost GNSS stations used for real-time centimeter-level positioning. Originally established in France, this network has been widely exported abroad in recent years. We focused on raw data collected by more than 300 stations in mainland France in 2023. A screening procedure was used to detect outliers and spurious estimates, resulting in a set of 265 stations used in this work.

Overall, the tropospheric delays calculated for these stations agree quite well with those from conventional stations, which are much more commonly used for such applications, with an average RMS of differences of 7.4 mm ± 3.0 mm. It should be noted that these differences are accompanied by an average bias of 2.7 mm. This bias is significant and stable over time. A number of possible explanations for this bias have been investigated, but it is not possible to explain its origin. However, it should be noted that this bias seems to depend on the antenna used to equip the GNSS station, confirming the results of previous studies using a smaller dataset. Poor antenna calibration or the presence of multipath due to station environment may explain these discrepancies. The IWV from low-cost GNSS stations were also compared with those extracted from the ERA5 reanalysis, with average differences similar to those found in the literature. Finally, a case study focusing on the impact of an atmospheric river over western France, accompanied by an episode of heavy precipitation, demonstrated the performance of a network of low-cost GNSS stations for monitoring the space and time distribution of atmospheric water vapor.

The main shortcoming of these low-cost systems therefore seems to be their antenna. One solution would be to use the relative calibration method proposed by (Krietemeyer et al. 2020) to calibrate uncalibrated antennas and to verify the calibration of calibrated antennas. It may also be possible to ask companies specialized in GNSS antenna calibration to determine the absolute calibrations of low-cost antennas; it would then be interesting to verify the stability of the calibration model between different antennas of the same model. Finally, multipath mitigation by the stacking of line-of-sight post-fit phase residuals has already been applied with conclusive results (Shoji et al. 2004; Bosser et al. 2010) and could easily be adapted to stations from the CentipedeRTK network.

This study confirms the great potential of low-cost GNSS receivers for monitoring atmospheric water vapor. The use of such systems could make a significant contribution to the cost-effective densification of existing networks. But it could be particularly beneficial in areas where permanent GNSS networks are still underdeveloped. We hope that the international development of the CentipedeRTK network will contribute to the growing use of low-cost GNSS data for the meteorological community.