1 Introduction

The accurate measurement of time and frequency is of fundamental importance to science and technology, with applications including the measurement of fundamental physical constants, global navigation satellite systems and the geodetic determination of physical heights. Time is also one of the seven base physical quantities within the International System of Units (SI), and the second is one of the seven base units. Since 1967 the second has been defined in terms of the transition frequency between the two hyperfine levels of the ground state of the caesium-133 atom. The second can also be realized with far lower uncertainty than the other SI units, with many other physical measurements relying on it.

Since the successful operation of the first caesium atomic frequency standard at the UK National Physical Laboratory (NPL) in 1955, the accuracy of caesium microwave frequency standards has improved continuously and the best primary standards now have fractional uncertainties of a few parts in \(10^{16}\) (Heavner et al. 2014; Levi et al. 2014; Guena et al. 2012; Weyers et al. 2012; Li et al. 2011). However, frequency standards based on optical, rather than microwave, atomic transitions have recently demonstrated a performance exceeding that of the best microwave frequency standards, which is likely to lead to a redefinition of the SI second. The optical clocks operate at frequencies about five orders of magnitude higher than the caesium clocks and thus achieve a higher frequency stability, approaching fractional stabilities and uncertainties of one part in \(10^{18}\) (Chou et al. 2010; Hinkley et al. 2013; Bloom et al. 2014; Nicholson et al. 2015; Ushijima et al. 2015; Huntemann et al. 2016). Reviews of optical frequency standards and clocks are given in Ludlow et al. (2015) and Margolis (2010). In this context, a thorough definition of frequency standards and clocks is beyond the scope of this contribution, and hence, for simplicity, both terms are used interchangeably in the following.

One key prerequisite for a redefinition of the second is the integration of optical atomic clocks into the international timescales TAI (International Atomic Time) and UTC (Coordinated Universal Time). This requires a coordinated programme of clock comparisons to gain confidence in the new generation of optical clocks within the international metrology community and beyond, to validate the corresponding uncertainty budgets, and to anchor their frequencies to the present definition of the second. Such a comparison programme has, for example, been carried out within the collaborative European project “International Timescales with Optical Clocks” (ITOC; Margolis et al. 2013).

Due to the demonstrated performance of atomic clocks and time transfer techniques, the definition of timescales and clock comparison procedures must be handled within the framework of general relativity. Einstein’s general relativity theory (GRT) predicts that ideal clocks will in general run at different rates with respect to a common (coordinate) timescale if they move or are under the influence of a gravitational field, which is associated with the relativistic redshift effect (one of the classical general relativity tests). Considering the usual case of two earthbound clocks at rest, the relativistic redshift effect is directly proportional to the corresponding difference in the gravity (gravitational plus centrifugal) potential W at both sites, where one part in \(10^{18}\) clock frequency shift corresponds to about \(0.1\,\hbox {m}^{2}\,\hbox {s}^{-2}\) in terms of the gravity potential difference, which is equivalent to 0.01 m in height. Hence, geodetic knowledge of heights and the Earth’s gravity potential can be used to predict frequency shifts between local and remote (optical) clocks, and vice versa, frequency standards can be used to determine gravity potential differences. The latter technique has variously been termed “chronometric levelling”, “relativistic geodesy”, and “chronometric geodesy” (e.g. Bjerhammar 1975, 1985; Vermeer 1983; Delva and Lodewyck 2013). It offers the great advantage of being independent of any other geodetic data and infrastructure, with the perspective to overcome some of the limitations inherent in the classical geodetic approaches. For example, it could be used to interconnect tide gauges on different coasts without direct geodetic connections and help to unify various national height networks, even in remote areas.

In the context of general relativity, it is important to distinguish between proper quantities that are locally measurable and coordinate quantities that depend on conventions. An ideal clock can only measure local time, and hence, it defines its own timescale that is only valid in the vicinity of the clock, i.e. proper time. On the other hand, coordinate time is the time defined for a larger region of space with associated conventional spacetime coordinates. Time metrology provides specifications for the unit of proper time as well as the relevant models for coordinate timescales. In this context, the SI second, as defined in 1967, has to be considered as an ideal realization of the unit of proper time (e.g. Soffel and Langhans 2013). Usually, the graduation unit of coordinate time is also named the “second” due to the mathematical link to the SI second as a unit of proper time, but some authors consider relativistic coordinates as dimensionless; for a recent discussion of this topic, see Klioner et al. (2010).

Furthermore, the notion of simultaneity is not defined a priori in relativity, and thus, a conventional choice has to be made, which is usually done by considering two events in some (spacetime) reference system as simultaneous if they have equal values of coordinate time in that system (e.g. Klioner 1992). This definition of simultaneity is called coordinate simultaneity (and is associated with coordinate synchronization), making clear that it is entirely dependent on the chosen reference system and hence is relative in nature. Accordingly, syntonization is defined as the matching of corresponding clock frequencies.

For the construction and dissemination of international timescales, the (spacetime) Geocentric Celestial Reference System (GCRS) and the associated Geocentric Coordinate Time (TCG) play a fundamental role. However, for an earthbound clock (at rest) near sea level, realizing proper time, the sum of the gravitational and centrifugal potential generates a relative frequency shift of approximately \(7 \times 10^{-10}\) (corresponding to about 22 ms/year) with respect to TCG, because the GCRS is a geocentric and non-rotating system. In order to avoid this inconvenience for all practical timing issues at or near the Earth’s surface, Terrestrial Time (TT) was introduced as another coordinate time associated with the GCRS. TT differs from TCG just by a constant rate, which was first specified by the International Astronomical Union (IAU) within Resolution A4 (1991) through a conversion constant (denoted as \(L_{G})\) based on the “SI second on the rotating geoid”, noting that \(L_{G}\) is directly linked to a corresponding (zero) reference gravity potential value, usually denoted as \(W_{0}\). However, due to the intricacy and problems associated with the definition, realization and changes of the geoid, the IAU decided in Resolution B1.9 (2000) to turn \(L_{G}\) into a defining constant with zero uncertainty. The numerical value of \(L_{G}\) was chosen to maintain continuity with the previous definition, where it is important to note that the corresponding zero potential \(W_{0}\) also has no uncertainty (as it is directly related to the defining constant \(L_{G})\). Therefore, regarding the relativistic redshift correction for a clock at rest on the Earth’s surface, contributing to international timescales, the absolute gravity potential W is required relative to a given conventional value \(W_{0}\), whereby only the uncertainty of W matters (as \(W_{0}\) has zero uncertainty), while potential differences suffice for local and remote clock comparisons.

From the above, it becomes clear that TT is a theoretical (conventional) timescale, which can have different realizations, such as TAI, UTC, Global Positioning System (GPS) time, which differ mainly by some time offsets. For civil timekeeping all over the Earth, the timescales TAI and UTC are of primary importance and probably represent the most important application of general relativity in worldwide metrology today. While some local realizations of UTC are provided in real time, TAI has never been disseminated directly, but is constructed as a weighted average of over 450 free running clocks worldwide. The resulting timescale is then steered to a combination of the world’s best primary frequency standards after doing a relativistic transformation of the (local) proper time observations into TT; in other words, a relativistic redshift correction (with a relative value of about \(1 \times 10^{-13}\) per kilometre altitude) is applied so that TAI is indeed a realization of TT, associated with a virtual clock located on the (zero) reference gravity potential surface (\(W_{0})\). The computation of the relativistic redshift is regularly re-evaluated to account for progress in the knowledge of the gravity potential and in the standards, see, for example, Pavlis and Weiss (2003, 2017) for such work on frequency standards at NIST (National Institute of Standards and Technology), Boulder, Colorado, USA (United States of America), or Calonico et al. (2007) for similar activities for the Italian metrology institute INRIM (Istituto Nazionale di Ricerca Metrologica), Torino. More information on time metrology, the general relativity framework, and international timescales (especially TAI) is given in the review papers by Guinot (2011), Guinot and Arias (2005), Arias (2005), and Petit et al. (2014).

Beyond this, optical clock networks and corresponding link technologies are currently established or are under discussion, from which distinct advantages are expected for the dissemination of time, geodesy, astronomy and basic and applied research; for a review, see Riehle (2017). Also under investigation is the installation of optical master clocks in space, where spatial and temporal variations of the Earth’s gravitational field are smoothed out, such that the space clocks could serve as a reference for ground-based optical clock networks (e.g. Gill et al. 2008; Bongs et al. 2015; Schuldt et al. 2016). However, clock networks as well as their possible impact on science, including the establishment of a new global height reference system, are beyond the scope of this work.

The main objectives of this contribution are to review the scientific background of international timescales and geodetic methods to determine the relativistic redshift, to discuss the conditions and requirements to be met, and to provide some practical results for the state of the art, aiming at the geodetic community on the one hand and the physics, time, frequency, and metrology community on the other hand. Section 2 starts with an introduction to the relativistic background of reference systems and timescales, including proper time and coordinate time, the relativistic redshift effect, as well as TT and the realization of international timescales. Section 3 describes geodetic methods for determining the gravity potential, considering both the geometric levelling approach and the GNSS/geoid approach, together with corresponding uncertainty considerations. In Sect. 4, some practical redshift results based on geometric levelling and the GNSS/geoid approach are presented for three clock sites in Germany (Braunschweig, Hannover, Garching near Munich) and one site in France (Paris). Section 5 contains a discussion of the results obtained and some conclusions. Further details on some fundamentals of physical geodesy, regional gravity field modelling, the implementation of geometric levelling and GNSS observations in general and specifically for the clock sites in Germany and France, as well as a list of the abbreviations used are given in “Appendices 1–5”.

2 Relativistic background of reference systems and timescales

2.1 Spacetime reference systems

Measurement techniques in metrology, astronomy, and space geodesy have reached accuracies that require routine modelling within the general relativity rather than the Newtonian framework. Einstein’s GRT is founded on spacetime as a four-dimensional manifold, the equivalence principle, and the Einstein field equations, including the postulate of a finite invariant speed of light. The GRT is a metric theory with an associated metric tensor, which explains gravitation via the curvature of four-dimensional spacetime (although occasionally also called a “fictitious force”); spacetime curvature tells matter how to move, and matter tells spacetime how to curve (e.g. Misner et al. 1973). Based on the equivalence principle, no privileged reference systems exist within the framework of GRT, but local inertial systems may be constructed in any sufficiently small (infinitesimal) region of spacetime; in other words, all local, freely falling, non-rotating laboratories are fully equivalent for the performance of physical experiments. In contrast to this, Newtonian mechanics describe gravitation as a force caused by matter with an associated Newtonian potential and presume the existence of universal absolute time and three-dimensional (Euclidean) space, i.e. globally preferred inertial (Galilean) coordinate systems with a direct physical meaning exist (where Newton’s laws of mechanics apply everywhere), and idealized clocks show absolute time everywhere in space. However, considering the most accurate measurements today (e.g. in time and frequency metrology), Newtonian theory is unable to describe fully the effects of gravitation, even in a weak gravitational field and when objects move with low velocities in the field.

Spacetime is a four-dimensional continuum, in which points, denoted as events, can be located by their coordinates; these can be in general four real numbers, but usually three spatial coordinates and one time coordinate are considered. A world line is the unique path that an object travels through spacetime, and the world lines of freely falling point particles or objects are geodesics in curved spacetime. Usually, spacetime coordinates are curvilinear and have no direct physical meaning, and according to the principle of covariance, different reference systems may be chosen to model observations and to describe the outcome of experiments. This freedom to choose the reference system can be used to simplify the models or to make the resulting parameters more physically adequate (Soffel et al. 2003). With regard to the terminology, it is fundamental to distinguish between a “reference system”, which is based on theoretical considerations or conventions, and its realization, the “reference frame”, to which users have access, e.g. in the form of position catalogues. Furthermore, it is important to distinguish between proper and coordinate quantities (Wolf 2001); proper quantities (e.g. proper time and length) are the direct result of local measurements, while coordinate quantities are dependent on conventional choices, e.g. a spacetime coordinate system or a convention for synchronization (Petit and Wolf 2005). In addition, due to the curvature of spacetime, the relation between proper and coordinate quantities is in general not constant and depends on the position in spacetime, in contrast to Newtonian theory.

In order to adequately describe modern observations in astronomy, geodesy, and metrology, several relativistic reference systems are needed. This was first recognized by the IAU with Resolution A4 (1991). The corresponding IAU Resolution B1 (2000) extended and clarified the relativistic framework, providing the relativistic definition of the BCRS (Barycentric Celestial Reference System) and the GCRS with origins at the solar system barycentre and the geocentre of the Earth, respectively, and including the choice of harmonic coordinates, the definition of corresponding metric tensors and timescales, and the relevant four-dimensional spacetime transformations. The BCRS with coordinates (t, x), where t is the Barycentric Coordinate Time (TCB), is useful for modelling the motion of bodies within the solar system, while the GCRS with coordinates (T, X), with T being TCG, is appropriate for modelling all processes in the near-Earth environment, including the Earth’s gravity field and the motion of Earth’s satellites. The BCRS can be considered to a good approximation as a global quasi-inertial system, while the GCRS can be regarded as a local system (Müller et al. 2008). While the BCRS and GCRS provide the general conceptual (relativistic) framework for a barycentric and a geocentric reference system, the absolute (spatial) orientation of either system was first left open and then fixed by employing the International Celestial Reference System (ICRS), as recommended in the IAU Resolutions B2 (1997) and B2 (2006). The ICRS is maintained by the International Earth Rotation Service (IERS) and realized by the International Celestial Reference Frame (ICRF) as a catalogue of adopted positions of extragalactic radio sources observed by Very Long Baseline Interferometry (VLBI).

The GCRS also provides the foundation for the International Terrestrial Reference System (ITRS), and both systems differ by just a (time-dependent) spatial rotation (or a series of rotations); accordingly the ITRS coordinate time coincides with TCG (e.g. Soffel et al. 2003; Kaplan 2005). The spatial rotation involves the Earth orientation parameters (EOP), and according to the IAU 2000 and 2006 resolutions, the transformation between both systems is based on an “intermediate system” with the Celestial Intermediate Pole (CIP, pole of the nominal rotation axis), as well as precession, nutation, frame bias, Earth Rotation Angle (ERA), and polar motion parameters; for details, see the IERS Conventions 2010 (Petit and Luzum 2010). The ITRS is an “Earth-fixed” system, co-rotating with the Earth in its diurnal motion in space, in which points at the solid Earth’s surface undergo only small variations with time (e.g. due to geophysical effects related to tectonics); it is therefore a convenient choice for all disciplines requiring terrestrial positions, such as geodesy, navigation, and geographic information services. The ITRS origin is at the centre of mass of the whole Earth including its oceans and atmosphere (geocentre), the scale unit of length is the metre (SI), the scale is consistent with TCG, the orientation is equatorial and initially given by the Bureau International de l’Heure (BIH) terrestrial system at epoch 1984.0, and the time evolution of the orientation is ensured by using a no-net-rotation condition with regard to the horizontal tectonic motions over the whole Earth. The Z-axis is directed towards the IERS reference pole (i.e. the mean terrestrial North Pole), and the axes X and Y span the equatorial plane, with the X-axis being defined by the IERS reference meridian (Greenwich), such that the coordinate triplet XYZ forms a right-handed Cartesian system. The International Union of Geodesy and Geophysics (IUGG) formally adopted the ITRS at its General Assembly in 2007, with the IERS being the responsible body.

The ITRS is realized by the International Terrestrial Reference Frame (ITRF), which consists of the three-dimensional positions and velocities of stations observed by space geodetic techniques, where the positions are regularized in the sense that high-frequency time variations are removed by conventional corrections. These corrections are mainly geophysical ones, such as solid Earth tides and ocean tidal loading (for full details, see the IERS Conventions 2010, Petit and Luzum 2010); the purpose of these corrections is to obtain positions with more regular time variation, which better conform to the linear time-variable coordinate modelling approach used and thus improve the transformation to a certain reference epoch (to obtain a quasi-static state). The most recent realization of the ITRS is the ITRF2014 (Altamimi et al. 2016); however, all results in the present paper are based on the previous realization ITRF2008 with the reference epoch 2005.0. The uncertainty (standard deviation) of the geocentric Cartesian coordinates (XYZ) is at the level of 1 cm or better; for further details, see Petit and Luzum (2010).

Besides the ITRS and its frames (ITRF), various other national, regional, and global systems are in use. Of some relevance are the World Geodetic System 1984 (WGS84) for GPS users, and the European Terrestrial Reference System (ETRS) for European users. While WGS84 is intended to be as closely coincident as possible with the ITRS (the latest realization, i.e. “Reference Frame G1762”, agrees with the ITRF at the level of 1 cm; NGA 2014; NGA—National Geospatial-Intelligence Agency, USA), the ETRS89 reference system is attached to the stable part of the Eurasian plate in order to compensate for the movement of the Eurasian tectonic plate (of roughly 2.5 cm per year). The latter reference system and corresponding frames (ETRF, European Terrestrial Reference Frame) are implemented in most European countries, as this results in much smaller station velocities. However, for global work in connection with international timescales, the ITRS and corresponding reference frames should be employed.

2.2 Proper time and coordinate time

For time metrology, the most fundamental quantities are the proper time, as observed by a local ideal clock, and the coordinate time of a conventional spacetime reference system. The relation between both quantities can be derived in general from the relativistic line element ds and the coordinate-dependent metric tensor \(g_{\alpha {\beta }}\), whose components are in general not globally constant. Hence, the measured proper quantities between two events (e.g. proper time observed by an ideal clock) depend in principle on the path followed by a particle (e.g. a clock) between these two events. Considering spacetime coordinates \(x^{{\gamma }} = (x^{0}\), \(x^{1}, x^{2}, x^{3})\) with \(x^{0} = {ct}\), where c is the speed of light in vacuum, and t is the coordinate time, the line element along a time-like world line is given by

$$\begin{aligned} \mathrm{d}s^{2}=g_{\alpha \beta } (x^{\gamma })\mathrm{d}x^{\alpha }\mathrm{d}x^{\beta }=-c^{2}\mathrm{d}\tau ^{2}, \end{aligned}$$

where \(\tau \) is the proper time along that world line. In this context, Einstein’s summation convention over repeated indices is employed, with Greek indices ranging from 0 to 3 and Latin indices taking values from 1 to 3. The relation between proper and coordinate time is obtained by rearranging the above equation, resulting in

$$\begin{aligned} \left( {\frac{\mathrm{d}\tau }{\mathrm{d}t}} \right) ^{2}= & {} -g_{00} -2g_{0i} \frac{1}{c}\frac{\mathrm{d}x^{i}}{\mathrm{d}t}-g_{ij} \frac{1}{c^{2}}\frac{\mathrm{d}x^{i}}{\mathrm{d}t}\frac{\mathrm{d}x^{j}}{\mathrm{d}t}\nonumber \\= & {} -g_{00} -2g_{0i} \frac{v^{i}}{c}-g_{ij} \frac{v^{i}v^{j}}{c^{2}}, \end{aligned}$$

where \(v^{i}(t)\) is the coordinate velocity along the path \(x^{i}(t)\).

Inserting the GCRS metric, as recommended by IAU Resolution B1 (2000), and using a binomial series expansion lead to

$$\begin{aligned} \frac{\mathrm{d}\tau }{\mathrm{d}T_{(TCG)} }= & {} 1-\frac{1}{c^{2}}\left( {V+\frac{v^{2}}{2}} \right) +O(c^{-4})\nonumber \\= & {} 1-\frac{1}{c^{2}}\left( {V_\mathrm{E} +V_\mathrm{ext} +\frac{v^{2}}{2}} \right) + O(c^{-4}) . \end{aligned}$$

In this equation, V is the usual gravitational scalar potential (denoted as “W” in the IAU 2000 resolutions, a generalized Newtonian potential), which is split up into two parts arising from the gravitational action of the Earth itself (\(V_\mathrm{E})\) and external parts due to tidal and inertial effects (\(V_\mathrm{ext}\) = \(V_\mathrm{tidal}+V_\mathrm{iner})\), and v is the coordinate velocity of the observer in the GCRS, while terms of the order \(c^{-4}\) are omitted. In this context, it should be noted that all potentials are defined here with a positive sign, which is consistent with geodetic practice, but in contrast to most physics literature, where usually the opposite sign (conceptually closer to potential energy) is employed (Jekeli 2009). The above equation corresponds to the first post-Newtonian approximation and is accurate to a few parts in \(10^{19}\) for locations from the Earth’s surface up to geostationary orbits, which is fully sufficient, as in practice the limiting factor is the uncertainty with which \(V_\mathrm{E}\) can be determined in the vicinity of the Earth. Consequently, contributions from \(V^{2}/c^{4}\) and the so-called gravitomagnetic vector potential (with the notation chosen in formal analogy to classical magnetism theory) have been neglected in the above equation as they do not exceed a few parts in \(10^{19}\), while the inertial terms in \(V_\mathrm{ext}\) remain below 2 parts in \(10^{20}\); for further details, see Soffel et al. (2003).

Regarding the (local) gravitational potential of the Earth (\(V_\mathrm{E})\) within an Earth-fixed system, for many applications it is advantageous to utilize series expansions (with basically constant coefficients), which usually converge for points outside the Earth’s surface. Within the relativistic context, multipole expansions, which have great similarities with corresponding Newtonian series, are very useful. These presently have more than sufficient accuracy and lead in the end to the well-known spherical harmonic expansion; however, in contrast to the classical theory, all relevant parameters have to be interpreted within a relativistic scope. A summary of this approach, including a detailed discussion on the post-Newtonian interpretation and neglected terms, can be found in Soffel et al. (2003).

Equation (3) shows that the proper time interval \(\mathrm{d}\tau \), separating two events, is less than the corresponding coordinate time interval dT, by an amount that depends on the gravitational potential V (zero at infinity, increasing towards the attracting masses) and the velocity v relative to the chosen reference system. Consequently, when compared to coordinate time, clocks run slower (tick slower, show less time) when they move or are affected by gravitation; this slowing of time is called time dilation and may be separated into a (relative) velocity and a gravitational time dilation, sometimes also called special and general relativistic time dilation, respectively, where the special relativistic time dilation is measurable by the transverse (or second-order) Doppler shift. On the other hand, for a stationary clock at infinity (i.e. \(v=0\) and \(V = 0\), a “distant” observer at rest), the proper time interval approaches the coordinate time interval \((\mathrm{d}\tau _{\infty }= {\mathrm{d}T})\), and hence offers a way, in principle at least, to directly observe coordinate time.

2.3 The gravitational redshift effect

The gravitational time dilation is closely related to the gravitational redshift effect, which considers an electromagnetic wave (light) travelling from an emitter (em—located at A) to a receiver (rec—located at B) in conjunction with two ideal (zero uncertainty) frequency standards (clocks) at A and B, measuring proper time. Here the ideal clocks should show the same time under the same conditions, e.g. they should be atomic or nuclear clocks based on the emission of an electromagnetic wave at a certain (natural) frequency. In textbooks (see, for example, Cheng 2005; Lambourne 2010; Misner et al. 1973; Moritz and Hofmann-Wellenhof 1993; Schutz 2003, 2009; Will 1993) the redshift effect is usually explained by assuming a stationary gravitational field (with time-independent metric) as well as a static emitter and receiver (\(v=0\)), and for better illustration, it is mostly supposed (although not required) that both the emitter and receiver are positioned in the same vertical one above the other, noting that also the famous Pound and Rebka experiment was carried out in this way (Pound and Rebka 1959). Under these circumstances (stationary gravitational field, static scenario), the trajectories (world lines) of successive wave crests of the emitted signal are identical, and hence, expressed in coordinate time, the interval between emission and reception (observation) of successive wave crests is the same \((\mathrm{d}t_\mathrm{em} = {\mathrm{d}t}_\mathrm{rec})\). However, the ideal clocks at the points of emission (A) and reception (B) of the wave crests are measuring proper time, i.e. based on Eqs. (2) and (3), the lower clock runs slower than the corresponding upper clock; in other words, an observer at the upper station will find that the lower clock is running slow with respect to his own clock. On the other hand, since the (proper) frequency being inversely proportional to the proper time interval with \(f=1/\mathrm{d}\tau \), Eq. (2) can be used (together with \(\mathrm{d}t_\mathrm{em}=\mathrm{d}t_\mathrm{rec}\)) to derive

$$\begin{aligned} \frac{{\mathrm{d}\tau _\mathrm{rec} }\big /{\mathrm{d}t_\mathrm{rec} }}{{\mathrm{d}\tau _\mathrm{em} }\big /{\mathrm{d}t_\mathrm{em} }}=\frac{\mathrm{d}\tau _\mathrm{rec} }{\mathrm{d}\tau _\mathrm{em} }=\frac{f_\mathrm{em} }{f_\mathrm{rec} }=\frac{\sqrt{-(g_{00} )_\mathrm{rec} }}{\sqrt{-(g_{00} )_\mathrm{em} }}, \end{aligned}$$

where \(f_\mathrm{em}\) and \(f_\mathrm{rec}\) are the proper frequencies of the light as observed at points A (em) and B (rec) by the corresponding ideal clocks, respectively. Rearranging the above relationship and considering Eq. (3) give

$$\begin{aligned} \frac{\Delta f}{f_\mathrm{rec} }= & {} \frac{f_\mathrm{rec} -f_\mathrm{em} }{f_\mathrm{rec} }=1-\frac{f_\mathrm{em} }{f_\mathrm{rec} }=1-\frac{\left( {1-{V_\mathrm{rec} }\big /{c^{2}}} \right) }{\left( {1-{V_\mathrm{em} }\big /{c^{2}}} \right) }\nonumber \\= & {} \frac{V_\mathrm{rec} -V_\mathrm{em} }{c^{2}}+O\left( {c^{-4}} \right) \approx \frac{-bH}{c^{2}}, \end{aligned}$$

where gravitational potential terms of the order \(V^{2}/c^{4}\) have been neglected, while b is the gravitational acceleration, and H is the vertical distance between points A (em) and B (rec) counted positive upward. Here it is worth mentioning again that the above equation holds for two arbitrary points with corresponding gravitational potentials, while only the rightmost part of the equation depends (to some extent) on the assumption that both points be in the same vertical. Hence, in the following discussion, the meaning of “above” and “below” relates to two arbitrary points on corresponding equipotential surfaces above or below each other. For the case where point B (rec) is located above point A (em), the potential difference \(V_{rec }-V_\mathrm{em}\) is negative, and hence, \(\Delta f=f_{rec }-f_\mathrm{em}\) is negative, i.e. the received (observed) frequency \(f_\mathrm{rec}\) is lower than the corresponding emitted frequency \(f_\mathrm{em}\), and thus, blue light becomes more red, explaining the term “redshift effect”. On the other hand, if point B (rec) is below point A (em), i.e. the light signal is sent from the top to the bottom station, \(V_\mathrm{rec}-V_\mathrm{em}\) is positive, and hence, \(\Delta f=f_\mathrm{rec}-f_\mathrm{em}\) is also positive, leading to an increase in frequency and thus a “blueshift”. Finally, instead of using frequencies, the redshift effect can also be formulated in terms of corresponding wavelengths, and Eq. (5) can be extended for moving observers by introducing velocities v.

It is also noteworthy that the redshift and the time dilation effect can be derived solely on the basis of the (weak) equivalence principle (physics in a frame freely falling in a gravitational field is equivalent to physics in an inertial frame without gravitation) and the energy conservation law; in other words, a photon climbing in the (Earth’s) gravitational field will lose energy and will consequently be redshifted. On the other hand, any theory of gravitation can predict the redshift effect if it respects the equivalence principle, i.e. gravitational redshift experiments can be considered as tests of the equivalence principle.

2.4 Time and the gravity potential

The fundamental relationship between proper time and coordinate time according to Eq. (3) refers to the (non-rotating) GCRS and therefore all quantities depend on the coordinate time T (TCG) due to Earth rotation. However, for many practical applications it is more convenient to work with an Earth-fixed system (e.g. ITRS, co-rotating with the Earth), which can be considered as static in the first instance. Then, for an observer (clock) at rest in an Earth-fixed system, the velocity v in Eq. (3) is simply given by \(v=\omega \,p\), where \(\omega \) is the angular velocity about the Earth’s rotation axis, and p is the distance from the rotation axis. Taking all these into account, Eq. (3) may be rearranged and expressed in the Earth-fixed system (for an observer at rest) as

$$\begin{aligned} \frac{\mathrm{d}\tau }{\mathrm{d}T_{(TCG)} }=1-\frac{1}{c^{2}}W(t) + O(c^{-4}), \end{aligned}$$

where W(t) is the slightly time-dependent (Newtonian) gravity potential related to the Earth-fixed system, as employed in classical geodesy. W(t) may be decomposed into

$$\begin{aligned} W(t)=W^\mathrm{static}(t_0 )+W^\mathrm{temp}(t-t_0 ), \end{aligned}$$

where \( W^\mathrm{static}(t_{0})\) is the dominant static (spatially variable) part of the gravity potential at a certain reference epoch \(t_{0}\), while \(W^\mathrm{temp}(t)\) incorporates all temporal components of the gravity potential (inclusively tidal effects) and indeed contains the temporal variations of all three terms (\(V_\mathrm{E}, V_\mathrm{ext}, v^{2}/2\)) in Eq. (3); for a discussion of the relevant terms and magnitudes, see, for example, Wolf and Petit (1995), Petit et al. (2014) and Voigt et al. (2016). The accuracy of Eq. (6) is similar to Eq. (3), where terms below a few parts in \(10^{19}\) have been neglected. If the observer is not at rest within the Earth-fixed system, then two more terms enter in Eq. (6), see, for example, Nelson (2011).

The focus here is on the static part of the gravity potential, denoted simply as W in the following, which is defined as the sum of the gravitational potential \(V_\mathrm{E}\) and the centrifugal potential \(Z_\mathrm{E}\) in the form (see also “Appendix 1”)

$$\begin{aligned} W=V_\mathrm{E} +Z_\mathrm{E} ,\quad \quad Z_\mathrm{E} =\frac{\omega ^{2}}{2}p^{2}, \end{aligned}$$

showing that the term \(v^{2}/2\) in Eq. (3) is exactly the centrifugal potential \(Z_\mathrm{E}\) as defined above. In general, the main advantage of the Earth-fixed reference system is that all gravity field quantities as well as station coordinates can be regarded as time-independent in the first instance, assuming that (small) temporal variations can be taken into account by appropriate reductions or have been averaged out over sufficiently long time periods.

The largest component in \(W^\mathrm{temp}\) is due to solid Earth tide effects, which lead to predominantly vertical movements of the Earth’s surface with a global maximum amplitude of about 0.4 m (equivalent to about \(4\,\hbox {m}^{2}\,\hbox {s}^{-2}\) in potential). The next largest contribution is the ocean tide effect with a magnitude of roughly 10–15% of the solid Earth tides, but with significantly increased values towards the coast. All other time-variable effects are a further order of magnitude smaller and originate from atmospheric mass movements (on a global scale, ranging from hourly to seasonal variations), hydro-geophysical mass changes (on regional and continental scales, seasonal variations), and polar motion (pole tides). For details regarding the computation, magnitudes, and main time periods of all relevant time-variable components, see Voigt et al. (2016). Based on this, the temporal and static components can be added according to Eq. (7) to obtain the actual gravity potential value W(t) at time t, as needed, for example, for the evaluation of clock comparison experiments.

In this context, geodynamic effects also lead to additional velocity contributions, and the effect of these on time and frequency comparisons has to be considered (see Gersl et al. 2015). For instance, Earth tides (as by far the largest time-variable component) lead to periodic variations of the Earth’s surface with amplitudes up to about 0.4 m and time periods of roughly 12 h, associated with maximum kinematic velocities of about \(5\times 10^{-5}\,\hbox {ms}^{-1}\); hence, they contribute well below \(10^{-25}\) to the (rightmost) second-order Doppler shift term in Eq. (3), which depends on the square of the velocity, and therefore can safely be neglected in the foreseeable future. Corresponding contributions to the (classical) first-order Doppler effect may be significant, as mentioned in Mai (2013), but fortunately first-order Doppler shifts do not play a role in optical clock comparisons through (two-way) fibre links, which presently is the only technique that has achieved link performances at the level of 1 part in \(10^{18}\) and below (Droste et al. 2013).

2.5 Terrestrial Time and the realization of international timescales

The development of atomic timekeeping is nicely described in the SI Brochure Appendix 2 (Practical realization of the definition of the unit of time; http://www.bipm.org/en/publications/mises-en-pratique/; BIPM—Bureau International des Poids et Mesures) as well as in the review papers from Guinot (2011), Arias (2005), and Guinot and Arias (2005). The unit of time is the SI second, which is based on the value of the caesium ground-state hyperfine transition frequency, as adopted by the 13th Conférence Générale de Poids et Mesures (CGPM) in 1967 (Terrien 1968). This definition should be understood as the definition of the unit of proper time (within a sufficiently small laboratory). The SI second is realized by different caesium standards in national metrology institutes (NMIs), and an expedient combination of all the results (including results from a number of hydrogen masers) leads to an ensemble timescale, denoted as TAI. The 14th CGPM approved TAI in 1971 (Terrien 1972) and endorsed the definition proposed by the Comité Consultatif pour la Définition de la Seconde (CCDS) in 1970, which states that TAI is established from “atomic clocks operating ... in accordance with the definition of the second”. In the framework of general relativity, this definition was completed by the CCDS in 1980, stating that “TAI is a coordinate time scale defined in a geocentric reference frame with the SI second as realized on the rotating geoid as the scale unit” (Giacomo 1981). This definition was amplified in the context of IAU Resolution A4 (1991), stating that TAI is considered as a realized timescale whose ideal form is TT. While only a best estimate for the constant rate between TT and TCG was given in the IAU 1991 resolution, this rate was fixed in the year 2000 by a further IAU Resolution B1.9 (2000), giving

$$\begin{aligned} \frac{\mathrm{d}T_{(TT)} }{\mathrm{d}T_{(TCG)} }=1-L_G ,\quad L_G =6.969290134\times 10^{-10}, \end{aligned}$$

where \(L_{G}\) is now a defining constant with a fixed value and no uncertainty. The main idea behind the IAU 2000 definition of TT is to produce a timescale for the construction and dissemination of time for all practical purposes at or near the Earth’s surface, which is consistent with the previous definitions, but avoids explicit mention of the geoid due to the intricacy and problems associated with the definition, realization, and changes of the geoid. Of course, the numerical value of \(L_{G}\) was chosen to maintain continuity with the previous definition, and as \(L_{G}=W_{0}/c^{2}\) according to Eq. (6), this implicitly defines a zero gravity potential value of

$$\begin{aligned} W_0 =62{,}636{,}856.00\hbox { m}^{{2}}\,\hbox {s}^{{-2}}. \end{aligned}$$

As the speed of light c is also fixed \((299{,}792{,}458\,\hbox {ms}^{-1})\) and has no uncertainty, the parameters \(L_{G}\) and \(W_{0}\) can be considered as equivalent, both having zero uncertainty. In this context, Pavlis and Weiss (2003) somewhat misleadingly mention that the zero potential (\(W_{0})\) uncertainty “implies a limitation on the realization of the second” and therefore “the second can be realized to no better than \(\pm 1 \times 10^{-17}\)”.

The above zero gravity potential value \(W_{0}\) was the best estimate at that time, and it is also listed in the IERS conventions 2010 (Petit and Luzum 2010) as the value for the “potential of the geoid”. Although the latest definition of TT gets along without the geoid and a numerical value for \(W_{0}\), the realization of TT is linked to signal frequencies that an ideal clock would generate on the zero level reference surface, the latter being implicitly defined by the constant \(L_{G}\). This leads to the relativistic redshift correction, analogous to Eq. (5), giving for a clock at rest on the Earth’s surface

$$\begin{aligned} \frac{\Delta f}{f_P }\!=\!\frac{f_P \!-\!f_0 }{f_P }\!=\!1-\frac{\mathrm{d}\tau _P }{\mathrm{d}\tau _0 }=\frac{W_P -W_{_0 } }{c^{2}}+O(c^{-4})\approx \frac{-gH}{c^{2}}, \end{aligned}$$

where \(f_{P}\) and \(f_{0}\) are the proper frequencies of an electromagnetic wave as measured at points P at the Earth’s surface and \(P_{0}\) on the zero level surface, respectively (cf. Sect. 2.3), while \(W_{P}\) and \(W_{0}\) are the corresponding gravity potential values, H is the vertical distance of point P relative to point \(P_{0}\), measured positive upwards, and g is the gravity acceleration. An exact relation for the potential difference is given by \(W_0 -W_{_P } =C_{_P } =\overline{{g}}H=\overline{{\gamma }}H^{N}\), where \(C_{P}\) is the geopotential number, and H and \(H^{N}\) are the orthometric and normal heights with corresponding mean gravity and normal gravity values \(\overline{{g}}\) and \(\overline{{\gamma }}\), respectively (for further details, see Sect. 3.1 and Torge and Müller 2012).

The above equation assumes that the two clocks at P and \(P_{0}\) are earthbound and at rest within the (rotating) Earth-fixed system, i.e. both points are affected by the Earth’s gravity (gravitational plus centrifugal) field and relative velocities between them are non-existent. Therefore, the term “relativistic redshift effect” is preferred over “gravitational redshift effect”, as not just the gravitational, but also the centrifugal potential is involved; similar conclusions are drawn by Delva and Lodewyck (2013), Pavlis and Weiss (2003) and Petit and Wolf 1997. Furthermore, the sign considerations related to Eq. (5) apply equally to the above equation (11), as the gravity potential W is dominated by the gravitational part V. Assuming again that P is located above \(P_{0}\), \(W_{P}- W_{0}\) is negative and so is \(\Delta f\), i.e. the received (observed) frequency \(f_{P}\) is lower than the emitted frequency \(f_{0}\), which explains the term “redshift effect” (cf. Sect. 2.3). Regarding international timescales, realizing TT, the frequency or rate of an individual clock at P above the zero level surface must be reduced by an amount \(\Delta f\), given by Eq. (11), in order to produce the desired signal corresponding to a hypothetical clock at the zero level surface. In other words, the clock frequency at P has to be adjusted such that it will no longer observe a redshift of the clock signal emitted at \(P_{0}\). As all existing primary frequency standards are situated above the zero level surface, in practice, the relativistic redshift correction is always negative and can become quite significant; for example, Pavlis and Weiss (2003, 2017) estimated a correction \(\Delta f/f\) of about \(-1800 \times 10^{-16}\) for the clocks at NIST in Boulder, Colorado, USA, with an altitude of about 1650 m. Furthermore, for the realization of TT based on Eq. (11), in principle only the absolute gravity potential (W) at the clock and its uncertainty matters, as the parameters \(L_{G}\) and \(W_{0}\) are equivalent and both have zero uncertainty.

As mentioned above, TAI is a realization of TT, a coordinate time scale in the geocentric system. As UTC is directly derived from TAI, it is by definition also a realization of TT. Similarly, all other timescales, which aim at keeping in synchrony with UTC, using coordinate synchronization, can also be considered as realizations of TT. Such timescales include the UTC(k), i.e. local realizations of UTC at the laboratory k, and the reference timescales for GNSS. By construction, these become realizations of TT without explicitly using Eq. (11) to correct the frequency of participating clocks. At this point, it should be noted that an ambiguity remains between TT, now defined by IAU Resolution B1.9 (2000) without mention of the geoid, and TAI, whose definition is still related to “the geopotential on the geoid”. This comes from the use of similar terms for the definition of TAI and TT before the redefinition of TT in 2000, which meant that both TT and TAI suffered from the uncertainty in the determination of the geoid and the corresponding geopotential value. For this reason, several authors (e.g. Wolf and Petit 1995) suggested that \(L_{G}\) should be turned into a defining constant, a suggestion eventually adopted in the IAU (2000) redefinition of TT. However, this change was not passed into a new definition of TAI. This problem is currently being addressed by the Consultative Committee for Time and Frequency (CCTF) and is expected to be solved with a new definition for TAI.

With respect to the above value for \(W_{0}\) (consistent with the international recommendations for the definition of TT), a further complication has emerged from the fact that the International Association of Geodesy (IAG) has introduced another numerical value for \(W_{0}\); for further details see “Appendix 1”. Therefore, since it seems most likely that the geodetic, time, and IAU communities will not be willing to change their definitions in the near future, the problem of different (conflicting) \(W_{0}\) values must be solved by transformations and comprehensive documentation of the relevant steps.

2.6 Orders of magnitude of relativistic terms and chronometric geodesy

Equation (11) is the classic formula relating frequency differences and gravity potential differences, where a fractional frequency shift of 1 part in \(10^{18}\) corresponds to about \(0.1\,\mathrm{m}^{2}\,\hbox {s}^{-2}\) in terms of the gravity potential difference, which is equivalent to about 0.01 m in height. In addition, Eq. (11) can be used to estimate the magnitude of the relativistic redshift correction (e.g. about \(-1800 \times 10^{-16}\) for NIST in Boulder, see above), and it makes clear that the absolute gravity potential (W) is required for contributions to international timescales, while potential differences suffice for local and remote clock comparisons, with the proviso that the actual potential values and the clock frequency measurements must refer to the same epochs. The latter requirement means that the magnitude of time-variable effects in the gravity potential due to solid Earth and ocean tides as well as other effects must be taken into account for all clock measurements at a performance level below roughly 5 parts in \(10^{17}\). This is especially important for contributions to international timescales and remote clock comparisons over large distances in cases where relatively short averaging times are used, since in such situations the time-variable gravity potential components may not average out sufficiently. Moreover, the tidal peak-to-peak signal could also prove useful for evaluating the performance of optical clocks, by providing a detectability test.

Another important consequence of Eq. (11) is, on the one hand, that geodetic knowledge of the Earth’s gravity potential (and corresponding gravity field related heights) can be used to predict frequency shifts between local and remote (optical) clocks, and vice versa, the clocks can be used to determine gravity potential differences. To the knowledge of the authors, the latter technique was first mentioned in the geodetic literature by Bjerhammar (1975) within a short section on a “new physical geodesy”. Vermeer (1983) introduced the term “chronometric levelling”, while Bjerhammar (1985) discussed the clock-based levelling approach under the title “relativistic geodesy” and also included a definition of a relativistic geoid as the “surface closest to mean sea level, where clocks run with the same speed”. Regarding the terminology, Delva and Lodewyck (2013) consider that “relativistic geodesy” should cover all geodetic topics based on a relativistic framework and suggest, like Petit et al. (2014), that the term “chronometric geodesy” should be used for all geodetic disciplines employing (atomic) clocks. This definition of terms is well conceived, and in this contribution, the term “chronometric levelling”—although somewhat restrictive—is preferred, as it characterizes quite accurately the clock-based levelling approach based on Eq. (11).

With regard to the definition of the geoid, when considering Eq. (11) and the underlying level of approximation (of better than \(10^{-18}\) in frequency or 1 cm in height), both the classical (geodetic) definition and the relativistic (chronometric) definition given by Bjerhammar (1985) relate to a selected level surface within the Earth’s gravity field, as defined in classical physical geodesy within the Newtonian framework. Further refinements of a relativistic geoid definition are discussed, for example, in Soffel et al. (1988), Kopeikin (1991), or Müller et al. (2008), where additional terms at the few mm level show up, but the option for a transformation between the different relativistic definitions and the classical version also exists. Therefore, even if future optical atomic clocks will (operationally) work at the level of 10\(^{-19}\) or below and deliver corresponding gravity potential differences, these may still be integrated into the framework of classical physical geodesy and gravity field modelling by considering appropriate corrections. Consequently, as most terrestrial geodetic applications do not require a relativistic treatment, with only a few areas (e.g. reference systems and time, ephemerides and satellite orbits, global geopotential modelling in connection with satellite observations) needing some relativistic background, the geodetic community is unlikely to switch soon to a (much more complicated) fully relativistic framework; this is because the classical Newtonian formulations are usually sufficient, far simpler to handle, and only exceptionally require some relativistic corrections. For a discussion of the classical geoid definition and different existing numerical values for the zero level surface, see “Appendix 1”.

3 Geodetic methods for determining the gravity potential

This section deals with geodetic methods for determining the gravity potential, needed for the computation of relativistic redshift corrections for optical clock observations. The focus is on the determination of the static (spatially variable) part of the potential field, while temporal variations in the station coordinates and the potential quantities are assumed to be taken into account through appropriate reductions or by using sufficiently long averaging times (see also Sect. 2.4). This is common geodetic practice and leads to a quasi-static state (e.g. by referring all quantities to a given epoch), such that the Earth can be considered as a rigid and non-deformable body, uniformly rotating about a body-fixed axis. Hence, all gravity field quantities including the level surfaces are considered in the following as static quantities, which do not change in time.

In this context, a note on the handling of the permanent (time-independent) parts of the tidal corrections is appropriate; for details, see, for example, Mäkinen and Ihde (2009), Ihde et al. (2008), or Denker (2013). The IAG has recommended that the so-called zero-tide system should be used (resolutions no. 9 and 16 from the year 1983; cf. Tscherning 1984), where the direct (permanent) tide effects are removed, but the indirect deformation effects associated with the permanent tidal deformation are retained. Unfortunately, geodesy and other disciplines do not strictly follow the IAG resolutions for the handling of the permanent tidal effects, and therefore, depending on the application, appropriate corrections may be necessary to refer all quantities to a common tidal system (see Sect. 4 and the aforementioned references).

“Appendix 1” outlines some necessary fundamentals of physical geodesy. This includes the introduction of the gravity potential and its components according to Eq. (8) as the fundamental quantity, from which all other relevant gravity field parameters can be derived, the definition of the geoid as a selected equipotential surface and its relation to mean sea level, as well as the choice of different (conflicting) zero potential values (\(W_{0}\) issue), being largely a matter of convention. In the following, two geodetic approaches for deriving gravity potential values are discussed.

3.1 The geometric levelling approach

The classical and most direct way to obtain gravity potential differences is based on geometric levelling and gravity observations, denoted here as the geometric levelling approach. Based on Eq. (33) in “Appendix 1”, the gravity potential differential can be expressed as

$$\begin{aligned} \mathrm{d}W=\frac{\partial W}{\partial x}\mathrm{d}x+\frac{\partial W}{\partial y}\mathrm{d}y+\frac{\partial W}{\partial z}\mathrm{d}z\!=\mathrm{grad}W \mathbf{ds}=\mathbf{gds}=\!-g \mathrm{d}n, \end{aligned}$$

where ds is the vectorial line element, g is the magnitude of the gravity vector, and dn is the distance along the outer normal of the level surface (zenith or vertical), which by integration leads to the geopotential number C in the form

$$\begin{aligned} C^{(i)}=W_0^{(i)} -W_P =-\int \limits _{P_{0(i)} }^P {\mathrm{d}W} =\int \limits _{P_{0(i)} }^P {g \mathrm{d}n} , \end{aligned}$$

where P is a point at the Earth’s surface, (i) refers to a given height datum, and \(P_{0(i)}\) is an arbitrary point on the selected zero level or height reference surface (with gravity potential \(W_0^{(i)}\)). Thus, in addition to the raw levelling results (dn), gravity observations (g) are needed along the path between \(P_{0(i)}\) and P. The spacing and uncertainty required for these gravity points is discussed in standard geodesy textbooks, e.g. Torge and Müller (2012). The geopotential number C is defined such that it is positive for points P above the zero level surface, similar to heights. The zero level surface and the corresponding potential are typically selected in an implicit way by connecting the levelling to a fundamental national tide gauge, but the exact numerical value of the reference potential \(W_0^{(i)}\) is usually unknown. As mean sea level deviates from a level surface within the Earth’s gravity field due to the dynamic ocean topography, this leads to inconsistencies of more than 0.5 m between different national height systems across Europe, the extreme being Belgium, which differs by more than 2 m from all other European countries due to the selection of low tide water as the reference (instead of mean sea level).

Geometric levelling (also called spirit levelling) itself is a quasi-differential technique, which provides height differences \(\delta n\) (backsight minus foresight reading) with respect to a local horizontal line of sight. The uncertainty of geometric levelling is rather low over shorter distances, where it can reach the sub-millimetre level, but it is susceptible to systematic errors up to the decimetre level over 1000 km distance (see also Sect. 3.3). In addition, the non-parallelism of the level surfaces cannot be neglected over larger distances, as it results in a path dependence of the raw levelling results \((\oint {\mathrm{d}n} \ne 0)\), but this problem can be overcome by using potential differences, which are path-independent because the gravity field is conservative (\(\oint {\mathrm{d}W} =0\)). For this reason, geopotential numbers are almost exclusively used as the foundation for national and continental height reference systems (vertical datum) worldwide, but one can also work with heights and corresponding gravity corrections to the raw levelling results (cf. Torge and Müller 2012).

Although the geopotential numbers are ideal quantities for describing the direction of water flow, they have the unit \(\hbox {m}^{2 }\,\hbox {s}^{-2}\) and are thus somewhat inconvenient in disciplines such as civil engineering. A conversion to metric heights is therefore desirable, which can be achieved by dividing the C values by an appropriate gravity value. Widely used are the orthometric heights (e.g. in the USA, Canada, Austria, and Switzerland) and normal heights (e.g. in Germany and many other European countries). Heights also play an important role in gravity field modelling due to the strong height dependence of various gravity field quantities.

The orthometric height H is defined as the distance between the surface point P and the zero level surface (geoid), measured along the curved plumb line, which explains the common understanding of this term as “height above sea level” (Torge and Müller 2012). The orthometric height can be derived from Eq. (13) by integrating along the plumb line, giving

$$\begin{aligned} H^{(i)}=\frac{C^{(i)}}{\overline{{g}}},\quad \overline{{g}}=\frac{1}{H^{(i)}}\int \limits _0^{H^{(i)}} {g \mathrm{d}H} , \end{aligned}$$

where \(\overline{{g}}\) is the mean gravity along the plumb line (inside the Earth). As \(\overline{{g}}\) cannot be observed directly, hypotheses about the interior gravity field are necessary, which is one of the main drawbacks of the orthometric heights. Therefore, in order to avoid hypotheses about the Earth’s interior gravity field, the normal heights \(H^{N}\) were introduced by Molodensky (e.g. Molodenskii et al. 1962) in the form

$$\begin{aligned} H^{N(i)}=\frac{C^{(i)}}{\overline{{\gamma }}},\quad \overline{{\gamma }}=\frac{1}{H^{N(i)}}\int \limits _0^{H^{N(i)}} {\gamma \mathrm{d}H^{N}} , \end{aligned}$$

where \(\overline{{\gamma }}\) is a mean normal gravity value along the normal plumb line (within the normal gravity field, associated with the level ellipsoid), and \(\gamma \) is the normal gravity acceleration along this line. Consequently, the normal height \(H^{N}\) is measured along the slightly curved normal plumb line (Torge and Müller 2012).

While the orthometric and normal heights are related to the Earth’s gravity field, the ellipsoidal heights h, as derived from GNSS observations, are purely geometric quantities, describing the distance (along the ellipsoid normal) of a point P from a conventional reference ellipsoid. As the geoid and quasigeoid serve as the zero height reference surface (vertical datum) for the orthometric and normal heights, respectively, the following relation holds

$$\begin{aligned} h=H^{(i)}+N^{(i)}=H^{N(i)}+\zeta ^{(i)}, \end{aligned}$$

where \(N^{(i)}\) is the geoid undulation, and \(\zeta ^{(i)}\) is the quasigeoid height or height anomaly; for further details on the geoid and quasigeoid (height anomalies) see, for example, Torge and Müller (2012). Equation (16) neglects the fact that strictly the relevant quantities are measured along slightly different lines in space, but the maximum effect is only at the sub-millimetre level (for further details cf. Denker 2013).

Lastly, the geometric levelling approach gives only gravity potential differences, but the associated constant zero potential \(W_0^{(i)}\) can be determined by at least one (better several) GNSS and levelling points in combination with the (gravimetrically derived) disturbing potential, as described in the next section. Rearranging the above equations gives the desired gravity potential values in the form

$$\begin{aligned} W_P =W_0^{(i)} -C^{(i)}=W_0^{(i)} -\bar{{g}}H^{(i)}=W_0^{(i)} -\bar{{\gamma }}H^{N(i)},\qquad \end{aligned}$$

and hence the geopotential numbers and the heights \(H^{(i)}\) and \(H^{N(i)}\) are fully equivalent.

3.2 The GNSS/geoid approach

For the determination of the gravity potential W, one of the primary goals of geodesy, gravity measurements form one of the most important data sets. However, since gravity (represented as \(g = {\vert }{} \mathbf{g}{\vert }\) = length of the gravity vector g) and other relevant observations depend in general in a nonlinear way on the potential W, the observation equations must be linearized by introducing an a priori known reference potential as well as a corresponding reference surface (positions). Usually, the normal gravity field related to the level ellipsoid is employed for this, requiring that the ellipsoid surface is a level surface of its own gravity field. The level ellipsoid is chosen as a conventional system, because it is easy to compute (from just four fundamental parameters, e.g. two geometrical parameters for the ellipsoid plus the total mass M and the angular velocity \(\omega \)), useful for other disciplines, and also utilized for describing station positions. However, today spherical harmonic expansions based on satellite data could also be employed (cf. Denker 2013).

The linearization process leads to the disturbing (or anomalous) potential T defined as

$$\begin{aligned} T_P =W_P -U_P , \end{aligned}$$

where U is the normal gravity potential associated with the level ellipsoid. Accordingly, the gravity vector and other parameters are approximated by corresponding reference quantities, leading to gravity anomalies, gravity disturbances, vertical deflections, height anomalies, geoid undulations, etc. (cf. Torge and Müller 2012). The main advantage of the linearization process is that the residual quantities (with respect to the known reference field) are in general four to five orders of magnitude smaller than the original ones, and in addition they are less position-dependent.

Accordingly, the gravity anomaly is given by

$$\begin{aligned} \Delta g_P =g_P -\gamma _Q =-\frac{\partial T}{\partial h}+\frac{1}{\gamma }\frac{\partial \gamma }{\partial h}T-\frac{1}{\gamma }\frac{\partial \gamma }{\partial h}\left( W_0^{(i)} -U_0 \right) , \end{aligned}$$

where \(g_{P}\) is the gravity acceleration at the observation point P (at the Earth’s surface or above), \(\gamma _{Q}\) is the normal gravity acceleration at a known linearization point Q (telluroid, Q is located on the same ellipsoidal normal as P at a distance \(H^{N}\) above the ellipsoid, or equivalently \(U_{Q}=W_{P}\); for further details, see Denker 2013), the derivatives are with respect to the ellipsoidal height h, and \(\delta W_0^{(i)} =W_0^{(i)} -U_0\) is the potential difference between the zero level height reference surface (\(W_0^{(i)}\)) and the normal gravity potential \(U_{0}\) at the surface of the level ellipsoid. Equation (19) is also denoted as the fundamental equation of physical geodesy; it represents a boundary condition that has to be fulfilled by solutions of the Laplace equation for the disturbing potential T, sought within the framework of geodetic boundary value problems (GBVPs). Moreover, the subscripts P and Q are dropped on the right side of Eq. (19), noting that it must be evaluated at the known telluroid point (boundary surface).

In a similar way, the height anomaly is obtained by Bruns’s formula as

$$\begin{aligned} \zeta ^{(i)}\!=\!h-H^{N(i)}=\frac{T}{\gamma }-\frac{W_0^{(i)}\! -\!U_0 }{\gamma }=\frac{T}{\gamma }-\frac{\delta W_0^{(i)} }{\gamma }=\zeta +\zeta _0^{(i)} . \end{aligned}$$

which also implies that \(\zeta ^{(i)} \) and \(\zeta \) are associated with corresponding zero level surfaces \(W=W_0^{(i)} \) and \(W=U_{0}\), respectively. The \(\delta W_0^{(i)}\) term is also denoted as height system bias and is frequently omitted in the literature, implicitly assuming that \(W_0^{(i)}\) equals \(U_{0}\). However, when aiming at a consistent derivation of absolute potential values, the \(\delta W_0^{(i)} \) term has to be taken into consideration.

Accordingly, the disturbing potential T takes over the role of W as the new fundamental target quantity, to which all other gravity field quantities of interest are related. As T has the important property of being harmonic outside the Earth’s surface and regular at infinity, solutions of T are developed in the framework of potential theory and GBVPs, i.e. solutions of the Laplace equation are sought that fulfil certain boundary conditions. Now, the first option to compute T is based on the well-known spherical harmonic expansion, using coefficients derived from satellite data alone or in combination with terrestrial data (e.g. EGM2008; EGM—Earth Gravitational Model; Pavlis et al. 2012), yielding

$$\begin{aligned} T(\theta ,\lambda ,r)=\sum _{n=0}^{n_{\max } } {\left( {\frac{a}{r}} \right) } ^{n+1}\sum _{m=-n}^n {\overline{T} _{nm} \overline{Y} _{nm} (} \theta ,\lambda ) \end{aligned}$$


$$\begin{aligned} \overline{Y} _{nm} (\theta ,\lambda )= & {} \overline{P} _{n\left| m \right| } (\cos \theta )\left\{ {\begin{array}{l} \cos m\lambda \\ \sin \left| m \right| \lambda \\ \end{array}} \right\} , \nonumber \\ \overline{T} _{nm}= & {} \frac{GM}{a}\left\{ {{\begin{array}{l} {\Delta \overline{C} _{nm} } \\ {\Delta \overline{S} _{nm} } \\ \end{array} }} \right\} \quad \hbox {for } \left\{ {{\begin{array}{l} {m\ge 0} \\ {m<0} \\ \end{array} }} \right\} , \end{aligned}$$

where \((\theta , \lambda , r)\) are spherical coordinates, nm are integers denoting the degree and order, GM is the geocentric gravitational constant (gravitational constant G times the mass of the Earth M), a is in the first instance an arbitrary constant, but is typically set equal to the semimajor axis of a reference ellipsoid, \(\overline{P}_{nm} (\cos \theta )\) are the fully normalized associated Legendre functions of the first kind, and \(\Delta \overline{C}_{nm} ,\Delta \overline{S}_{nm} \) are the (fully normalized) spherical harmonic coefficients (also denoted as Stokes’s constants), representing the difference in the gravitational potential between the real Earth and the level ellipsoid.

Regarding the uncertainty of a gravity field quantity computed from a global spherical harmonic model up to some fixed degree \(n_\mathrm{max}\), the coefficient uncertainties lead to the so-called commission error based on the law of error propagation, and the omitted coefficients above degree \(n_\mathrm{max}\), which are not available in the model, lead to the corresponding omission error. With dedicated satellite gravity field missions such as GRACE (Gravity Recovery and Climate Experiment) and GOCE (Gravity Field and Steady-State Ocean Circulation Explorer), the long-wavelength geoid and quasigeoid can today be determined with low uncertainty, e.g. about 1 mm at 200 km resolution (\(n = 95\)) and 1 cm at 150 km resolution (\(n = 135\)) from GRACE (e.g. Mayer-Gürr et al. 2014), and 1.5 cm at about 110 km resolution (\(n = 185\)) from GOCE (e.g. Mayer-Gürr et al. 2015; Brockmann et al. 2014). However, the corresponding omission error at these wavelengths is still quite significant with values at the level of several decimetres, e.g. 0.94 m for \(n = 90\), 0.42 m for \(n = 200\), and 0.23 m for \(n= 360\). For the ultra-high-degree geopotential model EGM2008 (Pavlis et al. 2012), which combines satellite and terrestrial data and is complete up to degree and order 2159, the omission error is 0.023 m, while the commission error is about 5–20 cm, depending on the region and the corresponding data quality. The above uncertainty estimates are based on the published potential coefficient standard deviations as well as a statistical model for the estimation of corresponding omission errors, but do not include the uncertainty contribution of GM (zero degree term in Eq. (21)); hence, the latter term, contributing about 3 mm in terms of the height anomaly (corresponds to about 0.5 ppb; see Smith et al. 2000; Ries 2014), has to be added in quadrature to the figures given above. Further details on the uncertainty estimates can be found in Denker (2013).

Based on these considerations it is clear that satellite measurements alone will never be able to supply the complete geopotential field with sufficient accuracy, which is due to the signal attenuation with height and the required satellite altitudes of a few 100 km. Only a combination of the highly accurate and homogeneous (long wavelength) satellite gravity fields with high-resolution terrestrial data (mainly gravity and topography data with a resolution down to 1–2 km and below) can cope with this task. In this respect, the satellite and terrestrial data complement each other in an ideal way, with the satellite data accurately providing the long-wavelength field structures, while the terrestrial data sets, which have potential weaknesses in large-scale accuracy and coverage, mainly contribute to the short-wavelength features.

This directly leads to the development of regional solutions for the disturbing potential and other gravity field parameters, which typically have a higher resolution (down to 1–2 km) than global spherical harmonic models. Based on the developments of Molodensky (e.g. Molodenskii et al. 1962), the disturbing potential T can be derived from a series of surface integrals, involving gravity anomalies and heights over the entire Earth’s surface, which in the first instance can be symbolically written as

$$\begin{aligned} T=\mathbf{M}(\Delta g), \end{aligned}$$

where M is the Molodensky operator and \(\Delta g\) are the gravity anomalies over the entire Earth’s surface.

Further details on regional gravity field modelling are given in “Appendix 2”, including the solution of Molodensky’s problem, the remove–compute–restore (RCR) procedure, the spectral combination approach, data requirements, and uncertainty estimates for the disturbing potential and quasigeoid heights. The investigations show that quasigeoid heights can be obtained with an uncertainty of 1.9 cm, where the major contributions come from the spectral band below spherical harmonic degree 360. This uncertainty estimate represents an optimistic scenario and is only valid for the case that a state-of-the-art global satellite model is employed and sufficient high-resolution and high-quality terrestrial gravity and terrain data are available around the point of interest (e.g. with a spacing of 2–4 km out to a distance of 50–100 km). Fortunately, such a data situation exists for most of the metrology institutes with optical clock laboratories—at least in Europe. Furthermore, the perspective exists to improve the uncertainty of the calculated quasigeoid heights (see “Appendix 2”).

Now, once the disturbing potential values T are computed, either from a global geopotential model by Eq. (21), or from a regional solution by Eq. (23) based on Molodensky’s theory, the gravity potential W, needed for the relativistic redshift corrections, can be computed most straightforwardly as

$$\begin{aligned} W_P =U_P +T_P , \end{aligned}$$

where the basic requirement is that the position of the given point P in space must be known accurately (e.g. from GNSS observations), as the normal potential U is strongly height-dependent, while T is only weakly height-dependent with a maximum vertical gradient of a few parts in \(10^{-3}\,\hbox {m}^{2}\,\hbox {s}^{-2}\) per metre. The above equation also makes clear that the predicted potential values \(W_{P}\) are in the end independent of the choice of \(W_{0}\) and \(U_{0}\) used for the linearization. Furthermore, by combining Eqs. (24) with (20), and representing U as a function of \(U_{0}\) and the ellipsoidal height h, the following alternative expressions for W (at point P) can be derived as

$$\begin{aligned} W_P =U_0 -\overline{{\gamma }}(h-\zeta )=U_0 -\overline{{\gamma }}(h-\zeta ^{(i)})+\delta W_0^{(i)} , \end{aligned}$$

which demonstrates that ellipsoidal heights (e.g. from GNSS) and the results from gravity field modelling in the form of the quasigeoid heights (height anomalies) \(\zeta \) or the disturbing potential T are required, whereby a similar equation can be derived for the geoid undulations N. Consequently, the above approach (Eqs. (24) and (25)) is denoted here somewhat loosely as the GNSS/geoid approach, which is also known in the literature as the GNSS/GBVP approach (the geodetic boundary value problem is the basis for computing the disturbing potential T; see, for example, Rummel and Teunissen 1988; Heck and Rummel 1990).

The GNSS/geoid approach depends strongly on precise gravity field modelling (disturbing potential T, metric height anomalies \(\zeta \) or geoid undulations N) and precise GNSS positions (ellipsoidal heights h) for the points of interest, with the advantage that it delivers the absolute gravity potential W, which is not directly observable and is therefore always based on the assumption that the gravitational potential is regular (zero) at infinity (see “Appendix 1”). In addition, the GNSS/geoid approach allows the derivation of the height system bias term \(\delta W_0^{(i)} \) based on Eq. (20) together with at least one (better several) common GNSS and levelling stations in combination with the gravimetrically determined disturbing potential T.

3.3 Uncertainty considerations

The following uncertainty considerations are based on heights, but corresponding potential values can easily be obtained by multiplying the metric values with an average gravity value (e.g. \(9.81\,\hbox {m}\,\hbox {s}^{-2}\) or roughly \(10\,\hbox {m}\,\hbox {s}^{-2})\). Regarding the geometric levelling and the GNSS/geoid approach, the most direct and accurate way to derive potential differences over short distances is the geometric levelling technique, as standard deviations of 0.2–1.0 mm can be attained for a 1-km double-run levelling with appropriate technical equipment (Torge and Müller 2012). However, the uncertainty of geometric levelling depends on many factors, with some of the levelling errors behaving in a random manner and propagating with the square root of the number of individual set-ups or the distance, respectively, while other errors of systematic type may propagate with distance in a less favourable way. Consequently, it is important to keep in mind that geometric levelling is a differential technique and hence may be susceptible to systematic errors; examples include the differences between the second and third geodetic levelling in Great Britain (about 0.2 m in the north–south direction over about 1000 km distance; Kelsey 1972), corresponding differences between an old and new levelling in France (about 0.25 m from the Mediterranean Sea to the North Sea, also mainly in north–south direction, distance about 900 km; Rebischung et al. 2008), and inconsistencies of more than ±1 m across Canada and the USA (differences between different levellings and with respect to an accurate geoid; Véronneau et al. 2006; Smith et al. 2010, 2013). In addition, a further complication with geometric levelling in different countries is that the results are usually based on different tide gauges with offsets between the corresponding zero level surfaces, which, for example, reach more than 0.5 m across Europe. Furthermore, in some countries the levelling observations are about 100 years old and thus may not represent the actual situation due to possibly occurring recent vertical crustal movements.

With respect to the GNSS/geoid approach, the uncertainty of the GNSS positions is today more or less independent of the interstation distance. For instance, the station coordinates provided by the International GNSS Service (IGS) or the IERS (e.g. ITRF2008) reach vertical accuracies of about 5–10 mm (cf. Altamimi et al. 2011, 2016, or Seitz et al. 2013). The uncertainty of the quasigeoid heights (height anomalies) is discussed mainly in “Appendix 2”, but also mentioned in the previous subsection, showing that a standard deviation of 1.9 cm is possible in a best-case scenario and that the values are nearly uncorrelated over longer distances, with a correlation of less than 10% beyond a distance of about 180 km. Aiming at the determination of the absolute gravity potential W according to Eqs. (24) or (25), which is the main advantage of the GNSS/geoid over the geometric levelling approach, both the uncertainties of GNSS and the quasigeoid have to be considered. Assuming a standard deviation of 1.9 cm for the quasigeoid heights and 1 cm for the GNSS ellipsoidal heights without correlations between both quantities, a standard deviation of 2.2 cm is finally obtained (in terms of heights) for the absolute potential values based on the GNSS/geoid approach. Thus, for contributions of optical clocks to timescales, which require the absolute potential \(W_{P}\) relative to a conventional zero potential \(W_{0}\) (see Sect. 2.5), the relativistic redshift correction can be computed with an uncertainty of about \(2 \times 10^{-18}\). This is the case more or less everywhere in the world where high-resolution regional gravity field models have been developed on the basis of a state-of-the-art global satellite model in combination with sufficient terrestrial gravity field data. On the other hand, for potential differences over larger distances of a few 100 km (i.e. typical distances between different NMIs), the statistical correlations of the quasigeoid values virtually vanish, which then leads to a standard deviation for the potential difference of 3.2 cm in terms of height, i.e. \(\surd \)2 times the figure given above for the absolute potential (according to the law of error propagation), which again has to be considered as a best-case scenario. This would also hold for intercontinental connections between metrology institutes, provided again that sufficient regional high-resolution terrestrial data exist around these places. Furthermore, in view of future refined satellite and terrestrial data (see “Appendix 2”), the perspective exists to improve the uncertainty of the relativistic redshift corrections from the level of a few parts in \(10^{18}\) to one part in \(10^{18}\) or below. According to this, over long distances across national borders, the GNSS/geoid approach should be a better approach than geometric levelling. Finally, “Appendix 3” gives some general recommendations for the implementation of geometric levelling and GNSS observations at typical clock sites.

Fig. 1
figure 1

Map showing the locations of the PTB, LUH, MPQ, and OBSPARIS sites

4 Practical results for optical clock sites in Germany and France

4.1 Geometric levelling and GNSS observations

In order to demonstrate the performance of geodetic methods for determining the gravity potential and corresponding differences at national and intercontinental scales, this section discusses some practical results for three optical clock sites in Germany and one in France. Following the recommendations for geometric levelling and GNSS observations outlined in “Appendix 3”, corresponding surveys were carried out at the Physikalisch-Technische Bundesanstalt (PTB) in Braunschweig, Germany, the Leibniz Universität Hannover (LUH) in Hannover, Germany, the Max-Planck-Institut für Quantenoptik (MPQ) in Garching (near Munich), Germany, and the Paris Observatory (l’Observatoire de Paris, OBSPARIS), Paris, France. The locations of the four selected clock sites are shown in Fig. 1; the linear distances between the German sites range from 52 km for PTB–LUH to 457 km for PTB–MPQ and 480 km for LUH–MPQ, while the corresponding distances between OBSPARIS and the German sites are 690 km (PTB), 653 km (LUH), and 690 km (MPQ).

The coordinates of all GNSS stations were referred to the ITRF2008 at its associated standard reference epoch 2005.0. The geometric levelling results were based in the first instance on the corresponding national vertical reference networks and then converted to the EVRS (European Vertical Reference System) using its latest realization EVRF2007 (European Vertical Reference Frame), which is based on a common adjustment of all available European levelling observations (UELN, United European Levelling Network). The measurements within the UELN originate from very different epochs, but reductions for vertical crustal movements were only applied for the (still ongoing) post-glacial isostatic adjustment (GIA) in northern Europe, using the Nordic Geodetic Commission (NKG) model NKG2005LU (Ågren and Svensson 2007) with the epoch 2000.0; for further details on EVRF2007, see Sacher et al. (2008). However, as GIA hardly affects the aforementioned clock sites, while other sources of vertical crustal movements are not known, the EVRF2007 heights are considered as stable in time in the following.

Further details on the local levelling results and the conversion to EVRF2007 as well as the corresponding GNSS observations and results are given in “Appendix 4”. In general, the uncertainty of geometric levelling is at the few mm level, and the uncertainty of the GNSS ellipsoidal heights is estimated to be better than 10 mm.

Before checking the consistency between the GNSS and levelling heights at each site, it is important to handle the permanent parts of the tidal corrections in a consistent manner. While the European height reference frame EVRF2007 and the European gravity field modelling performed at LUH follow the IAG resolutions to use the zero-tide system, most GNSS coordinates (including the ITRF and IGS results) refer to the “non-tidal (or tide-free) system”. Therefore, for consistency with the IAG recommendations and the other quantities involved (EVRF2007 heights, quasigeoid), the ellipsoidal heights from GNSS were converted from the non-tidal to the zero-tide system based on the following formula from Ihde et al. (2008) with

$$\begin{aligned} h_{zt} =h_{nt} +60.34-179.01\sin ^{2}\varphi -1.82\sin ^{4}\varphi \quad \quad \hbox {(mm)}, \end{aligned}$$

where \(\varphi \) is the ellipsoidal latitude, and \(h_{nt}\) and \(h_{zt}\) are the non-tidal and zero-tide ellipsoidal heights, respectively. Hence, the zero-tide heights over Europe are about 3–5 cm smaller than the corresponding non-tidal heights.

All results for the PTB, LUH, MPQ, and OBSPARIS sites are documented in Table 1, which contains the ellipsoidal heights, the normal heights referring to the national and the EVRF2007 height networks, and the gravimetric quasigeoid heights based on the European Gravimetric (quasi)Geoid EGG2015. Here, it should be noted that the zero level of the French levelling network based on the tide gauge in Marseille at the Mediterranean Sea is almost half a metre below the zero level surface of the German and EVRF2007 height networks related to the Amsterdam tide gauge at the North Sea. Furthermore, ellipsoidal coordinates are given in Table 2 at the epoch 2005.0, the standard reference epoch associated with the ITRF2008.

4.2 The European gravimetric quasigeoid model EGG2015

In this contribution, the latest European gravimetric quasigeoid model EGG2015 (Denker 2015) is employed. The major differences between EGG2015 and the previous EGG2008 model (Denker 2013) are the inclusion of additional gravity measurements carried out recently around all major European optical clock sites within the ITOC project (Margolis et al. 2013; Margolis 2014; see also http://projects.npl.co.uk/itoc/) and the use of a newer geopotential model based on the GOCE satellite mission instead of EGM2008. EGG2015 was computed from surface gravity data in combination with topographic information and the geopotential model GOCO05S (Mayer-Gürr et al. 2015) based on the RCR technique, as outlined in “Appendix 2”. The estimated uncertainty (standard deviation) of the absolute quasigeoid values is 1.9 cm; further details including correlation information can be found in “Appendix 2” as well as Denker (2013). In order to make EGG2015 consistent with GNSS and the EVRF2007, the final model values were computed according to Eq. (20) as

$$\begin{aligned} \zeta ^{(\mathrm{EGG2015})}= & {} \frac{T}{\gamma }-\frac{\delta W_0^{(\mathrm{EGG2015})} }{\gamma }=\zeta +\zeta _0^{\mathrm{(EGG2015)}},\quad \hbox {with}\nonumber \\ \zeta _0^{\mathrm{(EGG2015)}}= & {} +\,\hbox {0.300 m}, \end{aligned}$$

where a slightly rounded value for \(\zeta _0^{\mathrm{(EGG2015)}} \) was implemented; the original value of + 0.305 m with a formal uncertainty of 0.002 m resulted from the comparison with 1139 stations from the EUVN_DA GNSS/levelling data set (EUVN_DA: European Vertical Reference Network – Densification Action; Kenyeres et al. 2010). The rounded value of +0.300 m for \(\zeta _0^{\mathrm{(EGG2015)}} \) implies a zero potential for the EGG2015 model of \(W_0^{\mathrm{(EGG2015)}} = 62{,}636{,}857.91\,\hbox {m}^{2}\,\hbox {s}^{-2}\), while the original value of + 0.305 m leads to a corresponding zero potential of \(W_0^{(\mathrm{EVRF2007})} = 62{,}636{,}857.86\,\hbox {m}^{2}\,\hbox {s}^{-2}\) for EVRF2007, both having a formal uncertainty of \(0.02\,\hbox {m}^{2}\,\hbox {s}^{-2}\).

Table 1 Evaluation of GNSS and levelling data for the PTB, LUH, MPQ, and OBSPARIS sites by considering the (quasi)geoid as a horizontal plane and by utilizing the European Gravimetric (Quasi)Geoid model EGG2015 with corresponding residuals #1 and #2, respectively

4.3 Consistency check of GNSS and levelling heights

In order to check the consistency between the GNSS and levelling heights, corresponding quasigeoid heights were computed as \(\zeta _{\mathrm{GNSS}} = h_{zt} - H^{N (\mathrm{EVRF2007})}\), where \(h_{zt}\) is the ellipsoidal height from GNSS (ITRF2008, epoch 2005.0, zero-tide system) and \(H^{N (\mathrm{EVRF2007})}\) is the normal height based on the EVRF2007, see Table 1. As the maximum distance between the GNSS stations at each NMI is only about 600 m for the PTB site, and less for the three other sites, the quasigeoid at each site can be approximated in the first instance as a horizontal plane, i.e. a constant value; the remaining residuals (#1) are listed in Table 1 in column (8), attaining maximum values of 11 mm (RMS 7 mm; RMS: root mean square) for PTB, 9 mm (RMS 5 mm) for LUH, 2 mm (RMS 2 mm) for MPQ, and 5 mm (RMS 4 mm) for OBSPARIS. Besides this simple internal evaluation, a comparison with the independent gravimetric quasigeoid model EGG2015 is also performed, which is considered as an external evaluation. Table 1 shows the EGG2015 quasigeoid heights \(\zeta _{\mathrm{EGG2015}}\), the (raw) differences \(\zeta _{\mathrm{GNSS}}- \zeta _{\mathrm{EGG2015}}\) and the residuals about the mean difference in columns 9–11, respectively. Of most interest are the residuals (#2) about the mean difference (in column 11), which attain a maximum value of only 6 mm (RMS 4 mm) for PTB, 1 mm (RMS 1 mm) for LUH, 2 mm (RMS 1 mm) for MPQ, and 5 mm (RMS 4 mm) for OBSPARIS, proving the excellent consistency of the GNSS and levelling results. Although initial results were worse for the PTB and OBSPARIS sites, the problem was traced to an incorrect identification of the corresponding antenna reference points (ARPs); at the PTB site, an error of 16 mm was found for station PTBB, and at OBSPARIS, there was a difference of 29 mm between the ARP and the levelling benchmark and an additional error in the ARP height of 8 mm at station OPMT. It should be noted that, due to the high consistency of the GNSS and levelling data at all four sites, even quite small problems in the ARP heights (below 1 cm) could be detected and corrected after on-site inspections and additional verification measurements. This strongly supports the recommendation of “Appendix 3” to have sufficient redundancy in the GNSS and levelling stations. The mean differences are 23 mm or below for the three German sites, whereas 106 mm is obtained for the OBSPARIS site. The magnitude of the mean values for the German sites is excellent and proves the low uncertainty of both the geometric levelling and the GNSS/quasigeoid results in Germany as well as the correct implementation of the corresponding height system bias terms \((\delta W_0^{(i)} )\). The somewhat larger value for OBSPARIS is believed to be mainly related to accumulated systematic levelling errors (see Sect. 5 for a detailed discussion). Nevertheless, the results show that the different zero levels of the German and French height reference networks have been properly taken into account, recalling that the difference between the French and German zero level surfaces is about half a metre.

4.4 Gravity potential determination

In order to apply the GNSS/geoid approach according to Eq. (24) or (25), ellipsoidal heights are required for all stations of interest. However, initially GNSS coordinates are only available for a few selected points at each NMI site, while for most of the other laboratory points near the clocks, only levelled heights exist. Therefore, based on Eq. (20), a quantity \(\delta \zeta \) is defined as \(\delta \zeta =(h-H^{N(i)})-\zeta ^{(i)}\). This should be zero in theory, but is not in practice due to the uncertainties in the quantities involved (GNSS, levelling, quasigeoid). However, if a high-resolution quasigeoid model is employed (such as EGG2015), the term \(\delta \zeta \) should be small and represent only long-wavelength features, mainly due to systematic levelling errors over large distances as well as long-wavelength quasigeoid errors. In this case an average (constant) value \(\overline{\delta \zeta } \) can be used at each NMI site to convert all levelled heights into ellipsoidal heights by using

$$\begin{aligned} h^{(\mathrm{adj.})}=H^{N(i)}+\zeta ^{(i)}+\overline{\delta \zeta } =H^{N(i)}+(\zeta +\zeta _0^{(i)} )+\overline{\delta \zeta } , \end{aligned}$$

which is based on Eq. (16). This has the advantage that locally (at each NMI) the consistency is kept between the levelling results on the one hand and the GNSS/quasigeoid results on the other hand and consequently that the final potential differences between stations at each NMI are identical for the GNSS/geoid and geometric levelling approach, which is reasonable, as locally the uncertainty of levelling is usually lower than that of the GNSS/quasigeoid results. The quantity \(\overline{\delta \zeta } \) is in fact the mean value given in column (10) of Table 1 for each of the four sites. Furthermore, regarding the common GNSS and levelling stations, their (adjusted) ellipsoidal heights according to Eq. (28) (see Table 2) are identical with the observed ellipsoidal heights (column 3 in Table 1) plus the corresponding residual #2 (column 11 in Table 1).

Based on the GNSS and levelling results at the selected four sites, the gravity potential values can finally be derived for all relevant stations. The gravity potential values obtained are documented in Table 2 for all GNSS and levelling points given in Table 1 plus two additional points on the PTB campus as an example for stations that have only levelled heights (KB01, KB02; points at the Kopfermann building, hosting the caesium fountains). The gravity potential values are given in Table 2 in the form of geopotential numbers according to Eq. (13) with

$$\begin{aligned} C=W_0 (\hbox {IERS2010})-W_P , \end{aligned}$$

where the conventional value \(W_{0}\) (IERS2010) according to Eq. (10) is used, following the IERS2010 conventions and the IAU resolutions for the definition of TT. The geopotential numbers C are more convenient than the absolute potential values \(W_{P}\) due to their smaller numerical values and direct usability for the derivation of the relativistic redshift corrections according to Eq. (11). Table 2 gives the geopotential numbers C according to Eq. (29) in the geopotential unit (gpu; \(1\,\hbox {gpu} = 10\, \hbox {m}^{2}\,\hbox {s}^{-2})\), resulting in numerical values of C that are about 2% smaller than the numerical height values, for both the geometric levelling approach (\(C^{(\mathrm{lev})})\) and the GNSS/geoid approach (\(C^{(\mathrm{GNSS/geoid})})\) based on Eqs. (17) or (24) and (25), respectively. Regarding the geometric levelling approach, the above-mentioned value \(W_0^{(\mathrm{EVRF2007})} = 62{,}636{,}857.86\,\hbox {m}^{2}\,\hbox {s}^{-2}\) based on the European EUVN_DA GNSS/levelling data set from Kenyeres et al. (2010) is utilized in Eq. (17). The GNSS/geoid approach according to Eqs. (24) and (25) is based on the disturbing potential T or the corresponding height anomaly values \(\zeta \) from the EGG2015 model (see above and Eq. (27)), as well as the normal potential \(U_{0} = 62{,}636{,}860.850\,\hbox {m}^{2}\,\hbox {s}^{-2}\), associated with the surface of the underlying GRS80 (Geodetic Reference System 1980; see Moritz 2000) level ellipsoid, and furthermore, the mean normal gravity values \(\overline{{\gamma }}\) are also based on the GRS80 level ellipsoid; for further details, see Torge and Müller (2012). Besides the geopotential numbers derived by the two independent approaches, Table 2 also shows the differences between the two approaches as well as corresponding ITRF2008 coordinates and normal heights referring to EVRF2007.

Table 2 Ellipsoidal coordinates (latitude, longitude, height; \(\varphi \), \(\lambda \), \(h^{(\mathrm{adj.})})\) referring to ITRF2008 reference frame (epoch 2005.0; GRS80 ellipsoid; zero-tide system), normal heights \(H^{N (\mathrm{EVRF2007})}\) based on EVRF2007, geopotential numbers based on the geometric levelling (\(C^{(\mathrm{lev})})\) and GNSS/geoid approach (\(C^{(\mathrm{GNSS}/\mathrm{geoid})})\) relative to the IERS2010 conventional reference potential \(W_{0}= 62{,}636{,}856.00\,\hbox {m}^{2}\,\hbox {s}^{-2}\) and differences \(\Delta C\) thereof, as well as the relativistic redshift correction based on the GNSS/geoid approach; for further details, see Sects. 4 and 5

5 Discussion and conclusions

The differences between the geopotential numbers from the two approaches (geometric levelling and GNSS/geoid) range from +0.014 to −0.109 gpu, which is equivalent to +0.014 to −0.111 m in terms of height, and again the metric (height) values are preferred in the following, as earlier. Regarding gravity potential differences between two stations, for the nearby sites PTB and LUH (52 km linear distance) the difference between both approaches is only 0.010 m, while the corresponding figures for the connection PTB–MPQ (457 km distance) and LUH–MPQ (480 km distance) are 0.032 m and 0.042 m, respectively. On the other hand, the (international) potential differences between OBSPARIS (France) and the German sites show larger differences between the geometric levelling and the GNSS/geoid approach of the order of 0.10 m (0.094 m for the difference to PTB, 0.084 m for LUH, and 0.125 m for MPQ) over distances between about 650 and 700 km.

Regarding the significance of the 1 dm discrepancies in the potential differences between the two geodetic approaches for the German stations and OBSPARIS, these have to be discussed in relation to the corresponding uncertainties of levelling, GNSS, and the quasigeoid model (EGG2015). The geometric levelling results are based on the EVRF2007 adjustment, where a posteriori standard deviations of about \(1\,\hbox {mm} \times \surd d\) (d is the distance in km) are reported for the German levelling data and 2 mm \(\times \surd d\) for the French data (Sacher et al. 2008). Starting with a rough uncertainty estimation for the levelling connections between the German stations and OBSPARIS by assuming that the levelling is half in Germany and half in France, and considering that the levelling lines will be longer than the linear distances of about 700 km, leads to a standard deviation of about 50 mm. However, a more thorough uncertainty estimate has to consider that the EVRF2007 heights are not the result of single-line connections but rest upon an adjustment of a whole levelling network, which leads to a corresponding standard deviation of about 20 mm (M. Sacher, Bundesamt für Kartographie und Geodäsie, BKG, Leipzig, Germany, personal communication, 10 May 2017), indicating a factor 2.5 improvement due to the network; nevertheless, neither uncertainty estimate considers any systematic levelling error contributions. Furthermore, the GNSS ellipsoidal heights have uncertainties below 10 mm, as they are directly based on permanent reference stations or connected to such stations located nearby. The uncertainty of EGG2015 has been discussed in “Appendix 2” and above, yielding a standard deviation of 19 mm for the absolute values and about 27 mm for corresponding differences between the German and French sites over about 700 km distance, presuming that the correlations are insignificant over these distances. Finally, accepting that all three quantities involved in the comparison (levelling, GNSS, quasigeoid) are uncorrelated, the corresponding uncertainty components add in quadrature, giving a standard deviation of 59 and 36 mm based on the single-line levelling and the corresponding network uncertainty estimates, respectively. Hence, the discrepancies of about 100 mm between the two geodetic approaches have to be considered as statistically significant at a confidence level of 95% for the network-based levelling uncertainty estimates, but not for the simple single-line estimates.

Nevertheless, as the quasigeoid model EGG2015 is based on an up-to-date, consistent and quite homogeneous GOCE satellite model and also includes high-quality and high-resolution terrestrial gravity field data around the investigated clock sites, the corresponding uncertainty estimates are considered as realistic, while the GNSS component plays only a minor role. It is therefore hypothesized that the largest uncertainty contribution comes from geometric levelling, which is very accurate over short distances, but susceptible to systematic errors at the decimetre level over larger distances in the order of 1000 km. In France, differences between an old and a new levelling exist, mainly in the north–south direction (about 0.25 m from the Mediterranean Sea to the North Sea over a distance of about 900 km, i.e. 28 mm/100 km), but to a lesser extent also in the east–west direction (about 0.04 m from Strasbourg to Brest, i.e. 5 mm/100 km), where an evaluation by independent GNSS/quasigeoid data clearly shows a much better agreement with the new levelling results (Rebischung et al. 2008; Denker 2013). Since the old levelling (NGF -IGN69: Nivellement Général France–L’Institut National de l’Information Géographique et Forestière, IGN, 1969; see “Appendix 4”) is the basis for the results in Tables 1 and 2, using the new French levelling could significantly reduce the existing discrepancies of about 1 dm, and in fact a preliminary EVRF adjustment from 2017 with new levelling data for France and other countries (M. Sacher, BKG, personal communication, 10 May 2017) is indicating a reduction in the current differences between the German stations and OBSPARIS by 68 mm.

Based on the preceding discussion, the geometric levelling approach, which gives in the first instance only height and potential differences, is recommended over shorter distances of up to several ten kilometres, where it can give millimetre uncertainties. However, it is problematic over long distances, the data may not be up-to-date due to recent vertical crustal movements, and it is further complicated across national borders due to different zero levels. Consequently, over longer distances (of more than about 100 km) across national borders, the GNSS/geoid approach should be better. Moreover, this approach has the advantage that it gives absolute gravity potential values, presently with an uncertainty of about two centimetres in terms of heights (best-case scenario, see above). Hence, for contributions of optical clocks to international timescales, which require absolute potential values \(W_{P}\) relative to a conventional zero potential \(W_{0}\), the relativistic redshift corrections can be derived from the GNSS/geoid approach with a present uncertainty of about 2 \(\times \) 10\(^{-18}\). This is the case more or less everywhere in the world, where sufficient terrestrial gravity field data sets exist, and there is still potential for further improvements (see Sect 3.3 and “Appendix 2”). Based on this reasoning, Table 2 gives the relativistic redshift corrections according to Eq. (11) only for the GNSS/geoid approach, which can be considered as the recommended values. On the other hand, for optical clock (frequency) comparisons over shorter distances, requiring only potential differences, the geometric levelling approach can also be employed, giving the differential redshift corrections between the clock sites with uncertainties down to a few parts in \(10^{19}\).

Finally, the results given in Tables 1 and 2 show that the geodetic heights and corresponding gravity potential values derived from the geometric levelling approach and the GNSS/geoid approach are presently inconsistent at the decimetre level across Europe. For this reason, the more or less direct observation of gravity potential differences through optical clock comparisons (with targeted fractional accuracies of \(10^{-18}\), corresponding to 1 cm in height) is eagerly awaited as a means for resolving the existing discrepancies between different geodetic techniques and remedying the geodetic height determination problem over large distances. A first attempt in this direction was the comparison of two strontium optical clocks between PTB and OBSPARIS via a fibre link, showing an uncertainty and agreement with the geodetic results of about \(5 \times 10^{-17}\) (Lisdat et al. 2016). This was mainly limited by the uncertainty and instability of the participating clocks, which is likely to improve in the near future. For clocks with performance at the \(10^{-17}\) level and below, time-variable effects in the gravity potential, especially solid Earth and ocean tides, have to be considered and can also serve as a method of evaluating the performance of the optical clocks (i.e. a detectability test). Then, after further improvements in the optical clock performance, conclusive geodetic results can be anticipated in the future, and clock networks may also contribute to the establishment of the International Height Reference System (IHRS).