1 Introduction

One of the more challenging environments to provide accurate geodetic models and control is at the top of mountain peaks. The obvious reason for this is the physical environment, ruggedness, and poor accessibility that is typically associated with high mountains. However, just as significantly, many geodetic techniques and the theories, assumptions, and simple models behind them are significantly degraded when used under steep slopes and/or high elevations.

The majority of high mountain peaks in the USA were included in the trans-continental triangulation network with elevations determined using vertical angle observations performed by the United States Geological Survey (USGS) and the U.S. Coast and Geodetic Survey, predecessor to the National Geodetic Survey (NGS). The highest peak in the conterminous US (CONUS), Mt. Whitney in California, was also included in first-order levelling lines in 1925, 1928, and 1940. However, none of the high Colorado peaks-including all 58 of its “Fourteeners” (a peak having an elevation greater than 14,000 feet = 4267.2 m)—have had such treatment. This is true of even the more accessible summits like Pike’s Peak and Mt. Blue Sky (formerly Mt. Evans). Rather, most Colorado peak elevations were “simply” determined through triangulation or photogrammetry, and finding a comprehensive source for their currently accepted values—let alone their uncertainties—is surprisingly difficult (see United States Geological Survey 2001; Roach 2011).

In this paper, the focus is on Colorado’s Fourteeners, highlighting the geoid variability and additional inconsistencies in geodetic theory and modeling required to accurately determine the peaks’ elevations using ellipsoidal heights and a geoid model. Due to a lack of geodetic-quality GNSS observations on these peaks, airborne laser swath mapping (ALSM) or airborne LiDAR data from the USGS 3-D Elevation Program (3DEP) (Sugarbaker et al. 2014) is used to provide the geometric component (i.e., ellipsoidal height) of the peak. The use of LiDAR data to determine the elevation of prominent locations is commonly done for a variety of applications in navigation, hydrology, geomorphology, forestry, etc. The quality, assessment, and usage of 3DEP LiDAR (and LiDAR data in general) is well-documented in the literature (Raber et al. 2007; Stoker et al. 2014; Arundel and Sinha 2020; Stoker and Miller 2022). However, all of these investigations are from a LiDAR perspective, relying on the geodetic framework as a given, whereas the focus of this paper is specifically on the geodetic components and their careful treatment.

That said, the accuracy of the LiDAR component cannot be neglected. Since the late 1990s, the reported accuracy of LiDAR data is quite varied and dependent on a number of physical and instrument factors. Stoker and Miller (2022) report 13 cm root mean square error (RMSE) and 53 cm RMSE for the USGS Web Coverage Service when compared to project-specific LiDAR survey checkpoints and NGS OPUS (Online Positioning User Service) Share survey marks, respectively. Unless stated otherwise, all accuracy and uncertainty values in this paper are given either as RMSE or 1 standard deviation (“1σ”), which both represent approximately 68% confidence for normally distributed values. Additionally, Arundel and Sinha (2020) implemented a stepping scheme to traverse through a digital elevation model to arrive at the horizontal location of the most prominent peak in a region and extract the elevation, which was tested over a subset of peaks throughout the USA. This scheme is computationally efficient to implement on a nationwide-scale but does not perform perfectly as false summits (local maxima) were identified in 2.5% of cases. Since our investigation is limited to the Colorado Fourteeners and requires the utmost accuracy, a more accurate method is used in this paper that is more computationally demanding as a trade-off.

The overall quality of the 3DEP LiDAR data is extremely high. All of the LiDAR datasets used in this investigation meet USGS Quality Level (QL) 2 specifications, which include > 2 points per square meter and a vertical RMSE of 10 cm or less for non-vegetated areas (United States Geological Survey 2022). The vertical RMSE value is larger than currently accepted geodetic uncertainties but is comparable with historical surveying techniques like zenith distances computed from reciprocal observations, which for 10 km observation lengths, can be determined at the ± 0.10 m level (Heiskanen and Moritz 1967, p 175). But, as will be demonstrated, these vertical uncertainty estimates seem to be fairly pessimistic for the situation encountered here, and we estimate the LiDAR vertical uncertainty in the ~ 0.05–0.06 m range (see Sect. 3.1).

Additionally, mountainous areas are notoriously challenging for gravimetric geoid modeling due to the sparsity of in situ gravity observations and the impact of rugged terrain. One benefit of focusing on Colorado mountains is that they were the subject of the IAG Working Group 2.2 “Colorado Experiment” where Wang et al. (2021) reported geoid model accuracies between ± 2.1 to ± 5.6 cm over the greater Colorado region based on models computed by fourteen individual institutions. However, the USGS 3DEP datasets are typically provided with respect to the North American Vertical Datum of 1988, NAVD 88 (Zilkoski et al. 1992), which is realized via a number of different hybrid geoid models (e.g., GEOID03, GEOID12B, and the latest model GEOID18). Hybrid geoid models are used effectively by many countries to provide access via GNSS to a leveling-based vertical datum (Smith and Roman 2001; Brown et al. 2018; Hwang et al. 2020; etc.). A hybrid geoid model warps a purely gravimetric geoid model to fit with a regional or nationwide GNSS\leveling dataset and in doing so provides access to the vertical datum. A hybrid geoid model likely has a much larger, unrealized uncertainty in mountain regions devoid of GNSS\leveling passive marks used as constraints. For example, Ahlgren et al. (2020) report a standard deviation of ± 0.022 m between GEOID18 and GNSS\leveling marks in Colorado—the largest amount for any state.

In addition to the inherent error in both LiDAR data and geoid models, there is also a systematic effect caused by inconsistent (and inadvertent) combinations of vertical datum definitions. In the scenario encountered here, NAVD 88 (defined via leveling data) uses the Helmert approximated orthometric height whereas the gravimetric geoid models use a mixture of different definitions depending on the individual model. This is inconsequential in flat, low-lying terrain; however, in mountainous areas, the differences solely from definitional inconsistencies can easily reach the 1-m level and is even more pronounced on mountain summits.

The above considerations are especially timely, given that NGS is set to replace the current U.S. vertical datum, NAVD 88, with a geopotential-based datum, the North American-Pacific Geopotential Datum of 2022 (NAPGD2022), in the next few years (National Geodetic Survey 2021b). This is a collaborative effort between NGS, the Canadian Geodetic Survey (CGS), and Mexico’s National Institute of Statistics and Geography (INEGI). Changing a country’s official datum happens infrequently but is not without precedent. Many countries have transitioned or are in the process of transitioning from classical datums realized by geodetic leveling to datums realized with GNSS-geoid models including Australia (Featherstone et al. 2018), Canada (Véronneau and Huang 2016), Japan (Matsuo and Kuroishi 2020), and Taiwan (Hwang et al. 2020). Previously, the U.S. has changed its vertical datum from the National Geodetic Vertical Datum of 1929 (NGVD 29) to NAVD 88 which increased the Fourteener summit elevations by 1–2 m (Smith and Bilich 2019). Estimates for the change from NAVD 88 to NAPGD2022 for the conterminous US are shown in Fig. 1 (where the expected change in elevation for all of Colorado is about − 0.550 m). The expected change has a prominent tilt of approximately − 1.3 m from southeast to northwest, which is primarily caused by the propagation of systematic errors in the leveling-based NAVD 88 datum (National Geodetic Survey 2021b).

Fig. 1
figure 1

Estimated difference between NAPGD2022 and NAVD 88 from Fig. 4-1 National Geodetic Survey (2021b). Used with permission

This paper is organized as follows. Section 2 provides the geographical distribution of the Fourteeners in Colorado and the historical context of any collocated geodetic stations, a description of the LiDAR data analysis performed to arrive at a geometric estimate of the mountain elevation, and a brief description of the geoid models that will be evaluated. Section 3 presents the results of this investigation in terms of the relative geoid model accuracy/variability over the GSVS17 validation profile (van Westrum et al. 2021) and estimates of the peak elevations based on a number of geoid models. Additionally, a significant systematic effect was uncovered in the process by which the historical heights were determined, warranting a full readjustment of the zenith angle observation network along with a description of the new results. Finally, Sect. 4 presents the conclusions of this study along with opportunities for future work.

2 Data and methods

This section is separated into a detailed discussion of the historical geodetic data, the LiDAR data analysis, and the geoid models used in this paper. There are 60 peaks in Colorado that are included in this investigation: 58 of which have elevations over 4267 m and include 2 peaks that are just below that threshold. The geographic distribution of the Fourteeners investigated in this paper is shown in Fig. 2. Point clouds for 48 of the 60 are found at the USGS 3DEP data repository (United States Geological Survey 2019), and the remaining 12 peaks have point clouds available from the Colorado Water Conservation Board (Colorado Water Conservation Board 2023).

Fig. 2
figure 2

Distribution of Fourteeners throughout Colorado. Orange triangles represent stations with LiDAR coverage and historical triangulation derived orthometric heights. Blue triangles represent summits with LiDAR coverage only. Red closely-spaced symbols in the southern part of the state represent the GSVS17 stations

2.1 Historical data

NGS publishes the nation’s official positional information for geodetic control on any given active or passive mark’s “Datasheet.” While NGS does not publish the elevation of the ground surface or mountain summit, many summits have historical geodetic passive control marks that provide the mark’s elevation with respect to a particular datum. Note that these marks are all embedded in the bedrock and are flush with the ground surface. No corrections for a “pillar height” are necessary. As of 2023, 32 of the 60 peaks have historical elevations derived from triangulation that can be found in the NGS database with heights published to the nearest 0.1 m. These elevations were based on observations obtained between 1890 and 1960. They were originally provided in NGVD 29 and translated to NAVD 88 using a conversion surface utility called VERTCON (Smith and Bilich 2019). Note that while the triangulation station might not be exactly collocated with the summit, the elevations of these stations are often taken as a proxy for the summit. These station elevations—provided by NGS Datasheets, and taken as the official and currently accepted values—will provide the historical baseline against which we will compare predicted elevation changes when transitioning to a modernized, geopotential datum. A general overview of historical, current, and future datums and reference frames adopted in the U.S. and used throughout this paper is illustrated in Fig. 3, including conversions between frames.

Fig. 3
figure 3

Relationship between various vertical datums, geometric frames, and their realizations used historically, currently, and in the future

In addition to the Fourteener summits, the GSVS17 profile of 222 stations (see Fig. 2) is used to provide an external, independent validation of LiDAR-based heights and geoid models. Specifically, the GSVS17 ellipsoidal heights from GNSS and rigorous orthometric heights from spirit leveling are used for these comparisons consistent with van Westrum et al. (2021) and Wang et al. (2021), respectively. The overall accuracy of these observations is estimated at ± 0.0036 m and ± 0.01 to ± 0.015 m for the GNSS ellipsoidal heights and spirt leveled height, respectively (see van Westrum et al. 2021 for further details).

2.2 Vertical datum definitions

The elevations and geoid models encountered and employed in this investigation are often based on slightly different definitions, approximations, and assumptions. This requires careful handling, and so a brief description of these definitional differences is presented in the following section.

The orthometric height H is defined as (Heiskanen and Moritz 1967 (4–21)):

$$H=\frac{C}{\overline{g} }$$
(1)

where \(C\) is the geopotential number and \(\overline{g }\) is the mean value of gravity along the plumb line between the geoid and the point of consideration.

The mean value of gravity is defined as:

$$\overline{g }=\frac{1}{H}{\int }_{0}^{H}g\left(z\right) {\text{d}}z$$
(2)

Since \(\overline{g }\) is defined using an integral, a summation or other approximation is used in practice to evaluate (2). The NAVD 88 datum is defined with Helmert orthometric heights and uses the simplified Poincare-Prey reduction (Heiskanen and Moritz 1967) with a density = 2670 kg/m3 to determine \(\overline{g }\) according to:

$$\overline{g }=g+0.0424H (\text{g in gal},H\,\text{in km})$$
(3)

where \(g\) is the value of the Earth’s gravity at its surface (with 1 gal \(\equiv \) 0.01 m/s2). As (3) assumes the topography is approximated by a Bouguer plate and omits much of the topographic variation, many authors have investigated how to more closely evaluate (2) in a practical sense resulting in rigorous orthometric heights (Kingdon et al. 2005; Santos et al. 2006; Flury and Rummel 2009; Odera and Fukuda 2015). In addition to this definitional difference, it should be highlighted that even a fairly modest error in \(g\) of 20 mgal leads to an error of 0.084 m in the Helmert orthometric height for the mountains over 4267 m that are investigated in this paper.

However, there is another method to determine \(H\) given ellipsoidal heights and a geoid model:

$$H=h-N,$$
(4)

where \(h\) is the ellipsoidal height measured with GNSS (during the LiDAR survey) and \(N\) is the geoid undulation.

Heights determined according to (4) in NAVD 88 from LiDAR and a hybrid geoid model (e.g., GEOID18) are rarely equivalent to NAVD 88 heights determined via geodetic leveling (using (1) and (3)). The former is a mixture of Helmert orthometric heights and “true” orthometric heights, depending on the distance to a GNSS\leveling constraint, whereas the latter is NAVD 88 Helmert orthometric heights. In much of the literature, these definitional details are omitted—to be as transparent as possible, the specific height definition under discussion will be explicitly stated throughout this paper. It needs to be recognized that the value of \(H\) computed according to (4) is affected by a number of factors including the accuracy of the gravity data, the gridding process, how the topographic effect on the geoid is applied, the assumption of constant topographic density, etc.

2.3 Ellipsoidal heights from a LiDAR dataset

LiDAR data from the USGS 3D Elevation Program (3DEP) are used to extract the mountain peak orthometric elevations (NAVD 88) based on a digital elevation model constructed for each summit. USGS provides a number of elevation products such as the seamless 1/3 arcsecond DEM and the dynamic elevation service, delivered via a web coverage service (WCS) (Stoker and Miller 2022). However, instead of using these derived products, the raw LAS point-cloud files are used so that it is possible to investigate (sub)cm-level datum changes and other effects. These LAS files are provided in a mixture of horizontal coordinates (UTM, State Plane Coordinates, and other projected coordinate reference systems) and NAVD 88 orthometric heights obtained using different hybrid geoid models (GEOID12B and GEOID18). Using metadata and project reports from USGS and LiDAR contractors, the coordinates are re-projected back into geometric components relative to the North American Datum of 1983 (NAD 83(2011) epoch 2010.00). Horizontally, this can be achieved with little concern for sub-meter errors. However, extreme care must be taken to ensure that the same hybrid geoid is ‘removed’ that was applied in the LiDAR processing as shown in (5) to arrive at the native NAD 83(2011) ellipsoidal height:

$${h}_{{\text{NAD}}83(2011)}={H}_{{\text{NAVD}}88(\text{from LiDAR})}+{N}_{{\text{hybrid}}\left({\text{GEOID}}12\text{B or GEOID}18\right).}$$
(5)

Because the exact horizontal location of the summit is unknown, the following scheme is employed to estimate it: the LiDAR point cloud is interpolated into a coarse DEM around the approximate peak location, a high resolution grid is then computed around this location, and finally, the coordinates with the maximum elevation are assumed to represent the summit. This assumption is not without some concern: unattached boulders/rocks, ice, vegetation, buildings, etc. are all features that can be above the desired highest point of the solid-earth. For this investigation in Colorado, only unattached boulders/rocks are of any concern and would lead to errors at the + 1 m-level, at most. These mountains are generally not covered by significant ice, have no vegetation, nor any other man-made feature.

In addition to uncertainty in the summit location, there is also the problem of (vertical) noise in the LiDAR returns/point cloud. The point cloud is limited to only use ground-classified, last returns as specified in the LAS files (American Society for Photogrammetry and Remote Sensing 2019), which removes nearly all of the vegetation and noise. Figure 4 shows the LiDAR derived DEM for Uncompahgre Peak (~ 4365 m) as an example and to illustrate that some high-frequency noise (a north–south band just west of the center) continues to exist in the derived, coarse DEM. To alleviate this as much as possible, a median spatial filter at 1 m is applied to the coarse DEM. The horizontal location for the highest elevation is then extracted, and a 12 m radius, 0.25 m resolution elevation grid is calculated around this point using least squares collocation (Moritz 1980) to ultimately determine the summit elevation (Fig. 5). There are numerous, alternative interpolation schemes that could be employed here (bilinear, inverse distance weighted, splines, etc.); however, least squares collocation, which is widely used in physical geodesy applications and allows for a stochastic interpolation of the elevation point data, is used to perform the interpolation. Further investigation of alternative interpolation schemes is beyond the scope of this paper. The first, low-resolution step of this procedure is necessary due to the enormous number of point returns in the LiDAR data (i.e. computational efficiency). It is also important to note that, if one has a priori knowledge of the actual horizontal location—as we do for the GSVS17 profile investigated in Sect. 3.1—only the least squares collocation elevation estimation at the exact horizontal location needs to be performed.

Fig. 4
figure 4

LiDAR point cloud in NAVD 88 for Uncompahgre Peak (units: m). A 200 m search radius is shown with 10 m contour interval. White square is location of highest elevation extracted from the filtered terrain model along with a 12 m radius for high resolution summit estimation (black circle), which is shown in detail in Fig. 5

Fig. 5
figure 5

High-resolution gridded terrain model at Uncompahgre Peak (units: m). Grid size is 0.25 m over a 12 m radius. White square is location of estimated summit. Individual LiDAR returns are shown in gray illustrating the typical density provided. The upper 2-m level of the color ramp is separated at 0.25 m increments to accentuate this desired portion of the model

After extracting the highest elevation from the high-resolution DEM, the NAVD 88 orthometric height is converted to an ellipsoidal height in NAD 83(2011) using (5) and then transformed to ITRF2014 epoch 2010.00 using the appropriate Helmert transformation (National Geodetic Survey 2021a). At the extracted location, a gravimetric geoid model can then be used according to (6) to arrive at an estimate of the orthometric height in NAPGD2022* (where an asterisk is used throughout this paper to signify this is just a “stand in” model for the future gravimetric datum, NAGPD2022, which has yet to be formally defined):

$${H}_{{\text{NAPGD}}2022*}={h}_{{\text{ITRF}}}-{N}_{{\text{gravimetric}}}$$
(6)

NAPGD2022 will be based on ITRF2020 at epoch 2020.00, but the difference in ellipsoidal heights between ITRF2020 and ITRF2014 is less than 0.001 m at epoch 2020.0 everywhere in the conterminous U.S. (cf. Table 5 in Dennis et al. 2022).

2.4 Geoid models

A major objective of this study is to investigate the degree of variability that exists in geoid models when they are evaluated at the extreme: high mountain peaks. There are a number of aspects of a geoid model that can be tailored for different applications, such as: different input datasets, different parameters (e.g., W0), different methodologies, different resolutions, etc. These aspects can range from “choices” to “rules,” and to make the search domain more manageable, the geoid models selected for comparison have some common elements including the W0 value (= 62 636 856.0 m2/s2) and a tide-free tidal system that is compatible with GNSS positions (Mäkinen and Ihde 2009). However, it is critically important for geoid users to recognize that different W0 values and different tidal systems can lead to geoid biases of up to or even greater than 0.40 m (everywhere) and very minor, 0.0065 m north–south tilt (just across Colorado), respectively, without any geoid model change.

Of the dozens of models that are available from the International Service for the Geoid [ISG (Reguzzoni et al. 2021)] and the International Centre for Global Earth Models [ICGEM (Ince et al. 2019)], eight models are chosen and investigated based on this study’s specific objectives: three global geopotential models (GGM), a terrain-enhanced GGM, and four high-resolution, regional geoid models. A description of these models is provided in the following section; however, it must be acknowledged that one cannot assume that any of these models are independent from one another. To some degree, all the models rely on the same underlying gravity data in this region.

The three models determined directly from a GGM are EGM2008 (Pavlis et al. 2012), EIGEN-6C4 (Förste et al. 2014), and SGG-UGM-2 (Liang et al. 2020) and are available from ICGEM. EGM2008 and EIGEN-6C4 have been studied extensively over a wide variety of regions, and SGG-UGM-2 is a newer model that is provided at the same maximum degree/order equal to 2190. While the details of these models are a bit tangential to the NAPGD2022-definition objective of this study, they are preliminarily included to demonstrate their overall performance relative to other high-resolution models and to highlight the inconsistency in implementing these models for geoid-based applications. This inconsistency requires additional emphasis and is extremely relevant at high elevations like those encountered in this study as will be apparent in Sects. 3.3 and 3.4. This comes from the incompleteness and lack of resolution provided by the correction term to convert the quasigeoid to the geoid (Rapp 1997) which is typically approximated by (Heiskanen and Moritz 1967):

$$N-\zeta \approx \frac{{\Delta g}_{B}}{\overline{\gamma }}H$$
(7)

where N is the geoid undulation, \(\zeta \) is the height anomaly, \({\Delta g}_{B}\) is the Bouguer anomaly, \(\overline{\gamma }\) is the mean normal gravity, and H is the orthometric height.

This correction term, ζ-to-N, is not specifically determined for the majority of GGMs, which leads to an ambiguity in implementing such a model in practice. In addition to how this term is specifically calculated, a major shortcoming is the implied use of a nominal, constant mass density (typically = 2670 kg/m3). EGM2008 is one model that does include both a terrain model to compute the correction term and the correction term itself (see Pavlis et al. 2007, 2012 for details). The ICGEM implements this correction term using a terrain model [ETOPO1 (Amante and Eakins 2009)] within their computation service for all models. Consequently, the three GGMs described above are evaluated using both the EGM2008 correction term and the ICGEM correction term.

Additionally, three previously published, high-resolution geoid models are included: xGEOID19B (Li et al. 2019), xGEOID20B (Wang et al. 2021), and CGG2013a (Véronneau and Huang 2016). Additionally, since EGM2008 has a lower spatial resolution, an additional model is constructed by adding a high-frequency term generated from the topography from ERTM (Hirt et al. 2014) resulting in ‘EGM2008 + ERTM’. Finally, a new, prototype geoid model was constructed specifically for this investigation called ‘pxGEOID’, which is described below. Specifications for the evaluated geoid models are shown in Table 1.

Table 1 Specifications of the geoid models used in this study; the reference models are described in the references for each geoid model

The pxGEOID model is constructed in a fashion similar to that of previous xGEOID models: Molodensky method on Earth’s surface, xREF20B reference model to nmax = 2190, the same 3″ digital elevation model, and a 1′ geoid model. Different features of this model include the use of additional, newly acquired terrestrial gravity data, a different interpolation scheme, and a slightly modified geoid-quasigeoid separation term from that of Wang et al. (2023). The terrestrial gravity data distribution is illustrated in Fig. 6 with new data highlighted. The interpolation scheme used here relies on a 3D logarithmic covariance function (Forsberg 1987; Ahlgren and Krcmaric 2020), where the refined Bouguer anomalies (\({\Delta g}_{B}\)) at point-level are interpolated onto the Earth’s surface at 1′. The Bouguer plate and terrain correction terms are then restored on this 1′ grid, resulting in a free-air anomaly grid, which is then differenced with the reference free-air anomaly grid. The residual quasi-geoid is computed using Stokes’ equation with a Wong-Gore cutoff at n = 800. The complete geoid-quasigeoid separation term uses the same method from Wang et al. (2023) as determined by (8) with the change coming from the newly determined Bouguer anomaly grid.

$$N-\zeta =\frac{{\Delta g}_{B}}{\overline{\gamma }}H+\frac{{V}_{t}\left(Q\right)-{V}_{t}\left(P\right)}{\overline{\gamma }}-\frac{{H}^{2}}{2\overline{\gamma }}\frac{\partial {\Delta g}_{B}}{\partial H},$$
(8)

where \({V}_{t}\left(P\right)\) and \({V}_{t}\left(Q\right)\) are the gravitational potentials of the topographic masses at P on the earth’s surface and Q on the geoid, respectively.

Fig. 6
figure 6

Terrestrial gravity distribution in Colorado used in the geoid modeling. New stations are symbolized in pink with all other stations shown in light blue. Orange triangles represent Fourteener summits with LiDAR coverage and historical triangulation derived orthometric heights. Blue triangles represent summits with LiDAR coverage only. Red closely-spaced symbols in the southern part of the state represent the GSVS17 stations

The gradient component in the third term of (8) is found by downward continuing the refined Bouguer anomaly grid on the Earth’s surface to the geoid using the Heiskanen and Moritz (1967) integral Eq. (885). The numerator of this term can then be determined based on the original, surface anomaly and the downward continued anomaly.

3 Results and discussion

This section is divided into four components: the accuracy of LiDAR-based heights (ellipsoidal and orthometric), the relative accuracy of geoid models in the Colorado mountains, “preliminary” estimated elevations and elevation changes to the mountain summits, and finally, a necessary readjustment of historic data to arrive at final predictions of the summit elevation changes.

3.1 Accuracy of LiDAR-based heights

The LiDAR datasets used in this analysis are extremely consistent both in terms of space and time. Nearly all of the LiDAR data used here were collected between 2018 and 2020 with the exception being two summits that were surveyed in 2016. Additionally, the LiDAR data are generally over the summer periods: July 1st to October 1st for the Fourteener summits and May 23rd to November 2nd for GSVS17. In terms of spatial consistency, the top of Fig. 7 shows the estimated uncertainty based on the LiDAR gridding process applied to the GSVS17 profile and the top of Fig. 8 for the Fourteener summits. The bottom of each of these figures shows the LiDAR point density within a 3 m radius of each desired location. Statistics for these figures are shown in Table 2.

Fig. 7
figure 7

LiDAR data properties over GSVS17. (Top) estimated uncertainty (1-sigma) based on the least squares collocation gridding process (units: m) (bottom) LiDAR density in the immediate 3 m vicinity (units: points/m2)

Fig. 8
figure 8

LiDAR data quality evaluated at the Fourteener summit locations. (Top) estimated uncertainty (1-sigma) based on the least squares collocation gridding process (units: m) (bottom) LiDAR density within the immediate 3 m vicinity (units: points/m2)

Table 2 Statistics for LiDAR data over GSVS17 and Fourteener summit locations

The uncertainties and LiDAR densities provide even further confidence in the overall quality of the LiDAR datasets. However, it is important to reiterate that the uncertainty provided here is only a reflection of the internal consistency of the LiDAR data, as this estimate does not have an ‘absolute’ reference to compare against in this situation. There is a significant increase in the uncertainty estimates for the Fourteener summits (mean = ± 0.01 m) compared to the GSVS17 results (mean = ± 0.004 m). Overall, the magnitude is still fairly small, but this increase reflects the increased variability of the surface at the Fourteener summits compared with the ‘smoother’ surface around the GSVS17 station locations. Another significant difference is that the Fourteener summits have over twice the LiDAR point density compared with the GSVS17 stations. While this increase in density does not appear to result in an improved uncertainty, it does support the high-resolution gridding employed in Sect. 2.3. Both of these factors give some confidence that the two scenarios investigated here can be considered of the same quality and accuracy and results from the GSVS17 external validation (as described below) can be extrapolated to the Fourteener summits for the LiDAR components.

Prior to including any geoid model in this investigation, the consistency and accuracy of the LiDAR derived heights (orthometric and ellipsoidal) needs to be quantified. These are plotted in Fig. 9 for each of the GSVS17 survey marker locations. The NAVD 88 orthometric height is directly provided by the LiDAR data (in the LAS files) and shows an overall agreement with the GSVS17 profile at ± 0.074 m RMSE. A much better agreement is shown with the geometric, ellipsoidal height at ± 0.054 m RMSE—an improvement of nearly 28%. The primary reason to show this comparison is to illustrate the high-quality, high-accuracy data one can expect from LiDAR, especially when using the native, geometric vertical component. This will be informative for error estimation when used in combination with a geoid model in subsequent sections (see Sect. 3.4). Secondarily—and presumably of special interest for the LiDAR community—it should be highlighted that a significant degradation in accuracy is observed when using LiDAR-derived orthometric heights (e.g., NAVD 88). This stems from errors in the hybrid geoid(s) that are applied to the original LiDAR data. From a purely geodetic perspective, the differences between the ellipsoidal height residual and the orthometric height residual (blue and red traces in Fig. 9, respectively) highlight areas where the applied geoid model(s) exhibit noticeable errors and/or systematic definitional inconsistencies. As an example, in the section from 75 to 120 km, there is a sizeable 10+ cm deviation between the residuals. This is likely caused by errors in the hybrid geoid model applied in the original LiDAR processing. The exact cause of this error is beyond the scope of this paper, but this is a novel use of LiDAR data to evaluate geoid model error.

Fig. 9
figure 9

LiDAR derived heights differenced from GSVS17 survey values. Profile in blue is based on the ellipsoidal heights by removing the hybrid geoid applied to the LiDAR data (min: − 0.165, max: 0.092, mean: − 0.016, std dev: ± 0.051, RMSE: ± 0.054). Profile in red is based on the NAVD 88 orthometric height provided in the processed LiDAR data (min: − 0.173, max: 0.183, mean: 0.034, std dev: ± 0.067, RMSE: ± 0.074). Topographic elevation is shown in black

3.2 Omission error of LiDAR-based heights at mountain summits

The results from the previous section demonstrate the overall consistency between the LiDAR data over GSVS17 and the Fourteener summit scenarios. However, there is an aspect of this LiDAR scheme that requires comment. The horizontal locations of the GSVS17 stations were determined at the 0.001 m-level whereas the horizontal locations of the summits are only known to a few meters, at best. This requires that a connection be established between the highest LiDAR return, the overall LiDAR data distribution, and the omission error from that relationship. It is inevitable that some level of omission will occur with any type of gridding process, which is in direct disagreement with the goal here of estimating the highest elevation in a small area—resulting in the estimated summit elevation to almost always be slightly lower than the highest value LiDAR return(s). Additionally, it is extremely challenging to assess the level of omission that is present in geodetic applications especially a dataset like the LiDAR employed here for this scenario. However, an effort is made to approximate the maximum omission error at each of the mountain summits as follows. In the top portion of Fig. 10, each summit is plotted with all of the LiDAR returns that are higher than 1 m below the estimated summit elevation and within a 3 m radius. The estimated summit elevation is subtracted from all of the returns so that any positive returns give an estimate of the omission error in the gridding process. The bottom of Fig. 10 illustrates the maximum omission error at each summit (estimated summit elevation minus maximum LiDAR return value). This figure provides a more complete perspective on the complexity of the data distribution around these summits. Overall, the median value of this omission is − 0.038 m and 80% of the summits are less than − 0.10 m (absolute value). For the summits that have a much larger omission error (− 0.20 m level and greater), it is apparent that only a very small number of returns are causing the error to manifest to this level, and in some cases, it is only a single return as illustrated at North Maroon Peak, Mt. Yale, and Mt. Lindsey. These returns are often vertically separated by 0.2–0.5 m from the rest of the distribution of returns in the summit vicinity.

Fig. 10
figure 10

LiDAR data distribution at the Fourteener summits. The upper panel plots every LiDAR return (blue) that is higher than 1 m below the final, estimated peak elevation (shown in red). The peak elevations have been subtracted, so that any return higher than zero is a possible omission error. The lower panel quantifies the maximum possible omission error for each summit: a more negative value indicates how low the final, estimated peak elevation might be due to the LiDAR contribution

As stated above, the median omission error of the LiDAR-derived ellipsoidal height is − 0.038 m. Note that we did not incorporate it as an additional contribution to the Fourteener summit uncertainty estimate. The value of ± 0.054 m from Sect. 3.1 is the total estimated uncertainty for the LiDAR-based ellipsoidal heights, including this omission component. It is reassuring though that the omission error is at most the same magnitude (and likely even smaller). It must be pointed out that it is possible that the Fourteener summit elevations are skewed too low by a few centimeters.

3.3 Relative geoid model accuracy: the GSVS17 control dataset

The evaluation of the eight geoid models described in Sect. 2.4 is performed in the following section making use of the high-accuracy GSVS17 profile in Southern Colorado.

To evaluate the relative accuracy of any particular geoid model over the GSVS17 profile, a residual height, r, is determined using the GNSS\leveling-derived geoid undulations:

$$r\equiv h-H-N.$$
(9)

The residual profiles of the GGM-based models and the high-resolution geoid models are shown in Figs. 11 and 12, respectively. The GGM-based models are evaluated using both the EGM2008 \(\zeta \)-to-N correction coefficients, which is labeled ‘isw = 82’, and the ICGEM derived corrections. All residual profiles have a constant − 60 cm that is removed to visualize differences at the cm-level without difficulty. The − 60 cm bias is a result of the nominal shift between NAVD 88 and NAGPD2022 across this profile. Statistics for these residuals are shown in Table 3.

Fig. 11
figure 11

GNSS\leveling residual along GSVS17 profile for the GGM-based models (units: m). A constant (− 0.60 m) is removed from all residuals to account for the approximate bias in NAVD 88 at this location. The topographic elevation profile (right axis) is shown in black

Fig. 12
figure 12

GNSS\leveling residual along GSVS17 profile for the high-resolution geoid models (units: m). A constant (− 0.60 m) is removed from all residuals to account for the approximate bias in NAVD 88 at this location. The topographic elevation profile (right axis) is shown in black

Table 3 GSVS17 residual statistics (units: m)

The best-performing GGM-based model is EIGEN-6C4 with a standard deviation of ± 0.029 m, which is only slightly worse than the high-resolution models. The other GGM-based models, including the terrain enhanced EGM2008 + ERTM, have significantly higher standard deviations between ± 0.035 and ± 0.041 m. The best-performing high-resolution model is CGG2013a, with a standard deviation of ± 0.018 m, followed by pxGEOID and xGEOID20B, each at approximately ± 0.022 m.

Additionally, and presumably expectedly, there is more variability in the geoid models over the two mountain passes (at 120 and 300 km). However, there are numerous locations where an individual model can exhibit 0.02–0.04 m variability over short distances (10–20 km). This is even more pronounced in the GGM-based models, where both passes exhibit a roughly 0.10 m increase in the residual for all such models (equivalent to a 0.10 m decrease in the geoid model).

The different \(\zeta \)-to-N correction terms applied to the GGM-based models provide a negligible improvement in terms of standard deviation. However, a noticeable difference is present in terms of the bias where the residual is 0.015 m higher overall with the ICGEM corrections compared with the EGM2008 corrections (alternatively, the geoid models are 0.015 m lower when applying the ICGEM corrections). The correction term differences are illustrated in Fig. 13. This might seem like a non-issue as models can be expected to have different amplitudes over small, non-global regions. However, it is not the models that are biased with one another, but rather the use of a correction term that leads to systematic changes in the operational model. This has serious consequences for modern, high-accuracy geoid model applications such as optical atomic clock control, geoid models as vertical datum definitions, etc. This correction term difference is even more pronounced at the higher mountain summits (see Sect. 3.4).

Fig. 13
figure 13

Difference between \(\zeta \)-to-N correction terms (ICGEM correction term - EGM2008 isw = 82 term) evaluated along the GSVS17 profile (units: m)

Overall, all of the geoid models perform satisfactorily over this profile. However, the high-resolution, regional models are significantly more accurate than the GGM models with standard deviations of ± 0.023 and ± 0.035 m, respectively, an improvement of nearly 35%.

3.4 Preliminary Fourteener elevations

The estimated orthometric heights for the 60 summits are discussed in the following section. Based on the results in Sect. 3.3, highlighting significant differences at higher elevations and not having GNSS\leveling-based geoid values to compare with at the summits, the geoid variability about a mean value is presented here. Additionally, for those summits that have historical triangulation data with published heights in NAVD 88, the expected change in orthometric height is determined and investigated (i.e. the orthometric height change from old datum (NAVD 88) to new datum (NAPGD2022).

The variabilities in geoid undulations about a mean value for the 60 summits are shown in Figs. 14 and 15 for the GGM-based models and high-resolution models, respectively. All statistics are provided in Table 4. A mean geoid value for each summit is removed to highlight the internal consistency of the models, which is calculated from the GGM-based models with the ICGEM correction term and all of the high-resolution models. The reason for omitting the GGM models that use the EGM2008 correction in the mean geoid is that these models had a significantly different bias than the other models for the GSVS17 scenario, which is even more pronounced for the Fourteener summits evaluated here. Additionally, only one version of the GGM-based models should be included for statistical purposes. The difference in \(\zeta \)-to-N correction terms is also illustrated in Fig. 16, which highlights an even greater discrepancy at higher elevations than was found for the GSVS17 scenario.

Fig. 14
figure 14

Geoid undulation variability evaluated at the Fourteener summits for GGM-based models (units: m)

Fig. 15
figure 15

Geoid undulation variability evaluated at the Fourteener summits for high-resolution geoid models (units: m)

Table 4 Geoid undulation variability statistics at Fourteener summits (units: m)
Fig. 16
figure 16

Difference between \(\zeta \)-to-N correction terms (ICGEM correction term - EGM2008 isw = 82 term) evaluated at the Fourteener summits (units: m)

It is nearly impossible to determine which model is closer to reality based on these data; however, in general, all of the models agree at roughly dm levels. Curiously, a significant ‘inversion’ in models compared to the GSVS17 situation is evident: the GGM-based models all have systematically lower geoid values compared to the high-resolution models while this group of models was generally the highest across the GSVS17 profile (lowest residuals from Fig. 11). Upon closer inspection of Fig. 11, there is a similar behavior at the high mountain passes (km 120 and 300), where the geoid model values decline or dip significantly. So, in both the GSVS17 and the Fourteener summit scenarios, the GGM-based models are systematically underestimating the geoid surface. To some extent, this is likely caused by omission error, and on a practical level, it results in systematic artifacts of up to or even greater than 0.10 m at these summits. However, there is evidence that error in the \(\zeta \)-to-N correction term is also causing some of this disagreement as the GGM models that use the ICGEM correction term are quite a bit closer to the high-resolution models. GGM models with the ICGEM \(\zeta \)-to-N correction term applied are approximately 0.06 m closer to the high-resolution models compared to GGM models using the EGM2008 correction term. The GSVS17 scenario also shows a similar result at much smaller magnitudes (0.015 m convergence) caused by the much lower terrain. While it is possible the high-resolution models are all in error, it does not seem so likely due to two factors: (1) the superior performance of the high-resolution models over the entire GSVS17 profile (especially over the mountain passes) and (2) the improvement with the ICGEM correction term.

Another noticeable difference is apparent in xGEOID19B, xGEOID20B, and CGG2013a, which were all fairly consistent across GSVS17. Here, there is a substantial shift in the mean values between xGEOID20B and the other two models for the Fourteener summits of approximately − 0.05 m (mean values were nearly identical for GSVS17 and now xGEOID20B is quite different). The exact cause of this is unknown and will be the subject of future work.

Generally, the geoid models exhibit consistency at the ~ 0.05 m level for these mountain summits. However, there are a few summits where more variability is evident including Blanca Peak, Crestone Needle, Ellingwood Point, Mt. Lindsey, Little Bear Peak, and Redcloud Peak. With the exception of Redcloud Peak, these peaks are in the Sangre de Cristo Mountain range (see Fig. 2), which has a more rugged terrain than other portions of Colorado, and has very sparse terrestrial gravity data. Redcloud Peak has adequate surrounding terrestrial gravity data, but these values have higher uncertainties associated with them than is typical. Even though the geoid models might rely on the same gravity data, the interpolation schemes can be vastly different with respect to how uncertainties are handled or neglected. Both of these types of errors explain why the geoid models differ from one another, but only new gravity observations and better/more consistent modeling will mitigate the errors.

Based on the results here and from the previous section, a determination needs to be made as to which model or combination of models should be applied to determine the Fourteener summit elevations. In order to facilitate a “stand in” for NAPGD2022 for this study, the decision was made to take an equally-weighted mean of the five high-resolution geoid models. This rationale is supported by three factors: (1) it is similar to what the official geoid realization will be (i.e., a mean model computed by NGS, CGS, and potentially others); (2) all of the models are at a similar spatial resolution/not omitting portions of the geoid spectrum; and (3) provide the most consistent and highest accuracies in comparison to the GSVS17 profile. Finally, a “best estimate” of the orthometric height for each peak is then calculated using the LiDAR ellipsoidal heights via (4). It is important to stress that since no official geoid model (GEOID2022) has been defined to realize NAPGD2022, these values should all be considered experimental, possibly “close” approximations, but in no way “official.”

The resulting estimated elevations of the mountain summits are illustrated in Fig. 17 with the 4267.2 m (14,000’) level highlighted (see Table 7 in the appendix for details on each of the 60 peaks considered here). All of the summits continue to be above the 4267.2 level, with the three lowest Fourteeners (Huron Peak, Sunshine Peak, and Mt. of the Holy Cross) all residing just ~ 1 m above the Fourteener threshold. Additionally, we can now estimate the uncertainty of these elevations through a combination of uncertainties from the LiDAR-derived geometric component and the gravimetric geoid model component. To our knowledge, this is the first time uncertainty estimates have been determined for these summits. Based on Sect. 3.1, we assume a uniform ± 0.054 cm geometric uncertainty at all summits. Based on Sect. 3.2 and the preceding discussion, the geoid uncertainty is clearly not uniform across all summits, and take the standard deviation of the five geoid models’ undulations as an estimated uncertainty for each peak. The geoid contribution to the uncertainty ranges from ± 0.016 to ± 0.103 m with a median of ± 0.032 m. The uncertainty estimates determined here are roughly two times larger than estimates provided with the xGEOID20B model, which specifically states that the model uncertainty is ‘too optimistic’ over the GSVS17 profile (Wang et al. 2022). Assuming these two components are uncorrelated, the geometric and geoid components are added in quadrature and estimate that the total uncertainty for each peak’s elevation is between ± 0.056 and ± 0.116 m, with a median of ± 0.063 m.

Fig. 17
figure 17

Estimated orthometric height of mountain summits under a preliminary modernized datum (units: m). The 4267 m (= 14,000 ft) level is highlighted with a horizontal black line. Horizontal axis is based on the original summit ranking (highest to lowest). Change in ranking is shown with + or − ranking change

The expected elevation change (due to the datum shift from NAVD 88 to NAPGD2022*) at available, historical triangulation stations is shown in Fig. 18, with statistics in Table 5. These range from about − 0.40 to − 0.60 m (highlighted in red in Fig. 18) and are calculated via the xGEOID20 interpolation tool (National Geodetic Survey 2020). By way of comparison, and as seen in Sect. 3.3, the GSVS17 profile has a mean shift of − 0.60 m. However, the estimated elevation changes computed here have a mean of − 1.49 m, almost 1 m (mean = − 0.94 m) lower than the predicted datum shift. This is a significant difference and definitely unexpected. This residual difference is the discrepancy between the red and blue symbols in Fig. 18 and the residual triangulation-derived elevation change in Table 5. It is possible that uncertainties in the data and models can explain some of this discrepancy, and it is true that both the elevations in NAVD 88 at these stations (the historical elevation) and the new LiDAR/geopotential model elevations estimated here have errors that are difficult to quantify.

Fig. 18
figure 18

Estimated change in orthometric height between NAVD 88 and NAPGD2022* (units: m) evaluated at triangulation stations. All NAVD 88 heights are from triangulation vertical angles in NGVD 29 and vertically converted to NAVD 88 (shown in blue) except for Pikes Peak, which has a GNSS-derived NAVD 88 elevation (shown in yellow). The expected, predicted change (from NAVD 88 to NAPGD2022) is shown in red

Table 5 Statistics for estimated elevation change (units: m)

Based on results from the previous sections (a LiDAR vertical uncertainty of ± 0.054 m and a geoid model uncertainty of ± 0.03 m), a conservative uncertainty estimate for the new elevations is determined to be ± 0.06 m (standard deviation). The estimated uncertainty of the original, triangulation-based NAVD 88 elevations is nearly impossible to determine. NGS publishes the value to the nearest 0.1 m, but this is likely not a reflection of the actual uncertainty.

Investigating the only station that has modern geodetic observations, Pikes Peak,Footnote 1 we note a significant discrepancy: the GNSS-derived NAVD 88 height published in 2022 is 4299.32 m, which is roughly 1.3 m lower than the (previously published) NAVD 88 triangulation-based height (4300.6 m). The historic, triangulation-based orthometric height is consistent with other (historic) station height differences shown in Fig. 18 (the blue square for Pikes Peak is in line with the majority of other blue squares). Additionally, it should be noted that the newly published NAVD 88 height is remarkably consistent with estimates for the shift from NAVD 88 to NAPGD2022 (yellow square and red dots in Fig. 18).

Thus, Fig. 18 presents a major problem: nearly all of the derived height differences are too negative by at least a meter (either historic heights are generally too high or new heights are too low). There are a number of plausible sources for this disagreement, including any combination of the following: errors in the original NGVD 29 orthometric heights observed with vertical angles, errors in the NGVD 29-to-NAVD 88 VERTCON conversion, errors in the LiDAR derived geometric height, errors in the gravimetric geoid model, and/or errors in the estimated NAVD 88-to-NAPGD2022 conversion (which also includes a hybrid geoid model, GEOID18). Vertical deformation is unlikely to be a factor at these magnitudes as current rates from continuous GNSS generally show subsidence at less than ~ 0.001 m/yr (Blewitt et al. 2016). Based on results from the previous section, there is some confidence in how well the LiDAR/geoid method performs. We must therefore turn our attention to other possible error sources: historical, published orthometric heights derived from zenith angles, the NGVD 29-to-NAVD 88 conversion, and/or the NAVD 88-to-NAPGD2022 conversion. The previously discussed Pikes Peak example supports the hypothesis that the NAVD 88-to-NAPGD2022 conversion is consistent; and in any case, further study of the latter two error sources will be the subject of future work. Thus, the focus is directed on the historical, zenith angle-derived heights in the next section.

3.5 Readjustment of triangulated zenith angles

Investigating possible errors in the original triangulation observations proves to be quite challenging. There are very few records of measured zenith angles in the NGS Integrated Database as many of these observations were never digitized or otherwise retained. Additionally, it is very difficult to get a detailed description of how these zenith angles were observed, reduced, and adjusted (for more information see Poling 1947). Since there is no way to obtain access to all of the zenith angles and cannot reproduce the historical computations, a complete readjustment of the zenith angles is performed according to Heiskanen and Moritz (1967) (p. 175) where the ellipsoidal height difference between station 1 and 2 can be found from (10):

$$ h_{2} - h_{1} = s\left( {1 + \frac{{h_{m} }}{R} + \frac{{s^{2} }}{{12R^{2} }}} \right)\tan \frac{{z_{2} - z_{1} }}{2}, $$
(10)

where \({h}_{1}\) is the ellipsoidal height of station 1, \({h}_{2}\) is the ellipsoidal height of station 2, \(s\) is the horizontal distance between the stations along the ellipsoid, \({h}_{m}\) is the mean height, \(R\) is the mean spherical radius between station 1 and 2 using the principal radii of curvature of the ellipsoid at each station, and \({z}_{1}\), \({z}_{2}\) are reciprocal, ellipsoidal zenith angles corrected for the deflection of the vertical according to (11):

$$ \begin{aligned} z_{1} & = z_{1}^{\prime} + \xi_{1} \cos \alpha + \eta_{1} \sin \alpha , \\ z_{2} & = z_{2}^{\prime} - \xi_{2} \cos \alpha - \eta_{2} \sin \alpha , \\ \end{aligned} $$
(11)

where \({z}_{1}^{\prime}\) and \({z}_{2}^{\prime}\) are the observed zenith angles, \(\alpha \) is the azimuth from station 1 to station 2, \({\xi }_{1}\) and \({\xi }_{2}\) are the north-south components of the deflection of the vertical, and \({\eta }_{1}\) and \({\eta }_{2}\) are the east–west components of the deflection of the vertical.

This derivation assumes that reciprocal angles are observed simultaneously at stations 1 and 2 in order to mitigate the effects of refraction. The observations utilized here do include zenith angles at both ends of each profile, but in general, were not collected at the same time (or even the same day). Consequently, an assumption is made that they do not completely remove the atmospheric refraction effect and compute the last element in (10) as follows:

$$\frac{{z}_{2}-{z}_{1}}{2}=\frac{\pi }{2}+\frac{s}{2R}-{z}_{1.}$$
(12)

Of the 32 Fourteener summits with triangulation stations, it is possible to reconstruct and readjust two unconnected (sub)networks that include a total of 15 Fourteeners, as illustrated in Fig. 19. While these networks are very close to one another, unfortunately they do not have any overlap. Consequently, they are adjusted individually in the same manner with a single leveling station used as a constraint (with a NAD83(2011) ellipsoidal height), equally weighted observations, deflection components from the xDEFLEC20 model (Wang et al. 2022), and all estimated ellipsoidal heights are converted back into NAVD 88 orthometric heights using a hybrid geoid model (GEOID18). The constraint station for each network was selected based on centrality to the overall network, in addition, to the requisite leveled height information—however, there are very few stations in the triangulation network that have leveled heights on them. Horizontal distances between the stations were determined using latitude and longitude coordinate differences on the ellipsoid (Vincenty 1975). Additionally, the deflection model used here is determined at the topographic surface by applying a Bouguer plate correction term to account for the curvature of the plumbline between the geoid and the topography (see 5–32 in Heiskanen and Moritz 1967). Over the GSVS17 profile, Wang et al. (2022) found this model to be accurate at approximately ± 0.9″ in both directional components.

Fig. 19
figure 19

Historical triangulation networks available from the NGS Integrated Database included in reprocessing with topographic elevation shown in m. Network 1 (in blue) includes four Fourteener summits and Network 2 (in purple) includes eleven Fourteener summits. Constraint stations are shown as white squares

After the initial adjustment, an inspection of the estimated residuals of these networks reveals a clearly systematic artifact correlated with the distance as illustrated in Fig. 20 left panel. The non-centered residuals violate one of the key assumptions of least squares and signify the existence of an unmodeled effect. It is well known (c.f. Bomford 1962; Heiskanen and Moritz 1967; Torge 2001) that zenith angle observations are prone to errors caused by atmospheric refraction, and an assumption is made that this is the most likely cause of these systematic residuals. So, while the data does not have true reciprocal angles, it is likely these observations were observed under fairly similar atmospheric conditions (out of necessity they required clear visibility for the long sight lines with no adverse weather), in similar afternoon hours of the day (Gossett 1959), and are undertaken at high altitudes with sightlines well above the Earth’s surface where refraction is slightly less variable (Hirt et al. 2010). Thus, an additional estimated parameter is included in the least squares adjustment that depends on the distance squared (\(\tau *{s}^{2}\)). The inclusion of this term clearly mitigates nearly all of the systematic behavior of the post-adjustment residuals (Fig. 20 right panel). The estimated \(\tau \) parameter for Network 1 and Network 2 is − 0.00828 m/km2 and − 0.00813 m/km2, respectively. The level of agreement between these two values is reassuring—and defining the coefficient of refraction k as (Torge 2001),

$$k\equiv \frac{R}{r}\approx -2R*\tau ,$$
(13)

where R is the radius of the Earth, r is the radius of the line of sight—we find k values of 0.1055 and 0.1036, respectively. These values are slightly smaller than the nominal value of k = 0.13 from Gauss (Brunner 1984), but this is consistent with conditions found at high mountain peaks: lower pressure and lower temperature.

Fig. 20
figure 20

Post-adjustment residuals of triangulation network height differences (units: m). Blue and red squares are residuals from Network 1 and Network 2, respectively. (Left) original adjustment and right) adjustment including a distance squared parameter

The historical, published NAVD 88 orthometric heights can now be compared with the new, refraction-corrected NAVD 88 orthometric heights for the 15 peaks that were in the original adjustment networks (see Fig. 21 with statistics in Table 5). The impact of this correction is readily apparent: All stations show convergence towards the expected datum-induced height change (all purple symbols are closer to the red symbols when compared with the blue symbols). Additionally, it should be noted that the (mean) magnitude of the height changes is 0.46 m (from blue to purple symbols), and even stations that have inconsistent historical elevations (i.e., comparatively too low instead of too high, like Snowmass Mountain and Pyramid Peak) exhibit the same magnitude of improvement but in the opposite direction. Finally, Castle Peak, which showed some of the most disagreement in Fig. 18, experiences a − 1.7 m height change and is now within 0.10 m of expectations. Ultimately, the inclusion of the refraction parameter improves the results but errors are not completely eliminated to a satisfactorily level (Table 6).

Fig. 21
figure 21

Estimated change in orthometric height between NAVD 88 and NAPGD2022* (units: m) evaluated at triangulation stations. Published NAVD 88 heights (from triangulation vertical angles in NGVD 29 and vertically converted to NAVD 88) are shown in blue (are readjusted) or gray (not readjusted) (c.f. Fig. 13). Pikes Peak, which has a GNSS-derived NAVD 88 elevation, is shown in yellow. Stations included in a readjustment of zenith angles are shown in purple. The expected change (from NAVD 88 to NAPGD2022) is shown in red

Table 6 Statistics for estimated elevation change (units: m)

4 Conclusions

This investigation provides an estimate for the elevation of high mountain peaks in Colorado based on a preliminary geopotential datum. Due to the nearly total lack of modern geodetic data on the peaks, LiDAR data are utilized from the USGS 3DEP project to estimate ellipsoidal heights, and then the point cloud data is transformed into orthometric height space by applying a number of gravimetric geoid models. For the vast majority of the peaks that were investigated, the geoid models are quite consistent with one another at ± 0.05 m level. However, there are up to or even greater than 0.10 m differences at approximately 10% of summits due to insufficient terrestrial gravity data, inconsistent modeling, and/or inconsistent gravity data interpolation with uncertainties.

While one might think that precise knowledge of remote, mountain summit elevations is either a bit academic, or even frivolous, there are important, real-world applications that stand to benefit from the techniques and results presented here. These applications include high resolution geodynamical studies that require the accurate determination of elevations (and rates of change) or the testing of linked optical atomic clocks that requires large, quantified geopotential differences, ideally at the cm (or better) level.

While there is no way to determine which geoid model is the most accurate at the Fourteener summits, a comparison with the GSVS17 profile can provide some guidance as to how the various geoid models perform in a mountainous environment. Along GSVS17, the CGG2013a model provides the most consistency with the GNSS\leveling data (σ = ± 0.019 m) followed by pxGEOID and xGEOID20B (σ = ± 0.022 m). Additionally, all of the GGM based models exhibit a persistently lower geoid undulation at the mountain passes in the GSVS17 comparison and inter-model comparisons at the Fourteener summits. This would result in elevations being approximately 0.05–0.10 m higher with the GGM based models compared to the regional, geoid models, and while likely the result of omission, it should provide caution to users in need of cm-level geoid accuracy. Additionally, there is a noticeable divergence in xGEOID20B with respect to CGG2013a and xGEOID19B at the Fourteener summits where xGEOID20B exhibits a − 0.05 m shift from these other models. At lower elevations along GSVS17, there was effectively no offset between these models. Finally, there is a noticeable increase in the geoid variability for summits that have sparse or highly uncertain terrestrial gravity data.

An orthometric height for each Fourteener summit based on LiDAR-derived ellipsoidal heights and a “stand in” NAPGD2022* geopotential datum is calculated. While the uncertainty in the LiDAR derived ellipsoidal heights is not quite at the level of achievable geodetic standards, the observed ± 0.054 m vertical RMSE over GSVS17 would be equivalent to the vertical RMSE values obtained by using 40-min static GNSS occupations using various Online Positioning User Service (OPUS) products (Gillins et al. 2019). This is not an endorsement of using LiDAR ellipsoidal heights as precise geodetic control, but in challenging environments and other unique situations, this study demonstrates that LiDAR ellipsoidal heights have some utility for geodetic users. Additionally, vertical components derived from LiDAR ellipsoidal heights instead of orthometric heights are much more accurate: with an RMSE improvement of nearly 28% (from ± 0.074 to ± 0.054 m). This highlights the importance of retaining the ellipsoidal height component in LiDAR point clouds.

All of the Fourteeners would continue to be above the 14,000′ threshold based on these estimates. The three lowest Fourteeners would get even lower—only about 4′ above 14,000′. Additionally, a small number of summits would receive a change to their overall ranking. No peaks below the 14,000′ threshold would be “elevated” to Fourteener status. Total uncertainties for the peak elevation values range from ± 0.056 to ± 0.116 m, with a median of ± 0.063 m.

The geoid models studied here obviously do not exhibit a uniform variability across all peaks and are location-specific. Indeed, this variability—± 0.016 to ± 0.103 m, with a median of ± 0.032—is used to estimate the geoid component of the uncertainty in the composite NAPGD2022* model. It is expected that when the official realization of NAGPD2022 is ultimately released, its predicted undulations will be in concordance with this uncertainty range. That is, the final peak elevations are not expected to change by more than their uncertainties listed in Table 7.

The amount of elevation change expected for each summit with the upcoming datum redefinition was also investigated. Due to an inability to accurately establish the historical summit elevation in NAVD 88 for every Fourteener summit, the currently published NAVD 88 orthometric height at historical triangulation stations was used as a substitute, and the change from this value is estimated. Nearly all of the mountain triangulation stations would appear to drop significantly (mean = − 1.49 m) in the new NAPGD2022* datum. This is quite surprising given that the expected shift between NAVD 88 and NAPGD2022 at these locations is only about − 0.50 m. The most likely cause of a problem at this magnitude is poor handling of zenith angle observations (omission of deflection of the vertical, incomplete atmospheric refraction mitigation, etc.). More importantly though, we conclude that the currently published NAVD 88 elevation of geodetic marks on the summits are systematically too high by at least 1.0 m with respect to NAVD 88.

To mitigate as much as possible problems with the historic zenith angle observations, the two available subnetworks are rederived and readjusted, which includes roughly half (15/32) of the summit geodetic stations. This results in improvement for all of the evaluated stations with height changes at the 0.5 m level. While the majority of stations see their current elevation decrease from the readjustment, the three stations that showed evidence of being too low saw their elevations increase and are more aligned with estimates.

There are still unresolved differences (at the ~ 0.5 m level) from derived elevation differences and predictions for the NAVD 88 to NAPGD2022 difference, which will be the subject of further study and are most likely caused by errors and inconsistencies in the historic, triangulation data. Additionally, a number of topics for future work would greatly address some of the questions that have been uncovered. These include acquiring new static GNSS observations on the geodetic stations, which would provide some reassurance of the new height estimates. However, that will only decrease the error budget by approximately ± 0.03 m—a relatively small amount compared to the discrepancies shown in Fig. 21. But this would provide significant evidence to address the questionable nature of the historical, triangulated heights. Secondly, efforts to further clarify historical zenith angle observations in the region from the Coast and Geodetic Survey and USGS are underway. Third, additional possible errors from the NGVD 29-to-NAVD 88 conversion will be investigated. Finally, there are still improvements to be made to provide realistic and accurate geoid model uncertainties.