The open data HELI-DEM DTM for the western alpine area: computation and publication

Two problems can arise in analyzing Digital Elevation Models (DEMs). Firstly, one interest region could be patched by several partly overlapping models that present similar accuracies and spatial resolutions; they should be merged in one unified model. Moreover, even when the interest region is completely covered by one model, local data with better accuracy could cover partial areas and should be properly merged. All these problems have been addressed within Helvetia-Italy Digital Elevation Model, a project funded by the European Regional Development Fund (ERDF). One specific aim of the project was the creation and the publication of a unified DEM for a part of the Alps between Italy and Switzerland. The interest area is prevalently mountainous, with elevations that range from about 200 to 4600 m. Various elevation datasets were available that were gridded with different spatial resolutions, in different reference frames and coordinates. Moreover, one high resolution LiDAR model was available for some areas. Firstly, a validation was performed that allowed the identification of few blunders and confirmed the general accuracy of the input data. Two DEMs have been produced. Both of them cover the whole project area (boundaries λ = 7.80° East and λ = 10.70° East, φ = 45.10° North and φ = 46.70° North) and are gridded in ETRF2000 geographical coordinates, with a spatial resolution of 2 × 10 − 4 degrees. The former (named HD-1) has been obtained by interpolating and merging all the low resolution models on a new common grid. HD-1 has been locally corrected by the LiDAR model, where it was available; to avoid sharp discontinuities, the corrections have been filtered by Fast Fourier Transform before applying them. The resulting model has been called HD-2. HD-1 and HD-2 are published by an open access geoservice.


Introduction
Digital Elevation Models (DEMs, Li et al. 2005) provide a numerical description of the terrain elevation. Stored elevations are typically orthometric heights; when the acquisition technique produces ellipsoidal heights, these are converted into orthometric. DEMs can be classified in Digital Surface Models (DSMs), which represent the actual Earth surface (including buildings, vegetation, etc.) and Digital Terrain Models (DTMs), which represent the elevation of the bare soil.
Different models can be adopted to store and transmit a DEM; grids represent the most popular standard (El-Sheimy et al. 2005). Grids are georeferenced regular matrices of (x i , y i ) nodes, whose elevations H i are stored. The horizontal coordinates of the nodes can be either in a cartographic projection (x: East, y: North) or geographic (x: λ, y: φ). Typically, the horizontal spacing between nodes (the grid resolution Δx and Δy) are equal in x and y directions; this is not a strict requirement but quite a standard. The georeferencing of a grid requires the knowledge of the reference frame, the coordinate system, the grid origin (typically, the coordinates of the upper left node), Δx, Δy, and the total number of rows and columns; additional metadata are needed and are stored in the header. The vertical standard deviation (ZS) of a DEM defines the accuracy of the stored elevations.
The production of global DEMs by remote sensing techniques (Gesch et al. 2006) has grown significantly in the last decade. Their resolutions have improved from about 90 m of SRTM (v4.1, http://www.cgiar-csi.org/data/srtm-90m-digitalelevation-database-v4-1) to the 12 m of TerraSAR-X add-on for Digital Elevation Measurement (TanDEM-X). Accuracies reach now the order of few meters; for example, the absolute accuracies of TanDEM-X and ALOS World 3D are respectively of 4 m (http://www.dlr.de/eo/en/desktopdefault.aspx/ tabid-5727/10086_read-21046/) and 5 m (http://www.eorc.jaxa. jp/ALOS/en/aw3d30/). They are fundamental for global applications and are necessary also at the local scale, where local models are not available. However, local DEMs are preferred where they are available because typically, they are characterized by better accuracies and resolutions. In particular, regional or national models can be produced by aerial photogrammetry with resolution and accuracy of some meters (Li et al. 2005). By LiDAR technique (Brovelli et al. 2004;Liu 2008), high resolution and high accuracy (both better than 1 m) models can be produced.
The merging of local bordering models is a major objective especially in alpine areas where hydrogeological analyses are needed (Li and Lihao 2004;Grimaldi et al. 2010); there, one model which covers the whole interest area could be useful to analyze the scenario. However, bordering or overlapping local models cannot be immediately merged, because typically, they are not in the same grid. Moreover, possible inconsistencies must be carefully checked in the border overlaps.
In recent years, Italy and Switzerland have produced different DTMs that have been acquired by different data sources, are characterized by different reference frames, accuracies, and resolutions, and cover different areas of the Alps. HELI-DEM (Helvetia-Italy Digital Elevation Model, Biagi et al. 2011)  In the following sections, all the performed tasks are described. In particular, the BData preprocessing^section summarizes the data preprocessing, the BComputation of HELI-DEM-1^section discusses the merging of low resolution DTMs, the BComputation of HELI-DEM-2^section discusses the corrections with a high resolution LiDAR DTM, and the BHD-1 and HD-2 publication^section presents the implementation of a geoservice for the publication of the final results.

Data preprocessing
Preprocessing includes all the needed operations to prepare the input data. Firstly, the available data are collected and then they are checked. Our available data can be mainly classified in the following ways (Biagi et al. 2012 Input data are in different reference frames. The output model should be in ETRF2000, i.e., the present official European Reference Frame. Therefore, to harmonize the data, transformations of reference frames are needed; for this purpose, specific libraries (Biagi et al. 2012)  The results are extensively discussed in Biagi et al. (2013b). Here, only their synthesis is reported: PI-DTM and DTMCH-25 are fully consistent; neither global nor local biases exist in their overlaps and the standard deviations of the differences are consistent with the nominal accuracies. LO-DTM and DTMCH-25 are only partly consistent; no biases can be identified in their overlaps but in some areas, the standard deviations of the differences exceed the nominal accuracies. Discrepancies are not correlated with the morphology complexity of the terrain.
Moreover, LO-DTM contains clear problems in two small areas at the border with PI-DTM. As example, one of them is reported in Fig. 2; in the southern river-head (Ticino valley) of Lago Maggiore, PI-DTM reports realistic elevations while LO-DTM is surely wrong because it extends the lake (conventionally at an elevation of 193 m) for at least 12 km 2 in the Ticino valley. The two areas have been removed from the LO-DTM dataset before further processing.
Checks of low resolution models with PST-A have generally provided satisfactory results; this was in some way expected, because PST-A covers only the low valleys where less problems are expected than on slopes and mountains. Only in Val Pola (Valtellina area), anomalous localized values exist with a maximum difference of about 100 m (Fig. 3). This is due to a huge landslide that occurred in 1987 (Costa 1991); LO-DTM was produced before the event, while PST-A is more recent.
Ad hoc real-time kinematic (RTK) surveys in the anomalous areas confirmed the correctness of PST-A (Biagi et al. 2013a), whose use to correct less accurate values should generally be planned.

Computation of HELI-DEM-1
One DTM which covers the whole interest area must be created. In particular, the output DTM must be in ETRF2000 geographic coordinates and must cover the rectangle inside the following boundaries: λ = 7.80 ∘ E, λ = 10.70 ∘ E, φ = 45.10 ∘ N and φ = 46.70 ∘ N. The adopted spatial resolution is 2 ⋅ 10 −4 sexagesimal degrees (about 15 m in longitude and 22 m in latitude).
All the input DTMs must be transformed to ETRF2000. Note that a reference frame transformation of a point is a three dimensional transformation of its X, Y, Z (or φ, λ, h) coordinates. However, a point of a DTM is a node of an horizontal grid whose orthometric height is stored; therefore, the reference frame transformation is applied only to the horizontal coordinates of the node. The output is a list of transformed horizontal points and their orthometric heights; the points are again almost regularly distributed but no more on an exactly oriented grid. Moreover, a re-gridding could be needed with a different resolution or in a different coordinate system. Different procedures can be adopted to transform, re-grid, and merge more DTMs; in general, three independent operations are necessary regardless of the order: & transformation of the input DTMs to a common reference frame, & re-gridding of the DTMs on the output grid, & and merging of the re-gridded data.
In order to perform them, at least two different approaches can be applied.
1. Firstly, the input DTMs are transformed to the output reference frame; the resulting datasets are individually re-gridded on the output grid. Finally, the obtained grids are merged by averaging in the overlapping areas. This approach is called direct. 2. The horizontal coordinates of the output grid nodes are back transformed from the output reference frame to those of the input models. Each input DTM, still gridded in its own reference frame, is interpolated on these horizontal points. The interpolated elevations are then assigned to the output nodes. Finally, the obtained grids are merged by averaging in the overlapping areas. This approach is called inverse.

Re-gridding choices
To re-grid, firstly, a parametric model is chosen; the parameters are then estimated by the input data. Many approaches and models exist; a first classification can be in interpolation and approximation (Christakos 1992). In interpolation, a model is estimated that passes through all the observations. In approximation, statistical methods are applied to estimate a smoother model from the observations. When raw observations are used to estimate a model, approximation must be adopted because this allows the filtering of errors and outliers; one example is discussed in Biagi and Negretti (2004). On the contrary, in our case, the input data came from models that have been already filtered and checked against errors. In this case, actual details can be lost by smoothing while interpolation should provide better results.
Typically, geographic information system (GIS) software (O'Sullivan and Unwin 2003) implements either splines or polynomial surfaces (Kidner 2003) to interpolate grid nodes. Particularly, we have focused on the use of bilinear and bicubic surfaces; previous works (see for example Rees 2000) stated the better accuracy of bicubic surfaces and our preliminary analyses confirmed that. A bicubic surface is given by the following: x 2 y þ a 12 xy 2 þ a 03 y 3 þ a 31 x 3 y þ a 22 x 2 y 2 þ a 13 xy 3 þ a 32 x 3 y 2 þ a 23 x 2 y 3 þ a 33 x 3 y 3 Firstly, let us consider the generic case of sparse observations and their interpolation or approximation in a point P = (x P , y P ). At least 16 (x i , y i , H i ) observations around P are needed to estimate a ij parameters; the system can be written as follows: If exactly 16 observations are used, the parameters are estimated by the iso-determined system.
Clearly, the solution requires that A matrix is well conditioned. Finally, the estimated parameters are used to estimate elevations in P. In this case, the bicubic surface acts as interpolator.
If more than 16 observations are used, a redundant system is built and can be solved by the least squares (LS, Koch 1987) approach.
Also in this case, the good conditioning of the normal system is required; again, the estimated parameters are used to estimate elevations in P. In this case, the bicubic surface acts as approximation.
In the case of direct approach, input points must be managed as sparse observations and Eqs. (3) or (4) must be implemented; methodological and technical problems of this approach are extensively discussed in Biagi and Carcano (2015). In the paper, a particular attention is given to the numerical regularization of ill-conditioned cases that can occur in specific geometric configurations, even if the input data are regularly distributed.
On the opposite, let us now consider the bicubic interpolation of gridded data; in this case, to interpolate in a point P, the 16 nodes of the 4 rows ×4 columns around it can be used. The system Eq. (2) is solved by five consecutive cubic interpolations along lines, without any matrix inversion, as is shown in Press et al. (2007) and applied in Carcano (2014). Let us suppose that P is between rows i and i + 1 and columns j and j + 1 of the matrix DTM; let us call I and J the row and column real valued coordinates of P. In the first step, the four points in each row, k , k = i − 1 , i , i + 1 , i + 2, are used to interpolate a point in k, J: In the second step, H i − 1 , J , H i , J , H i + 1 , J , and H i + 2 , J are interpolated in I: The same approach applied in the opposite order produces the same results that are exactly equal to the solution of Eq. (2) for this particular spatial configuration. No system inversion is needed and the solution is always well conditioned. This twostep process is simply an application of the Lagrange polynomial interpolator in the special case of equidistant data. This is exactly the interpolation needed in the inverse regridding, where input data are still gridded. In case less than 16 elevations exist around P, the interpolation cannot be computed and the result is put to Bno data^; this happens at the borders of the input grid (and clearly outside it).

Final interpolation of HD-1
Firstly, direct approach was implemented to compute HD-1; the applied procedure is described in Biagi et al. (2013b). Unfortunately, when extensively applied to the whole input dataset, the direct approach provided numerical problems in a significant number of points.
Therefore, computation strategy has changed and the inverse approach has been implemented and tested. The ETRF2000 horizontal coordinates of the output nodes are transformed to the reference frame of the three input DTMs that are interpolated by Eq. (5a) and (5b) on these points. The interpolated elevations are assigned to the output nodes. To cross check, the process can be reversed; interpolated grids are back interpolated on the input nodes. Original and twiceinterpolated elevations are compared. Table 1 reports the cross checks for LO-DTM, PI-DTM, and DTMCH-25. In case of PI-DTM and DTMCH-25, results are completely satisfactory; the differences are not significant with respect to the nominal accuracies of input data. LO-DTM results are satisfactory but worse than others; the worst differences are in very rough areas.
At the end, three interpolated DTMs are available and must be merged; they share the same grid (origin, spatial resolution, and number of nodes) but each one contains elevations only inside the domain of its input DTM, while the external nodes contain Bno data^.
Most of the nodes of the final grid contain just one elevation that is the final stored elevation. On the contrary, interpolated DTMs overlap at their borders. To merge them, existing discontinuities must be smoothed; a weighted approach has been applied. Let P be a node where more DTMs overlap; the horizontal distances of P from the borders of all the overlapping DTM can be computed by standard GIS functions (Fig. 4). The final elevation in P is computed by the following: where h i are the elevations of overlapping DTMs in P, d i are the distances between P and the borders of overlapping DTMs. Note that this weighting has the purpose to avoid discontinuities at the borders of overlaps and not statistical reasons. The final DTM (HD-1) is shown in Fig. 5 and its statistics are summarized in Table 2.

Computation of HELI-DEM-2
The high resolution PST-A DTM is available in the river basins areas and can be used to improve HD-1. The simple replacement of HD-1 values with PST-A can introduce local significant discontinuities and a proper filtering has to be applied. The proposed procedure can be schematized in the following ways: 1. computation of the corrections, 2. filtering of the corrections, 3. and application of the filtered corrections to HD-1.
Computation of the corrections PST-A is subsampled on HD-1 nodes: firstly, all the nodes of PST-A within the square domain of each HD-1 node are averaged and then the average is assigned to the HD-1 node. The matrix containing the  N number of points (in million), Mean mean difference, Std. dev. standard deviation, Min. minimum difference, Max. maximum difference differences between PST-A and HD-1 is computed: the differences corresponding to PST-A Bno data^are set to zero. This is the correction model.
Filtering of the corrections by low pass filter Due to the subsampling of PST-A, the correction model is already smoothed; however, discontinuities at the borders can be in principle sharp. A low pass filter allows to smooth the corrections at their borders; it can be described as the convolution of a signal with a mask whose integral is equal to one. The low pass Butterworth filter is defined as follows: where n is the order of the filter, D 0 is the halving distance. In the bi-dimensional discrete case, the mask is a matrix of coefficients that are null outside a given window.
Here, indexes i, j are the node positions with respect to the origin of the filter (convolution application). In our application, we chose D 0 = 5 and n = 2. The dimension of the square filter was 11 × 11; k is a constant to normalize the sum of its coefficients.
The filtering has been implemented by a Fast Fourier Transform (FFT) approach (Brigham 1974): 1. construction of the Butterworth filter matrix.

computation of FFTs of both correction model and
Butterworth filter, 3. product of the two FFTs, 4. and inverse FFT of the product.
To visually analyze the filtering benefits, just the results of a small case study (Oglio river) are here reported. Figure 6 shows the original and the filtered corrections; the filtering effect is evident at the borders.
The filtered corrections have been applied to the unified DTM. The output DTM is called HD-2. Internal and external validations are useful to evaluate if the correction actually leads to an improvement of the final product. As internal validation, HD-1 and HD-2 are compared to PST-A DTM. The statistics of the differences demonstrate the usefulness of the correction. In Oglio river valley, the mean and the standard deviation equal respectively to −1.9 and 5.8 m before the correction and 0.0 and 1.1 m after the correction. In the complete HD-1, they are equal to 0.3 and 6.0 m before the correction and 0.0 and 0.4 m after.
As external validation, HD-1 and HD-2 are compared with about 1000 checkpoints that are available in some test areas. They have been surveyed by GNSS RTK, with an accuracy surely better than 10 cm and have been used in a previous step of the project to check LO-DTM where it has provided anomalous values (Biagi et al. 2013a); in particular, they are located   (Barzaghi et al. 2007). The subsampled PST-A and HD-2 have been interpolated by bicubic interpolation on the test points, and the differences with respect to test values have been computed (Table 3). The statistics prove the reliability of the applied correction; indeed, the results of HD-2 are significantly better than those of HD-1 and are comparable with those of PST-A. As example, Fig. 7 shows on the same graphs the differences between HD-1, PST-A (in this case not subsampled), HD-2, and the 64 points of one area that were surveyed along an almost straight path. A bias of HD-1 is clear, with some points that present differences up to 11 m. The differences between PST-A and RTK are hardly recognizable at the scale of representation. HD-2 differences are more similar to the high resolution case.

HD-1 and HD-2 publication
In publishing HD-1 and HD-2 data, our main goal was to ensure the data access in a simple way. To realize a high level of data accessibility and make easy to use them, we decided to implement standards; Open Geospatial Consortium (OGChttp://www.opengeospatial.org/) provides the main benchmarks. OGC is an international consortium of more than 500 members: companies, government agencies and universities participating to the development and use of international standards and supporting services that promote geospatial interoperability.
For our application, the OGC web service (OWS) standards must be adopted to implement the GeoWeb services that Fig. 6 The filtering effect shown for a small case study (Oglio river, 600 × 500 cells). Above: original correction model (differences between subsample PST-A and HD-1). Left: differences. Right: slopes. Below: filtered correction model. Left: differences. Right: slopes. Mean mean difference, Std dev standard deviation, Max maximum difference are needed to distribute spatial data (Peng andTsou 2003 andBrovelli et al. 2011). Using OWS, data and geoservices can be published to anyone that is standard compliant, independently from an operative system, GIS software, or other client properties or configuration (Michaelis and Ames 2007). Figure 8 shows the basic scheme of an OWS service. The client sends a standard request to the server. Then, the server accesses to the data, processes them according to the request, and finally sends the response to the client. Many different requests can be managed, depending on the type of OWS service implemented. Basically, they can be grouped in four main kinds: & BGet capabilities^requests to which the server responds with a list the process or data available, & BDescribe^requests to which the server responds with the description of a particular datum or process, and & BGet data^or BGet service^requests to which the server responds delivering data or executing specific processes.
Moreover, whether the server implements transaction functionalities, also data editing requests exist. Different standards are defined, according to the type of the data published in the geoserver; our data are grids (geo referenced raster data). Web coverage service (WCS) (Baumann 2012) and Web map service (WMS) (De La Beaujardiere et al. 2006) have been implemented. WCS (http://www.opengeospatial.org/standards/wcs) publishes raster data as georeferenced numeric matrices; the user can use them for processing and any other needed activity. Data published by WCS can be heavy; if you need them only for background images, the full numeric matrix is actually useless. WMS (http://www.opengeospatial.org/standards/wms) publishes data as simple georeferenced image. Data published by WMS are lighter; therefore, we have implemented also this service.
To implement WCS and WMS, we adopted GeoServer (http://geoserver.org/). GeoServer is a Java-based software that allows to access and edit geospatial data; it is free and open source and uses the open standards set by the OGC Our geoservice with HD-1 and HD-2 is available at the following address of Geomatics Laboratory of Como Campus of Politecnico di Milano (http://geomatica.como.polimi.it; http://www.helidemdataserver.como.polimi.it:8080/ geoserver/ows).

Conclusions
The HELI-DEM project was aimed at developing a unique DTM which covers the alpine area between Lombardy, Piedmont, Ticino, and Grisons Cantons. For this project, different DTMs were available, acquired with different accuracies and resolutions and in different reference frames: at low/ Fig. 7 Differences (cm) between HD-1/HD-2/PST-A and RTK points in area VT02 in Valtellina Valley   Fig. 8 Basic scheme of an OWS service medium resolution and accuracy, the two official DTMs of Italian Regions Lombardy and Piedmont and the national Swiss DTM and at high resolution, a LiDAR DTM which covers the main Italian river basins. These datasets have been merged to create one model.
To do that, several preliminary analyses have been performed, with the purpose to cross check and to assess the accuracy of all the input datasets. The performed analyses confirmed the general quality of the available data and allowed to identify and eliminate few anomalies.
Output model HD-1 has been obtained by re-gridding and merging of the low resolution available DTMs. Cross checks of HD-1 with input datasets confirm the correctness of adopted methods. In area where geodetic benchmarks are available and ad hoc RTK GNSS surveys were made (mainly in Valtellina Valley), the vertical accuracy of the model can be externally checked; it is of the order of 5 m.
Moreover, a method to correct a low resolution model by using high resolution data without introducing discontinuities has been studied and implemented. This method has been applied to HD-1 by using the LiDar model PST-A that was produced by the Italian Ministry of Environment for Italian alpine valleys. The correction, implemented through a Fast Fourier Transform, proved to be effective and the final product is called HD-2. Where corrections existed and have been applied, the final accuracy of HD-2 increases to about 1.7 m.
Both HD-1 and HD-2 are freely published by a FOSS geoservice.