## Abstract

The cross-validation technique is a popular method to assess and improve the quality of prediction by least squares collocation (LSC). We present a formula for direct estimation of the vector of cross-validation errors (CVEs) in LSC which is much faster than element-wise CVE computation. We show that a quadratic form of CVEs follows Chi-squared distribution. Furthermore, a posteriori noise variance factor is derived by the quadratic form of CVEs. In order to detect blunders in the observations, estimated standardized CVE is proposed as the test statistic which can be applied when noise variances are known or unknown. We use LSC together with the methods proposed in this research for interpolation of crustal subsidence in the northern coast of the Gulf of Mexico. The results show that after detection and removing outliers, the root mean square (RMS) of CVEs and estimated noise standard deviation are reduced about 51 and 59%, respectively. In addition, RMS of LSC prediction error at data points and RMS of estimated noise of observations are decreased by 39 and 67%, respectively. However, RMS of LSC prediction error on a regular grid of interpolation points covering the area is only reduced about 4% which is a consequence of sparse distribution of data points for this case study. The influence of gross errors on LSC prediction results is also investigated by lower cutoff CVEs. It is indicated that after elimination of outliers, RMS of this type of errors is also reduced by 19.5% for a 5 km radius of vicinity. We propose a method using standardized CVEs for classification of dataset into three groups with presumed different noise variances. The noise variance components for each of the groups are estimated using restricted maximum-likelihood method via Fisher scoring technique. Finally, LSC assessment measures were computed for the estimated heterogeneous noise variance model and compared with those of the homogeneous model. The advantage of the proposed method is the reduction in estimated noise levels for those groups with the fewer number of noisy data points.

This is a preview of subscription content, access via your institution.

## References

Amiri-Simkooei A (2007) Least-squares variance component estimation: theory and GPS applications. Doctoral dissertation, TU Delft, Delft

Arabelos DN, Forsberg R, Tscherning CC (2007) On the a priori estimation of collocation error covariance functions: a feasibility study. Geophys J Int 170:527–533

Baarda W (1968) A testing procedure for use in geodetic networks. In: Geodesy, New series, vol 2. issue 5, Netherlands Gedetic Commission, Delft

Burden RL, Faires JD (2011) Numerical analysis, 9th edn. Brooks/Cole, Pacific Grove

Darbeheshti N, Featherstone WE (2009) Non-stationary covariance function modelling in 2D least-squares collocation. J Geod 83(6):495–508

Dokka RK (2011) The role of deep processes in late 20th century subsidence of New Orleans and coastal areas of southern Louisiana and Mississippi. J Geophys Res Solid Earth 116:B06403. https://doi.org/10.1029/2010jb008008

El-Fiky G, Kato T, Fuji Y (1997) Distribution of vertical crustal movement rates in the Tohoku district, Japan, predicted by least-squares collocation. J Geod 71(7):432–442

Eshagh M, Sjöberg LE (2011) Determination of gravity anomaly at sea level from inversion of satellite gravity gradiometric data. J Geodyn 51(5):366–377

Featherstone WE, Sproule DM (2006) Fitting AusGeoid98 to the Australian height datum using GPS-levelling and least squares collocation: application of a cross-validation technique. Surv Rev 38(301):573–582

Grafarend EW (1976) Geodetic applications of stochastic processes. Phys Earth Planet Inter 12(3):151–179

Grodecki J (1999) Generalized maximum-likelihood estimation of variance components with inverted gamma prior. J Geod 73(7):367–374

Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer, New York

Jarmołowski W (2013) A priori noise and regularization in least squares collocation of gravity anomalies. Geod Cartogr 62(2):199–216

Jarmołowski W (2015) Least squares collocation with uncorrelated heterogeneous noise estimated by restricted maximum likelihood. J Geod 89(6):577–589

Jarmołowski W, Bakuła M (2014) Precise estimation of covariance parameters in least-squares collocation by restricted maximum likelihood. Stud Geophys Geod 58(2):171–189

Kitanidis PK (1983) Statistical estimation of polynomial generalized covariance functions and hydrologic applications. Water Resour Res 19(4):909–921

Koch KR (1977) Least squares adjustment and collocation. Bull Geod 51(2):127–135

Koch KR (1986) Maximum likelihood estimate of variance components. Bull Geod 60(4):329–338

Koch KR (1999) Parameter estimation and hypothesis testing in linear models, 2nd edn. Springer, Berlin

Koch KR (2007) Introduction to Bayesian statistics, 2nd edn. Springer, New York

Koch KR, Kusche J (2002) Regularization of geopotential determination from satellite data by variance components. J Geod 76(5):259–268

Krakiwsky EJ, Biacs ZF (1990) Least squares collocation and statistical testing. Bull Geod 64(1):73–87

Krarup T (1969) A contribution to the mathematical foundation of physical geodesy, pub. 44. Dan Geod Inst, Copenhagen

Kusche J, Klees R (2002) Regularization of gravity field estimation from satellite gravity gradients. J Geod 76(6–7):359–368

Mikhail EM, Ackermann F (1976) Observations and least squares. Harper and Row, New York

Moritz H (1962) Interpolation and prediction of gravity and their accuracy, rep. 24. Inst Geod Phot Cart, Ohio State University, Columbus

Moritz H (1972) Advanced least-squares methods, vol 175. Department of Geodetic Science, Ohio State University, Columbus

Moritz H (1980) Advanced physical geodesy. Herbert Wichmann Verlag, Karlsruhe

Pope AJ (1976) The statistics of residuals and the detection of outliers. NOAA technical report NOS 65 NGS 1

Rummel R, Schwarz KP, Gerstl M (1979) Least squares collocation and regularization. Bull Geod 53(4):343–361

Sadiq M, Tscherning CC, Ahmad Z (2009) An estimation of the height system bias parameter N0 using least squares collocation from observed gravity and GPS-levelling data. Stud Geophys Geod 53(3):375–388

Schaffrin B (2001) Softly unbiased prediction. Part 2: the random effects model. Boll Geod Sci Affini 60(1):49–62

Shinkle KD, Dokka RK (2004) Rates of vertical displacement at benchmarks in the lower Mississippi Valley and the northern Gulf Coast, US Department of Commerce NOAA technical report NOS/NGS 50

Snow KB (2012) Topics in total least-squares adjustment within the errors-in-variables model: singular cofactor matrices and prior information. Doctoral dissertation, The Ohio State University

Stein ML (1999) Interpolation of spatial data: some theory for kriging. Springer, New York

Teunissen PJG (2000) Testing theory an introduction. Series on mathematical geodesy and positioning. Delft University Press, Delft

Tscherning CC (1991a) Strategy for gross-error detection in satellite altimeter data applied in the Baltic-sea area for enhanced geoid and gravity determination. Determination of the geoid. Springer, New York, pp 95–107

Tscherning CC (1991b) The use of optimal estimation for gross-error detection in databases of spatially correlated data. Bull d’Inf 68:79–89

Vaníček P, Krakiwsky EJ (1986) Geodesy: the concepts. North Holland, Amsterdam

Vestøl O (2006) Determination of postglacial land uplift in Fennoscandia from leveling, tide-gauges and continuous GPS stations using least squares collocation. J Geod 80(5):248–258

Wei M (1987) Statistical problems in collocation. Manuscr Geod 12:282–289

Yang Y, Zeng A, Zhang J (2009) Adaptive collocation with application in height system transformation. J Geod 83(5):403–410

## Acknowledgements

The US National Oceanic and Atmospheric Administration and US National Geodetic Survey are appreciated for providing access to the observational data for this research. We would like to thank the editors and reviewers for many constructive and insightful comments that lead to major improvements of the manuscript. Dr. Soheil Vasheghani is acknowledged for proofreading the English of the manuscript.

## Author information

### Authors and Affiliations

### Corresponding author

## Appendices

### Appendix A

### 1.1 A lemma in linear algebra

*Notation* For an arbitrary matrix \(\mathbf{D}=\left\{ {d_{ij} } \right\} \), \(\mathbf{d}_{i,-i} \) is the *i*th row of **D** whose *i*th element is removed and \({{\varvec{D}}}_{-i,-i}\) is the same matrix whose *i*th row and column are removed.

### Lemma

If \(\mathbf{A}\) represents an arbitrary symmetric positive definite matrix and \(\mathbf{B}=\mathbf{A}^{-1}\), then

### Proof

we define the vector \(\mathbf{d}^{(i)}\) by

where \(\mathbf{b}_i \) is the *i*th row of \(\mathbf{B}\), the *k*th element of \(\mathbf{d}^{(i)}\) is simply derived

where \(\delta _{ik} \) is the Kronecker delta. Considering the arbitrary vector \(\mathbf{e}^{(i)}\) that is defined by

and using Eq. (A3), one can conclude that:

Finally, the following relations are deduced from Eq. (A5)

Therefore,

It has to be mentioned here that any principal submatrix of a positive definite matrix is also positive definite (Harville 1997, p. 214). Therefore, for the positive definite matrix \(\mathbf{A}\), \(\mathbf{A}_{-i,-i} \) is always invertible. \(\square \)

### Appendix B

### 1.1 LSC prediction errors and noise estimation

LSC prediction error at an unobserved point \(p_0 \) is computed by (Moritz 1972, p. 47; Mikhail and Ackermann 1976, p. 422)

where \(\hat{{y}}_0 \) is prediction of *y* at \(p_0 \) and \(c_{s_0 s_0 } \) is the signal variance, \(\mathbf{c}_{s_0 \mathbf{s}} \) is the cross-covariance vector of the predicted point and the vector of data points, \(\mathbf{a}_0 \) is the vector of trend for predicted point, and \(\mathbf{C}_{{\hat{\mathbf{x}}\hat{\mathbf{x}}}} \) denotes the covariance matrix of estimated trend parameters which is computed by the following formula

LSC internal error (adopted from Darbeheshti and Featherstone 2009) is LSC prediction error at an observed point \(p_i \)

where \(\hat{{y}}_i \) is prediction of *y* at \(p_i \) and \(c_{s_i s_i } \) is the signal variance, \(\mathbf{c}_{s_i \mathbf{s}}\) is the cross-covariance vector of the predicted point and the vector of data points, \(\mathbf{a}_i \) is the *i*th row of \(\mathbf{A}\).

Noise of the observations in Eq. (1) is always unknown. It can be estimated by the following formula (Moritz 1980, p. 119)

## Rights and permissions

## About this article

### Cite this article

Behnabian, B., Mashhadi Hossainali, M. & Malekzadeh, A. Simultaneous estimation of cross-validation errors in least squares collocation applied for statistical testing and evaluation of the noise variance components.
*J Geod* **92**, 1329–1350 (2018). https://doi.org/10.1007/s00190-018-1122-6

Received:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s00190-018-1122-6

### Keywords

- Cross-validation errors
- Least squares collocation
- Statistical tests
- Blunder detection
- Estimation of noise variance components