Abstract
The cross-validation technique is a popular method to assess and improve the quality of prediction by least squares collocation (LSC). We present a formula for direct estimation of the vector of cross-validation errors (CVEs) in LSC which is much faster than element-wise CVE computation. We show that a quadratic form of CVEs follows Chi-squared distribution. Furthermore, a posteriori noise variance factor is derived by the quadratic form of CVEs. In order to detect blunders in the observations, estimated standardized CVE is proposed as the test statistic which can be applied when noise variances are known or unknown. We use LSC together with the methods proposed in this research for interpolation of crustal subsidence in the northern coast of the Gulf of Mexico. The results show that after detection and removing outliers, the root mean square (RMS) of CVEs and estimated noise standard deviation are reduced about 51 and 59%, respectively. In addition, RMS of LSC prediction error at data points and RMS of estimated noise of observations are decreased by 39 and 67%, respectively. However, RMS of LSC prediction error on a regular grid of interpolation points covering the area is only reduced about 4% which is a consequence of sparse distribution of data points for this case study. The influence of gross errors on LSC prediction results is also investigated by lower cutoff CVEs. It is indicated that after elimination of outliers, RMS of this type of errors is also reduced by 19.5% for a 5 km radius of vicinity. We propose a method using standardized CVEs for classification of dataset into three groups with presumed different noise variances. The noise variance components for each of the groups are estimated using restricted maximum-likelihood method via Fisher scoring technique. Finally, LSC assessment measures were computed for the estimated heterogeneous noise variance model and compared with those of the homogeneous model. The advantage of the proposed method is the reduction in estimated noise levels for those groups with the fewer number of noisy data points.
This is a preview of subscription content, access via your institution.











References
Amiri-Simkooei A (2007) Least-squares variance component estimation: theory and GPS applications. Doctoral dissertation, TU Delft, Delft
Arabelos DN, Forsberg R, Tscherning CC (2007) On the a priori estimation of collocation error covariance functions: a feasibility study. Geophys J Int 170:527–533
Baarda W (1968) A testing procedure for use in geodetic networks. In: Geodesy, New series, vol 2. issue 5, Netherlands Gedetic Commission, Delft
Burden RL, Faires JD (2011) Numerical analysis, 9th edn. Brooks/Cole, Pacific Grove
Darbeheshti N, Featherstone WE (2009) Non-stationary covariance function modelling in 2D least-squares collocation. J Geod 83(6):495–508
Dokka RK (2011) The role of deep processes in late 20th century subsidence of New Orleans and coastal areas of southern Louisiana and Mississippi. J Geophys Res Solid Earth 116:B06403. https://doi.org/10.1029/2010jb008008
El-Fiky G, Kato T, Fuji Y (1997) Distribution of vertical crustal movement rates in the Tohoku district, Japan, predicted by least-squares collocation. J Geod 71(7):432–442
Eshagh M, Sjöberg LE (2011) Determination of gravity anomaly at sea level from inversion of satellite gravity gradiometric data. J Geodyn 51(5):366–377
Featherstone WE, Sproule DM (2006) Fitting AusGeoid98 to the Australian height datum using GPS-levelling and least squares collocation: application of a cross-validation technique. Surv Rev 38(301):573–582
Grafarend EW (1976) Geodetic applications of stochastic processes. Phys Earth Planet Inter 12(3):151–179
Grodecki J (1999) Generalized maximum-likelihood estimation of variance components with inverted gamma prior. J Geod 73(7):367–374
Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer, New York
Jarmołowski W (2013) A priori noise and regularization in least squares collocation of gravity anomalies. Geod Cartogr 62(2):199–216
Jarmołowski W (2015) Least squares collocation with uncorrelated heterogeneous noise estimated by restricted maximum likelihood. J Geod 89(6):577–589
Jarmołowski W, Bakuła M (2014) Precise estimation of covariance parameters in least-squares collocation by restricted maximum likelihood. Stud Geophys Geod 58(2):171–189
Kitanidis PK (1983) Statistical estimation of polynomial generalized covariance functions and hydrologic applications. Water Resour Res 19(4):909–921
Koch KR (1977) Least squares adjustment and collocation. Bull Geod 51(2):127–135
Koch KR (1986) Maximum likelihood estimate of variance components. Bull Geod 60(4):329–338
Koch KR (1999) Parameter estimation and hypothesis testing in linear models, 2nd edn. Springer, Berlin
Koch KR (2007) Introduction to Bayesian statistics, 2nd edn. Springer, New York
Koch KR, Kusche J (2002) Regularization of geopotential determination from satellite data by variance components. J Geod 76(5):259–268
Krakiwsky EJ, Biacs ZF (1990) Least squares collocation and statistical testing. Bull Geod 64(1):73–87
Krarup T (1969) A contribution to the mathematical foundation of physical geodesy, pub. 44. Dan Geod Inst, Copenhagen
Kusche J, Klees R (2002) Regularization of gravity field estimation from satellite gravity gradients. J Geod 76(6–7):359–368
Mikhail EM, Ackermann F (1976) Observations and least squares. Harper and Row, New York
Moritz H (1962) Interpolation and prediction of gravity and their accuracy, rep. 24. Inst Geod Phot Cart, Ohio State University, Columbus
Moritz H (1972) Advanced least-squares methods, vol 175. Department of Geodetic Science, Ohio State University, Columbus
Moritz H (1980) Advanced physical geodesy. Herbert Wichmann Verlag, Karlsruhe
Pope AJ (1976) The statistics of residuals and the detection of outliers. NOAA technical report NOS 65 NGS 1
Rummel R, Schwarz KP, Gerstl M (1979) Least squares collocation and regularization. Bull Geod 53(4):343–361
Sadiq M, Tscherning CC, Ahmad Z (2009) An estimation of the height system bias parameter N0 using least squares collocation from observed gravity and GPS-levelling data. Stud Geophys Geod 53(3):375–388
Schaffrin B (2001) Softly unbiased prediction. Part 2: the random effects model. Boll Geod Sci Affini 60(1):49–62
Shinkle KD, Dokka RK (2004) Rates of vertical displacement at benchmarks in the lower Mississippi Valley and the northern Gulf Coast, US Department of Commerce NOAA technical report NOS/NGS 50
Snow KB (2012) Topics in total least-squares adjustment within the errors-in-variables model: singular cofactor matrices and prior information. Doctoral dissertation, The Ohio State University
Stein ML (1999) Interpolation of spatial data: some theory for kriging. Springer, New York
Teunissen PJG (2000) Testing theory an introduction. Series on mathematical geodesy and positioning. Delft University Press, Delft
Tscherning CC (1991a) Strategy for gross-error detection in satellite altimeter data applied in the Baltic-sea area for enhanced geoid and gravity determination. Determination of the geoid. Springer, New York, pp 95–107
Tscherning CC (1991b) The use of optimal estimation for gross-error detection in databases of spatially correlated data. Bull d’Inf 68:79–89
Vaníček P, Krakiwsky EJ (1986) Geodesy: the concepts. North Holland, Amsterdam
Vestøl O (2006) Determination of postglacial land uplift in Fennoscandia from leveling, tide-gauges and continuous GPS stations using least squares collocation. J Geod 80(5):248–258
Wei M (1987) Statistical problems in collocation. Manuscr Geod 12:282–289
Yang Y, Zeng A, Zhang J (2009) Adaptive collocation with application in height system transformation. J Geod 83(5):403–410
Acknowledgements
The US National Oceanic and Atmospheric Administration and US National Geodetic Survey are appreciated for providing access to the observational data for this research. We would like to thank the editors and reviewers for many constructive and insightful comments that lead to major improvements of the manuscript. Dr. Soheil Vasheghani is acknowledged for proofreading the English of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
1.1 A lemma in linear algebra
Notation For an arbitrary matrix \(\mathbf{D}=\left\{ {d_{ij} } \right\} \), \(\mathbf{d}_{i,-i} \) is the ith row of D whose ith element is removed and \({{\varvec{D}}}_{-i,-i}\) is the same matrix whose ith row and column are removed.
Lemma
If \(\mathbf{A}\) represents an arbitrary symmetric positive definite matrix and \(\mathbf{B}=\mathbf{A}^{-1}\), then
Proof
we define the vector \(\mathbf{d}^{(i)}\) by
where \(\mathbf{b}_i \) is the ith row of \(\mathbf{B}\), the kth element of \(\mathbf{d}^{(i)}\) is simply derived
where \(\delta _{ik} \) is the Kronecker delta. Considering the arbitrary vector \(\mathbf{e}^{(i)}\) that is defined by
and using Eq. (A3), one can conclude that:
Finally, the following relations are deduced from Eq. (A5)
Therefore,
It has to be mentioned here that any principal submatrix of a positive definite matrix is also positive definite (Harville 1997, p. 214). Therefore, for the positive definite matrix \(\mathbf{A}\), \(\mathbf{A}_{-i,-i} \) is always invertible. \(\square \)
Appendix B
1.1 LSC prediction errors and noise estimation
LSC prediction error at an unobserved point \(p_0 \) is computed by (Moritz 1972, p. 47; Mikhail and Ackermann 1976, p. 422)
where \(\hat{{y}}_0 \) is prediction of y at \(p_0 \) and \(c_{s_0 s_0 } \) is the signal variance, \(\mathbf{c}_{s_0 \mathbf{s}} \) is the cross-covariance vector of the predicted point and the vector of data points, \(\mathbf{a}_0 \) is the vector of trend for predicted point, and \(\mathbf{C}_{{\hat{\mathbf{x}}\hat{\mathbf{x}}}} \) denotes the covariance matrix of estimated trend parameters which is computed by the following formula
LSC internal error (adopted from Darbeheshti and Featherstone 2009) is LSC prediction error at an observed point \(p_i \)
where \(\hat{{y}}_i \) is prediction of y at \(p_i \) and \(c_{s_i s_i } \) is the signal variance, \(\mathbf{c}_{s_i \mathbf{s}}\) is the cross-covariance vector of the predicted point and the vector of data points, \(\mathbf{a}_i \) is the ith row of \(\mathbf{A}\).
Noise of the observations in Eq. (1) is always unknown. It can be estimated by the following formula (Moritz 1980, p. 119)
Rights and permissions
About this article
Cite this article
Behnabian, B., Mashhadi Hossainali, M. & Malekzadeh, A. Simultaneous estimation of cross-validation errors in least squares collocation applied for statistical testing and evaluation of the noise variance components. J Geod 92, 1329–1350 (2018). https://doi.org/10.1007/s00190-018-1122-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00190-018-1122-6
Keywords
- Cross-validation errors
- Least squares collocation
- Statistical tests
- Blunder detection
- Estimation of noise variance components