The Missing Data Problem for a Two-Dimensional Surface

Griffith, Daniel A.

doi:10.1007/978-94-009-2758-2_6

Daniel A. Griffith⁴

Part of the book series: Advanced Studies in Theoretical and Applied Econometrics ((ASTA,volume 12))

231 Accesses

Abstract

The topics, themes and subject matter of Chapters 1 thru 5 are of a reference nature. Besides the complexities discussed in these preceding chapters, the field of spatial statistics also embraces several prominent difficult, bothersome, and unresolved problems. One concern has to do with incomplete data, which is the topic of this chapter. Geographical data sets sometimes contain missing observations that need to be estimated. An exact maximum likelihood solution for this problem is discussed, both in terms of parameter and missing value estimation, for multivariate normal spatial data sets satisfying the first-order spatial Markov property with constant mean. Moreover, information at neighboring or contiguous observed sites is used to estimate the missing values, and then the complete spatial distribution is used to estimate model parameters. The solution procedure is iterative, and is akin to the Orchard and Woodbury missing information principle. Results are reported for extensions to a second-order, simultaneous model, and from an empirical example used to explore the behavior of these estimates. Also, tentative simulation experiment findings are reviewed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abraham, B., 1981, Missing observations in time series, Communications in Statistics A, Vol. 10: 1643–1653.
Google Scholar
Afifi, A., and R. Elashoff, 1966, Missing observations in multivariate statistics, I: review of the literature, Journal of the American Statistical Association, Vol. 61: 595–604.
Google Scholar
Akaike, H., and M. Ishiguro, 1980, Trend estimation with missing observations, Annals of the Institute of Statistical Mathematics, Vol. 32: 481–488.
Google Scholar
Anderson, A., A. Basilevsky, and D. Hum, 1983, Missing data: a review of the literature, Handbook of Survey Research, Vol. 4: 415–494.
Google Scholar
Beale, E., and R. Little, 1975, Missing values in multivariate analysis, Journal of the Royal Statistical Society B. Vol. 37: 129–146.
Google Scholar
Bennett, R., D. Griffith, and R. Haining, 1984, The problem of missing data on spatial surfaces, Annals, Association of American Geographers, Vol. 74: 138–156.
Google Scholar
Besag, J., 1974, Spatial interaction and the statistical analysis of lattice systems, Journal of the Royal Statistical Society B, Vol. 36: 192–236.
Google Scholar
Bhoj, D., 1984, On difference of means of correlated variâtes with incomplete data on both responses, Journal of Statistical Computation and Simulation, Vol. 19: 275–290.
Google Scholar
Bhoj, D., 1984, On testing equality of variances of correlated variates with incomplete data, Biometrika, Vol. 71: 639–641.
Google Scholar
Boyles, R., 1983, On the convergence of the EM algorithm, Journal of the Royal Statistical Society B, Vol. 45: 47–50.
Google Scholar
Bouza, C., 1983, Estimation of a difference in finite populations with missing observations, Biometrical Journal, Vol. 25: 123–128.
Google Scholar
Box, M., 1971, A parameter estimation criterion for multiresponse models applicable when some observations are missing, Applied Statistics, Vol. 20: 1–7.
Google Scholar
Brooks, R., 1982, On the loss of information through censoring, Biometrika, Vol. 69: 137–144.
Google Scholar
Brouwer, U., and P. Vijn, 1980, A program to estimate the correlation coefficient in incomplete datasets, COMPSTAT, Vol. 4, Proceedings in Computational Statistics, edited by M. Barritt and D. Wishart, Vienna: Physica, pp. 194–200.
Google Scholar
Campbell, N., 1984, Canonical variate analysis — a general model formulation, Australian Journal of Statistics, Vol. 26: 86–96.
Google Scholar
Campbell, G., 1984, Testing equality of proportions with incomplete correlated data, Journal of Statistical Planning and Inference, Vol. 10: 311–321.
Google Scholar
Chan, L., and O. Dunn, 1974, A note on the asymptotic aspects of the treatment of missing values in discriminant analysis, Journal of the American Statistical Association, Vol. 69: 672–673.
Google Scholar
Chapman, D., 1982, Substitution for missing units, Proceedings of Survey Research Methods Section, American Statistical Association, 76–84.
Google Scholar
Cheng, S., and K. Ling, 1983, On the BLUE’s of location and scale parameters based on incomplete samples, Soochow Journal of Mathematics, Vol. 9: 35–45.
Google Scholar
Chow, G., and A. Lin, 1976, Best linear unbiased estimation of missing observations in an economic time series, Journal of the American Statistical Association, Vol. 71: 719–721.
Google Scholar
Dagenais, M., and J. Dufour, 1984, Durbin-Watson tests with missing observations: applications and comparisons, Proceedings, American Statistical Association Business and Economic Statistics Section, pp. 525–530.
Google Scholar
Dahiya, R., and R. Korwar, 1980, Maximum likelihood estimates for a bivariate normal distribution with missing data, Annals of Statistics, Vol. 8: 687–692.
Google Scholar
Damsleth, E., 1980, Interpolating missing values in a time series, Scandinavian Journal of Statistics, Vol. 7: 33–39.
Google Scholar
de Ligny, C. et al., 1981, An application of factor analysis with missing data, Technometrics, Vol. 23: 91–95.
Google Scholar
del Pino, G., 1984, Linear restrictions and two step least squares with applications, Statistics and Probability Letters, Vol. 2: 245–248.
Google Scholar
Dempster, A., N. Laird, and D. Rubin, 1977, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society B, Vol. 39: 1–38.
Google Scholar
Donner, A., 1982, The relative effectiveness of procedures commonly used in multiple regression analysis for dealing with missing values, American Statistician, Vol. 36: 378–381.
Google Scholar
Donner, A., and B. Rosner, 1982, Missing value problems in multiple linear regression with two independent variables, Communications in Statistics A, Vol. 11: 127–140.
Google Scholar
Drygas, H., 1976, Gauss-Markov estimation for multivariate linear models with missing observations, Annals of Statistics, Vol. 4: 779–787.
Google Scholar
Dunsmuir, W., and P. Robinson, 1981, Estimation of time series models in the presence of missing data, Journal of the American Statistical Association, Vol. 76: 560–568.
Google Scholar
Englman, L., 1982, An efficient algorithm for computing covariance matrices from data with missing values, Communications in Statistics B, Vol. 11: 113–121.
Google Scholar
Eubank, R., and V. LaRiccia, 1982, Location and scale parameter estimation from randomly censored data, Communications in Statistics A, Vol. 11: 2869–2888.
Google Scholar
Feingold, M., 1982, Missing data in linear models with correlated errors, Communications in Statistics A, Vol. 11: 2831–2843.
Google Scholar
Gleason, T., and R. Staelin, 1975, A proposal for handling missing data, Psychometrika, Vol. 40: 229–252.
Google Scholar
Gokhale, D., and B. Sirtonik, 1984, On tests for correlated proportions in the presence of incomplete data, Psychometrika, Vol. 49: 147–152.
Google Scholar
Greenlees, J. et al., 1982, Imputation of missing values when the probability of response depends on the variable being imputed, Journal of the American Statistical Association, Vol. 77: 251–261.
Google Scholar
Haining, R., D. Griffith, and R. Bennett, 1984, A statistical approach to the problem of missing spatial data using a first-order Markov model, The Professional Geographer, Vol. 36: 338–345.
Google Scholar
Hamdan, M., W. Pirie, and A. Khuri, 1976, Unbiased estimation of the common mean based on incomplete bivariate normal samples, Biometrische Zeitschrift, Vol. 18: 245–249.
Google Scholar
Hartley, H., and R. Hocking, 1971, An analysis of incomplete data, Biometrics, Vol. 27: 783–823.
Google Scholar
Harvey, A., 1981, The Kalman filter and its applications in econometrics and time series analysis, Methods of Operations Research, Vol. 44: 3–18.
Google Scholar
Harvey, A., and C. McKenzie, 1984, Missing observations in dynamic econometric models, in Time Series Analysis of Irregularly Observed Data, edited by E. Parzen. New York: Springer-Verlag, pp. 108–133.
Google Scholar
Harvey, A., and R. Pierse, 1984, Estimating missing observations in economic time series, Journal of the American Statistical Association, Vol. 79: 125–131.
Google Scholar
Hill, M., and W. Dixon, 1981, Missing data: search for patterns, Proceedings of the Statistical Computing Section, American Statistical Association, pp. 57–60.
Google Scholar
Hinich, M., and W. Weber, 1984, A method for estimating distributed lags when observations are randomly missing, Journal of the American Statistical Association, Vol. 79: 368–373.
Google Scholar
Hinkins, S., 1980, RFACTOR — a program to create Rubin’s factorization when there are incomplete multivariate data, American Statistician, Vol. 34: 182–183.
Google Scholar
Hocking, R., and D. Marx, 1979, Estimation with incomplete data: an improved computational method and the analysis of nested data, Communications in Statistics — Theory and Methods A, Vol. 8: 1155–1181.
Google Scholar
Hocking, R., and H. Oxspring, 1971, Maximum likelihood estimation with incomplete multinomial data, Journal of the American Statistical Association, Vol. 66: 65–70.
Google Scholar
Hocking, R., and W. Smith, 1968, Estimation of parameters in the multivariate normal distribution with missing observations, Journal of the American Statistical Association, Vol. 63: 159–173.
Google Scholar
Hocking, R., and W. Smith, 1972, Optimum incomplete normal samples, Technometrics, Vol. 14: 299–307.
Google Scholar
Hoffmann, R., and J. Anderson, 1982, The effect of missing values on estimators for very short AR(1) time series, Proceedings of the Statistical Computing Section, American Statistical Association, pp. 224–227.
Google Scholar
Honohan, P., and C. McCarthy, 1982, On the use of Durbin-Watson type statistics where there are missing observations, The Statistician, Vol. 31: 149–152.
Google Scholar
Hosking, J., 1981, Missing data in multivariate linear models: a comparison of several estimation techniques, Proceedings of the SAS Users Group International Conference, Vol. 6: 46–51.
Google Scholar
Huseby, J., N. Schwertman, and D. Allen, 1980, Computation of the mean vector and dispersion matrix for incomplete multivariate data, Communications in Statistics B, Vol. 9: 301–309.
Google Scholar
Iwase, K., and N. Seto, 1984, A construction of incomplete sufficient unbiased estimators of the normal correlation coefficient, Journal of the Japan Statistical Society, Vol. 14: 49–61.
Google Scholar
John, J., and P. Prescott, 1975, Estimating missing values in experiments, Applied Statistics, Vol. 24: 190–192.
Google Scholar
Jones, B., and R. Facer, 1982, CORRMAT/PROB, a program to create and test a correlation coefficient matrix from data with missing values, Computers and Geosciences, Vol. 8: 191–198.
Google Scholar
Jones, R., 1980, Maximum likelihood fitting of ARMA models to time series with missing observations, Technometrics, Vol. 22: 389–395.
Google Scholar
Kemp, W., D. Burnell, D. Eberson, and A. Thomson, 1983, Estimating missing daily maximum and minimum temperatures, Journal of Climate and Applied Meteorology, Vol. 22: 1587–1593.
Google Scholar
Kennedy, S., and W. Tobler, 1983, Geographic interpolation, Geographical Analysis, Vol. 15: 151–156.
Google Scholar
Korwar, R., and R. Dahiya, 1982, Estimation of a bivariate distribution function from incomplete observations, Communications in Statistics A, Vol. 11: 887–897.
Google Scholar
Koul, H., and V. Susarla, 1980, Testing for new better than used in expectation with incomplete data, Journal of the American Statistical Association, Vol. 75: 952–956.
Google Scholar
Koziol, J., 1980, Goodness-of-fit tests for randomly censored data, Biometrika, Vol. 67: 693–696.
Google Scholar
Laird, N., and T. Louis, 1982, Approximate posterior distributions for incomplete data problems, Journal of the Royal Statistical Society B, Vol. 44: 190–200.
Google Scholar
Limonard, C., 1978, Missing values in time series and the implications on autocorrelation analysis, Analytica Chimica Acta, Vol. 103: 133–140.
Google Scholar
Lin, P., and L. Stivers, 1974, On difference of means with incomplete data, Biometrika, Vol. 61: 325–334.
Google Scholar
Lin, P., 1971, Estimation procedures for differences of means with missing data, Journal of the American Statistical Association, Vol. 66: 634–636.
Google Scholar
Little, R., 1976, Inference about means from incomplete multivariate data, Biometrika, Vol. 63: 593–604.
Google Scholar
Little, R., and D. Rubin, 1983, Missing data in large data sets, in Statistical Methods and the Improvement of Data Quality, edited by T. Wright. New York: Academic Press, pp. 215–243.
Google Scholar
Liung, G., 1982, The likelihood function for a stationary Gaussian autoregressive-moving average process with missing observations, Biometrika, Vol. 69: 265–268.
Google Scholar
Marshall, R., 1980, Autocorrelation estimation of time series with randomly missing observations, Biometrika, Vol. 67: 567–570.
Google Scholar
Martin, R., 1984, Exact maximum likelihood for incomplete data from a correlated Gaussian process, Communications in Statistics A, Vol. 13: 1275–1288.
Google Scholar
Milhoej, A., 1984, Bias correction in the frequency domain estimation of time series models, Biometrika, Vol. 71: 91–99.
Google Scholar
Miller, R., and J. Halpern, 1982, Regression with censored data, Biometrika, Vol. 69: 521–531.
Google Scholar
Morgan, B., and D. Titterington, 1977, A comparison of iterative methods for obtaining maximum likelihood estimates in contingency tables with a missing diagonal, Biometrika, Vol. 64: 265–270.
Google Scholar
Morrison, D., 1971, Expectations and variances of maximum likelihood estimates of the multivariate normal distribution parameters with missing data, Journal of the American Statistical Association, Vol. 66: 602–604.
Google Scholar
Morrison, D., and D. Bhoj, 1973, Power of the likelihood ratio test on the mean vector of the multivariate normal distribution with missing observations, Biometrika, Vol. 60: 365–368.
Google Scholar
Murry, G., 1979, The estimation of multivariate normal density functions using incomplete data, Biometrika, Vol. 66: 375–380.
Google Scholar
Nelson, F., 1981, A test for misspecification in the censored normal model Econometrica, Vol. 49: 1317–1330.
Google Scholar
Ord, K., 1975, Estimation methods for models of spatial interaction, Journal of the American Statistical Association, Vol. 70: 120–126.
Google Scholar
Papaioannou, T., and S. Loukas, 1984, Inequalities of rank correlation with missing data, Journal of the Royal Statistical Society B, Vol. 46: 68–71.
Google Scholar
Papaioannou, T., and T. Speevak, 1977, Rank correlation inequalities with missing data, Communications in Statistics A, Vol. 6: 67–72.
Google Scholar
Preece, D., 1971, Iterative procedures for missing values in experiments, Technometrics, Vol. 13: 743–754.
Google Scholar
Press, S., and A. Scott, 1976, Missing variables in Bayesian regression II, Journal of the American Statistical Association, Vol. 71: 366–369.
Google Scholar
Radhakrishnan, R., 1982, Inadmissibility of the maximum likelihood estimator for a multivariate normal distribution when some observations are missing, Communications in Statistics A, Vol. 11: 941–955.
Google Scholar
Ratkowsky, D., 1974, Maximum likelihood estimation in small incomplete samples from the bivariate normal distribution, Applied Statistics, Vol. 23: 180–189.
Google Scholar
Redner, R., and H. Walker, 1084, Mixture densities, maximum likelihood and the EM algorithm, SIAM Review, Vol. 26: 195–202.
Google Scholar
Richardson, S., and K. White, 1981, The power of tests for autocorrelation with missing observations, Econometrica, Vol. 47: 785–788.
Google Scholar
Robinson, P., 1980, Estimation and forecasting for time series containing censored or missing observations, in Time Series, edited by O. Anderson, Amsterdam: North-Holland, pp. 167–182.
Google Scholar
Rubin, D., 1974, Characterizing the estimation of parameters in incomplete-data problems, Journal of the American Statistical Association, Vol. 69: 467–474.
Google Scholar
Rubin, D., 1972, A noniterative algorithm for least squares estimation of missing values in any analysis of variance design, Applied Statistics, Vol. 21: 136–141.
Google Scholar
Rubin, D., 1976, Inference and missing data, Biometrika, Vol. 63: 581–592.
Google Scholar
Rubin, D., and T. Szatrowski, 1982, Finding maximum likelihood estimates of patterned covariance matrices by the EM algorithm, Biometrika, Vol. 69: 657–660.
Google Scholar
Ryan, T., B. Joiner, and B. Ryan, 1982, Minitab Reference Manual. State College, Pa.: Minitab, Inc.
Google Scholar
Selvin, S., 1980, Maximum likelihood estimation for complete or incomplete discrete data, Computer Programs in Biomedicine, Vol. 11: 83–87.
Google Scholar
Shumway, R., and D. Stoffer, 1982, An approach to time series smoothing and forecasting using the EM algorithm, Journal of Time Series Analysis, Vol. 3: 253–264.
Google Scholar
Singh, R., 1977, A note on the use of incomplete multi-auxiliary information in sample surveys, Australian Journal of Statistics, Vol. 19: 105–107.
Google Scholar
Smith, W., and M. Riggs, 1984, Likelihood ratio testing on partial multinormal data, Statistics and Probability Letters, Vol. 2: 337–343.
Google Scholar
Sundberg, R., 1974, Maximum likelihood theory for incomplete data from an exponential family, Scandinavian Journal of Statistics, Vol. 1: 49–58.
Google Scholar
Susarla, V., and J. Van Ryzin, 1976, Nonparametnc Bayesian estimation of survival curves from incomplete observations, Journal of the American Statistical Association, Vol. 71: 897–902.
Google Scholar
Tabony, R., 1982, The estimation of missing values in highly correlated data, COMPSTAT, Vol. 5, Proceedings in Computational Statistics, edited by H. Caussinus, P. Ettinger and R. Tomassone. Vienna: Physica, pp. 425–430.
Google Scholar
Titterington, D., 1977, Analysis of incomplete multivariate binary data by the kernel method, Biometrika, Vol. 64: 455–460.
Google Scholar
Titterington, D., 1984, Recursive parameter estimation using incomplete data, Journal of the Royal Statistical Society B, Vol. 46: 257–267.
Google Scholar
Titterington, D., and J. Jiang, 1983, Recursive estimation procedures for missing-data problems, Biometrika, Vol. 70: 613–624.
Google Scholar
Titterington, D., and G. Mill, 1983, Kernel-based density estimates from incomplete data, Journal of the Royal Statistical Society B, Vol. 45: 258–266.
Google Scholar
Tobler, W., 1979, Smooth pycnophlyactic interpolation for geographical regions, Journal of the American Statistical Association, Vol. 74: 519–530.
Google Scholar
Tobler, W., and S. Kennedy, 1985, Smooth multidimensional interpolation, Geographical Analysis, Vol. 17: 251–257.
Google Scholar
Upton, G., 1985, Distance-weighted geographic interpolation, Environment and Planning A, Vol. 17: 667–671.
Google Scholar
U.S. National Academy of Sciences, 1980, Panel on Incomplete Data, Washington, D.C.: NAS.
Google Scholar
Vacek, P., and T. Ashikaga, 1980, An examination of the nearest neighbor rule for imputing missing values, Proceedings of the Statistical Computing Section, American Statistical Association, pp. 326–331.
Google Scholar
van Guilder, M., and S. Azen, 1981, Conclusions regarding algorithms for handling incomplete data, Proceedings of the Statistical Computing Section, American Statistical Association, pp. 53–56.
Google Scholar
Vo-Dai, T., 1980, Time series analysis with missing or aberrant data, COMPSTAT, Vol. 4, Proceedings in Computational Statistics, ed. by M. Barritt and D. Wishart. Vienna: Physica, pp. 594–601.
Google Scholar
Wei, L., 1983, Tests for interchangeability with incomplete paired observations, Journal of the American Statistical Association, Vol. 78: 725–729.
Google Scholar
Weier, D., and A. Basu, 1980, An investigation of Kendall’s t modified for censored data with applications, Journal of Statistical Planning and Inference, Vol. 4: 381–390.
Google Scholar
Wingo, D., 1982, Unimodality of the Pareto distribution likelihood function for multicensored samples and implications for estimation, Communications in Statistics A, Vol. 11: 1129–1138.
Google Scholar
Woolson, R., J. Leeper, and W. Clarke, 1978, Analysis of incomplete data from longitudinal and mixed longitudinal studies, Journal of the Royal Statistical Society A, Vol. 141: 242–252.
Google Scholar

Download references

Author information

Authors and Affiliations

University of New York, Buffalo, USA
Daniel A. Griffith

Authors

Daniel A. Griffith
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Griffith, D.A. (1988). The Missing Data Problem for a Two-Dimensional Surface. In: Advanced Spatial Statistics. Advanced Studies in Theoretical and Applied Econometrics, vol 12. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-2758-2_6

Download citation

DOI: https://doi.org/10.1007/978-94-009-2758-2_6
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-7739-2
Online ISBN: 978-94-009-2758-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics