Advertisement

Comparing Exact and Approximate Spatial Auto-regression Model Solutions for Spatial Data Analysis

  • Baris M. Kazar
  • Shashi Shekhar
  • David J. Lilja
  • Ranga R. Vatsavai
  • R. Kelley Pace
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3234)

Abstract

The spatial auto-regression (SAR) model is a popular spatial data analysis technique, which has been used in many applications with geo-spatial datasets. However, exact solutions for estimating SAR parameters are computationally expensive due to the need to compute all the eigenvalues of a very large matrix. Recently we developed a dense-exact parallel formulation of the SAR parameter estimation procedure using data parallelism and a hybrid programming technique. Though this parallel implementation showed scalability up to eight processors, the exact solution still suffers from high computational complexity and memory requirements. These limitations have led us to investigate approximate solutions for SAR model parameter estimation with the main objective of scaling the SAR model for large spatial data analysis problems. In this paper we present two candidate approximate-semi-sparse solutions of the SAR model based on Taylor series expansion and Chebyshev polynomials. Our initial experiments showed that these new techniques scale well for very large data sets, such as remote sensing images having millions of pixels. The results also show that the differences between exact and approximate SAR parameter estimates are within 0.7% and 8.2% for Chebyshev polynomials and Taylor series expansion, respectively, and have no significant effect on the prediction accuracy.

Keywords

Chebyshev Polynomial Taylor Series Expansion Large Problem Size Spatial Weight Matrix Spatial Data Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK User’s Guide, 3rd edn. Society for Industrial and Applied Mathematics, Philadelphia (1999)zbMATHCrossRefGoogle Scholar
  2. 2.
    Anselin, L.: Spatial Econometrics: Methods and Models. Kluwer Academic Publishers, Dorddrecht (1988)Google Scholar
  3. 3.
    Barry, R., Pace, R.: Monte Carlo Estimates of the log-determinant of large sparse matrices. Linear Algebra and its Applications 289, 41–54 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Bavaud, F.: Models for Spatial Weights: A Systematic Look. Geographical Analysis 30, 153–171 (1998)CrossRefGoogle Scholar
  5. 5.
    Besag, J.E.: Spatial Interaction and the Statistical Analysis of Lattice Systems. Journal of the Royal Statistical Society, B 36, 192–225 (1974)zbMATHMathSciNetGoogle Scholar
  6. 6.
    Besag, J.E.: Statistical Analysis of Nonlattice Data. The Statistician 24, 179–195 (1975)CrossRefGoogle Scholar
  7. 7.
    Blackford, L.S., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.C.: ScaLAPACK User’s Guide. Society for Industrial and Applied Mathematics, Philadelphia (1997)CrossRefGoogle Scholar
  8. 8.
    Chawla, S., Shekhar, S., Wu, W., Ozesmi, U.: Modeling Spatial Dependencies for Mining Geospatial Data. In: Proc. of the 1st SIAM International Conference on Data Mining, Chicago, IL (2001)Google Scholar
  9. 9.
    Cheney, W., Kincaid, D.: Numerical Mathematics and Computing, 3rd edn. (1999)Google Scholar
  10. 10.
    Cressie, N.A.: Statistics for Spatial Data (Revised Edition). Wiley, New York (1993)Google Scholar
  11. 11.
    Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)zbMATHGoogle Scholar
  12. 12.
    Griffith, D.A.: Advanced Spatial Statistics. Kluwer Academic Publishers, Dordrecht (1988)Google Scholar
  13. 13.
    Information about Freely Available Eigenvalue-Solver Software: http://www.netlib.org/utk/people/JackDongarra/la-sw.html
  14. 14.
    Kazar, B., Shekhar, S., Lilja, D.: Parallel Formulation of Spatial Auto-Regression, AHPCRC Technical Report No: 2003-125 (August 2003)Google Scholar
  15. 15.
    Kazar, B.M., Shekhar, S., Lilja, D.J., Boley, D.: A Parallel Formulation of the Spatial Auto-Regression Model for Mining Large Geo-Spatial Datasets. In: Proc. of 2004 SIAM International Conf. on Data Mining Workshop on High Performance and Distributed Mining (HPDM 2004), Orlando, Fl. USA (2004)Google Scholar
  16. 16.
    Li, B.: Implementing Spatial Statistics on Parallel Computers. In: Arlinghaus, S. (ed.) Practical Handbook of Spatial Statistics, pp. 107–148. CRC Press, Boca Raton (1996)Google Scholar
  17. 17.
    LeSage, J.: Solving Large-Scale Spatial autoregressive models. presented at the Second Workshop on Mining Scientific Datasets, AHPCRC, University of Minnesota (July 2000)Google Scholar
  18. 18.
    LeSage, J.P.: Econometrics Toolbox for MATLAB, http://www.spatial-econometrics.com/
  19. 19.
    LeSage, J., Pace, R.K.: Using Matrix Exponentials to Explore Spatial Structure in Regression Relationships (Bayesian MESS), Technical Report (October 2000), http://www.spatial-statistics.com
  20. 20.
    LeSage, J., Pace, R.K.: Spatial Dependence in Data Mining. In: Grossman, R.L., Kamath, C., Kegelmeyer, P., Kumar, V., Namburu, R.R. (eds.) Data Mining for Scientific and Engineering Applications, pp. 439–460. Kluwer Academic Publishing, Dordrecht (2001)Google Scholar
  21. 21.
    Long, D.S.: Spatial autoregression modeling of site-sepecific wheat yield. Geoderma 85, 181–197 (1998)CrossRefGoogle Scholar
  22. 22.
    Marcus, M., Minc, H.: A Survey of Matrix Theory and Matrix Inequalities. Dover, New York (1992)Google Scholar
  23. 23.
    Martin, R.J.: Approximations to the determinant term in Gaussian maximum likelihood estimation of some spatial models. Communications in Statistical Theory Models 22(1), 189–205 (1993)zbMATHCrossRefGoogle Scholar
  24. 24.
    Ord, J.K.: Estimation Methods for Models of Spatial Interaction. Journal of the American Statistical Association 70, 120–126 (1975)zbMATHCrossRefMathSciNetGoogle Scholar
  25. 25.
    Pace, R.K., Barry, R.: Quick Computation of Spatial Auto-regressive Estimators. Geographical Analysis 29, 232–246 (1997)CrossRefGoogle Scholar
  26. 26.
    Pace, R.K., LeSage, J.: Closed-form maximum likelihood estimates for spatial problems (MESS), Technical Report (September 2000), http://www.spatial-statistics.com
  27. 27.
    Pace, R.K., LeSage, J.: Semiparametric Maximum Likelihood Estimates of Spatial Dependence. Geographical Analysis 34(1), 76–90 (2002)CrossRefGoogle Scholar
  28. 28.
    Pace, R.K., LeSage, J.: Simple bounds for difficult spatial likelihood problems, Technical Report (2003), http://www.spatial-statistics.com
  29. 29.
    Pace, R.K., LeSage, J.: Spatial Auto-regressive Local Estimation (SALE), Spatial Statistics and Spatial Econometrics, Edited by Art Getis, Palgrave (2003)Google Scholar
  30. 30.
    Pace, R.K., LeSage, J.: Chebyshev Approximation of Log-Determinant of Spatial Weight Matrices, Computational Statistics and Data Analysis, Technical Report (forthcoming)Google Scholar
  31. 31.
    Pace, R.K., LeSage, J.: Closed-form maximum likelihood estimates of spatial autoregressive models: the double bounded likelihood estimator (DBLE), Geographical Analysis- (forthcoming)Google Scholar
  32. 32.
    Pace, R.K., Zou, D.: Closed-Form Maximum Likelihood Estimates of Nearest Neighbor Spatial Dependence. Geographical Analysis 32(2) (2000)Google Scholar
  33. 33.
    Press, W., Teukulsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in Fortran 77, 2nd edn. Cambridge University Press, Cambridge (1992)Google Scholar
  34. 34.
    Shekhar, S., Chawla, S.: Spatial Databases: A Tour. Prentice-Hall, Englewood Cliffs (2003)Google Scholar
  35. 35.
    Shekhar, S., Schrater, P., Raju, R., Wu, W.: Spatial Contextual Classification and Prediction Models for Mining Geospatial Data. IEEE Transactions on Multimedia 4(2), 174–188 (2002)CrossRefGoogle Scholar
  36. 36.
    Smirnov, O., Anselin, L.: Fast Maximum Likelihood Estimation of Very Large Spatial Auto-regressive Models: A Characteristic Polynomial Approach. Computational Statistics & Data Analysis 35(3), 301–319 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  37. 37.
    Timlin, J., Walthall, C.L., Pachepsky, Y., Dulaney, W.P., Daughtry, C.S.T.: Spatial Regression of Crop Parameters with Airborne Spectral Imagery. In: Proceedings of the 3rd Int. Conference on Geospatial Information in Agriculture and Forestry, Denver, CO (November 2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Baris M. Kazar
    • 1
  • Shashi Shekhar
    • 2
  • David J. Lilja
    • 1
  • Ranga R. Vatsavai
    • 2
  • R. Kelley Pace
    • 3
  1. 1.Electrical and Computer Engineering DepartmentUniversity of MinnesotaTwin-CitiesUSA
  2. 2.Computer Science and Engineering DepartmentUniversity of MinnesotaTwin-CitiesUSA
  3. 3.LREC Endowed Chair of Real Estate, 2164B CEBA, Department of Finance, E.J. Ourso College of BusinessLouisiana State UniversityBaton RougeUSA

Personalised recommendations