From Spatial Data Mining in Precision Agriculture to Environmental Data Mining

  • Georg Ruß
Part of the Studies in Computational Intelligence book series (SCI, volume 445)


In the first part of this article, the main results from applying data mining methods and algorithms to spatial precision agriculture data sets will be outlined. In particular, the task of yield prediction will be handled as a spatial regression problem. To account for the spatial nature of the data sets, a few modeling pitfalls resulting from spatial autocorrelation will be tackled. Based on a cross-validation approach, the yield prediction setting will be used to determine spatial variable importance. Another task called management zone delineation will be briefly outlined. A novel hierarchical spatially constrained clustering algorithm will be presented which aims to provide a tradeoff between spatial contiguity of the resulting clusters and cluster similarity. These two tasks are a summary of [26]. In the second part of this article, the emerging field of environmental data mining will be briefly laid out.


Random Forest Spatial Autocorrelation Support Vector Regression Hierarchical Agglomerative Cluster Bayesian Maximum Entropy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brenning, A.: Spatial prediction models for landslide hazards: review, comparison and evaluation. Natural Hazards and Earth System Science 5(6), 853–862 (2005)CrossRefGoogle Scholar
  2. 2.
    Brenning, A., Lausen, B.: Estimating error rates in the classification of paired organs. Statistics in Medicine 27(22), 4515–4531 (2008)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Cressie, N.A.C.: Statistics for Spatial Data. Wiley, New York (1993)Google Scholar
  4. 4.
    El-Beltagy, S.R., Rafea, A., Mabrouk, S.: Agrimine: A tool for mining agricultural problems and their solutions. In: Proc. of Int. Computer Engineering Conference (ICENCO), ICENCO, pp. 81–85 (2010)Google Scholar
  5. 5.
    Equihua, M.: Fuzzy clustering of ecological data. Journal of Ecology 78(2), 519–534 (1990)CrossRefGoogle Scholar
  6. 6.
    Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: The kdd process for extracting useful knowledge from volumes of data. Commun. ACM 39, 27–34 (1996)CrossRefGoogle Scholar
  7. 7.
    Fielding, A.: An introduction to machine learning methods. In: Fielding, A. (ed.) Machine Learning Methods for Ecological Applications, pp. 1–35. Kluwer Academic Publishers, Dordrecht (1999)CrossRefGoogle Scholar
  8. 8.
    Gibert, K., Spate, J., Snchez-Marr, M., Athanasiadis, I.N., Comas, J.: Data mining for environmental systems. In: Jakeman, A.J., Voinov, A.A., Rizzoli, A.E., Chen, S. (eds.) Environmental Modelling, Software and Decision Support, Developments in Integrated Environmental Assessment, vol. 3, pp. 205–228. Elsevier (2008)Google Scholar
  9. 9.
    Griffith, D.A.: Spatial Autocorrelation and Spatial Filtering. Advances in Spatial Science. Springer, New York (2003)Google Scholar
  10. 10.
    Heege, H., Reusch, S., Thiessen, E.: Prospects and results for optical systems for site-specific on-the-go control of nitrogen-top-dressing in germany. Precision Agriculture 9(3), 115–131 (2008)CrossRefGoogle Scholar
  11. 11.
    Iisaka, J., Sakurai-Amano, T.: Spatial association analysis for radar image interpretation. In: International Geoscience and Remote Sensing Symposium, IGARSS, pp. 1200–1203 (1993)Google Scholar
  12. 12.
    Kanevski, M. (ed.): Advanced Mapping of Environmental Data: Geostatistics, Machine Learning and Bayesian Maximum Entropy. ISTE, London (2010)Google Scholar
  13. 13.
    Kanevski, M., Parkin, R., Pozdnukhov, A., Timonin, V., Maignan, M., Demyanov, V., Canu, S.: Environmental data mining and modeling based on machine learning algorithms and geostatistics. Environmental Modelling and Software 19(9), 845–855 (2004)CrossRefGoogle Scholar
  14. 14.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of International Joint Conference on Artificial Intelligence (1995)Google Scholar
  15. 15.
    Lek, S., Guegan, J.: Application of artificial neural networks in ecological modelling. Ecological Modelling 120(2-3) (1999)Google Scholar
  16. 16.
    Marcot, B.G., Holthausen, R.S., Raphael, M.G., Rowland, M.M., Wisdom, M.J.: Using bayesian belief networks to evaluate fish and wildlife population viability under land management alternatives from an environmental impact statement. Forest Ecology and Management 153, 29–42 (2001)CrossRefGoogle Scholar
  17. 17.
    Meier, U.: Entwicklungsstadien mono- und dikotyler Pflanzen. In: Biologische Bundesanstalt für Land- und Forstwirtschaft, Braunschweig, Germany (2001)Google Scholar
  18. 18.
    Moran, P.A.P.: Notes on continuous stochastic phenomena. Biometrika 37, 17–33 (1950)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Papajorgji, P.J., Pardalos, P.M. (eds.): Advances in Modeling Agricultural Systems. Springer Optimization and Its Applications, vol 25. Springer (2009)Google Scholar
  20. 20.
    Petcu, D., Zaharie, D., Panica, S., Hussein, A.S., Sayed, A., El-Shishiny, H.: Fuzzy clustering of large satellite images using high performance computing. In: Proc. of SPIE. SPIE, vol. 8183 (2011)Google Scholar
  21. 21.
    Pozdnoukhov, A., Foresti, L., Kanevski, M.: Data-driven topo-climatic mapping with machine learning methods. Natural Hazards 50(3), 497–518 (2009)CrossRefGoogle Scholar
  22. 22.
    Prasad, A.M., Iverson, L.R., Liaw, A.: Newer classification and regression tree techniques: Bagging and random forests for ecological prediction. Ecosystems 9, 181–199 (2006)CrossRefGoogle Scholar
  23. 23.
    R Development Core Team R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2009),, ISBN 3-900051-07-0
  24. 24.
    Ruß, G.: Data Mining of Agricultural Yield Data: A Comparison of Regression Models. In: Perner, P. (ed.) ICDM 2009. LNCS (LNAI), vol. 5633, pp. 24–37. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  25. 25.
    Ruß, G.: Hacc-spatial: Hierarchical agglomerative spatially constrained clustering. In: Bichindaritz, I., Perner, P., Ruß, G. (eds.) 11th ICDM Conference, New York, USA, Workshop Proceedings. IBaI Publishing, Leipzig (2011)Google Scholar
  26. 26.
    Ruß, G.: Spatial Data Mining in Precision Agriculture. PhD thesis, Otto-von-Guericke-Universität Magdeburg (2012)Google Scholar
  27. 27.
    Ruß, G., Brenning, A.: Data Mining in Precision Agriculture: Management of Spatial Information. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. LNCS, vol. 6178, pp. 350–359. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  28. 28.
    Ruß, G., Kruse, R., Schneider, M., Wagner, P.: Estimation of neural network parameters for wheat yield prediction. In: Artificial Intelligence in Theory and Practice II. IFIP, vol. 276, pp. 109–118. Springer, Boston (2008)CrossRefGoogle Scholar
  29. 29.
    Ruß, G., Kruse, R., Schneider, M., Wagner, P.: Optimizing wheat yield prediction using different topologies of neural networks. In: Verdegay, J.L., Ojeda-Aciego, M., Magdalena, L. (eds.) Proceedings of the International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2008), pp. 576–582. University of Málaga (2008)Google Scholar
  30. 30.
    Stein, M.L.: Interpolation of Spatial Data: Some Theory for Kriging. Springer Series in Statistics. Springer (1999)Google Scholar
  31. 31.
    Su, F., Zhou, C., Lyne, V., Du, Y., Shi, W.: A data-mining approach to determine the spatio-temporal relationship between environmental factors and fish distribution. Ecological Modelling 174, 421–431 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.TecData AGUzwilSwitzerland

Personalised recommendations