Protocol for automating error removal from yield maps

  • Andrés VegaEmail author
  • Mariano Córdoba
  • Mauricio Castro-Franco
  • Mónica Balzarini


Yield mapping is one of the most widely used precision farming technologies. However, the value of the maps can be compromised by the presence of systematic and random errors in raw within field data. In this paper, an automated method to clean yield maps is proposed so as to ensure the quality of further data processing and management decisions. First, data were screened by filtering null and edge yield values as well global outliers. Second, spatial outliers or local defective observations were deleted. The local Moran’s index of spatial autocorrelation and Moran’s plot were used as tool to identify the spatial outliers. The protocol to filter out global and local outliers was evaluated on 595 real yield datasets from different grain crops. Significant improvements in the distribution and spatial structure of yield datasets was found. Approximately 30% of the dataset size was removed from each monitor dataset, with one third of the removal occurring during filtering of spatial outliers. The automation of null, edge yield values and the removal of global outliers improved yield distributions, whereas the cleaning of local outliers impacted the yield spatial structure for all yield maps and crops. The algorithm proposed to clean yield maps is easy to apply for preprocessing the growing number of available yield maps.


Global outliers Local outliers Spatial data mining 



We thank the Argentinian National Scientific and Technological Promotion Agency (ANPCyT-PICT 2014-1071), Ministry of Science and Technology of Córdoba province (MinCyT-PIODO), Science and Technology Secretary of National University of Córdoba (SECyT-UNC), and the National Scientific and Technical Research Council (CONICET), for their support of this research.

Supplementary material

11119_2018_9632_MOESM1_ESM.txt (5 kb)
Supplementary material 1 (TXT 5 kb)


  1. Anselin, L. (1995). Local indicators of spatial association-LISA. Geographical Analysis, 27(2), 93–115.CrossRefGoogle Scholar
  2. Anselin, L. (1996). The Moran scatterplot as an ESDA tool to assess local instability in spatial association. In M. Fischer, H. Scholten, & D. Unwin (Eds.), Spatial analytical perspectives on GIS (pp. 111–125). London, UK: Taylor and Francis.Google Scholar
  3. Arslan, S., & Colvin, T. S. (2002). Grain yield mapping: Yield sensing, yield reconstruction, and errors. Precision Agriculture, 3(2), 135–154.CrossRefGoogle Scholar
  4. Bivand, R. S., Pebesma, E., & Gómez-Rubio, V. (2013). Applied spatial data analysis with R. New York, USA: Springer.CrossRefGoogle Scholar
  5. Blackmore, B. S., & Marshall, C. J. (1996). Yield mapping: Errors and algorithms. In P. C. Robert, R. H. Rust, & W. E. Larson (Eds.), 3rd International conference on precision agriculture (pp. 403–416). Madison, WI, USA: ASA/CSSA/SSSA.Google Scholar
  6. Blackmore, S., & Moore, M. (1999). Remedial correction of yield map data. Precision Agriculture, 1, 53–66.CrossRefGoogle Scholar
  7. Brody, S. D., Highfield, W. E., & Thornton, S. (2006). Planning at the urban fringe: An examination of the factors influencing nonconforming development patterns in southern Florida. Environment and Planning B: Planning and Design, 33, 75–96.CrossRefGoogle Scholar
  8. Burrough, P. A., & McDonnell, R. A. (1998). Principles of geographical information systems. Oxford, UK: Oxford University Press.Google Scholar
  9. Cambardella, C. A., Moorman, T. B., Parkin, T. B., Karlen, D. L., Novak, J. M., Turco, R. F., et al. (1994). Field-scale variability of soil properties in Central Iowa Soils. Soil Science Society of America Journal, 58, 1501–1511.CrossRefGoogle Scholar
  10. Córdoba, M., Costa, J. L., Peralta, N. R., & Balzarini, M. (2016). Protocol for multivariate homogeneous zone delineation in precision agriculture ScienceDirect. Biosystems Engineering, 143, 95–107.CrossRefGoogle Scholar
  11. Deutsch, C. V., & Journel, A. G. (1998). GSLIB: Geostatistical software library and user’s guide (2nd ed.). New York, NY, USA: Oxford University Press.Google Scholar
  12. Draper, N. R., & Smith, H. (1998). Applied regression analysis. New York, USA: Wiley.CrossRefGoogle Scholar
  13. Drummond, S. T., & Sudduth, K. A. (2005). Analysis of errors affecting yield map accuracy. In D. J. Mulla (Ed.), 7th International conference on precision agriculture, CD-ROM (pp. 1478–1490). St. Paul, USA: University of Minnesota.Google Scholar
  14. Fu, W., Zhao, K., Zhang, C., Wu, J., & Tunney, H. (2016). Outlier identification of soil phosphorus and its implication for spatial structure modeling. Precision Agriculture, 17(2), 121–135.CrossRefGoogle Scholar
  15. Getis, A., & Ord, J. K. (1992). The analysis of spatial association by use of distance statistics. Geographical Analysis, 24(3), 189–206.CrossRefGoogle Scholar
  16. Gozdowski, D., Samborski, S., & Dobers, E. S. (2010). Evaluation of methods for the detection of spatial outliers in the yield data of winter wheat. Colloquium Biometricum, 40, 41–51.Google Scholar
  17. Hubert, M., & Van der Veeken, S. (2008). Outlier detection for skewed data. Journal of Chemometrics, 22, 235–246.CrossRefGoogle Scholar
  18. Ishioka, F., KuriharA, K., Suito, H., Horikawa, Y., & Ono, Y. (2007). Detection of hotspots for three-dimensional spatial data and its application to environmental pollution data. Journal of Environmental Science for Sustainable Society, 1, 15–24.CrossRefGoogle Scholar
  19. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2014). An introduction to statistical learning: With applications in R. New York, USA: Springer Publishing Company, Incorporated.Google Scholar
  20. Lark, R. M., Stafford, J. V., & Bolam, H. C. (1997). Limitations on the spatial resolution of yield mapping for combinable crops. Journal of Agricultural Engineering Research, 66(3), 183–193.CrossRefGoogle Scholar
  21. Lehmann, E. L., & D’abrera, H. J. M. (1975). Nonparametrics. San Francisco, CA, USA: Holden-Day.Google Scholar
  22. Leroux, C., Jones, H., Clenet, A., Dreux, B., Becu, M., & Tisseyre, B. (2018). A general method to filter out defective spatial observations from yield mapping datasets. Precision Agriculture, 19(5), 789–808.CrossRefGoogle Scholar
  23. Li, H., Li, Y., Lee, M., Liu, Z., & Miao, C. (2015). Spatiotemporal analysis of heavy metal water pollution in transitional China. Sustainability, 7, 9067–9087.CrossRefGoogle Scholar
  24. Lyle, G., Bryan, B. A., & Ostendorf, B. (2014). Post-processing methods to eliminate erroneous grain yield measurements: Review and directions for future development. Precision Agriculture, 15(4), 377–402.CrossRefGoogle Scholar
  25. McCallum, Q. E., & Weston, S. (2011). In M. L. M. Blanchette (Ed.), Parallel R. Sebastopol, CA: O’Reilly Media.Google Scholar
  26. McGrath, D., & Zhang, C. (2003). Spatial distribution of soil organic carbon concentrations in grassland of Ireland. Applied Geochemistry, 18(10), 1629–1639.CrossRefGoogle Scholar
  27. Menegatti, L. A. A., & Molin, J. P. (2004). Remoção de erros em mapas de produtividade via filtragem de dados brutos (Removal of errors in yield maps through raw data filtering). Revista Brasileira de Engenharia Agrícola e Ambiental, 8(1), 126–134.CrossRefGoogle Scholar
  28. Morgan, M., Obenchain, V., Lang, M., Thompson, R., & Turaga, N. (2016). BiocParallel: Bioconductor facilities for parallel evaluation. R package version 1.4.3. Retrieved October 6, 2018, from
  29. Noack, P. O., Muhr, T., & Demmel, M. (2005). Effect of interpolation methods and filtering on the quality of yieldmaps. In J. V. Stafford (Ed.), Proceedings of the 5 th European conference on precision agriculture (pp. 701–706). Wageningen, The Netherlands: Wageningen Academic Publishers.Google Scholar
  30. Oliver, M. A., & Webster, R. (2014). A tutorial guide to geostatistics: Computing and modelling variograms and kriging. Catena, 113, 56–69.CrossRefGoogle Scholar
  31. Pebesma, E. J. (2004). Multivariable geostatistics in S: The gstat package. Computers & Geosciences, 30(7), 683–691.CrossRefGoogle Scholar
  32. Peralta, N. R., Costa, J. L., Balzarini, M., & Angelini, H. (2013). Delineation of management zones with measurements of soil apparent electrical conductivity in the southeastern pampas. Canadian Journal of Soil Science, 93(2), 205–218.CrossRefGoogle Scholar
  33. Ping, J. L., & Dobermann, A. (2005). Processing of yield map data. Precision Agriculture, 6(2), 193–212.CrossRefGoogle Scholar
  34. R Core Team. (2016). R: A language and environment for statistical computing. Vienna: Austria.Google Scholar
  35. Rodrigues, M. S., Corá, J. E., Castrignanò, A., Mueller, T. G., & Rienzi, E. (2013). A spatial and temporal prediction model of corn grain yield as a function of soil attributes. Agronomy Journal, 105, 1878–1887.CrossRefGoogle Scholar
  36. Schabenberger, O., & Pierce, F. J. (2002). Contemporary statistical models for the plant and soil sciences. Boca Raton, FL, USA: CRC Press LLC.Google Scholar
  37. Simbahan, G. C., Dobermann, A., & Ping, J. L. (2004). Screening yield monitor data improves grain yield maps. Agronomy Journal, 96(4), 1091–1102.CrossRefGoogle Scholar
  38. Spekken, M., A Anselmi, A., & Molin, J. (2013). A simple method for filtering spatial data. In: Stafford, J. V. (Ed.), Precision agriculture 2013, proceedings of the 9th European conference on precision agriculture (pp. 259–266). Wageningen, The Netherlands: Wageningen Academic Publishers.Google Scholar
  39. Stafford, J. V. (1996). Essential technology for precision agriculture. In P. C. Robert, R. H. Rust, & W. E. Larson (Eds.), 3rd International Conference on Precision Agriculture (pp. 595–604). Madison, WI, USA: ASA/CSSA/SSSA.Google Scholar
  40. Su, P. C. (2011). Statistical Geocomputing: Spatial outlier detection in precision agriculture. Masters thesis, University of Waterloo, Ontario, Canada.Google Scholar
  41. Sudduth, K. A., & Drummond, S. T. (2007). Yield editor: Software for removing errors from crop yield maps. Agronomy Journal, 99(6), 1471–1482.CrossRefGoogle Scholar
  42. Sun, W., Whelan, B., McBratney, A. B., & Minasny, B. (2013). An integrated framework for software to provide yield data cleaning and estimation of an opportunity index for site-specific crop management. Precision Agriculture, 14(4), 376–391.CrossRefGoogle Scholar
  43. Taylor, J. A., McBratney, A. B., & Whelan, B. M. (2007). Establishing management classes for broadacre agricultural production. Agronomy Journal, 99(5), 1366–1376.CrossRefGoogle Scholar
  44. Thöle, H., Richter, C., & Ehlert, D. (2013). Strategy of statistical model selection for precision farming on-farm experiments. Precision Agriculture, 14(4), 434–449.CrossRefGoogle Scholar
  45. Thylén, L., Algerbo, P. A., & Giebel, A. (2000). An expert filter removing erroneous yield data. In P. C. Robert, R. H. Rust, & W. E. Larson (Eds.), 5th International conference on precision agriculture (pp. 1–9). ASA/CSSA/SSSA: Madison, WI, USA.Google Scholar
  46. Wright, S. (1992). Adjusted P-values for simultaneous inference. Biometrics, 48(4), 1005–1013.CrossRefGoogle Scholar
  47. Zhao, K., Fu, W., Liu, X., Huang, D., Zhang, C., Ye, Z., et al. (2014). Spatial variations of concentrations of copper and its speciation in the soil-rice system in Wenling of southeastern China. Environmental Science and Pollution Research, 21(11), 7165–7176.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Chair of Statistics and Biometrics, School of Agricultural SciencesNational University of Córdoba (UNC), ArgentinaCórdobaArgentina
  2. 2.CONICET, National Scientific and Technical Research CouncilBuenos AiresArgentina

Personalised recommendations