Abstract
Yield maps are recognized as a valuable tool with regard to managing upcoming crop production but can contain a large amount of defective data that might result in misleading decisions. These anomalies must be removed before further processing to ensure the quality of future decisions. This paper proposes a new holistic methodology to filter out defective observations likely to be present in yield datasets. The notion of spatial neighbourhood has been refined to embrace the specific characteristics of such on-the-go vehicle based datasets. Observations are compared with their newly-defined spatial neighbourhood and the most abnormal ones are classified as defective observations based on a density-based clustering algorithm. The approach was conceived to be as non-parametric and automated as far as possible to pre-process a growing number of datasets without supervision. The proposed approach showed promising results on real yield datasets with the detection of well-known sources of errors such as filling and emptying times, speed changes and non-fully used cutting bar.








Similar content being viewed by others
References
Arslan, S. (2008). A grain flow model to simulate grain yield sensor response. Sensors, 8, 952–962.
Arslan, S., & Colvin, T. (2002). Grain yield mapping: Yield sensing, yield reconstruction, and errors. Precision Agriculture, 3, 135–154.
Ben-Gal, I. (2005). Outlier detection. In The data mining and knowledge discovery handbook: A complete guide for practitioners and researchers. Boston, USA: Kluwer.
Blackmore, B. S., & Moore, M. (1999). Remedial correction of yield map data. Precision Agriculture, 1, 53–66.
Chen, D., Lu, C.-T., Kou, Y., & Chen, F. (2008). On detecting spatial outliers. Geoinformatica, 12, 455–475.
Chung, S. O., Sudduth, K. A., & Drummond, S. T. (2002). Determining yield monitoring system delay time with geostatistical and data segmentation approaches. Transactions of the ASAE, 45, 915–926.
Diker, K., Heerman, D. F., & Brodahl, M. K. (2004). Frequency analysis of yield for delineating yield response zones. Precision Agriculture, 5, 435–444.
Drummond, S. T., Fraisse, C. W., & Sudduth, K. A. (1999). Combine harvest area determination by vector processing of GPS position data. Transactions of the ASAE, 42, 1221–1227.
Duan, L., Xu, L., Guo, F., Lee, J., & Yan, B. (2007). A local-density based spatial clustering algorithm with noise. Information Systems, 32, 978–986.
Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In E. Simoudis, J. Han, & U. Fayyad (Eds.), Identification of local multivariate outliers (pp. 226–231). Palo Alto, CA, USA: AAAI Press.
Filzmoser, P., Ruiz-Gazen, A., & Thomas-Agnan, C. (2014). Identification of local multivariate outliers. Statistical Papers, 55, 29–47.
Florin, M. J., McBratney, A. B., & Whelan, B. M. (2009). Quantification and comparison of wheat yield variation across space and time. European Journal of Agronomy, 30, 212–219.
Gogoi, P., Bhattacharyya, D., Borah, B., & Kalita, J. K. (2011). A survey of outlier detection methods in network anomaly identification. Computer Journal, 54, 570–588.
Griffin, T., Dobbins, C., Vyn, T., Florax, R., & Lowenberg-DeBoer, J. (2008). Spatial analysis of yield monitor data: Case studies of on-farm trials and farm management decision making. Precision Agriculture, 9, 269–283.
Harris, P., Brunsdon, C., Charlton, M., Juggins, S., & Clarke, A. (2014). Multivariate spatial outlier detection using robust geographically weighted methods. Mathematical Geosciences, 46, 1–31.
Hawkins, D. (1980). Identification of outliers. London, UK: Chapman & Hall.
Hu, J., Gong, C., & Zhang, Z. (2012). Dynamic compensation for impact-based grain flow sensor. In D. Li & Y. Chen (Eds.), Computer and computing technologies in agriculture V (CCTA 2011). IFIP advances in information and communication technology (Vol. 370, pp. 210–216). Berlin, Germany: Springer.
Hubert, M., & Van der Veeken, S. (2008). Outlier detection for skewed data. Journal of Chemometrics, 22, 235–246.
Jingtao, Q., & Shuhui, Z. (2010). Experiment research of impact-based sensor to monitor corn ear yield. In IEEE International conference on computer application and system modeling (Vol. 101, pp. 187–192).
Lee, D. H., Sudduth, K. A., Drummond, S. T., Chung, S. O., & Myers, D. B. (2012). Automated yield map delay identification using phase correlation methodology. Transactions of the ASABE, 55, 743–752.
Leroux, C., Jones, H., Clenet, A., Dreux, B., Becu, M., & Tisseyre, B. (2017). Simulating yield datasets: An opportunity to improve data filtering algorithms. In J. A. Taylor, D. Cammarano, A. Preashar, & A. Hamilton (Eds.), Proceedings of the 11th European conference on precision agriculture, precision agriculture ’17. Advances in Animal Biosciences (Vol. 8(2), pp. 600–605). https://doi.org/10.1017/S2040470017000899.
Lu, C.-T., Chen, D., & Kou, Y. (2003). Algorithms for spatial outlier detection. In X. Wu, A. Tuzhilin, & J. Shavlik (Eds.), Proceedings of the third IEEE international conference on data mining (pp. 597–600). Los Alamitos, CA, USA: IEEE Press.
Lyle, G., Bryan, B., & Ostendorf, B. (2013). Post-processing methods to eliminate erroneous grain yield measurements: Review and directions for future development. Precision Agriculture, 15, 377–402.
Pringle, M. J., McBratney, A. B., Whelan, B. M., & Taylor, J. A. (2003). A preliminary approach to assessing the opportunity for site-specific crop management in a field, using a yield monitor. Agricultural Systems, 76, 273–292.
R Core Team. (2013). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
Reinke, R., Dankowicz, H., Phelan, J., & Kang, W. (2011). A dynamic grain flow model for a mass flow yield sensor on a combine. Precision Agriculture, 12, 732–749.
Reitz, P., & Kutzbach, H. D. (1996). Investigations on a particular yield mapping system for combine harvesters. Computers and Electronics in Agriculture, 14, 137–150.
Robinson, T. P., & Metternicht, G. (2005). Comparing the performance of techniques to improve the quality of yield maps. Agricultural Systems, 85, 19–41.
Sawant, K. (2014). Adaptive methods for determining DBSCAN parameters. International Journal of Innovative Science, Engineering & Technology, 1, 330–334.
Simbahan, G. C., Dobermann, A., & Ping, J. L. (2004). Screening yield monitor data improves grain yield maps. Agronomy Journal, 96, 1091–1102.
Spekken, M., Anselmi, A. A., & Molin, J. P. (2013). A simple method for filtering spatial data. In J. V. Stafford (Ed.), Precision agriculture’13: Proceedings of the 9th European conference on precision agriculture (pp. 259–266). Wageningen, The Netherlands: Wageningen Academic Publishers.
Sudduth, K., & Drummond, S. T. (2007). Yield Editor: Software for removing errors from crop yield maps. Agronomy Journal, 99, 1471.
Sun, W., Whelan, B., McBratney, A. B., & Minasny, B. (2013). An integrated framework for software to provide yield data cleaning and estimation of an opportunity index for site-specific crop management. Precision Agriculture, 14, 376–391.
Taylor, J. A., Mcbratney, A. B., & Whelan, B. M. (2007). Establishing management classes for broadacre agricultural production. Agronomy Journal, 99, 1366–1376.
Tobler, W. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46, 234–240.
Zhao, C., Huang, W., Chen, L., Meng, Z., Wang, Y., & Xu, F. (2010). A harvest area measurement system based on ultrasonic sensors and DGPS for yield map correction. Precision Agriculture, 11, 163–180.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Leroux, C., Jones, H., Clenet, A. et al. A general method to filter out defective spatial observations from yield mapping datasets. Precision Agric 19, 789–808 (2018). https://doi.org/10.1007/s11119-017-9555-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11119-017-9555-0
