Feature Selection for Wheat Yield Prediction

  • Georg RußEmail author
  • Rudolf Kruse
Conference paper


Carrying out effective and sustainable agriculture has become an important issue in recent years. Agricultural production has to keep up with an everincreasing population by taking advantage of a field’s heterogeneity. Nowadays, modern technology such as the global positioning system (GPS) and a multitude of developed sensors enable farmers to better measure their fields’ heterogeneities. For this small-scale, precise treatment the term precision agriculture has been coined. However, the large amounts of data that are (literally) harvested during the growing season have to be analysed. In particular, the farmer is interested in knowing whether a newly developed heterogeneity sensor is potentially advantageous or not. Since the sensor data are readily available, this issue should be seen from an artificial intelligence perspective. There it can be treated as a feature selection problem. The additional task of yield prediction can be treated as a multi-dimensional regression problem. This article aims to present an approach towards solving these two practically important problems using artificial intelligence and data mining ideas and methodologies.


Global Position System Root Mean Square Error Feature Selection Regression Tree Support Vector Regres 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A. Arauzo-Azofra and J. M. Benitez. Empirical study of feature selection methods in classification. In Hybrid Intelligent Systems, 2008. HIS ’08. Eighth International Conference on, pages 584–589, 2008.Google Scholar
  2. 2.
    Avrim L. Blum and Pat Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97:245–271, 1997.zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Bernhard E. Boser, Isabelle M. Guyon, and Vladimir N. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pages 144–152. ACM Press, 1992.Google Scholar
  4. 4.
    L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA, 1984.zbMATHGoogle Scholar
  5. 5.
    Chih-Chung Chang and Chih-Jen Lin. LIBSVM: a library for support vector machines, 2001. Software available at
  6. 6.
    D. L. Corwin and S. M. Lesch. Application of soil electrical conductivity to precision agriculture: Theory, principles, and guidelines. Agron J, 95(3):455–471, May 2003.CrossRefGoogle Scholar
  7. 7.
    S. F. Crone, S. Lessmann, and S. Pietsch. Forecasting with computational intelligence - an evaluation of support vector regression and artificial neural networks for time series prediction. In Neural Networks, 2006. IJCNN ’06. International Joint Conference on, pages 3159–3166, 2006.Google Scholar
  8. 8.
    M. Dash and H. Liu. Feature selection for classification. Intelligent Data Analysis, 1:131–156, 1997.CrossRefGoogle Scholar
  9. 9.
    S.R. Gunn. Support vector machines for classification and regression. Technical Report, School of Electronics and Computer Science, University of Southampton, Southampton, U.K., 1998.Google Scholar
  10. 10.
    Robert Hecht-Nielsen. Neurocomputing. Addison-Wesley, September 1990.Google Scholar
  11. 11.
    Chengquan Huang, Limin Yang, Bruce Wylie, and Collin Homer. A strategy for estimating tree canopy density using landsat 7 etm+ and high resolution images over large areas. In Proceedings of the Third International Conference on Geospatial Information in Agriculture and Forestry, 2001.Google Scholar
  12. 12.
    M. Karagiannopoulos, D. Anyfantis, S. B. Kotsiantis, and P. E. Pintelas. Feature selection for regression problems. In Proceedings of HERCMA’07. Athens University of Economics and Business, September 2007.Google Scholar
  13. 13.
    Pat Langley. Selection of relevant features in machine learning. In In Proceedings of the AAAI Fall symposium on relevance, pages 140–144. AAAI Press, 1994.Google Scholar
  14. 14.
    H. Liu and L. Yu. Feature selection for data mining. Technical report, Arizona State University, 2002.Google Scholar
  15. 15.
    Huan Liu and Hiroshi Motoda, editors. Computational Methods of Feature Selection. Data Mining and Knowledge Discovery. Chapman & Hall/CRC, October 2007.Google Scholar
  16. 16.
    J. Liu, J. R. Miller, D. Haboudane, and E. Pattey. Exploring the relationship between red edge parameters and crop variables for precision agriculture. In 2004 IEEE International Geoscience and Remote Sensing Symposium, volume 2, pages 1276–1279, 2004.Google Scholar
  17. 17.
    David B. Lobell, J. Ivan Ortiz-Monasterio, Gregory P. Asner, Rosamond L. Naylor, and Walter P. Falcon. Combining field surveys, remote sensing, and regression trees to understand yield variations in an irrigated wheat landscape. Agronomy Journal, 97:241–249, 2005.Google Scholar
  18. 18.
    U. Meier. Entwicklungsstadien mono- und dikotyler Pflanzen. Biologische Bundesanstalt fr Land- und Forstwirtschaft, Braunschweig, Germany, 2001.Google Scholar
  19. 19.
    Iván Mejía-Guevara and Ángel Kuri-Morales. Evolutionary feature and parameter selection in support vector regression. In Lecture Notes in Computer Science, volume 4827, pages 399–408. Springer, Berlin, Heidelberg, 2007.Google Scholar
  20. 20.
    E. M. Middleton, P. K. E. Campbell, J. E. Mcmurtrey, L. A. Corp, L. M. Butcher, and E. W. Chappelle. “Red edge” optical properties of corn leaves from different nitrogen regimes. In 2002 IEEE International Geoscience and Remote Sensing Symposium, volume 4, pages 2208–2210, 2002.Google Scholar
  21. 21.
    Tom M. Mitchell. Machine Learning. McGraw-Hill Science/Engineering/Math, March 1997.Google Scholar
  22. 22.
    Jacques J. Neeteson. Nitrogen Management for Intensively Grown Arable Crops and Field Vegetables, chapter 7, pages 295–326. CRC Press, Haren, The Netherlands, 1995.Google Scholar
  23. 23.
    J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106, March 1986.Google Scholar
  24. 24.
    Ross J. Quinlan. C4.5: Programs for Machine Learning (Morgan Kaufmann Series in Machine Learning). Morgan Kaufmann, January 1993.Google Scholar
  25. 25.
    R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2009. ISBN 3-900051-07-0.Google Scholar
  26. 26.
    Georg Ruß. Data mining of agricultural yield data: A comparison of regression models. In Petra Perner, editor, Advances in Data Mining, Lecture Notes in Computer Science, pages –. Springer, July 2009. (to appear).Google Scholar
  27. 27.
    Georg Ruß, Rudolf Kruse, Martin Schneider, and Peter Wagner. Optimizing wheat yield prediction using different topologies of neural networks. In José Luis Verdegay, Manuel Ojeda-Aciego, and Luis Magdalena, editors, Proceedings of IPMU-08, pages 576–582. University of M´alaga, June 2008.Google Scholar
  28. 28.
    Georg Ruß, Rudolf Kruse, Martin Schneider, and Peter Wagner. Visualization of agriculture data using self-organizing maps. In Tony Allen, Richard Ellis, and Miltos Petridis, editors, Applications and Innovations in Intelligent Systems, volume 16 of Proceedings of AI-2008, pages 47–60. BCS SGAI, Springer, January 2009.Google Scholar
  29. 29.
    Georg Ruß, Rudolf Kruse, Peter Wagner, and Martin Schneider. Data mining with neural networks for wheat yield prediction. In Petra Perner, editor, Advances in Data Mining (Proc. ICDM 2008), pages 47–56, Berlin, Heidelberg, July 2008. Springer Verlag.Google Scholar
  30. 30.
    M. Schneider and P.Wagner. Prerequisites for the adoption of new technologies - the example of precision agriculture. In Agricultural Engineering for a Better World, Düsseldorf, 2006. VDI Verlag GmbH.Google Scholar
  31. 31.
    Alex J. Smola and Bernhard Sch Olkopf. A tutorial on support vector regression. Technical report, Statistics and Computing, 1998.Google Scholar
  32. 32.
    Michael L. Stein. Interpolation of Spatial Data : Some Theory for Kriging (Springer Series in Statistics). Springer, June 1999.Google Scholar
  33. 33.
    Georg Weigert. Data Mining und Wissensentdeckung im Precision Farming - Entwicklung von ökonomisch optimierten Entscheidungsregeln zur kleinräumigen Stickstoff-Ausbringung. PhD thesis, TU München, 2006.Google Scholar
  34. 34.
    Michael D. Weiss. Precision farming and spatial economic analysis: Research challenges and opportunities. American Journal of Agricultural Economics, 78(5):1275–1280, 1996.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2010

Authors and Affiliations

  1. 1.Otto-von-Guericke-UnivMagdeburgGermany

Personalised recommendations