Agribusiness Intelligence: Grape Production Forecast Using Data Mining Techniques

  • Rosana Cavalcante de OliveiraEmail author
  • João Mendes-Moreira
  • Carlos Abreu Ferreira
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 747)


The agribusiness volatility is related to the uncertainty of the environment, rising demand, falling prices and new technologies. However, generation of agriculture data has increased over past years and can be used for a growing number of applications of data mining techniques in agriculture. The multidisciplinary approach of integrating computer science with agriculture will support the necessary decisions to be taken in order to mitigate risks and maximize profits. The present study analyzes different methods of regression applied in the study case of grapes production forecast. The selected methods were multivariate linear regression, regression trees, lasso and random forest. Their performance were compared against the predictions obtained by the company through the mean squared error and the coefficient of variation. The four regression methods used obtained better predictive results than the method used by the company with statistical significance < 0.5%.


Production forecast Data mining Agribusiness Grapes 



Rosana Cavalcante acknowledges the grant from project Smartfarming (POCI-01-0247-FEDER-018029), co-financed by the ERDF through COMPETE 2020 under P2020 Partnership Agreement and by the Portuguese Agency ANI. This work is also funded by the ERDF through COMPETE 2020 within project POCI-01-0145-FEDER-006961, and by National Funds through the FCT as part of project UID/EEA/50014/2013.


  1. 1.
    Eurostat: Statistical office of the European Union.
  2. 2. Portal de Informação Agroalimentar de Portugal.
  3. 3.
    Sharma, L., Mehta, N.: Data mining techniques: A tool for knowledge management system in agriculture. Int. J. Sci. Technol. Res. 1(1), 67–73 (2012)Google Scholar
  4. 4.
    Allen, P.G.: Economic forecasting in agriculture. Int. J. Forecast. 10, 81–135 (1994)CrossRefGoogle Scholar
  5. 5.
    Mucherino, A., Papajorgji, P., Pardalos, P.: A survey of data mining techniques applied to agriculture. Oper. Res. 9(2), 121–140 (2009)zbMATHGoogle Scholar
  6. 6.
    Bhargavi, P., Jyothi, S.: Applying naive bayes data mining technique for classification of agricultural land soils. Int. J. Comput. Sci. Netw. Secur. 9(8), 117–122 (2009)Google Scholar
  7. 7.
    Hira, S., Deshpande, P.: Data analysis using multidimensional modeling, statistical analysis and data mining on agriculture parameters. Procedia Comput. Sci. 54, 431–439 (2015)CrossRefGoogle Scholar
  8. 8.
    Ramya, M., Lokesh, V., Manjunath, T., Hegadi, R.: A predictive model construction for mulberry crop productivity. Procedia Comput. Sci. 45, 156–165 (2015)CrossRefGoogle Scholar
  9. 9.
    Khedr, A., Kadry, M., Walid, G.: Proposed framework for implementing data mining techniques to enhance decisions in agriculture sector applied case on food security information center ministry of agriculture, Egypt. Procedia Comput. Sci. 65, 633–642 (2015)CrossRefGoogle Scholar
  10. 10.
    Wen, Q., Mu, W., Sun, L., Hua, S., Zhou, Z.: Daily sales forecasting for grapes by support vector machine. In: Computer and Computing Technologies in Agriculture VII, vol. 420(3), pp. 351–360 (2014)Google Scholar
  11. 11.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth & Brooks, Monterey (1984)zbMATHGoogle Scholar
  12. 12.
    Therneau, T., Atkinson, E.: An introduction to recursive partitioning using the RPART routines. Technical report Mayo Foundation (1997)Google Scholar
  13. 13.
    Yi-Yang, G., Nan-Ping, R.: Data mining and analysis of our agriculture based on the decision tree. In: 2009 IEEE ISECS International Colloquium on Computing, Communication, Control, and Management, CCCM 2009, vol. 2, pp. 134–138 (2009)Google Scholar
  14. 14.
    Ruß, G.: Data mining of agricultural yield data: A comparison of regression models. In: Industrial Conference on Data Mining, pp. 24–37. Springer (2009)Google Scholar
  15. 15.
    Draper, N.R., Smith, H.: Applied Regression Analysis. Wiley, New York (2014)zbMATHGoogle Scholar
  16. 16.
    Kukreja, S.L., Löfberg, J., Brenner, M.J.: A least absolute shrinlage and selection operator (LASSO) for nonlinear system identification. IFAC Proc. Volumes 39, 814–819 (2006)CrossRefGoogle Scholar
  17. 17.
    Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)Google Scholar
  18. 18.
    Williams, G.: Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery. Springer, New York (2011)CrossRefzbMATHGoogle Scholar
  19. 19.
    R-project: The R Project for Statistical Computing.
  20. 20.
    Gama, J., Carvalho, A.P., Faceli, K., Lorena, A.C., Oliveira, M.: Extração de Conhecimento de Dados. Silabo, vol. 2, pp. 351–360 (2015)Google Scholar
  21. 21.
    Steinberg, D., Colla, P., Kerry, M.: MARS User Guide. Salford Systems, San Diego (1999)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Rosana Cavalcante de Oliveira
    • 1
    Email author
  • João Mendes-Moreira
    • 1
  • Carlos Abreu Ferreira
    • 1
  1. 1.LIAAD-INESC TECCampus da FEUPPortoPortugal

Personalised recommendations