Agribusiness Intelligence: Grape Production Forecast Using Data Mining Techniques
The agribusiness volatility is related to the uncertainty of the environment, rising demand, falling prices and new technologies. However, generation of agriculture data has increased over past years and can be used for a growing number of applications of data mining techniques in agriculture. The multidisciplinary approach of integrating computer science with agriculture will support the necessary decisions to be taken in order to mitigate risks and maximize profits. The present study analyzes different methods of regression applied in the study case of grapes production forecast. The selected methods were multivariate linear regression, regression trees, lasso and random forest. Their performance were compared against the predictions obtained by the company through the mean squared error and the coefficient of variation. The four regression methods used obtained better predictive results than the method used by the company with statistical significance < 0.5%.
KeywordsProduction forecast Data mining Agribusiness Grapes
Rosana Cavalcante acknowledges the grant from project Smartfarming (POCI-01-0247-FEDER-018029), co-financed by the ERDF through COMPETE 2020 under P2020 Partnership Agreement and by the Portuguese Agency ANI. This work is also funded by the ERDF through COMPETE 2020 within project POCI-01-0145-FEDER-006961, and by National Funds through the FCT as part of project UID/EEA/50014/2013.
- 1.Eurostat: Statistical office of the European Union. http://ec.europa.eu/eurostat
- 2.Agronegocios.eu: Portal de Informação Agroalimentar de Portugal. http://www.agronegocios.eu
- 3.Sharma, L., Mehta, N.: Data mining techniques: A tool for knowledge management system in agriculture. Int. J. Sci. Technol. Res. 1(1), 67–73 (2012)Google Scholar
- 6.Bhargavi, P., Jyothi, S.: Applying naive bayes data mining technique for classification of agricultural land soils. Int. J. Comput. Sci. Netw. Secur. 9(8), 117–122 (2009)Google Scholar
- 10.Wen, Q., Mu, W., Sun, L., Hua, S., Zhou, Z.: Daily sales forecasting for grapes by support vector machine. In: Computer and Computing Technologies in Agriculture VII, vol. 420(3), pp. 351–360 (2014)Google Scholar
- 12.Therneau, T., Atkinson, E.: An introduction to recursive partitioning using the RPART routines. Technical report Mayo Foundation (1997)Google Scholar
- 13.Yi-Yang, G., Nan-Ping, R.: Data mining and analysis of our agriculture based on the decision tree. In: 2009 IEEE ISECS International Colloquium on Computing, Communication, Control, and Management, CCCM 2009, vol. 2, pp. 134–138 (2009)Google Scholar
- 14.Ruß, G.: Data mining of agricultural yield data: A comparison of regression models. In: Industrial Conference on Data Mining, pp. 24–37. Springer (2009)Google Scholar
- 17.Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)Google Scholar
- 19.R-project: The R Project for Statistical Computing. https://www.r-project.org/
- 20.Gama, J., Carvalho, A.P., Faceli, K., Lorena, A.C., Oliveira, M.: Extração de Conhecimento de Dados. Silabo, vol. 2, pp. 351–360 (2015)Google Scholar
- 21.Steinberg, D., Colla, P., Kerry, M.: MARS User Guide. Salford Systems, San Diego (1999)Google Scholar