A Rainfall Prediction Tool for Sustainable Agriculture Using Random Forest

  • Cristian Valencia-Payan
  • Juan Carlos Corrales
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11288)


In recent years world’s governments have focused its efforts on the development of the Sustainable Agriculture were all resources, especially water resources, are used in a more environmentally friendly manner. In this paper, we present an approach for estimating daily accumulated rainfall using multi-spatial scale multi-source data based on Machine Learning algorithms for three HABs in the Andean Region of Colombia where the agricultural activities are one of the main production activities. The proposed approach uses data from different rain-related variables such as vegetation index, elevation data, rain rate and temperature with the aim of the development of a rain forecast, able to respond to local or large-scale rain events. The results show that the trained model can detect local rain events event when no meteorological station data was used.


Rainfall Machine learning Cubist CART Random forest Multiscale data High Andean Basin Sustainable agriculture 



The authors are grateful to the Telematics Engineering Group (GIT) and the Optics and Laser Group (GOL) of the University of Cauca, The University of Cauca, Meteoblue, RICCLISA Program, and the AgroCloud project for supporting this research, as well as the AQUARISC program for the PhD support granted to Cristian Valencia-Payan.


  1. 1.
    Feenstra, G.: What is sustainable agriculture? — UC SAREP, UC Sustainable Agriculture Research and Education Program (2017). Accessed 23 May 2018
  2. 2.
    Ministerio De Agricultura y Desarrollo Rural and Departamento Administrativo Nacional de Estadística: El cultivo de la papa, Solanum tuberosum Alimento de gran valor nutritivo, clave en la seguridad alimentaria mundial. Insumos y Factores asociados a la producción agropecuaria, no. 15, p. 92 (2013)Google Scholar
  3. 3.
    Abbot, J., Marohasy, J.: Forecasting extreme monthly rainfall events in regions of Queensland, Australia using artificial neural networks. Int. J. Sustain. Dev. Plan. 12(7), 1117–1131 (2017)CrossRefGoogle Scholar
  4. 4.
    Unnikrishnan, P., Jothiprakash, V.: Daily rainfall forecasting for one year in a single run using singular spectrum analysis. J. Hydrol. 561, 609–621 (2018)CrossRefGoogle Scholar
  5. 5.
    Rasel, R.I., Sultana, N., Meesad, P.: An application of data mining and machine learning for weather forecasting. In: Meesad, P., Sodsee, S., Unger, H. (eds.) IC2IT 2017. AISC, vol. 566, pp. 169–178. Springer, Cham (2018). Scholar
  6. 6.
    Ahmed, B.: Predictive capacity of meteorological data: will it rain tomorrow? In: Proceedings of the 2015 Science and Information Conference, SAI 2015, pp. 199–205 (2015)Google Scholar
  7. 7.
    Chu, W.T., Zheng, X.Y., Ding, D.S.: Image2weather: a large-scale image dataset for weather property estimation. In: Proceedings - 2016 IEEE 2nd International Conference on Multimedia Big Data, BigMM 2016, pp. 137–144 (2016)Google Scholar
  8. 8.
    Gupta, U., Jitkajornwanich, K., Elmasri, R., Fegaras, L.: Adapting k-means clustering to identify spatial patterns in storms. In: Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016, pp. 2646–2654 (2016)Google Scholar
  9. 9.
    Hong, Y., Chiang, Y.M., Liu, Y., Hsu, K.L., Sorooshian, S.: Satellite-based precipitation estimation using watershed segmentation and growing hierarchical self-organizing map. Int. J. Remote Sens. 27(23), 5165–5184 (2006)CrossRefGoogle Scholar
  10. 10.
    Gope, S., Sarkar, S., Mitra, P., Ghosh, S.: Early prediction of extreme rainfall events: a deep learning approach. In: Perner, P. (ed.) ICDM 2016. LNCS (LNAI), vol. 9728, pp. 154–167. Springer, Cham (2016). Scholar
  11. 11.
    Pencue-Fierro, E.L., Solano-Correa, Y.T., Corrales-Muñoz, J.C., Figueroa-Casas, A.: A semi-supervised hybrid approach for multitemporal multi-region multisensor landsat data classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 9(12), 5424–5435 (2016)CrossRefGoogle Scholar
  12. 12.
    Ho, P.-G.P.: Geoscience and Remote Sensing. INTECH (2009)Google Scholar
  13. 13.
    Ministerio de Ambiente Vivienda Y Desarrollo Teritorial, Ministerio de Hacienda y Crédito Público: CONPES 3624 - Programa para el saneamiento, manejo y recuperación ambiental de la cuenca alta del río Cauca, p. 60 (2009)Google Scholar
  14. 14.
    National Aeronautics and Space Administration: TRMM Home Page | Precipitation Measurement Missions. Accessed 29 Mar 2018
  15. 15.
    Vicente, G.A., Scofield, R.A., Menzel, W.P.: The operational GOES infrared rainfall estimation technique. Bull. Am. Meteorol. Soc. 79(9), 1883–1893 (1998)CrossRefGoogle Scholar
  16. 16.
    Meteoblue. Accessed 28 Jul 2017
  17. 17.
    MODIS Vegetation Index Products. Accessed 28 Jul 2017
  18. 18.
    NASA: Shuttle Radar Topography Mission 2017. Accessed 28 Jul 2017
  19. 19.
    N. C. P. Center. NOAA’s Climate Prediction CenterGoogle Scholar
  20. 20.
    Sasaki, H., Kurihara, K.: Relationship between precipitation and elevation in the present climate reproduced by the non-hydrostatic regional climate model. SOLA 4, 109–112 (2008)CrossRefGoogle Scholar
  21. 21.
    Purevdorj, T., Hoshino, B., Ganzorig, S., Tserendulam, T.: Spatial and temporal patterns of NDVI response to precipitation in Mongolian Steppe. J. Rakuno Gakuen Univ. 35(2), 55–62 (2011)Google Scholar
  22. 22.
    Umoh, A.A.: Rainfall and relative humidity occurrence patterns in Uyo metropolis, Akwa Ibom State, South-South Nigeria. IOSR J. Eng. 03(08), 27–31 (2013)CrossRefGoogle Scholar
  23. 23.
    Trenberth, K.E., Shea, D.J.: Relationships between precipitation and surface temperature. Geophys. Res. Lett. 32(14) (2005)CrossRefGoogle Scholar
  24. 24.
    NOAA Center for Weather and Climate Prediction: Climate Prediction Center (CPC), Madden Jullian Oscillation (MJO) 2013. Accessed 28 Jul 2017
  25. 25.
    Corrales, D.C., Corrales, J.C., Ledezma, A.: How to address the data quality issues in regression models: a guided process for data cleaning. Symmetry (Basel) 10(4), 1–20 (2018)Google Scholar
  26. 26.
    Vink, G., Frank, L.E., Pannekoek, J., van Buuren, S.: Predictive mean matching imputation of semicontinuous variables. Stat. Neerl. 68(1), 61–90 (2014)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Andridge, R.R., Little, R.J.: A review of hot deck imputation for survey non-response. Int. Stat. Rev. 78(1), 40–64 (2010). NIH Public AccessCrossRefGoogle Scholar
  28. 28.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Methodol. 39(1), 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  29. 29.
    Stekhoven, D.J., Bühlmann, P.: Missforest-Non-parametric missing value imputation for mixed-type data. Bioinformatics 28(1), 112–118 (2012)CrossRefGoogle Scholar
  30. 30.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
  31. 31.
    Loh, W.-Y.: Classification and regression trees. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 1(1), 14–23 (2011)CrossRefGoogle Scholar
  32. 32.
    Quinlan, J.R.: An overview of Cubist. Retrieved June 2017. Accessed 30 Mar 2018

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Universidad del CaucaPopayánColombia

Personalised recommendations