Skip to main content
Log in

Predicting coffee yield based on agroclimatic data and machine learning

  • Original Paper
  • Published:
Theoretical and Applied Climatology Aims and scope Submit manuscript

Abstract

Climate directly and indirectly influences agriculture, being the main responsible for low and high yields. Prior knowledge on yield helps coffee farmers in their decision-making and planning for the future harvest, avoiding unnecessary costs and losses during the harvesting process. Thus, we sought to predict coffee yield with regressive models using meteorological data of the state of Paraná, Brazil. This study was carried out in 15 localities that produce Coffea arabica in this Brazilian state. The climate data were collected using the NASA/POWER platform from 1989 to 2020, while the data of arabica coffee yield (bags/ha) were obtained by CONAB from 2003 to 2018. The Penman–Monteith method was used to calculate the reference evapotranspiration and the climatological water balance (WB) was calculated based on Thornthwaite and Mather (1955). Multiple linear regression was used in the data modeling, in which C. arabica yield was the dependent variable and air temperature, precipitation, solar radiation, water deficit, water surplus, and soil water storage were the independent variables. The comparison between the estimation models and the actual data was performed using the statistical indices RMSE (accuracy) and adjusted coefficient of determination (R2adj) (precision). Multiple linear regression models can predict arabica coffee yield in the state of Paraná 2 to 3 months before harvest. The maximum air temperature is the climate element that most influences coffee plants, especially during fruit formation (March). Maximum air temperatures of 31.01 °C in March can reduce coffee production. Wenceslau Braz, Jacarezinho, and Ibaiti presented the highest yields, with mean values of 32.5, 29.9, and 29.3 bags ha−1, respectively. The models calibrated for localities that have Argisol had the highest mean accuracy, with an RMSE of 2.68 bags ha−1. The best models were calibrated for Paranavaí (Latosol), with an RMSE of 0.78 bags ha−1 and R2adj of 0.89, and Ibaiti (Argisol), with RMSE and R2adj values of 3.09 bags ha−1 and 0.83, respectively. Paranavaí has a mean difference between the actual and estimated coffee yield of only 0.86 bags ha−1. The highest deviations were observed in Wenceslau Braz (9.17 bags ha−1) and the lowest deviations were found in Paranavaí (0.86 bags ha−1). The models can be used to predict arabica coffee yield, assisting the planning of coffee farmers in the northern region of the state of Paraná.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Adapted from Aparecido et al. (2018)

Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

The data/material is opened.

Code availability

The software used was python and scripts are available.

References

Download references

Funding

This study was funded by IFMS Campus of Naviraí and IFSULDEMINAS Campus Muzambinho.

Author information

Authors and Affiliations

Authors

Contributions

Lucas Eduardo de Oliveira Aparecido: formal analysis, writing—original draft, writing—review and editing, investigation, conceptualization, methodology, supervision, project administration; João Antonio Lorençone: writing—review and editing; Pedro Antonio Lorençone: writing—review and editing; Guilherme Botega Torsoni: writing—review and editing; Rafael Fausto Lima: writing—review and editing; José Reinaldo da Silva Cabral de Moraes: writing—review and editing.

Corresponding author

Correspondence to Lucas Eduardo de Oliveira Aparecido.

Ethics declarations

Ethics approval.

Not applicable.

Consent to participate.

Not applicable.

Consent for publication.

Not applicable.

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 4965 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Oliveira Aparecido, L.E., Lorençone, J.A., Lorençone, P.A. et al. Predicting coffee yield based on agroclimatic data and machine learning. Theor Appl Climatol 148, 899–914 (2022). https://doi.org/10.1007/s00704-022-03983-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00704-022-03983-z

Navigation