Abstract
Accurate prediction of rice yield is essential for national food security and the development of the national economy. Currently, owing to the influence of data sources and model parameters, it is difficult to obtain simple and highly accurate models for rice yield prediction. In this study, nine typical rice ecological observation stations in China were selected to build a rice yield prediction model integrating multi-source data based on the least squares support vector machine (LSSVM) model. To improve the accuracy of the rice yield prediction model, the genetic optimization algorithm (GA), particle swarm optimization algorithm (PSO), and grey wolf optimization algorithm (GWO) were selected to optimize the parameters of the least squares support vector machine model. The correlation significances of yield with different influencing factors followed the order: total solar radiation (Ra) > number of spikes (NS) > plant height (H) > average pressure (P) > maximum temperature (Tmax) > relative humidity (RH) > precipitation (Pre) > average surface temperature (Ts) > minimum temperature (Tmin) > sunshine hours (n) > accumulated temperature (Ta), and it was highly significant with meteorological data (P = 63.1%) and significant with phenotypic data (P = 36.9%). With an increasing number of influencing input factors, the model accuracy tended to increase and then decrease when prediction model was constructed. The results showed that in the input models with different variables, the prediction accuracy was the highest when the input was Ra, NS, H, P, Tmax, RH, Pre, Ts, Tmin, and n (R2 = 0.712–0.841, RMSE = 1.139–1.458 ton/ha, MAE = 0.814–1.085 ton/ha, and NSE = 0.702–0.831). With the reintroduction of input variables, the accuracy of the rice prediction model could not be significantly improved. Compared with stand-alone LSSVM models, hybrid optimization algorithms can significantly improve the accuracy of the LSSVM model prediction results. The results of the GA, GWO, and PSO algorithms optimized for LSSVM showed that GWO-LSSVM had the highest accuracy with R2 = 0.841, RMSE = 1.139 ton/ha, MAE = 0.814 ton/ha, and NSE = 0.831. The best accuracies of PSO and GA were R2 = 0.782, RMSE = 1.233 ton/ha, MAE = 0.882 ton/ha, NSE = 0.781, R2 = 0.818, RMSE = 1.169 ton/ha, MAE = 0.863 ton/ha, and NSE = 0.798. This study suggests that the optimization algorithm is important for optimizing the hyperparameter parameters of the LSSVM model and that the GWO˗LSSVM yield prediction model is recommended for predicting rice yields in China.
Graphical abstract
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Abbreviations
- Ra:
-
Total solar radiation
- NS:
-
Number of spikes
- H:
-
Plant height
- P:
-
Average pressure
- Tmax:
-
Maximum temperature
- RH:
-
Relative humidity
- Pre:
-
Precipitation
- Ts:
-
Average surface temperature
- Tmin:
-
Minimum temperature
- n:
-
Sunshine hours
- Ta:
-
Accumulated temperature
- LSSVM:
-
The least squares support vector machine
- GA:
-
The genetic optimization algorithm
- PSO:
-
Article swarm optimization algorithm
- GWO:
-
Grey wolf optimization algorithm
- GA-LSSVM:
-
GA optimization model based on LSSVM
- PSO-LSSVM:
-
PSO optimization model based on LSSVM
- GWO-LSSVM:
-
GWO optimization model based on LSSVM
- GBDT:
-
Gradient Boosting Decision Tree
- F1:
-
Single factor of rice
- F2:
-
Double factors of rice
- F3:
-
Three factors of rice
- F4:
-
Four factors of rice
- F5:
-
Five factors of rice
- F6:
-
Six factors of rice
- F7:
-
Seven factors of rice
- F8:
-
Eight factors of rice
- F9:
-
Nine factors of rice
- F10:
-
Ten factors of rice
- F11:
-
Eleven factors of rice
- R2 :
-
Coefficient of determination
- RMSE:
-
Root mean square error
- MAE:
-
Mean absolute error
- NSE:
-
Nash–Sutcliffe efficiency
- GPI:
-
Global performance indicator
- MLR:
-
Multiple linear regression
- RF:
-
Random forest
- SVR:
-
Support vector regression
- SVM:
-
Support vector machine
- LSTM:
-
Long and short-term memory
References
Ahmadi, F., Mehdizadeh, S., Mohammadi, B., Pham, Q. B., Doan, T. N. C., & Vo, N. D. (2021). Application of an artificial intelligence technique enhanced with intelligent water drops for monthly reference evapotranspiration estimation. Agricultural Water Management, 244, 106622.
Bian, C., Shi, H., Wu, S., Zhang, K., Wei, M., Zhao, Y., Sun, Y., Zhuang, H., Zhang, X., & Chen, S. (2022). Prediction of field-scale wheat yield using machine learning method and multi-spectral UAV data. Remote Sensing, 14, 1474.
Cao, J., Zhang, Z., Tao, F., Zhang, L., Luo, Y., Zhang, J., Han, J., & Xie, J. (2021). Integrating multi-source data for rice yield prediction across china using machine learning and deep learning approaches. Agricultural and Forest Meteorology, 297, 108275.
Cheng, M., Penuelas, J., McCabe, M. F., Atzberger, C., Jiao, X., Wu, W., & Jin, X. (2022). Combining multi-indicators with machine-learning algorithms for maize yield early prediction at the county-level in China. Agricultural and Forest Meteorology, 323, 109057.
Dhakar, R., Sehgal, V. K., Chakraborty, D., Sahoo, R. N., Mukherjee, J., Ines, A. V., Kumar, S. N., Shirsath, P. B., & Roy, S. B. (2022). Field scale spatial wheat yield forecasting system under limited field data availability by integrating crop simulation model with weather forecast and satellite remote sensing. Agricultural Systems, 195, 103299.
Dong, J., Liu, X., Huang, G., Fan, J., Wu, L., & Wu, J. (2021). Comparison of four bio-inspired algorithms to optimize KNEA for predicting monthly reference evapotranspiration in different climate zones of China. Computers and Electronics in Agriculture, 186, 106211.
Feng, P., Wang, B., Li Liu, D., Waters, C., & Yu, Q. (2019). Incorporating machine learning with biophysical model can improve the evaluation of climate extremes impacts on wheat yield in south-eastern Australia. Agricultural and Forest Meteorology, 275, 100–113.
Feng, Y., Hao, W., Li, H., Cui, N., Gong, D., & Gao, L. (2020). Machine learning models to quantify and map daily global solar radiation and photovoltaic power. Renewable and Sustainable Energy Reviews, 118, 109393.
Guo, Y., Fu, Y., Hao, F., Zhang, X., Wu, W., Jin, X., Bryant, C. R., & Senthilnath, J. (2021a). Integrated phenology and climate in rice yields prediction using machine learning methods. Ecological Indicators, 120, 106935.
Guo, Y., Xiang, H., Li, Z., Ma, F., & Du, C. (2021b). Prediction of rice yield in East China based on climate and agronomic traits data using artificial neural networks and partial least squares regression. Agronomy, 11, 282.
Han, J., Zhang, Z., Cao, J., Luo, Y., Zhang, L., Li, Z., & Zhang, J. (2020). Prediction of winter wheat yield based on multi-source data and machine learning in China. Remote Sensing, 12, 236.
Iniyan, S., Varma, V. A., & Naidu, C. T. (2023). Crop yield prediction using machine learning techniques. Advances in Engineering Software, 175, 103326.
Jin, N., Tao, B., Ren, W., He, L., Zhang, D., Wang, D., & Yu, Q. (2022). Assimilating remote sensing data into a crop model improves winter wheat yield estimation based on regional irrigation data. Agricultural Water Management, 266, 107583.
Kumar, M. N., & Balakrishnan, M. (2019). Prediction of sugarcane yield using lssvm and lssvm with simulated annealing based algorithm. IJRAR-International Journal of Research and Analytical Reviews (IJRAR), 6, 1141–1148.
Li, M., Zhao, J., & Yang, X. (2021). Building a new machine learning-based model to estimate county-level climatic yield variation for maize in Northeast China. Computers and Electronics in Agriculture, 191, 106557.
Li, L., Wang, B., Feng, P., Li Liu, D., He, Q., Zhang, Y., Wang, Y., Li, S., Lu, X., & Yue, C. (2022). Developing machine learning models with multi-source environmental data to predict wheat yield in China. Computers and Electronics in Agriculture, 194, 106790.
Liu, H.-B., Gou, Y., Wang, H.-Y., Li, H.-M., & Wu, W. (2014). Temporal changes in climatic variables and their impact on crop yields in southwestern China. International Journal of Biometeorology, 58, 1021–1030.
Luo, S., Jiang, X., Yang, K., Li, Y., & Fang, S. (2022). Multispectral remote sensing for accurate acquisition of rice phenotypes: Impacts of radiometric calibration and unmanned aerial vehicle flying altitudes. Frontiers in Plant Science, 13, 958106.
Mathieu, J. A., & Aires, F. (2018). Assessment of the agro-climatic indices to improve crop yield forecasting. Agricultural and Forest Meteorology, 253, 15–30.
Mohammadi, B., & Mehdizadeh, S. (2020). Modeling daily reference evapotranspiration via a novel approach based on support vector regression coupled with whale optimization algorithm. Agricultural Water Management, 237, 106145.
Paudel, D., Boogaard, H., de Wit, A., van der Velde, M., Claverie, M., Nisini, L., Janssen, S., Osinga, S., & Athanasiadis, I. N. (2022). Machine learning for regional crop yield forecasting in Europe. Field Crops Research, 276, 108377.
Peng, Y., Wang, L., Zhao, L., Liu, Z., Lin, C., Hu, Y., & Liu, L. (2021). Estimation of soil nutrient content using hyperspectral data. Agriculture, 11, 1129.
Qiao, S., Harrison, S. P., Prentice, I. C., & Wang, H. (2023). Optimality-based modelling of wheat sowing dates globally. Agricultural Systems, 206, 103608.
Ruan, G., Li, X., Yuan, F., Cammarano, D., Ata-UI-Karim, S. T., Liu, X., Tian, Y., Zhu, Y., Cao, W., & Cao, Q. (2022). Improving wheat yield prediction integrating proximal sensing and weather data with machine learning. Computers and Electronics in Agriculture, 195, 106852.
Samiappan, S., Hariharasubramanian, A., Venkataraman, P., Jan, H., & Narasimhan, B. (2018). Impact of regional climate model projected changes on rice yield over southern India. International Journal of Climatology, 38, 2838–2851.
Seifi, A., & Soroush, F. (2020). Pan evaporation estimation and derivation of explicit optimized equations by novel hybrid meta-heuristic ANN based methods in different climates of Iran. Computers and Electronics in Agriculture, 173, 105418.
Taghizadeh-Mehrjardi, R., Emadi, M., Cherati, A., Heung, B., Mosavi, A., & Scholten, T. (2021). Bio-inspired hybridization of artificial neural networks: An application for mapping the spatial distribution of soil texture fractions. Remote Sensing, 13, 1025.
Tian, H., Wang, P., Tansey, K., Zhang, S., Zhang, J., & Li, H. (2020). An IPSO-BP neural network for estimating wheat yield using two remotely sensed variables in the Guanzhong Plain, PR China. Computers and Electronics in Agriculture, 169, 105180.
Tikhamarine, Y., Malik, A., Souag-Gamane, D., & Kisi, O. (2020). Artificial intelligence models versus empirical equations for modeling monthly reference evapotranspiration. Environmental Science and Pollution Research, 27, 30001–30019.
Wang, Y., Zhang, Z., Feng, L., Du, Q., & Runge, T. (2020). Combining multi-source data and machine learning approaches to predict winter wheat yield in the conterminous United States. Remote Sensing, 12, 1232.
Wang, D.-Y., Li, X.-Y., Chang, Y., Xu, C.-M., Song, C., Guang, C., Zhang, Y.-B., & Zhang, X.-F. (2021). Geographic variation in the yield formation of single-season high-yielding hybrid rice in Southern China. Journal of Integrative Agriculture, 20, 438–449.
Wang, S., Wu, Y., Li, R., & Wang, X. (2023a). Remote sensing-based retrieval of soil moisture content using stacking ensemble learning models. Land Degradation & Development, 34, 911–925.
Wang, Y., Shi, W., & Wen, T. (2023b). Prediction of winter wheat yield and dry matter in North China Plain using machine learning algorithms for optimal water and nitrogen application. Agricultural Water Management, 277, 108140.
Wu, Q., & Lin, H. (2019). A novel optimal-hybrid model for daily air quality index prediction considering air pollutant factors. Science of the Total Environment, 683, 808–821.
Xin, W., Liu, H., Yang, L., Ma, T., Wang, J., Zheng, H., Liu, W., & Zou, D. (2022). BSA-Seq and fine linkage mapping for the identification of a novel locus (qPH9) for mature plant height in rice (Oryza sativa). Rice, 15, 1–11.
Xu, H., Zhang, X., Ye, Z., Jiang, L., Qiu, X., Tian, Y., Zhu, Y., & Cao, W. (2021). Machine learning approaches can reduce environmental data requirements for regional yield potential simulation. European Journal of Agronomy, 129, 126335.
Yang, M., Xu, D., Chen, S., Li, H., & Shi, Z. (2019). Evaluation of machine learning approaches to predict soil organic matter and pH using Vis-NIR spectra. Sensors, 19, 263.
Yu, D., Zha, Y., Shi, L., Ye, H., & Zhang, Y. (2022). Improving sugarcane growth simulations by integrating multi-source observations into a crop model. European Journal of Agronomy, 132, 126410.
Zhang, Y., Xiao, D., Liu, Y., & Wu, H. (2022). An algorithm for automatic identification of multiple developmental stages of rice spikes based on improved Faster R-CNN. The Crop Journal, 10, 1323–1333.
Zhao, L., Zhao, X., Zhou, H., Wang, X., & Xing, X. (2021). Prediction model for daily reference crop evapotranspiration based on hybrid algorithm and principal components analysis in Southwest China. Computers and Electronics in Agriculture, 190, 106424.
Zhao, L., Zhao, X., Pan, X., Shi, Y., Qiu, Z., Li, X., Xing, X., & Bai, J. (2022). Prediction of daily reference crop evapotranspiration in different Chinese climate zones: Combined application of key meteorological factors and Elman algorithm. Journal of Hydrology, 610, 127822.
Zheng, Z.-Y., Xie, G., Li, L., & Liu, W.-L. (2020). The joint effect of ultrasound and magnetic Fe3O4 nanoparticles on the yield of 2, 6-dimethoxy-ρ-benzoquinone from fermented wheat germ: Comparison of evolutionary algorithms and interactive analysis of paired-factors. Food Chemistry, 302, 125275.
Zhou, Y., Zhou, N., Gong, L., & Jiang, M. (2020). Prediction of photovoltaic power output based on similar day analysis, genetic algorithm and extreme learning machine. Energy, 204, 117894.
Zhou, N.-b, S-l, F. A. N. G., H-y, W. E. I., & H-c, Z. H. A. N. G. (2021). Effects of temperature and solar radiation on yield of good eating-quality rice in the lower reaches of the Huai River Basin, China. Journal of Integrative Agriculture, 20, 1762–1774.
Acknowledgements
We would like to thank the National Climatic Centre of the China Meteorological Administration for providing the climate database used in this study. This work was also supported by National Natural Science Foundation of China (Grant No. 51922072), Key R&D and Promotion Projects in Henan Province (Science and Technology Development) (Grant No. 222102110452 and 232102110264), PhD Research Startup Foundation of Henan University of Science and Technology (No. 13480025 & 13480033), Key Scientific Research Projects of Colleges and Universities in Henan Province (No.22B416002).
Author information
Authors and Affiliations
Contributions
LZ conceptualization, methodology, supervision, funding acquisition. SQ writing—original draft, formal analysis, software. FW investigation, data curation, software. HW visualization, software. HM visualization, funding acquisition. YS investigation. NC software, writing—review and editing.
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhao, L., Qing, S., Wang, F. et al. Prediction of Rice Yield Based on Multi-Source Data and Hybrid LSSVM Algorithms in China. Int. J. Plant Prod. 17, 693–713 (2023). https://doi.org/10.1007/s42106-023-00266-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42106-023-00266-z