Abstract
Temperature data is one of the basic inputs of meteorological, hydrological and climatic studies. The completeness of this data is of great importance for reliability in research. This study aimed to compare the performances of various machine learning methods such as support vector machines (SVM), adaptive neuro-fuzzy inference system (ANFIS) and decision tree (DT) to infill missing air temperature data. Monthly average temperature data from 1968 to 2017 (50 years) was used to develop the models. In the established model, the data is divided as 80/20% (1968–2007 training/2008–2017 testing). Neighbouring stations, like Sarıkamış, Tortum and Ağrı, which have a high correlation with Horasan, were used as inputs to estimate the temperature data of the Horasan station. The most suitable machine learning method was chosen according to the mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE) and determination coefficients (R2) of the training and test results. The ANFIS model with four sub-sets, triangular membership function, hybrid learning algorithm and 300 iterations was selected as the most suitable model. It was recommended using ANFIS to estimate monthly air temperatures in the northeastern part of Turkey and perhaps in other semi-arid climatic regions around the world.
Similar content being viewed by others
Data and material availability
Not applicable.
References
Ahmad MW, Reynolds J, Rezgui Y (2018) Predictive modelling for solar thermal energy systems: a comparison of support vector regression, random forest, extra trees and regression trees. J Clean Prod 203:810–821. https://doi.org/10.1016/j.jclepro.2018.08.207
Batra M, Agrawal R (2018) Comparative analysis of decision tree algorithms. Nature inspired computing. Springer, Singapore, pp 31–36
Bonfante AG, Ventura TM, de Oliveira AG, Marques HO, Oliveira RS, Martins CA, de Figueiredo JM (2013) A computational approach for gap filling in micrometeorological data. Braz J Environ Sci 27:61–70
Boulila W, Farah IR, Ettabaa KS, Solaiman B, Ghézala HB (2011) A data mining based approach to predict spatiotemporal changes in satellite images. Int J Appl Earth Obs Geoinf 13(3):386–395. https://doi.org/10.1016/j.jag.2011.01.008
Chang FJ, Chang YT (2006) Adaptive neuro-fuzzy inference system for prediction of water level in reservoir. Adv Water Resour 29(1):1–10. https://doi.org/10.1016/j.advwatres.2005.04.015
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1007/BF00994018
Coulibaly P, Evora N (2007) Comparison of neural network methods for infilling missing daily weather records. J Hydrol 341(1–2):27–41. https://doi.org/10.1016/j.jhydrol.2007.04.020
Dawson CW, Wilby R (1998) An artificial neural network approach to rainfall-runoff modelling. Hydrol Sci J 43(1):47–66. https://doi.org/10.1080/02626669809492102
De S, Debnath A (2009) Artificial neural network based prediction of maximum and minimum temperature in the summer monsoon months over India. Appl Phys Res 1(2):37
Dehghani R, Poudeh HT (2021) Applying hybrid artificial algorithms to the estimation of river flow: a case study of Karkheh catchment area. Arab J Geosci 14(9):1–19. https://doi.org/10.1007/s12517-021-07079-2
Dombaycı ÖA, Gölcü M (2009) Daily means ambient temperature prediction using artificial neural network method: a case study of Turkey. Renewable Energy 34(4):1158–1161. https://doi.org/10.1016/j.renene.2008.07.007
Greenspan H, Van Ginneken B, Summers RM (2016) Guest editorial deep learning in medical imaging: overview and future promise of an exciting new technique. IEEE Trans Med Imaging 35(5):1153–1159. https://doi.org/10.1109/TMI.2016.2553401
Hassan MA, Khalil A, Kaseb S, Kassem MA (2017) Exploring the potential of tree-based ensemble methods in solar radiation modeling. Applied Energy 203:897–916. https://doi.org/10.1016/j.apenergy.2017.06.104
Hill T, Lewicki P, Lewicki P (2006) Statistics: methods and applications: a comprehensive reference for science, industry, and data mining: StatSoft, Inc.
Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts HJ (2018) Artificial intelligence in radiology. Nat Rev Cancer 18(8):500–510. https://doi.org/10.1038/s41568-018-0016-5
Huang C, Zhao Z, Wang L, Zhang Z, Luo X (2020) Point and interval forecasting of solar irradiance with an active Gaussian process. IET Renew Power Gener 14(6):1020–1030
Jang JS (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybern 23(3):665–685
Kajewska-Szkudlarek J, Stańczyk J (2018) Filling missing meteorological data with computational intelligence methods. Paper Presented at the ITM Web of Conferences. https://doi.org/10.1051/itmconf/20182300015
Kashani MH, Dinpashoh Y (2012) Evaluation of efficiency of different estimation methods for missing climatological data. Stoch Env Res Risk Assess 26(1):59–71. https://doi.org/10.1007/s00477-011-0536-y
Katipoğlu OM, Acar R (2021) Estimation of missing temperature data by artificial neural network (ANN). Dicle Univ Eng Facult J Eng 12(2):431–438. https://doi.org/10.24012/dumf.852821
Kavzoğlu T, Şahin EK, Çölkesen I (2012) Assessment of landslide susceptibility using regression trees: the case of Trabzon province. J Map 147(3):21–33
Kaya M (2018) The completion with ANFIS of the missing currents data stream. Süleyman Demirel University Graduate School of Natural and Applied Sciences Department of Civil Engineering, Isparta.
Kotsiantis S, Kostoulas A, Lykoudis S, Argiriou A, Menagias K (2006) Filling missing temperature values in weather data banks. Paper presented at the 2006 2nd IET International Conference on Intelligent Environments-IE 06.
Kumaş K, Ayan M, Akyüz AÖ, Güngör A (2020) Estimation of air temperature by artificial neural networks with meteorological data for Antalya province. Gumushane Univ J Sci Inst 10(1):146–154. https://doi.org/10.17714/gumusfenbil.511481
Liu W, Chawla S, Cieslak DA, Chawla NV (2010) A robust decision tree algorithm for imbalanced data sets. Paper presented at the Proceedings of the 2010 SIAM International Conference on Data Mining. https://doi.org/10.1137/1.9781611972801.67
Mabel MC, Fernandez E (2008) Analysis of wind power generation and prediction using ANN: a case study. Renewable Energy 33(5):986–992. https://doi.org/10.1016/j.renene.2007.06.013
MathWorks (2021a) Statistics and Machine Learning Toolbox™ User's Guide. https://www.mathworks.com/help/pdf_doc/stats/stats.pdf. Accessed 20 September 2021
Ozbek A, Sekertekin A, Bilgili M, Arslan N (2021) Prediction of 10-min, hourly, and daily atmospheric air temperature: comparison of LSTM, ANFIS-FCM, and ARMA. Arab J Geosci 14(7):1–16. https://doi.org/10.1007/s12517-021-06982-y
Özel A, Büyükyıldız M (2019) Usability of artificial intelligence methods for estimation of monthly evaporation. Niğde Ömer Halisdemir Univ J Eng Sci 8(1):244–254. https://doi.org/10.28948/ngumuh.516891
Provost F, Kohavi R (1998) Glossary of terms. J Mach Learn 30(2–3):271–274
Quej VH, Almorox J, Arnaldo JA, Saito L (2017) ANFIS, SVM and ANN soft-computing techniques to estimate daily global solar radiation in a warm sub-humid environment. J Atmos Solar Terr Phys 155:62–70. https://doi.org/10.1016/j.jastp.2017.02.002
Radhika Y, Shashi M (2009) Atmospheric temperature prediction using support vector machines. Int J Comput Theory Eng 1(1):55
Şahin M (2012) Modelling of air temperature using remote sensing and artificial neural network in Turkey. Adv Space Res 50(7):973–985. https://doi.org/10.1016/j.asr.2012.06.021
Şahin M, Erol R (2017) A comparative study of neural networks and ANFIS for forecasting attendance rate of soccer games. Math Comput Appl 22(4):43. https://doi.org/10.3390/mca22040043
Salcedo-Sanz S, Deo RC, Carro-Calvo L, Saavedra-Moreno B (2016) Monthly prediction of air temperature in Australia and New Zealand with machine learning algorithms. Theoret Appl Climatol 125(1):13–25. https://doi.org/10.1007/s00704-015-1480-4
Saplioglu K, Kucukerdem TS (2018) Estimation of missing streamflow data using ANFIS models and determination of the number of datasets for Anfis: the case of Yeşilırmak River. Appl Ecol Environ Res 16(3):3583–3594. https://doi.org/10.20944/preprints201803.0084.v1
Tabari H, Kisi O, Ezani A, Talaee PH (2012) SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment. J Hydrol 444:78–89. https://doi.org/10.1016/j.jastp.2017.02.002
Tosunoglu F, Hanay S, Çintaş E, Özyer B (2020) Monthly streamflow forecasting using machine learning. J Erzincan Univ Inst Sci Technol 13(3):1242–1251. https://doi.org/10.18185/erzifbed.780477
Trabelsi A, Elouedi Z, Lefevre E (2019) Decision tree classifiers for evidential attribute values and class labels. Fuzzy Sets Syst 366:46–62. https://doi.org/10.1016/j.fss.2018.11.006
Xu M, Watanachaturaporn P, Varshney PK, Arora MK (2005) Decision tree regression for soft classification of remote sensing data. Remote Sens Environ 97(3):322–336. https://doi.org/10.1016/j.rse.2005.05.008
Zeng L, Hu Y, Wang R, Zhang X, Peng G, Huang Z, Zhou G, Xiang D, Meng R, Wu W, Hu S (2021) 8-Day and daily maximum and minimum air temperature estimation via machine learning method on a climate zone to global scale. Remote Sens 13(12):2355. https://doi.org/10.3390/rs13122355
Acknowledgements
The authors thank the General Directorate of Meteorology of Turkey for the observed monthly temperature data provided, the Editor and the anonymous reviewers for their contributions to the content and development of this paper.
Author information
Authors and Affiliations
Contributions
This is a single author paper, and the author, O. M. Katipoğlu, solely made the study conception, analysis, and manuscript preparation.
Corresponding author
Ethics declarations
Ethical approval
The manuscript complies with all the ethical requirements; the paper was not published in any journal.
Consent to participate
Not applicable.
Consent to for publication
Not applicable.
Conflict of interest
The author declares no competing interests.
Additional information
Responsible Editor: Zhihua Zhan
Part of this work was presented orally at the IV. International Conference on Data Science and Applications 2021
Rights and permissions
About this article
Cite this article
Katipoğlu, O. Prediction of missing temperature data using different machine learning methods. Arab J Geosci 15, 21 (2022). https://doi.org/10.1007/s12517-021-09290-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12517-021-09290-7