Abstract
An authentic water consumption forecast is an auxiliary tool to support the management of the water supply and demand in urban areas. Providing a highly accurate forecasting model depends a lot on the quality of the input data. Despite the advancement of technology, water consumption in some places is still recorded by operators, so its database usually has some approximate and incomplete data. For this reason, the methods used to predict the water demand should be able to handle the drawbacks caused by the uncertainty in the dataset. In this regard, a structured hybrid approach was designed to cluster the customers and predict their water demand according to the uncertainty in the dataset. First, a fuzzy-based algorithm consisting of Forward-Filling, Backward-Filling, and Mean methods was innovatively proposed to impute the missing data. Then, a multi-dimensional time series k-means clustering technique was developed to group the consumers based on their consumption behavior, for which the missing data were estimated with fuzzy numbers. Finally, one forecasting model inspired by Long Short-Term Memory (LSTM) networks was adjusted for each cluster to predict the monthly water demand using the lagged demand and the temperature. This approach was implemented on the water time series of the residential consumers in Yazd, Iran, from January 2011 to November 2020. Based on the performance evaluation in terms of the Root Mean Squared Error (RMSE), the proposed approach had an acceptable level of confidence to predict the water demand of all the clusters.
Similar content being viewed by others
Data availability
Not applicable.
References
Aghabozorgi S, Ying Wah T, Herawan T, Jalab HA, Shaygan MA, Jalali A (2014) A hybrid algorithm for clustering of time series data based on affinity search technique. Sci World J 2014
Altunkaynak A, Nigussie TA (2017) Monthly water consumption prediction using season algorithm and wavelet transform–based models. J Water Resour Plan Manag 143(6):04017011
Amiri M, Jensen R (2016) Missing data imputation using fuzzy-rough methods. Neurocomputing 205:152–164
Antunes A, Andrade-Campos A, Sardinha-Lourenço A, Oliveira M (2018) Short-term water demand forecasting using machine learning techniques. J Hydroinf 20(6):1343–1366
Aristiawati K, Siswantining T, Sarwinda D, Soemartojo SM (2019). Missing values imputation based on fuzzy C-Means algorithm for classification of chronic obstructive pulmonary disease (COPD). In AIP Conference Proceedings (Vol. 2192, No. 1, p. 060003). AIP Publishing LLC
Bata Mt, Carriveau R, Ting DS-K (2020). Short-term water demand forecasting using hybrid supervised and unsupervised machine learning model. Smart Water, 5, 1-18
Bokde N, Beck MW, Álvarez FM, Kulat K (2018) A novel imputation methodology for time series based on pattern sequence forecasting. Pattern Recogn Lett 116:88–96
Candelieri A (2017) Clustering and support vector regression for water demand forecasting and anomaly detection. Water 9(3):224
Candelieri A, Giordani I, Archetti F, Barkalov K, Meyerov I, Polovinkin A, Zolotykh N (2019) Tuning hyperparameters of a SVM-based water demand forecasting system through parallel global optimization. Comput Oper Res 106:202–209
Chen SM (1994) Fuzzy system reliability analysis using fuzzy number arithmetic operations. Fuzzy Sets Syst 64(1):31–38
Du B, Zhou Q, Guo J, Guo S, Wang L (2021) Deep learning with long short-term memory neural networks combining wavelet transform and principal component analysis for daily urban water demand forecasting. Expert Syst Appl 171:114571
El-Bakry M, Ali F, El-Kilany A, Mazen S (2021) Fuzzy based techniques for handling missing values. Int J Adv Comput Sci Appl 12(3)
García Valverde D, Quevedo Casín JJ, Puig Cayuela V, Saludes Closa J (2015) Water demand estimation and outlier detection from smart meter data using classification and Big Data methods. In 2nd New Developments in IT & Water Conference, 8–10 Rotterdam (Holland) (pp. 1–8)
Gil A, Quartulli M, Olaizola IG, Sierra B (2020) Learning Optimal Time Series Combination and Pre-Processing by Smart Joins. Appl Sci 10(18):6346
Giordano D, Mellia M, Cerquitelli T (2021) K-mdtsc: K-multi-dimensional time-series clustering algorithm. Electronics 10(10):1166
Hammond M, Chen AS, Batica J, Butler D, Djordjević S, Gourbesville P et al (2018) A new flood risk assessment framework for evaluating the effectiveness of policies to improve urban flood resilience. Urban Water Journal 15(5):427–436
Herrera M, Torgo L, Izquierdo J, Pérez-García R (2010) Predictive models for forecasting hourly urban water demand. J Hydrol 387(1–2):141–150
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hu P, Tong J, Wang J, Yang Y, de Oliveira Turci L (2019) A hybrid model based on CNN and Bi-LSTM for urban water demand prediction. In 2019 IEEE Congress on evolutionary computation (CEC) (pp. 1088-1094). IEEE
Jun S, Jung D, Lansey KE (2021) Comparison of imputation methods for end-user demands in water distribution systems. J Water Resour Plan Manag 147(12):04021080
Kavitha V, Punithavalli M (2010). Clustering time series data stream-a literature survey. arXiv preprint arXiv:1005.4270
Kim SH, Yang HJ, Ng KS (2011) Incremental expectation maximization principal component analysis for missing value imputation for coevolving EEG data. J Zhejiang Univ Sci 12(8):687–697
Kühnert C, Gonuguntla NM, Krieg H, Nowak D, Thomas JA (2021) Application of LSTM networks for water demand prediction in optimal pump control. Water 13(5):644
Kumaran SR, Othman MS, Yusuf LM, Yunianta A (2019) Estimation of Missing Values Using Hybrid Fuzzy Clustering Mean and Majority Vote for Microarray Data. Procedia Comput Sci 163:145–153
Luengo J, Sáez JA, Herrera F (2012) Missing data imputation for fuzzy rule-based classification systems. Soft Comput 16(5):863–881
Madani K (2014) Water management in Iran: what is causing the looming crisis? J Environ Stud Sci 4(4):315–328
Mousavi-Mirkalaei P, Roozbahani A, Banihabib ME, Randhir TO (2022) Forecasting urban water consumption using bayesian networks and gene expression programming. Earth Sci Inf 15(1):623–633
Nejadrekabi M, Eslamian S, Zareian MJ (2022) Spatial statistics techniques for SPEI and NDVI drought indices: a case study of Khuzestan Province. Int J Environ Sci Technol 19(7):6573–6594
Niknam A, Zare HK, Hosseininasab H, Mostafaeipour A, Herrera M (2022) A Critical Review of Short-Term Water Demand Forecasting Tools—What Method Should I Use? Sustainability 14(9):5412
Pesantez JE, Berglund EZ, Kaza N (2020) Smart meters data for modeling and forecasting water demand at the user-level. Environ Model Softw 125:104633
Qi C, Chang NB (2011) System dynamics modeling for municipal water demand estimation in an urban region under uncertain economic impacts. J Environ Manage 92(6):1628–1641
Razavi-Far R, Saif M (2016) Imputation of missing data using fuzzy neighborhood density-based clustering. In 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1834–1841). IEEE
Rezaali M, Quilty J, Karimi A (2021) Probabilistic urban water demand forecasting using wavelet-based machine learning models. J Hydrol 600:126358
Saemian P, Tourian MJ, AghaKouchak A, Madani K, Sneeuw N (2022) How much water did Iran lose over the last two decades? J. Hydrol Reg Stud 41:101095
Tang F, Ishwaran H (2017) Random forest missing data algorithms. Stat Anal Data Min ASA Data Sci J 10(6):363–377
Torres JF, Martínez-Álvarez F, Troncoso A (2022) A deep LSTM network for the Spanish electricity consumption forecasting. Neural Comput Appl 34(13):10533–10545
Vijai P, Sivakumar PB (2018) Performance comparison of techniques for water demand forecasting. Procedia Comput Sci 143:258–266
Vysala A, Gomes D (2020). Evaluating and validating cluster results. arXiv preprint arXiv:2007.08034
Wang X, Xu Y (2019) An improved index for clustering validation based on Silhouette index and Calinski-Harabasz index. In IOP Conference Series: Materials Science and Engineering (Vol. 569, No. 5, p. 052024). IOP Publishing
Zadeh LA, Klir GJ, Yuan B (1996) Fuzzy sets, fuzzy logic, and fuzzy systems: selected papers (Vol. 6). World scientific
Zaidi AZ, Rasmani KA (2016) Classification of excessive domestic water consumption using Fuzzy Clustering Method. In Journal of Physics: Conference Series (Vol. 738, No. 1, p. 012081). IOP Publishing
Zanfei A, Brentan BM, Menapace A, Righetti M, Herrera M (2022a) Graph convolutional recurrent neural networks for water demand forecasting. Water Resour Res 58(7):e2022WRO32299
Zanfei A, Menapace A, Brentan BM, Righetti M (2022b) How does missing data imputation affect the forecasting of urban water demand? J Water Resour Plan Manag 148(11):04022060
Zubaidi SL, Ortega-Martorell S, Al-Bugharbee H, Olier I, Hashim KS, Gharghan SK, Al-Khaddar R (2020) Urban water demand prediction for a city that suffers from climate change and population growth: Gauteng province case study. Water, 12(7), 1885
Yu Y, Si X, Hu C, Zhang J (2019) A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput 31(7):1235–1270
Author information
Authors and Affiliations
Contributions
Conceptualization, A.N. and H.K.Z.;.; data collection, A.N.; methodology, A.N.; software, A.N.; validation, A.N., H.K.Z., H.H., and A.M. and; formal analysis, A.N. and H.K.Z.; investigation, A.N., H.K.Z.; resources, A.N., H.K.Z., H.H. and A.M; writing original draft preparation, A.N.; writing review and editing, A.N., H.K.Z., H.H., and A.M.; visualization, A.N. and H.K.Z.; supervision, H.K.Z. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Ethical approval
The authors assure that this article has not been published in any other journal and that no plagiarism has occurred.
Consent to participate
The authors assure that this article has not been published in any other journal and that no plagiarism has occurred.
Conflict of interests
Regarding the content of this article, the authors have no competing interests to declare.
Additional information
Communicated by: H. Babaie
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Niknam, A., Zare, H.K., Hosseininasab, H. et al. A hybrid approach combining the multi-dimensional time series k-means algorithm and long short-term memory networks to predict the monthly water demand according to the uncertainty in the dataset. Earth Sci Inform 16, 1519–1536 (2023). https://doi.org/10.1007/s12145-023-00976-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-023-00976-y