Forecasting temperature of the Saudi Arabian Province of Makkah using a discrete state–space modeling approach

The maximum and minimum air temperature components (Tmax and Tmin) play a crucial role in science. This study proposes a discrete-time identified state-space modeling approach in which the temperature fluctuation was modelled as a state-space system with the temperature time series as inputs. We aim to provide a tool for projecting future scenarios of Tmax and Tmin. The current research employs a prediction-focused methodology to system identification, with the overarching goal of developing a realistic and dynamic system model. Data on the Tmin and Tmax recorded in the Saudi Arabian province of Makkah are used to test the accuracy and robustness of the proposed methodology. The proposed model was developed utilizing 120 years' (1901–2020) worth of historical monthly time series data on Tmax and Tmin. It was applied to anticipate future temperatures over the ensuing 60 years (up to 2080). For maximum temperature projections, the fit to the data or prediction focus was 87.04% and 85.14%, respectively for the identification (training) and validation phases of the model development. Akaike’s Final Prediction Error (FPE) and Mean Squared Error (MSE) values were observed to be 0.37 °C and 0.34 °C, respectively. The prediction focus during the identification and validation phases were 86.25% and 84.78%, respectively for the minimum temperature projection. The FPE and MSE values were 0.41 °C and 0.37 °C, respectively for this instance. The findings demonstrate that the recommended discrete state-space modeling approach may be utilized to predict temperature variations in the future.


Introduction
The planet has been experiencing major regional and global temperature increases more than 1 °C since 1850 [1]. It is believed that greenhouse gas emissions, which are mostly caused by human activities like the burning of fossil fuels, deforestation, and agriculture, are substantially to blame for the rise in the earth's surface temperature and global warming [2]. Greenhouse gas emission has a significant influence on ecological environment [3,4], social economy [5,6], and human health [6][7][8]. Drought is brought on by weather circumstances like low to no rainfall that appears to be falling in a region, which in turn causes a substantial degradation in crop productivity. Furthermore, difficulties can arise from severe temperature conditions. Extremely high or low temperatures can make people uneasy and lead to a variety of health problems, which can eventually increase morbidity and the death rate. People working outside, the elderly, and young children, for instance, were all afflicted by significant health conditions like heat stroke and diarrhea. Numerous studies have evaluated the future temperature using statistical methods at both the global and regional levels, and it has been generally agreed that the temperature will rise but that the rate of increase would vary by region [9][10][11][12][13][14]. The variability in climate, geography, and social context may be responsible for the discrepancies in temperature rise across earlier research [9,[15][16][17]. As a result, more research is required, especially on a regional scale, to assess the regional applicability of data-driven statistical modeling approaches for temperature prediction, which is useful for supplying ample and focused supplements for the response of regional temperature variation to climate change. Short-and long-term temperature changes can be predicted using data-driven statistical forecast models, which have proven to be suitable for modeling and simulating complex, nonlinear phenomena. The goal of this effort is to provide a non-physical method for accurate temperature forecasting in the Makkah region of Saudi Arabia.
When compared to conventional modeling approaches, a variety of data-driven modeling tools are increasingly being used since they require less data and are easier to deploy [18]. In recent years, temperature projections in various hydrologic zones have been successfully modeled using data-driven models [19]. Modern machine learning methods are the foundation of the widely utilized datadriven techniques for predicting upcoming temperature fluctuations. The ability of various machine learning algorithms to predict temperature fluctuations at various hydrological settings has been the subject of several recent studies. All these investigations, which focused on specific regions, showed that the temperature projection models based on machine learning algorithms could make adequate predictions. It is evident that a variety of different modeling techniques have been used to forecast temperature variations with differing degrees of prediction accuracies. Furthermore, recommending a specific prediction model for a specific issue with projecting minimum and maximum temperature changes is extremely challenging, if not impossible. Therefore, improving prediction accuracy still calls for more sophisticated approaches to temperature projection. To the best of the readers, knowledge, no attempt has been made to date to anticipate and forecast the lowest and maximum temperatures in the Makkah region of Saudi Arabia using a prediction-focused method based on the framework of a dynamic system model. The minimum and maximum temperatures recorded from a weather station in the Makkah region of Saudi Arabia are therefore forecasted using a discrete state-space modeling approach [20] proposed in this study.
According to assessments of data-driven modeling applications in hydrological forecasting, these modeling techniques can produce a respectable prediction performance that may be on par with or even more accurate than the prediction performances of numerical simulation models in some cases [19,[21][22][23]. The results of these investigations have generally demonstrated that machine learning technologies based on artificial intelligence can capture and mapping nonlinear interactions. Another intriguing strategy would be to train and test a data-driven model using solely the historical values of temperature time series data, which can then be applied to estimate future values. The goal of the current study is to use the discrete state-space modeling approach to model minimum and maximum temperature data of the Saudi Arabian Province of Makkah. As a result, this study suggests that it is possible to predict and forecast temperature variations rather effectively using a dynamic system-based data-driven modeling tool that applies a discrete state-space method. In the past, attempts to project temperatures depended on a variety of nondynamic machine learning-based prediction modeling methodologies. This work makes use of a dynamic system's capacity for prediction and a prediction-focused approach to the system identification process to build a solid and trustworthy model that can then be applied to forecasting. Additionally, the system identification method uses a prediction-focused technique to do model identification and validation simultaneously. To the best of the readers' knowledge, this method has never been used to forecast and predict temperature fluctuations. Taking these factors into consideration, this work aims to put into practice a discrete state-space modeling approach [20] as a tool for making predictions about the future of temperature forecasting in the Makkah region of Saudi Arabia.

Materials and methods
We estimate temperature using a system-based, datadriven modeling methodology that is prediction focused. We provide a description for the study area and the data in Sect. 2.1. Next, we present the modeling approach in Sect. 2.2. Further, we provide an evaluation statistic of the performance in Sect. 2.3. Lastly, we discuss the results of our approach.

Study area and the data
With coordinates of 21°40′N and 39°850′E, Makkah covers an area of 153,128 km 2 [24]. Located at an elevation of 277 m above sea level, it is a narrow valley. The population was 7.7 million [24], and it has been growing steadily since then. Makkah was the place where the Holy Quran was first revealed to the Prophet Muhammad (peace be upon him). A pilgrimage to Makkah, the holiest city in Islam, is obligatory at least once in a believer's lifetime. Makkah's population and built environment have grown in recent decades. Over the past few decades, Makkah has welcomed upwards of 15 million Muslims per year, with many more visiting during the Hajj. Since then, Makkah has grown into one of the most multicultural and cosmopolitan metropolises in the Muslim world. Summertime highs in Makkah can easily reach 45 degrees Celsius. Many pilgrims, especially those who are not acclimated, can suffer from heat exhaustion or heat stroke if they are forced to spend extended periods of time in hot environments [24]. Almazroui et al. [25] in Saudi Arabia studied the impact of urbanization on air temperature rise and concluded that urbanization changes resulting from population increase are unlikely to be the primary cause of the air temperature increase (Fig. 1).
Historical monthly time series of temperature data from Makkah over the past 120 years (01 January 1901 to 31 December 2020) were employed. To create the time series model, which is used to predict future temperatures over the following 60 years (up to 2080). The long-term monthly variations of T max , and T min parameters in Makkah region of Saudi Arabia are depicted in Fig. 2. T max was 37.65 °C in the month of June, while it was 24.07 °C in the month of January. Moreover, the highest T min was recorded in August (25.80 °C) and the lowest was recorded in January (12.61 °C). This was the monthly average value from 1901 to 2020.
The identification (training) and validation data are separated from the original time series of temperature data. The forecast models are trained using 80% of the total time series data, and the remaining 20% is utilized to  validate the models that have been generated. The trained and verified models are utilized for future predictions after the models have undergone successful training. Figure 3 shows the division of the full datasets into identification and validation components.

Modeling approach
In this study, future projections of the minimum and maximum temperatures are performed using a discrete state-space model [20]. A "state-space system" is used to model the temperature variations, with noisy inputs (predictors) and observed temperature data serving as responses (outputs). The detailed methods are described in [26]. The observed temperature is proportionate to the system's current condition (state of the system) such that where A, C, and K in the equations are the parameters that must be determined by the recognized system; x n is the state vector that contains the monthly temperature data. We state Eq. (2) as y n is the model output from the observed data; and e n represents the noise.
State-space modeling identifies or discovers a model that effectively predicts the behavior of a dynamic system in respect to an input variable. Model responses can be generated in two ways: (a) through simulation, which depicts the model responses using initial conditions and input data, and (b) through prediction, which uses both current and historical entries of observed input-output patterns, in addition to the initial conditions, to determine the model response at a given future time. The objective of the current work is to develop a reliable and dynamic system model that can be utilized for forecasting. Utilizing the prediction-focused methodology of the system identification procedure, this is achieved. A one-step prediction focus is employed during the model identification process since it typically yields the best results. The input and output observations of the temperature data are used (1) X n+1 = AX n + Ke n (2) y n = Cx n + e n Fig. 3 Partitioning of datasets into groups for identification and validation: a maximum temperature, b minimum temperature in this one-step prediction approach to help account for the trends and peculiarities of instabilities. Usually, incorporating these instabilities or disturbances yields the finest optimal solutions.

Prediction-focused methodology
The term "prediction" or "forecasting" often refers to the process of looking into the future, typically k steps out, and making educated guesses based on the observed values of input and output variables from the model in question. The term k refers to the prediction horizon, which has to do with getting expected output at time kT s , where T s stands for sampling time. Mathematically, the prediction gives the final output y p t N+K for the observed inputs u m t 1 , … , t N+K and observed outputs y m t 1 , … , t N . For instance, if a physical system's input and output signals are u m (t) and y m (t) , respectively, then the system's first order equation can be written by where, y is the output and u is the input.
In specifically, one must be aware of the input values up to a time step of t + k and the output values up to a time step t in order to forecast the behavior of the model [response of the model k steps ahead k ≥ 1 ] using the current time t. We need to identify the dynamic model using the initial states as follows: where, the initial states are represented by u m (0) and y m (0), respectively. A dynamic model is the predictor identified by the notation f (). This predictor's or dynamic model's form is determined by the model's structure.
By utilizing the tools and procedures available in MATLAB [27], we can identify a discrete state-space model of the temperature data.

Akaike' s final prediction error (FPE)
The effectiveness of model performance is assessed using FPE. This criterion assesses a model's performance by simulating several scenarios in which a model is assessed using various test data sets. A model is said to be the most accurate when its FPE value is the minimum, according where N denotes the total number of observations used for the estimation, estimated parameters are denoted by ̂ N , the number of parameters is denoted by d, and the prediction error vector e(t) . According to Akaike's FPE theorem, the number of parameters must be less than the total number of samples. FPE will not be calculated during model estimation if the number of parameters is greater than the number of samples.

Mean squared error (MSE)
Mean square error (MSE) measures error in statistical models by using the average squared difference between observed and predicted values.
where, N = number of data point, ( Actual i ) = observed value, and ( Predicted i ) = predicted values.

Model development
The three procedures listed below are used to develop the model: Step 1: A 1-step ahead prediction is carried out in the first step of model building, and the accuracy of the prediction is assessed using the FPE and MSE values, Step 2: The identified models minimize the prediction errors for a one-step prediction by utilizing the prediction focused approach. Then, the obtained or identified model is validated by means of a 10-step ahead predictor. We identify the state-space model from Step 1 to forecast the 10-step forward predictor y n+10 that would be used in the prediction. As it can be employed to validate the model, which was identified in step 1, and.
Step 3: To further verify the effectiveness of the model, forecasting is finally used. This is done by comparing the original validation dataset with the predicted responses over the time range of the validation datasets. The following paragraphs provide a detailed explanation of model development techniques.
Both maximum and minimum temperatures are subjected to a one-step ahead prediction as the initial stage of model construction. The system finds 440 free coefficients for the maximum temperature dataset to create a state-space model, and the estimation data fit is deemed to be 88.25 percent accurate (prediction focus). Findings show a highly strong prediction model with FPE and MSE values of 0.372 and 0.335, respectively. To develop model for the minimum temperature prediction, the developed state-space model also identifies the numbers of free coefficients to be 440. The values of prediction focus (fit to the estimated data), MSE, and FPE for the maximum and minimum temperature prediction are provided in Table 2. It is perceived from Table 2 that the state-space model developed for predicting the minimum temperature yields the lower value of the estimation data fit than the state-space model developed for predicting the maximum temperature. It is also noted from the data presented in Table 2 that the state-space model developed for minimum temperature prediction produces higher values of FPE and MSE. Nevertheless, these differences in the prediction performances are relatively lower that are rational for a state-space model and are satisfactory for the proposed temperature modeling context.
The discovered models at this stage make use of the prediction focused strategy to reduce the prediction errors for a 1-step ahead prediction. Thereafter, these discovered (obtained or identified) models are validated through a 10-step ahead predictor (utilizing a prediction focused approach). In other words, for a given set of historical temperature datasets of y 0 , … , y n , the identified or discovered state-space models are used to forecast y n+10 in advance. It should be emphasized that the 10-step forecasts (y n+10 ) are produced using the observed and projected values, y 0 −ŷ 0 , … , y n −ŷ n . The results for the 10-step advance predictions for the identification (training) and validation (testing) data for the maximum and minimum temperatures are illustrated in Figs. 4 and 5, respectively. It is perceived from Figs. 4 and 5 that the predictor fits the observed temperature data with high accuracy for both the identification and validation data when maximum and minimum temperatures data are considered.  In the next step, forecasting is carried out inside the validity data range to further validate the constructed state-space forecast models. The observed record of the maximum and minimum temperature data series y 0 , y 1 , … , y n −ŷ n is used in this forecasting phase to estimate the model state at a time step of n . This predicted value acts as the starting point for predicting how the model will respond when projecting values for a time period in the future for the maximum and minimum temperatures. This study involves forecasting the model responses during the time period of the validation data and performing a comparison between observed and anticipated temperature values for the validation datasets for both the maximum and minimum temperatures prediction (Figs. 4 and 5).

Results and Discussion
Using historical temperature data from a weather station of Makkah in the Saudi Arabia, this study predicts the possible future scenarios of the maximum and minimum

Future projections of maximum temperature
Future projections of maximum temperature in 10 years' increment are presented in Figs. 8, 9, 10, 11, 12, and 13. In these figures, the green curve shows the measured identification data whereas the blue curve shows the measured validation data that spans over 1-120 months, 1-240 months, 1-360 months, 1-480 months, 1-600 months and 1-720 months. The red curve is the forecasted response for 120, 240, 360, 480, 600, and 720 months beyond the measured data's time range. The numeric values of maximum temperatures in successive years did not vary significantly and showed a decreasing trend with reference to the base year (December 2020) when the temperature was 25.60 °C. A difference in temperature of 0.41 °C was observed between December 2020 and January 2021. This apparently little difference in temperature is associated with modeling error, which is relatively small in the context of forecasting modeling. A comparatively small increase in the maximum temperature was observed in the successive years between 2021 and 2080. For instance, the projected maximum temperatures were 25

Future projections of minimum temperature
Future projections of minimum temperature in 10 years' increment are presented in Figs. 14, 15, 16, 17, 18, and 19, respectively. The green curve shows the measured identification data whereas the blue curve shows the measured validation data that spans over 1-120 months, 1-240 months, 1-360 months, 1-480 months, 1-600 months and 1-720 months. The red curve is the forecasted response for 120, 240, 360, 480, 600, and 720 months beyond the measured data's time range. The minimum temperatures were found to be 13.63 °C, 16

Conclusions
This research suggests a discrete-time identifiable statespace modeling strategy, wherein the temperature fluctuations were modeled as a state-space system with the temperature time series as inputs. As a result, this research is striving to create a platform capable of predicting future minimum and maximum temperatures. The proposed study takes a predictive approach to system identification to build a more realistic and adaptable model of the system. Makkah, Saudi Arabia's minimum and maximum temperature records are used to evaluate the proposed method. The identification (training) and validation rounds of model construction yielded a fit to the data or prediction focus of 87.04% and 85.14%, respectively, for maximum temperature projections. Observations revealed that Akaike's Final Prediction Error (FPE) and Mean Squared Error (MSE) were 0.37 °C and 0.34 °C, respectively. During the identification and validation phases, however, the prediction focus for minimum temperature projection was 86.25% and 84.78%, respectively. The FPE and MSE values were 0.41 °C and 0.37 °C, respectively for this instance. The findings of the combined training, validation, prediction, and forecasting reveal that the proposed modeling approach maps the observed minimum and maximum temperature data quite adequately. Therefore, it appears from the findings that the proposed discrete state-space modeling approach could be utilized to make predictions about potential temperature changes in the future. It is noted that the results presented in this effort are meant for the specific study area. Nevertheless, the methodology can be replicated many other study areas around the globe.
Data availability Data will be available on request to corresponding author.

Conflict of interest The author declares no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.