Study area and data availability
Iraq is one of the Middle East countries. Six countries, Iran, Kuwait, Saudi Arabia, Jordan, Syria, and Turkey, have bordered with Iraq and a small stretch of the Arab Gulf. Iraq has eighteen governorates, and Baghdad governorate is Iraq’s capital city, as seen in Fig. 1. Iraq presently consists of 18 governorates; the population of Iraq is proportional to 0.52% of the world’s total population, where the current number of the Iraqi population is about 40,278,833 people based on the latest update of the United Nations database.
The timeline for the spread of the COVID-19 pandemic in Iraq began on February 24, 2020, when the Iraqi Ministry of Health announced the first confirmed case of COVID-19 infection in Najaf Province southern of Iraq. The patient was placed in quarantine at Al-Hakim Hospital, and nine people who touched the patient were quarantined in the same hospital. Then the chronology accelerated its spread in all Iraqi governorates, especially in the capital Baghdad, Najaf, and Kirkuk. This study’s used data was collected from the Iraqi Ministry of Health (the daily epidemiological situation of recorded infections of the emerging coronavirus in Iraq). The World Health Organization (WHO) (Url-1) and the world meter website (Url-2) which provide users with useful charts and figures regarding the COVID spread all over the world. A total of 38,000 patients were diagnosed with COVID-19 in Iraq diagnosed from February 25, 2020, to July 15, 2020. In this study, the included dataset was divided into three parts according to the available data, as illustrated in Fig. 2, namely the total cases, the total healed, and the total deaths cases where these datasets have been prepared for training and validation processes as an essential step for design the proposed model.
Proposed COVID-19 prediction model
The study’s main objective was to develop an AIT-based predictive model by choosing suitable ANNs procedures to forecast the prognostic factors for the rapid spread of COVID-19 in Iraq. The developed model’s methodology was divided into five stages as preparing data, selecting ANNs’ functions, modeling processes, statistical model criteria, and model window interface design. Three ANNs’ functions described in Fig. 3 due to their main characteristic and their mathematical representation were selected. MATLAB workspace was used in this study because of its convenient computing environment ability to design structure, statistical analysis, estimate performance accuracy, etc. for variant types of artificial neural networks (MATLAB R 2018a, 2018).
Preparing data
Preparing data is a fundamental stage in developing artificial intelligence models (Zhou et al. 2010). Therefore, data organization, imputation reduction, discretization, and normalization were mainly performed to get high-performance models (Zhang et al. 2003). The total used data set has been organized and imported into MATLAB workspace. Out of the 38,000 cases, 70% of them (26,600) were used in the training process that is randomly chosen, 15% (5700) were used to measure the popularization of the designed networks, and the last 15% (5700) of the total dataset used as a new data for invalidation process as data have not been considered with or dealing before.
Selecting AINN functions
RBF, FCM, and NARX, classified as three ANNs’ functions, were selected and used to develop a COVID-19 prediction model. These procedures are conformed as powerful ANNs’ functions through their mathematical ability for data analyses and predict results when they are involved in developing artificial intelligent models (Mirmozaffari 2019; Yahya and Seker 2019; Ahmad et al. 2020).
First, RBF functions are simply a class of functions. In principle, they could be employed in any nonlinear model and neural networks (single or multilayer) (Uysal 2016). RBF functions can be defined as a multilayer feeding forward neural network used as a robust tool offering many advantages in nonlinear modeling systems. Such functions were commonly used in many research applications such as time series forecasting, pattern recognition, or speech processing, where they can be mathematically described as (Rahmati and Tatar 2019)
$$ {y}_{(t)}={\sum}_{j=1}^n{w}_i.{\varphi}_{j+{w}_0} $$
(1)
where, y(t) represent RBF output of, wi the associate weights for every RBF function, Фj represents the essential functions set, and wo is the bias weight of output. A nonlinear function named Gaussian function represents an activation function in the hidden layers used in this study that can be written as Eq. 2 where x represents the training data, xi is the mean of the data, and σ represents the Gaussian function width.
$$ {\varphi}_i(x)=\exp \left[-\frac{{\left\Vert x-{x}_1\right\Vert}^2}{2{\sigma}^2}\right] $$
(2)
For this analysis, the suggested RBF neural network was guided for this study to learn about the centers and the width of the Gaussian function for hidden layers, and then the weights were adjusted by employed using the Linear Regression after the parameters for the Gaussian function at the hidden layers have been identified. The k-means clustering method can deduce Gaussian functions for each attribute using cluster testing instances to exercise the Gaussian function centers. After the Gaussian function parameters at the secret units have been defined, the weights from these units to the output unit are adjusted using Linear Regression. Second, NARX is a suitable estimation algorithm that used for modeling and predicting time series in moderately nonlinear dynamic systems (Lin et al. 1996, Ruiz et al. 2016); the developed NARX in this study can be utilized for a lot of prediction step by providing the prediction process through recursive use of the input dataset. The NARX model can be described by Eq. 3 as follows:
$$ y\left(d+1\right)={\mathrm{f}}_{\mathrm{ANN}}=\Big(y(d),y\left(d-1\right),\dots y\left(d-n+1\right),u(d),u\left(d-1\right),\dots u\left(d-m+1\right)+\varepsilon (d) $$
(3)
where y(d + 1) is the output of model-predict, fANN is the function (nonlinear) representing the system manner, y(d), u(d), ε(d) are input, output, as well as vectors of approximation error in the time of case d, n, and m the order of y(d) and u(d), respectively.
Third, image segmentation, medical diagnosis, clustering, and time series forecasting are the most problems that can be solved using the FCM algorithm (Yokoi et al. 2011). Clustering using FCM function is the operation of grouping feature vectors into self-organizing model categories. Fuzzy C-Means Clustering (FCM) is defined as the unsupervised clustering algorithm depending on the fuzzy concept introduced by Bezdek (1981). FCM may also be categorized inside the algorithm of fuzzy clustering. This function is based on splitting a single dataset or time series into several clusters, each cluster having different attributes. The fuzzy weights were determined in this analysis, depending on the reciprocal distance used to minimize the cumulative weighted mean-square error (Abebe et al. 2000). To accommodate the FCM design procedure, the mathematical steps proposed by (Başkır and Türkşen 2013) can be followed:
-
Step 1. The FCM algorithm allocates objects to each class by using fuzzy memberships. It splits a set of nth classes into c (1 < c < n) fuzzy clusters with specific centroids. A fuzzy matrix M describes the fuzzy clustering of classes with n rows and c columns in which n is the number of data classes, and c is the number of clusters. The membership function’s degree can be indicated by the symbol mij that initialized randomly, as described mathematically as follows.
$$ 0\le \sum \limits_{i=1}^n{m}_{ij}\le n $$
(4)
$$ \sum \limits_{i=1}^n{m}_{ij}=1,\forall j=1,\dots \dots \dots ..n $$
(5)
$$ K\left(M,c1,c2,\dots \dots \dots ., cp\right)=\sum \limits_{i=1}^p ki=\sum \limits_{i=1}^p\sum \limits_{j=1}^n{m}_{ij}^r{d}_{ij}^2 $$
(6)
where mij ranged between 0 and 1, cp is the centroid of cluster i, dij is the Euclidian distance between ith, centroid (cp) and jth data point, and m є [1,∞] is a weighting exponent.
$$ cp=\frac{\sum_{j=0}^n{m}_{ij}^r{d}_{ij}^2}{\sum_{i=1}^n{m}_{ij}} $$
(7)
$$ {m}_{ij}=\frac{1}{\sum_{k=0}^c{\left(\frac{d_{ij}}{d_{ki}}\right)}^{\frac{2}{m-1}}} $$
(8)
Modeling processes
The procedure of developing the proposed model was divided into three stages (stage for each proposed ANN). First, developing RBF neural network to create an RBF structure, the function (newrb) was used where the hidden nodes had been adding until assembly the specified RMSE goal to approximate the function in the training process. To ensure that this neural network works perfectly, three arguments have been taken. Firstly, set up the training process hindrance through configuration the minimum RMSE error where in this study, the RMSE was determined to 0.000001, as well as, the training process will be terminated when reaching this number. Secondly, RBF spread default set up equal to 1.0. Thirdly, to guarantee that the network works satisfactorily without any complexity, the fair number of hidden nodes was carefully chosen by trial and error strategy. However, the most appropriate number of hidden nodes was found to be equal to 140.
Second, creating a NARX neural network, three parameters are entered as input data for the constructed NARX neural network, with one exogenous input parameter and two feedback delays. The developed NARX is structured as two hidden layers with 28 neurons in the first layer and 10 neurons in the second layer. Using the divide block method, the cumulative data was split into two classes. Seventy percent of the total data are deemed a training dataset, and 30% of the total dataset included in validation and testing processes. The original random weight set as 12, and the functions (narxnet) and (close loop) used to trigger the sigmoid’s functions as control. Besides, the Levenberg–Marquardt approach was used as a potential solution to network testing by utilizing the (trainlm) function to achieve optimal performance.
Third, developing FCM for designing FCM neural network; first, actual data will be read and pre-processed to implement and prepare the FCM neural network parameters. Genfis3 function is used to build a fuzzy inference system structure (Astakhova et al. 2015). To generate the (genfis3) function, two input parameters must be taken into account as input dataset matrix, matrix dimensionality, and the radii numbers as a vector. This instantly calculates the expected cluster numbers to be three clusters. In this context, those clusters are input only as a starting point into the FCM algorithm. Creating Membership Functions (MF) is where the sum of all MF will be 1.0. The number of MF rules which were produced was 23. The neural network learned using a hybrid learning algorithm to classify single output membership feature parameters where the fuzzy fuzzification method Sugeno-type inference systems were implemented for completion of the training period. The FCM clustering process is iterative. The process stops when the cumulative amount of iterations obtained is less than the minimum quantity specified for the changed, or the target function update for two consecutive iterations. For this neural network, the total number of iterations tuned as a standard equal to 500 and stopped when it reached 148 iterations.
The performance accuracy of the developing models is a critical part of designing any AIT model; a significant task for a successful design model is to improve any designed model; training, testing, and cross-validating processes play an essential role for a good design model. Various training algorithms can be used for training the ANN, like Backpropagation, Levenberg–Marquardt, Hebbian-based, Bayesian regularization, or One-step secant, for instance (Bello 1992). The development environment in MATLAB workspace offers a wide range of these algorithms that can be chosen according to the proposed neural network. In this study, three training algorithms have been used for training the proposed ANN, where the Backpropagation algorithm was chosen for RBF, Levenberg–Marquardt algorithm was chosen for NARX while Hebbian-based algorithm was chosen for FCM. The testing process is an art of investigating or evaluating the developed model by checking the developed ANN’s performance by estimating the testing error. When the testing error decreases, this means the performance of the developed ANN increase and vice versa. Cross-validating is a process for detect overfitting and evaluating the developed models by good training the proposed ANN involved in models structure and based on the available input dataset and evaluating them on the complementary subset of the data.
Statistical model criteria
The best performance is when the correlation is high, and the errors are as small as possible. To ascertain a good performance accuracy for designed ANN models, several error criteria were chosen to test the testing process (Cadenas et al. 2016; Yalur 2019); where the most famous criteria were used in this study, these criteria are Coefficient of Determination (R2), Root Mean Square Error (RMSE), Nash-Sutcliffe coefficient (CE), and mean absolute percentage error (MAPE) as represented mathematically in Eqs. 9, 10, and 11.
$$ RMSE=\sqrt{\frac{\sum_{i=1}^n{\left({X}_{actual,i}-{X}_{predict,i}\right)}^2}{n}} $$
(9)
$$ \mathrm{CE}=1\sqrt{\frac{\sum_{i=1}^n{\left({X}_{actual,i}-{X}_{predict}\right)}^2}{\sum_{i=1}^n{\left({X}_{actual,i}-\overline{X_{predict}}\right)}^2}} $$
(10)
$$ \mathrm{MAPE}=\frac{1}{n}\sqrt{\sum_{i=1}^n\left(\mathrm{actual}(i)-\mathrm{predicted}(i)\right)/\mathrm{actual}(i)} $$
(11)
Analysis using GIS environment
One of GIS’s famous abilities is providing a quick, comparative view of the area at high risk or hazards through spatial distribution technologies. In this study, Arc View V. 10.4 software was used as a GIS environment, where the Inverse Distance Weighted (IDW) interpolation technique installed in the used software was adopted. This technique represents an algorithm used for interpolating a different kind of data spatially as well as estimating or predicting data (Murugesan et al. 2020). In this part of the study, the spatial distribution of the used infected data (actual data) and predicted results (gained from the developed model) was distributed spatially. This method’s outputs consisted of purpose maps that can compare the spread of the COVID-19 disease in the Iraqi governorates between the current period and the predicted period.
GIS environment was employed to prepare short-term spatial distribution maps based on two periods, actual and predicted infected cases for 3 months over Iraq. These spatial distribution maps were created according to the numbers of infected cases in the actual stage compared with the numbers of infected cases gained from the proposed model.