Introduction

COVID-19 has spread rapidly to countries worldwide, and WHO represents that the COVID-19 outbreak was a global pandemic and of international interest [1,2,3]. The easy spread of COVID-19 has resulted in countries implementing restrictions to avoid the spread of the virus, such as travel restrictions, quarantine, postponements or cancelations of events, social distancing, and lockdowns. [4]. This issue exacerbates global public health problems that indirectly cause adverse effects other than human health, such as mental stress, an unstable economy, and other harmful daily activities. As a result, governments in various countries are trying to make public policy decisions regarding health and the economy [5, 6].

The COVID-19 problem requires handling, but it is challenging to predict cases using conventional techniques [7]. Forecasting and studying the pattern of disease spread can be an accurate plan in control strategies. They construct the prognosis using predictive models dependent on the underlying forecasting algorithm. The increase in positive cases of COVID-19 changes daily, making it difficult to restrict the spread. DL can provide better projections and improvements in this case [8]. Hence, intelligence methods have an essential role in predicting the spread of COVID-19 to provide early warning and develop the effect of actions on the spread of the virus. Therefore, the current development of models implemented various intelligence techniques to predict and analyze trend cases of COVID-19 [1].

A paper proposed an RNN to estimate confirmed cases with LSTM [9, 10]. Several reports suggested that Deep LSTM, Convolutional LSTM, and Bi-LSTM anticipate enhancing positive cases daily and weekly [4]. Other studies also predict COVID-19 with the LSTM, GRU [11], and Bi-LSTM learning models [1]. The technique can produce more effective results than ANN with a random test set [12]. Another paper explored ANN to predict confirmed cases using mathematical and computational models [7].

In predicting the spread, several papers present a model based on meditative fuzzy logic to analyze contradictory parameters [13]. Another study used prediction models ANFIS and LSTM to predict newly infected cases. The LSTM architecture has four gates: input gates, forget gates, control gates, and output gates. While ANFIS uses the input layer to take parameters, the first layer creates the second layer, and the last layer normalizes the function and sends it to the last layer, namely the output layer. ANFIS has multiple layers and thus requires more hardware resources and training time than LSTM [8].

Another study analyzed spatial as a short term to measure confirmed cases by calculating ARIMA. The ARIMA model can acquire AR for considering past values and MA to evaluate historical knowledge of current and preceding residual series [14]. The ARIMA method uses three regression techniques: SVR, NN, and Linear Regression [15, 16]. The ARIMA model of accuracy in forecasting future COVID-19 cases proves the effectiveness of epidemiological surveillance. The ARIMA technique can predict events with relatively higher intervals [17].

Currently, dealing with COVID-19 prediction to measure the spread trend is challenging. The current paper proposed positive cases of COVID-19 using CNN by analyzing and predicting new cases and death cases based on the DL method. In addition, CNN can forecast accurate results in time series analysis, because it can study the features present in the dataset [18, 19]. The paper employed CNN to generate the algorithms with RMSE and MAPE indicators to evaluate the model. Based on the experiment result, the model can better predict the COVID-19 trend cases [5, 20].

Therefore, in this study, we propose a technique to construct a prediction model using CNN to accurately represent the new cases and new death of COVID-19. In COVID-19 prediction research, we present several significant contributions to this research, particularly in case categorization using the following learning methods:

  1. 1.

    We introduce a new technique for predicting new cases and new death, utilizing CNN to train datasets to develop viable models. We use several features to build our model, including the date, recent cases, and death cases as feature datasets to construct our learning model.

  2. 2.

    We establish a model to predict positive cases of COVID-19. This prediction model can be a solution to finding the cases in real time. Moreover, the proposed technique can be a promising solution for improving the conventional model to produce a higher score.

  3. 3.

    We test the proposed model to achieve high-accuracy results to quickly and accurately predict new cases and new deaths in the next few days quickly and accurately. We tune parameters to get the best accuracy values by training extensive features to achieve the best results.

This paper's organization is as follows: Section Related Work delves further into past findings. Section Background discusses the issue description. Section Experimental Setup explains the experimental design, including a feature learning algorithm, a dataset, and preprocessing, while Section Result and Analysis gives the findings and extensive analysis. Finally, Section Conclusion summarizes the research's unresolved difficulties.

Related Work

A study proposed ANFIS as a variant of ANN with a fuzzy logic modification to deal with COVID-19 prediction. In another study, ANFIS and LSTM produced a high performance that handles non-linear data with computational techniques to estimate new cases infected with COVID-19. However, ANFIS and LSTM can only predict and not lead to significant assumptions about this outbreak. So that can explain why the COVID-19 prediction model using the ML technique cannot avoid failure and it is challenging to show accuracy, especially for small datasets. The experimental results show that LSTM has better results than ANFIS [8].

Another paper uses mediative fuzzy to predict COVID-19 outcomes. The meditative fuzzy correlation technique creates a relationship between the increments of positive patients and the passage of time in terms of the increment. The correlation coefficient is essential in determining the linear relationship between two independent variables in statistical analysis. However, the correlation coefficient of meditative fuzzy logic analyzes various conflicting parameters. Therefore, the mediative fuzzy correlation coefficient was calculated based on the fuzzy correlation's first lower and upper bounds. From the calculation results of Fuzzy Logic, Intuitionistic Fuzzy Logic, and Mediative Fuzzy Logic, the correlations obtained are [0.25, 0.37], [0.2294, 0.39], and [0.21, 4092] [13].

In dealing with the COVID-19 issue, several papers presented deep learning variants (RNN) to predict daily and weekly cases with minimum error. In the article, the method produces great accuracy for short-term prediction, with an error of less than 3% for daily forecasts and less than 8% for weekly forecasts [4]. Another paper discussed forecasting the number of cases in a month using data-driven estimating approaches like LSTM and curve fitting and the impact of preventive measures like social isolation and lockdown on the spread of COVID-19. The number of recovered instances and the number of positive cases are used to predict certain factors [9].

Other papers proposed LSTM, GRU, and Bi-LSTM to find confirmed deaths and recovered cases. In addition, LSTM [30] can use hidden layer units known as memory cells to overcome the limitations of RNN. The memory cell can store and control the temporal state through the gate using its connection. Furthermore, a simple variant does not have a place to store information in a memory cell called a GRU. Therefore, the GRU can only control the data inside the unit. Bi-LSTM can increase learning capacity and memorize length settings. In particular, the DL method using Bi-LSTM produces smaller errors and higher precision. The combination of LSTM, GRU, and Bi-LSTM makes MAE and RMSE of 0.0070 and 0.0077 [1].

A study explored using ANN with different training methods to predict COVID-19 deaths, including Levenberg–Marquardt and Resilient Propagation. The former method effectively solves numerical problems for non-linear functions by combining velocity in convergence. At the same time, the Resilient Propagation method can adapt directly based on the local gradient information of the weight step. The Resilient Propagation function produces an MSE ten times greater than the MSE generated by the Levenberg–Marquardt method. As a result, ANN with a random set can perform better than ANN with a particular group [12].

Another study used ANN to predict COVID-19 cases as a computational method while using Gompertz and Logic as mathematical methods. The ANN model utilizes a three-layer design, with the input layer giving weight to each parameter and the hidden layer describing the internal output using the transfer function. The result produces a simulated value of the signal weight in the hidden layer and added coefficient in the output layer. Several parameter models use analytical purposes in mathematical models to determine the difference between cases observed and predicted in each model. Based on the estimated data and observed in computerized methods and mathematical models using total confirmed cases, it produces R2, namely Gompertz model 0.9998, Logistics 0.9996, and ANN 0.9999. These two ANN models are superior to the Gompertz and Logistic models [7].

Several communities also proposed spatial distribution analysis with ARIMA. However, efficiency metrics, such as increased yield index, MAE, and RMSE, are unsuitable for precision model prediction. Nevertheless, the accuracy of the ARIMA model in forecasting future COVID-19 epidemics proves its effectiveness in epidemiological surveillance [17]. Other studies used ARIMA to gather data and used estimation methods for the COVID-19 dataset. This study uses ARIMA to generate a simple, average number based on the performance of the regression technique. However, applying the ARIMA approach does not increase the matrix error and only produces the best regression, namely RMSE 286,879, MSE 78604,436, and MAE 175,672 [15]. However, the ARIMA model can predict future COVID-19 infections. However, some researchers explain that ARIMA is unsuitable for non-linear relationships, especially significant and dynamic complexity problems. Based on the finite model, ARIMA has the limitation cannot capture hidden non-linear time series. In this experiment, the paper uses the Pearson correlation coefficient to produce relative confidence of 95%, and the actual point estimate data are 0.996 [14].

The current papers proposed a new technique for predicting new cases and death based on CNN. The study gathered the daily new cases and new death as the dataset. In addition, the experiment shows that the early characteristics of CNN as input and subsequent inputs significantly enter the CNN structure, so that it helps predict cases [21]. The proposed CNN model can gain the most significant predictive effectiveness and accuracy compared to other DL methods. Several experiments show that CNN is the best result among the tested peers [5].

Another study proposed CNN to predict positive cases compared to other models, such as LSTM, GRU, and MCNN. Based on the analysis results, the CNN model outperforms other DLs in validation accuracy and forecasting consistency. The proposed CNN model can provide strong long-term effects in time series analysis due to its study of critical features, invariant distortion, and temporal dependency learning. CNN is the best forecasting model because of its deep feature learning power [18].

Therefore, we propose a model to predict new cases by introducing a new CNN method based on dataset features to deal with those issues. We collect a large COVID-19 case to build our dataset to train our model. By training several informative features, this model can predict cases of COVID-19 problems in the next few days.

Background

This section will provide a formal definition of the research problem and some of the concepts in this journal.

Problem Definitions

CNN has three layers: input, hidden, and output [22]. Common CNNs use the input layer to project the extraction as an input matrix. In the dataset, there are features \(x (x1, x2,x3, \dots xn)\), \(s \to \textit{Training samples}\), \(c \to\) Batch size \(a \to 0\) for \(i\) to epoch then for to \(s\) then \(j = k \text{ mod } c\)

$$if\,{ }j{ } = { }0{ }\,\,\,\textit{ then j} = \textit{ c end}$$
(1)
$$if\,\,{ }j{ } - { }a{ } > { }0\,\,\,{ }then{ }$$
(2)

Calculate the training sample set to output in batches using equations (1) and (2).

else

Equations (1) and (2) count the loss using the loss function, update the weights using the optimization, and count the output with a training set sample in the next batch.

end

\(a = \textit{j end}\)

\(\textit{end Testing:}\)

The CNN model predicts the testing set and compares the actual value with the predicted value, including calculating the prediction error.

Proposed Method

This study used CNN to build a COVID-19 case prediction model using a DNN with multiple hidden layers to train and test the model. We also try gradient descent with parameter models for the objective function. We use a different pooling layer than usual, adopting a pooling layer on the neural network to optimize and speed up training time. To influence neural units, CNN combines the two layers, including kernel size and pooling. Using the CNN model, we calculate the losses from the training and testing process to achieve the best results with the diverse input vector. This study uses a 1D time series and computes a dataset with proper hyperparameter settings [23]. We establish a supervised learning model by defining calculation over NN as follows:

Input features \({x}^{\left(i\right)}\in R\)

Outputs \({x}^{\left(i\right)}\in Y (e.g.\, R, \left\{\mathrm{0,1}\right\}, \left\{1,...., p\right\})\)

Model Parameters \(\theta \in {\mathbb{R}}^{k}\)

Hypothesis function \({h}_{\theta :} : {\mathbb{R}}^{n }\to {\mathbb{R}}\)

Loss function \(\ell :{\mathbb{R}}\times Y \to {\mathbb{R}}_{+:}\)

In this study, we calculate the optimization problem as follows:

$$\underset{\theta }{\textit{Minimise}} \mathop{\sum }\limits_{i=1}^{m} \ell ({\text{h}}\theta :({x}^{(i)}), {y}^{(i)}).$$
(3)

In the paper, we provide a hypothesis function \({h}_{\theta }: {\mathbb{R}}^{n}\to {\mathbb{R}}\) in neural network processing. On a CNN, we need to calculate the forward pass and backward pass to measure the gradient of the loss function in the model. The study calculates the forward pass to convolve the input matrix \({x}_{i}\) with filter \({W}_{i}\) to produce convolution output \({z}_{i:}\) as follows:

$$f : {\mathbb{R}}^{n }\to {\mathbb{R}}^{m}$$
(4)
$${z}_{i:}\left({x}_{i}\right)= {W}_{i}{x}_{i }+b.$$

The CNN consists of the filters \({W}_{i}\) and bias term \(b\) as the parameters of the convolutional layer during training. It is a supervised learning model with input, representation, and metrics to compute tensors in the hidden layer. CNN has many identical neurons among the layers to run large models’ computations with a little number of parameters. The layer receives a single input (the feature maps) and computes the feature maps as its output by convolving filters across the feature maps. The parameters of the convolution layer called filters and back-propagation model are used to learn during training. In the backward pass, we calculate the vector-valued function \(f : {\mathbb{R}}^{n }\to {\mathbb{R}}^{m}\) with the Jacobian matrix \(m x n\)

$$\left(\frac{\partial f\left(x\right)}{\partial x}\right)\in {\mathbb{R}}^{m x n }=\left[\begin{array}{c} \begin{array}{cccc}\frac{\partial {f}_{1}\left(x\right)}{\partial {x}_{1}} & \frac{\partial {f}_{1}\left(x\right)}{\partial {x}_{2}} &\dots & \frac{\partial {f}_{1}\left(x\right)}{\partial {x}_{n}}\\ \frac{\partial {f}_{2}\left(x\right)}{\partial {x}_{1}}& \frac{\partial {f}_{2}\left(x\right)}{\partial {x}_{2}} &\dots & \frac{\partial {f}_{2}\left(x\right)}{\partial {x}_{n}}\end{array}\\ \begin{array}{cccc} \vdots & \vdots &\ddots & \vdots \\ \frac{\partial {f}_{m}\left(x\right)}{\partial {x}_{1}}& \frac{\partial {f}_{m}\left(x\right)}{\partial {x}_{2}} &\dots & \frac{\partial {f}_{m}\left(x\right)}{\partial {x}_{n}}\end{array} \end{array} \right].$$
(5)

In CNN, the layer pooling process enables output with statistics from the nearest output at a specific location. The pooling layer helps the representation be more robust, approximately invariant to minor changes to the input. In addition, the pooling layer reduces the number of intermediate representations, thereby reducing the capacity to produce COVID-19 predictions [24].

Experimental Setup

Main Idea

The primary purpose of this paper is to create a prediction model based on daily new cases and new death using the CNN algorithm. CNN computes the most informative features using several hidden layers in the training process and analyzes the model performance [9]. CNN can exert a robust long-term effect in time series analysis due to its feature studies [18]. Therefore, several communities proposed that CNN establish a model to predict the COVID-19 trend. The purpose of this method is to organize the data into pre-existing categories. We adopt the architecture because its high effectiveness and accuracy suit long-term forecasting issues [25, 26].

Dataset

In this experiment, we collect the dataset COVID-19 that consists of entity, code, day, new cases, and new death. This experiment contains a time series dataset from March 2, 2020 to November 16, 2021. The dataset comes from the Our World in Data website. Then, we divide it into training datasets to construct the model and testing to evaluate the model performance. This experiment distributes the dataset by 80% for training and 20% for testing from the total dataset, namely 625 data with five features. Table 1 shows details of the dataset distribution in the study as follows:

Table 1 Distribution of the dataset

Data Preprocessing

In data preprocessing, we process raw datasets by cleaning, filtering, and combining data into information. After processing data, we perform missing values ​​by eliminating inaccurate data to become relevant. After completing the missing value, we separated several features to see the data type of each variable in the dataset and see if there are NA data or empty data [27]. Then, delete the noise data, so that the data become effective [28]. Then, do the vectorization, normalization, and feature extraction stages [29].

Prediction Method

In this study, we adopt CNN to establish a prediction model to analyze the trend of the COVID-19 case. To conduct our study, we collect datasets including entity, code, day, new cases, and new death. After collecting the dataset, we divide the dataset into training and testing datasets.

In the second stage, we undergo data preprocessing to change the raw data format and clean the missing value process to produce relevant samples. Then, we reduce the noise of the dataset for each variable to check empty data. We can obtain a compelling dataset for training and testing by eliminating the noise data.

After the preprocessing stage, we utilize the training dataset to train the model. This study feeds a testing dataset to evaluate or measure the model performance to test our model. In the testing process, we provide unseen datasets to get the optimal model and modify some hyperparameters to get the best accuracy value. Finally, the proposed model can produce an effective model for predicting positive cases of COVID-19 [30].

Result and Analysis

Prediction Test

In this study, we construct our model by collecting the most informative feature for predicting COVID-19 cases to predict the number of issues using the learning model. The study gathers the dataset of new positive cases and death features. This experiment tests a 30-day model to predict COVID-19 instances in the next 10 days. Figure 1 shows a graph of the prediction of new cases, and Fig. 2 depicts a chart of the prediction of new death in the study.

Fig. 1
figure 1

Prediction graph for new cases in the next 10 days

Fig. 2
figure 2

Prediction graph for new deaths in the next 10 days

Figure 1 shows the prediction results for new cases indicated by the dashed line plot, and the blue line plot shows the actual daily new COVID-19 cases. The daily new case graph produces predictions close to the actual new cases of COVID-19 on a scale of 0–550 new cases in the next 10 days. Figure 2 shows the prediction results of death cases characterized by a dashed line plot and a red line plot showing the real daily cases of COVID-19 deaths. The daily new death graph produces predictions close to the actual new death of COVID-19 on a scale of 0–30 new deaths in the next 10 days.

Based on the above graph, label X is the date for predicting 10 days of COVID-19 cases, and label Y is the daily number of new cases and deaths. Plot the dotted line to determine the prediction of the case with the actual situation. The evidence proves that the prediction model can achieve results closer to daily cases. Based on the experiment result, our proposed model with CNN can produce an effective prediction result close to the actual case line pattern. After several training phases, the prediction technique can get higher accuracy with a tiny loss in the COVID-19 prediction cases.

Evaluation Metric

To evaluate the accuracy, predictive models use a variety of assessment criteria. We analyzed the prediction outcomes by calculating RMSE and MAPE to test the prediction model. Therefore, smaller RMSE and MAPE values indicate more accurate findings and lower error rates. The RMSE and MAPE are calculated using the following formula:

$$RMSE=\frac{1}{n}\sqrt{\sum_{\dot{i}=1}^{n}{\left({y}_{i}-{\widehat{y}}_{l}\right)}^{2}}$$
(6)
$$MAPE=\frac{1}{n}\sum_{i=1}^{n}\left|\frac{{y}_{i}-{\widehat{y}}_{l}}{{y}_{i}}\right|\times 100\mathrm{\%},$$
(7)

where \({y}_{i}\) represents the actual value, \({\widehat{y}}_{l}\) represents the predicted value, and n represents the number of data points.

Table 2 describes the results of the parameters used to determine the error value using epochs = 1000. Cases of COVID-19 predictions using the CNN algorithm show a pretty good error value, new cases with RMSE of 0.00082, MAPE of 0.02440, and new death with RMSE of 0,00468 and MAPE of 0.06446. These results show that the resulting RMSE and MAPE values are small, proving that our proposed model can produce a higher prediction capability.

Table 2 Results of RMSE and MAPE scores

Conclusion

Regarding the spread of COVID-19, some researchers use the conventional techniques to predict positive cases [6]11. However, the conventional methods remain drawbacks to predicting the actual COVID-19 trend cases accurately. To solve this problem, we construct a novel learning model using CNN to predict COVID-19 cases in Indonesia effectively. This study uses a daily COVID-19 case as our dataset to construct our prediction model. We utilize the trend cases dataset to predict the trend using different parameters in the hidden layer. The study tuned various hyperparameters to achieve high-accuracy results.

In testing, we tune several different hyperparameters to achieve high accuracy and tiny loss by setting significant epochs and different regulators. Based on the experimental result, the model can obtain accuracy in new cases with RMSE 0.00070, MAPE 0.02440 and RMSE 0.00468, MAPE 0.06446 for new deaths. Furthermore, this study not only calculates accuracy and loss but also evaluates the model to convince the model's performance. Therefore, the proposed model can be a promising solution in dealing with the COVID-19 prediction problem in real time.

To improve the prediction result, future work requires adopting another algorithm using GCN architecture for this model. GCN is an extension of CNN which can operate directly on the graph. The architecture GCN can be a novel solution to produce higher accuracy with an additional hyperparameter setting.