1 Introduction

COVID-19 is a contagious disease caused by the novel corona virus (SARS-CoV-2) and was first detected in Wuhan in the Hubei province of China in December 2019. The findings of a Singh et al. (2021) examine that higher temperature may bring down the number of COVID-19 confirmed cases. It proposes that better access to sanitation facilities is a certain measure to contain the spread of the virus. The findings of Bashir, (2020) also confirmed a significant positive association between temperature and COVID-19 cases. In another study, Bilal (2021b) has found temperature, humidity, environmental quality index, PM2.5, and rainfall as significant factors related to the unprecedented pandemic (COVID-19) in the top 10 most affected states of the USA.

COVID-19 soon turned into a pandemic leading the world into a lockdown that is speculated to have an economic cost greater than the Great Depression of 1930s (IMF 2020). According to the IMF, several developed and undeveloped countries with weaker economic growth before the current pandemic will suffer worse under economic deceleration and decline (IMF 2020). The pandemic is yet to recede in intensity with more than 63 million cases and 1.47 million deaths as on November 30, 2020. The volume and spread of infection has increased exponentially since the first week of March in the United States, which now has the most number of both confirmed cases and deaths (Worldometer 2020). The effect of the COVID-19 pandemic on the US economy has been largely disruptive, adversely affecting financial markets, travel, shipping, employment, and other industries (CBO 2020). The US stock market index Dow Jones Industrial Average (DJI) fell by more than 3% with COVID-19 spreading rapidly across the world over the weekend. While the sharpest declines in the US stock market indices including DJI, NASDAQ-100 and S&P 500 since 2008 was reported on February 27. On February 28, stock markets across the world reported their largest single week decline since 2008 with stock markets closing down worldwide and the market fell to a new record low on March 6th (YahooFinance, 2020b), (Wikipedia 2020).

Forecasting the Dow Jones Industrial Average (DJI) with the number of confirmed cases of COVID-19 in the USA is important to understand the extent of the lockdown and the pandemic. Such studies have been previously carried out. The current work attempts to compare different forecasting methods to find out the most effective method for predicting the behavior of the DJI index in relation to the COVID-19 crisis. This is especially important given the huge economic costs of the current pandemic and its widespread impact on the investment and spending behavior and returns of people across the world.

Many researchers’ have applied diverse and increasingly effective range of methods and indicators to predict the spread of infections and pandemics, and also to map and forecast the current COVID-19 crisis. Some of these have been included in the current study. The Susceptible-Exposed-Infected-Recovered (SEIR) model has been used to predict the transmission of the pandemic by Gatto et al. (2020) in Italy and by Kuniya (2020) in Japan. The SIR (Susceptible-Infected-Recovered) model has been used to predict the reproduction number of COVID 19 by Das (2020) at both centre and state levels in India. The temporal deep learning method, deriving from time-aware long short-term memory (T-LSTM) neural network, has been suggested to forecast COVID 19 in Wuhan, China by Li et al. (2020), while (Arora et al. 2020) used the deep-learning based models to forecast the number of reported positives of COVID in 32 states and union territories in India. Time series prediction logistic models based on machine learning have been applied in Brazil, Russia, India, Peru and Indonesia for global predictions (Wang et al., 2020). GROOMs is a combination of five forecasting models proposed by Fong et al. (2020). Singh et al. (2020) used ARIMA model in forecasting the infection trajectory for the coming two months. In Indian states, the epidemiological spread is predicted using ARIMA (Roy et al. 2020). Castillo and Melin (2020) have used a hybrid approach for time series forecasting using the fractal theory and fuzzy logic. The Limited Failure Population concept for prediction of the spread of virus in European countries and the US has been used by (Koutsellis and Nikas 2020). Singhal et al. (2020) applied the mathematical model accounting and the Fourier decomposition method (FDM) in India, Italy and the US. The Network Inference based Prediction Algorithm (NIPA) was used in Hubei, China and the Netherlands, and was found to be superior to other forecasting algorithms by Achterberg et al. (2020). Eydoux et al. (2021) reported that the application of a purified and highly active SARS-CoV replication/transcription complex (RTC) provides a new strategy for the rapid identification of potential anti-SARS inhibitors. Bashir et al. (2020) found that the restriction of economic activities helped to reduce pollution level in the ecosystem, however he concluded that this change is not permanent, and the status of pollution may increase again in the future. Ahmar et al., (2020) have used the SutteARIMA method, a short-term prediction method in the USA for COVID-19 and for COVID-19 and stock market in Spain, while (Ahmar et al., 2020) used the SutteARIMA method in predicting confirmed cases in Spain. Salisu et al. (2020) have applied the GFI (Global Fear Index) to understand the predictability of commodity price levels during COVID. Nabipour et al. (2020) have used machine learning algorithms in predicting the future values of the Tehran stock market while Garcia et al. (2020) have applied the KAF (Kernel Adaptive Index) with a stock market interdependence to systematically predict stock market returns. Xie et al. (2011) have developed support vector machine (SVM) and multivariate discriminate analysis (MDA) to predict Chinese listed companies. Pang et al. (2020) advocate the use of deep long short-term memory neural network (LSTM) with an embedded layer and the long short-term memory neural network with automatic encoder to forecast stock market data. Ahmar et al. (2018) have compared the indicators ARIMA, Holt-Winters, SARIMA, α-Sutte, which are widely used along with NNAR in predicting stock market and time series data Ahmar et al. (2018). The SutteARIMA prediction model again has been used to predict COVID-19 confirmed cases in Spain (Ahmar and Boj 2020a, b). Ojugo and Yoro (2020) has used ARIMA model to predict the Oil market and its price direction. In the preceding literature, there is a growing tendency towards lower MAPE and MSE values with the SutteARIMA model. To find out the most effective model, in the current work, the SutteARIMA model was compared to the traditional ARIMA model.

2 Literature review

2.1 ARIMA

The St process is an autoregressive-moving average or ARMA (p, q) model if it fulfils:

$$\phi _{p} \left( C \right)S_{t} = \theta _{q} \left( C \right)a_{t} ,a_{t} \sim WN\left( {0,\sigma ^{2} } \right),\phi _{p} ,\theta _{q} \in \mathbb{R},t \in \mathbb{Z}$$
(1)

with \({\phi }_{p}\)(C) = (1 − \({\phi }_{1}\) C − \({\phi }_{2}\) C2− … − \({\phi }_{p}\) Cp) (for AR(p)).

and θq(C) = (1 − θ1C − θ2C2− … − θqCq) (for MA(q)).

If there is a differencing then the ARIMA model becomes as follows:

$${\phi }_{p}\left(C\right){\left(1-C\right)}^{d}{S}_{t}={\theta }_{q}\left(C\right){a}_{t},{a}_{t}\sim WN\left(0,{\sigma }^{2}\right),{\phi }_{p},{\theta }_{q}\in \mathbb{R},t\in \mathbb{Z}.$$

with \({\phi }_{p}\)(C) = (1 − \({\phi }_{1}\) C − \({\phi }_{2}\) C2− … − \({\phi }_{p}\) Cp) (for AR(p)), (1 − C)d.

(for differencing non seasonal) and θq(C) = (1 − θ1S − θ2C2− … − θqCq (for MA(q)).

2.2 α-Sutte indicator

The α-Sutte indicator is based on the practice of forecasting that is developed on the previous values of the variable or data set (Ahmar et al. 2018). The model applies the adapted version of the MA (moving average) method of forecasting which is generally used to detect and forecast trends in time series data. The α-Sutte indicator’s predictions are based on four data sets of the previous period, i.e., St − 1, St − 2, St − 3, and St − 4 (Ahmar et al. 2018). The principal equations of the indicator method are (Ahmar et al. 2018).

$$S_{t} = \frac{{\gamma \left( {\frac{{\Delta k}}{{\frac{{\gamma + \delta }}{2}}}} \right) + \beta \left( {\frac{{\Delta m}}{{\frac{{\beta + \gamma }}{2}}}} \right) + \alpha \left( {\frac{{\Delta s}}{{\frac{{\alpha + \beta }}{2}}}} \right)}}{3}~$$
(2)

where:

$$\begin{array}{c}\delta ={S}_{t-4}\\ \gamma ={S}_{t-3}\\ \beta ={S}_{t-2}\\ \alpha ={S}_{t-1}\end{array}$$
$$\begin{array}{c}\Delta k=\gamma -\delta ={S}_{t-3}-{S}_{t-4}\\ \Delta m=\beta -\gamma ={S}_{t-2}-{S}_{t-3}\\ \Delta s=\alpha -\beta ={S}_{t-1}-{S}_{t-2}\end{array}$$

St= data at t time,St-r= data at (t–r) time.

2.3 SutteARIMA

SutteARIMA is a method of forecasting that combines the method of ARIMA with the method of α-Sutte Indicator (Ahmar et al. 2018). The result of SutteARIMA forecast is the average of results from ARIMA and α-Sutte indicator.

ARIMA in Eq. (1), can be described as:

$$\begin{array}{*{20}c} {\left( {1 - \phi _{1} C - \phi _{2} C^{2} - ... - \phi _{p} C^{p} } \right)S_{t} = \left( {1 - \theta _{1} C - \theta _{2} C^{2} - ... - \theta _{q} C^{q} } \right)a_{t} } \\ {S_{t} - \phi _{1} CS_{t} - \phi _{2} C^{2} S_{t} - ... - \phi _{p} C^{p} S_{t} = a_{t} - \theta _{1} Ca_{t} - \theta _{2} C^{2} a_{t} - ... - \theta _{q} C^{q} a_{t} } \\ \end{array}$$
(3)

While Eq. (3), can be reduced by using the backward shift operator (CpSt = St − p):

$$\begin{array}{*{20}c} {S_{t} - \phi _{1} S_{{t - 1}} - \phi _{2} S_{{t - 2}} - \ldots - \phi _{p} S_{{t - p}} } \\ { = a_{t} - \theta _{1} a_{{t - 1}} - \theta _{2} a_{{t - 2}} - ... - \theta _{q} a_{{t - q}} } \\ {S_{t} = \phi _{1} S_{{t - 1}} + \phi _{2} S_{{t - 2}} + \ldots + \phi _{p} S_{{t - p}} } \\ { + a_{t} - \theta _{1} a_{{t - 1}} - \theta _{2} a_{{t - 2}} - ... - \theta _{q} a_{{t - q}} } \\ \end{array}$$
(4)

If we define:

$$\begin{array}{c}\delta ={S}_{t-4}\\ \gamma ={S}_{t-3}\\ \beta ={S}_{t-2}\\ \alpha ={S}_{t-1}\end{array}$$

So, the Eq. (4) can be describe:

$$S_{t} = \phi _{1} \alpha + \phi _{2} \beta + \phi _{3} \gamma + \phi _{4} \delta + \ldots + \phi _{p} S_{{t - p}} + a_{t} - \theta _{1} a_{{t - 1}} - \theta _{2} a_{{t - 2}} - ... - \theta _{q} a_{{t - q}}$$
(5)

and the Eq. (2) can be simplified as:

$${S}_{t}=\frac{\gamma \left(\frac{\Delta k}{\frac{\gamma +\delta }{2}}\right)+\beta \left(\frac{\Delta m}{\frac{\beta +\gamma }{2}}\right)+\alpha \left(\frac{\Delta s}{\frac{\alpha +\beta }{2}}\right)}{3}$$
$${S}_{t}=\frac{\frac{\gamma \Delta k}{\frac{\gamma +\delta }{2}}+\frac{\beta \Delta m}{\frac{\beta +\gamma }{2}}+\frac{\alpha \Delta s}{\frac{\alpha +\beta }{2}}}{3}$$
$${S}_{t}=\frac{\gamma \Delta k}{\frac{3\gamma +3\delta }{2}}+\frac{\beta \Delta m}{\frac{3\beta +3\gamma }{2}}+\frac{\alpha \Delta s}{\frac{3\alpha +3\beta }{2}}$$
$${S}_{t}=\frac{2\gamma \Delta k}{3\gamma +3\delta }+\frac{2\beta \Delta m}{3\beta +3\gamma }+\frac{2\alpha \Delta s}{3\alpha +3\beta }$$
$${S}_{t}=\gamma \frac{2\Delta k}{3\gamma +3\delta }+\beta \frac{2\Delta m}{3\beta +3\gamma }+\alpha \frac{2\Delta s}{3\alpha +3\beta }$$

Let, Eq. (4) added with Eq. (5), we will find:

$$\begin{array}{c}2{S}_{t}={\phi }_{1}\alpha +{\phi }_{2}\beta +{\phi }_{3}\gamma +{\phi }_{4}\delta +...+{\phi }_{p}{S}_{t-p}+{a}_{t}-{\theta }_{1}{a}_{t-1}-{\theta }_{2}{a}_{t-2}-...-{\theta }_{q}{a}_{t-q}+\\ \gamma \frac{2\Delta k}{3\gamma +3\delta }+\beta \frac{2\Delta m}{3\beta +3\gamma }+\alpha \frac{2\Delta s}{3\alpha +3\beta }\end{array}$$
$$\begin{array}{*{20}c} {S_{t} = \alpha \left( {\frac{{\phi _{1} }}{2} + \frac{{\Delta s}}{{3\alpha + 3\beta }}} \right) + \beta \left( {\frac{{\phi _{3} }}{2} + \frac{{2\Delta m}}{{3\beta + 3\gamma }}} \right) + \gamma \left( {\frac{{\phi _{3} }}{2} + \frac{{2\Delta k}}{{3\gamma + 3\delta }}} \right) + } \\ {\frac{{\phi _{4} \delta }}{2} + ... + \frac{{\phi _{p} S_{{t - p}} }}{2} + \frac{{a_{t} }}{2} - \frac{{\theta _{1} a_{{t - 1}} }}{2} - \frac{{\theta _{2} a_{{t - 2}} }}{2} - ... - \frac{{\theta _{q} a_{{t - q}} }}{2}} \\ \end{array}$$
(6)

So, the Eq. (6) is the formula of SutteARIMA.

After an overview of related models, the next section discusses the methodology.

3 Methodology

COVID-19 data for this research work has been collected from Worldometer (2020), CSSE, JHU (2020) for 20 January to 06 December 2020 and DJI data from YahooFinance (2020) for 21 January 2020 to 04 December 2020. In this research, the data is divided into two parts: training data and test data. Training data starts from 20 January 2020 to 29 November 2020 for COVID- 19 confirmed cases and 21 January 2020 to 24 November 2020 for DJI, while test data covers 30 November to 06 December 2020 and 25 November to 04 December 2020 for COVID-19 and DJI respectively. Based on the fitting data, this study conducted short-term forecasts for 05 future periods. Ahmar (2018) has developed SutteForecastR package in R software which compares the forecasting results of other forecasting methods. In the interpretation of the predictions, we have used the forecasting accuracy measure MAPE (Mean Absolute Percentage Error) (Kim and Kim 2016; Ahmar 2020).

$${\text{MAPE}} = \frac{1}{N}\mathop \sum \limits_{{t = 1}}^{N} ~\left| {\frac{{A_{t} - F_{t} }}{{A_{t} }}} \right|$$
(7)

where: At = Actual values at data time t and Ft = forecast value at data time t.

4 Results and discussion

Short-term daily estimates of COVID-19 and Stock prices is important for making strategic decisions for the future. In the case of COVID-19, daily forecasting can provide information to find a way to prevent the spread of COVID-19. Figure 1 highlights that confirmed daily cases of COVID-19 in the US will continue to increase until this curve will move downward. This graph also explains that this curve has gone through the first peak period and addition of confirmed COVID-19 cases in the US seems to be decreasing or fluctuating.

Fig. 1
figure 1

Daily New Cases of COVID-19 in the USA

A crucial impact is felt on the stock market because with the outbreak of the pandemic, investors started panic buying, leading to selling of stocks, which resulted in a drop-in stock price. Moreover, this trend increased after the WHO announced COVID-19 as a pandemic. The United States has seen a large number of cases and deaths since. Thus, the impact on the US is very important in light of the extensive cases. Figure 2 shows the closing price of the DJI which has fluctuated from the beginning of COVID-19 in the USA.

Fig. 2
figure 2

Closing Price of DJI the USA

As per the description, the process of forecasting data was conducted using the ARIMA and SutteARIMA methods. The results of the forecast of different methods are presented in Table 1 for confirmed COVID-19 cases and Table 3 for DJI stock closing price in the USA.

Table 1 Results of fitting confirmed cases of COVID-19 in the USA

SutteARIMA method was found to be the most appropriate method for predicting COVID-19 confirmed cases and DJI with a MAPE value of 0.56 and 0.60 (smaller than MAPE value of ARIMA) as shown through Tables 1 and 3. Therefore, the technique has been used to predict COVID-19 confirmed cases for five days (Table 2 and Fig. 3). Based on Table 3, the SutteARIMA method is also the most suitable method with a MAPE value 0.60 for DJI than ARIMA method. Therefore, it has been used to forecast the DJI stock from 05 to 09 December 2020 (Table 4 and Fig. 4). Finally, based on forecasting results of COVID-19 and DJI stock, we may conclude that SutteARIMA method is the most suitable forecasting method to forecast confirmed cases on COVID-19 and closing price of DJI in the USA. This can be verified by the value of forecasting accuracy measures (MAPE), and thus SutteARIMA method is considered as the best method for all data. Predicted values of COVID-19 confirmed cases and DJI stock Price has also shown in Figs. 3 and 4 respectively with 99% upper and lower limits.

Table 2 Forecast for confirmed cases of COVID-19 in the USA
Fig. 3
figure 3

Forecast for confirmad cases of COVID-19 in the USA (with 99% Upper and Lower Limit)

Table 3 Results of fitting data of DJI Stock
Table 4 Forecast for Closing Stock Price DJI in the USA
Fig.4
figure 4

Forecast for Closing Stock Price DJI in the USA (with 99% Upper & Lower Limit)

5 Conclusion

Prediction of COVID-19 and DJI in the USA can prove an intelligent idea for the policymakers in future. In fitting data of confirmed COVID-19 cases in the USA from 07 to 11 December 2020, the SutteARIMA is more suitable as compared to ARIMA method. In fitting DJI data in the USA from 02 to 06 December 2020, the SutteARIMA is found more suitable as compared to ARIMA method. Finally, SutteARIMA calculated daily forecasts of COVID-19 confirmed cases (07 December to 11 December 2020 i.e. 14,812,981, 14,996,471, 15,179,962, 15,363,452 and 15,546,942) in Table 2 and DJI stock price (05 December to 09 December 2020 i.e. 30,063, 30,052, 30,041, 30,030, and 30,020) for five days in Table 4.

6 Policy implication and further research

The findings of the paper will be helpful for policymakers in ascertaining the future impact and can be used as a tool in policy formulation. Tracking changes in consumption and investment behavior and their impact on returns is important to devise and initiate suitable policy prescriptions for the target population. The current study opens new arena of research for examining relation of stock market and other economic indices in countries and economies across the world. This will help in getting a holistic understanding of the economic impact of the current pandemic with new and advanced indicators. For further research, this method can be compared with other methods, for example, Holt-Winters, α -Sutte indicator, NNAR, Theta, time series linear model (TSLM) or other forecasting methods.