1 Introduction

The COVID-19 pandemic has thrown the world into economic chaos. It is the worst turmoil experienced since 1930 (IMF, 2021). On the one hand, the International Monetary Fung (IMF) estimated a decline in global growth of − 3.2% in 2020, with advanced (emerging market) economies experiencing a negative growth of 4.6% (2.1%) (IMF, 2021). On the other hand, the global economy is projected to grow 6.0% in 2021 and 4.9% in 2022 (IMF, 2021). Global financial markets experienced steep declines during the pandemic. For example, the Dow Jones Industrial Average (DJIA) had dropped by 33% on March 20, 2020, from its value on December 31, 2019, when the World Health Organisation (WHO) reported the first confirmed case of the coronavirus in Wuhan, China (World Economic Forum, 2020). Also, the DJIA experienced its third and sixth largest daily drops on March 16 and March 12, 2020, respectively (World Economic Forum, 2020). Other leading indices, such as the Financial Times Stock Exchange 100 (FTSE 100) in the United Kingdom and the Nikkei 225 in Japan, also experienced large declines (Bloomberg, 2020; The Guardian, 2020).

Financial crises such as the Global Financial Crisis of 2007–2008, and crisis induced by the COVID-19 pandemic have effects on both commodity and financial markets, which spill over into returns and volatility (Cheng, 2020; Corbet et al., 2020), portfolio allocations (Yoshino et al., 2021), efficiency in foreign exchange markets (Aslam et al., 2020; Narayan, 2020), and oil prices (Salisu et al. 2020a, b). Since the beginning of the COVID-19 crisis, the foreign exchange market has experienced unprecedented volatility (Aslam et al., 2020). The foreign exchange market is the largest financial market in average trading volume (Bank for International Settlements, 2019). Hence, it is critical for portfolio investors, regulators, and policymakers to understand the volatility in foreign exchange rates. Central banks across the world have attempted to manage the stability of their foreign exchange markets in a bid to withstand the COVID-19 pandemic (The Reserve Bank of Australia, 2020).

Several studies have examined the impact of COVID-19 on the foreign exchange market (e.g., Aslam et al., 2020; Hofmann et al., 2020; Umar & Gubareva, 2020). These studies applied traditional regression techniques to investigate the volatility of foreign exchange markets. However, these studies failed to consider the predictability of exchange rates during COVID-19 and non-COVID-19 periods. Our study attempts to apply the best machine learning and deep learning algorithms to predict the foreign currency exchange rates during the COVID-19 pandemic and compare them with the rates during the normal non-COVID-19 period. It is important to predict the exchange rate accurately because it helps policymakers and businesspeople to improve the quality and quantity of appropriate management decisions and plan their finances more precisely. Different methods are used to predict the foreign currency exchange rate (Mahmoud & Hosseini, 1994; Maya & Gómez, 2008; Rambaldi et al., 2015; Windsor & Thyagaraja, 2001); most of them have been based on statistical analysis. This study applies deep learning approaches to predict the exchange rates of 21 currencies against USD. We use the Long Short-Term Memory (LSTM) approach, a recurrent neural network algorithm that is one of the most promising current approaches in the field of deep learning. LSTM performs particularly well in analysing time-series data. One of the limitations of neural networks other than LSTM is that there is no memory associated with them, and this causes a major problem for time-series data. LSTM overcomes this issue by creating both a short-term and a long-term memory component. However, a traditional recurrent neural network (RNN) such as LSTM cannot handle the data when the input and output are of different sizes. Bi-directional LSTM (Bi-LSTM) overcomes this problem by considering both previous and future data patterns. This in turn leads to effective learning of long-term dependencies in the time-series data. For comparative purposes, this study also uses support vector regression (SVR) and random forest (RF) regression. The main advantages of SVR are that the computation does not depend on dimensionality and that it has an excellent generalisation capability with a high prediction accuracy. As a result, SVR works well for large datasets and can handle a large number of input variables without deletion. The most significant advantage of RF is its versatility. In fact, ensemble methods have been neglected in the existing literature on foreign currency market predictions even though ensemble methods are very powerful in reducing the variance inherently present in complex and volatile financial markets (Carta et al., 2021). Therefore, we propose to use an ensemble deep learning approach, Bi-LSTM BR, by combining Bi-LSTM, a state-of-the-art deep learning-based model, with Bagging Ridge (BR) regression.

This study shows that the proposed ensemble deep learning approach provides a robust prediction performance, performing well for different clusters of time-series data in our study. Our study analyses the prediction performance by clustering data based on the waves of confirmed COVID-19 cases and on the timing of fiscal and monetary policies meant to combat the economic consequences of the COVID-19 pandemic. Four clusters are categorized based on the waves of confirmed COVID-19 cases in the United States, and three clusters are based on the U.S. government’s monetary and fiscal policy given that the United States was the country most affected by the pandemic and that the USD has been used for each pair of currency. All algorithms are applied for each dataset, the prediction performances are taken separately, and the accuracy curve is generated. In predicting the foreign currency exchange rate, we measured the performance of the algorithms using the root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). Results demonstrate that our proposed ensemble deep learning approach performed well in predicting the exchange rate during the non‒COVID-19 and COVID-19 periods.

The rest of the paper is organized as follows. Section 2 briefly reviews existing research work on predicting the foreign exchange market and the impact of COVID-19 on this market. Section 3 presents our experimental design and the data used for experiments. Section 4 outlines the proposed Bi-LSTM BR hybrid model and the used research methodology. Section 5 presents the experimental analysis, and Sect. 6 checks the robustness of the proposed model. The results are discussed in Sect. 7 and the conclusions about them in Sect. 8.

2 Related literature

This section reviews previous studies on predicting the forex exchange market and provides the theoretical justification for investigating the impact of COVID-19 on the predictability of foreign exchange markets.

Nonlinear time series models used for predicting the forex exchange market include two categories of models, namely (1) traditional statistical parametric models (Mahmoud & Hosseini, 1994; Windsor & Thyagaraja, 2001; Rambaldi et al., 2005) and (2) advanced machine learning methods. Model specification is required for the former category of models, and it limits their accuracy because many nonlinear time-series patterns cannot be captured. Because advanced machine learning methods, such as support vector machine (SVM) and neural networks (NNs) outperformed the traditional parametric models in recent studies, we present these approaches in Table 1.

Table 1 Summary of previous studies using advanced machine learning methods for predicting foreign exchange market rates

Among the advanced machine learning methods, SVM and extreme learning machines were particularly effective when handling multidimensional time-series data obtained by using technical indicators for preprocessing. However, these machine learning methods were proven ineffective when handling large and noisy data (Munkhdalai et al., 2019). In contrast, LSTM-based models were highly effective due to their capacity to capture high-level temporal features from the foreign exchange time-series data (Ahmed et al., 2020; Rundo, 2019; Yıldırım et al., 2021). The main limitation of existing approaches is that only single LSTMs were used without considering both previous and future data patterns, something that is needed to effectively learn long-term dependencies in the data. Moreover, ensemble methods have been overlooked even though the variance was substantially reduced and a more robust performance was achieved by combining multiple base LSTM models in a related stock market prediction problem (Borovkova & Tsiamas, 2019).

Several studies revealed a significant impact of COVID-19 on the volatility of foreign exchange markets. Hofmann et al. (2020) showed that borrowing through local currency bonds did not protect emerging market economies from the financial shock triggered by COVID-19, because the local currency bond spreads rose sharply. As a result, portfolio investors faced amplified losses. A high coherence was also found between the Coronavirus Panic Index and foreign exchange markets, indicating that new cross-currency hedges should be introduced to withstand the adverse effects of global economic turmoil (Umar & Gubareva, 2020). A decline in forex market efficiency during the COVID-19 outbreak was observed by Aslam et al. (2020). These findings inspired us to explore the prediction capacity of deep learning-based models during COVID-19 and non-COVID-19 periods, and that is the main contribution of this study.

3 Experimental data

The dataset came from Kaggle and Oanda.Footnote 1 We used 21 currency exchange rates against USD. The data during COVID-19 came from Oanda and were divided into several clusters. The dataset description is provided in Table 2. Data clusters were divided based on the number of confirmed cases in the United States and the timeline of events related to the COVID-19 pandemic. The experimental datasets were divided into training and testing sets. The training-to-testing ratio was 80:20. The amount of data for the pre-COVID-19 time period (2000–2019) was 5216 daily observations for all the used currencies, and for the COVID-19 period 571 observations. The training set of the pre-COVID-19 dataset was cross validated tenfold to minimise the training error and enhance the generalizability of the forecasting outcome (Abedin et al., 2019).

Table 2 Description of datasets and data clusters of datasets

4 Methodology

Statistical models, machine learning, and deep learning models have been used in the literature to predict financial asset prices (Abedin et al., 2020; Akyildirim et al., 2021; Cui et al., 2020; Fischer & Krauss, 2018; Guotai et al., 2017; Hajek & Abedin, 2020; Jiang et al., 2020; Kyriakou et al., 2021; Shajalal et al., 2021; Xia et al., 2020). We apply machine learning and deep learning algorithms to measure different types of errors and find the best model for the dataset to measure the prediction accuracy for each currency against USD.

4.1 Regression algorithms

Different algorithms have been used to select the best method to calculate the prediction error of exchange rate movements of the selected currencies. We applied some regression algorithms, namely the regression tree (RT), SVR, RF, BR, LSTM, Bi-LSTM, and Bi-LSTM BR, to estimate the prediction accuracy.Footnote 2,Footnote 3

4.1.1 Regression tree

Using a decision tree is a common practical approach for supervised learning (Chen, 2011; Delen et al., 2013). It is used for both classification and regression estimations. The decision tree is a tree-structured classifier that consists of three types of nodes, namely, the root node, interior node, and leaf node. The root node is the initial node that represents the whole sample. The interior nodes represent the characteristics of a data set. Lastly, the root nodes provide the outcome. For a particular data point, the decision tree is run by answering true/false questions until they reach the leaf node. The final prediction is calculated by finding the average value of a dependent variable in a specific leaf node. In this way, the tree can predict a proper value for the data point through several iterations (Fig. 1). The decision tree is advantageous because it is simple to understand and requires less data cleaning. Like ridge and lasso regression, decision tree regression may have overfitting problems. An ensemble of decision trees (e.g., the RF algorithm) can overcome these problems.

Fig. 1
figure 1

A decision tree flow chart

4.1.2 Support vector regression

Support vector regression (SVR) is the most used and highest-performance algorithm in today's world (Balabin & Lomakina, 2011; Gazzola & Jeong, 2021). This is a supervised machine learning algorithm used for classification and regression purposes (Weston et al., 1997). This study applies three different kernels, linear, polynomial, and radial basis function (RBF), while training the SVR classifier. SVR makes a decision boundary based on the support vector points and, accordingly, it forecasts sample points within this boundary. The applied kernels are trained for non-linear data (Hsu & Lin, 2002).

4.1.3 Random forest

Random forest is an ensemble algorithm that builds a set of independent and non-identical decision trees following the idea of randomisation (Provost et al., 2016). This algorithm is used for both classification and regression purposes, and it is a combination of tree predictors. Each decision tree employs a random vector as a parameter randomly chooses the attributes of samples, and it then finally chooses the sample subset as the training dataset (Bradter et al., 2013). The generalisation error of a forest of trees depends on the forest’s individual trees’ strength and correlation. However, deep decision trees might suffer from overfitting (Bramer, 2007). RF prevents overfitting by generating random subsets of attributes and constructing trees using these subsets (Breiman, 2001).

4.1.4 LSTM

LSTM is a recurrent neural network algorithm in the deep learning model (Alhagry et al., 2017; Hochreiter & Schmidhuber, 1997). Initially, LSTM aims to capture the long-term dependency and determine the optimal lag order in the time-series analysis. We have applied LSTM neural networks to predict exchange rates (Sun et al., 2020). The hidden state \(s_{t}\) is determined as below:

$$ S_{t} = f\left( {Ux_{t} + Ws_{t - 1} } \right), $$
(1)

where \(f\) is an activation function, \(x_{t}\) are inputs, U is the hidden layers’ weight, V is the weights of output layers, and W is the transition weights of the hidden state.

LSTM is an effective way to overcome the problem of a vanishing gradient by using the memory cells. The input gate, the forget gate, the output gate and the self-recurrent neuron are central units in a memory cell. The values of the input gate \(i_{t}\) and the memory cell’s candidate state \(\tilde{C}_{t}\) are estimated as below:

$$ i_{t} = \sigma \left( {W_{i} x_{t} + U_{i} h_{t - 1} + b_{i} } \right), $$
(2)
$$ \tilde{C}_{t} = {\text{tanh}}\left( {W_{c} x_{t} + U_{c} h_{t - 1} + b_{c} } \right), $$
(3)

where \(x_{t}\) is the memory cells inputs; \(W_{i}\), \(W_{f}\), \(W_{c}\), \(W_{o}\), \(U_{i}\), \(U_{f}\), \(U_{c}\), \(U_{o}\), and \(V_{0}\) are the matrices of weight; \(b_{i}\), \(b_{f}\), \(b_{c}\), and \(b_{o}\) are biases; and \(h_{t}\) is the memory cell’s value. The cell state vector \(C_{t}\) and the value of the forget gate \(f_{t}\) are estimated as follows:

$$ f_{t} = \sigma \left( {W_{c} x_{t} + U_{c} h_{t - 1} + b_{f} } \right), $$
(4)
$$ C_{t} = i_{t} * \tilde{C}_{t} + f_{t} * C_{t - 1} , $$
(5)

where \(o_{t}\) and \(h_{t}\) are the values of the output gate and the memory cell, respectively. Finally, the hidden state \(h_{t}\) and the value of the output gate \(o_{t}\) are estimated as below:

$$ o_{t} = \sigma \left( {W_{0} x_{t} + U_{0} h_{t - 1} + V_{0} C_{t} + b_{0} } \right), $$
(6)
$$ h_{t} = o_{h} * {\text{tanh}}\left( {C_{t} } \right). $$
(7)

The LSTM network consists of the delays and the hidden layers’ sizes obtained from the time-series data by applying training data.

4.1.5 Bi-LSTM

A Bi-LSTM deep learning-based recurrent neural network (Fig. 2) works efficiently to analyze any time-series data better than traditional statistical time-series models, such as the autoregressive moving average (ARIMA), seasonal ARIMA, and ARIMAX models, because of its bidirectional nature of input patterns (Sezer et al., 2020; Sunny et al., 2020). While LSTM works only with previous data patterns, the Bi-LSTM model considers both previous and future data during training, and this makes the Bi-LSTM model more effective than LSTM. This behavior of Bi-LSTM helps learn the present status of data both from past data and future data through its forward layer and backward layer. It can capture not only local features but also extract global features in the time-series data. In the Bi-LSTM layer, there are no hidden-to-hidden connections between forward and backward layers. This helps one to understand information from both the backward layer and forward layer in each Bi-LSTM unit.

Fig. 2
figure 2

Bi-directional long short-term memory (Bi-LSTM) model

4.1.6 Bagging ridge

A bagging regressor is an ensemble procedure that can take any regression task and predict the target values more accurately by combining multiple simple regression models while reducing their overall variance. In this paper, we combined ridge regression with a bagging regressor to predict exchange rates in order to improve prediction performance. The primary purpose of bagging ridge regression is to increase the stability of the final model and reduce the error in testing data. The important aspect of BR for this study is that it performs well in cases where the size of the data is limited. Specifically, to understand the behaviour of prediction models before and during the COVID-19 pandemic, we divided the dataset into many subsets, and this ensemble regressor showed higher accuracy than the base ridge regression model.

4.1.7 Bi-LSTM bagging ridge

Bi-LSTM BR is an ensemble algorithm that combines Bi-LSTM and BR. There are many approaches to constructing an ensemble algorithm. We applied the averaging approach of ensemble formation (Ribeiro et al., 2020). We generated an additional training dataset by applying the repetitions procedure (Hennig et al., 2021). Our study combined the results of BR and Bi-LSTM to form a novel Bi-LSTM BR regression model, as depicted in Fig. 3.

Fig. 3
figure 3

Bi-directional long short-term memory bagging ridge (Bi-LSTM BR)

4.2 Performance measures

We measured and compared the performance of the different algorithms mentioned above by computing the root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). RMSE is a standard metric for computing a numerical prediction error by squaring each data forecast, hence putting more weight on larger errors. This is a desirable feature in evaluating prediction performance in foreign exchange markets (Islam & Hossain, 2020). RMSE is calculated as:

$$ RMSE{ } = { }\sqrt {\frac{{\mathop \sum \nolimits_{t = 1}^{N} \left( {R_{i} { } - { }\hat{R}_{t} } \right)^{2} }}{N}} , $$
(8)

where \({R}_{i}\) is the actual value, \({\widehat{R}}_{t}\) is the forecasted value, and \(N\) is the number of forecasts. MAE and MAPE are other standard evaluation metrics used in currency exchange rate prediction (Dash, 2018):

$$MAE = \frac{1}{N}\sum_{t=1}^{N}\left|{R}_{i} - {\widehat{R}}_{t}\right|,$$
(9)
$$MAPE = \frac{1}{N}\sum_{t=1}^{N}\left|\frac{{R}_{i} - {\widehat{R}}_{t}}{{R}_{i} }\right|\times 100.$$
(10)

5 Experimental analysis

5.1 Description of clusters

We have divided the data into two clusters: before the COVID-19 pandemic and during the COVID-19 pandemic. The cluster during COVID-19 was divided into seven sub-clusters based on confirmed COVID-19 cases and events. The dataset during COVID-19 was divided into four sub-clusters based on confirmed cases in the United States and three sub-clusters based on the timing of efforts by the U.S. government to combat the COVID-19 pandemic. These three sub-clusters have been used to check the robustness of the accuracy of the deep learning ensemble approaches in our study.

The pre-COVID-19 period is from January 3, 2000, to December 31, 2019. On December 31, 2019, the first confirmed case of the COVID-19 was reported in Wuhan, China, by the WHO. The first sub-cluster (C1) of the COVID-19 period is between January 31 and March 11, 2020, when the WHO declared that a global pandemic was occurring. The 7-day moving average of U.S. confirmed cases was approximately 40,000 for most of the second sub-cluster (C2), between March 12 and July 25, 2020. The third sub-cluster (C3), between July 26 and September 8, 2020, witnessed a negative trend of cases after sub-cluster 2. Finally, the fourth sub-cluster, between September 9 and December 14, 2020, experienced a large increase in daily confirmed cases, with the recording of 246,716 new cases on December 11, 2020.

The first sub-cluster (E1) based on the timeline of events related to the COVID-19 pandemic was between December 31, 2019, and March 11, 2020, when the WHO declared that a COVID-19 global pandemic was occurring. During this period, the U.S. Congress was debating the passage of the Coronavirus Aid, Relief, and Economic Security (CARES) Act. The second sub-cluster (E2), between March 12 and June 8, 2020, was the period in which the U.S. Congress passed an economic relief package exceeding USD 2 trillion to boost the economy. The third sub-cluster (E3), between June 9 and December 14, 2020, is when the U.S. government adopted several monetary and fiscal policies. For example, the Federal Reserve launched the Primary Market Corporate Credit Facility on June 29, 2020, and on July 28, 2020, the Federal Reserve extended the operation of emergency lending programs. During the third sub-cluster, on December 11, 2020, the Food and Drug Administration approved the first emergency use of the COVID-19 vaccine offered by Pfizer Biontech.

We applied all the algorithms to train the clusters separately, calculate the errors, test the significance level, and choose the best algorithm for the large datasets, as well as for the cluster datasets. Figure 4 provides the flowchart of the experiment.

Fig. 4
figure 4

Methodologic flowchart

5.2 Experiment for data in the pre-COVID-19 period

The cluster for a non-COVID-19 period extends from January 3, 2000 to January 30, 2020. We have trained the models using the dataset and created graphs. Figure 5 shows the predicted exchange rates versus the actual exchange rates of 21 currencies against USD. Because a special feature of a deep learning algorithm is that it can perform feature selection by itself and scale the data as required (Mathew et al., 2020), we have presented actual versus predicted exchange rates with automatic scaled values generated by our proposed BR ensemble deep learning approach over time (see Fig. 5).Footnote 4 Table 3 presents the RMSE of all algorithms. Table 3 shows that the best-suited algorithm varied from one currency to another during the non-COVID-19 period. For example, BR appeared to be the best model for predicting the AUD/USD exchange rate but Bi-LSTM was the best model for predicting the EUR/USD rate. Figure 6 shows the performance of algorithms based on RMSE. Table 3 shows that the Diebold Mariano (DM) test (see Diebold & Mariano, 2002) examined the accuracy of our proposed Bi-LSTM BR deep learning algorithm against the benchmark algorithms in our study for major currencies (e.g., AUD, CNY, GBP, and JPY).Footnote 5 The DM test results demonstrated that our proposed Bi-LSTM BR ensemble deep learning approach was effective in terms of its RMSE loss function against the benchmark algorithms in our study (Table 4).

Fig. 5
figure 5figure 5

Actual versus predicted exchange rates (scaled) for before COVID-19 data (January 3, 2000 to December 31, 2019)

Table 3 Evaluation of algorithms using RMSE for pre-COVID-19 dataset (January 3, 2000 to December 31, 2019)
Fig. 6
figure 6

Performance of algorithms based on RMSE for before COVID-19 data (January 3, 2000 to December 31, 2019)

Table 4 Robustness test of Bi-LSTM BR ensemble deep learning approach for pre-COVID-19 dataset (January 3, 2000 to December 31, 2019)

5.3 Experiment for data from the period of the COVID-19 pandemic

5.3.1 Cluster 1 of the COVID-19 dataset

The first cluster of the COVID-19 period (C1) contains data from January 31 to March 11, 2020. The number of days in this sample is 41. Table 5 shows that the best-suited algorithm varied from one currency to another in the C1 data of the COVID-19 period. For example, Bi-LSTM BR appears to be the best model to predict the AUD/USD exchange rate, and Bi-LSTM is the best model for CNY/USD. Figure 7 shows that the predicted exchange rates are closely aligned with actual exchange rates, providing evidence that our proposed BR ensemble deep learning approach performed well in predicting the exchange rate during the COVID-19 period.Footnote 6 Figure 8 shows the performance results of algorithms based on RMSE. Table 6 shows the results of the DM test in examining the accuracy of our proposed Bi-LSTM BR deep learning algorithm against the benchmark algorithms. The DM test results demonstrate that our proposed approach had superior forecasting effectiveness in terms of RMSE against the compared methods.

Table 5 Evaluation of algorithms using RMSE for Cluster C1 (January 31 to March 11, 2020)
Fig. 7
figure 7figure 7

Actual versus predicted exchange rates (scaled) for Cluster C1 (January 31 to March 11, 2020)

Fig. 8
figure 8

Performance of algorithms based on RMSE for Cluster C1 (January 31 to March 11, 2020). Note Korean Won (KRW) has no error since the actual exchange rate equals the predicted exchange rate

Table 6 Robustness test of Bi-LSTM BR ensemble deep learning approach for Cluster C1 (January 31 to March 11, 2020)

5.3.2 Cluster 2 of the COVID-19 dataset

The second cluster of data during the COVID-19 pandemic (C2) extends from March 12 to July 25, 2020 (41 days in the sample). Table 7 shows that Bi-LSTM BR was a superior model for predicting the GBP/USD exchange rate, whereas Bi-LSTM was the best model for EUR/USD. Figure 9 provides evidence that our proposed Bi-LSTM BR approach was effective in predicting testing data during this COVID-19 period. Figure 10 confirms this good performance with regard to RMSE. Table 8 provides the results of the DM test in comparing the performance of our proposed Bi-LSTM BR deep learning model in terms of RMSE. These results show that significant improvements were achieved using the proposed model.

Table 7 Evaluation of algorithms using RMSE for Cluster C2 (March 12 to July 25, 2020)
Fig. 9
figure 9figure 9

Actual versus predicted exchange rates (scaled) for Cluster C2 (March 12 to July 25, 2020)

Fig. 10
figure 10

Performance of algorithms based on RMSE for Cluster C2 (March 12 to July 25, 2020). Note: Korean Won (KRW) has no error since the actual exchange rate equals the predicted exchange rate

Table 8 Robustness test of Bi-LSTM BR ensemble deep learning approach for Cluster C2 (March 12 to July 25, 2020)

5.3.3 Cluster 3 of the COVID-19 dataset

The third cluster of COVID-19 data (C3) is for the period from July 26 to September 8, 2020, or 45 days. Table 9 shows that the best-performing algorithms varied from one currency to another in the C3 data for the COVID-19 period. For instance, Bi-LSTM BR outperformed other models in predicting GBP/USD, whereas Bi-LSTM was the best model for EUR/USD. Figures 11 and 12 provide additional evidence for the good predictive capacity of our proposed Bi-LSTM BR approach during this COVID-19 period. The results of the DM test in Table 10 confirm the superiority of the proposed prediction model.

Table 9 Evaluation of algorithms using RMSE for Cluster C3 (July 26 to September 8, 2020)
Fig. 11
figure 11figure 11

Actual versus predicted exchange rates (scaled) for Cluster C3 (July 26 to September 8, 2020)

Fig. 12
figure 12

Performance of algorithms based on RMSE for Cluster C3 (July 26 to September 8, 2020). Note: Korean Won (KRW) has no error since the actual exchange rate equals the predicted exchange rate

Table 10 Robustness test of Bi-LSTM BR ensemble deep learning approach for Cluster C3 (July 26 to September 8, 2020)

5.3.4 Cluster 4 of the COVID-19 dataset

The fourth cluster of COVID-19 data (C4) is for the period from September 9 to December 14, 2020, or 45 days. Table 11 shows that Bi-LSTM BR performed best in predicting the EUR/USD exchange rate, whereas Bi-LSTM performed best for AUD/USD. Figures 13 and 14 provide evidence that our proposed BR ensemble deep learning approach performed well in predicting the exchange rate during this COVID-19 period. Table 12 shows that the DM test examined the accuracy of our proposed Bi-LSTM BR deep learning algorithm against the benchmark algorithms. The DM test results demonstrate that our approach had forecasting effectiveness in terms of a RMSE loss function against the benchmark algorithms in our study.

Table 11 Evaluation of algorithms using RMSE for Cluster C4 (September 9 to December 14, 2020)
Fig. 13
figure 13figure 13

Actual versus predicted exchange rates (scaled) for Cluster C4 (September 9 to December 14, 2020)

Fig. 14
figure 14

Performance of algorithms based on RMSE for Cluster C4 (September 9 to December 14, 2020). Note: Korean Won (KRW) has no error since the actual exchange rate equals the predicted exchange rate

Table 12 Robustness test of Bi-LSTM BR ensemble deep learning approach for Cluster C4 (September 9 to December 14, 2020)

5.4 Comparison of prediction performance before and during the COVID-19 pandemic

To compare the performance of the proposed Bi-LSTM BR deep learning model for the two periods, before and during the COVID-19 pandemic, we evaluated the results in terms of the RMSE, MAE and MAPE. For the COVID-19 period, we reported the average performance of clusters C1 to C4 and E1 to E3 for the period from January 31 to December 31, 2020. In other words, we investigated the differences in the prediction capacity of the deep learning-based model between the two periods. Table 13 shows that there were not only substantial differences between the two periods but also among the used currencies. Generally, we can find two patterns in the results; one pattern showing the currencies for which the predictive capacity significantly deteriorated and one pattern showing those for which no significant effect was observed. The currencies that were least affected include AUD, GBP, CNY, HKD, NOK, and THB. In contrast, the largest decline in model performance occurred for EUR and other European currencies. To further investigate the differences across the studied foreign exchange markets, we compared the overall performance for the two periods in terms of the RMSE, MAE and MAPE, as shown in Fig. 15. On average, the prediction performance of Bi-LSTM BR deteriorated during the COVID-19 period, as confirmed by the significant differences obtained using the DM test at a P-value less than 0.05 for all the used performance metrics.

Table 13 Comparison of Bi-LSTM BR performances for pre-COVID-19 and COVID-19 periods
Table 14 Evaluation of algorithms using RMSE for Cluster E1 (December 31, 2019 to March 11, 2020)
Table 15 Robustness test of Bi-LSTM BR ensemble deep learning approach for Cluster E1 (December 31, 2019 to March 11, 2020)
Table 16 Evaluation of algorithms using RMSE for Cluster E2 (March 12 to June 8, 2020)
Table 17 Robustness test of Bi-LSTM BR ensemble deep learning approach for Cluster E2 (March 12 to June 8, 2020)
Table 18 Evaluation of Algorithms using RMSE for Cluster E3 (June 9 to December 14, 2020)
Table 19 Robustness test of Bi-LSTM BR ensemble deep learning approach for Cluster E3 (June 9 to December 14, 2020)
Fig. 15
figure 15

Prediction performance of Bi-LSTM BR in pre‒COVID-19 and COVID-19 periods

Fig. 16
figure 16figure 16

Actual versus predicted exchange rates (scaled) for Cluster E1 (December 31, 2019 to March 11, 2020)

Fig. 17
figure 17

Actual versus predicted exchange rates (scaled) for Cluster E2 (March 12 to June 8, 2020)

Fig. 18
figure 18figure 18

Actual versus predicted exchange rates (scaled) for Cluster E3 (June 9 to December 14, 2020)

Fig. 19
figure 19

Performance of algorithms based on RMSE for Cluster E1 (December 31, 2019 to March 11, 2020). Note: Korean Won (KRW) has no error since the actual exchange rate equals to the predicted exchange rate

Fig. 20
figure 20figure 20

Performance of algorithms based on RMSE for Cluster E2 (March 12 to June 8, 2020). Note: Korean Won (KRW) has no error since the actual exchange rate equals the predicted exchange rate

Fig. 21
figure 21

Performance of algorithms based on RMSE for Cluster E3 (June 9 to December 14, 2020). Note: Korean Won has no error since the actual exchange rate equals the predicted exchange rate

6 Robustness check

To check the robustness of the results, we applied algorithms for three clusters based on the timeline of events related to the U.S. government policies to combat the COVID-19 pandemic. Results from clusters based on events are qualitatively similar to those from the confirmed cases. Results in Tables 14, 16 and 18 show that Bi-LSTM, BR, and Bi-LSTM BR consistently outperformed the other methods across the clusters and for most of the currencies. Consistent with the clusters’ results based on confirmed cases, Figs. 16, 18 and 20 provide evidence that our proposed BR ensemble deep learning approach performed well in predicting the exchange rate during the highly volatile COVID-19 period. Figures 17, 19 and 21 confirm this by depicting the RMSE achieved for clusters E1 to E3. Tables 15, 17 and 19 provides the results of the DM test in examining the accuracy of our proposed Bi-LSTM BR deep learning algorithm against the benchmark algorithms. The results demonstrate that our proposed approach had superior forecasting effectiveness in terms of the RMSE loss function.

7 Discussion

The main motivation for our Bi-LSTM BR hybrid prediction model was to take full advantage of state-of-the-art deep learning models by combining them in an ensemble learning manner. To demonstrate the efficiency of the proposed Bi-LSTM BR model, Table 20 compares our results with those of existing studies. It is worth noting that only level estimation models were selected to achieve a fair comparison. Compared with existing approaches, the proposed model achieved a superior average performance in terms of the RMSE, MAE, and MAPE. It must be acknowledged that the best performance so far has been reported by Islam and Hossain (2020). However, unlike in other studies, the authors did not use a daily prediction horizon but, rather, a 30-min prediction horizon, and this substantially reduced the average error. Overall, our model was superior to the compared neural network-based prediction models. In addition, a wide range of 21 currencies, including emerging foreign currency markets, provided strong experimental support for our results. Therefore, one implication of this study is that the advantages of combining state-of-the-art deep learning-based Bi-LSTM models with variance-reducing BR are reflected in a better prediction performance compared with more traditional deep learning-based models.

Table 20 Comparison of Bi-LSTM BR with existing prediction models

The demonstrated robustness of the proposed model indicates high confidence in its predictions during the COVID-19 period. We have found additional empirical support for the previous findings of Umar and Gubareva (2020) that foreign exchange markets have been highly volatile during the pandemic. This can also be attributed to the decline in foreign exchange market efficiency during the COVID-19 period (Aslam et al., 2020). Our results suggest that as the number of pandemic cases in a country increased, more negative effects were seen on the country’s exchange rate predictability. This information can be used to develop targeted interventions aimed at stabilizing foreign exchange markets.

In this study, we examined the predictability of 21 major currency pairs, including currencies of the world’s largest economies. Therefore, the results of our study can help investors and other stakeholders to evaluate their risks and their effects on business decisions. In fact, exchange rate volatility is essential for the valuations of assets and liabilities and the pricing of derivative instruments. The predictions provided by the proposed model can be incorporated into existing volatility models to improve their prediction accuracy. Predicting foreign exchange market volatility during the pandemic period is also critical for policymakers in reducing systematic risks when planning for and implementing fiscal and monetary policies.

The findings of this study have a number of important implications for stakeholders. In accordance with work by Umar and Gubareva (2020), cross-currency hedges are suggested to address the higher currency risk posed by the pandemic. Indeed, during the COVID-19 period, it became crucial for stakeholders, including banks and private companies, to anticipate the effects of the pandemic on their business and financial risks associated with increased foreign currency market volatility. These stakeholders are now increasingly checking the reliability of the foreign market data used as inputs for the valuation of their assets, liabilities, and contracts. The proposed prediction model might not only provide some support for this checking process but it can also be used to identify the most seriously affected exposures. This in turn can result in the reconsidering of current hedging strategies. Therefore, our research suggests that stakeholders should take more advantage of derivative markets by using hedging with options and cross-currency basis swaps.

Another challenging issue is the post-pandemic scenario in the foreign exchange market. Our results indicate that the predictability of the foreign exchange market of countries less affected by COVID-19 is close to that from before the pandemic. Hence, post-pandemic foreign exchange markets are expected to be more predictable, allowing investors to find more opportunities in foreign exchange markets.

8 Conclusion

In this study, we proposed an ensemble deep learning approach combining Bi-LSTM and BR. We have applied other machine learning algorithms, such as RT, SVR, and RF, and deep learning algorithms, such as LSTM and Bi-LSTM. Results from the DM test show that our proposed Bi-LSTM BR ensemble deep learning approach has superior forecasting effectiveness in terms of a RMSE loss function against these benchmark algorithms. These findings suggest that in general, the prediction performance of the used model worsened during the COVID-19 period. The performance deteriorated especially for exchange rates of the most adversely affected countries, and this can be attributed to a higher currency volatility induced by the pandemic. However, it should be noted that although this increased volatility translates into more challenging foreign exchange market predictions, our model still performed well compared with existing prediction models in terms of the RMSE, MAE, and MAPE. The highly competitive prediction capacity of the proposed model in both periods, pre-COVID-19 and COVID-19, is beneficial for policymakers, entrepreneurs, foreign exchange brokers, and dealers when addressing currency risks, particularly during the highly volatile COVID-19 period. This study’s findings can support market participants as they try to find the best investment strategy that can generate profits during the crisis period while also managing the increased currency risk.

This study has revealed several questions in need of further investigation. Our proposed Bi-LSTM BR ensemble deep learning approach was shown to be highly effective for foreign exchange markets. It would be interesting to investigate its effectiveness in other financial and commodity markets, such as in forecasting stock prices, crude oil prices, and prices of gold and other precious metals. Our primary focus was on exchange rate forecasting, and we have not included other factors such as interest rate differentials and inflation differentials. Additional determinants should therefore be incorporated into the proposed prediction model. The forecasting power may be enhanced if these factors are integrated into our proposed Bi-LSTM BR ensemble deep learning approach. This provides an opportunity for future research. The financial implications for investors of trading strategies based on the predictions of the proposed model should also be considered in further work.