Introduction

In recent decades, cryptocurrencies have witnessed spectacular development. They are growing rapidly and are used for many different applications in the economy due to their ability to facilitate electronic payments between individuals without the involvement of a (trusted) third party. Nakamoto (2008) was the first to document Bitcoin as the most well-known and prominent decentralized digital cryptocurrency based on blockchain technology. Bitcoin currently has the largest market capitalization among cryptocurrencies and has been widely used as a means of electronic payment in recent years due to the anonymity, safety, transparency, and cost effectiveness that it offers (Yermack 2013; Kim 2017; Yuneline 2019). The increasing popularity of Bitcoin has led researchers to develop other digital cryptocurrencies, such as Ethereum, XRP, and CRO. Presently, the digital crypto markets have grown rapidly in a short period of time; as of December 15, 2020, the global crypto market cap was $563.68 bn with 4028 cryptocurrencies (www.coinmarketcap.com). With the growing appeal of digital cryptocurrencies, finance analysts, economists, traders, and investors are focusing on predicting their future potential investment value. Therefore, numerous studies have recently been conducted to verify the influence of financial variables (e.g., gold prices, oil prices, currency exchange rates, commodities, stock market prices) on the fundamental and speculative value of cryptocurrencies, specifically with relation to their volatility as well as in terms of bubbles (e.g., Kondor et al. 2014; Kristoufek 2013; Ciaian et al. 2016; Bariviera et al. 2017; Zhu et al. 2017; Panagiotidis et al. 2018; Lahmiri et al. 2018; Nasir et al. 2019; Dennery 2020; Faghih Mohammadi Jalali and Heidari 2020; Hakim das Neves 2020; Huynh et al. 2020b, 2020c, 2020d; Makarov and Schoar 2020). In early 2017, the following group set out to objectively measure the overall growth and movement in the blockchain sector: Igor Rivin and Carlo Scevola (the team leaders of a group of mathematicians, fund managers, and quants); CS&P presidents and economists; as well as engineer Robert Davis. They designed the Crypto Currency Index 30 (CCI30) to track the top 30 cryptocurrencies by market capitalization, excluding stablecoins (www.cci30.com), and suggested that the CCI30 could serve as an investment tool for passive investors and investment managers. Limited literature on crypto markets has explored the use of the CCI30 index (Senarathne and Jianguo 2020; Pontoh and Rizkianto 2020; Petukhina, et al. 2020). To fill this gap, this research explores the stochastic behavior of the CCI30 index, especially for the purpose of helping crypto market policymakers and investors interested in portfolio diversification.

Over the last 9 months, the most prominent global health threat has been COVID-19, which was first detected in Wuhan, China on December 31, 2019, and was subsequently declared a global pandemic by the World Health Organization (WHO) on March 11, 2020 (WHO 2020). The pandemic spread rapidly throughout the world—despite quarantines, lockdowns, and social distancing—thereby upending the lives of millions of people. After COVID-19 was declared a global pandemic, the world economy was drastically affected. Worldwide, sales and production fell, companies became burdened financially, unemployment rose, and consumer behaviors changed (Lahmiri and Bekiros 2020a). The pandemic severely impacted financial markets, which, in turn, compelled many researchers to explore its effect on financial contagion and market stability (e.g., Akhtaruzzaman et al. 2020; Ali et al. 2020; Al-Awadhi et al. 2020; He et al. 2020; Okorie and Lin 2020; Sharif et al. 2020; Zaremba et al. 2020; Zhang et al. 2020; Shear et al. 2021). The rapid spread of COVID-19 in 2020 posed a serious threat to the crypto market as well. After the declaration of the pandemic, the largest 1-day fall in the price of Bitcoin (36%) occurred on March 13, 2020 (Yousaf and Ali 2021). Researchers started focusing on understanding the dynamics of the cryptocurrency market, especially the connections that have existed among the various cryptocurrencies during the COVID-19 crisis (e.g., Conlon et al. 2020; Conlon and McGee 2020; Corbet et al. 2020; Lahmiri and Bekiros 2020b; Mnif et al. 2020; Umar and Gubareva 2020; Yousaf and Ali 2020; James et al. 2021; Iqbal et al. 2021).

Several studies (e.g., Liu and Tsyvinski 2018; Rognone et al. 2020) have found that cryptocurrencies behave in a different manner from traditional assets, such as currencies, commodities, and equities. Furthermore, it was found that the returns in crypto markets are influenced by the enthusiasm of investors, which is, in turn, affected by unique and unusual events in the news. Considering the fact that the COVID-19 pandemic has been a very unique and unusual event given its unprecedentedness, researchers analyzed how the pandemic affected the cryptocurrency markets, especially in light of the disagreeing behavioral evidence. For example, Lahmiri and Bekiros (2020a) explored the evolution of informational efficiency in 45 cryptocurrency markets, including the CCI30 index and 16 international stock markets, from September 2019 to April 2020. They declared that cryptos showed more instability and more irregularity during the COVID-19 pandemic when compared to international stock markets. Mariana et al. (2021), on the one hand, evaluated the impact of COVID-19 on Bitcoin, Ethereum, gold, and the S&P500 from July 1, 2019, to April 6, 2020. All of the returns accrued during COVID-19 were found to be more volatile than the pre-COVID-19 period. They also declared that cryptos’ volatility is higher than that of both gold and the S&P500. Since valid, well-tested treatments and preventative strategies for COVID-19 are still lacking, these effects are expected to continue; thus, this research explores the impact of COVID-19 on the crypto market with the aim of providing investors and policymakers with a better understanding of the market dynamics of cryptocurrencies while allocating cryptos into their portfolios.

A widely used asset pricing model in portfolio applications that provides a guideline for crypto market investors is the capital asset pricing model (CAPM), which was independently developed by Sharpe (1964), Lintner (1965) and Mossin (1966) based on the Markowitz (1952) market portfolio model. The benchmark linear specification of the market model (LMM) is the data generating process (DGP) of the CAPM. In addition, the time invariant beta risk parameter, which is the slope coefficient of the CAPM, captures the global linearity between the financial asset returns and entire market returns and is commonly estimated via ordinary least squares (OLS). The linearity limitation of the CAPM beyond the benchmark LMM, however, has been explored by researchers due to rapidly changing global economic conditions (e.g., Kou et al 2014; Chao et al 2019) such as real-life credit and bankruptcy risks, trade-based money transaction methods, unemployment, inflation, and exchange rates. Jagannathan and Wang (1996) developed the time-varying linearity specification of the market model (Tv-LMM), which allows for a time-varying beta risk parameter known as the DGP of the conditional CAPM (C-CAPM). Here, Tv-LMM is designed in a state space model form via the Kalman filter algorithm (Kalman 1960) due to its performance (e.g., Mergner and Bulla 2008 in the pan-European industry; Zhang and Choudhry 2017 in European banks; Dębski et al. 2020 in the Polish, Czech, and Hungarian stock exchange; Neslihanoglu et al. 2020 in developed and emerging stock markets). While the Tv-LMM via the Kalman filter algorithm has been investigated extensively in several stock markets, firms, and industries, there is limited research on its use in crypto markets (e.g., Raimundo Júnior et al. 2020; Bianchi et al. 2020). Moreover, Neslihanoglu et al. (2017) proposed the generalized additive model (GAM) for flexibility in the rigid linearity shape of the LMM in developed and emerging stock markets, but no studies have used this approach to explore crypto markets. Given the lack of existing literature in this area, this research explores extensions of prior research on crypto markets.

The objective of the comparative analysis in this research is to shed light on the extensions of the LMM for modeling and forecasting cryptocurrency prices. Indeed, this is the first such comparison to be undertaken in the literature. To conduct the comparison, two extensions of the LMM model are investigated: the GAM, which allows for flexibility in the rigid linearity shape of the LMM, and Tv-LMM, in the mean reverting form of the state space model via the Kalman filter (KFMR) algorithm. This comparison is performed using daily data from two different time periods: pre-COVID19 (from January 1, 2019, to March 10, 2020) and during COVID-19 (from March 12, 2020, to November 1, 2020), specifically regarding the price index of 10 cryptocurrencies. The starting point of COVID-19 was chosen as March 11, 2020, following Mariana et al. (2021). This is based on the WHO declaring COVID-19 a global pandemic on that date (WHO 2020). The CCI30 served as a market proxy following Chowdhury et al. (2020), and the 1-month USD London Interbank Offered Rate (LIBOR) interest rate served as the risk-free rate proxy following Anyfantaki and Topaloglou (2018). For both time periods, 30 days forward are examined using 1-week and 7-day ahead predictions. The aforementioned models’ performance is compared using the mean absolute error (MAE), the mean square error (MSE), and a graphical summary.

This research contributes to the literature on cryptocurrencies in many ways with several first attempts. First, it investigates the impact of the COVID-19 pandemic on the financial stability of daily cryptocurrency prices based on modeling and forecasting using different time horizon forward predictions. Second, it evaluates the effectiveness of the LMM, which allows the time invariant beta risk to capture the global linearity between cryptocurrency returns and the CCI30′s returns—something that is commonly estimated via OLS. Third, it evaluates the nonlinearity extension of the market model via GAM underpinning the polynomial model on the cryptocurrency price. Next, it evaluates the local linearity extension of the market model in the state space model form via the Kalman filter algorithm on the cryptocurrency price while also accounting for the time-varying behavior of the beta risk parameter of the cryptocurrency price. Finally, it evaluates the impact of COVID-19 on the stochastic behavior of the time-varying beta risk of cryptos with the aim of providing investors with a quantifiable metric with which to build their crypto portfolios and to better understand the possible risks and rewards of each cryptocurrency.

The rest of this research is laid out in the following way. Second section outlines the overview of data, while third section provides the detailed methodologies of the proposed models. Fourth section presents the empirical outcomes from the comparison of the aforementioned models, while also showing the parameter estimation in the best model. Finally, fifth section summarizes the research.

Data description

The daily data of this research span two time periods: pre-COVID-19 (from January 1, 2019, to March 10, 2020) and COVID-19 (from March 12, 2020, to November 1, 2020). The data pertain specifically to the price index of the 10 cryptocurrencies. The CCI30, which tracks the top 30 cryptocurrencies by adjusted market capitalization (www.cci30.com), serves as the market proxy in this research. The main criteria of selecting these cryptocurrencies were that they had the highest market capitalizations in all 30 cryptocurrencies on November 1, 2020, and that they were also continuous listings and on the CCI30 during the pre-COVID-19 and COVID-19 periods as defined by March 11, 2020 [the day the WHO declared COVID-19 as a global pandemic (WHO 2020)]. As suggested by Alexander and Dakos (2020) and Huynh et al. (2020a), the validity of the data set was checked using different data sources for crypto prices. Table 1 provides an overview of the variables, their abbreviations, and their data sources.

Table 1 Variables, abbreviations, and sources

The daily data returns of the 10 cryptocurrencies and the CCI30 as the log difference of the daily closing price index in USD are determined as follows.

$${R}_{it}=\text{log}\left({P}_{it}\right)-\text{log}\left({P}_{i t-1}\right)$$
(1)

where i = 0,1,…,10 and t = 2,…,T. Here, i = 0 refers to the CCI30 (\({R}_{mt}\)) and \({R}_{it}\) with \(1\le i\le 10\) referring to each cryptocurrency. Pit is the daily closing price index of those in day t. The 1-month USD LIBOR interest rate in percentage per annum serves as the risk-free rate (\({R}_{ft}\)) proxy over time t.

Table 2 displays descriptive statistics for the returns of CCI30, the 10 cryptocurrencies, and the 1-month USD LIBOR interest rate during the pre-COVID-19 and COVID-19 periods. Table 2 provides the key empirical features of the data. The mean returns of cryptocurrencies (0.00154 (average)) and the CCI30 (0.00076) during the COVID-19 period were lower than that of the cryptocurrencies (0.00412 (average)) and the CCI30 (0.00373) during the pre-COVID-19 period. This means that investors realized greater financial gains during the COVID-19 period than during the pre-COVID-19 period. Moreover, the standard deviations (unconditional volatility) of returns in the cryptocurrencies (0.05233 (average)) and in the CCI30 (0.03658) in the pre-COVID-19 period are greater than those of returns in the cryptocurrencies (0.04314 (average)) and in the CCI30 (0.03373) during the COVID-19 period. This suggests that cryptocurrencies and the CCI30 were deemed less risky investments during the COVID-19 period when compared to the pre-COVID-19 period, when the risk is measured by unconditional volatility. These results suggest that the cryptocurrencies have been quite affected by the COVID-19 global pandemic.

Table 2 Descriptive statistics of daily data during the pre-COVID-19 and COVID-19 periods

The distributions of cryptocurrencies exhibit positive average skewness (1.1490; 0.3648) in the pre-COVID-19 and COVID-19 periods, while the returns of CCI30 (− 0.2901; 0.0872) and Rf (risk-free rate) (− 0.7352; 1.7173) exhibit negative skewness in the pre-COVID-19 period but positive skewness in the COVID-19 period. This signifies that there were frequent small dips and a number of massive increases in returns in all variables during the COVID-19 period. In addition, the return distributions for all cryptocurrencies, CCI30, and Rf are leptokurtic, which suggests the larger tails when compared to a normal distribution and a higher probability for immense results for all variables. This implies that there was a higher possibility for extreme financial gains or losses in investment cryptocurrency returns during the pre-COVID-19 and COVID-19 periods. The normality of all variables is also rejected at a significance level of 5% via the Jarque–Bera test. The null hypothesis of no autocorrelation is rejected at the 5% significance level only for CRO during the pre-COVID-19 period and for CCI30, BPI, ETH, and BNB during the COVID-19 period. According to the ADF test, the time series of the 10 cryptocurrencies and the CCI30 market excess returns was stationary during the pre-COVID-19 and COVID-19 periods.

To sum up, the key characteristics of this research data are positive means, volatility, asymmetrical (left- and right-skew), and leptokurtosis (fat tails) for both time periods. These features match those regularly reported by cryptocurrency studies, especially Catania et al. (2019) and Yousaf and Ali (2020). These results justify the consideration of extending the linearity between each cryptocurrency with CCI30 for both time periods.

For the sake of brevity, the time series plot of the top four cryptocurrencies (BPI, ETH, XRP, and BCH) and the CCI30 returns during the pre-COVID-19 and COVID-19 periods are represented in Figs. 1 and 2, respectively. These figures provide some key insights. The large fluctuations in the returns of the CCI30 and cryptos are a common characteristic during both periods. After the declaration of COVID-19 as a global pandemic (on March 11, 2020), the largest 1-day fall in the price of Bitcoin was 36% on March 13, 2020 (Yousaf and Ali 2021); the following week showed an increase in fluctuations of returns in the cryptos, as can be observed in Fig. 2.

Fig. 1
figure 1

Time series plots of the CCI30 and four cryptocurrencies’ returns during the pre-COVID-19 period

Fig. 2
figure 2

Time series plots of the CCI30 and four cryptocurrencies’ returns during the COVID-19 period

Methodology

Linear market model (LMM)

The benchmark LMM was independently developed by Sharpe (1964), Lintner (1965), and Mossin (1966) and is known as the DGP of the CAPM. This model is defined as follows.

$${R}_{it}-{R}_{ft}={\alpha }_{i}+{\beta }_{im}\left({R}_{mt}-{R}_{ft}\right)+{\varepsilon }_{it} \;\;\;\;\;\; {\varepsilon }_{it} \sim N\left(0,{\sigma }_{i}^{2}\right)$$
(2)

Let \({R}_{it}\) be the returns in cryptocurrency i \(\left(i=1,\cdots ,10\right)\), \({R}_{mt}\) be the returns in the CCI30 index, and \({R}_{ft}\) be the risk-free rate at time t \(\left(t=1,\cdots ,T\right)\). Additionally, residuals are \({\varepsilon }_{it}\) with \({\varepsilon }_{it} \sim N\left(0, {\sigma }_{i}^{2}\right)\) and \(E\left({\varepsilon }_{it}{\varepsilon }_{kt}\right)=0\), for \(i\ne k\) and \(E\left({\varepsilon }_{it}{\varepsilon }_{i i+j}\right)=0\), for \(j>0\). Here, \({\alpha }_{i}\) is the regression intercept, and \({\beta }_{im}\) is the regression slope and accounts for the time invariant beta risk. OLS, which is briefly outlined by Wood (2006), is used to estimate the coefficients of Eq. (2). Note that the analyses of OLS were computed using the lm function in the R software (R Core Team 2018).

Generalized additive model (GAM)

Developed by Hastie and Tibshirani (1990), the GAM is used to evaluate whether the rigid parametric shapes of the LMM (Eq. 2) are too restrictive in this research. The GAM extension of the LMM is given as follows.

$${R}_{it}-{R}_{ft}={\alpha }_{i}+{f}_{i}\left({R}_{mt}-{R}_{ft}\right)+{\varepsilon }_{it} \;\;\;\;\;\; {\varepsilon }_{it} \sim N(0,{\sigma }_{i}^{2})$$
(3)

Here, \({\varepsilon }_{it} \sim N\left(0, {\sigma }_{i}^{2}\right)\) with \(E\left({\varepsilon }_{it}{\varepsilon }_{kt}\right)=0\), for \(i\ne k\) and \(E\left({\varepsilon }_{it}{\varepsilon }_{i i+j}\right)=0\), for \(j>0\). Let \({f}_{i}\left({R}_{mt}-{R}_{ft}\right)\) be a smooth function of \({R}_{mt}-{R}_{ft}\), whose shape is estimated from the data. It has a potentially non-linear relationship with \({R}_{it}-{R}_{ft}\). Here, the GAM provides a nonlinearity extension of the market model with \({\left({R}_{mt}{-R}_{ft}\right)}^{\tau }\), where \(\tau\) takes any positive values instead of an integer that represents the polynomial extension of LMM (Eq. 2). The shape of the fitted model via GAM will be estimated from the data itself (Simpson 2018). The GAM parameter estimation procedure is briefly outlined by Wood (2006). Note that the analyses of GAM were computed using the mgcv package in the R software (R Core Team 2018).

Time-varying linear market model (Tv-LMM)

The time-varying extension of LMM, which allows for the time-varying beta risk parameter and is also known as the DGP of C-CAPM developed by Jagannathan and Wang (1996), is referred to here as the Tv-LMM. This extension is defined as the mean reverting form of the state space model and is divided into two equations: the observation equation (Eq. 4) and the state equation (Eq. 5) (Rosenberg 1973). The unknown parameters of these equations are estimated via the Kalman filter algorithm (Kalman 1960), which is a widely used method for the linear state space model and is referred to here as the KFMR. It is expressed as follows.

$${R}_{it}-{R}_{ft}={\alpha }_{i}+{\beta }_{imt}\left({R}_{mt}-{R}_{ft}\right)+{\varepsilon }_{it}\;\;\;\;\;\; {\varepsilon }_{it}\sim N(0,{H}_{i})$$
(4)
$${\beta }_{imt}={\overline{\beta }}_{im}+{\phi }_{i}\left({\beta }_{im\;t-1}-{\overline{\beta }}_{im}\right)+{w}_{it} \;\;\;\;\;\;\;{w}_{it}\sim N\left(0,{Q}_{i}\right)$$
(5)

where \({\overline{\beta }}_{im}=\frac{1}{T}{\sum }_{t=1}^{T}{\beta }_{imt}\). Here, \({\varepsilon }_{it}\) and \({w}_{it}\) are assumed to be mutually independent residuals and normally distributed with mean 0 and variances \({H}_{i}\) and \({Q}_{i}\), respectively. \({\phi }_{i}\) evaluates the temporal autocorrelation in \({\beta }_{imt}\) in cryptocurrency i \(\left(i=1,\dots ,10\right)\). Note that if \({\phi }_{i}=1\), this model becomes a random walk form of the state space model (Samuelson 1965), while if \({\phi }_{i}=0\), the model becomes a random coefficient form of the state space model (Schaefer et al. 1975). The prior parameter of KFMR is defined as follows.

$${\beta }_{im0 }\sim N\left({\mu }_{{\beta }_{im}},{\Sigma }_{{\beta }_{im}}\right)$$
(6)

Here, \({\mu }_{{\beta }_{im}}\) and \({\Sigma }_{{\beta }_{im}}\) are the initial estimates for \({\beta }_{im0}\) and are derived from the data in the estimation process. Optimal, updated, one-step-ahead, linear, and unbiased estimators of the unobservable state \({\beta }_{imt}\) are here provided by the Kalman filter algorithm. Briefly, the Kalman filter is a recursive algorithm that produces, at each time t, an estimator of the state vector \({\beta }_{imt}\), which is given by the orthogonal projection of the state vector onto the observed variables up to that time (Costa and Monteiro 2016).

The linear state space model via the Kalman filter and smoother algorithm (Shumway and Stoffer (2006) and Neslihanoglu et al. (2021)) is used to estimate the Tv-LMM in this study. This model is defined as follows:

$${Y}_{t}={A}_{t}{\kappa }_{t}+{\varepsilon }_{t}\;\;\;\;\;\;{\varepsilon }_{t} \sim N\left(0, H\right)$$
(7)
$${\kappa }_{t}=\Phi {\kappa }_{t-1}+{w}_{t}\;\;\;\;\; {w}_{t} \sim N\left(0, Q\right)$$
(8)

Let \({Y}_{t}\) be a \(q \times 1\) vector of observations and \({\text{A}}_{t}\) be a \(q \times p\) observation matrix and \({\kappa }_{t}\) is a \(p \times 1\) unobserved state vector at each time \(t \;(t=1,\dots ,n).\) Here, the transition parameter, \(\Phi\), is a \(p \times p\) matrix. Let \({\varepsilon }_{t}\) be a \(q \times 1\) vector of observation residuals, which are independent and identically distributed with \({\varepsilon }_{t} \sim N\left(0, H\right)\), and \({w}_{t}\) is a \(p \times 1\) vector of state residuals independent and identically distributed with \({w}_{t} \sim N\left(0, Q\right).\) Here, the model is wholly identified with the use of two other assumptions. First, it is assumed that \({\varepsilon }_{t}\) and \({w}_{t}\) are mutually independent for all t. Second, it is assumed that the initial state vector is \({\kappa }_{0}\sim N({\mu }_{0},{\Sigma }_{0})\).

The main purpose of this procedure is to estimate for the unobserved state vector, \({\kappa }_{t}\), at time \(t\) given \({Y}_{t}=\left\{{Y}_{1},{Y}_{2},\cdots ,{Y}_{n}\right\}\) at time n. Throughout this procedure, a prediction problem occurs when \(t > n\); a filtering problem occurs when \(t = n\); and a smoothing problem occurs when \(t < n\). To solve these problems, the Kalman filter and smoother algorithms are used. These are defined below.

The forward recursion steps of the Kalman filter and smoother algorithm with initial conditions \({\kappa }_{0}^{0}={\mu }_{0}\) and \({P}_{0}^{0}={\Sigma }_{0}\), for \(t=1,\dots ,n\) can be implemented to mitigate the prediction (\(t > n\)) and filtering (\(t= n\)) problems. These steps are outlined as follows.

Prediction steps:

$$\text{Set the state prediction }\;\;\;\;\; {\kappa }_{t}^{t-1}={\Phi \kappa }_{t-1}^{t-1}$$
(9)
$$\text{Set the state variance prediction } \;\;\;\;\;{P}_{t}^{t-1}={\Phi P}_{t-1}^{t-1}{\Phi }^{^{\prime}}+Q$$
(10)

Filtering steps:

$$\text{Set the innovations }\;\;\;\;\;{v}_{t}={Y}_{t}-{A}_{t}{\kappa }_{t}^{t-1}$$
(11)
$${\text{Set the variance matrices of innovations }}\;\;\;\;\; \Sigma_{t}=Var\left({v}_{t}\right)={A}_{t}{P}_{t}^{t-1}{\text{A}}_{t}^{^{\prime}}+H$$
(12)
$$\text{Set the Kalman gain } \;\;\;\;\;{K}_{t}={P}_{t}^{t-1}{\text{A}}_{t}^{^{\prime}}{\Sigma }_{t}^{-1}$$
(13)
$$\text{Set the state filtering }\;\;\;\;\;{\kappa }_{t}^{t}={\kappa }_{t}^{t-1}+{K}_{t}{v}_{t}$$
(14)
$$\text{Set the state variance filtering }\;\;\;\;\;{P}_{t}^{t}={P}_{t}^{t-1}-{K}_{t}{A}_{t}{P}_{t}^{t-1}$$
(15)

Equations (9) to (15) should be cycled through for each time t. The forward recursions in Eqs. (11) through (15) identify the Kalman filter.

The backward recursion steps of the Kalman filter and smoother algorithm with initial conditions \({\kappa }_{n}^{n}\) (Eq. 14) and \({P}_{n}^{n}\) (Eq. 15), which are obtained from the Kalman filter with \(t=n\), for \(t=n,n-1,\dots ,1\), can be used in order to mitigate the smoothing \(\left(t<n\right)\) problem. This is shown as follows.

Smoothing steps:

$$\text{Set the smoothed state }\;\;\;\;\;{\kappa }_{t-1}^{n}={\kappa }_{t-1}^{t-1}+{J}_{t-1}\left({\kappa }_{t}^{n}-{\kappa }_{t}^{t-1}\right)$$
(16)
$$\begin{aligned}&\text{Set the smoothed error variance}\quad{P}_{t-1}^{n}={P}_{t-1}^{t-1}+{J}_{t-1}({P}_{t}^{n}-{P}_{t}^{t-1}){J}_{t-1}^{\prime} \\ & {}\text{where}\quad{J}_{t-1}={P}_{t-1}^{t-1}{\Phi }^{{\prime}}{[{P}_{t}^{t-1}]}^{-1}\end{aligned}$$
(17)

Equations (16) to (17) should be cycled through for each time t. The backward recursion that occurs in Eqs. (16) through (17) is called the Kalman smoother. To estimate a state \({\kappa }_{t}\) given \({Y}_{n}\) with \(t<n\), the Kalman filter is applied recursively until reaching state\({\kappa }_{n}\). While continuing through the cycle, the values \({\kappa }_{t}^{t-1}\), \({\kappa }_{t}^{t}\), \({P}_{t}^{t-1}\), and \({P}_{t}^{t}\) (\(t=1,\dots ,n\),) are stored. Next, one works backwards by applying the Kalman smoother until one reaches the state,\(t\), which one is trying to estimate for.

Throughout this process, one makes the assumption that the system matrices (\(H, \Phi\) and \(Q\)) and the initial mean \({\mu }_{0}\) and variance \({\Sigma }_{0}\), are both known. As occurs more often, however, some of the system matrices elements depend on an unknown parameters vector, Θ. One estimates the unknown parameters vector, Θ, by calculating for maximum likelihood. The loglikelihood function of the linear state space model (Eqs. (7) and (8)), which was coined by Harvey (1989) as “prediction error decomposition,” is stipulated as follows.

$$\text{log}{L}_{Y}\left(\Theta \right)= -\frac{nq}{2}\text{log}\left(2\pi \right)-\frac{1}{2}\sum_{t=1}^{n}\text{log}\left|{\Sigma }_{t}\left(\Theta \right)\right|-\frac{1}{2}\sum_{t=1}^{n}{{v}_{t}\left(\Theta \right)}^{^{\prime}}{\Sigma }_{t}{\left(\Theta \right)}^{-1}{v}_{t}\left(\Theta \right)$$
(18)

Here, \({v}_{t}\left(\Theta \right)\) and \({\Sigma }_{t}\left(\Theta \right)\) are calculated routinely by the Kalman filter (Eqs. 915) by assuming that \({\Sigma }_{t}\left(\Theta \right)\) is nonsingular for t = 1,…,n. The Newton–Raphson algorithm can be utilized successively for the purpose of updating the parameter values until the loglikelihood function (Eq. 18) is enlarged to its maximum level. The standard deviation of the unknown parameters vector, \(\Theta\), is here calculated by \(\sqrt{diag\left(\Theta \left({H\left(\Theta \right)}^{-1}\right)\Theta \right)}\) where \(H(\Theta )=\frac{{\partial }^{2}log{L}_{Y}\left(\Theta \right)}{\partial\Theta \partial {\Theta }^{^{\prime}}}\) (known as Hessian matrix\()\), obtained in the Newton–Raphson algorithm. The optim package in the R software (R Core Team 2018) is used for the Newton–Raphson algorithm. Refer to Durbin and Koopman (2012) for an exhaustive review of the loglikelihood function.

The mechanism by which the Tv-LMM is applied, which is based on the Kalman filter and smoother algorithm, is related above. Throughout this procedure, the system matrices (\(H, \Phi\) and \(Q\)) and the initial mean \({\mu }_{0}\) and variance \({\Sigma }_{0}\) of Tv-LMM model (Eqs. (3) and (4)) are adapted as follows.

$$\begin{aligned} & \Phi =\left[\begin{array}{cc}0& 0\\ 0& {\phi }_{i}\end{array}\right]\quad Q=\left[\begin{array}{cc}0& 0\\ 0& {Q}_{i}\end{array}\right] H={H}_{i}\\ & {\kappa }_{0}^{0}={\left[\begin{array}{cc}{\mu }_{{{\upalpha }}_{{\rm i}0}} & {\mu }_{{\upbeta }_{{\rm im}0}}\end{array}\right]}^{^{\prime}} \quad{P}_{0}^{0}=\left[\begin{array}{cc}{\Sigma }_{{\alpha }_{{\rm i}0}}& 0\\ 0& {\Sigma }_{{\upbeta }_{{\rm im}0}}\end{array}\right]\end{aligned}$$
(19)

In addition, the unknown parameters vector, \(\Theta =\left\{{Q}_{i}{,{H}_{i}, \phi }_{i},{{\upalpha }}_{{\rm i}, },\stackrel{-}{{\beta }_{i}}\right\}\), is estimated by means of a consistent application of the Kalman Filter. The parameter estimation procedure and the R software coding (R Core Team 2018) of Kalman filter and smoother are briefly outlined in Shumway and Stoffer (2006).

The performance of the proposed models is compared by utilizing the MAE and the MSE criteria, represented as follows.

$$\text{MAE}=\frac{1}{T}\sum_{t=1}^{T}\left|\left(\widehat{{R}_{it}-{R}_{ft}}\right)-\left({R}_{it}-{R}_{ft}\right)\right|$$
(20)
$$\text{MSE}=\frac{1}{T}\sum_{t=1}^{T}{\left(\left(\widehat{{R}_{it}-{R}_{ft}}\right)-\left({R}_{it}-{R}_{ft}\right)\right)}^{2}$$
(21)

According to these criteria, the models with the lowest MSE and MAE values proffer a better modeling (forecasting) performance.

The Diebold–Mariano (DM) test (Diebold and Mariano 1995) is here used for the robustness checking of the model fit (forecasting) accuracy in the two aforementioned models in terms of the MAE and MSE as the measures of the in-sample model fitting (out-of-sample forecasting) procedure, represented as follows.

$$\text{DM}=\frac{\overline{d }}{Var\left(\overline{d }\right)}$$
(22)
$${d=\left|{\left(\left(\widehat{{R}_{it}-{R}_{ft}}\right)-\left({R}_{it}-{R}_{ft}\right)\right)}_{a}\right|}^{w}-{\left|{\left(\left(\widehat{{R}_{it}-{R}_{ft}}\right)-\left({R}_{it}-{R}_{ft}\right)\right)}_{b}\right|}^{w}$$

Here, \({\left(\left(\widehat{{R}_{it}-{R}_{ft}}\right)-\left({R}_{it}-{R}_{ft}\right)\right)}_{a}\) and \({\left(\left(\widehat{{R}_{it}-{R}_{ft}}\right)-\left({R}_{it}-{R}_{ft}\right)\right)}_{b}\) are the residuals for the two aforementioned models, that is, a and b for the tth (t = 1,…,T). According to Choudhry and Wu (2009), w is defined as 1 for MAE and 2 for MSE. In the DM test, given the null hypothesis of no difference in levels of model fit (forecasting) accuracy between the two models, the DM test statistic follows a \({t}_{n-1}\) (Neslihanoglu et al. 2020). Note that these analyses were computed using the R software (R Core Team 2018).

Results

Model comparison

In-sample model fit

The comparison of the proposed models’ performance for modeling each cryptocurrency return in the pre-COVID-19 and COVID-19 periods uses the in-sample model fitting procedure. The MSE and MAE results between the actual and the theoretical values of the model for each cryptocurrency return during both time periods are summarized, respectively, in Table 3.

Table 3 In-sample model fit comparison criteria

As shown in Table 3, on average, the TV-LMM (with the lowest MAE and MSE) improves on the LMM (with the highest MAE and MSE) in terms of MAE (MSE) by 15.6% (22.2%) for cryptocurrency in the pre-COVID-19 period, while it improves on LMM by 21.9% (33.5%) for cryptocurrency during the COVID-19 period. In addition, on average, the GAM improves on the LMM in terms of MAE (MSE) by 1.9% (6.4%) for cryptocurrency in the pre-COVID-19 period and by 1.9% (2.7%) for cryptocurrency during the COVID-19 period. Clearly, the performance of all models with relation to cryptocurrencies in the pre-COVID-19 period is worse than their performance for cryptocurrencies during the COVID-19 period. This may be because cryptocurrencies during the COVID-19 period are more stable than in the pre-COVID-19 period, as evidenced by Table 2. To sum up, the Tv-LMM model appears to be more desirable for modeling the daily cryptocurrency price indices in the pre-COVID-19 and COVID-19 periods, seeing as it achieves the lowest average MAE and MSE.

Out-of-sample forecasting

An assessment of the aforementioned models’ forecasting performance is conducted here utilizing 1-day and 7-day ahead predictions with the rolling window technique for both the pre-COVID 19 and COVID-19 periods. To do this, the length of the rolling window in both periods is 180 days (6 months), and the length of the prediction period is 30 days (1 month) to predict \({\beta }_{it}\) by generating a 1-day ahead and 7-day ahead forecast, respectively, for each cryptocurrency during these periods. The MSE and MAE results between the actual and the predicted values of returns of each cryptocurrency over these 30 values during both time periods are summarized in Tables 4 and 5, respectively.

Table 4 Out-of-sample forecasting comparison criteria (1-day ahead forward prediction)
Table 5 Out-of-sample forecasting comparison criteria (7-day ahead forward prediction)

As shown in Tables 4 and 5, on average, the TV-LMM (with the lowest MAE and MSE) improves on the LMM (with the highest MAE and MSE) in terms of MAE (MSE) by 30.9% (42.3%) for the 1-day ahead forecast and by 28.1% (36.9%) for the 7-day ahead forecast for cryptocurrencies in the pre-COVID-19 period. It improves on the LMM by 15.5% (23.2%) for the 1-day ahead forecast and by 16.4% (25.4%) for the 7-day ahead forecast for cryptocurrencies during the COVID-19 period. Moreover, on average, the GAM improves on the LMM in terms of MAE (MSE) by 3.4% (10.1%) for the 1-day ahead forecast and by 2.7% (7.8%) for the 7-day ahead forecast for cryptocurrency in the pre-COVID-19 period. On average, it improves on the LMM by 0.4% (0.9%) for the 1-day ahead forecast and by 0.2% (0.8%) for the 7-day ahead forecast for cryptocurrencies during the COVID-19 period. It is also apparent that all three models’ cryptocurrency results in the 1-day ahead and 7-day ahead forecast procedures during the COVID-19 period are better than those in the 1-day ahead and 7-day ahead forecast procedures in the pre-COVID-19 period. This may be due to the fact that the cryptocurrencies were more subject to outliers in the pre-COVID-19 period than in the COVID-19 period. To summarize, Tv-LMM seems preferable when predicting the daily cryptocurrency price indices in the pre-COVID-19 and the COVID-19 periods given that it generated the lowest average MSE and MAE.

Robustness check

The robustness check of the model fitting and forecasting accuracy of the two aforementioned models using the DM test described in Sect. 3 in terms of MAE and MSE for both procedures and the 10 cryptocurrencies in the pre-COVID-19 and COVID-19 periods are discussed in this section, respectively. According to the DM test, the null hypothesis states that no differences exist in levels of model fitting (forecasting) accuracy between the two models in the in-sample (out-of-sample) procedure for each cryptocurrency in each time period. Tables 6 and 7 show the numbers of cryptocurrencies that reject the null hypothesis by using the DM test at the 5% significance level during the pre-COVID 19 and COVID-19 periods, respectively.

Table 6 The numbers of cryptocurrencies that reject the null hypothesis in terms of the MAE and MSE criteria in-sample procedure during the pre-COVID-19 and COVID-19 periods
Table 7 The numbers of cryptocurrencies that reject the null hypothesis in terms of the MAE and MSE criteria out-of-sample procedure during the pre-COVID-19 and COVID-19 periods

As shown in Tables 6 and 7, the Tv-LMM (which exhibits the best model fitting and forecasting performance) is statistically significant at the different levels of modelling for the 1-day and 7-day ahead forecast accuracy for the LMM and GAM in terms of MAE and MSE for both procedures during the pre-COVID-19 and COVID-19 periods. Moreover, GAM is not generally statistically significant at the different levels of forecasting accuracy for LMM in each time period. In sum, the Tv-LMM seems to be the preferable model for cryptocurrencies in the pre-COVID-19 and COVID-19 periods.

Graphical summary

The aforementioned models’ modeling performance is also presented using scatter plots to show the relationship between the returns of each cryptocurrency and the CCI30 during the pre-COVID-19 and COVID-19 periods. For the sake of brevity, the top four cryptocurrencies by adjusted market capitalization are BPI, ETH, XRP, and BCH; their fitted model plots for the pre-COVID-19 and COVID-19 periods are represented in Figs. 3 and 4, respectively.

Fig. 3
figure 3

The scatter plots of the four cryptocurrencies’ daily excess returns in the pre-COVID-19 period

Fig. 4
figure 4

The scatter plots of the four cryptocurrencies’ daily excess returns in the COVID-19 period

As shown in Figs. 3 and 4, the Tv-LMM provides a much closer fit with the cryptocurrency data (especially for the COVID-19 period) owing to the time-varying relationship estimated between the excess returns of the cryptocurrency and the CCI30, suggesting that the short-term volatility in this relationship is captured by the Tv-LMM. The estimated relationships between the excess returns of the cryptocurrencies and the CCI30 from the LMM and the GAM are generally close to each other, with the exception of extreme values, which are more commonly observed in the pre-COVID-19 period than in the COVID-19 period. To sum up, the Tv-LMM seems to be the most appropriate model for cryptocurrencies, especially for the COVID-19 period.

Best model fit and forecasting performance

The model with the best modeling and forecasting performance for the pre-COVID 19 and COVID-19 periods is the TV-LMM via KFMR parameter estimates, which are described in Sect. 2 for each cryptocurrency and each time period. These are summarized in Tables 8 and 9, respectively.

Table 8 Tv-LMM via KFMR parameter estimates (with standard errors) of cryptocurrencies during the pre-COVID-19 period
Table 9 Tv-LMM via KFMR parameter estimates (with standard errors) of cryptocurrencies during the COVID-19 period

Tables 8 and 9 detail some key points. The average estimated variances of observation (\(\widehat{{H}_{i}}\)) and state (\(\widehat{{Q}_{i}}\)) values for the cryptocurrencies in the pre-COVID-19 period are higher than those during the COVID-19 period. This may be because of higher cryptocurrency volatilities in the pre-COVID-19 period compared to those in the COVID-19 period, as illustrated by Table 2. According to the average Adjusted R2, Tv-LMM provides a better performance during the COVID-19 period than in the pre-COVID-19 period. Moreover, the average temporal autocorrelation (captured by \(\widehat{{\phi }_{i}}\)) for the time-varying beta risk of cryptocurrencies is slightly higher in the pre-COVID-19 period than in the COVID-19 period. This proffers that the time-varying beta risk parameter changed rapidly in the pre-COVID-19 period, as illustrated by Figs. 5 and 6.

Fig. 5
figure 5

The estimated \(\left({\widehat{\beta }}_{imt}\right)\) plots of four cryptocurrencies in the pre-COVID-19 period

Fig. 6
figure 6

The estimated \(\left({\widehat{\beta }}_{imt}\right)\) plots of four cryptocurrencies in the COVID-19 period

To provide additional information on the behavior of the beta risk estimates of the top four cryptocurrencies by adjusted market capitalization (BPI, ETH, XRP, and BCH), Figs. 5 and 6 present the cryptocurrencies’ beta risk estimates from the Tv-LMM (exhibiting the best performance) and LMM (exhibiting the worst performance) in both time periods, respectively. The time-varying beta risk \(\left({\widehat{\beta }}_{imt}\right)\) estimates of the cryptocurrencies from the Tv-LMM fluctuate around those from the LMM, as expected, during both the pre-COVID-19 and COVID-19 periods.

Tables 8, 9 and 10 provide a guideline for active (alpha) and passive (beta) investors upon which their portfolios can be based and with which they can learn what risks and rewards each cryptocurrency may entail. The average alpha (\(\widehat{{\alpha }_{i}}\)) is close to 0 and positive in cryptocurrencies in the pre-COVID-19 period, while it is close to 0 and negative in cryptocurrencies during the COVID-19 period. This implies that the cryptocurrency i actual returns is higher than that of the expected returns derived by Tv-LMM in the pre-COVID-19 period, while it is lower than that of the expected returns derived by Tv-LMM during the COVID-19 period. It is worth mentioning that active (alpha) investors prefer to invest in cryptocurrencies with a positive alpha (being considered as valuable investments). Thus, it was observed that they preferred to make investments during the pre-COVID-19 period.

Table 10 Descriptive statistics of the time-varying beta risk \(\left({\widehat{\beta }}_{imt}\right)\) estimates of the cryptocurrencies from the Tv-LMM during the pre-COVID-19 and COVID-19 periods

The average of the time-varying beta (\({\widehat{\beta }}_{imt}\)) (being a statistical measure of a cryptocurrency’s relative volatility to that of the CCI30, where it can be interpreted as both a measure of systematic risk and a performance measure) for cryptocurrencies is positive (1.0719); it was theoretically 7.19% more volatile than the CCI30 during the pre-COVID-19 period, and (0.9878), which is theoretically 1.22% less volatile than the CCI30 during the COVID-19 period. In addition, the fluctuation in time-varying betas is an indicator of the level of exposure to systematic risk. Thus, higher betas imply higher risk, while lower betas imply lower risk. Thus, cryptos with higher betas may gain more in up-markets but may also lose more in down-markets. BSV is the riskiest cryptocurrency with the highest range of time-varying beta risk \({\widehat{\beta }}_{imt}\) estimate series during both time periods. In the wake of the discussion about the stochastic behavior of cryptos’ betas, it can be concluded that passive investors’ (beta investor) investing preferences while allocating cryptos in their portfolios in the future may become more flexible and related to their risk-tolerance level.

Conclusion

This research investigated the performance comparison of LMM and its extensions (namely, the GAM and the Tv-LMM), the model fitting, and the 1-day and 7-day ahead predictions for the 10 cryptocurrencies’ daily prices in the pre-COVID-19 and COVID-19 periods. The empirical findings in this research favor the Tv-LMM model, which outperforms others in both modeling and predicting the daily prices of these 10 cryptocurrencies, especially during the COVID-19 period. This provides evidence of a local linear relationship between each cryptocurrency with the CCI30 index as opposed to the traditional LMM, which provides evidence of global linearity.

The comparative analysis in this research clarifies the linearity extensions of the traditional market model in explaining cryptocurrency price. It is apparent that the time-varying linearity extension of the traditional market model, allowing for the time-varying beta risk parameters, absorbed the structural changes of the cryptocurrency price series. In conclusion, this research emphasizes the importance of the time-varying market model when dealing with crypto market inefficiencies.

These research findings should motivate a more systematic investigation of potential risk premia in the cryptocurrencies market as far as there exist specific types of cryptos with a higher exposure to systematic risk (beta risk). Thus, it would be interesting to investigate whether this is accounted for in the profitability of investors. The approach taken in this research suggests that a connection between the exposure to systematic risk and the gains accumulated by trading cryptos is likely to exist. This is strong evidence for the existence of crypto risk premiums. More analysis, however, is expected in this area. Nevertheless, the time dynamics of risk pricing casts doubt on the persistence of risk premiums during specific sub-periods, which indicates that there is more work to be done by regulators to improve the efficiency of this newly established market. More efforts should be put into further research to potentially improve the efficiency of this market, thereby allowing for the attraction of more investors and the risk aversion of their diverse portfolios by considering cryptos as well.