1 Introduction

The cryptocurrency market is the most sought-after topic in finance when Bitcoin (BTC) and Ether (ETH) hit multiple all-time highs and the entire cryptocurrency market capitalization exceeds 2 trillion USD (equivalent to that of Apple) in this year’s bull run. Unlike late 2017 when the market went into a bearish phase after reaching its all-time high, the market has witnessed a remarkable bullish trend this year thanks to the increased institutional adoption of cryptocurrency. As major cryptocurrencies such as BTC or ETH are still in the price discovery phase after breaking their all-time highs and institutional adoption keeps surging, it is worth investigating the relationship between institutional activity and the price discovery of the cryptocurrency market. Several prior studies have examined price discovery in the cryptocurrency market using various approaches. Brandvold et al. (2015), for instance, examine price discovery across different BTC exchanges from April 2013 to February 2014 and find that Mt.Gox and BTC-e were the leaders in price discovery during this period. Dimpfl and Peter (2021) study various exchanges including Bitfinex, Kraken and Poloniex from March 2017 to November 2017, and the results show that Bitfinex is the leader in price discovery followed by Poloniex and Kraken. In addition to price discovery in different exchanges, other studies investigate price discovery between spot prices and futures such as Baur and Dimpfl (2019), Corbet et al. (2018), and Kapar and Olmo (2019). The findings of these studies are, however, not in union whereby Baur and Dimpfl (2019) and Corbet et al. (2018) claim that spot price leads price discovery, while Kapar and Olmo (2019)’s results give the opposite conclusion. Recently, Chang and Shi (2020) examine the dynamic information shares of Bitcoin (BTC), Ether (ETH), Ripple (XRP) and Litecoin (LTC) using daily data and find that the information share of BTC is the largest. Their results also show that 20% of variations of the BTC information share can be explained by market capitalization and trading volume.

Earlier studies such as Moosa (2020) claim that a steep decline in BTC is evidence of a price bubble instead of a price discovery path from late 2017–2018. However, we have witnessed a significant increase in institutional activity in the cryptocurrency market, especially BTC, such as Microstrategy, Tesla, Grayscale BTC Trust or Ruffer Investment Company toward the end of 2020 and in early 2021.Footnote 1 As the institutional investors continue to enter the BTC market as it becomes mature, their trading activity is expected to play a vital role in the price discovery of BTC. The important role of institutional investors in an asset market is documented in numerous studies for equity (e.g., Chan & Lakonishok, 1995; Foster et al., 2011; Sias et al., 2001), and institutional trading is also reported to drive the stock market more than retail trading (e.g., Griffin et al., 2003; Koesrindartoto et al., 2020; Stoffman, 2014). Since the current literature has not clearly explained the significance of institutional trading activity in the price discovery of BTC, our study will provide more empirical evidence on this branch of the literature. In particular, we analyze whether the information shares of futures price in BTC price discovery, measured in the spirit of Hasbrouck (1995, 2003), is positively correlated with the BTC futures trading activity, establishing the implication that institutional trades matter in BTC price discovery.

Another goal of this study is to examine whether the relative importance of BTC institutional trading activity in BTC price discovery contributes to the BTC correlation with other traditional assets such as equity, gold and bond. Several studies have investigated the correlation between cryptocurrencies and traditional assets such as Brière et al. (2015), Bouri et al. (2017a, 2017b), Corbet et al. (2018), Shahzad et al. (2019), and Charfeddine et al. (2020), and most studies document a weak correlation between BTC/cryptocurrencies and conventional financial assets, making BTC a potential candidate for diversification benefits in a multi-asset portfolio (Pham et al., 2021). However, the existing literature on the cross-asset correlation of BTC mainly covers the periods when the BTC market was driven by the retail investors only, and little is known on the impact of time-varying institutional trading activities on the correlation of BTC with other assets. While the retail investors still play an important role in BTC price surge, institutional investors can provide more liquidity than the former and potentially become more crucial in the BTC price discovery as the market matures, similar to the stock market (e.g., Chan & Lakonishok, 1995; Foster et al., 2011; Sias et al., 2001). Therefore, the institutional multi-asset portfolio rebalance including BTC position is likely to have an impact on the correlation of BTC with the traditional assets. As such, it is important to empirically examine the extent to which the relative importance of institutional trading activity in the BTC market contributes to the cross-asset correlations of BTC. In particular, our study analyzes whether the BTC futures information shares, a proxy for institutional involvement in the BTC market, drive the correlation of BTC with traditional assets.

The findings of this study evaluate the impact of institutional trading activity on the price discovery of BTC and its correlations with other assets. In general, we find that the information shares of BTC futures contribute more (less) to BTC price discovery when the trading activities of institutional traders in BTC futures rise (fall). Our results also show that the correlation between BTC and US equity/gold decreases when the information shares of futures price and hence institutional involvement in the BTC market rise whereby institutions tend to buy (sell) more BTC from selling (buying) equity/gold. In addition, we find the institutional activity in the BTC market increases with the BTC-bond correlation, although the results are less significant. The remainder of this paper is structured as follow. Section 2 describes the data and methodology used in the study, Sect. 3 discusses the findings, and Sect. 4 concludes the paper.

2 Data and methodology

We obtain the 1-s intraday data of nearest-maturity futures price of BTC, 10-year T-Note and e-mini gold on Chicago Mercantile Exchange (CME), S&P 500 ETF (SPY) on New York Stock Exchange, and spot prices of BTC and gold from Thomson Reuters Tick History. The data sample starts from 18th December 2017, the earliest date available for BTC futures contracts, to 26th March 2021. We note that while SPY trades in Monday–Friday from 9:30 to 16:00 EST, the futures contracts of BTC, 10-year T-Note and gold trade in Sunday–Friday 17:00 to 16:00 CT, and the spot prices of BTC and gold are available throughout 24 h a day. In addition, we use SPY and 10-year T-Note futures to proxy for the S&P500 index and bond spot prices, respectively. Lastly, we collect the daily BTC spot market capitalization and mean USD transaction fee of BTC blockchain from coinmetrics.io and Google trend data containing the keyword “Bitcoin” from trends.google.com as the control variables in the regression analysis.

We first study that the information shares of BTC futures market \(IS\) are linked to futures trading activity with the following regression on daily observations:

$${\text{log}}\left( {BTC \, futures \, trading \, activity} \right)_{t} = \beta_{0} + \beta_{1} {\text{log}}(\% \,spread_{t} ) + \beta_{2} {\text{log}}(realized \, volatility_{t} ) + \beta_{3} IS_{t} + \epsilon_{t} ,$$
(1)

where the \(realized \, volatility\) and \(\% \, spread\) are common factors to explain the trading activity, see Karpoff (1987) and Chordia et al. (2001), respectively. \(realized \, volatility\) is measured as the square root of the sum of squared 1- or 5-min midquote futures returns over a day, and \(\% \, spread\) is measured as the daily average of (ask price  –  bid price)/midquote price in futures market throughout the day. We also control for the weekday fixed effects and calculate the Newey and West (1987) standard errors adjusted for 5 lags. We next analyze the extent to which the BTC futures’ information shares can explain the BTC correlation with other assets from the following regression:

$$\begin{aligned} correlation_{t} & = \beta_{0} + \beta_{1} {\text{log}}(BTC \, futures \, trading \, activity_{t} ) \\ &\quad+ \beta_{2} {\text{log}}(the \,other \, asset \, trading \, activity_{t} ) + \beta_{3} IS_{t - i} + \epsilon_{t} , \end{aligned}$$
(2)

where the daily realized correlation is computed from 1-min spot returns, the daily BTC futures’ information share \(IS_{t - i}\) lags the BTC correlation with other assets from \(i = 0\) to 5 days, and the daily \(trading \, activity_{t}\) is measured by trading volume or the number of trades on day \(t\) to control for the positive linkage between trading activity and asset price volatility, therefore any potential relation between trading activity and cross-asset correlation. We also include the weekday fixed effects and calculate the Newey and West (1987) standard errors adjusted for 5 lags. As a robustness test, we add the BTC spot market capitalization, daily mean USD transaction fee of BTC blockchain, and Google trend that contains the keyword “Bitcoin” to capture the rising interest on BTC, which in turn impacts BTC price and its correlation with other assets, e.g. see Chuffart (2021) for the predictability of cryptocurrency attention on BTC price.

Our sample period contains the COVID pandemic that has been shown to adversely impact financial markets, therefore the connectedness and volatility transmission between different assets. For example, Fig. 1 illustrates a shift to the positive domain in the daily correlation between BTC and S&P500 as well as gold since 2020. We treat this issue with two approaches. We first include the time trend and its squared value in (Eq. 3) to capture any nonlinear trend of correlation. Alternatively, we have a dummy variable COVID equal 1 since the start of 2020 and its interaction with \(IS_{t - i}\) in (Eq. 4) to study any change in the main coefficient of interest during the COVID pandemic as follows:

$$\begin{aligned} correlation_{t} & = \beta_{0} + \beta_{1} {\text{log}}(BTC \, futures \, trading \, activity_{t} ) \\ & \quad + \beta_{2} {\text{log}}(the \, other \, asset \, trading \, activity_{t} ) + \beta_{3} IS_{t - i} + \beta_{4} {\text{log}}(BTC \, market \, cap_{t} ) \\ & \quad + \beta_{5} {\text{log}}(daily \, mean \, USD \, transaction \, fee_{t} ) + \beta_{6} {\text{log}}(Google \, trend_{t} ) \\ & \quad + \beta_{7} { }time \, trend_{t} + \beta_{8} { }time \, trend_{t}^{2} + \epsilon_{t} \\ \end{aligned}$$
(3)
$$\begin{aligned} correlation_{t} & = \beta_{0} + \beta_{1} {\text{log}}(BTC \, futures \, trading \, activity_{t} ) \\ &\quad+ \beta_{2} {\text{log}}(the \, other \, asset \, trading \, activity_{t} )\\ &\quad + \beta_{3} IS_{t - i} + \beta_{4} IS_{t - i} \times COVID + \beta_{5} COVID + \epsilon_{t} . \end{aligned}$$
(4)
Fig. 1
figure 1

Daily realized correlation of different assets. This figure presents the daily realized correlation between different assets based on 1-min spot returns. The sample period starts from 18th December 2017 to 26th March 2021. SPY trades in Monday–Friday from 9:30 to 16:00 EST, the futures contracts of 10-year T-Note trade in Sunday–Friday 17:00 to 16:00 CT, and the BTC and gold spot prices are available throughout 24 h a day. SPY and 10-year T-Note futures are used as a proxy of S&P500 and bond spot prices, respectively

To proxy for the information shares of futures price in the BTC price discovery, we follow Hasbrouck (1995, 2003) as follows. Given the same underlying cryptocurrency in the futures and spot market, the random walk component is the same for all prices. The random walk innovation variance is then decomposed into components attributed to innovations in each price, and the relative contribution of a price series to this variance is defined as its information shares. Let \(p_{1t}\) and \(p_{2t}\) be the log spot price and futures prices, respectively, and the quantity \(p_{1t} - p_{2t}\) ex-ante does not diverge over time. To measure the information shares of either futures or spot price, we estimate the reduced-form econometric specification or the vector error correction model (VECM) of order \(M\) as follows:

$${\Delta }{\varvec{p}}_{t} = {\varvec{A}}_{1} {\Delta }{\varvec{p}}_{t - 1} + {\varvec{A}}_{2} {\Delta }{\varvec{p}}_{t - 2} + \cdots + {\varvec{A}}_{M} {\Delta }{\varvec{p}}_{t - M} + \gamma \left( {p_{1,t - 1} - p_{2,t - 1} - \mu } \right) + {\varvec{u}}_{t} ,$$
(5)

where \({\varvec{p}}_{t}\) is the column vector of log prices, \({\varvec{A}}_{i}\) is the matrix of autoregressive coefficients, \(\mu = E\left( {p_{1,t} - p_{2,t} } \right)\) is the mean deviation, \(\gamma \left( {p_{1,t - 1} - p_{2,t - 1} - \mu } \right)\) is the error correction term, and \(\gamma\) is the adjustment coefficient. Each price in the VECM model contains a latent random walk component or the “efficient price”. This component is unobservable without further identification restrictions in the current reduced-form specification, but its innovations have the property that they are linear in the disturbances. In other words,

$${\varvec{w}}_{t} = \left( {w_{1,t} w_{2,t} } \right)^{\prime } = A{\varvec{u}}_{t} = \left[ {\begin{array}{*{20}c} {a_{11} } & {a_{12} } \\ {a_{21} } & {a_{22} } \\ \end{array} } \right]\left( {u_{1t} u_{2t} } \right)^{\prime } ,$$
(6)

where \(w_{it}\) is the random walk innovation in the ith price series, and the \(a_{ij}\) is determined from the VECM parameter set. Both futures and spot prices reflect the same efficient price, or the random walk innovations are identical. Therefore. the rows in the coefficient matrix are the same, and we focus on either one to have the random walk innovation variance as

$$Var\left( {w_{1t} } \right) = \left( {a_{11} a_{12} } \right) \left[ {\begin{array}{*{20}c} {\sigma_{1}^{2} } & {\sigma_{12} } \\ {\sigma_{12} } & {\sigma_{2}^{2} } \\ \end{array} } \right]\left( {a_{11} a_{12} } \right)^{\prime } .$$
(7)

If the price innovation covariance matrix is diagonal, \(Var\left( {w_{1t} } \right) = a_{11}^{2} \sigma_{1}^{2} + a_{12}^{2} \sigma_{2}^{2}\), and the information share of ith price series is defined as \(I_{i} = a_{ii}^{2} \sigma_{i}^{2} /Var\left( {w_{1t} } \right)\). If the price innovation covariance matrix is not diagonal, the information share is not exactly identified, alternative factor rotations are examined, and we take the simple average of information shares from different factor rotations. We measure the information share on daily basis to allow the time-varying mean deviation \(\mu\). We also note that if the price innovations are highly correlated, it is not possible to assign explanatory power with any precision, so we employ a time resolution of 1–5 s to avoid introducing correlation by time aggregation. To avoid a large number of coefficients if the interval width is small, we follow Hasbrouck (2003) to use the polynomial distributed lags.

We note that the BTC spot returns and futures returns Granger cause to each other in the Vector Autoregressive VAR(20) model from daily returns at the 1% significance level.Footnote 2 In other words, futures returns and their trading activities matter to the spot market and therefore underlying BTC price, even when institutional investors only trade futures contract in the BTC market. This is because the institutional investorsFootnote 3 are main players in the CME futures market due to its restrictive contract specifications, e.g. 5 BTCs per contract vs. 0.001 BTC as the minimum trade amount per perpetual futures contract in Binance, and they have limited access to the spot market due to regulatory issues over the sample period.Footnote 4 This gives us further support to study the information shares from the futures market as a proxy of the relative importance of institutional trading in the BTC market. Alternatively, Gonzalo and Granger (1995) measure the information share of a market from the permanent (as opposed to transitory) shocks in that market to underlying price that result in a disequilibrium, which is reflected through the error correction process of CME futures price in our context. We argue that this error correction process is potentially inefficient when compared with other traditional assets because a majority of institutional investors are not able to exploit any arbitrage opportunities in the BTC market from trading futures and spot contracts. Therefore, we focus on the Hasbrouck’s information shares and leave other metrics for future research.

3 Findings and discussion

Figure 1 presents the daily realized correlation between BTC spot market and S&P500, gold and bond returns over the studied sample period. Until the end of 2019, BTC price did not exhibit a clear relationship with S&P500, gold and bond since the correlations fluctuated around zero over this time period. Since the start of 2020, the daily correlations between BTC and S&P500 as well as gold on average shifted to the positive domain and strongly varied in this region. Meanwhile, BTC and bond demonstrated a weakly negative relationship, whose correlation fluctuations were not so intense as those observed between BTC and S&P500 or gold. Over the full sample period, the averages of daily correlation between BTC and the other asset classes are reported in Table 1 to be 0.05, 0.01, and -0.005 for S&P500, gold, and bond, respectively. In addition, the standard deviations of daily correlations are much larger than the average counterparts by at least 2 times, suggesting that these assets do not have a stable relationship. Figure 2 presents the daily trading activity in the futures markets of BTC, S&P500, gold, and bond, with trading activity measured by trading volume or the number of trades. Together with the summary statistics in Table 1, it suggests that the trading activity is most crowded in the stock market, followed by bond, BTC and gold. On average, the daily volumes are 73,108 thousand shares, 1110 thousand contracts, 5 thousand contracts, and 0.65 thousand contracts, for S&P500, bond, BTC, and gold, respectively. In addition, the standard deviations of daily trading activity are less than the average counterparts in all assets, except for gold, implying that the S&P500, bond and BTC markets display a more stable and intense trading activity than gold on average. The conclusion remains the same with number of trades. To measure the extent of institutional activity in the BTC market, we utilize the information shares of BTC futures price. The summary statistics in Table 1 document that the contribution of futures price to BTC price discovery is fairly considerable with the daily average of information shares of 57–61%. In addition, the daily standard deviations of information shares are less than the average counterparts by 40–50%, indicating that the information shares do not wildly fluctuate over the sample period.

Table 1 Summary statistics
Fig. 2
figure 2

Daily trading activity of different assets. This figure presents the daily trading activity measured in (log) volume or (log) number of trades (# trade) of different assets. The sample period starts from 18th December 2017 to 26th March 2021. SPY trades in Monday–Friday from 9:30 to 16:00 EST, and the futures contracts of BTC, 10-year T-Note and gold trade in Sunday–Friday 17:00 to 16:00 CT

We first confirm that the information shares of the futures market are closely associated with its trading activity and present the results of Eq. (1) in Table 2. With either the number of trades or trading volume as proxies of trading activity, the coefficients of information shares are positive and statistically significant at 1–5% level. In other words, the greater contribution of futures price to BTC price discovery, the more intense the trading activity in the futures market. Our findings are consistent with the previous literature in Chordia et al. (2001) and Karpoff (1987) whereby the coefficients on \(\% spread\) and \(realized volatility\) are statistically negative and positive at the 1% significance level respectively. Intuitively, the more liquid or volatile market is associated with the higher-level trading activity. The results are also robust to different data frequencies in the measurement of information shares. To conclude, the information shares of the BTC futures market meaningfully capture the trading activity of the futures market in the presence of standard controls. This suggests that the price discovery of underlying BTC incorporates BTC futures market activity, which is commonly believed due to institutional investors in the context of CME futures contracts.

Table 2 Regression results for trading activity with information shares

To test the hypothesis that the relative importance of institutional investors in the BTC market contributes to the correlation movement between BTC and other assets, we run Eq. (2) and present the results in Tables 3, 4, 5 and 6. We first discuss the relationship between asset correlation and trading activity. With respect to the BTC-S&P500 correlation, the coefficients on trading activity in each market are positively and statistically significant at the 1% level, and the coefficients of BTC trading activity reduce its statistical significance in Eqs. (3) and (4) with more controls. Meanwhile, the trading activity in the futures market is positively (negatively) related to the BTC-gold correlation at the 1% (5%) significance level in Eq. (2) for BTC (Eq. (4) for gold), and it is positively (negatively) related to the BTC-bond correlation at conventional statistical significance levels in Eqs. (3)–(4) for BTC (across all Equations for bond).

Table 3 Regression results for correlation with trading volume and one second-based information shares
Table 4 Regression results for correlation with trading volume and five second-based information shares
Table 5 Regression results for correlation with number of trades and one second-based information shares
Table 6 Regression results for correlation with number of trades and five second-based information shares

Regarding the main hypothesis results, Tables 3, 4, 5 and 6 document the negative and statistically significant coefficient on the information shares of BTC futures in the regression of BTC-S&P500 correlations at the 1% significance level in Eq. (2), robust to different data frequencies in the information shares. This implies that when institutional activity is more active in BTC futures, therefore the greater contribution of futures price to BTC price discovery, BTC and S&P500 price move in the opposite direction. This finding is aligned with the portfolio rebalancing channel whereby investors facing lower yields on securities may turn to higher-yielding alternatives (Modigliani & Sutch, 1966; Paludkiewicz, 2021; Tobin, 1969). In other words, profit-seeking institutional investors who prefer high risk-adjusted asset returnsFootnote 5 potentially reduce their exposure in the equity market to allocate funds into the BTC market and vice versa. This leads to the equity selling (buying) pressure that decreases (increases) equity price and BTC buying (selling) pressure that increases (decreases) BTC price. The impact is not short-lived given that the coefficients on the lagged information shares are negative and statistically significant, though they reduce in magnitude. These results suggest that BTC may act as a hedge against a stock market crash and it should be included in a well-diversified portfolio. The regression of BTC-gold correlation also reports the negative and statistically significant coefficients on the information shares at the 1% significance level in most cases of Eq. (2), suggesting that the portfolio rebalance also takes place in the BTC and gold markets. We also run several robustness tests that take into account other controls as well as the COVID pandemic (see Eqs. (3) and (4) in Table 3, 4, 5, 6). For the sake of brevity, we do not report the coefficients of additional controls, and they are available upon request.Footnote 6 In general, the negative coefficient of information shares on BTC-S&P500 correlation remains statistically significant in Eq. (3), and the results are more striking from 2020 with the statistically significant coefficient of the interaction term in Eq. (4). The conclusions remain the same for the BTC-gold correlation regression, although the negative coefficient of information shares reduce its statistical significance in Eq. (3). Our results show that the COVID-19 pandemic has a material impact on the relation between the information shares and cross-asset correlation with the stronger and statistically significant coefficient over the period starting from 2020. This also suggests that the participation of institutional investors in the BTC market is more widely recognized from 2020.

With respect to the results of the BTC-bond correlation, it is interesting that the coefficients of information shares flip to positive in Table 3, 4, 5 and 6, and they reduce in magnitude and statistical significance in Table 4 and 6 and in the case of lagged information shares. Similar to the case of BTC-S&P500 and BTC-gold correlations, the result is stronger with statistical significance at the 1–10% level over the period starting from 2020 as seen from the interaction term coefficient. In contrast to the S&P500 and gold markets, the bond market does not support the portfolio rebalance channel, which can be explained by the fact that bondholders may potentially not be driven by profit-seeking.Footnote 7

4 Conclusion

We find that the information shares in the spirit of Hasbrouck (1995, 2003) have significantly positive impacts on the BTC futures trading activity and that the futures market contributes to the BTC price discovery. This indicates that spot market participants incorporate information from the BTC futures trading activities in the CME exchange typically dominated by institutional investors.

Moreover, the estimation results reveal that increases in information shares negatively affect both the pairwise correlations BTC-SP500 and BTC-gold, especially from the start of 2020. Those negative relationships could indicate that as institutional investors allocate more (less) funds into BTC, they decrease (increase) their SP500 and gold positions as part of their allocation strategies. In contrast, increases in the information shares raise the BTC-bond correlation. Since the BTC market is quite fragmented with multiple venues including derivative contracts across the world such as the Grayscale BTC Trust, BTC ETFs and Futures trading at different time zones, future studies could analyze more in detail how the price discovery of BTC takes place, especially with the upcoming BTC ETFs in the U.S. that help institutional investors gain more exposure to BTC.

One drawback of our study is we only focus on the BTC market because of its largest market capitalization in the cryptocurrency world. While BTC has been approved by regulators in some developed countries such as Canada and the U.S. for futures trading and exchange traded funds, many others among over ten thousand cryptocurrencies are subjective to regulation and they are often considered as securities. It is therefore interesting for future studies to examine how institutional trading activity can impact the pairwise correlations between other cryptocurrencies and traditional assets, where price discovery of securities-akin cryptocurrencies and institutional interest depend on technological capabilities (Philippi et al., 2021) and regulations (Allen et al., 2021).