Blockchain analytics for intraday financial risk modeling
 196 Downloads
Abstract
Blockchain offers the opportunity to use the transaction graph for financial governance, yet properties of this graph are understudied. One key question in this direction is the extent to which the transaction graph can serve as an earlywarning indicator for large financial losses. In this article, we demonstrate the impact of extreme transaction graph activity on the intraday volatility of the Bitcoin prices series. Specifically, we identify certain subgraphs (‘chainlets’) that exhibit predictive influence on Bitcoin price and volatility and characterize the types of chainlets that signify extreme losses. Using bars ranging from 15 min up to a day, we fit GARCH models with and without the extreme chainlets and show that the former exhibit superior valueatrisk backtesting performance.
Keywords
Blockchain Cryptocurrencies Graph analysis GARCH Intraday financial riskJEL Classification
C58 C63 G181 Introduction
The global financial system collapsed in 2008 and created the most severe recession in the history of the United States since the great depression of the 1920s. The “Flash Crash” of May 6th of 2010 in which the Dow Jones Industrial Average plunged approximately 5% only to recover within minutes not only highlighted the continued fragility of our financial markets and the inability of regulators to either preempt this crash or promptly release postcrisis results of investigative studies aimed at preventing a reoccurrence of similar events. The double auction system deployed by financial markets is not only largely inaccessible to researchers, but the identity of the market participants is hidden. It is, therefore, impossible to reliably trace the movement of money through a financial network, attributing financial market characteristics to the actions of specific agents.
Since the seminal Bitcoin paper (Nakamoto 2008) in 2008, cryptocurrencies (Tschorsch and Scheuermann 2016) have been the most prominent Blockchain application. Designed to facilitate a secure distributed platform without central regulation, Blockchain is heralded as a paradigm that will be as powerful as big data, cloud computing, and machine learning.
Blockchain provides the capability to use the transaction graph for financial governance but properties of this graph are understudied. One key question in this direction is the extent to which transaction graph can serve as an earlywarning indicator for large financial losses.
The goal of this article is to develop a representation of the transaction graph which augments classical financial risk modeling and can be used to address such a question. This novel data representation permits a big data version of financial econometrics—with the emphasis on the topological network structures in addition to the covariance of historical time series of prices. By processing all financial interactions, we model the network with a highfidelity graph so that it is possible to characterize how the flow of information in the network evolves over time. In this graph, each node (also called a vertex) is an address that is created from the cryptographic keys of a Bitcoin account. The graph contains edges that represent inputs (i.e., received Bitcoins) or outputs (i.e., sent Bitcoins) to this address. The owner of the account can be uniquely identified by its corresponding address, but address creation is free and cheap. Furthermore, the community discourages using the same address to receive/send money multiple times; for change coins a user usually creates a new address. As a result, there can be multiple addresses belonging to a single user. On this graph, we employ a novel graph data model, referred to as a ‘chainlet’ (Akcora et al. 2018a) that captures local transaction information patterns, to store the graph for a given time period (e.g., one day). In particular, we model the impact of ‘extreme chainlets’ on intraday prices and volatility in a GARCH framework.
1.1 Related research
Several studies have already partially addressed the capacity and limitations for cryptocurrencies to provide a robust and transparent economic system for all economic participants (Caporale et al. 2018; Shah and Zhang 2014; Corbet et al. 2017; Dyhrberg 2016a; Gomber et al. 2017; Sovbetov 2018). In contrast to existing financial networks, Blockchainbased cryptocurrencies expose the entire transaction graph to the public, where payment sender, receiver and amount are visible to the public. Although any node can join the network without identifying itself, Bitcoin transactions are listed for all participants and the most significant agents can be immediately located on the network.
Graph analysis Analyzing the relationship between transactions and addresses and Bitcoin price has emerged as a pivotal research direction (Tasca et al. 2018). In particular, there is a growing focus on building statistical models which can predict and attribute price movements to transactions and transaction graph properties. While simple Blockchain transactional features, such as average transaction amount, are shown to exhibit mixed performance for cryptocurrency price forecasting (Greaves and Au 2015), a number of recent studies complement this article in demonstrating the utility of global graph features to predict the price (Akcora et al. 2018b; Madan et al. 2015; Kondor et al. 2014; Greaves and Au 2015). For instance, Sorgente and Cibils (2014) analyzed the predictive effects of average balance, clustering coefficient, and number of new edges on the Bitcoin price and Akcora et al. (2018a) use Blockchain chainlets as predictors. Two network flow measures were recently proposed by Yang and Kim (2015) to quantify the dynamics of the Bitcoin transaction network and to assess the relationship between flow complexity and Bitcoin market variables.
Intraday volatility All of the aforementioned studies are performed on daily Bitcoin market data. However, not only are data available on a much more granular timescale, many practical financial applications rely on intraday data to make trading and risk management decisions. Recently, Guo and AntulovFantulin (2018) applies various time series and machine learning approaches to the shortterm (i.e. intraday) prediction of the Bitcoin exchange price fluctuations. The novelty in their study is the use of historical intraday market data combined with limit order information. Their results showed that limit order book information holds predictive power at short time horizons, lagging up to 30 min, but found little empirical evidence to support longer term price impact.
GARCH Generalized autoregressive conditional heteroskedasticity (GARCH) are popularized in the financial econometric literature for their capacity to model volatility with empirically supported properties. The GARCH model, for example, can capture volatility clustering, correlation between the returns and the volatility, and for certain classes of GARCH models, asymmetric effects between positive and negative returns. Asymmetric GARCH is particularly useful in risk management and ideal for risk averse investors in anticipation of negative shocks to the market (Dyhrberg 2016b).
GARCH modeling has emerged as the primary approach for cryptocurrency volatility modeling from daily price history (Cermak 2017; Dyhrberg 2016b; Chu et al. 2017; Ardia et al. 2018). Yet the application of GARCH to intraday cryptocurrency volatility modeling has garnered little attention. Building on this literature, we demonstrate the effect of including extreme chainlet activity, \(x_{t}\) in an intraday eGARCH model. Note that our focus is primarily on longer intraday risk horizons (hours) that would be relevant to highfrequency GARCH modeling (Shephard and Sheppard 2010) (subminute) and, for this reason, we do not include the exchange’s limit order book in our study (see Guo and AntulovFantulin 2018), instead choosing to use the Blockchain graph.
Market sentiment indices Finding alternative data sources which are strongly linked with prices and volatility is a wellestablished practice in the conventional financial markets. The ability to exploit such data sources often requires more advanced methodology as such data are irregularly spaced and may depart from stylized econometric data modeling assumptions. For example, Borovkova and Mahakena (2015) study the impact of news on returns, prices and price jumps in natural gas futures, and deploy a local level state space model to construct a news sentiment time series from irregularly spaced news announcements. Other data sources include limit order books, fundamentals, SEC filings and equity analyst ratings. In this regard, our proposed use of the transaction graph to predict price and risk is a form of alternative data analysis for trading and investment management. The compelling aspect of the transaction graph is its objectivity, instantaneous access and transparency. On the other hand, new sentiment analysis can be subjective, and fundamental ratios are reported infrequently and with some degree of opacity and potential for manipulation. Thus, in principle, the transaction graph has the potential to deliver a robust and transparent cryptocurrency market sentiment indicators.
Contribution Bitcoin requires new data science research and methodology to demonstrate how full disclosure of an agent’s actions in a cryptocurrency market inform price discovery and ultimately serve as an earlywarning indicator for excess market volatility or even a crash.
This article contributes to the growing body of research on the role of users, entities and their interactions in formation and dynamics of cryptocurrency risk investment, financial predictive analytics and, more generally, in reshaping the modern financial world.
Specifically, we model the impact of extreme chainlets on intraday prices and volatility in a GARCH framework. We mention in passing that the application of GARCH models to forecast daily Bitcoin prices has already been extensively investigated in Chu et al. (2017), and Cermak (2017). We supplement these findings, by (i) demonstrating the importance of including extreme chainlet activity and (ii) modeling intraday price and volatility, extending our previous study on the effect of chainlets on daily volatility (Akcora et al. 2018b).
2 Chainlets and extreme chainlets
As shown in Fig. 1, a Bitcoin graph consists of three main components: addresses, transactions and blocks (see Akcora et al. 2017 for a primer on Blockchain graphs). In the rest of this paper, we will use address and account interchangeably.
The Bitcoin protocol allows multiple accounts to participate in a transaction by each party signing its own part of the transaction. Both inputs and outputs of a transaction can have multiple accounts. For instance, in Fig. 1 transaction \(t_2\) receives Bitcoins from addresses \(a_2\) and \(a_3\) and deposits the amount to addresses \(a_5\) and \(a_6\). A reallife analogy is a person using multiple bank accounts, merging funds in a single transaction and sending the amount to multiple accounts. However, there are coin mixing (Maxwell 2013) services that allow unacquainted people to create a transaction together, where the coins are mixed to hide their origins. As such, addresses that appear in the inputs or outputs of a transaction may not belong to the same person/entity.
Inputs and outputs can provide vital information about transaction purposes. For example, transactions that involve hundreds of inputs and very few outputs may imply large amounts of Bitcoin investments.
One approach to understand how transactions relate to market price is to introduce the novel concept of kchainlets (Akcora et al. 2018a). A kchainlet is a Bitcoin subgraph of \(k \ge 1\) transactions and their corresponding input and output addresses corresponding to different accounts, not necessarily unique to a user. In the simplest case, a single transaction creates a 1chainlet with one or more inputs and a single output. For example, in Fig. 1, transaction \(t_1\) results in the transfer of Bitcoin from address \(a_1\) to address \(a_4\) and \(a_5\). Such a transaction creates a 1chainlet that has one input and two outputs. We denote this subgraph as a chainlet of type \({\mathbb {C}}_{1 \rightarrow 2}\), where 1 and 2 are the number of input and output addresses, respectively.
2.1 Extreme chainlets
Graph analysis allows us to evaluate the local topological structure of the Bitcoin graph over time and assess the role of chainlets on Bitcoin price formation and dynamics. Figure 2 illustrates how the activity of the network can be represented by a chainlet occurrence matrix. For a given time period, we may count the occurrences of each \({\mathbb {C}}_{i \rightarrow j}\) and store it in a matrix. The maximum number of inputs or outputs of a chainlet can be large, however, sometimes exceeding 1000. When the number of inputs and/or outputs equals or exceeds a threshold N, we refer to these chainlets as “extreme chainlets”. In our historical analysis of daily snapshots, we choose \(N=20\), which corresponds to the 97.5 percentile of all chainlet occurrences. In other words, if there are chainlets falling beyond the Nth row or column, their information is stored in the last row and/or column of the matrix. It is instructive to distinguish between ‘left extreme chainlets’ and ‘right extreme chainlets’:

Left extreme chainletsare the subset \({\mathcal {C}}^\mathrm{l}:=\{{\mathbb {C}}_{i \rightarrow j}\,\,i=N,\,j\in \{1,\ldots ,N\}\}\) highlighted in the bottom row in the figure. They represent transactions from a large number of accounts to fewer addresses. They represent Bitcoin investment—transfer of Bitcoin from a large number of wallets to a relatively few number of wallets represents the supply of liquidity at an exchange.
 Right extreme chainletsare the subset \({\mathcal {C}}^\mathrm{r}:=\{{\mathbb {C}}_{i \rightarrow j}\,\,i\in \{1,\ldots ,N1\},\,j=N\}\) highlighted in the far right column in the figure. They represent the sale of a large sum of Bitcoins across the market—the seller divides the balance and sends them to potentially hundreds of Bitcoin addresses.
We denote the amount of Satoshis (one BTC is \(10^{8}\) Satoshis) transferred between dates \(t1\) and t by left and right extreme chainlets as \(A^\mathrm{l}_{t}\) and \(A^\mathrm{r}_{t}\), and the total occurrences as \(O^\mathrm{l}_{t}\) and \(O^\mathrm{r}_{t}\), respectively. In Fig. 1, the three blocks and their occurrence and amount matrices are given. As all transactions have 2 or less inputs/outputs, we use \(N=2\) in this toy example. The sum of occurrence values equals the number of transactions (e.g., 2 in Block 1 and 1 in Block 3), whereas the sum of amount matrix cells gives the total transaction volume (e.g., 6.8 BTC in Block 2).
Discussion The economic rationale for using graph analysis for risk management and price forecasting is its ability to directly capture supply and demand dynamics. People negotiate trades to buy or sell goods, services, and fiat currencies in exchange for cryptocurrency and pay for these trades by transferring cryptocurrency between virtual wallets. Market participants, and particularly investors, move cryptocurrency between virtual wallets to buy, sell, and hold cryptocurrency as they attempt to generate capital gains or reduce exposure to cryptocurrencies. Transfer of Bitcoins from a large number of wallets to a relatively few number of wallets represents the supply of liquidity at an exchange. For example, Bitcoin holders seeking to reduce exposure would convert their Bitcoin to fiat currency. Conversely, Bitcoin transfer from a few number to a large number of wallets represents demand—the sale of Bitcoin to a large number of Bitcoin investors and hence increased exposure. The magnitude and direction of Bitcoin movement is a measure of price sentiment in Bitcoin and the frequency of large transactions is additional source of price volatility.
3 Intraday forecasting Bitcoin prices and volatility
The extent to which we can build predictive models from the chainlets has already led to some promising results (see Akcora et al. 2018a for specification of the types and groups of chainlets that exhibit predictive influence on Bitcoin price and volatility).
Data There are numerous sources of historical Bitcoin prices which are collected by various exchanges. In this work, we used the Bitcoin USD price information that is sourced from the Blockchain exchange Coinbase.com over the period from February 2015 to December 2018 (1,499,040 observations) and the corresponding chainlet occurrence and amount matrices.^{1} This price is estimated as an average over multiple exchanges worldwide.
On average, the Bitcoin network attempts to generate a block every 10 min, and the mining difficulty is adjusted to achieve this goal. However, in practice, there is high degree of uncertainty in the time taken by the miner to return a valid blockhash; a new block may be found after only 2 s or as long as 20 min. Bitcoin limits the number of transactions in a block by limiting block size to 1MB. Between 2015 and 2018, a Bitcoin block had 1432 transactions in average with a minimum of 1 and maximum of 12,239. Figure 3 shows how these chainlets are distributed for various inputs and outputs.
Every 2016 blocks, Bitcoin computes the time that it actually took to mine these blocks. If it took less than 14 days, the difficulty is deemed to be too easy and increased. If it took more than 2 weeks, difficulty is decreased. The difficulty decreases very rarely though.
3.1 Risk and time series modeling
We characterize the uncertainty of a ‘loss’ and, in particular, estimate the probability of extreme losses occurring over a future horizon. The loss is defined as the negative of the discrete returns, \(L_{t}=r_{t}\), where \(r_t:={(P_{t+1}  P_t)}/{P_{t}}\) and \(P_{t}\) is the Bitcoin price at time t.
This table compares the empirical densities of the standardized daily losses conditioned on the lower and upper \(\alpha =0.05\) percentiles of extreme chainlet activity by amount (\(A^{x}\)) and occurrences (\(O^{x}\))
Interval (min)  Mean  Std.dev.  Skewness  Kurtosis  

15  \(\phi (L_t)\)  0  1  9.154  1056.313 
15  \(\phi \left( L_t  A^x_t < \Phi ^{1}_{A^x_t}(0.05)\right)\)  \(\) 0.013  0.907  \(\) 14.93  918.836 
15  \(\phi \left( L_t  A^x_t > \Phi ^{1}_{A^x_t}(0.95)\right)\)  0.008  1.181  9.706  839.024 
15  \(\phi \left( L_t  O^x_t < \Phi ^{1}_{O^x_t}(0.05)\right)\)  \(\) 0.01  0.931  \(\) 11.531  764.131 
15  \(\phi \left( L_t  O^x_t > \Phi ^{1}_{O^x_t}(0.95)\right)\)  0.041  1.641  22.525  1034.844 
30  \(\phi (L_t)\)  0  1  7.181  498.262 
30  \(\phi \left( L_t  A^x_t < \Phi ^{1}_{A^x_t}(0.05)\right)\)  \(\) 0.039  0.971  \(\) 15.781  519.805 
30  \(\phi \left( L_t  A^x_t > \Phi ^{1}_{A^x_t}(0.95)\right)\)  0.018  1.124  13.496  526.045 
30  \(\phi \left( L_t  O^x_t < \Phi ^{1}_{O^x_t}(0.05)\right)\)  \(\) 0.014  0.787  \(\) 6.176  209.604 
30  \(\phi \left( L_t  O^x_t > \Phi ^{1}_{O^x_t}(0.95)\right)\)  0.028  1.127  14.669  528.621 
60  \(\phi (L_t)\)  0  1  4.723  227.521 
60  \(\phi \left( L_t  A^x_t < \Phi ^{1}_{A^x_t}(0.05)\right)\)  \(\) 0.034  0.761  \(\) 6.031  107.801 
60  \(\phi \left( L_t  A^x_t > \Phi ^{1}_{A^x_t}(0.95)\right)\)  0.029  1.362  11.197  255.048 
60  \(\phi \left( L_t  O^x_t < \Phi ^{1}_{O^x_t}(0.05)\right)\)  \(\) 0.024  0.77  \(\) 4.014  99.398 
60  \(\phi \left( L_t  O^x_t > \Phi ^{1}_{O^x_t}(0.95)\right)\)  \(\) 0.006  1.182  11.231  297.34 
120  \(\phi (L_t)\)  0  1  4.615  228.85 
120  \(\phi \left( L_t  A^x_t < \Phi ^{1}_{A^x_t}(0.05)\right)\)  \(\) 0.025  0.749  \(\) 5.148  112.134 
120  \(\phi \left( L_t  A^x_t > \Phi ^{1}_{A^x_t}(0.95)\right)\)  0.027  1.37  11.155  253.848 
120  \(\phi \left( L_t  O^x_t < \Phi ^{1}_{O^x_t}(0.05)\right)\)  \(\) 0.022  0.781  \(\) 3.575  98.738 
120  \(\phi \left( L_t  O^x_t > \Phi ^{1}_{O^x_t}(0.95)\right)\)  \(\) 0.019  0.906  3.781  148.533 
240  \(\phi (L_t)\)  0  1  4.642  229.758 
240  \(\phi (L_t  A^x_t < \Phi ^{1}_{A^x_t}(0.05))\)  \(\) 0.021  0.802  \(\) 2.709  106.828 
240  \(\phi (L_t  A^x_t > \Phi ^{1}_{A^x_t}(0.95))\)  0.025  1.712  11.622  351.914 
240  \(\phi (L_t  O^x_t < \Phi ^{1}_{O^x_t}(0.05))\)  \(\) 0.023  0.768  \(\) 3.164  98.804 
240  \(\phi (L_t  O^x_t > \Phi ^{1}_{O^x_t}(0.95))\)  \(\) 0.016  0.957  4.071  130.399 
1440  \(\phi (L_t)\)  0  1  1.38  11.512 
Daily  \(\phi (L_t  A^x_t < \Phi ^{1}_{A^x_t}(0.05))\)  \(\) 0.102  0.712  \(\) 0.693  5.881 
Daily  \(\phi (L_t  A^x_t > \Phi ^{1}_{A^x_t}(0.95))\)  0.186  1.356  1.309  6.623 
Daily  \(\phi (L_t  O^x_t < \Phi ^{1}_{O^x_t}(0.05))\)  \(\) 0.026  0.919  2.056  13.222 
Daily  \(\phi (L_t  O^x_t > \Phi ^{1}_{O^x_t}(0.95))\)  \(\) 0.119  1.047  \(\) 1.072  5.619 
Conditional loss distributions Table 1 shows the unconditional loss densities, \(\phi (L_t)\) (black) and conditional densities of the standardized Bitcoin losses over the considered time period. At each time interval, the skewness and kurtosis of the conditional loss distributions are observed to differ significantly from the unconditional loss distribution. Since we seek early warning indicators for extreme losses, we highlight (i.e,. use a bold font) each result where the conditional loss distribution exhibits larger skewness and kurtosis.
This accentuated right skewness, combined with the higher kurtosis, indicates more extreme losses following abnormal extreme chainlet activity in the previous time period. The trend across most of the time scales is that large extreme chainlet amounts are followed by larger losses. The role of occurrences is also important—at time intervals of 30 and 60 min, high occurrences are followed by larger losses.
Note that at timescales of 15 min, conditioning on the extreme chainlets does not result in a more right skewed or fatter tailed distribution. Additional results, not shown in Table 1 show similar effects at shorter intervals.
3.2 GARCH
Generalized autoregressive conditional heteroskedasticity (GARCH) are popularized in the financial econometric literature for their capacity to model volatility with empirically supported properties.
Model selection and diagnostics Additional diagnostics, provided in the Appendix, show that the returns time series are always stationary, irrespective of timescale (see Table 7). All ARMA models test positive for an ARCH effect in the residual. Table 6 shows that all models pass a Box–Ljung and Lagrange multiplier test at the 99% confidence level for the residuals and square of the residuals. Table 2 shows the specification of each optimal model, as determined by the AIC, for each timescale. Note that the optimal orders of p and q are determined separately for the mean equation and GARCH model using the returns and squared residuals, respectively.
This table shows the fitted ARMA–eGARCH(X) models together with the training set size for each time interval
Interval (mins)  ARMA  eGARCH  # observations 

15  (4, 5)  (3, 5)  99,795 
30  (5, 0)  (2, 2)  49,923 
60  (4, 3)  (1, 0)  24,967 
120  (5, 5)  (1, 0)  12,483 
240  (4, 4)  (2, 5)  5985 
Daily  (5, 4)  (4, 4)  1040 
This table compares the VaR backtesting performance with and without the chainlet regressors
Interval (mins)  Backtest length  Expected breaches  Actual VaR breaches  

W/o chainlets  With chainlets  
15  99,545  995.5  1023  1014 
30  49,673  496.7  512  501 
60  24,717  247.2  261  249 
120  12,233  122.3  141  126 
240  5735  57.4  74  59 
Daily  790  7.9  19  12 
This table compares Kupiec’s unconditional coverage test results for the VaR breaches, with and without the chainlet regressors
Interval (mins)  Unconditional coverage nullhypothesis  Kupiec correct breaches  

W/o chainlets  With chainlets  
15  LR.uc statistic  11.306  7.291 
LR.uc critical  3.841  3.841  
LR.uc p value  0.001  0.005  
Reject null  Yes  Yes  
30  LR.uc statistic  8.060  1.854 
LR.uc critical  3.841  3.841  
LR.uc p value  0.003  0.210  
Reject null  Yes  No  
60  LR.uc statistic  4.671  1.755 
LR.uc critical  3.841  3.841  
LR.uc p value  0.035  0.223  
Reject null  Yes  No  
120  LR.uc statistic  3.771  1.317 
LR.uc critical  3.841  3.841  
LR.uc p value  0.056  0.259  
Reject null  No  No  
240  LR.uc statistic  28.515  1.516 
LR.uc critical  3.841  3.841  
LR.uc p value  0  0.242  
Reject null  Yes  No  
Daily  LR.uc statistic  11.306  1.955 
LR.uc critical  3.841  3.841  
LR.uc p value  0.001  0.195  
Reject null  Yes  No 
This table compares Christoffersen’s conditional coverage test results for the VaR breaches, with and without the chainlet regressors
Interval (mins)  Unconditional coverage nullhypothesis  Christoffersen  

Correct breaches and independence of failures  
W/o chainlets  With chainlets  
15  LR.uc statistic  9.124  6.899 
LR.uc critical  5.991  5.991  
LR.uc p value  0.007  0.029  
Reject null  Yes  Yes  
30  LR.uc statistic  14.025  2.105 
LR.uc critical  5.991  5.991  
LR.uc p value  0.001  0.157  
Reject null  Yes  No  
60  LR.uc statistic  6.982  1.592 
LR.uc critical  5.991  5.991  
LR.uc p value  0.03  0.213  
Reject null  Yes  No  
120  LR.uc statistic  9.773  2.547 
LR.uc critical  5.991  5.991  
LR.uc p value  0.008  0.141  
Reject null  Yes  No  
240  LR.uc statistic  32.487  1.855 
LR.uc critical  5.991  5.991  
LR.uc p value  0  0.173  
Reject null  Yes  No  
Daily  LR.uc statistic  14.389  2.738 
LR.uc critical  5.991  5.991  
LR.uc p value  0.001  0.154  
Reject null  Yes  No 
Table 4 compares the results of the Kupiec’s unconditional coverage test applied to the ARMA–eGARCH and ARMA–eGARCHX models. This test assesses whether the amount of expected versus actual breaches, given the tail probability of VaR, actually occur as predicted. The results show that at almost all time intervals, the ARMA–eGARCH model fails the backtest (rejection of \(H_0\) at the 95% confidence level) whereas the ARMA–eGARCHX always passes the backtest for timescales at 30 min or more.
Finally, Table 5 shows the results of conditional coverage test of Christoffersen. This test is a joint test of the unconditional coverage and the independence of the breaches. Both the joint and the separate unconditional test are reported since it is always possible that the joint test passes while failing either the independence or unconditional coverage test. The results again show that the ARMA–eGARCH model always fails the backtest (rejection of \(H_0\) at the 95% confidence level) whereas the ARMA–eGARCHX always passes the backtest at 30 min or more.
Under a quadratic loss function additional results, not shown here, we reject \(H_0\) in the Diebold–Mariano test and conclude that the differences in the ARMA–eGARCH and ARMA–eGARCHX model residuals are always significant at the 95% level.
4 Summary and outlook
In this article, we model the Blockchain transaction history of Bitcoin with highfidelity graphs. Extreme chainlet activity, characterized by transaction amounts and occurrences, is shown empirically to result in significant changes in the intraday volatility. With the inclusion of these chainlet activities as external regressors in the conditional variance equation, we show a significant improvement in the GARCH model for predicting next period extreme losses at scales of 15, 30, 60, 120, 240 min and also daily losses. Across all timescales at 30min resolution or large, the inclusion of extreme chainlet regressors results in from 10% up to 90% reduction in the number of false next period 99% VaR breaches or underbreaches over an approximately 2 year backtesting horizon.
Our experiments show that extreme chainlets are well suited as a tool for risk averse Bitcoin investors in anticipation of large market exposures. Broadly, extreme chainlets provide a more granular representation of the market than Bitcoin price information, enabling investors to make more informed portfolio allocation and hedging decisions. The ability to link extreme chainlets to price movement also supports their usage for speculation. For example, Bitcoin ‘whale’ activity might be linked to the extreme chainlets to trace the impact of whale wallets on prices and risk.
In future research, we seek to characterize the temporal evolution of volatility predictability (see AntulovFantulin et al. 2019 for details) using extreme chainlets. Such a study would provide further insight into the reliability of extreme chainlets as a shortterm predictive indicator of risk. Further more, there is scope for building in the study by Akcora et al. (2018a), who find that combining transactional volume with extreme chainlets can yield stronger predictive performance over longer time horizons.
Additionally, a future direction for research would be to link the extreme chainlets with relevant news events, SEC announcements, social media accounts and other relevant macroeconomic data. Aggregating data across multiple sources, we shall evaluate the extent to which the extreme chainlets support geolocationsensitive querying of events. For example, one approach could be to develop a timebased anomaly index for Bitcoin which measures geosensitive sentiment through abnormal extreme chainlet patterns.
Footnotes
 1.
Chainlet matrices are available from https://github.com/cakcora/CoinWorks.
Notes
Acknowledgements
The authors are grateful for useful comments by the reviewers. The work of Dixon is partially supported by NSF EEC 1840433 and Intel Corporation. The work of Gel is partially supported by NSF IIS 1633331, NSF DMS 1736368 and NSF ECCS 1824710. The work of Kantarcioglu was supported in part by NIH award 1R01HG006844, NSF awards CICI1547324, IIS1633331, CNS1837627, OAC1828467 and ARO award W911NF1710356.
References
 Akcora, C. G., Gel, Y. R., & Kantarcioglu, M. (2017). Blockchain: A graph primer. arXiv:1708.08749.
 Akcora, C. G., et al. (2018a). Forecasting Bitcoin price with graph chainlets. In The 22nd pacificasia conference on knowledge discovery and data mining, PaKDD.CrossRefGoogle Scholar
 Akcora, C. G., et al. (2018b). Bitcoin risk modeling with blockchain graphs. Economics Letters, 173, 138–142.CrossRefGoogle Scholar
 AntulovFantulin, N., et al. (2019). Inferring shortterm volatility indicators from the bitcoin blockchain. In L. M. Aiello, et al. (Eds.), Complex networks and their applications VII (pp. 508–520). Cham: Springer International Publishing. (ISBN: 9783030054144).CrossRefGoogle Scholar
 Ardia, D., Bluteau, K., & Rüede, M. (2018). Regime changes in bitcoin GARCH volatility dynamics. Finance Research Letters, 29, 266–271. https://doi.org/10.1016/j.frl.2018.08.009. (ISSN: 15446123).CrossRefGoogle Scholar
 Borovkova, S. A., & Mahakena, D. (2015). News, volatility and jumps: The case of natural gas futures. Quantitative Finance, 15(7), 1217–1242. https://doi.org/10.1080/14697688.2014.986513. (ISSN: 14697688).CrossRefGoogle Scholar
 Caporale, Guglielmo Maria, GilAlana, Luis, & Plastun, Alex. (2018). Persistence in the cryptocurrency market. Research in International Business and Finance, 46, 141–148. https://doi.org/10.1016/j.ribaf.2018.01.002.CrossRefGoogle Scholar
 Cermak, V. (2017). Can bitcoin become a viable alternative to fiat currencies? An empirical analysis of bitcoin’s volatility based on a GARCH model (pp. 1–53).Google Scholar
 Chu, J., et al. (2017). GARCH modelling of cryptocurrencies. Journal of Risk and Financial Management, 10, 17. https://doi.org/10.3390/jrfm10040017. http://www.mdpi.com/19118074/10/4/17(ISSN: 19118074).CrossRefGoogle Scholar
 Corbet, S., et al. (2017). Exploring the dynamic relationships between cryptocurrencies and other financial assets. Economics Letters, 165, 28–34.CrossRefGoogle Scholar
 Dyhrberg, A. H. (2016a). Bitcoin, gold and the dollar—A GARCH volatility analysis. Finance Research Letters, 16, 85–92.CrossRefGoogle Scholar
 Dyhrberg, A. H. (2016b). Bitcoin, gold and the dollar—A GARCH volatility analysis. Finance Research Letters, 16, 85–92. https://doi.org/10.1016/j.frl.2015.10.008. (ISSN: 15446123).CrossRefGoogle Scholar
 Gomber, P., Koch, J.A., & Siering, M. (2017). Digital Finance and FinTech: Current research and future research directions. Journal of Business Economics, 7(5), 537–580.CrossRefGoogle Scholar
 Greaves, A., & Au, B. (2015). Using the bitcoin transaction graph to predict the price of bitcoin. No data.Google Scholar
 Guo, T., & AntulovFantulin, N. (2018). An experimental study of bitcoin fluctuation using machine learning methods. arXiv:1802.04065 [stat.ML].
 Kondor, D., et al. (2014). Inferring the interplay between network structure and market effects in bitcoin. New Journal of Physics, 16(12), 125003.CrossRefGoogle Scholar
 Madan, S., Saluja, I., & Zhao, A. (2015). Automated bitcoin trading via machine learning algorithms. Technical report, Department of Computer Science, Stanford University.Google Scholar
 Maxwell, G. (2013). CoinJoin: Bitcoin privacy for the real world. In Post on bitcoin Forum. https://bitcointalk.org/index.php?topic=279249.0.
 Nakamoto, S. (2008). Bitcoin: A peertopeer electronic cash system. https://bitcoin.org/bitcoin.pdf.
 Nelson, D. B. (1991). Conditional heteroskedasticity in asset returns: A new approach. Econometrica, 59(2), 347–370. http://www.jstor.org/stable/2938260(ISSN: 00129682, 14680262).CrossRefGoogle Scholar
 Shah, D., & Zhang, K. (2014). Bayesian regression and bitcoin. In Communication, control, and computing (Allerton), 2014 52nd annual Allerton conference. IEEE (pp. 409–414).Google Scholar
 Shephard, N., & Sheppard, K. (2010). Realising the future: Forecasting with highfrequencybased volatility (HEAVY) models. Journal of Applied Econometrics, 25(2), 197–231. https://doi.org/10.1002/jae.1158.CrossRefGoogle Scholar
 Sorgente, M., & Cibils, C. (2014). The reaction of a network: Exploring the relationship between the bitcoin network structure and the bitcoin price. Technical report, Department of Computer Science, Stanford University.Google Scholar
 Sovbetov, Y. (2018). Factors in uencing cryptocurrency prices: Evidence from bitcoin, Ethereum, Dash, Litcoin, and Monero. Journal of Economics and Financial Analysis, 2(2), 1–27.Google Scholar
 Steinbach, M., Karypis, G., & Kumar, V., et al. (2000). A comparison of document clustering techniques. In KDD workshop on text mining, Boston (Vol. 400, no. 1, pp. 525–526).Google Scholar
 Tasca, P., Hayes, A., & Liu, S. (2018). The evolution of the bitcoin economy: Extracting and analyzing the network of payment relationships. The Journal of Risk Finance, 19(2), 94–126.CrossRefGoogle Scholar
 Tschorsch, F., & Scheuermann, B. (2016). Bitcoin and beyond: A technical survey on decentralized digital currencies. IEEE Communications Surveys & Tutorials, 18(3), 2084–2123.CrossRefGoogle Scholar
 Yang, S. Y., & Kim, J. (2015). Bitcoin market return and volatility forecasting using transaction network flow properties. In IEEE SSCI (pp. 1778–1785).Google Scholar