Introduction

Since the creation of Bitcoin in 20081, various different cryptoassets have been developed and are now considered to be at the cutting edge of innovation in finance2. These digital financial assets are vastly diverse in design characteristics and intended purposes, ranging from peer-to-peer networks with underlying cash-like digital currencies (e.g. Bitcoin) to general-purpose blockchains transacting in commodity-like digital assets (e.g. Ethereum), and even to cryptoassets that intend to replicate the price of conventional assets such as the US dollar or gold (e.g. Tether and Tether Gold)3,4. With more than nine thousand cryptoassets as of 20225, the total market value of cryptocurrencies has grown massively to a staggering $2 trillion peak in 20216. Despite long-standing debates over the intrinsic value and legality of cryptoassets7, or perhaps even precisely due to such controversies, it is undeniable that cryptocurrencies are increasingly attracting the attention of academics, investors, and central banks, around the world8,9.

Moreover, these digital assets have been at the forefront of sizable financial gains and losses in recent years10,11, they have been recognized as the main drivers of the brand-new phenomena of cryptoart and NFTs12,13, but also as facilitators of illegal activities, such as money laundering and dark trade14,15,16. Financial research dedicated to cryptoassets, on the other hand, has been mostly concerned with the extension of fairly traditional analyses17, including market efficiency18,19,20,21,22,23, distribution of price returns24,25,26, and volatility27,28. Researchers have also probed the hedging and safe haven capabilities of cryptoassets when combined with a portfolio of stocks29,30, their behavior in the scenario of generalized market turmoil caused by the COVID-19 pandemic30,31, and the formation of price bubbles32,33.

Among these subjects, the distribution of price returns, especially of large price variations, is considered fundamental for evaluating this new market’s intrinsic risks and modeling its dynamics34,35. Earlier analyses have consistently found price returns to follow heavy-tail distributions. Chu et al.36 have adjusted a large number of probability distributions to the log-returns of daily prices of Bitcoin from 2011 to 2014, finding the generalized hyperbolic distribution (a heavy-tailed distribution) to be the best description of the data. Using daily prices of eight cryptoassets (Bitcoin, Dash, Ethereum, Litecoin, NEM, Stellar, Monero, and Ripple) and the Jarque-Bera test, Zhang et al.28 have rejected the normality of their log-returns. Similarly, Osterrieder et al.25 have also found the normal distribution to be incompatible with price returns of six cryptocurrencies (Bitcoin, Dash, Litecoin, MaidSafeCoin, Monero, and Ripple) over a three-year period (2014–2016). Feng et al.29 have fitted a generalized Pareto distribution to two years of daily price returns of seven cryptocurrencies (Bitcoin, Dash, Ethereum, Litecoin, Monero, NEM, and Ripple) and observed an asymmetry between the left and right tails. Finally, using high-resolution data obtained from exchanges and referring to different semesters between 2010 and 2018, Begušić and coworkers26 have found power laws to be plausible fits to the empirical distributions of large price variations of Bitcoin. This latter approach belongs to the field of econophysics37, and has established intriguing regularities in the distributions of log-returns of traditional financial assets such as a power-law distribution [\(p(r) \sim r^{-\alpha }\)] with typical exponents \(\alpha \sim 4\)37,38,39.

The above short review of pertinent past research shows that, while the return distributions of cryproassets have attracted considerable interest, most previous works have investigated these distributions using data spanning only a few years of price history from small sets of cryptocurrencies (usually Bitcoin and a handful of the biggest cryptocurrencies by market capitalization). Moreover, past research has not established whether return distributions change over time and whether they are dependent on market capitalization. The main goal of this work is therefore to fill these gaps by presenting a dynamic analysis of the return distributions of more than seven thousand cryptocurrencies.

Our results show that the vast majority of cryptocurrencies have return distributions with tails well described by power-law functions over their entire history. The typical values of the power-law exponents characterizing these distributions are smaller than those observed in traditional assets, showing that cryptoassets are more susceptible to large price variations, with about half of them not presenting a characteristic scale for price returns. Moreover, these tail exponents reveal an asymmetry in large price movements often characterized by smaller exponents for positive returns; that is, large positive price variations are expected to occur more frequently than negative ones in most cryptoassets, but this asymmetry is minimal for a few classes of cryptoassets such as stablecoins. Our research further demonstrates that changes in the tail exponents are often associated with the age of cryptocurrencies and their market capitalization, or only with age, with only a minority of cryptoassets affected by market capitalization alone or entirely unaffected by these two quantities. For digital assets affected by age or market capitalization, we find power-law exponents to have mixed directions, with about 28% of all cryptocurrencies, and 37% of the current top 200 cryptocurrencies, becoming less likely to exhibit large price variations as they age and grow in market capitalization. This in turn indicates that large price variations are expected to become less likely only for a small part of the cryptocurrency market.

Results

Our results are based on daily price time series of 7111 cryptocurrencies that comprise a significant part of all currently available cryptoassets (see “Methods” for details). From these price series, we have estimated their logarithmic returns

$$\begin{aligned} r_t = \ln (x_t / x_{t+1}), \end{aligned}$$
(1)

where \(x_t\) represents the price of a given cryptocurrency at day t. All return time series in our analysis have at least 200 observations (see Supplementary Fig. S1 for the length distribution). Figure 1a illustrates Bitcoin’s series of daily returns. To investigate whether and how returns have changed over the aging and growing processes of all cryptocurrencies, we sample all time series of log-returns using a time window that expands in weekly steps (seven time series observations), starting from the hundredth observation to the latest return observation. In each step, we separate the positive from the negative return values and estimate their power-law behavior using the Clauset-Shalizi-Newman method40. Figure 1a further illustrates this procedure, where the vertical dashed line represents a given position of the time window (\(t=2004\) days), the blue and red lines indicate positive and negative returns, respectively, and the gray lines show the return observations that will be included in the expanding time window in future steps. Moreover, Fig. 1b shows the corresponding survival functions (or complementary cumulative distributions) for the positive (blue) and negative (red) returns of Bitcoin within the time window highlighted in Fig. 1a. These survival functions correspond to return values above the lower bound of the power-law regime (\(r_{\text {min}}\)) and dashed lines in Fig. 1b show the power-law functions adjusted to data, that is,

$$\begin{aligned} p(r) \sim r^{-\alpha }\quad (\text {for }r>r_{\text {min}})\,, \end{aligned}$$
(2)

with \(\alpha =4.5\) for the positive returns and \(\alpha =3.0\) for the negative returns in this particular position of the time window (\(t=2004\) days).

Figure 1
figure 1

Illustration of the approach used to probe patterns in price returns of digital currencies. (a) Bitcoin’s time series of daily returns (\(r_t\)) between 29 April 2013 (\(t=1\)) and 25 July 2022 (\(t=3375\)). The black horizontal arrow represents a given position of the expanding time window (at \(t=2004\) days) used to sample the return series over the entire history of Bitcoin. This time window expands in weekly steps (seven time series observations), and for each position, we separate the positive (blue) from the negative (red) price returns. The gray line illustrates observations that will be included in future positions of the expanding time window (\(t>2004\)). (b) Survival functions or the complementary cumulative distributions of positive (blue) and negative (red) price returns within the expanding time window for \(t=2004\) days and above the lower bound of the power-law regime estimated from the Clauset-Shalizi-Newman method40. The dashed lines show the adjusted power-law functions, \(p(r)\sim r^{-\alpha }\), with \(\alpha =4.5\) for positive returns and \(\alpha =3.0\) for negative returns. (c) Time series of the power-law exponents \(\alpha _t\) for the positive (blue) and negative (red) return distributions obtained by expanding the time window from the hundredth observation (\(t=100\)) to the latest available price return of Bitcoin. The circular markers represent the values for the window position at \(t=2004\) days and the dashed lines indicate the median of the power-law exponents (\({\tilde{\alpha }}_{\,+}=4.50\) for positive returns and \({\tilde{\alpha }}_{\,-}=2.99\) for negative returns). (d) Time series of the p-values related to the power-law hypothesis of positive (blue) and negative (red) price returns for every position of the expanding time window. The dashed line indicates the threshold (\(p=0.1\)) above which the power-law hypothesis cannot be rejected. For Bitcoin, the power-law hypothesis is never rejected for positive returns (fraction of rejection \(f_r=0\)) and rejected in only 4% of the expanding time window positions for negative returns (fraction of rejection \(f_r=0.04\)).

We have further verified the goodness of the power-law fits using the approach proposed by Clauset et al.40 (see also Preis et al.41). As detailed in the “Methods” section, this approach consists in generating several synthetic samples under the power-law hypothesis, adjusting these simulated samples, and estimating the fraction of times the Kolmogorov-Smirnov distance between the adjusted power-law and the synthetic samples is larger than the value calculated from the empirical data. This fraction defines a p-value and allows us to reject or not the power-law hypothesis of the return distributions under a given confidence level. Following Refs.40,41, we consider the more conservative 90% confidence level (instead of the more lenient and commonly used 95% confidence level), rejecting the power-law hypothesis when \(p\text {-value}\le 0.1\). For the particular examples in Fig. 1b, the p-values are respectively 1.00 and 0.17 for the positive and negative returns, and thus we cannot reject the power-law hypotheses.

After sampling the entire price return series, we obtain time series for the power-law exponents (\(\alpha _t\)) associated with positive and negative returns as well as the corresponding p-values time series for each step t of the expanding time window. These time series allow us to reconstruct the aging process of the return distributions over the entire history of each cryptoasset and probe possible time-dependent patterns. Figure 1c and 1d show the power-law exponents and p-values time series for the case of Bitcoin. The power-law hypothesis is never rejected for positive returns and rarely rejected for negative returns (about 4% of times). Moreover, the power-law exponents exhibit large fluctuations at the beginning of the time series and become more stable as Bitcoin matures as a financial asset (a similar tendency as reported by Begušić et al.26). The time evolution of these exponents further shows that the asymmetry between positive and negative returns observed in Fig. 1b is not an incidental feature of a particular moment in Bitcoin’s history. Indeed, the power-law exponent for positive returns is almost always larger than the exponent for negative returns, implying that large negative price returns have been more likely to occur than their positive counterparts over nearly the entire history of Bitcoin covered by our data. However, while the difference between positive and negative exponents has approached a constant value, both exponents exhibit an increasing trend, indicating that large price variations are becoming less frequent with the coming-of-age of Bitcoin.

The previous analysis motivates us to ask whether the entire cryptocurrency market behaves similarly to Bitcoin and what other common patterns digital currencies tend to follow. To start answering this question, we have considered the p-values series of all cryptocurrencies to verify if the power-law hypothesis holds in general. Figure 2a shows the percentage of cryptoassets rejecting the power-law hypothesis in at most a given fraction of the weekly positions of the expanding time window (\(f_r\)). Remarkably, the hypothesis that large price movements (positive or negative) follow a power-law distribution is never rejected over the entire history of about 70% of all digital currencies in our dataset. This analysis also shows that only \(\approx\) \(2\)% of cryptocurrencies reject the power-law hypothesis in more than half of the positions of the expanding time window (\(f_r\ge 0.5\)). For instance, considering a 10% threshold as a criterion (\(f_r \le 0.1\)), we find that about 85% of cryptocurrencies have return distributions adequately modeled by power laws. Increasing this threshold to a more lenient 20% threshold (\(f_r \le 0.2\)), we find large price movements to be power-law distributed for about 91% of cryptocurrencies. These results thus provide strong evidence that cryptoassets, fairly generally, present large price movements quite well described by power-law distributions. Moreover, this conclusion is robust when starting the expanding window with a greater number of return observations (between 100 and 300 days) and filtering out cryptoassets with missing observations (Supplementary Figs. S2 and S3). Still, it is worth noticing the existence of a few cryptoassets (9 of them) with relatively small market capitalization (ranking below the top 1000) for which the power-law hypothesis is always rejected (Supplementary Table S1).

Figure 2
figure 2

Large price movements are power-law distributed over the entire history of most cryptocurrencies with median values typically smaller than those found for traditional assets. (a) Percentage of cryptoassets rejecting the power-law hypothesis for large positive (blue) or negative (red) price returns in at most a given fraction of the weekly positions of the expanding time window (\(f_r\)) used to sample the return series. Remarkably, 68% of all 7111 digital currencies are compatible with the power-law hypothesis over their entire history, and about 91% of them reject the power-law hypothesis in less than 20% of the positions of the expanding time window (\(f_r \le 0.2\)). (b) Probability distributions obtained via kernel density estimation of the median values of the power-law exponents along the history of each digital currency. The blue curve shows the distribution of the median exponents related to positive returns (\({\tilde{\alpha }}_{\,+}\)) and the red curve does the same for negative returns (\({\tilde{\alpha }}_{\,-}\)). The medians of \({\tilde{\alpha }}_{\,+}\) and \({\tilde{\alpha }}_{\,-}\) are indicated by vertical dashed lines. Panels (c) and (d) show the distributions of these median exponents when considering the top 2000 and the top 200 cryptocurrencies by market capitalization, respectively. We observe that the distributions of \({\tilde{\alpha }}_{\,+}\) and \({\tilde{\alpha }}_{\,-}\) tend to shift toward larger values when considering the largest cryptoassets.

Having verified that large price movements in the cryptocurrency market are generally well-described by power-law distributions, we now focus on the power-law exponents that typically characterize each cryptoasset. To do so, we select all exponent estimates over the entire history of each digital asset for which the power-law hypothesis is not rejected and calculate their median values for both the positive (\({\tilde{\alpha }}_{\,+}\)) and negative (\({\tilde{\alpha }}_{\,-}\)) returns. The dashed lines in Fig. 1c show these median values for Bitcoin where \({\tilde{\alpha }}_{\,+}=4.50\) and \({\tilde{\alpha }}_{\,-}=2.99\). It is worth noticing that the variance of large price movements \(\sigma ^2\) is finite only for \(\alpha >3\), as the integral \(\sigma ^2 \sim \int _{r_{\text {min}}}^\infty r^2 p(r) dr\) diverges outside this interval. Thus, while the typical variance of large positive returns is finite for Bitcoin, negative returns are at the limit of not having a typical scale and are thus susceptible to much larger variations. Figure 2b shows the probability distribution for the median power-law exponents of all cryptoassets grouped by large positive and negative returns. We note that the distribution of typical power-law exponents associated with large positive returns is shifted to smaller values when compared with the distribution of exponents related to large negative returns. The medians of these typical exponents are respectively 2.78 and 3.11 for positive and negative returns. This result suggests that the asymmetry in large price movements we have observed for Bitcoin is an overall feature of the cryptocurrency market. By calculating the difference between the typical exponents related to positive and negative large returns (\(\Delta \alpha = {\tilde{\alpha }}_{\,+} - {\tilde{\alpha }}_{\,-}\)) for each digital currency, we find that about 2/3 of cryptocurrencies have \({\tilde{\alpha }}_{\,+}<{\tilde{\alpha }}_{\,-}\) (see Supplementary Fig. S4 for the probability distribution of \(\Delta \alpha\)). Thus, unlike Bitcoin, most cryptocurrencies have been more susceptible to large positive price variations than negative ones. While this asymmetry in the return distributions indicates that extremely large price variations tend to be positive, it does not necessarily imply positive price variations are more common for any threshold in the return values. This happens because the fraction of events in each tail is also related to the lower bound of the power-law regime (\(r_{\text {min}}\)). However, we have found the distribution of \(r_{\text {min}}\) to be similar among the positive and negative returns [Supplementary Fig. S5a]. The distribution of high percentile scores (such as the 90th percentile) is also shifted to larger values for positive returns [Supplementary Fig. S5b]. Moreover, this asymmetry in high percentile scores related to positive and negative returns is systematic along the evolution of the power-law exponents [Supplementary Fig. S5c]. These results thus indicate that there is indeed more probability mass in the positive tails than in the negative ones, a feature that likely reflects the current expansion of the cryptocurrency market as a whole. The distributions in Fig. 2b also show that large price variations do not have a finite variance for a significant part of cryptoassets, that is, \({\tilde{\alpha }}_{\,+}\le 3\) for 62% of cryptocurrencies and \({\tilde{\alpha }}_{\,-}\le 3\) for 44% of cryptocurrencies. A significant part of the cryptocurrency market is thus prone to price variations with no typical scale. Intriguingly, we further note the existence of a minority group of cryptoassets with \({\tilde{\alpha }}_{\,+}\le 2\) (7%) or \({\tilde{\alpha }}_{\,-}\le 2\) (3%). These cryptocurrencies, whose representative members are Counos X (CCXX, rank 216) with \(\alpha _{\,-} = 1.96\) and \(\alpha _{\,+} = 1.84\) and Chainbing (CBG, rank 236) with \(\alpha _{\,+} = 1.87\), are even more susceptible to extreme price variations as one cannot even define the average value \(\mu\) for large price returns, as the integral \(\mu \sim \int _{r_{\text {min}}}^\infty r p(r) dr\) diverges for \(\alpha \le 2\).

We have also replicated the previous analysis when considering cryptocurrencies in the top 2000 and top 200 rankings of market capitalization (as of July 2022). Figure 2c and 2d show the probability distribution for the median power-law exponents of these two groups. We observe that these distributions are more localized (particularly for the top 200) than the equivalent distributions for all cryptocurrencies. The fraction of cryptocurrencies with no typical scale for large price returns (\({\tilde{\alpha }}_{\,+}\le 3\) and \({\tilde{\alpha }}_{\,-}\le 3\)) is significantly lower in these two groups compared to all cryptocurrencies. In the top 2000 cryptocurrencies, 51% have \({\tilde{\alpha }}_{\,+}\le 3\) and 26% have \({\tilde{\alpha }}_{\,-}\le 3\). These fractions are even smaller among the top 200 cryptocurrencies, with only 44% and 15% not presenting a typical scale for large positive and negative price returns, respectively. We further observe a decrease in the fraction of cryptoassets for which the average value for large price returns is not even finite, as only 2% and 1% of top 2000 cryptoassets have \({\tilde{\alpha }}_{\,+}\le 2\) and \({\tilde{\alpha }}_{\,-}\le 2\). This reduction is more impressive among the top 200 cryptocurrencies as only the cryptoasset Fei USD (FEI, rank 78) has \({\tilde{\alpha }}_{\,+} = 1.97\) and none is characterized by \({\tilde{\alpha }}_{\,-}\le 2\). The medians of \({\tilde{\alpha }}_{\,+}\) and \({\tilde{\alpha }}_{\,-}\) also increase from 2.78 and 3.11 for all cryptocurrencies to 2.98 and 3.35 for the top 2000 and to 3.08 and 3.58 for the top 200 cryptocurrencies. Conversely, the asymmetry between positive and negative large price returns does not differ much among the three groups, with the condition \({\tilde{\alpha }}_{\,+}<{\tilde{\alpha }}_{\,-}\) holding only for a slightly larger fraction of top 2000 (69.1%) and top 200 (70.6%) cryptoassets compared to all cryptocurrencies (66.4%). Moreover, all these patterns are robust when filtering out time series with sampling issues or when considering only cryptoassets that stay compatible with the power-law hypothesis in more than 90% of the positions of the expanding time window (Supplementary Figs. S6 and S7).

We also investigate whether the patterns related to the median of the power-law exponents differ among groups of cryptocurrencies with different designs and purposes. To do so, we group digital assets using the 50 most common tags in our dataset (e.g. “bnb-chain”, “defi”, and “collectibles-nfts”) and estimate the probability distributions of the median exponents \({\tilde{\alpha }}_{\,+}\) and \({\tilde{\alpha }}_{\,-}\) (Supplementary Figs. S8 and S9). These results show that design and purpose affect the dynamics of large price variations in the cryptocurrency market as the medians of typical exponents range from 2.4 to 3.7 among the groups. The lowest values occur for cryptocurrencies tagged as “doggone-doggerel” (medians of \({\tilde{\alpha }}_{\,+}\) and \({\tilde{\alpha }}_{\,-}\) are 2.38 and 2.83), “memes” (2.41 and 2.87), and “stablecoin” (2.65 and 2.79). Digital currencies belonging to the first two tags overlap a lot and have Dogecoin (DOGE, rank 9) and Shiba Inu (SHIB, rank 13) as the most important representatives. Cryptoassets with these tags usually have humorous characteristics (such as an Internet meme) and several have been considered as a form of pump-and-dump scheme42,43,44, a type of financial fraud in which false statements artificially inflate asset prices so the scheme operators sell their overvalued cryptoassets. Conversely, cryptoassets tagged as “stablecoin” represent a class of cryptocurrencies designed to have a fixed exchange rate to a reference asset (such as a national currency or precious metal)3,4. While the price of stablecoins tends to stay around the target values, their price series are also marked by sharp variations, which in turn are responsible for their typically small power-law exponents. This type of cryptoasset has been shown to be prone to failures45,46,47, such as the recent examples of TerraUSD (UST) and Tron’s USDD (USDD) that lost their pegs to the US Dollar producing large variations in their price series. The asymmetry between positive and negative large returns also emerges when grouping the cryptocurrencies using their tags. All 50 tags have distributions of \({\tilde{\alpha }}_{\,+}\) shifted to smaller values when compared with the distributions of \({\tilde{\alpha }}_{\,-}\), with differences between their medians ranging from \(-0.74\) (“okex-blockdream-ventures-portfolio”) to \(-0.14\) (“stablecoin”). Indeed, only four (‘stablecoin”, “scrypt”, “fantom-ecosystem” and “alameda-research-portfolio”) out of the fifty groupings have both distributions indistinguishable under a two-sample Kolmogorov-Smirnov test (p-value \(> 0.05\)).

Focusing now on the evolution of the power-law exponents quantified by the time series \(\alpha _t\) for positive and negative returns, we ask whether these exponents present particular time trends. For Bitcoin (Fig. 1c), \(\alpha _t\) seems to increase with time for both positive and negative returns. At the same time, the results of Fig. 2 also suggest that market capitalization affects these power-law exponents. To verify these possibilities, we assume the power-law exponents (\(\alpha _t\)) to be linearly associated with the cryptocurrency’s age (\(y_t\), measured in years) and the logarithm of market capitalization (\(\log c_t\)). As detailed in the “Methods” section, we frame this problem using a hierarchical Bayesian model. This approach assumes that the linear coefficients associated with the effects of age (A) and market capitalization (C) of each digital currency are drawn from distributions with means \(\mu _A\) and \(\mu _C\) and standard deviations \(\sigma _A\) and \(\sigma _C\), which are in turn distributed according to global distributions representing the overall impact of these quantities on the cryptocurrency market. The Bayesian inference process consists of estimating the posterior probability distributions of the linear coefficients for each cryptocurrency as well as the posterior distributions of \(\mu _A\), \(\mu _C\), \(\sigma _A\), and \(\sigma _C\), allowing us to simultaneously probe asset-specific tendencies and overall market characteristics. Moreover, we restrict this analysis to the 2140 digital currencies having more than 50 observations of market capitalization concomitantly to the time series of the power-law exponents in order to have enough data points for detecting possible trends.

Figure 3
figure 3

Illustration of different effects of age and market capitalization on power-law exponents of cryptocurrencies. (a) Posterior probability distributions of the linear coefficients associated with the effects of age [p(A)] and (b) the effects of market capitalization [p(C)] on power-law exponents related to large positive returns. Panels (c) and (d) show the analogous distributions for the association with power-law exponents related to large negative returns. In all panels, the different curves show the distributions for each of the top 20 cryptoassets by market capitalization. Cryptocurrencies significantly affected by age or market capitalization are highlighted in boldface, and the numbers between brackets show their positions in the market capitalization rank.

When considering the overall market characteristics, we find that the 94% highest density intervals for \(\mu _A\) ([− 0.01, 0.06] for positive and [− 0.02, 0.03] for negative returns) and \(\mu _C\) ([− 0.02, 0.03] for positive and [− 0.001, 0.04] for negative returns) include the zero (see Supplementary Fig. S10 for their distributions). Thus, there is no evidence of a unique overall pattern for the association between the power-law exponents and age or market capitalization followed by a significant part of the cryptocurrency market. Indeed, the 94% highest density intervals for \(\sigma _A\) ([0.87, 0.93] for positive and [0.63, 0.70] for negative returns) and \(\sigma _C\) ([0.57, 0.61] for positive and [0.49, 0.52] for negative returns) indicate that the cryptocurrency market is highly heterogeneous regarding the evolution of power-law exponents associated with large price variations (see Supplementary Fig. S10 for the distributions of \(\sigma _A\) and \(\sigma _C\)). Figure 3 illustrates these heterogeneous behaviors by plotting the posterior probability distributions for the linear coefficients associated with the effects of age (A) and market capitalization (C) for the top 20 digital assets, where cryptocurrencies which are significantly affected (that is, the 94% highest density intervals for A or C do not include the zero) by these quantities are highlighted in boldface. Even this small selection of digital currencies already presents a myriad of patterns. First, we observe that the power-law exponents of a few top 20 cryptocurrencies are neither correlated with age nor market capitalization. That is the case of Shiba Inu (SHIB, rank 13) and Dai (DAI, rank 11) for both positive and negative returns, UNUS SED LEO (LEO, rank 18) and Polkadot (DOT, rank 12) for the positive returns, and USDCoin (USDC, rank 4) and Solana (SOL, rank 9) for negative returns. There are also cryptocurrencies with exponents positively or negatively correlated only with market capitalization. Examples include Tether (USDT, rank 3) and Dogecoin (DOGE, rank 10), for which the power-law exponents associated with positive returns increase with market capitalization, and Binance USD (BUSD, rank 6), for which power-law exponents associated with positive and negative returns decrease with market capitalization. We also observe cryptocurrencies for which age and market capitalization simultaneously affect the power-law exponents. Polygon (MATIC, rank 14) is an example where the power-law exponents associated with positive returns tend to increase with age and decrease with market capitalization. Finally, there are also cryptocurrencies with power-law exponents only associated with age. That is the case of Bitcoin (BTC, rank 1), Ethereum (ETH, rank 2), and Cardano (ADA, rank 8), for which the power-law exponents related to positive and negative returns increase with age, but also the case of Uniswap (UNI, rank 19), for which the exponents decrease with age.

Figure 4 systematically extends the observations made for the top 20 cryptoassets to all 2140 digital currencies for which we have modeled the changes in the power-law exponents as a function of age and market capitalization. First, we note that only 10% of cryptocurrencies have power-law exponents not significantly affected by age and market capitalization. The vast majority (90%) displays some relationship with these quantities. However, these associations are as varied as the ones we have observed for the top 20 cryptoassets. About 52% of cryptocurrencies have power-law exponents simultaneously affected by age and market capitalization. In this group, these quantities simultaneously impact the exponents related to positive and negative returns of 34% of cryptoassets, whereas the remainder is affected only in the positive tail (9%) or only in the negative tail (9%). Moving back in the hierarchy, we find that the power-law exponents of 32% of cryptocurrencies are affected only by age while a much minor fraction (6%) is affected only by market capitalization. Within the group only affected by age, we observe that the effects are slightly more frequent only on the exponents related to negative returns (12%), compared to cases where effects are restricted only to positive returns (10%) or simultaneously affect both tails (10%). Finally, within the minor group only affected by market capitalization, we note that associations more frequently involve only exponents related to negative returns (3%) compared to the other two cases (2% only positive returns and 1% for both positive and negative returns).

Beyond the previous discussion about whether positive or negative returns are simultaneously or individually affected by age and market capitalization, we have also categorized the direction of the trend imposed by these two quantities on the power-law exponents. Blue rectangles in Fig. 4 represent the fraction of relationships for which increasing age or market capitalization (or both) is associated with a raise in the power-law exponents. About 28% of all cryptocurrencies exhibit this pattern in which large price variations are expected to occur less frequently as they grow and age. Conversely, the red rectangles in Fig. 4 depict the fraction of relationships for which increasing age or market capitalization (or both) is associated with a reduction in the power-law exponents. This case comprises about 25% of all cryptocurrencies for which large price variations are likely to become more frequent as they grow in market capitalization and age. Still, the majority of associations represented by green rectangles refer to the case where the effects of age and market capitalization point in different directions (e.g. exponents increasing with age while decreasing with market capitalization). About 36% of cryptocurrencies fit this condition which in turn contributes to consolidating the cumbersome hierarchical structure of patterns displayed by cryptocurrencies regarding the dynamics of large price variations. This complex picture is not much different when considering only cryptocurrencies in the top 200 market capitalization rank (Supplementary Fig. S11). However, we do observe an increased prevalence of patterns characterized by exponents that rise with age and market capitalization (37%), suggesting that large price variations are becoming less frequent among the top 200 cryptocurrencies than in the overall market.

Figure 4
figure 4

Summary of the effects of age and market capitalization on power-law exponents of the cryptocurrency market. Hierarchical visualization or a tree map of the possible effects of age and market capitalization on the power-law exponents. The first level (two outermost rectangles) separates cryptocurrencies that are affected by age or market capitalization (90%) from those unaffected by any of these quantities (10%). Cryptocurrencies affected by age or market capitalization are classified as those simultaneously affected by both quantities (52%), those affected only by age (32%), and those affected only by market capitalization (6%). Each of the previous three levels is further classified regarding whether both positive and negative returns are simultaneously affected or whether the effect involves only positive or only negative returns. Finally, the former levels are classified regarding whether the power-law exponents increase, decrease or have a mixed trend with the predictive variables. Overall, 36% of the associations are classified as mixed trends (green rectangles), 28% are increasing trends (blue rectangles), and 26% are decreasing trends (red rectangles).

Discussion

We have studied the distributions of large price variations of a significant part of the digital assets that currently comprise the entirety of the cryptocurrency market. Unlike previous work, we have estimated these distributions for entire historical price records of each digital currency, and we have identified the patterns under which the return distributions change as cryptoassets age and grow in market capitalization. Similarly to conventional financial assets37,38,39, our findings show that the return distributions of the vast majority of cryptoassets have tails that are described well by power-law functions along their entire history. The typical power-law exponents of cryptocurrencies (\(\alpha \sim 3\)) are, however, significantly smaller than those reported for conventional assets (\(\alpha \sim 4\))37,38,39. This feature corroborates the widespread belief that cryptoassets are indeed considerably more risky for investments than stocks or other more traditional financial assets. Indeed, we have found that about half of the cryptocurrencies in our analysis do not have a characteristic scale for price variations, and are thus prone to much higher price variations than those typically observed in stock markets. On the upside, we have also identified an asymmetry in the power-law exponents for positive and negative returns in about 2/3 of all considered cryptocurrencies, such that these exponents are smaller for positive than they are for negative returns. This means that sizable positive price variations have generally been more likely to occur than equally sizable negative price variations, which in turn may also reflect the recent overall expansion of the cryptocurrency market.

Using a hierarchical Bayesian linear model, we have also simultaneously investigated the overall market characteristics and asset-specific tendencies regarding the effects of age and market capitalization on the power-law exponents. We have found that the cryptocurrency market is highly heterogeneous regarding the trends exhibited by each cryptocurrency; however, only a small fraction of cryptocurrencies (10%) have power-law exponents neither correlated with age nor market capitalization. These associations have been mostly ignored by the current literature and are probably related to the still-early developmental stage of the cryptocurrency market as a whole. Overall, 36% of cryptocurrencies present trends that do not systematically contribute to increasing or decreasing their power-law exponents as they age and grow in market capitalization. On the other hand, for 26% of cryptocurrencies, aging and growing market capitalization are both associated with a reduction in their power-law exponents, thus contributing to the rise in the frequency of large price variations in their dynamics. Only about 28% of cryptocurrencies present trends in which the power-law exponents increase with age and market capitalization, favoring thus large price variations to become less likely. These results somehow juxtapose with findings about the increasing informational efficiency of the cryptocurrency market22. In fact, if on the one hand the cryptocurrency market is becoming more informationally efficient, then on the other our findings indicate that there is no clear trend toward decreasing the risks of sizable variations in the prices of most considered cryptoassets. In other words, risk and efficiency thus appear to be moving towards different directions in the cryptocurrency market.

To conclude, we hope that our findings will contribute significantly to the better understanding of the dynamics of large price variations in the cryptocurrency market as a whole, and not just for a small subset of selected digital assets, which is especially relevant due to the diminishing concentration of market capitalization among the top digital currencies, and also because of the considerable impact these new assets may have in our increasingly digital economy.

Methods

Data

Our results are based on time series of the daily closing prices (in USD) for all cryptoassets listed on CoinMarketCap (coinmarketcap.com) as of 25 July 2022 [see Supplementary Fig. S1a for a visualization of the increasing number cryptoassets listed on CoinMarketCap since 2013]. These time series were automatically gathered using the cryptoCMD Python package48 and other information such as the tags associated with each cryptoasset were obtained via the CoinMarketCap API49. In addition, we have also obtained the daily market capitalization time series (in USD) from all cryptoassets which had this information available at the time. Earliest records available from CoinMarketCap date from 29 April 2013 and the latest records used in our analysis correspond to 25 July 2022. Out of 9943 cryptocurrencies, we have restricted our analysis to the 7111 with at least 200 price-return observations. The median length of these time series is 446 observations [see the distribution of series length in Supplementary Fig. S1b].

Estimating power-law exponents

We have estimated the power-law behavior of the return distributions by applying the Clauset-Shalizi-Newman method40 to the return time series \(r_t\). In particular, we have sampled each of these time series using an expanding time window that starts at the hundredth observation and grows in weekly steps (seven data points each step). For each position of the expanding time window, we have separated the positive returns from the negative ones and applied the Clauset-Shalizi-Newman method40 to each set. This approach consists of obtaining the maximum likelihood estimate for the power-law exponent, \(\alpha = 1 + n / \left( \sum _{t=1}^n \ln {r_t}/{r_{\text {min}}}\right) ,\) where \(r_{\text {min}}\) is the lower bound of the power-law regime and n is the number of (positive or negative) return observations in the power-law regime for a given position of the expanding time window. The value \(r_{\text {min}}\) is estimated from data by minimizing the Kolmogorov-Smirnov statistic between the empirical distribution and the power-law model. The Clauset-Shalizi-Newman method40 yields an unbiased and consistent estimator50, in a sense that as the sample increases indefinitely, the estimated power-law exponent converges in distribution to the actual value. Moreover, we have used the implementation available on the powerlaw Python package51.

In addition to obtaining the power-law exponents, we have also verified the adequacy of the power-law hypothesis using the procedure originally proposed by Clauset et al.40 as adapted by Preis et al.41. This procedure consists of generating synthetic samples under the power-law hypothesis with the same properties of the empirical data under analysis (that is, same length and parameters \(\alpha\) and \(r_{\text {min}}\)), adjusting the simulated data with the power-law model via the Clauset-Shalizi-Newman method, and calculating the Kolmogorov-Smirnov statistic (\(\kappa _{\text {syn}}\)) between the distributions obtained from the simulated samples and the adjusted power-law model. Next, the values of \(\kappa _{\text {syn}}\) are compared to the Kolmogorov-Smirnov statistic calculated between empirical data and the power-law model (\(\kappa\)). Finally, a p-value is defined by calculating the fraction of times for which \(\kappa _{\text {syn}}>\kappa\). We have used one thousand synthetic samples for each position of the expanding time window and the more conservative 90% confidence level (instead of the more lenient and commonly used 95% confidence level), such that the power-law hypothesis is rejected whenever p-value \(\le 0.1\).

Modelling the effects of age and market capitalization on the power-law exponents

We have estimated the effects of age and market capitalization on the power-law exponents associated with positive or negative returns of a given cryptocurrency using the linear model

$$\begin{aligned} \alpha _t \sim {\mathscr {N}}(K + C\, \log c_t + A\, y_t, \varepsilon ), \end{aligned}$$
(3)

where \(\alpha _{t}\) represents the power-law exponent, \(\log c_t\) is the logarithm of the market capitalization, and \(y_t\) is the age (in years) of the cryptocurrency at t-th observation. Moreover, K is the intercept of the association, while C and A are linear coefficients quantifying the effects of market capitalization and age, respectively. Finally, \({\mathscr {N}}(\mu , \sigma )\) stands for the normal distribution with mean \(\mu\) and standard deviation \(\sigma\), such that the parameter \(\epsilon\) accounts for the unobserved determinants in the dynamics of the power-law exponents. We have framed this problem using the hierarchical Bayesian approach such that each power-law exponent \(\alpha _t\) is nested within a cryptocurrency with model parameters considered as random variables normally distributed with parameters that are also random variables. Mathematically, for each cryptocurrency, we have

$$\begin{aligned} \begin{aligned} K \sim {\mathscr {N}}(\mu _K, \sigma _K), \quad C \sim {\mathscr {N}}(\mu _C, \sigma _C), \quad A \sim {\mathscr {N}}(\mu _A, \sigma _A), \end{aligned} \end{aligned}$$
(4)

where \(\mu _K\), \(\sigma _K\), \(\mu _C\), \(\sigma _C\), \(\mu _A\), and \(\sigma _A\) are hyperparameters. These hyperparameters are assumed to be distributed according to distributions that quantify the overall impact of age and market capitalization on the cryptocurrency market as a whole.

We have performed this Bayesian regression for exponents related to positive and negative returns separately, and used noninformative prior and hyperprior distributions in order not to bias the posterior estimation52. Specifically, we have considered

$$\begin{aligned} \begin{aligned} \mu _K^{}&\sim {\mathscr {N}}(0, 10^5), \quad \sigma _K^{} \sim \mathrm{Inv{-}\Gamma }(1, 1),\\ \mu _C^{}&\sim {\mathscr {N}}(0, 10^5), \quad \sigma _C^{} \sim \mathrm{Inv{-}\Gamma }(1, 1),\\ \mu _A^{}&\sim {\mathscr {N}}(0, 10^5), \quad \sigma _A^{} \sim \mathrm{Inv{-}\Gamma }(1, 1),\\ \end{aligned} \end{aligned}$$
(5)

and \(\varepsilon \sim {\mathscr {U}}(0,10^2),\) where \({\mathscr {U}}(a,b)\) stands for the uniform distribution in the interval [ab] and \({\text {Inv}{-}\Gamma }(\theta , \gamma )\) represents the inverse gamma distribution with shape and scale parameters \(\theta\) and \(\gamma\), respectively. For the numerical implementation, we have relied on the PyMC53 Python package and sampled the posterior distributions via the gradient-based Hamiltonian Monte Carlo no-U-Turn-sampler method. We have run four parallel chains with 2500 iterations each (1000 burn-in samples) to allow good mixing and estimated the Gelman-Rubin convergence statistic (R-hat) to ensure the convergence of the sampling approach (R-hat was always close to one).

In addition, we have also verified that models describing the power-law exponents as a function of only age (\(C\rightarrow 0\) in Eq. 3) or only market capitalization (\(A\rightarrow 0\) in Eq. 3) yield significantly worse descriptions of our data as quantified by the Widely Applicable Information Criterion (WAIC) and the Pareto Smoothed Importance Sampling Leave-One-Out cross-validation (PSIS-LOO)54 (see Supplementary Table S2).