Theoretical and empirical analysis of trading activity

Understanding the structure of financial markets deals with suitably determining the functional relation between financial variables. In this respect, important variables are the trading activity, defined here as the number of trades N, the traded volume V, the asset price P, the squared volatility σ2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma ^2$$\end{document}, the bid-ask spread S and the cost of trading C. Different reasonings result in simple proportionality relations (“scaling laws”) between these variables. A basic proportionality is established between the trading activity and the squared volatility, i.e., N∼σ2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N \sim \sigma ^2$$\end{document}. More sophisticated relations are the so called 3/2-law N3/2∼σPV/C\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N^{3/2} \sim \sigma P V /C$$\end{document} and the intriguing scaling N∼(σP/S)2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N \sim (\sigma P/S)^2$$\end{document}. We prove that these “scaling laws” are the only possible relations for considered sets of variables by means of a well-known argument from physics: dimensional analysis. Moreover, we provide empirical evidence based on data from the NASDAQ stock exchange showing that the sophisticated relations hold with a certain degree of universality. Finally, we discuss the time scaling of the volatility σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}, which turns out to be more subtle than one might naively expect.

Pitts [30], defined trading activity via trading volume and derived a proportionality relation between the trading volume and the price variability. The rationale behind this definition and the implied relation is the widely-cited aphorism, "it takes volume to move prices". We refer to Karpoff [17] for a survey of these early works on the price-volume relation.
Due to minor empirical evidence for the hypotheses developed in these early approaches, the volume-based definition of trading activity has been replaced by the number of trades. This definition is caused by a substantial link between the observed price variability and the number of trades (see Jones et al. [16], Ané and Geman [4] as well as Dufour and Engle [12]). For example, Jones et al. [16] find no predictive power in the volume for the price variability but that the number of trades scales proportionally to the squared volatility. This scaling relation will be the starting point of our discussion. Building on the aforementioned ideas numerous other studies followed, e.g. [2,20]. In particular, let us point out the contribution by Wyart et al. [31], who argue that the price volatility per trade, i.e., (price) × (volatility) × (number of trades) −1/2 , is proportional to the bid-ask-spread. This connection can be seen as a somewhat refined version of the relation proposed by Jones et al. [16].
More recently, general relations between financial quantities have been derived based on the invariance of markets' microstructure, see Kyle and Obizhaeva [18]. In particular, the authors postulate a trading invariance principle which (in contrast to the above relations) is formulated on the latent level of meta-orders. 1 Andersen et al. [3] and Benzaquen et al. [6] confirm empirically that an analogue of this invariance principle holds true for intradaily observable quantities. The fundamental relation may then be formulated as follows: the nominal value of the exchanged risk during a period of time, defined as the product (volatility) × (traded volume) × (price), is proportional to the number of trades to the power 3/2. This so called intraday trading invariance principle and its connection to the relations proposed by Jones et al. [16] and Wyart et al. [31] is the focus of the present paper.
Our aim is to critically analyze these three relations as well as variants thereof by applying a method well known from physics: dimensional analysis. It is a tool which allows for the falsification of a proposed relation, e.g. of the above mentioned formulas for the number of trades, but not for its verification. This principle is similar in spirit to K. Popper's approach to epistemology which in turn is inspired by the classical theory of statistics: There one can possibly reject a null hypothesis, but never prove it. Similarly, dimensional analysis can only isolate those functional relations between variables involving certain "dimensions" which do not violate the obvious scaling invariance of these dimensions. Hence, it a priori rules out those functional relations which are in conflict with these scaling requirements. But this does not imply that the identified functional relations, which are in accordance with the scaling requirements, describe the reality in a reasonable way. This has to be confirmed by other methods. In the present setting the ultimate challenge is, of course, to fit to empirical data. To complete the picture, we perform an empirical analysis of the relations described above and show that the intraday trading invariance principle provides an appropriate fit to empirical data, but fails to be a "universal law".
In dimensional analysis one uses the rather obvious argument that a meaningful relation between quantities involving some "dimensions" should not be affected by the units in which these "dimensions" are measured. In the present context the relevant "dimensions" are time, shares, and money, denoted as T, S and U, respectively. We shall also use an additional argument, namely "leverage neutrality" as introduced by Kyle and Obizhaeva [19]. We emphasize that these authors were the first to combine the concepts of "leverage neutrality" and dimensional analysis. The assumption of leverage neutrality is based on the Modigliani-Miller theorem (see [24]) and leads to a scaling invariance principle which, mathematically speaking, is perfectly analogous to the dimensional scaling requirements mentioned above.
The remainder of the paper is structured as follows. In Sect. 2, we first deduce the proportionality between the number of trades and the price variability as proposed by Jones et al. [16] from dimensional arguments. Next, we derive the more involved scaling relations proposed by Benzaquen et al. [6] as well as Wyart et al. [31], again using dimensional analysis, and discuss the assumption of leverage neutrality in this context. Having a theoretical foundation for the discussed relations, we then turn to the empirical analysis in Sect. 3: Based on data from the NASDAQ stock market, we show that the relation proposed by Benzaquen et al. [6] fits the data rather well. In Sect. 4, we take a closer look at volatility and analyze implications of different time scalings thereof. We conclude with some empirical results in this respect. A reminder on the Pi-theorem from dimensional analysis as well as proofs for all considered relations can be found in the Appendix.

The trading invariance principle
We are interested in explaining the arrival rate of trades in a given stock measured as • N = N t+T t the number of trades within a fixed time interval [t, t + T ] so that N is measured per units of time. Following the notation from [26], this link between the variable N and its dimensional unit is therefore given by

123
• σ 2 = (σ 2 ) t+T t = Var (log(P t+T ) − log(P t )) the variance of the log-price over the time interval [t, t + T ]. We assume If the price process (P t ) t≥0 follows, e.g. the Black-Scholes model, see (24), we clearly find the above scaling [σ 2 ] = T −1 and shall retain this assumption in most of the paper. However, the scaling of σ 2 turns out to be more subtle than it seems at first glance. In Sect. 4 below, we shall investigate the implications of a scaling relation [σ 2 ] = T −2H , where H ∈ (0, 1) may be different from 1/2. For instance, such a scaling may result from price processes based on a fractional Brownian motion (B H t ) t≥0 with Hurst parameter H ∈ (0, 1), see [23].
Based on these identified dimensions, let us turn to the basic idea of dimensional analysis: the validity of a considered relation should not depend on whether we measure time T in seconds or in minutes, shares S in single shares or in packages of hundred shares, and money U in Euros or in Euro-cents.

Definition 1 (Dimensional invariance).
A function h : R n + → R + relating the quantity of interest U to the explanatory variables W 1 , . . . , W n , i.e, is called dimensionally invariant if it is invariant under rescaling the involved dimensions (in our case S, T and U).
As a first-and rather naive-approach we analyze the assumption that the three variables σ 2 , P and V fully explain the number of trades N .

Proposition 1
Assume that the number of trades N depends only on the three quantities σ 2 , P and V , i.e., where the function g : R 3 + → R + is dimensionally invariant. Then, there is a constant c > 0 such that the number of trades N obeys the relation The proof relies on elementary linear algebra and is given in Appendix B below (compare also the proof of Theorem 1 below which is similar). Recall that relation (2) goes back to Jones et al. [16]. As mentioned in the introduction, one should read the present "dimensional" argument in favor of relation (2) as a pure "if. . . then. . . " assertion: if N really is fully explained by σ 2 , P and V and the obvious scaling invariances of S, T and U are satisfied, then (2) is the only possible relation. As we shall see below, the empirical data does not reconfirm the validity of (2). In other words, we have to turn the above statement upside down: as (2) is not reconfirmed by empirical data, the variables σ 2 , P and V cannot fully explain the quantity N . It is therefore natural to introduce more/other quantities in order to explain the number of trades N .
Regarding the uniqueness of the function g in (1), the mathematical reason for the unique choice of g given by (2) is that we have three scaling relations (pertaining to the invariance of the "dimensions" S, U and T) as well as the three explanatory variables σ 2 , P and V . This leads to three linear equations in three unknowns, yielding a unique solution.
Let us now try to go beyond the scope of relation (1) by considering further explanatory variables. Motivated by Wyart et al. [31], we consider the following quantity as relevant for the number of trades N in a given interval [t, t + T ], additionally to σ 2 , P and V : • S = S t+T t the average bid-ask spread in the interval [t, t + T ], measured in units of money per share Following Benzaquen et al. [6], it is also convenient to alternatively consider the quantity • C = C t+T t the average cost per trade in the interval [t, t + T ], measured in units of money To visualize things, suppose that for some stock we observe in average during the time interval [t, t + T ] an ask price of EUR12.30 and a bid price of EUR12.20 so that the bid-ask spread S equals 10 cents. If the average trade size in the interval [t, t + T ], denoted by Q = Q t+T t , is 500 shares, we obtain that the average cost per trade C = QS is EUR50. A discussion of the difference between using S rather than C as an explanatory variable can be found at the end of this section. For now, let us follow Benzaquen et al. [6] for our derivation of the intraday trading invariance principle and pass to the set σ 2 , P, V and C of explanatory variables, i.e., for some function g : R 4 + → R + . As we now have four explanatory variables, the three equations yielded by the scale invariance of the dimensions S, U and T are not sufficient anymore to imply an (essentially) unique solution for g. In fact, the four explanatory variables above combined with the three invariance relations pertaining to S, T and U only yield a general solution of (3) of the form where f : R + → R + is an arbitrary function whose generality cannot be restricted by only relying on arguments pertaining to dimensional analysis with respect to the three dimensions S, T and U (see Appendix B).

123
Hence, in order to obtain such a crisp result as in (2), an additional "dimensional invariance" is required. Kyle and Obizhaeva [19] found a remedy: a no-arbitrage type argument, referred to as "leverage neutrality". 2 This concept is inspired by the findings of Modigliani and Miller [24] (compare [26]): Consider a stock of a company, and suppose that the company changes its capital structure by paying dividends or by raising new capital. The Modigliani-Miller theorem tells us precisely which features of the company are not affected by a change in the capital structure. This allows us to establish how certain quantities behave when varying the leverage in terms of the relation between debt and equity of a company.
From a conceptual point of view, the assumption of leverage neutrality gives a constraint on the behavior of the quantities N , σ 2 , P, V , C (resp. S) in case of changing the firm's capital structure. This constraint can be understood as an additional though synthetic dimension in our analysis, which we refer to as the Modigliani-Miller "dimension" M. The Modigliani-Miller "dimension" M of a share of a company is measured in terms of the leverage L, i.e., the quantity L = total assets equity .
Multiplying L by a factor A > 1 is equivalent to paying out (1 − A −1 ) of the equity as cash-dividends. On the other hand, multiplying L by a factor 0 < A < 1 corresponds to raising new capital in order to increase the firm's equity by a factor A −1 . Following Kyle and Obizhaeva [19] as well as [26], we are led to the following assumption: Leverage Neutrality Assumption ( [19,26]). Scaling the Modigliani-Miller "dimension" M by a factor A ∈ R + implies that • N , V and C (as well as S) remain constant, • P changes by a factor A −1 , • σ 2 changes by a factor A 2 .
To recapitulate: Setting A = 2 corresponds to paying out half of the equity as dividends so that each share yields a dividend of (1 − A −1 )P = P/2. The stock price is, thus, multiplied by A −1 = 1/2 while the volatility σ is multiplied by A = 2. The remaining quantities are not affected by changing the leverage, in accordance with the insight of Modigliani and Miller [24] and the recent work by Kyle and Obizhaeva [19]. The economic reason is that the value of the assets of the corresponding company and hence the associated risk does not change. We can now derive the following relation, which is the focus of the present paper. It relies on the basic fact that under the "Leverage Neutrality Assumption" we now find four linear equations in order to determine four unknowns. Note that Benzaquen et al. [6] coined this relation the "3/2-law". Theorem 1 ((3/2)-law). Suppose the "Leverage Neutrality Assumption" holds and that the number of trades N depends only on the four quantities σ 2 , P, V and C, i.e., where the function g : R 4 + → R + is dimensionally invariant and leverage neutral. Then, there is a constant c > 0 such that the number of trades N obeys the relation The proof follows from the general Pi-theorem reviewed in Appendix A. For the convenience of the reader, we also present a direct proof of Theorem 1. Although slightly longish and repetitive, we hope that it helps the intuition.
Proof of Theorem 1 First, we make the following ansatz for the function g in (5): where c > 0 is a constant and y 1 , . . . , y 4 are unknown real numbers. Looking at the first row of Table 1 yields the relation − y 2 + y 3 = 0.
Indeed, when passing from counting shares in packages of 100 units rather than in single units, the number P is replaced by 100P while the number V is replaced by V /100. Since the function g in (7) is assumed to be dimensionally invariant, g should remain unchanged by this passage, i.e., 123 which is only possible if (8) holds true. Looking at the other rows of Table 1 we therefore get the system of linear equations which gives (6) as one possible solution of (5).
We still have to show the uniqueness of (6). To do so, it is convenient to pass to logarithmic coordinates: suppose that there is a function G : where we write log( where y 1 , y 2 , y 3 , y 4 are given by (10) and const is a real number. Denote by r 1 := −e 2 + e 3 the first row of Table 1, considered as a vector in R 4 , where (e i ) 4 i=1 is the canonical basis of R 4 . Similarly as in (9), the first row of Table 1 and dimensional invariance imply that Clearly we can replace log(100) by any real number. Speaking abstractly, this means that G : R 4 → R must be constant on any straight line parallel to the vector r 1 . A similar argument applies to r 2 = e 2 + e 4 and r 4 = 2e 1 − e 2 . As regard r 3 = −e 1 − e 3 the situation is slightly different, as the third row of Table 1 also involves a non-zero entry of N .
The third row of Table 1 and (11) imply that for any λ ∈ R, Setting const := G(0, 0, 0, 0), we have which uniquely determines G on the one-dimensional space spanned by r 3 = −e 1 −e 3 in R 4 . As we have seen that G also must be constant along each line in R 3 parallel to r 1 , r 2 and r 4 , and as r 1 , r 2 , r 3 , r 4 span the entire space R 4 , we conclude that there is only one choice for the function G, up to the constant const = G(0, 0, 0, 0).
For an alternative derivation of relation (6), we pass from considering σ 2 , the variability of the relative price changes, to considering σ 2 B , the variability of the absolute price changes. This will allow us to reduce the two explanatory variables σ 2 and P to one explanatory variable σ 2 B = σ 2 P 2 . We call σ B the Bachelier volatility as it corresponds to Bachelier's original model from 1900, see [5]. Recall that the dynamics of the price process (P t ) t≥0 of the Black-Scholes versus the Bachelier model are (Black−Schloes model) where W t is a standard Brownian motion. Defining σ B = σ P the two models coincide remarkably well as long as P t does not move too much (compare e.g. [29]). We therefore define A glance at Table 2 reveals that σ 2 B has Modigliani-Miller dimension M equal to zero (just as the other variables V , C and N ). This enables us to derive the assertion of Theorem 1 by using only the three obvious scaling invariances, but without imposing a priori the requirement of leverage neutrality.

Corollary 2 Suppose the number of trades N depends only on the three quantities
where the function g : Then, there is a constant c > 0 such that the number of trades N obeys the relation The proof is analogous to (and even easier than) the above proof. Note that Proposition 1 and Corollary 2 both only rely on the very convincing invariance assumption with respect to S, T and U, but not on the "Leverage Neutrality Assumption".
Anticipating that relation (14) gives a superior fit to empirical data than relation (2) we can draw the following conclusion: the choice of σ 2 B , V , C as explanatory variables for the quantity N is superior to the choice σ 2 , P, V made in Proposition 1 above.
Here is a "dimensional argument" why we should expect a better result from Corollary 2 as compared to Proposition 1. It follows from the very approach of dimensional analysis that everything hinges on the assumption that the chosen explanatory variables indeed "fully explain" the dependent variable. Of course, in reality such an assumption will-at best-only be approximately satisfied. The art of the game is to find a combination of explanatory variables which "best" explain the resulting variable. The choice of the variables σ 2 B , V , C as in Corollary 2 automatically implies that the "Leverage Neutrality Assumption" is satisfied as shown in Table 2. Indeed, the variables σ 2 B , V , C as well as N have a zero entry for the Modigliani-Miller dimension M. Therefore, any function relating these variables is automatically leverage neutral. This is in contrast to the choice of variables σ 2 , P, V in Proposition 1 as Table 1 reveals that P and σ 2 have a non-trivial dependence on M. It follows that formula (2) does not satisfy the invariance relation dictated by the "Leverage Neutrality Assumption".
Finally, we examine the implications of substituting the cost per trade C by its more common counterpart, the bid-ask spread S, introduced above. In fact, in the present context it is equivalent to use either C or S as explanatory variables for the number of trades N -provided that the traded volume V is already one of the explanatory variables. Indeed, we have the relation C = S Q = SV /N since the average trade size Q in the interval [t, t + T ] is given by the traded volume V divided by the number of trades N . Hence, if we know the functional relation between N and V , we also know the functional relation between N and Q and can therefore pass from S to C = S Q and vice versa. Thus, we may restate Theorem 1 (and, equivalently, Corollary 2) in terms of the bid-ask spread S rather than the cost per trade C in the following corollary.

Corollary 3
Suppose that the number of trades N depends only on the three quantities σ 2 B , V and S, i.e., where the function g : Then, there is a constant c > 0 such that the number of trades N obeys the relation We observe that the variables σ 2 B , V and S again have no Modigliani-Miller dimension M, i.e., they are invariant under changes of the leverage. Therefore, formula (16) satisfies the invariance principle given by the "Leverage Neutrality Assumption". We note again that given the relations C = S Q = SV /N as well as σ 2 B = σ 2 P 2 the two equations (6) and (16) are indeed equivalent.
Relation (16) is precisely the one proposed by Wyart et al. [31]. By rearranging the terms, we find that The interpretation is that the squared Bachelier volatility per trade is proportional to the square of the spread. If we elaborate further on (17), we find that Without loss of generality, we can determine the price P on the left hand side of (18) as midquote price, i.e., the average of the best ask-and bid price. Then, S/P refers to the so called proportional bid-ask spread which can be used to approximate a dealer's "round trip" transaction costs. Clearly, the approximate round-trip costs increase in the volatility of a relative price change and decrease in the trading activity. Summing up this section, we have seen that the relation N ∼ σ 2 proposed by Jones et al. [16] follows from the restrictive assumption that the number of trades N only depends on the quantities σ 2 , P and V as well as dimensional arguments (see Proposition 1). Going beyond the latter relation, it seems reasonable to include information concerning the bid-ask spread in our analysis. Depending on whether we choose the trading cost C or the bid-ask spread S directly, we are led to either the 3/2-law N 3/2 ∼ σ PV /C proposed by Benzaquen et al. [6] (see Theorem 1) or to the relation S ∼ σ B / √ N proposed by Wyart et al. [31] (see Corollary 3). When proving the two latter relations we have seen that the assumption of leverage neutrality comes into play. Alternatively, we can also consider the product σ 2 P 2 , rather than σ 2 and P separately. This consideration of the "Bachelier volatility" σ B = σ P reduces the complexity of the problem inasmuch as the assumption of leverage neutrality is not needed anymore. Again, the actual validity of any of the above scaling laws should be confirmed by exhaustive empirical analysis.

Degrees of universality and relevant literature
We now turn to the empirical analysis of relation (2) as well as of the 3/2-law (6). When collecting data for the quantities N , σ 2 , V , P and C, one has to specify the considered asset and the considered time period as well as the length T of the time interval over which the data is aggregated. We cannot expect that the constant c appearing in relations (2) resp. (6) is the same for each considered interval and each possible interval length and each considered asset in either one of the relations. We can only hope that a given relation holds on average. Based on the nomenclature introduced in Benzaquen et al.
123 [6], we therefore distinguish the following three degrees of universality attached to the validity of relations (2) and (6) Note that this distinction does not allow for the possibility that the validity attached to a given relation changes over time, simply because we consider only one specific time period. Let us shortly discuss the relevant empirical evidence which can be found in the literature before turning to our own empirical analysis. Andersen et al. [3] conducted an important empirical study in the present context. They test the relation where I is independently and identically distributed across assets and time for E-mini S&P 500 futures contract. Neglecting the price P, they show that relation N 3/2 ∼ V σ holds when averaging within and across trading days for this particular asset. In fact, their data fits the latter relation nearly perfectly compared to the relations V ∼ σ 2 resp. N ∼ σ 2 proposed by Tauchen and Pitts [30] resp. Jones et al. [16]. Benzaquen et al. [6] address the same question by examining eleven additional futures contracts as well as 300 US stocks. Aiming to confirm that β = 3/2 in the relation N β ∼ σ PV , they estimate β for each considered stock individually. They find thatβ = 1.54 ± 0.11, where the uncertainty here is the root mean square cross-sectional dispersion. Thus, these authors note that this provides evidence that the relation N 3/2 ∼ σ PV holds also on the stock market and not only on the very liquid futures market. Moreover, they show that the distribution of I in (19) depends significantly on the studied asset and thus, conclude that relation (19) holds only with weak universality. As an additional contribution, the authors reveal that the inclusion of the trading cost C is beneficial in the sense that their proposed invariant I = σ PV C −1 N −3/2 is almost constant for different assets. Finally, let us mention the evidence in the earlier work by Wyart et al. [31]. These authors show that relation (17) describes the data very well when the right level of aggregation is chosen. When examining the France Telecom stock, S and σ B / √ N are averaged over two trading days, while in case of NYSE stocks these quantities are averaged over an entire year. The constant c in relation (17) is found to lie between 1.2 and 1.6. Moreover, the authors note that the typical intraday pattern of the considered quantities is in line with (17): The U-shaped pattern of the volatility σ B is explained by the decline of the bid-ask spread S and an increase of the number of trades N within the trading day.

Description of data
Our empirical analysis is based on limit order book data provided by the LOBSTER database (https://lobsterdata.com). The considered sampling period begins on January 2, 2015 and ends on August 31, 2015, leaving 167 trading days. Among all NASDAQ stocks, d = 128 sufficiently liquid stocks with high market capitalizations are chosen. Stocks are considered to be "sufficiently liquid" as long as the aggregated variables (defined below) can be reasonably treated as continuously distributed, i.e., the empirical distributions of the aggregated variables do not have points with obviously concentrated mass. Observations made during the thirty minutes after the opening of the exchange as well as trading halts are removed.
Let us fix an interval length T ∈ {30, 60, 120, 180, 360} min for which a developed hypothesis is tested. For the sake of illustration, set the length of the considered time interval T to 60min. This interval length balances the tradeoff between sufficient aggregation of the data on the one hand and some intraday variability on the other hand. As a result, we are left with n = 1002 non-overlapping time intervals with equal length T = 60 min. Let us concentrate on a specific asset i ∈ {1, . . . , d} (omitting the index i for ease of notation in the remainder of Sect. 3.2) and let j ∈ {1, . . . , n} refer to an arbitrary interval. Suppose the trades in the considered interval j arrive at irregularly spaced transaction times t 1 , t 2 , . . . , t N j . Then, N j denotes the number of trades in the interval j, k=1 Q t k denotes the average size of the trades in the interval j, where Q t k denotes the number of shares traded at time t k , V j = N j × Q j is the traded volume in the interval j, P j = N −1 j N j k=1 P t k denotes the average midquote price in the interval j, where P t k = (A t k + B t k )/2 and A t k (resp. B t k ) denotes the best ask (resp. bid) price after the transaction at time t k , σ 2 j denotes the estimated squared volatility in the interval j, S j = N −1 j N j k=1 S t k denotes the average bid-ask spread in the interval j, where S t k = A t k − B t k is the bid-ask spread after the transaction at time t k , and C j = Q j × S j is the cost per trade in the interval j.
Note the following four details: Firstly, even though transaction times are recorded on a nano-second level, a time-stamp t k is recorded L-times (t k 1 , . . . , t k L ) in the raw dataset when a market order is executed against L limit orders at time t k . Such a multiple entry of the same time-stamp enters the number of trades N j only once (not L-times). The size Q t k of the trade at time t k is determined by summing the L-records in the dataset Q t k , = 1, . . . , L, i.e., Q t k = L =1 Q t k . The midquote price P t k and the bid-ask spread S t k related to the merged market order of size Q t k are computed as volume-weighted averages 123 Secondly, the aggregated variables, i.e., the average market order size Q j , the average midquote price P j and the average bid-ask spread S j of interval j, are in fact not computed by the sample averages as state above. Since simple sample averages are sensitive with respect to outliers, e.g. huge market orders, Q j , P j and S j are based on robust averages. In detail, we compute trimmed means of Q t 1 , . . . , Q t N j , P t 1 , . . . , P t N j and S t 1 , . . . , S t N j to obtain Q j , P j and S j respectively. These trimmed means discard the upper 0.5% and the lower 0.5% of the corresponding ordered data and compute the average based on the remaining 99% of the data.
Thirdly, the estimated squared volatility σ 2 j is computed as realized variance in interval jσ The properties of the estimatorσ 2 j are well understood for a variety of models for the efficient price process (P t ) t≥0 . For example, if the dynamics of the efficient price process follows the stochastic model d P t = σ P t dW t , with σ > 0, the estimatorσ 2 j converges weakly in probability to σ 2 T (the quadratic variation of the increments of (log(P t )) t≥0 ) as the number of transactions within interval j becomes dense (as N j → ∞). The limit ofσ 2 j , however, does not coincide with the quadratic variation of the efficient price process, if the observed midquote price is contaminated by market microstructure noise. This noise, for instance, arises from market imperfections such as price discreteness or informational content in price changes, see [7]. To check the robustness of our analysis with respect to the presence of market microstructure noise, several results below can likewise be confirmed by replacing the realized variance by the noise-robust estimator of the quadratic variation proposed in [15]. It should be noticed that a distortion of the analysis by the bid-ask bounce is already avoided by considering midquote prices rather than transaction prices. The interested reader will find a gentle introduction explaining how noisy price observations erode the realized variance in [1].
Last but not least, note that Benzaquen et al. [6] in fact define the cost per trade by C j = N −1 j N j k=1 Q t k S t k . This slight difference in the definitions becomes obviously negligible, if the bid-ask spread S t k is constant over the entire interval j. The results presented below are robust with respect to the employed version of the cost per trade as we shall see.

N ∼ 2 versus N 3/2 ∼ PV/C
To check which of the relations N ∼ σ 2 and N 3/2 ∼ σ PV /C is superiorly supported by data, we consider for each stock (i = 1, . . . , d) a multiplicative model of the form where ε i j , j = 1, . . . , n, is an error term that satisfies standard regularity conditions and α i , β i and γ i are unknown real valued parameters. A logarithmic transformation of (21) yields the linear model Since dimensional analysis imposes the restriction β i + γ i = 1 on the parameters β i and γ i , the value γ i = 0 would imply the relation N ∼ σ 2 , whereas γ i = 2/3 would imply the relation N 3/2 ∼ σ PV /C from Theorem 1. The estimation of the coefficients β i and γ i subject to the restriction β i + γ i = 1 therefore allows us to infer which of the two discussed relations is backed by stronger empirical evidence. Before turning to the constrained estimation of the parameters β i and γ i , it deserves to be emphasized that the functional relation between the logarithmic dependent variable log(N j ) and the logarithmic explanatory variable log(σ i j P i j V i j /C i j ) can be reasonably assumed to be linear for all stocks i = 1, . . . , d. To conclude this, we have visually inspected the bivariate point-clouds of dependent and explanatory variable. Similarly, if the parameter γ i is equal to 2/3, then we can conclude that the 3/2-law from Theorem 1 holds. As seen in Fig. 2, the averages of the estimatesγ i (across i for different T ) are clearly much closer to 2/3 than to zero for all considered interval lengths T . This result supports the claim made in Sect. 2 that there is stronger empirical support for the 3/2-law (or equivalently for the relation N ∼ (σ P/S) 2 ) than for the relation N ∼ σ 2 .
Regarding the robustness of this insight, we have re-conducted the above regression analysis for two slightly different scenarios. One alternative setting considers replacing the realized variance in the linear model (22) by the market microstructure noise robust estimator of the quadratic variation of [15]. The dashed graphs in Fig. 2 are related to density estimates relying on corresponding parameter estimatesγ i , i = 1, . . . , d. The second modification of the initial setting replaces the cost per trade C j in the linear model (22) by the variant C j of [6]. The dotted graphs in Fig. 2 refer to corresponding density estimates. Despite some deviation in the estimatesγ i for these two alternative settings from the initial one, the solid, dashed and dotted graphs document a rather similar pattern among the estimates of the parameters γ i for all interval lengths T ∈ {30, 60, 120, 180, 360} min. These similarities lead to the conclusion that neither market microstructure noise nor the exact definition of the cost per trade erode the overall relation between the dependent and explanatory variables. In the remaining part of the manuscript, we take a closer look on the 3/2-law and try to find reasonable explanations for the systematic deviations of the estimatesγ i from 2/3.

On the universality of the 3/2-law
In order to check the validity and universality of the 3/2-law, N 3/2 = c · σ PV /C (or equivalently of the relation N = c 2 · (σ P/S) 2 ), we examine the variation of the constant c across assets and interval lengths. Hence, we do not rely on the estimatorŝ γ i computed in Sect. 3.3. Instead, we compute for a fixed interval length T the quantitŷ where n is the number of non-overlapping time intervals with equal length T . The left panel of Fig. 3 shows the estimatesĉ i for different values of T . Note that the rainbowcolor-code refers to the ordered values ofĉ i for T = 120 min. As we recover the same rainbow-pattern also for the other interval lengths T ∈ {30, 60, 180, 360} min, we can conclude that there is little variation of the estimatesĉ i for a fixed stock i across different interval lengths T . This small variation ofĉ i for fixed i and varying T ∈ {30, 60, 120, 180, 360} min endows the 3/2-law with a certain degree of universality. However, the present cross-sectional dispersion inĉ i across different assets i, i.e., the fact that depending on the considered stock the estimatesĉ i range from two to five, does not allow awarding the 3/2-law with strong universality. Thus, we draw the same conclusion as Benzaquen et al. [6] that the 3/2-law holds with weak universality. For completeness, the kernel density estimate in the right panel of Fig. 3 illustrates the distribution of the estimatesĉ i , i = 1, . . . , d for T = 120 min.

4 A closer look on volatility
We have seen that the volatility σ plays a dominant role in explaining the trading activity N . The squared volatility σ 2 of a given stock during a fixed interval [t, t + T ] was defined as the variance of the change of the log-price When specifying the definition of σ 2 in this way we had in mind the Black-Scholes model, where, fixing the normalization T = 1, formula (23) indeed recovers the constant σ in (24). Going beyond Black-Scholes, consider a price process of the form where (σ t ) t≥0 is an arbitrary stochastic process (satisfying suitable regularity conditions). In this case, formula (23) should, of course, be interpreted conditionally on the sigma-algebra F t and we obtain the "Wald identity" This implies in particular that, as long as we are in the framework of processes of the form (25), the above chosen scaling is the only reasonable choice. But let us have a closer look at what we are actually doing here. The above reasoning tacitly assumes that we are starting from a stochastic model of a price process. The present situation, however, dictates a different point of view: we start from empirical tick data observed during the interval [t, t + T ]. Even when we make the heroic assumption that this data is accurately modeled, e.g. by the Black Scholes model (24), the number σ 2 which we plug into the formula N = g(σ 2 , . . . ) can only be an estimator of ⊃ 2 obtained from the data at hand. This implies that, strictly speaking, we should write our formulas as N = g(σ 2 , . . . ) in dependence of the estimated squared volatilityσ 2 . The gist of the argument is that for the purpose of dimensional analysis the scaling which is relevant is that of the estimator of the volatility rather than that of the true volatility (whatever this is). To be concrete, suppose that we are given price data (P t k ) k=1,...,N for a grid t ≤ t 1 < · · · < t N ≤ t + T in the interval [t, t + T ]. An obvious choice for the estimator of the squared volatility, which is also used in Sect. 3 above, isσ Clearly, this estimator has the dimension [σ 2 ] = T −1 if we suppose that the typical distance t k = t k+1 − t k (in absolute terms) does not depend on whether we measure time in seconds or in minutes. Hence, for the estimatorσ 2 , the hypothesis [σ 2 ] = T −1 underlying the dimensional analysis in Sect. 2 is satisfied. However, we can also think of other estimators. Fix H ∈ (0, 1) and define the estimatorσ 2 (H ) bŷ To motivate this estimator, consider the model where σ > 0 is a fixed number and (W H t ) t≥0 is a fractional Brownian motion with Hurst parameter H , starting at W H 0 = 0. In this case, the estimatorσ 2 (H ) in (28) is a consistent estimator for the parameter σ 2 in (29). But the estimatorσ 2 (H ) now scales differently in time than the quadratic estimatorσ 2 (see [10,27]), namely Models for the price process (P t ) t≥0 involving fractional Brownian motion as in (29) have been proposed, notably by B. Mandelbrot, already more than 50 years ago [22,23] and there may be good reasons not to rule them out a priori.
Here is another example where a sub-diffusive behavior of the price process (P t ) t ≥ 0 occurs, due to a micro-structural effect: the discrete nature of the prices in the real world (compare Benzaquen et al. [6]; we thank Jean-Philippe Bouchaud for bringing this phenomenon to our attention). To present the idea in its simplest possible form, suppose that the price process (P t ) t≥0 is given by where (W t ) t≥0 is a standard Brownian motion and int(x) denotes the integer closest to the real number x, i.e., int(x) = sup{n ∈ Z : n ≤ x + 0.5}. Fix again an interval [t, t + T ] and consider the quantity 123 For small T > 0, we show in Appendix C that for some constant c > 0. Hence, if the interval length T is sufficiently small, we recover that [σ 2 ] = T −1/2 , rather than the usual scaling in the dimension time, i.e., T −1 . This observation indicates, that if the interval length T is small compared to the width of the price grid, i.e., the tick value, we observe a sub-diffusive behavior of the price process even if the "efficient", unobserved price process is assumed to be a diffusion. We refer to Robert and Rosenbaum [28] for a detailed discussion of how to account for the discrete nature of prices. For now, this rough argument should only serve as motivation that there might be plenty of reasons why the scaling [σ 2 ] = T −1 is, in practical situations, not as clearly granted as it might seem at first glance.
For all these reasons we drop in this section the convenient dimensional assumption The proof is analogous to the proof of Theorem 1 and is given in Appendix B. The hypothesis of the above proposition assumes that H ∈ (0, 1) is known a priori. As H is typically unknown in practical applications, we can therefore ask the following question: For which H does relation (31) fit the empirical data best? We address this question in the following subsection.

Empirical evidence under the H-Assumption
According to arguments from dimensional analysis, the constant c and the parameter H from Eq. (31) should at best be identical for all stocks and all interval lengths T . The empirical results above, however, have revealed cross-sectional dispersion which might be related to the restrictive assumption [σ 2 ] = T −1 . This restriction motivates the empirical exercise of this section: Can we determine an H ∈ (0, 1) in (31) that minimizes the cross-sectional dispersion across the estimates of c?
Following Proposition 2, we therefore compute the estimatesĉ i (H ) for different H asĉ (28), H ∈ (0, 1). Both variables N H i j andσ i j (H ) increase as H increases, so that it is not obvious howĉ i (H ) behaves when H increases. We find empirically that overall the constantĉ i (H ) typically increases in H . Addressing the above question therefore requires a scale invariant measure for the variation inĉ i (H ) such as the Gini-coefficient which is given by for the ordered data x [1] < x [2] < .. . < x [n] . Note that the Gini-coefficient For now, we can only speculate on reasons why the optimal H is strikingly smaller than 1/2 for all interval lengths T . The quantityĉ i (H ) relies on tick-by-tick data, so that an obvious explanation for these unexpected optimal values of H are market To be more concrete, Benzaquen et al. [6] observe similar to our results a sub-diffusive behavior for so called large tick future contracts. Large tick assets are defined such that their bid-ask spread is almost always equal to one tick, see e.g. [13]. Most of the stocks in our sample can be categorized as large tick stocks based on this definition. When referring to market microstructure effects, however, it deserves to be stressed that the value H = 1/2 is implied by numerous models for the efficient price process (P t ) t≥0 , which are backed by empirical evidence and take market microstructure effects into account. Hence, the scaling of the squared volatility through time implied by H = 1/2 seems suitable in many applications. We also note that the Gini-coefficient G in Fig. 4 does not vary drastically when H ranges between the optimal H ≈ 0.25 and the traditional H = 1/2, namely roughly between G = 0.12 and G = 0.15. Hence, the value of H does not seem to play a very significant role in explaining the heterogeneity of the value ofĉ i j (H ). Nevertheless, a better understanding of the behavior of H seems to us a challenging topic for future research.

Conclusion
Finding laws relating the trading activity (defined here as the number of trades N within a given time interval) to other relevant market quantities has been the subject of numerous investigations. The earliest contribution dating as far back as the beginning of the 1970s. Two decades later, Jones et al. [16] suggested the relation N ∼ σ 2 based on an extensive empirical study. Other landmark contributions include the relation N ∼ (σ P/S) 2 of Madhavan et al. [21] resp. Wyart et al. [31] and the so called 3/2law N 3/2 ∼ σ PV /C of Benzaquen et al. [6], which were obtained using market microstructure arguments and supported by empirical evidence. In the first part of the paper we show that all these scaling laws can be derived using arguments relying on dimensional analysis. The relation N ∼ σ 2 follows from the assumption that N is fully explained by the squared volatility σ 2 , the asset price P and the traded volume V , and the assumption that the relation between these quantities is invariant under changes of the dimensions shares S, time T and money U. The somewhat refined relation N 3/2 ∼ σ PV /C is obtained when assuming that N depends only on σ 2 , P, V and the cost of trading C, and assuming in addition, that an invariance principle known as "Leverage Neutrality" holds true. This "Leverage Neutrality Assumption" can be seen as a no-arbitrage condition enabling us to obtain a unique functional relation from the assumption N = g(σ 2 , P, V , C). Substituting the quantity C by the bid-ask spread S in the latter assumption, we derive the relation N ∼ (σ P/S) 2 , which is shown to be equivalent to the 3/2-law. Alternatively, we can consider the volatility of the relative price change instead of the absolute price change, i.e., assume N = g(σ 2 P 2 , V , C) resp. N = g(σ 2 P 2 , V , S). This assumption simplifies the analysis in that a unique solution for g(·, ·, ·) can be obtained without recourse to the "Leverage Neutrality Assumption". Since our theoretical analysis relies on a set of well-defined, but not necessarily realistic assumptions, the validity of any of the aforementioned scaling laws needs to be confirmed through an empirical analysis.
Based on data from the NASDAQ stock exchange, we provide empirical evidence that the 3/2-law N 3/2 = c · σ PV /C (or equivalently N = c 2 · (σ P/S) 2 ) fits the data clearly better than N ∼ σ 2 . In fact, the 3/2-law holds for a fixed asset and a fixed interval length. However, the estimated value of the constant c strongly depends on the considered asset. In the language of Benzaquen et al. [6], this means that the 3/2-law holds with weak universality.
Finally, we note that both our theoretical and empirical analysis relied on the assumption that the scaling of σ 2 is inversely proportional to time T. This hypothesis is clearly debatable as it tacitly assumes diffusive price behaviors, and ignores e.g. the discrete nature of prices. A closer look at the scaling of σ 2 suggests the scaling [σ 2 ] = T −2H for some H ∈ (0, 1) that can be seen e.g. as the Hurst parameter of a fractional Brownian motion. Repeating our dimensional arguments, the latter scaling of σ 2 yields the relation N 1+H ∼ σ 2 PV /C. An essential drawback of this more general situation is that the parameter H is unknown. We formulate an optimality criterion for the choice of H . It should yield the most homogeneous estimates for the proportionality coefficientsĉ i (H ). A preliminary analysis implies that, on average, the optimal H is of the order 0.25, i.e., quite different from the assumption H = 0.5. Although the overall effect of this passage from H = 0.5 to H ≈ 0.25 turns out to have only mild effects on the issue of universality of the corresponding laws, we believe that this phenomenon merits further investigation.

A Dimensional analysis and the Pi-Theorem
In order to formally prove the results of Sects. 2 and 4, which in done in Appendix B, we need the Pi-Theorem from dimensional analysis. For completeness, we therefore provide the following reminder of this important theorem from dimensional analysis, which can also be found in [26]. Additionally, the interested reader is referred to Chapter 1 of the book by Bluman and Kumei [8] as well as to Pobedrya and Georgievskii [25] for a historical perspective and to [11] for a purely mathematical treatment of dimensional analysis. We formalize the assumptions behind dimensional analysis in proper generality. However, for the purpose of the present paper we shall only need the degree of generality covered by Corollaries 5 and 6 below.

Assumption 1 (Dimensional analysis).
(i) Let the quantity of interest U ∈ R + depend on n quantities W 1 , . . . , W n ∈ R + , i.e., for some function h : R n + → R + . We can now state the main result from dimensional analysis (see [8]).
We shall only need the special cases k = 0 and k = 1, which are spelled out in the two subsequent corollaries.

Corollary 5
Under Assumption 1, suppose that rank(B) = n and let y := (y 1 , . . . , y n ) be the unique solution to the linear system By = a. Then there is a constant const > 0 such that U = const · W y 1 1 · · · W y n n .

Corollary 6
Under Assumption 1, suppose that rank(B) = n − 1 and let x := (x 1 , . . . , x n ) and y := (y 1 , . . . , y n ) be non-trivial solutions to the homogeneous and inhomogeneous systems Bx = 0 and By = a respectively. Then there is a function f : R + → R + such that

B Proofs of Sects. 2 and 4
In this section, we provide formal arguments for the results presented in Sects. 2 and 4. The proofs are based on Corollaries 5 and 6 above.

Proof of Proposition 1
Combining relation (1) and the dimensions of the quantities σ 2 , P, V and N , we obtain that the matrix B as well as the vector a are given by  Table 1 illustrates how B and a relate to the considered quantities and their dimensions. As the matrix B has full rank, i.e., rank(B) = 3, applying Corollary 5 yields N = c · σ 2y 1 P y 2 V y 3 , for some constant c > 0, where y = (y 1 , y 2 , y 3 ) is the unique solution of the linear system By = a which is given by y = (1, 0, 0) . (3)  The vector x = (−1, 1, 1, −1) is a solution of the homogeneous system Bx = 0, and the vector y = (1, 0, 0, 0) is a solution of the inhomogeneous system By = a. Thus, relation (4) follows from Corollary 6.

Proof of Theorem 1
Combining the dimensions of the quantities considered in relation (5) and the "Leverage Neutrality Assumption", we obtain that the matrix B as well as the vector a are given by As the matrix B has full rank, i.e., rank(B) = 4, applying Corollary 5 yields for some constant c > 0, where y = (y 1 , y 2 , y 3 , y 4 ) is the unique solution of the linear system By = a which is given by y = (1/3, 2/3, 2/3, −2/3) . As the matrix B has full rank, i.e., rank(B) = 3, applying Corollary 5 yields N = c · V y 1 σ y 2 B C 2y 3 , for some constant c > 0, where y = (y 1 , y 2 , y 3 ) is the unique solution of the linear system By = a which is given by y = (1/3, 2/3, −2/3) . This shows (14).

Proof of Corollary 3
As explained before the statement of Corollary 3, the conditions (5) and (15) are equivalent. Thus, it holds Since C = SV /N , the corollary follows.

Proof of Proposition 2
The proof is the same as that of Theorem 1 except that in the present case the matrices B and a are given by The unique solution y of the linear system By = a is y = 1/(1+ H )·(1/2, 1, 1, −1) . Applying Corollary (5) gives the desired result.

C Integer part of Brownian motion
With the notation from Sect. 4, we want to show that as T 0 Var log(P t+T ) − log(P t ) ≈ c √ T , for some constant c > 0. Recall that log(P t ) t≥0 is given by where (W t ) t≥0 is a standard Brownian motion and int(x) denotes the integer closest to the real number x, i.e., int(x) = sup{n ∈ Z : n ≤ x + 0.5}. To present the idea in its simplest possible form, note that for fixed t > 0, say t = 1 and T small, it is straightforward to verify that log(P t+T ) − log(P t ) with probability of order 1, 1 with probability of order T 1/2 , > 1 with probability smaller than T .
So that Var log(P t+T ) − log(P t ) is of order T 1/2 , as T 0, rather than of the usual order T . In the above sketchy argument we used the fact that, for every t > 0, for some constant c > 0.
To furnish a more precise result, we make-contrary to our usual assumption W 0 = 0-the assumption that the Brownian motion starts from a random variable W 0 which is uniformly distributed on [−1/2, 1/2]. Then, we can formulate the following more quantitative result for fixed t = 0.

Proof Note that
Var log(P T ) − log(P 0 ) = E log(P T ) − log(P 0 ) 2 , 123 where log(P 0 ) is in fact zero as we assumed that W 0 ∼ Uni(1/2, 1/2). In the following (B t ) t≥0 denotes a standard Brownian motion starting at B 0 = 0 such that W T = B T + W 0 . Then, E log(P T ) − log(P 0 ) We now use that fact for x → ∞, (−x) ≈ φ(x)/x, where φ(x) = exp(−x 2 /2)/ √ 2π is the probability density function of the standard normal distribution (we thank Friedrich Hubalek for pointing this out to us). It follows that for small T which concludes the proof.