Exact Solution to a Generalised Lillo–Mike–Farmer Model with Heterogeneous Order-Splitting Strategies

Sato, Yuki; Kanazawa, Kiyoshi

doi:10.1007/s10955-024-03264-1

Exact Solution to a Generalised Lillo–Mike–Farmer Model with Heterogeneous Order-Splitting Strategies

Open access
Published: 07 May 2024

Volume 191, article number 58, (2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Statistical Physics Aims and scope Submit manuscript

Exact Solution to a Generalised Lillo–Mike–Farmer Model with Heterogeneous Order-Splitting Strategies

Download PDF

384 Accesses
5 Altmetric
Explore all metrics

Abstract

The Lillo–Mike–Farmer (LMF) model is an established econophysics model describing the order-splitting behaviour of institutional investors in financial markets. In the original article (Lillo et al. in Phys Rev E 71:066122, 2005), LMF assumed the homogeneity of the traders’ order-splitting strategy and derived a power-law asymptotic solution to the order-sign autocorrelation function (ACF) based on several heuristic reasonings. This report proposes a generalised LMF model by incorporating the heterogeneity of traders’ order-splitting behaviour that is exactly solved without heuristics. We find that the power-law exponent in the order-sign ACF is robust for arbitrary heterogeneous order-submission probability distributions. On the other hand, the prefactor in the ACF is very sensitive to heterogeneity in trading strategies and is shown to be systematically underestimated in the original homogeneous LMF model. Our work highlights that predicting the ACF prefactor is more challenging than the ACF exponent because many microscopic details (complex ingredients in actual data analyses) start to matter.

High frequency trading strategies, market fragility and price spikes: an agent based model perspective

Article Open access 25 August 2018

Rock around the clock: An agent-based model of low- and high-frequency trading

Article 16 August 2015

Fast traders and slow price adjustments: an artificial market with strategic interaction and transaction costs

Article 30 October 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Market microstructure of financial markets has been studied quantitatively and empirically in econophysics [1,2,3,4]. Econophysicists propose various dynamical models, such as at the limit order book level (e.g., the Santa Fe model [5,6,7], the $\epsilon $-intelligence model [8], and the latent order book model [9]) and the individual-traders level (e.g., the dealer model [10,11,12,13,14]) with the hope that the statistical physics program is useful even for financial modelling. This paper focuses on a microscopic model of market order submissions proposed by Lillo, Mike, and Farmer (LMF) in 2005 [15], which was hypothetically based on the order-splitting behaviour of individual traders.

The LMF model is a stylised dynamical model to explain the persistence of market-order flows. The market order is a trading option to immediately buy or sell the stock at the best prices. The market order sign $\epsilon _t$ is defined as $\epsilon _t=+1$ ($\epsilon _t=-1$) for the buy (sell) market order at time t in the following. In financial data analyses, the binary order-sign sequence of market-order flows is known to be predictable for a long time: the autocorrelation function (ACF) of the order-sign sequence obeys the slow decay characterised by the power law, such that

$$\begin{aligned} C_\tau := \langle \epsilon _t\epsilon _{t+\tau }\rangle \simeq c_0 \tau ^{-\gamma } \>\>\> \text{ for } \text{ large } \tau , \>\>\> \gamma \in (0,1). \end{aligned}$$

(1)

Here the empirical average of any stochastic variable A is denoted by $\langle A\rangle $, $c_0$ is the ACF prefactor, and $\gamma $ is the ACF power-law exponent. This slow decay is called the long-range correlation (LRC) of the order flows, and its origin has been under debate in econophysics and market microstructure for a long time [3]. For example, some researchers state that the LRC is a consequence of herding among traders [16,17,18]. However, from the viewpoint of empirical support, the current most promising microscopic hypothesis is the order-splitting hypothesis stating that the LRC originates from the order-splitting behaviour of institutional investors. The LMF model is based on this order-splitting hypothesis in describing the LRC from the microscopic dynamics in the spirit of the statistical-physics programs.

The order-splitting hypothesis states that there are traders who split large metaorders into a long sequence of child orders. Because all the child orders share the same sign for a while, the LRC naturally appears in this scenario. The LMF model is a simple stochastic model implementing this order-splitting picture. In the original article [15], they made the following assumptions:

There are M traders in the financial markets. M is a time constant (i.e., a closed system).
All traders are order-splitting traders characterised by the identical microscopic parameters: i.e., the homogeneity in the trading strategy is assumed across all the agents.
The distribution of metaorder length L is given by the discrete Pareto distribution ($L=1,2,...$), such that
$$\begin{aligned} \rho (L) \simeq \alpha L^{-\alpha -1} \>\>\> \text{ with } \>\>\alpha \in (1,2). \end{aligned}$$
(2)
They randomly submit market orders with the same order-submission probability.

While this microscopic dynamics is described as a Markovian stochastic process (whose dimension is $2M+1$; see Sec 2.2), LMF solved this model to study the LRC in the ACF as its macroscopic dynamical behaviour with heuristic but reasonable approximations. They finally showed that the ACF asymptotically obeys the LRC asymptotics (1), and the power-law exponent $\gamma $ and the prefactor $c_0$ are given by

$$\begin{aligned} \gamma&= \alpha -1, \end{aligned}$$

(3a)

$$\begin{aligned} c_0&= \frac{1}{\alpha M^{2-\alpha }}. \end{aligned}$$

(3b)

They also numerically showed that the power-law exponent formula (3a) robustly works even for an open-system version, where the total number of the traders M fluctuates in time. Since the predictive formula (3) connects the quantitative relationship between the macroscopic LRC phenomenon and the microscopic parameters, the LMF theory belongs to typical statistical physics programs and is exceptionally appealing to econophysicists theoretically.

Several empirical studies support both the order-splitting hypothesis and the LMF model. While the original LMF paper could not establish their prediction (3a) at a quantitative level^{Footnote 1} due to the data unavailability of high-quality microscopic datasets in 2005, Refs. [19, 20] showed that the assumption of the power-law metaorder size distribution is plausable by real datasets. In addition, Tóth et al. showed very convincing qualitative evidence in 2015 that the order-splitting is the main cause of the LRC by decomposing the total ACF [21]. Furthermore, Sato and Kanazawa showed crucial evidence in 2023 that the LMF prediction (3a) precisely works well even at a quantitative level [22, 23] using a large microscopic dataset of the Tokyo Stock Exchange (TSE) market.

While the LMF model well-describes the power-law decay of the ACF, its predictive power is expected to be limited regarding the prefactor $c_0$, because the LMF model was historically proposed to characterise the power-law exponent $\gamma $ but not prefactor $c_0$. Indeed, we noticed that heterogeneity of order-splitting strategies is present during the data analyses for Ref. [22, 23] and that such heterogeneity can theoretically impact the prefactor $c_0$, while the power-law formula (3a) robustly holds. Given the recent breakthrough in data analyses, we believe the classical LMF theory can be updated to describe the prefactor $c_0$ better by taking into account the heterogeneity in trading strategies toward precise data calibration.

In this report, we propose a generalised LMF model by incorporating heterogeneity of order-splitting strategies. In addition, we solve the generalised LMF model exactly to show the following two characters: (i) The power-law exponent formula (3a) robustly holds true even in the presence of heterogeneous order-submission probability distributions. (ii) The prefactor formula (3b) is replaced with a new formula that is sensitive to the order-submission probability distribution. (iii) Furthermore, the classical prefactor formula (3b) systematically underestimates the actual prefactor in the presence of heterogeneity in agents. Our results imply that while the interpretation of the ACF power-law exponent is robust and straightforward, the interpretation of the ACF prefactor needs more careful investigation for data calibrations.

This report is organised as follows. Section 2 describes our model and mathematical notation with the assumption of the closed system. We show the exact solution for the generalised LMF model in Sect. 3. In Sect. 4, we study several specific but important cases with numerical verifications. Sects. 5 and 6 discuss the implication of our heterogeneous LMF formulas for realistic data calibration. We conclude our paper with some remarks in Sect. 7. At the end of this report, ten appendices follow the main text for its supplements.

2 Model

In this section, let us define the stochastic dynamics of our generalised LMF model.

2.1 Mathematical Notation

In this report, the probability density function (PDF) of a stochastic variable A is written as P(A). If the stochastic variable explicitly depends on time t, such that $A_t$, the PDF of $A_t$ is denoted by $P_t(A)$. We note that any PDF must satisfy the normalisation condition $\sum _A P(A) = 1$. We also define the cumulative distribution function (CDF) and the complementary cumulative distribution function (CCDF) by

$$\begin{aligned} P_<(A) := \sum _{A'< A} P(A'), \>\>\> P_{\ge }(A) := \sum _{A'\ge A} P(A') = 1 - \sum _{A' < A}P(A'), \end{aligned}$$

(4)

respectively. The stationary PDF and the stationary ensemble average are respectively defined by

$$\begin{aligned} P_{\textrm{st}} (A) := \lim _{t\rightarrow \infty }P_t(A), \>\>\> \langle A\rangle _{\textrm{st}} := \lim _{t\rightarrow \infty }\langle A_t\rangle = \sum _A AP_{\textrm{st}}(A), \end{aligned}$$

(5)

if $P_{\textrm{st}} (A)$ and $\langle A\rangle _{\textrm{st}}$ exist. Also, under the condition B, the conditional PDF and conditional average of A are respectively defined by

$$\begin{aligned} P(A|B) := \frac{P(A,B)}{P(B)}, \>\>\> \langle A | B\rangle = \sum _{A'} A' P(A'|B). \end{aligned}$$

(6)

2.2 Model Parameters and Variables

${\Omega }$ denotes the set of all the traders, and the system is assumed to be closed, such that $M:= |{\Omega }|=\textrm{const}$ (see Fig. 1). $|{\Omega }|$ is a positive integer, and the traders set ${\Omega }$ can be written as

$$\begin{aligned} {\Omega } = \{1,2,...,M\} \end{aligned}$$

(7)

without loss of generality. We incorporate the heterogeneity of trading strategies into our model, and the characteristic parameters of the ith trader are given by the order-submission probability $\lambda ^{(i)}$ and metaorder-length (or run-length) distribution $\rho ^{(i)}(L)$. For simplicity, we assume that the executed volume size is always the minimum unit of transactions. In other words, our model is completely characterised by the following parameter set

$$\begin{aligned} \mathcal {P} := \left( M, \{\lambda ^{(i)}\}_{i\in {\Omega }}, \{\rho ^{(i)}(L)\}_{i\in {\Omega }}\right) , \end{aligned}$$

(8)

where the submission probability and the metaorder-length distribution satisfy the normalisation of the probability

$$\begin{aligned} \sum _{i'\in {\Omega }} \lambda ^{(i')}=1, \>\>\> \sum _{L=1}^{\infty } \rho ^{(i)}(L)=1 \end{aligned}$$

(9)

for any $i \in {\Omega }$.

We next define the state variable of the ith trader. The trader i has two state variables $\epsilon ^{(i)}_t$ and $R^{(i)}_t$, representing the order-sign of the metaorder ($\epsilon ^{(i)}_t=+1$ denotes buy and $\epsilon ^{(i)}_t=-1$ denotes sell) and the remaining metaorder length, respectively. The order sign of the whole market $\epsilon _t$ is defined by $\epsilon _t:=\epsilon _t^{(\mathfrak {i}_t)}$, where $\mathfrak {i}_t$ is the trader identifier (ID) who submits the market order at time t. Thus, this system is specified by the point in the phase space

$$\begin{aligned} X_t: = \left( \epsilon _t; \epsilon _t^{(1)}, R_t^{(1)}; \dots ; \epsilon _t^{(M)}, R_t^{(M)}\right) \end{aligned}$$

(10)

and is designed as a Markovian stochastic process with dimension $2M+1$.

2.3 Stochastic Dynamics

We next proceed with the definition of the stochastic dynamics. Let $\mathfrak {i}_t$ be the stochastic variable representing the trader identifier who submits the market order at time t, such that $\mathfrak {i}_{t} \in {\Omega }$. We assume that $\mathfrak {i}_{t+1}$ obeys the PDF $\{\lambda ^{(\mathfrak {i})}\}_{\mathfrak {i}\in {\Omega }}$. In other words, the probability $\mathfrak {i}_{t+1}$ is given by

$$\begin{aligned} P_{t+1}(\mathfrak {i}) = \lambda ^{(\mathfrak {i})} \end{aligned}$$

(11a)

as an independent and identically distributed (IID) sequence $\{\mathfrak {i}_t\}_{t}$. After the execution by the trader $\mathfrak {i}_{t+1}$, the remaining volume $R^{(\mathfrak {i}_{t+1})}_{t+1}$ decreases by one if $R^{(\mathfrak {i}_{t+1})}_{t} > 1$. If all the metaorder is executed at time $t+1$ (i.e., $R^{(\mathfrak {i}_{t+1})}_t=1$), the metaorder length and its sign are randomly reset for the trader $\mathfrak {i}_{t+1}$. In summary, the dynamics of $X_t$ is given as follows for all $i \in {\Omega }$ (see Fig. 1 for a schematic):

$$\begin{aligned} R^{(i)}_{t+1}&= {\left\{ \begin{array}{ll} R^{(i)}_t &{} \text{ if } i \ne \mathfrak {i}_{t+1} \\ R^{(i)}_t - 1 &{} \text{ if } i = \mathfrak {i}_{t+1} \text{ and } R^{(i)}_{t}>1 \\ L &{} \text{ if } i= \mathfrak {i}_{t+1} \text{ and } R^{(i)}_t=1;\,L \text{ obeys } \rho ^{(i)}(L) \end{array}\right. }, \end{aligned}$$

(11b)

$$\begin{aligned} \epsilon ^{(i)}_{t+1}&= {\left\{ \begin{array}{ll} \epsilon ^{(i)}_t &{} \text{ if } i \ne \mathfrak {i}_{t+1} \text{ or } R^{(i)}_{t}>1 \\ +1 &{} \text{ with } \text{ prob. } 1/2, \text{ if } i = \mathfrak {i}_{t+1} \text{ and } R^{(i)}_{t}=1 \\ -1 &{} \text{ with } \text{ prob. } 1/2, \text{ if } i = \mathfrak {i}_{t+1} \text{ and } R^{(i)}_{t}=1 \end{array}\right. }, \end{aligned}$$

(11c)

$$\begin{aligned} \epsilon _{t+1}&= \epsilon _{t}^{(\mathfrak {i}_{t+1})}. \end{aligned}$$

(11d)

Here the metaorder length is replenished according to the PDF $\{\rho ^{(i)}(L)\}_L$ when the previous metaorder is terminated (i.e., if $R_t^{(\mathfrak {i}_{t+1})}=1$).

2.4 Relationship with the Original LMF Model

Our model is a natural generalisation of the original LMF model to include the heterogeneity of the order-splitting behaviour. Indeed, our model reduces to the original LMF model by setting the parameter $\mathcal {P}$ as

$$\begin{aligned} \lambda ^{(i)} = \frac{1}{M}, \>\>\> \rho ^{(i)}(L) = \rho (L) \>\>\> \text{ for } \text{ all } i \in {\Omega } \end{aligned}$$

(12)

by removing the heterogeneity in the order-splitting strategies.

3 Exact Solutions

In this section, we derive the exact solutions to our generalised LMF model. Particularly, we are interested in the order-sign autocorrelation function (ACF) in the stationary state:

$$\begin{aligned} C_\tau := \lim _{t\rightarrow \infty }\langle \epsilon _t\epsilon _{t+\tau }\rangle = \langle \epsilon _1\epsilon _{\tau +1}\rangle _{\textrm{st}}. \end{aligned}$$

(13)

3.1 Preliminary Calculation

Before deriving the explicit formula of the exact ACF, we make a transformation of the definition of the ACF. Let us introduce a flag variable $u_{t_s,t_e}$ satisfying $u_{t_s,t_e}=1$ if the metaorder executed at time $t=t_e$ belongs to the same metaorder executed at time $t=t_s$ or otherwise $u_{t_s,t_e}=0$. Let us introduce the conditioning on $u_{1,\tau +1}$, $\mathfrak {i}_{\tau +1}$, and $\mathfrak {i}_1$, to decompose the ACF as

$$\begin{aligned} C_\tau = \sum _{u'\in \{0,1\}}\sum _{i\in {\Omega }}\sum _{j\in {\Omega }}\langle \epsilon _1\epsilon _{\tau +1} | {u_{1,\tau +1}=u'}, \mathfrak {i}_{\tau +1}=i, \mathfrak {i}_1=j \rangle _{\textrm{st}}P({u_{1,\tau +1}=u'}, \mathfrak {i}_{\tau +1}=i, \mathfrak {i}_1=j). \end{aligned}$$

(14)

See Fig. 2 for a schematic of this decomposition. By construction, there is no correlation between the order signs belonging to two different metaorder. On the other hand, the order signs between the same metaorder are perfectly correlated. We thus obtain

$$\begin{aligned} \langle \epsilon _1\epsilon _{\tau +1} | {u_{1,\tau +1}=u'}, \mathfrak {i}_{\tau +1}=i, \mathfrak {i}_1=j \rangle _{\textrm{st}}=\delta _{u',1}. \end{aligned}$$

(15)

In addition, $\mathfrak {i}_{\tau +1}$, and $\mathfrak {i}_1$ are independently generated, and

$$\begin{aligned} P({u_{1,\tau +1}}=1, \mathfrak {i}_{\tau +1}=i, \mathfrak {i}_1=j)&= P({u_{1,\tau +1}}=1 | \mathfrak {i}_{\tau +1}=i, \mathfrak {i}_1=j)P(\mathfrak {i}_{\tau +1}=i)P(\mathfrak {i}_1=j) \nonumber \\ {}&= P({u_{1,\tau +1}}=1 | \mathfrak {i}_{\tau +1}=\mathfrak {i}_1=i)\left( \lambda ^{(i)}\right) ^2\delta _{i,j}. \end{aligned}$$

(16)

We obtain

$$\begin{aligned} C_\tau = \sum _{i\in {\Omega }} \left( \lambda ^{(i)}\right) ^2P({u_{1,\tau +1}}=1 | \mathfrak {i}_{\tau +1} = \mathfrak {i}_1=i). \end{aligned}$$

(17)

We next introduce the conditioning on $R_{t=0}^{(i)}$ as

$$\begin{aligned} P({u_{1,\tau +1}}=1 | \mathfrak {i}_{\tau +1} = \mathfrak {i}_1=i) = \sum _{R^{(i)}_0=2}^\infty P({u_{1,\tau +1}}=1 | \mathfrak {i}_{\tau +1} = \mathfrak {i}_1=i, R_0^{(i)})P_{\textrm{st}}(R_0^{(i)}), \end{aligned}$$

(18)

where we use the identity^{Footnote 2}$P(A|B)=\sum _C P(A|B,C)P(C|B)$, and the relationships $P_\textrm{st}(R_0^{(i)} | \mathfrak {i}_{\tau +1}=\mathfrak {i}_1=i)=P_{\textrm{st}}(R_0^{(i)})$, and $P({u_{1,\tau +1}}=1 | \mathfrak {i}_{\tau +1} = \mathfrak {i}_1=i, R_0^{(i)}=1)=0$.

The term $P({u_{1,\tau +1}}=1 | \mathfrak {i}_{\tau +1} = \mathfrak {i}_1=i, R_0^{(i)})$ is directly related with the survival probability of a metaorder whose initial volumes is $R_0^{(i)}$. Indeed, by defining $N^{(i)}_{\tau }$ as the total number of the metaorder executions by the trader i during $[1,\tau ]$, the condition of ${u_{1,\tau +1}}=1$ is equal to $R_0^{(i)}-N_{\tau }^{(i)}\ge 1$ (see Fig. 2), or equivalently,

$$\begin{aligned} P({u_{1,\tau +1}}=1 | \mathfrak {i}_{\tau +1} = \mathfrak {i}_1=i, R_0^{(i)}) = P(N^{(i)}_{\tau } \le R_0^{(i)}-1). \end{aligned}$$

(22)

In other words, this is the probability that the discrete-time Poisson counting process $N^{(i)}_{\tau }$ remains within the range $[1,R_0^{(i)}-1]$ during the time interval $[1,\tau ]$ with the initial condition $N_{t=1}^{(i)}=1$.

Thus, we can exactly decompose the total ACF as

The explicit formulas for $P_{\textrm{st}}(R_0^{(i)})$ and $P(N^{(i)}_{\tau } \le R_0^{(i)}-1)$ can be derived using the master equation approach. Regarding $P_{\textrm{st}}(R_0^{(i)})$, we obtain the following formulas (see Appendix A):

$$\begin{aligned} P_{\textrm{st}}(R_0^{(i)}) = c_R^{(i)} \rho _{\ge }^{(i)}(R^{(i)}),\>\>\> \rho ^{(i)}_{\ge }(L) := \sum _{L'=L}^\infty \rho ^{(i)}(L'), \>\>\> c_R^{(i)} = P_{\textrm{st}}(1):=\frac{1}{\sum _{L=1}^\infty \rho _{\ge }^{(i)}(L)}=\frac{1}{L_{\textrm{avg}}} \end{aligned}$$

(23)

with $L_{\textrm{avg}}:=\sum _{L'=1}^\infty L'\rho ^{(i)}(L')$. The metaorder survival probability $P(N^{(i)}_{\tau } \le R_0^{(i)}-1)$ is given by the sum of the binomial distribution (see Appendix B):

$$\begin{aligned} P_t(N^{(i)}) = {\left\{ \begin{array}{ll} \mathcal {B}_{t-1,\lambda ^{(i)}}(N^{(i)}-1) &{} (\text{ for } N^{(i)} \in [1, t]) \\ 0 &{} (\text{ for } N^{(i)} \not \in [1, t]) \end{array}\right. }, \>\>\> \mathcal {B}_{t,\lambda }(x) := \frac{t!}{x!(t-x)!} \lambda ^{x} \left( 1-\lambda \right) ^{t-x}. \end{aligned}$$

(19)

In summary, we obtain the exact order-sign ACF formula in an explicit form as

3.2 Remark on the Original Derivation

Let us focus on the homogeneous case $\lambda ^{(i)}=\lambda =1/M$ for all $i\in {\Omega }$. In the original LMF argument, they heuristically estimated that the original metaorder length at $\tau =1$ should obey the PDF

$$\begin{aligned} Q(L) = \frac{L\rho (L)}{\sum _{L=1}^\infty L\rho (L)} \end{aligned}$$

(25)

because a longer metaorder is likely to be observed with a higher probability. Furthermore, they assumed that the remaining metaorder length $R_{\tau =1}^{(i)}$ is uniformly distributed within [1, L]. On these heuristic but reasonable assumptions, they estimated the order-sign ACF as

$$\begin{aligned} C_{\tau }^\textrm{LMF} = \frac{1}{L_{\textrm{avg}}} \sum _{L=1}^\infty \sum _{j=1}^{L-2}\sum _{h=0}^j \rho (L) \frac{(\tau -1)!}{h!(\tau -1-h)!}\lambda ^{h+1}(1-\lambda )^{\tau -1-h}, \>\>\> L_{\textrm{avg}}:= \sum _{L=1}^\infty L\rho (L). \end{aligned}$$

(26)

Our derivation is essentially similar to the original LMF argument. However, our derivation is more systematic and rigorous than the original LMF derivation in the sense that ours is based on the master equations without heuristic arguments.

Indeed, their heuristic formula is equivalent to ours except for a minor typo as follows: By switching the order of the sums between between L and j, we obtain

$$\begin{aligned} C_{\tau }^\textrm{LMF}&= \frac{1}{L_{\textrm{avg}}} \sum _{j=1}^{\infty }\sum _{h=0}^j\sum _{L=j+2}^\infty \rho (L) \frac{(\tau -1)!}{h!(\tau -1-h)!}\lambda ^{h+1}(1-\lambda )^{\tau -1-h} \nonumber \\&= \frac{1}{L_{\textrm{avg}}} \sum _{j=1}^{\infty }\sum _{h=0}^j \rho _{\ge }(j+2) \frac{(\tau -1)!}{h!(\tau -1-h)!}\lambda ^{h+1}(1-\lambda )^{\tau -1-h} \nonumber \\&= \frac{1}{L_{\textrm{avg}}} \sum _{R_0=3}^{\infty }\sum _{N=1}^{R_0-1} \rho _{\ge }(R_0) \frac{(\tau -1)!}{(N-1)!(\tau -N)!}\lambda ^{N}(1-\lambda )^{\tau -N} \nonumber \\&= \frac{\lambda }{L_{\textrm{avg}}} \sum _{R_0=3}^{\infty }\rho _{\ge }(R_0) P(N_{\tau }\le R_0-1) \end{aligned}$$

(27)

with formal replacements of the dummy variables between the second and third lines as $j=R_0-2$ and $h=N-1$.

By the way, for the homogeneous case, our exact formula (21) reduces to

$$\begin{aligned} C^\textrm{SK}_{\tau } = \frac{\lambda }{L_{\textrm{avg}}}\sum _{R_0=2}^\infty \rho _{\ge }(R_0)P(N_{\tau }\le R_0-1). \end{aligned}$$

(28)

Therefore, the LMF estimation $C^\textrm{LMF}_{\tau }$ in Ref. [15] is consistent with our exact formula $C^\textrm{SK}_{\tau }$ for the homogeneous case except for a very minor contribution from $R_0=2$. We think this minor contribution is just a typo without significant meanings, and our formula is a natural and rigorous extension of the original LMF theory.

4 Examples and Numerical Verification

Let us derive the asymptotic behaviour of the order-sign ACF for several important cases.

4.1 Case 0: Random Traders

Let us consider the most trivial case where the trader i submit her orders as independent random variables. This case corresponds to the setting

$$\begin{aligned} \rho ^{(i)}(L) = \delta _{L,1} \>\>\> \Longleftrightarrow \>\>\> \rho _{\ge }^{(i)}(L) = \delta _{L,1} \>\>\> \text{ for } \text{ all } L\ge 1. \end{aligned}$$

(29)

From Eq. (24), we obtain the order-sign ACF without any correlation as

$$\begin{aligned} C_\tau ^{(i)} = \delta _{\tau ,1}. \end{aligned}$$

(30)

4.2 Case 1: Exponential Metaorder Length Distribution

Let us consider the case where the metaorder length obeys the exponential law:

$$\begin{aligned} \rho _{\ge }^{(i)}(L) = e^{-(L-1)/L^{*(i)}}, \>\>\> {L^{*(i)}}>0 \>\>\> \Longleftrightarrow \>\>\> \rho ^{(i)}(L) = \left( e^{1/L^{*(i)}}-1\right) e^{-L/L^{*(i)}}. \end{aligned}$$

(31)

Note that $c_{R}^{(i)}:= 1/\sum _{L=1}^\infty \rho _{\ge }^{(i)}(L) = 1-e^{-1/L^{*(i)}}$. For this case, we obtain an exact ACF formula, such that

$$\begin{aligned} C_\tau ^{(i)} = \left( \lambda ^{(i)}\right) ^2e^{-1/L^{*(i)}}\left( 1-\lambda ^{(i)}+\lambda ^{(i)}e^{-1/L^{*(i)}}\right) ^{\tau -1} \>\>\>\text{ for } \tau \ge 1. \end{aligned}$$

(32)

See Appendix C for the detailed derivation. This equation can be rewritten as

This implies that the exponential decay appears in the order-sign ACF as a fast-decaying tail, which is consistent with empirical observations. We numerically checked the validity of this formula as shown in Fig. 3.

4.3 Case 2: Power-Law Metaorder Length Distribution

We next study the case where the metaorder length obeys the power law:

$$\begin{aligned} \rho ^{(i)}_{\ge }(L) = L^{-\alpha ^{(i)}} \end{aligned}$$

(34)

with a positive constant $\alpha ^{(i)}>1$. This means that the density profile is approximately given by

$$\begin{aligned} \rho ^{(i)}(L) \simeq -{\frac{d}{dL}}\rho ^{(i)}_{\ge }(L) = \alpha ^{(i)} L^{-\alpha ^{(i)}-1}. \end{aligned}$$

(35)

For this case, by using an integral approximation of the sum, we obtain

$$\begin{aligned} \frac{1}{c_R^{(i)}} = L_{\textrm{avg}} := \sum _{L=1}^\infty L \rho (L) \simeq \int _1^\infty \alpha ^{(i)}L^{-\alpha ^{(i)}}dL = \frac{\alpha ^{(i)}}{\alpha ^{(i)}-1}. \end{aligned}$$

(36)

For sufficiently large $\tau \gg 1$, we asymptotically obtain

For the detailed derivation, see Appendix D.

4.4 ACF Formula with Heterogeneous Strategies

Let us summarise the above formula regarding the heterogeneity of the order-splitting strategies. Let us consider a market where the following-types of traders coexist:

random traders (whose set is denoted by $\Omega _{\textrm{RT}}$),
exponentially-splitting traders (whose set is denoted by $\Omega _{\textrm{ET}}$), and
power-law splitting traders (whose set is denoted by $\Omega _{\textrm{PT}}$).

The total ACF asymptotically obeys

Thus, while we observe the fast decay characterised by the exponential law for relatively small $\tau $, the slow decay is dominant for large $\tau $. Such characters are consistent with the empirical observations.

4.4.1 Remark 1: Consistency with the Original LMF Formula for the Homogeneous Case

Let us assume that all traders are power-law splitting traders with homogeneous order-submission probability, such that

$$\begin{aligned} \alpha ^{(i)} = \alpha , \>\>\> \lambda ^{(i)} = \frac{1}{M} \>\>\> \text{ for } \text{ all } i\in \Omega _{\textrm{PT}} = {\Omega }. \end{aligned}$$

(39)

For this case, we obtain

$$\begin{aligned} C_\tau \simeq \frac{\tau ^{-\gamma }}{\alpha M^{2-\alpha }}, \>\>\> \gamma := \alpha -1, \end{aligned}$$

(40)

which is equivalent to the original LMF formula (3).

4.4.2 Remark 2: The Importance of the Minimum Power-Law Exponent $\alpha _{\min }$

The ACF is finally characterised by the power law

$$\begin{aligned} C_\tau \propto \tau ^{-\alpha _{\min }+1}\>\>\> \text{ for } \text{ large } \tau , \>\>\> \alpha _{\min } := \min _{i\in {\Omega _{\textrm{PL}}}} \alpha ^{(i)} \end{aligned}$$

(41)

Thus, $\alpha _{\min }$ is the most important parameter characterising the final asymptotic behaviour of the ACF.

This character is relevant to the data calibration. Indeed, a typical quantity that is empirically-available is the aggregated metaorder-length distribution among all the splitting traders $\Omega _{\textrm{ST}}:=\Omega _{\textrm{ET}} + \Omega _{\textrm{PT}}$, such that

$$\begin{aligned} \rho _{\textrm{ST}}^\textrm{empirical}(L) := \frac{1}{N_\textrm{tot}}\sum _{k=1}\delta (L-L_k), \end{aligned}$$

(42)

where $N_{\textrm{tot}}$ is the total number of metaorder lengths and $L_k$ is the kth metaorder length among all the splitting traders. Let us decompose this aggregated empirical distribution as

$$\begin{aligned} \rho _{\textrm{ST}}^\textrm{empirical}(L) = \frac{1}{N_{\textrm{tot}}}\sum _{i\in \Omega _{\textrm{ST}}}\sum _{k=1}^{N_{\textrm{tot}}^{(i)}}\delta (L-L_k^{(i)}) = \sum _{i\in \Omega _{\textrm{ST}}}\frac{N_{\textrm{tot}}^{(i)}}{N_\textrm{tot}}\left\{ \frac{1}{N_{\textrm{tot}}^{(i)}}\sum _{k=1}^{N_\textrm{tot}^{(i)}}\delta (L-L_k^{(i)})\right\} \end{aligned}$$

(43)

with $L_{k}^{(i)}$ being the kth metaorder length of the trader i and $N_{\textrm{tot}}^{(i)}$ being the total number of metaorder lengths of the trader i. Here we use the ergodicity regarding the empirical distributions

$$\begin{aligned} \rho ^{(i)}(L) = \lim _{N_{\textrm{tot}}^{(i)} \rightarrow \infty }\frac{1}{N_\textrm{tot}^{(i)}}\sum _{k=1}^{N_{\textrm{tot}}^{(i)}}\delta (L-L_k^{(i)}). \end{aligned}$$

(44)

Also, we can evaluate the following quantities for a long-time simulation with the simulation time t as

$$\begin{aligned} N_{\textrm{tot}}^{(i)} \simeq \frac{\lambda ^{(i)}t}{\langle L_i\rangle }, \>\>\> N_{\textrm{tot}} \simeq \frac{\lambda _{\textrm{ST}} t}{\langle L\rangle }, \>\>\> \lambda _{\textrm{ST}} := \sum _{i\in \Omega _{\textrm{ST}}}\lambda ^{(i)}. \end{aligned}$$

(45)

We thus obtain

$$\begin{aligned} \rho _{\textrm{ST}}^\textrm{empirical}(L) \simeq \sum _{i\in \Omega _{\textrm{ST}}} w^{(i)}\rho ^{(i)}(L) \propto L^{-{\alpha _{\min }}-1} \>\> \text{ for } \text{ a } \text{ large } \text{ L } \text{ with } \>\> w^{(i)} := \frac{\lambda ^{(i)}}{\lambda _\textrm{ST}}\frac{\langle L\rangle }{\langle L_i\rangle }. \end{aligned}$$

(46)

This relation implies that it is acceptable to use the aggregated metaorder-length distributions among all the splitting traders in determining $\alpha _{\min }$.

We note that these formulas are derived by assuming that the sample size is infinity and the metaorder-length distributions obey the true power-law without cutoffs. Technically, the straightforward applicability of these formulas is rather limited for real data analyses, where the sample size is finite and the metaorder-length PDFs obey truncated power laws. Indeed, the numerical convergence speed of the asymptotic relation (46) was slow regarding the sample size (see Appendix E for the numerical results). However, the above analysis highlights the conceptual relation between the metaorder-length PDF for individual traders and the aggregated PDF that is empirically accessible.

5 Theoretical Discussion 1: Data Calibration Based on the Power-Law Splitting Assumption

Here we discuss the implication of our heterogeneous-LMF formula (38) for the data calibration. Particularly, in this section, we only make a simple assumption

$$\begin{aligned} \alpha ^{(i)} = \alpha \>\> \text{ for } \text{ all } i \in \Omega _{\textrm{PT}} \end{aligned}$$

(47)

with the heterogeneity included in the intensities $\{\lambda ^{(i)}\}_{i \in \Omega _{\textrm{PT}}}$ among the power-law splitting traders. Also, the total order-submission probability $\mu $ and the total number M of the power-law splitting traders are denoted by

$$\begin{aligned} \mu := \sum _{\i \in \Omega _{\textrm{PT}}} \lambda ^{(i)}, \>\>\> M_{\textrm{PT}} := |\Omega _{\textrm{PT}}|, \end{aligned}$$

(48)

respectively. For this case, the asymptotic behaviour is described by

5.1 Robust Power-Law Exponent Formula

What is the implication of the heterogeneous LMF formula (49) for the data calibration? One of the most important implications is that the power-law exponent $\gamma $ is insensitive to the heterogeneity of the order-submission probability distribution $\{\lambda ^{(i)}\}_{i \in \Omega _{\textrm{PT}}}$. This is a very important character of our heterogenous LMF model because it implies that the power-law exponent is a very robust measurable quantity: even if the heterogeneity of the order-submission probability distribution is present, the LMF prediction $\gamma = \alpha -1$ is a trustable relationship. In real datasets, the average waiting time $\tau ^{(i)}:= 1/\lambda ^{(i)}$ is expected to distribute widely, such as the power-law distribution $P(\tau ):=(1/M)\sum _{i\in \Omega _{\textrm{PT}}}\delta (\tau -\tau ^{(i)})\propto \tau ^{-\chi -1}$ for large $\tau $ with $\chi > 0$. This assumption is equivalent to the power-law peak asymptotics in the order-submission probability distribution, such that $P(\lambda )\propto \lambda ^{\chi -1}$ for small $\lambda $. The relation (49) states that such widely-distributed waiting times (or intensities) have no impact on the macroscopic power-law exponent $\gamma $ in the ACF, which is non-trivial. From this viewpoint, the heterogeneity in the LMF model is not essential in understanding the LRC; the original LMF is a sufficient model.

In addition, since the formula $\gamma =\alpha -1$ does not depend on the intensities $\{\lambda ^{(i)}\}_i$, the power-law ACF formula will hold even when the intensities have slow time dependence if $\alpha $ is time independent. Indeed, in the presence of the time inhomogeneity of $\{\lambda ^{(i)}(t)\}_i$, the ACF formula will be replaced by

$$\begin{aligned} C_{\tau } \simeq \bar{c}_0^\textrm{SK} \tau ^{-\gamma }, \>\>\> \gamma =\alpha -1, \>\>\> \bar{c}_0^\textrm{SK} := \frac{1}{\alpha T_\textrm{fin}}\sum _{i\in \Omega _{\textrm{PT}}}\int _0^{T_{\textrm{fin}}} \left( \lambda ^{(i)}(t)\right) ^{3-\alpha }dt \end{aligned}$$

(50)

with the final observation time $T_{\textrm{fin}}$, if the time dependence of $\{\lambda ^{(i)}(t)\}_i$ is sufficiently slow. Given that the intensities $\{\lambda ^{(i)}(t)\}_i$ will change day by day, it is pleasant that the power-law ACF formula holds independently of the intensities, at least for their slow time variation.

5.2 Non-robust Prefactor Formula

On the other hand, the prefactor $c_0^\textrm{SK}$ is very sensitive to the heterogeneity of the order-submission probability distribution $\{\lambda ^{(i)}\}_{i \in \Omega _{\textrm{PT}}}$. Indeed, the prefactor $c_0^\textrm{SK}$ is different from the homogeneous LMF formula:

$$\begin{aligned} c_0^\textrm{SK} \ne \frac{1}{\alpha M_{\textrm{PT}}^{2-\alpha }}, \end{aligned}$$

(51)

if $\lambda ^{(i)}\ne 1/M_{\textrm{PT}}$ for some $i \in \Omega _\textrm{PT}$. This implies that the interpretation of the prefactor is not straightforward because it sensitively depends on the underlying microscopic assumptions. We are sure that the homogenous assumption in the order-submission probability distribution is unrealistic in real datasets.

Furthermore, we have assumed that the metaorder length PDF exactly obeys the paretian distribution for all the range. This assumption is also unrealistic. Rather, it is a more realistic assumption that the power-law holds only asymptotically:

$$\begin{aligned} \rho _{\ge }(L) \simeq c_{\rho } L^{-\alpha } \>\>\> \text{ for } \text{ large } L. \end{aligned}$$

(52)

Actually, we validated this weaker assumption in our microscopic datasets of the TSE market (see Ref. [22, 23]). Under this assumption, the prefactor is slightly modified. Anyway, the prefactor is sensitive to the model-specific assumptions.

5.3 Systematic Underestimation of the Prefactor by the Homogeneous LMF Model

Furthermore, our heterogeneous LMF formula (49) implies that the prefactor is systematically biased in the presence of the heterogeneous intensities. To clarify this point, let us consider the homogenous assumption in the intensities among the power-law splitting traders, such that

$$\begin{aligned} \lambda ^{(i)}_{\textrm{LMF}} = \frac{\mu }{M_{\textrm{PT}}}, \end{aligned}$$

(53)

while we assume that the random and exponentially-splitting traders can be present (i.e., $\mu $ can be different from the unity). For this case, the prefactor is given by

$$\begin{aligned} c_0^\textrm{LMF} = \frac{\mu ^{3-\alpha }}{\alpha M_{\textrm{PT}}^{2-\alpha }}, \end{aligned}$$

(54)

which reduces to Eq. (3b) for $\mu =1$. Here we can prove that the prefactor is systematically underestimated by the homogeneous LMF model with $\alpha \in (1,2)$:

The lower-bound inequality is derived by applying Hölder’s inequality, and the upper-bound inequality is derived by mathematical induction. See Appendix F for the detailed proof. The lower-bound equality holds when the intensities are homogeneous, such that $\lambda ^{(i)}=\mu /M_\textrm{PT}$ for all $i\in \Omega _{\textrm{PT}}$. In addition, the upper-bound equality holds when the power-law splitter is alone $M_{\textrm{PT}}=1$.

These inequalities highlight the impact of the heterogeneous trading strategies on the prefactor estimation, and is the final main result of this report. The inequality (55) is practically relevant to the evaluation of the ACF prefactors by data calibration.

5.3.1 Estimation of the Lower Bound of the Total Number of Order-Splitting Traders

The inequality (55) is useful for the estimation of the total number of order-splitting traders. Indeed, we can estimate the lower bound of the total number of the power-law order-splitting traders as

by assuming $c_{0}^\textrm{SK}\simeq c_0^\textrm{dat}$, where $c_0^\textrm{dat}$ is the empirically accessible quantity. Since $\gamma $ is directly measurable from the ACF, $\alpha $ is also indirectly measurable by the relationship $\alpha =\gamma +1$. While $\mu $ is not easily accessible from public data, we assume $\mu =0.8$ because this value was typical in the TSE market from 2012 to 2020. Thus, we have approximate access to $M_{\textrm{PT}}^\textrm{LB}$.

5.3.2 How to Use the Inequality (56)

Let us discuss how to use the inequality (56) for the evaluation of the total number of traders $M_{\textrm{PT}}$ from the empirical ACF. Our question is whether $M^\textrm{LB}_{\textrm{PT}}$ has useful informationis on the true value of $M_{\textrm{PT}}$. For simplicity, let us consider the case where one has to decide whether there is one or two order splitters. We also assume that the true values are given by

$$\begin{aligned} M_{\textrm{PT}}=2, \>\>\> \mu = 0.8, \>\>\> \lambda ^{(1)} = 0.2,\>\>\> \lambda ^{(2)} = 0.6,\>\>\> \alpha ^{(1)} = \alpha ^{(2)} = 1.5. \end{aligned}$$

(57)

Under these assumptions, the theoretical value of the prefactor is given by $c_0\simeq 0.37$. Using inequality (56), we can estimate the lower bound of the number of power-law splitting traders $M_{\textrm{PT}}$ as

$$\begin{aligned} M_{\textrm{PT}} \gtrsim M_{\textrm{PT}}^\textrm{LB} \approx 1.66. \end{aligned}$$

(58)

This result rejects the possibility of the case with $M_{\textrm{PT}}=1$, and provides a similar value to $M_{\textrm{PT}}=2$. By parallel considerations, the lower bound $M^\textrm{LB}_{\textrm{PT}}$ has useful information on the true value of $M_{\textrm{PT}}$ because it rejects the possibility of the case with $M_{\textrm{PT}}< M^\textrm{LB}_{\textrm{PT}}$.

5.3.3 Remark on the Practical Interpretation of $M_{\textrm{PT}}$.

In practice, it is not easy to define the total number of traders $M_{\textrm{PT}}$. We here remark on this technical issue. For example, let us consider the case where one mutual fund joins a financial market with one trading account. If we regard a trading account as the unit of the trader, the contribution of the mutual funds to $M_{\textrm{PT}}$ is one. However, a mutual fund has many clients behind and aggregates their orders, and it might be plausible to count the “hidden" clients regarding the contribution to $M_{\textrm{PT}}$. We are unsure which interpretation is appropriate for the LMF calibration. This example illustrates the difficulty in defining the total number of traders in practice from the viewpoint of data analysts. In other words, if we attempt to correctly predict $M_{\textrm{PT}}$, many microscopic details start to matter. Accordingly, $M_{\textrm{PT}}^\textrm{LB}$ provides an approximate lower bound but not the exact lower bound in light of the difficulty in defining the true $M_{\textrm{PT}}$.

6 Theoretical Discussion 2: Superposition of the Exponential Splitting Traders

In Sect. 5, we discuss the theoretical scenario in the overwhelming presence of power-law splitting traders to understand the origin of the LRC in the market-order flow. While this scenario is the most promising and plausible, here we discuss other theoretical possibilities that the LRC appears as the superposition of exponential splitting traders. In other words, we assume the absence of powew-law splitting traders $M_{\textrm{PT}}=0$ but the dominant presence of exponential splitting traders $M_\textrm{ET}\ne 0$. Interestingly, the LMF prediction $\gamma = \alpha -1$ still holds even for this alternative scenario, suggesting the robustness of the LMF prediction.

Let us define the empirical distribution function of $(L^{*(i)}, \lambda ^{(i)})$, which characterises the exponential splitting traders:

$$\begin{aligned} P_{\textrm{ET}}(L^*,\lambda ) := \frac{1}{M_{\textrm{ET}}}\sum _{i\in \Omega _{\textrm{ET}}}\delta (L^*-L^{*(i)})\delta (\lambda -\lambda ^{(i)}), \>\>\> M_{\textrm{ET}} := |\Omega _{\textrm{ET}}|. \end{aligned}$$

(59)

For simplicity, we assume that $M_{\textrm{ET}}$ is large enough for $P_{\textrm{ET}}(L^*,\lambda )$ to be approximated as a continuous function and that $L^*$ and $\lambda $ are statistical independent, such that

$$\begin{aligned} P_{\textrm{ET}}(L^*,\lambda ) = P_{\textrm{ET}}(L^*)P_{\textrm{ET}}(\lambda ) \end{aligned}$$

(60)

with $P_{\textrm{ET}}(L^*):=(1/M_{\textrm{ET}})\sum _{i\in \Omega _{\textrm{ET}}} \delta (L^*-L^{*(i)})$ and $P_{\textrm{ET}}(\lambda ):=(1/M_\textrm{ET})\sum _{i\in \Omega _{\textrm{ET}}} \delta (\lambda -\lambda ^{(i)})$. In addition, we assume the total number of exponential splitting traders is sufficiently large $M_{\textrm{ET}}\gg 1$ and there is no single trader overwhelmingly contributing to the total market orders, such that

$$\begin{aligned} \lambda ^{(i)} \ll 1 \>\> \text{ for } \text{ all } i\in \Omega _{\textrm{ET}}. \end{aligned}$$

(61)

6.1 Scenario Based on the Fat-Tailed Decay Length Distribution

Let us focus on the strong inhomogeneity in the decay length $L^{*(i)}$, such that

$$\begin{aligned} P(L^*)\simeq ( \vartheta -1)\left( L^{*}\right) ^{-\vartheta } \end{aligned}$$

(62)

with $\vartheta \in (1,2)$. This scenario can be interpreted from financial viewpoints as follows: the typical metaorder length is assumed to be correlated with the size of the trading institutions, such that large (small) metaorders are likely to be associated with large (small) institutions. The typical lengths of metaorders are assumed to be homogeneous within the same institution (e.g., with exponential distributions). Finally, the heterogeneity in the institution sizes obeys power laws.

On this assumption, we find that both the market order ACF $C_{\tau }$ and the aggregated empirical metaorder-length PDF $\rho _{\textrm{ST}}^\textrm{empirical}(L)$, defined by Eq. (42), obeys the power law (see Fig. 5; see Appendices 1 and 1 for the derivation and the technical details of numerical simulations):

6.1.1 Relationship to the Previous BBDG Model

Let us discuss the relationship to the previous model in the textbook [3] by Bouchaud, Bonart, Donier, and Gould (BBDG). BBDG proposed a variant of the LMF model (which we call the BBDG model in this article to distinguish the two models; see Appendix I) to simplify the algebraic calculations for the ACF formulas. The BBDG model is based on the assumption that the stopping probability of order splitting obeys a power law. Our scenario based on superposition of exponential splitters is essentially similar to the BBDG model. Indeed, by setting $\lambda ^{(i)}=\mu /M_{\textrm{ET}}$, we obtain

$$\begin{aligned} q_0^\textrm{BBDG} := \Gamma (\alpha )\frac{\mu ^{3-\alpha }}{M_\textrm{ET}^{2-\alpha }}, \end{aligned}$$

(64)

which is equivalent to the formula in Ref. [3] when $\mu =1$.

6.1.2 Robustness of the Power-Law Formula

The results (63) highlight the robustness of the LMF power-law prediction $\gamma =\alpha -1$: if the empirical aggregated metaorder-length PDF $\rho _{\textrm{ST}}^\textrm{empirical}(L)$ obeys the power law with exponent $\alpha $, we can expect the ACF power-law decay with exponent $\gamma =\alpha -1$ whether $\rho _{\textrm{ST}}^\textrm{empirical}(L)$ is composed of power-law splitters or superposition of exponential-law splitters. This prediction is robust and insensitive to the details of the underlying microscopic dynamics, even regarding the types of splitters (regardless of whether they are power-law or exponential-law splitters). This character is pleasant and reliable for data analyses.

6.1.3 Robustness of the Prefactor Formulas

The prefactor formula (63) is very similar to the prefactor formula (49) in Sect. 5. Indeed, we have

In other words, the prefactor is smallest if and only if the submission intensities are homogeneous, such that $\lambda _i=\mu /M_{\textrm{ET}}$ for all $i\in \Omega _{\textrm{ET}}$.

6.1.4 Remark on the Essential Similarity Between the LMF and BBDG Models

In addition, we find that the prefactor formulas between $c_0$ and $q_0$ are essentially similar in the sense that

In other words, the prefactors are almost the same between the two scenarios except for factor 2 at most.

6.2 Open Question: The Power-Law Splitter Scenario vs. the Superposed Exponential-Law Splitter Scenario

We presented various theoretical scenarios to derive the market ACF from the order-splitting hypothesis. For example, the power-law exponent formula $\gamma =\alpha -1$ and the prefactor inequality (55) robustly hold for both two scenarios, in the presence of power-law splitters or superposition of exponential splitters with various decay length. Note that the power-law ACF can be derived even for the scenario of superposition of exponential splitters with various intensities (see Appendix J).

The natural question is which scenario is most plausible in reality, the power-law splitter scenario or the superposition of the exponential-law splitters. While the robustness of the LMF predictions is a pleasant character in applying the LMF theory, verifying the LMF predictions itself does not immediately imply the rejection of either scenario inversely due to the robustness of the LMF formulas. In other words, while our previous reports [22, 23] establish the relationship $\gamma =\alpha -1$ and the inequality (55), technically, they do not distinguish the two scenarios of the power-law splitting and the superposition of exponential splitting. It is a crucial issue to reject either scenario in real data analyses. Indeed, testing these scenarios is feasible by studying the metaorder-length distributions of individual traders, which is planned to be our subsequent study by analysing the TSE microscopic dataset.

7 Conclusion

We have proposed a generalised Lillo–Mike–Farmer model by incorporating the heterogeneity of order-splitting strategies. This model is exactly solved to evaluate the impact of the heterogeneous strategies regarding both the power-law exponent and the prefactor in the order-sign autocorrelation function. Our theoretical formulas imply that (i) the power-law exponent formula $\gamma =\alpha -1$ robustly holds even in the presence of the heterogeneous intensity distributions. On the other hand, (ii) the prefactor formula is sensitive to the underlying microscopic assumptions. Indeed, the formula explicitly depends on the intensity distributions among the power-law splitting traders. Furthermore, we find that (iii) the prefactor formula for the homogeneous LMF model systematically underestimates the actual prefactor in the presence of the heterogeneous order-submission probability distributions. We believe that points (i)–(iii) are essential in examining the LMF model for data calibration.

These days, the availability of high-quality microscopic datasets has been significantly enhanced, and our recent articles [22, 23] have verified the LMF prediction quantitatively. Considering such updates from the data-analytic side, we believe that the classical LMF theory should be updated for precise empirical validation.

We must admit that our generalisation is just a first step forward for data calibration, and there is plenty of room to improve the trader model for market order submissions. While only the heterogeneity of the order-splitting strategies is included in our generalised LMF model, other characters, such as the trend-following (herding) behaviour among traders, are not included. Trend-following behaviour is empirically observed at the level of individual traders [24], which can be included in the market order submission models for a more precise market description.

Data Availability

The main results in this manuscript are mathematically derived and are supplemented by numerical simulations. The original codes to reproduce the numerical results are available as supplementary information.

Notes

They showed that the theoretical line (3a) passed through the centre of the scatterplot between $\alpha $ and $\gamma $, but the regression coefficient did not agree with their prediction at all. In our view, this might be partially due to their imperfect dataset and statistical analyses at that time.
See the following derivation:
$$\begin{aligned} P(A|B) = \frac{P(A,B)}{P(B)} = \sum _{C} \frac{P(A,B,C)}{P(B)} = \sum _{C} \frac{P(A,B,C)}{P(B,C)}\frac{P(B,C)}{P(B)} = \sum _{C}P(A|B,C)P(C|B). \end{aligned}$$
(20)

References

Mantegna, R.N., Stanley, H.E.: Introduction to Econophysics. Cambridge University Press, Cambridge (1999)
Book Google Scholar
Slanina, F.: Essentials of Econophysics Modelling. Oxford University Press, Oxford (2014)
Google Scholar
Bouchaud, J.-P., Bonart, J., Donier, J., Gould, M.: Trades, Quotes and Prices. Cambridge University Press, Cambridge (2018)
Book Google Scholar
Jusup, M., et al.: Social physics. Phys. Rep. 948, 1 (2022)
Article ADS MathSciNet Google Scholar
Daniels, M.G., Farmer, J.D., Gillemot, L., Iori, G., Smith, E.: Quantitative model of price diffusion and market friction based on trading as a mechanistic random process. Phys. Rev. Lett. 90, 108102 (2003)
Article ADS Google Scholar
Smith, E., Farmer, J.D., Gillemot, L., Krishnamurthy, S.: Statistical theory of the continuous double auction. Quantit. Finance 3, 481 (2003)
Article ADS MathSciNet Google Scholar
Bouchaud, J.-P., Mézard, M., Potters, M.: Statistical properties of stock order books: empirical results and models. Quantit. Finance 2, 251 (2002)
Article Google Scholar
Tóth, B., et al.: Anomalous price impact and the critical nature of liquidity in financial markets. Phys. Rev. X 1, 021006 (2011)
Google Scholar
Donier, J., Bonart, J., Mastromatteo, I., Bouchaud, J.-P.: A fully consistent, minimal model for non-linear market impact. Quant. Finance 15, 1109 (2015)
Article MathSciNet Google Scholar
Takayasu, H., Miura, H., Hirabayashi, T., Hamada, K.: Statistical properties of deterministic threshold elements—the case of market price. Physica A 184, 127 (1992)
Article ADS Google Scholar
Sato, A.-H., Takayasu, H.: Dynamic numerical models of stock market price: from microscopic determinism to macroscopic randomness. Physica A 250, 231 (1998)
Article ADS Google Scholar
Kanazawa, K., Sueshige, T., Takayasu, H., Takayasu, M.: Derivation of the Boltzmann equation for financial Brownian motion: direct observation of the collective motion of high-frequency traders. Phys. Rev. Lett. 120, 138301 (2018)
Article ADS Google Scholar
Kanazawa, K., Sueshige, T., Takayasu, H., Takayasu, M.: Kinetic theory for financial Brownian motion from microscopic dynamics. Phys. Rev. E 98, 052317 (2018)
Article ADS Google Scholar
Kanazawa, K., Takayasu, H., Takayasu, M.: Exact solution to two-body financial dealer model: revisited from the viewpoint of kinetic theory. J. Stat. Phys. 190, 8 (2023)
Article ADS MathSciNet Google Scholar
Lillo, F., Mike, S., Farmer, J.D.: Theory for long memory in supply and demand. Phys. Rev. E 71, 066122 (2005)
Article ADS Google Scholar
LeBaron, B., Yamamoto, R.: Long-memory in an order-driven market. Physica A 383, 85 (2007)
Article ADS MathSciNet Google Scholar
LeBaron, B., Yamamoto, R.: The impact of imitation on long memory in an order-driven market. East. Econ. J. 34, 504 (2008)
Article Google Scholar
Yamamoto, R.: Order aggressiveness, pre-trade transparency, and long memory in an order-driven market. J. Econ. Dyn. Control 35, 1938 (2011)
Article MathSciNet Google Scholar
Vaglica, G., Lillo, F., Moro, E., Mantegna, R.N.: Scaling laws of strategic behavior and size heterogeneity in agent dynamics. Phys. Rev. E 77, 036110 (2008)
Article ADS Google Scholar
Bershova, N., Rakhlin, D.: The non-linear market impact of large trades: evidence from buy-side order flow. Quant. Finance 13, 1759 (2013)
Article MathSciNet Google Scholar
Tóth, B., Palit, I., Lillo, F., Farmer, J.D.: Why is equity order flow so persistent? J. Econ. Dyn. Control 51, 218 (2015)
Article MathSciNet Google Scholar
Sato, Y., Kanazawa, K.: Inferring microscopic financial information from the long memory in market-order flow: a quantitative test of the Lillo-Mike-Farmer model. Phys. Rev. Lett. 131, 197401 (2023)
Article ADS MathSciNet Google Scholar
Sato, Y., Kanazawa, K.: Quantitative statistical analysis of order-splitting behaviour of individual trading accounts in the Japanese stock market over nine years. Phys. Rev. Res. 5, 043131 (2023)
Article Google Scholar
Sueshige, T., Kanazawa, K., Takayasu, H., Takayasu, M.: Ecology of trading strategies in a forex market for limit and market orders. PLoS ONE 13, e0208332 (2018)
Article Google Scholar

Download references

Acknowledgements

YS was supported by JST SPRING (Grant Number JPMJSP2110). KK was supported by JST PRESTO (Grant Number JPMJPR20M2), JSPS KAKENHI (Grant Numbers 21H01560, 22H01141, and 23H00467), and JSPS Core-to-Core Program (Grant Number JPJSCCA20200001).

We thank Hideki Takayasu for his fruitful comment regarding the robustness of our power-law ACF formulas, particularly for the time inhomogeneity of the intensities.

We also thank Fabrizio Lillo for his suggestion to study another alternative scenario to explain the long-range correlation, which is based on the superposition of exponential splitters. Finally, we declare the author contributions. Both YS and KK contributed to all the analytical calculations. YS initially derived the asymptotic solutions of the heterogeneous LMF model and wrote the programming code. KK followed, confirmed, and fixed all the analytical calculations regarding the mathematical exactness and contributed to the ACF coefficient inequality proofs. Both YS and KK wrote the manuscript and agreed all its contents.

Author information

Authors and Affiliations

Department of Physics, Graduate School of Science, Kyoto University, Kyoto, 606-8502, Japan
Yuki Sato & Kiyoshi Kanazawa

Authors

Yuki Sato
View author publications
You can also search for this author in PubMed Google Scholar
Kiyoshi Kanazawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuki Sato.

Ethics declarations

Conflict of interest

The authors do not have any potential conflicts of interest to disclose.

Additional information

Communicated by Francesco Zamponi.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Derivation of the Stationary PDF for the Remaining Metaorder Length

Let us derive the stationary PDF $P_{\textrm{st}}(R^{(i)})$ for the remaining metaorder length $R^{(i)}$ via the master equation approach. The dynamics of remaining metaorder length $R^{(i)}$ is represented as

$$\begin{aligned} f(R^{(i)}_{t+1})-f(R^{(i)}_t)&= {\left\{ \begin{array}{ll} 0 &{} \text{ with } \text{ prob. } 1-\lambda ^{(i)}\\ f(R^{(i)}_t-1) - f(R^{(i)}_t) &{} \text{ with } \text{ prob. } \lambda ^{(i)} \text{ when } R^{(i)}_t>1 \\ f(L) - f(1) &{} \text{ with } \text{ prob. } \lambda ^{(i)} \text{ when } R^{(i)}_t=1 \end{array}\right. }, \end{aligned}$$

(67)

where $f(R^{i}_t)$ is the arbitrary function and the renewed metaorder length L obeys $\rho ^{(i)}(L)$. By taking the ensemble average of the both-hand sides of (67), we obtain an identity

$$\begin{aligned}&\sum _{R^{(i)}}\left( P_{t+1}(R^{(i)})-P_{t}(R^{(i)})\right) f(R^{(i)}) \nonumber \\ {}&\quad =\sum _{R^{(i)}}P_t(R^{(i)}) \left[ \lambda ^{(i)}\left\{ \left( f(R^{(i)}-1) - f(R^{(i)})\right) \Theta (R^{(i)}-1) + \sum _{L=1}^{\infty }\rho ^{(i)}(L)\delta _{R^{(i)},1}\left( f(L) - f(1) \right) \right\} \right] \end{aligned}$$

(68)

with the Heaviside function $\Theta (x)$ defined by

$$\begin{aligned} \Theta (x) = {\left\{ \begin{array}{ll} 1 &{} (x > 0) \\ 0 &{} (x \le 0) \end{array}\right. }. \end{aligned}$$

(69)

By setting $f(R^{(i)}_t) = \delta _{R^{(i)}_t,R}$ with a real number R, we obtain the master equation for $R>0$ as

$$\begin{aligned} \Delta _{t}P_t(R) = \lambda ^{(i)}\left\{ P_t(R+1)-P_t(R) + P_t(1)\rho ^{(i)}(R) \right\} , \end{aligned}$$

(70)

where $\Delta _{t}P_t(R):= P_{t+1}(R)-P_t(R)$.

We next use the master equation (70) and derive the stationary distribution of remaining metaorder length PDF $P_t(R^{(i)})$. In the stationary state $\Delta _{t}P_t\left( R^{(i)}\right) = 0$, we obtain the stationary distribution $P_{\textrm{st}}(R^{(i)})$ in an exact form as

$$\begin{aligned} P_{\textrm{st}}(R^{(i)})&= c_R^{(i)} \rho _{\ge }^{(i)}(R^{(i)}) \end{aligned}$$

(71)

with the CCDF of $\rho ^{(i)}_{\ge }(L)$ and the normalisation coefficient $c_R$ defined by

$$\begin{aligned} \rho ^{(i)}_{\ge }(L) := 1 - \sum ^{L-1}_{L'=1}\rho ^{(i)}(L') = \sum _{L'=L}^\infty \rho ^{(i)}(L'), \>\>\> c_R^{(i)} = P_{\textrm{st}}(1):=\frac{1}{\sum _{L=1}^\infty \rho _{\ge }^{(i)}(L)}. \end{aligned}$$

(72)

We note that $c_R$ can be transformed as

$$\begin{aligned} \frac{1}{c_R^{(i)}} = \sum _{L=1}^\infty \rho _{\ge }^{(i)}(L) = \sum _{L=1}^\infty \sum _{L'=L}^\infty \rho ^{(i)}(L') = \sum _{L'=1}^\infty \sum _{L=1}^{L'}\rho ^{(i)}(L') = \sum _{L'=1}^\infty L'\rho ^{(i)}(L') = L_{\textrm{avg}}^{(i)}.\qquad \end{aligned}$$

(73)

Appendix B: Derivation of the Survival Probability of the Metaorder Length

Let us derive the survival probability of the metaorder length, which is formulated as the CDF $P(N_{\tau }^{(i)}\le R_0^{(i)}-1)$ of the discrete-time Poisson counting process $N_{t}^{(i)}$ with order-submission probability $\lambda ^{(i)}$ and initial condition $N_{t=1}^{(i)}=1$.

First of all, let us derive the PDF for $N_{t}^{(i)}$ via the master equation approach. The dynamics of the Poisson counting process $N_{t}^{(i)}$ is represented as

$$\begin{aligned} \Delta _t f(N_{t}^{(i)}) = {\left\{ \begin{array}{ll} 0 &{} \text{ with } \text{ prob. } 1-\lambda ^{(i)} \\ f(N_{t}^{(i)}+1) - f(N_{t}^{(i)}) &{} \text{ with } \text{ prob. } \lambda ^{(i)}, \end{array}\right. } \end{aligned}$$

(74)

where $f(N_{t}^{(i)})$ is an arbitrary function, and $\Delta _t f(N_{t}^{(i)}):= f(N_{t+1}^{(i)})-f(N_{t}^{(i)})$. We take the ensemble average of both-hand sides to obtain

$$\begin{aligned} \sum _{N^{(i)}} (P_{t+1}(N^{(i)})-P_{t}(N^{(i)}))f(N^{(i)}) = \sum _{N^{(i)}} \lambda ^{(i)} P_{t}(N^{(i)}) \left( f(N^{(i)}+1) - f(N^{(i)})\right) . \end{aligned}$$

(75)

By setting $f(N^{(i)})=\delta _{N^{(i)},N^*}$ with a real number $N^*$ and replacing the dummy variable $N^*$ with $N^{(i)}$, we obtain the master equation

$$\begin{aligned} \Delta _t P_t(N^{(i)}) = \lambda ^{(i)}\left( P_t(N^{(i)}-1)-P_t(N^{(i)})\right) \end{aligned}$$

(76)

with $\Delta _t P_t(N^{(i)}):=P_{t+1}(N^{(i)})-P_t(N^{(i)})$. Finally, we find that Eq. (23) satisfies the master equation (76) under the initial condition $P_{t=1}(N^{(i)}) = \delta _{N^{(i)},1}$.

We note that $P(N^{(i)}_{\tau } \le R_0^{(i)}-1)$ can be simplified as

$$\begin{aligned} P(N^{(i)}_{\tau } \le R_0^{(i)}-1) = \sum _{N^{(i)}=1}^{R_0^{(i)}-1} P_{\tau }(N^{(i)}) = {\left\{ \begin{array}{ll} I_{1-\lambda ^{(i)}}(\tau -R_{0}^{(i)}+1, R_{0}^{(i)}-1) &{} (\text{ for } \tau > R_0^{(i)}-1) \\ 1 &{} (\text{ for } \tau \le R_0^{(i)}-1) \end{array}\right. }, \end{aligned}$$

(77)

where $I_{x}(a,b)$ is the regularised incomplete Beta function defined by

$$\begin{aligned} I_{x}(a,b) := \frac{B(x;a,b)}{B(a,b)}, \>\>\> B(x;a,b):= \int _0^x t^{a-1}(1-t)^{b-1}dt, \>\>\> B(a,b) = B(1;a,b)\qquad \end{aligned}$$

(78)

for real numbers x, a, and b.

Appendix C: Derivation of the Exact ACF Formula (33) for the Exponential-Splitting Traders

In this Appendix, we exactly derive the ACF formula (33) under the assumption (31). Using Eq. (24), we obtain

$$\begin{aligned} \frac{C_\tau ^{(i)}}{c_R^{(i)}\left( \lambda ^{(i)}\right) ^2}= \sum _{R=2}^{\tau }\sum _{N=1}^{R-1} \frac{(\tau -1)!}{(N - 1)!(\tau -N)!} \left( \lambda ^{(i)}\right) ^{N - 1} \left( 1-\lambda ^{(i)}\right) ^{\tau -N} e^{-(R-1)/L^{*(i)}} + \sum _{R=\tau +1}^\infty \rho _{\ge }^{(i)}(R)\qquad \end{aligned}$$

(79)

by replacing the dummy variables $R_0^{(i)}$ and $N^{(i)}$ with R and N, respectively. By exchanging the sums $\sum _{R=2}^{\tau }$ and $\sum _{N=1}^{R-1}$, the first term of the right-hand side can be evaluated as

$$\begin{aligned}&\sum _{R=2}^{\tau }\sum _{N=1}^{R-1} \frac{(\tau -1)!}{(N - 1)!(\tau -N)!} \left( \lambda ^{(i)}\right) ^{N - 1} \left( 1-\lambda ^{(i)}\right) ^{\tau -N} e^{-(R-1)/L^{*(i)}} \nonumber \\&\quad = \sum _{N=1}^{\tau -1}\frac{(\tau -1)!}{(N - 1)!(\tau -N)!} \left( \lambda ^{(i)}\right) ^{N - 1} \left( 1-\lambda ^{(i)}\right) ^{\tau -N}\sum _{R=N+1}^{\tau } e^{-(R-1)/L^{*(i)}} \nonumber \\&\quad = \frac{1}{1-e^{-1/L^{*(i)}}}\sum _{N=1}^{\tau -1}\frac{(\tau -1)!}{(N - 1)!(\tau -N)!} \left( \lambda ^{(i)}\right) ^{N - 1} \left( 1-\lambda ^{(i)}\right) ^{\tau -N}\left( e^{-N/L^{*(i)}}-e^{-\tau /L^{*(i)}}\right) \nonumber \\&\quad = \frac{1}{1-e^{-1/L^{*(i)}}}\left[ \left( 1-\lambda ^{(i)}+\lambda ^{(i)}e^{-1/L^{*(i)}}\right) ^{\tau -1}e^{-1/L^{*(i)}} -e^{-\tau /L^{*(i)}}\right] . \end{aligned}$$

(80)

The second term is evaluated as

$$\begin{aligned} \sum _{R=\tau +1}^\infty \rho _{\ge }^{(i)}(R) = \frac{e^{-\tau /L^{*(i)}}}{1-e^{-1/L^{*(i)}}}. \end{aligned}$$

(81)

We thus obtain Eq. (33) by using $c_R=1-e^{-1/L^{*(i)}}$.

Appendix D: Derivation of the Asymptotic ACF Formula (37) for the Power-Law-Splitting Traders

We derive the asymptotic ACF formula (37) under the assumption (34). For large $\tau \gg 1$, the binomial distribution (23) can be asymptotically approximated as the normal distribution due to the central limit theorem:

$$\begin{aligned} \lim _{\tau \rightarrow \infty } P\left( \frac{N_{\tau }^{(i)}-1-\lambda (\tau -1)}{\sqrt{\lambda ^{(i)}(1-\lambda ^{(i)})(\tau -1)}}\le a\right) = \int _{\infty }^a\frac{dx }{\sqrt{2\pi }}e^{-x^2/2} = \frac{1}{2}\left[ \textrm{erf}\left( \frac{a}{\sqrt{2}}\right) +1\right] \end{aligned}$$

(82)

with the error function defined by

$$\begin{aligned} \textrm{erf}(x) := \frac{2}{\sqrt{\pi }}\int _0^x e^{-z^2}dz. \end{aligned}$$

(83)

For large $\tau \gg 1/\lambda ^{(i)}$, we approximate the sum in the ACF formula (24) by an integration, such that

$$\begin{aligned} C_\tau ^{(i)}&\approx \frac{c_R^{(i)}\left( \lambda ^{(i)}\right) ^2}{2} \int _2^{\infty } R^{-\alpha ^{(i)}}\left[ \textrm{erf}\left( \frac{R-\lambda ^{(i)}\tau }{\sqrt{2\tau \sigma ^2}}\right) +1\right] dR \end{aligned}$$

(84)

with $\sigma ^2:=\lambda ^{(i)}(1-\lambda ^{(i)})$ and $c_R^{(i)}=(\alpha ^{(i)}-1)/\alpha ^{(i)}$, where we use Eq. (82). By using the partial integration, we obtain

$$\begin{aligned}&\alpha ^{(i)}\frac{C_\tau ^{(i)}}{\left( \lambda ^{(i)}\right) ^2} \simeq \frac{(\alpha ^{(i)}-1)}{2}\int _2^{\infty } R^{-\alpha ^{(i)}}\left[ \textrm{erf}\left( \frac{R-\lambda ^{(i)}\tau }{\sqrt{2\tau \sigma ^2}}\right) + 1\right] dR \nonumber \\&\quad = -\left[ \frac{1}{2}R^{1-\alpha ^{(i)}}\left\{ \textrm{erf}\left( \frac{R-\lambda ^{(i)}\tau }{\sqrt{2\tau \sigma ^2}}\right) + 1\right\} \right] _2^\infty \nonumber \\&\quad + \int _2^{\infty } \frac{dR}{\sqrt{2\pi \tau \sigma ^2}}R^{1-\alpha ^{(i)}}\exp \left( -\frac{(R-\lambda ^{(i)}\tau )^2}{2\tau \sigma ^2}\right) \nonumber \\&\quad = 2^{-\alpha ^{(i)}}\left\{ \textrm{erf}\left( \frac{2-\lambda ^{(i)}\tau }{\sqrt{2\tau \sigma ^2}}\right) +1\right\} + \int _{2-\lambda ^{(i)}\tau }^\infty \frac{dz}{\sqrt{2\pi \tau \sigma ^2}}dz(\lambda ^{(i)}\tau +z)^{1-\alpha ^{(i)}}e^{-z^2/(2\tau \sigma ^2)} \nonumber \\&\quad \simeq \int _{-\infty }^\infty \frac{dz}{\sqrt{2\pi \tau \sigma ^2}}dz(\lambda ^{(i)}\tau +z)^{1-\alpha ^{(i)}}e^{-z^2/(2\tau \sigma ^2)} \simeq \left( \lambda ^{(i)}\tau \right) ^{1-\alpha ^{(i)}}, \end{aligned}$$

(85)

which leads to Eq. (37).

Appendix E: Convergence Speed of the Aggregated Metaorder-Length Distribution for Heterogeneous Power-Law Splitters

We numerically discuss the convergence speed of the asymptotic relationship (46): $\lim _{N_{\textrm{tot}}\rightarrow \infty }\alpha ^{*}=\alpha _{\min }$ for the aggregated metaorder length distribution $\rho _{\textrm{ST}}^\textrm{empirical}(L)\propto L^{-\alpha ^{*}-1}$, $\alpha _{\min }:=\min _{i\in \Omega _\textrm{PT}}\alpha ^{(i)}$, and the total number of transactions $N_\textrm{tot}$.

Our numerical setup is as follows: we assume that all the traders are the power-law splitters ($\mu =1$), and the total number of traders is fixed as $M_{\textrm{tot}}=100$. The order submission probabilities are assumed to be uniform, such that $\lambda _{(i)}=1/M_{\textrm{tot}}$ for all $i \in \Omega _{\textrm{PT}}$. We also assume that there are two types of power-law splitters with different exponents, such that $M_1$ splitters have the power-law exponent $\alpha _1$ and $M_{2}:=M_{\textrm{tot}} - M_1$ splitters have $\alpha _2$. By fixing the total number of transactions $N_\textrm{tot}$, we repeatedly generated the numerical CCDF $\rho ^\textrm{empirical}_{\ge ;\mathrm ST}(L):= \sum _{L'\ge L}\rho _{\textrm{ST}}^\textrm{empirical}(L')\propto L^{-\alpha ^*}$. The exponent $\alpha ^*$ was measured by the least squares method applied to the run-length CCDF on the log-log scale with the ten points excluded in the right tail.

In Fig. 6, we numerically plot $|\alpha ^*-\alpha _{\min }|$, the absolute error between the empirical exponent $\alpha ^*$ and the minimum exponent $\alpha _{\min }:=\min \{\alpha _1,\alpha _2\}$, for many patterns of the parameter set $(M_{1},\alpha _1,\alpha _2)$. This figure shows that $|\alpha ^*-\alpha _{\min }|$ gradually decreases to zero as $N_\textrm{tot}$ increases as expected. However, the convergence speed is slow; to control the convergence error within 0.1 (0.05), such that $|\alpha ^*-\alpha _{\textrm{min}}|\le 0.1$ ($|\alpha ^*-\alpha _\textrm{min}|\le 0.05$), the minimum sample size is roughly evaluated as at least $N_{\textrm{tot}} \approx 10^{5}-10^6$ ($N_{\textrm{tot}} \approx 10^{7}-10^8$), though it strongly depends on the parameter selections.

Appendix F: Proof of the Inequality for the Prefactor (55)

In this section, we provide the proof of the inequality (55).

1.1 F.1 On the Lower Bound

Let us focus on the lower bound of the inequality (55). We use Hölder’s inequality:

$$\begin{aligned} \left( \sum _{i} |a_i|^{p}\right) ^{1/p} \left( \sum _{i} |b_i|^{q}\right) ^{1/q} \ge \sum _{i} |a_ib_i| \end{aligned}$$

(86)

for any series $\{a_i\}_i, \{b_i\}_i$ and any real numbers p, q satisfying $1/p+1/q=1$, $p \ge 1$, and $q \ge 1$. By putting $a_i = \lambda ^{(i)}$, $b_i = \lambda ^{(i)}_{\textrm{LMF}}=\mu /M_{\textrm{PT}}$, and $p = 3-\alpha \in (1,2)$, we obtain

$$\begin{aligned} \left\{ \sum _{i\in \Omega _{\textrm{PT}}} \left( \lambda ^{(i)}\right) ^{p}\right\} ^{1/p} \left\{ \sum _{i\in \Omega _{\textrm{PT}}} \left( \frac{\mu }{M_\textrm{PT}}\right) ^{q}\right\} ^{1/q} \ge \sum _{i\in \Omega _{\textrm{PT}}} \lambda ^{(i)}\frac{\mu }{M_{\textrm{PT}}}, \end{aligned}$$

(87)

which is equivalent to

$$\begin{aligned} \sum _{i\in \Omega _{\textrm{PT}}} \left( \lambda ^{(i)}\right) ^{3-\alpha } \ge \frac{\mu ^{3-\alpha }}{M_{\textrm{PT}}^{2-\alpha }}. \end{aligned}$$

(88)

We thus obtain the inequality (55) for the lower bound.

1.2 F.2 On the Upper Bound

For the proof of the upper-bound inequality, we prove the inequality

$$\begin{aligned} \left( \sum _{i=1}^M x_i\right) ^a \ge \sum _{i=1}^M x_i^a \end{aligned}$$

(89)

for any nonnegative series $x_i \ge 0$, a positive integer $M \ge 1$, and a real number $a>1$.

We prove the inequality (89) by mathematical induction. The inequality trivially holds for $M=1$, and let us start the proof from $M=2$ by defining a function $f(x)=(1+x)^a-x^a$. The derivative of f(x) is positive for $x\ge 0$, such that

$$\begin{aligned} \frac{df(x)}{dx} = a\left\{ (1+x)^{a-1}-x^{a-1}\right\} > 0 \end{aligned}$$

(90)

since $a-1>0$. We thus find that $f(x)\ge f(0)=1$, or equivalently $(1+x)^a \ge 1 + x^a$. By setting $x=x_2/x_1$ by assuming $x_1 \ne 0$, we obtain $(x_1+x_2)^a \ge x_1^a + x_2^a$. When $x_1=0$, we trivially obtain $(x_1+x_2)^a = x_1^a + x_2^a$. We therefore find

$$\begin{aligned} (x_1+x_2)^a \ge x_1^a + x_2^a, \end{aligned}$$

(91)

which is the special case of the inequality (89) for $M=2$.

We next assume that the inequality (89) holds up to $M=k$ with an integer $k\ge 2$.

$$\begin{aligned} \left( \sum _{i=1}^{k+1}x_i\right) ^a = \left( x_{k+1}+\sum _{i=1}^{k}x_i\right) ^a \ge \left( \sum _{i=1}^{k}x_i\right) ^a + x_{k+1}^a \end{aligned}$$

(92)

by applying the inequality (91). Since the inequality (89) holds for $M=k$, we obtain

$$\begin{aligned} \left( \sum _{i=1}^{k}x_i\right) ^a + x_{k+1}^a \ge \sum _{i=1}^{k} x_i^a + x_{k+1}^a, \end{aligned}$$

(93)

which implies that the inequality (89) holds for $M=k+1$. Thus, the inequality (89) holds for any integer M.

We then prove the upper bound in the inequality (55). Given that $\lambda ^{(i)} \ge 0$ and $3-\alpha \in (1,2)$, we apply the inequality (89) by setting $x_i = \lambda ^{(i)}$. We then obtain

$$\begin{aligned} \mu ^{3-\alpha } = \left( \sum _{i \in \Omega _{\textrm{PT}}} \lambda ^{(i)}\right) ^{3-\alpha } \ge \sum _{i \in \Omega _{\textrm{PT}}} \left( \lambda ^{(i)} \right) ^{3-\alpha }, \end{aligned}$$

(94)

which is equivalent to the upper-bound inequality for (55).

Appendix G: Derivation of the Power-Law ACF Formula (63) as the Superposition of Exponential Splitters with Power-Law Metaorder Decay Lengths

We derive the ACF formula (63) as the superposition of exponential splitting traders. By assuming the approximation (61), the formula (33) is approximately given by

$$\begin{aligned} C_{\tau }^{(i)} \simeq \left( \lambda ^{(i)}\right) ^2 e^{- \lambda ^{(i)}\tau /L^{*(i)}} \end{aligned}$$

(95)

with large $L^{*(i)} \gg 1$ at leading order. Using this approximate formula, we obtain

$$\begin{aligned} C_{\tau }&\simeq \sum _{i\in \Omega _{\textrm{ET}}} \left( \lambda ^{(i)}\right) ^2 e^{- \lambda ^{(i)}\tau /L^{*(i)}} \nonumber \\&= M_{\textrm{ET}}\int _0^1 d\lambda \int _1^\infty dL^* \left( \frac{1}{M_{\textrm{ET}}}\sum _{i\in \Omega _{\textrm{ET}}} \delta (L^*-L^{*(i)})\delta (\lambda -\lambda ^{(i)})\right) \lambda ^2 e^{- \lambda \tau /L^{*}} \nonumber \\&= M_{\textrm{ET}} \int _0^1 d\lambda \int _1^\infty dL^* P_\textrm{ET}(L^*,\lambda )\lambda ^2 e^{-\lambda \tau /L^{*}}, \end{aligned}$$

(96)

where we used the definition (59) for $P_{\textrm{ET}}(L^*,\lambda )$. By assuming the factorised PDF $P_{\textrm{ET}}(L^*,\lambda )=P_\textrm{ET}(L^*)P_{\textrm{ET}}(\lambda )$, we consider the case with a power-law PDF regarding the decay-length $L^{*}$, such that

$$\begin{aligned} P_{\textrm{ET}}(L^*) \simeq (\vartheta -1) \left( L^{*}\right) ^{-\vartheta }\>\>\text{ for } L^{*}\in [1,\infty ) \text{ with } \vartheta \in (1,2). \end{aligned}$$

(97)

We thus obtain

$$\begin{aligned} C_{\tau }&\simeq (\vartheta -1)M_{\textrm{ET}} \int _0^1 d\lambda P_\textrm{ET}(\lambda )\lambda ^2 \int _1^\infty dL^* \left( L^*\right) ^{-\vartheta } e^{-\lambda \tau /L^{*}}\nonumber \\&= M_{\textrm{ET}}\frac{\vartheta -1}{\tau ^{\vartheta -1}}\int _0^1 d\lambda P_{\textrm{ET}}(\lambda )\lambda ^{3-\vartheta } \int _{0}^{\lambda \tau } y^{\vartheta -2}e^{-y}dy \end{aligned}$$

(98)

with the dummy-variable transformation $y:= \lambda \tau / L$. For large $\tau \gg 1$, we asymptotically obtain

$$\begin{aligned} C_{\tau } \simeq \frac{M_\textrm{ET}\Gamma (\vartheta )}{\tau ^{\vartheta -1}} \int _0^1 d\lambda P_\textrm{ET}(\lambda )\lambda ^{3-\vartheta } = \frac{\Gamma (\vartheta )}{\tau ^{\vartheta -1}} \sum _{i\in \Omega _{\textrm{ET}}}\left( \lambda ^{(i)}\right) ^{3-\vartheta } \end{aligned}$$

(99)

with $P_{\textrm{ET}}(\lambda )=(1/M_{\textrm{ET}})\sum _{i\in \Omega _\textrm{ET}}\delta (\lambda -\lambda ^{(i)})$. This is equivalent to Eq. (63a)

We next evaluate the aggregated metaorder-length PDF. Let us use an appoximation $\rho ^{(i)}(L)\simeq (1/L^{*(i)})e^{-L/L^{*(i)}}$ to obtain

$$\begin{aligned} \langle L_i\rangle \simeq \int _{0}^\infty \frac{L}{L^{*(i)}}e^{-L/L^{*(i)}}dL = L^{*(i)}. \end{aligned}$$

(100)

Using Eq. (46), we obtain

$$\begin{aligned} \rho _{\textrm{ST}}^\textrm{empirical}(L)&\simeq \frac{\langle L\rangle }{\mu }\sum _{i\in \Omega _{\textrm{ET}}}\frac{\lambda ^{(i)}}{\langle L_i\rangle }\rho ^{(i)}(L) \nonumber \\&\propto \sum _{i\in \Omega _\textrm{ET}}\lambda ^{(i)}\frac{e^{-L/L^{*(i)}}}{\left( L^{*(i)}\right) ^2} \nonumber \\&= \int _0^1 \lambda P(\lambda ) d\lambda \int _1^\infty dL^* P(L^*) \frac{e^{-L/L^*}}{L^{*2}} \nonumber \\&\propto L^{-\vartheta -1} \>\>\> \text{ for } \text{ large } L \gg 1, \end{aligned}$$

(101)

implying Eq. (63b).

Appendix H: Numerical LMF Simulation Method of the Heterogeneous Exponential Splitters

This Appendix describes the numerical simulation method for the heterogeneous exponential splitting traders, particularly for the deterministic allocation of the parameter $L^*$ based on the inverse transform method (see Fig. 7 for its schematic).

Let us assume that the distribution of $L^*$ is approximately given by

$$\begin{aligned} P(L^*):=\frac{1}{M_{\textrm{ET}}}\sum _{i\in \Omega _{\textrm{ET}}} \delta (L^*-L^{*(i)}) \simeq (\vartheta -1) \left( L^{*(i)}\right) ^{-\vartheta }, \>\>\> M_{\textrm{ET}}:=|\Omega _\textrm{ET}| \end{aligned}$$

(102)

with $\vartheta \in (1,2)$. The aim of this Appendix is to develop a systematic method to deterministically allocate $\{L^{*(i)}\}_{i \in \Omega _{\textrm{ET}}}$ even for finite $M_{\textrm{ET}}$ to approximately satisfy the relation (102). Since its CCDF should obey

$$\begin{aligned} P_{\ge }(L^*) \simeq \left( L^{*(i)}\right) ^{-\vartheta +1}, \end{aligned}$$

(103)

the value of $L^*$ corresponding to the lower-$p \times 100$ percentile is given by $1-p=P_{\ge }(L^*)\Longleftrightarrow L^*=P^{-1}_{\ge }\left( 1-p\right) $, where the inverse function is given by

$$\begin{aligned} P^{-1}_{\ge }\left( 1-p\right) := \left( \frac{1}{1-p}\right) ^{\frac{1}{\vartheta -1}}. \end{aligned}$$

(104)

On the basis of this relationship, we allocate the parameter $L^{*(\mathfrak {i})}$ for the trader $\mathfrak {i}$ by

$$\begin{aligned} L^{*(\mathfrak {i})}= \left\{ \frac{1}{1-\frac{(\mathfrak {i}-1)}{M_{\textrm{ET}}}} \right\} ^{\frac{1}{\vartheta -1}}. \end{aligned}$$

(105)

Appendix I: Review of the Bouchaud–Bonart–Donier–Gould Model

This Appendix reviews the BBDG model [3], a variant of the original LMF model. One of the advantages of the BBDG model is that its theoretical calculation can be pedagogically simplified in deriving the asymptotic behaviour of the order-sign ACF while the essential features of the LMF model are kept.

In the BBDG model, metaorder execution of OSTs is assumed to be independent of remaining volume; instead, metaorder execution randomly stops with probability $\kappa $, where $\kappa $ is reset according to the PDF $P(\kappa )= \alpha \kappa ^{\alpha -1}, 0<\kappa <1$ when the metaorder is terminated.

We next formulate the stochastic dynamics of the BBDG model by defining its state variables. The BBDG model is composed of two types of state variables. First, $E^{(i)}_t:=\left( \epsilon ^{(i)}_t,\kappa ^{(i)}_t\right) $ is the set of two state variables characterising metaorder execution behaviour of the ith trader:

$\epsilon ^{(i)}_t$: the order-sign of the metaorder.
$\kappa ^{(i)}_t$: the stopping probability of metaorder execution.

Second, $\epsilon _t$ is the state variable characterising the order-sign of the whole market. In summary, the BBDG system is specified as the point in the phase space

$$\begin{aligned} X_t: = \left( \epsilon _t, E^{(1)}, \dots , E^{(M)}\right) , \>\>\> E^{(i)}=\left( \epsilon _t^{(i)}, \kappa _t^{(i)}\right) , \end{aligned}$$

(106)

and is designed as a $(2M+1)$-dimensional Markovian stochastic process.

The BBDG model assumes the homogeneity of the order-splitting strategy among all the traders. $\mathfrak {i}_{t+1}$ represents the trader ID who submits the market order at time $t+1$. The order-submission probability of the trader $\mathfrak {i}_{t+1}$ obeys the uniform distribution,

$$\begin{aligned} P_{t+1}(\mathfrak {i}) = \frac{1}{M} \>\> \text{ for } \text{ all } \mathfrak {i}\in \{1,2,\dots ,M\} \end{aligned}$$

(107a)

as an IID sequence $\{\mathfrak {i}_t\}_{t}$. After the execution by the trader $\mathfrak {i}_{t+1}$, the trader $\mathfrak {i}_{t+1}$ decides whether she terminates her metaorder execution with probability $\kappa ^{(i)}_t$. If the metaorder execution was terminated at time $t+1$, the termination probability and its order sign are randomly reset for the trader $\mathfrak {i}_{t+1}$:

$$\begin{aligned} E^{(i)}_{t+1}&= {\left\{ \begin{array}{ll} \left( +1, \kappa \right) &{} \text{ with } \text{ prob. } \kappa _t^{(i)}/2 \text{ if } i = \mathfrak {i}_{t+1};\,{\kappa \text{ obeys } P(\kappa )= \alpha \kappa ^{\alpha -1}} \\ \left( -1, \kappa \right) &{} \text{ with } \text{ prob. } \kappa _t^{(i)}/2 \text{ if } i = \mathfrak {i}_{t+1};\,{\kappa \text{ obeys } P(\kappa )= \alpha \kappa ^{\alpha -1}} \\ \left( \epsilon ^{(i)}_t, \kappa ^{(i)}_t\right) &{} \text{ otherwise } \\ \end{array}\right. }\end{aligned}$$

(107b)

$$\begin{aligned} \epsilon _{t+1}&= \epsilon _{t}^{(\mathfrak {i}_{t+1})}. \end{aligned}$$

(107c)

Finally, the order-sign ACF is shown to exhibits the power-law decay (see Chapter 11 in [3] for its detailed derivation):

$$\begin{aligned} C(\tau )&\simeq \frac{\Gamma (\alpha )}{M^{2-\alpha }} \tau ^{-(\alpha -1)} \>\>\>\text{ for } \tau \gg 1\text{. } \end{aligned}$$

(108)

Appendix J: The Power-Law ACF Formula as the Superposition of Exponential Splitters with Power-Law Order-Submission Probability Distribution

In this Appendix, we study an alternative theoretical scenario for the origin of the LRC as the superposition of exponential splitters with various trading speeds.

Let us assume that there is no dominantly frequent order-splitting trader such that $\lambda _i \ll 1$ for all $i\in \Omega _{\textrm{ET}}$ and that the empirical PDF for the characteristic constants $P_\textrm{ET}(L^*,\lambda )$ is factorised such that $P_\textrm{ET}(L^*,\lambda )=P_{\textrm{ET}}(L^*)P_{\textrm{ET}}(\lambda )$. Particularly, we focus on the case with a truncated power-law order-submission probability PDF, such that

$$\begin{aligned} P_{\textrm{ET}}(\lambda ) \simeq \frac{\beta }{\lambda _{\textrm{cut}}^{-\beta }-1} \lambda ^{-\beta -1} \>\>\>\text{ for } \lambda \in [\lambda _{\textrm{cut}},1], \end{aligned}$$

(109)

where $\lambda _{\textrm{cut}}$ is a nonzero small parameter representing the lower cutoff of the intensities. We thus obtain

$$\begin{aligned} C_{\tau }&\simeq \frac{M_{\textrm{ET}}\beta }{\lambda _{\textrm{cut}}^{-\beta }-1} \int _1^\infty dL^* P(L^{*}) \int _{\lambda _{\textrm{cut}}}^1 d\lambda \lambda ^{-\beta -1}\lambda ^2 e^{-\lambda \tau /L^{*}} \nonumber \\&= \frac{M_{\textrm{ET}}\beta }{\lambda _{\textrm{cut}}^{-\beta }-1} \tau ^{\beta -2} \int _1^\infty \left( L^*\right) ^{2-\beta }P(L^{*})dL^* \int _{\lambda _{\textrm{cut}}\tau / L^*}^{\tau /L^*} x^{1-\beta } e^{-x}dx \nonumber \\&= \frac{\beta }{\lambda _{\textrm{cut}}^{-\beta }-1} M_{\textrm{ET}} \tau ^{-(2-\beta )} \int _1^\infty \left( L^{*}\right) ^{2-\beta } P(L^{*}) dL^* \left[ \Gamma \left( 2-\beta ,\frac{\lambda _{\textrm{cut}}\tau }{L^*}\right) -\Gamma \left( 2-\beta ,\frac{\tau }{L^*}\right) \right] , \end{aligned}$$

(110)

where we apply the variable transformation $x:= \lambda \tau / L^*$ on the second line. Here we focus on the intermediate asymptotic regime $1\ll \tau \ll \lambda ^{-1}_{\textrm{cut}}$. Since $L^*$ is not smaller than one, we obtain asymptotic relations for the incomplete Gamma function

$$\begin{aligned} \Gamma \left( 2-\beta ,\frac{\lambda _{\textrm{cut}}\tau }{L^*}\right) \simeq \Gamma (2-\beta ), \>\>\> \Gamma \left( 2-\beta ,\frac{\tau }{L^*}\right) \simeq \left( \frac{\tau }{L^*}\right) ^{1-\beta }e^{-\tau /L^*} \end{aligned}$$

(111)

for $L^*\ll \tau \ll \lambda ^{-1}_{\textrm{cut}}$. We thus obtain the power-law ACF decay independent of the metaorder-length distribution until the cutoff time $\lambda ^{-1}_{\textrm{cut}}$ (see Fig. 8 for numerical comparison),

Beyond the cutoff time $\tau \gg \lambda ^{-1}_{\textrm{cut}}$, the ACF decay should depend on the details of the metaorder-length distribution.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sato, Y., Kanazawa, K. Exact Solution to a Generalised Lillo–Mike–Farmer Model with Heterogeneous Order-Splitting Strategies. J Stat Phys 191, 58 (2024). https://doi.org/10.1007/s10955-024-03264-1

Download citation

Received: 10 November 2023
Accepted: 01 April 2024
Published: 07 May 2024
DOI: https://doi.org/10.1007/s10955-024-03264-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Exact Solution to a Generalised Lillo–Mike–Farmer Model with Heterogeneous Order-Splitting Strategies

Abstract

Similar content being viewed by others

High frequency trading strategies, market fragility and price spikes: an agent based model perspective

Rock around the clock: An agent-based model of low- and high-frequency trading

Fast traders and slow price adjustments: an artificial market with strategic interaction and transaction costs

1 Introduction

2 Model

2.1 Mathematical Notation

2.2 Model Parameters and Variables

2.3 Stochastic Dynamics

2.4 Relationship with the Original LMF Model

3 Exact Solutions

3.1 Preliminary Calculation

3.2 Remark on the Original Derivation

4 Examples and Numerical Verification

4.1 Case 0: Random Traders

4.2 Case 1: Exponential Metaorder Length Distribution

4.3 Case 2: Power-Law Metaorder Length Distribution

4.4 ACF Formula with Heterogeneous Strategies

4.4.1 Remark 1: Consistency with the Original LMF Formula for the Homogeneous Case

4.4.2 Remark 2: The Importance of the Minimum Power-Law Exponent \(\alpha _{\min }\)

5 Theoretical Discussion 1: Data Calibration Based on the Power-Law Splitting Assumption

5.1 Robust Power-Law Exponent Formula

5.2 Non-robust Prefactor Formula

5.3 Systematic Underestimation of the Prefactor by the Homogeneous LMF Model

5.3.1 Estimation of the Lower Bound of the Total Number of Order-Splitting Traders

5.3.2 How to Use the Inequality (56)

5.3.3 Remark on the Practical Interpretation of \(M_{\textrm{PT}}\).

6 Theoretical Discussion 2: Superposition of the Exponential Splitting Traders

6.1 Scenario Based on the Fat-Tailed Decay Length Distribution

6.1.1 Relationship to the Previous BBDG Model

6.1.2 Robustness of the Power-Law Formula

6.1.3 Robustness of the Prefactor Formulas

6.1.4 Remark on the Essential Similarity Between the LMF and BBDG Models

6.2 Open Question: The Power-Law Splitter Scenario vs. the Superposed Exponential-Law Splitter Scenario

7 Conclusion

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Derivation of the Stationary PDF for the Remaining Metaorder Length

Appendix B: Derivation of the Survival Probability of the Metaorder Length

Appendix C: Derivation of the Exact ACF Formula (33) for the Exponential-Splitting Traders

Appendix D: Derivation of the Asymptotic ACF Formula (37) for the Power-Law-Splitting Traders

Appendix E: Convergence Speed of the Aggregated Metaorder-Length Distribution for Heterogeneous Power-Law Splitters

Appendix F: Proof of the Inequality for the Prefactor (55)

1.1 F.1 On the Lower Bound

1.2 F.2 On the Upper Bound

Appendix G: Derivation of the Power-Law ACF Formula (63) as the Superposition of Exponential Splitters with Power-Law Metaorder Decay Lengths

Appendix H: Numerical LMF Simulation Method of the Heterogeneous Exponential Splitters

Appendix I: Review of the Bouchaud–Bonart–Donier–Gould Model

Appendix J: The Power-Law ACF Formula as the Superposition of Exponential Splitters with Power-Law Order-Submission Probability Distribution

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation