Continuous-Time Random Walk with multi-step memory: An application to market dynamics

A novel version of the Continuous-Time Random Walk (CTRW) model with memory is developed. This memory means the dependence between arbitrary number of successive jumps of the process, while waiting times between jumps are considered as i.i.d. random variables. The dependence was found by analysis of empirical histograms for the stochastic process of a single share price on a market within the high frequency time scale, and justified theoretically by considering bid-ask bounce mechanism containing some delay characteristic for any double-auction market. Our model turns out to be exactly analytically solvable, which enables a direct comparison of its predictions with their empirical counterparts, for instance, with empirical velocity autocorrelation function. Thus this paper significantly extends the capabilities of the CTRW formalism.


I. INTRODUCTION
The dynamics of many complex systems, not only in natural but also in socio-economical sciences, is usually represented by stochastic time series. These series are often composed of elementary random spatio-temporal events, which may show some dependences and correlations as well as apparent universal structures [1][2][3][4][5][6]. By the elementary event we understand a "spatial" jump, r, of a stochastic process preceded by its waiting (or pausing) time, t, both being stochastic variables.
Such a stochastic process, named Continuous Time Random Walk (CTRW), was introduced in the physical context by Montroll and Weiss [7] and applied successfully to description of a photocurrent relaxation in amorphous films [8]. (and ref. therein). Nearly three and a half decades ago the versions of the CTRW formalism containing the backward or forward correlations were developed [9] (and refs. therein). Soon, the first application of the former version of the formalism to the case of concentrated lattice gas was performed for the study of the tracer diffusion coefficient [10]. The study was directly inspired by hydrogen diffusion in transition metals [11,12] and ionic conductivity in superionic conductors [13].
Next, it was proved [14] that for lattices of low coordination numbers the description of the tracer diffusion in concentrated lattice gas requires an extension of the CTRW formalism to correlations over several subsequent jumps. This is because the vacancy left behind the tracer particle after its jump, quickly becomes ahead of the particle. This happens due to small loops present in the particle trajectory, which conducts the particle again to the origin.
The CTRW formalism with memory appeared also in other contexts [15,16], but up to now, still limited only to the dependence over two subsequent jumps as its extension to the case of memory (i.e. the dependence) over three or more subsequent jumps was too complicated.
This work extends the field of applications of the CTRW formalism by including memory ranging over two jumps behind the current jump. That is, in this work the dependence between three successive jumps is considered. Such an approach can be useful not only for study of one dimensional random walk but also in many dimensions, on different kinds of lattices and graphs.
Furthermore, we applied our CTRW formalism to the subtle description of the microscopic mechanism in finance in particular the bid-ask bounce phenomena. It seems that one of the solid reason in favor of CTRW formalisms is that they provide a generic formula for the first and second order time-dependent statistics in terms of two auxiliary spatial, h(r), and temporal, ψ(t), distributions that can be obtained directly from empirical histograms.
The paper is organized as follows: in Sec. II we present the motivation of our work. In Sec. III we define the proper stochastic process which is solved in the Sec. IV. In Sec. V the novel model is compared with our previous model [17] and in Sec. VI the comparison with empirical data was made. Section VII contains our concluding remarks.

II. MOTIVATION
There are mainly two generic reasons forcing us to include the two-step memory into the Continuous-Time Random Walk formalism: • Firstly, to describe joint two-point histograms depending on the current share price jump and the second one before the current jump. These histograms (constructed separately for many different shares) clearly show significant dependences between mentioned above jumps at almost vanishing their mutual two-point correlations.
• Secondly, to significantly improve the agreement of the theoretical time-dependent velocity autocorrelation function of the successive share price jumps (cf. [17] and refs. therein) with its empirical counterpart.
Although, the need of such a type of models is commonly known, it is surprising how poorly developed and exploited still they are. Indeed, the present work carries out this gap.
If we record only successive share price jumps and not time intervals (waiting-times) between them, we obtain the so-called "event time" series. The event time dependent autocorrelation functions of price changes obtained on this basis were already widely considered [18,19]. The shape of these autocorrelation functions, that is, their dependence on eventtime is universal in the sense that the shape is independent of the market and stock analyzed.
More precisely, for each considered event time series we get the distinctly negative value of lag-1 autocorrelation function, while almost vanishing values for lag-2, lag-3, . . .. For this reason, the shape of this autocorrelation function is considered as a stylized fact.
The significant correlation between two successive price jumps stimulated Montero and Masoliver [15] as well as authors of the present work [17] to describe stochastic process of the single stock price as a CTRW with one step memory backward. That is, it motivated a community to consider the CTRW model in which current value of the increment depends only on the previous increment. Such a dependence is caused by the bid-ask bounce phenomenon [18,20]. Previously [17] we assumed, for simplicity, that dependence between current price jump and the second one before the current price jump can be neglected. However, in the present work the mentioned above dependence is took into account as we observed that vanishing correlation does not implies the lack of dependence.
By basing on the empirical histogram of the two consecutive price jumps ( Fig. 1 in [17] and Fig. 1a in the present work), we proposed [17] a formula which describes dependence between two consecutive price jumps, r n , r n−1 , in the form of two variable pdf h(r n , r n−1 ) = (1 − ǫ)h(r n )h(r n−1 ) + ǫ δ(r n + r n−1 )h(r n−1 ), (1) or equivalently for the conditional pdf h(r n | r n−1 ) = (1 − ǫ)h(r n ) + ǫ δ(r n + r n−1 ), where h(x) = h(−x) and ǫ is a constant weight, which can be estimated either from the histogram or from the lag-1 autocorrelation function of consecutive price jumps. Apparently, only the second term in Eg. (1) describes dependence between r n and r n−1 variables.
Furthermore, above formulas implies a dependence between r n and r n−2 jumps, expressed in the two variable pdf which gives a significant, positive correlation between r n and r n−2 equals ǫ 2 .
The generalization of Eq. (3) for the dependence between any two price jumps is straight- Hence, the autocorrelation function of price jumps in the event time is simply where the second moment and k is the number of steps in the event time.
Unfortunately, such a behavior is not observed in empirical data as empirical autocorrelation function decays to zero much more quickly.
In principle, the empirical autocorrelation function of price jumps cannot be reproduced if one assumes that only two successive jumps are dependent and the dependence can be described by the symmetric distribution function h(r n , r n−1 ) = h(r n−1 , r n ), although the latter is justified by the empirical representation of h(r n , r n−1 ) shown in Figure 1a. Unfortunately, for such a case the correlation between r n and r n−2 is always grater than zero, which can be proven as follows This principle disagreement with empirical observation was one of the main motivation to consider the CTRW model with longer memory, where each price jump depends on the two previous jumps. The conclusion may be surprising that to reproduce the lack of correlation, we actually need to assume a dependence between more than two consecutive jumps.

III. DEFINITION OF THE MODEL
Let us begin with the analyzes of the empirical histogram presenting dependence between the current price jump and the second one before the current jump. This histogram, which is a statistical realization of the function h 2 (r n , r n−2 ), is shown in the Figure 1b. Observed symmetric dependence between r n and r n−2 can be considered as a prominent empirical example of two random variables which are significantly dependent but uncorrelated.
Plot in Fig. 1a represents dependence between r n and r n−1 . Besides the sharp central cross corresponding to the case where at least one of the two consecutive price jumps has zero length, the plot contains "anti-diagonal" corresponding to the case where two subsequent price jumps have the same length but the opposite sign. The dependence presented in Fig. 1a was already discussed in Ref. [17] and, as shown there, can be satisfactorily described by The plot in Fig. 1b has essentially different structure. Besides sharp central cross, contains both "diagonal" and "anti-diagonal". These diagonal and anti-diagonal correspond to the case, where the current price jump and the second one before the current price jump have the same length but might have the same or the opposite signs.
Apparently, Eq. (3) is able to reproduce only the diagonal of the histogram. To reproduce both diagonal and anti-diagonal we ave to extend Eq. (3) into the form essential for further considerations, where the second and third terms represent diagonal an anti-diagonal, respectively. These terms, together with the first term, make distribution h 2 (r n , r n−2 ) well normalized quantity. To obtain a vanishing correlation between r n and r n−2 we assumed weights of the diagonal and anti-diagonal equal and denoted by ζ. Now, we can construct the three-variable pdf of three consecutive price jumps. For simplicity, instead of notation (r n , r n−1 , r n−2 ) we use (r 3 , r 2 , r 1 ).
The three variable pdf, h(r 3 , r 2 , r 1 ), should obey the following constrains concerning the marginal distributions: (a) Firstly, distribution h(r 3 , r 2 , r 1 ) integrated over any two of the three variables should reproduce, for the third variable, a single price jump distribution -the same for all three cases. The analogical constrain for two variables pdf is already satisfied by Eq. (1).
(b) Secondly, distribution h(r 3 , r 2 , r 1 ) integrated over variable r 1 should reproduce two variable pdf, h(r 3 , r 2 ), in the form of Eq. (1). The same pdf h(r 3 , r 2 , r 1 ) integrated over variable r 3 should also reproduce two variable pdf, h(r 2 , r 1 ), again in the form of Eq. (1).
Hence, we propose a key formula for h(r 3 , r 2 , r 1 ) in the following developed form which satisfies all constrains mentioned above. Obviously, it is not a unique pdf but it seems to be as simple as it is possible.
It is worth to mention that all terms shown on the right-hand side of Eq. (8), except the last one, are in fact present, with slightly different prefactors, in the simple product of distributions h(r 3 | r 2 ) and h(r 2 , r 1 ) defined by Eqs. (1) and (2) respectively. The only new term is the last one, proportional to δ(r 3 + r 1 )h(r 2 ). This term describes the situation, where price jump r 1 is followed by the second, independent price jump r 2 and the third price jump r 3 = −r 1 which has the same length as jump r 1 but the opposite sign. The adding of such a term can be justified by the model of the bid-ask bounce phenomena with delay present. We explain what do we mean by the name "bid-ask bounce with delay" by using a characteristic scenario presented below.
Let us consider a continuous-time double auction market organized by the order book system [18,[21][22][23]. Let buy and sell orders be sorted according to the corresponding price limit. The gap between buy order with the highest price limit and sell order with the lowest price limit is called the bid-ask spread [18,[21][22][23]. In our previous paper [17] we analyzed, as a typical example, a series of orders which lead to the bouncing of the price between lower and higher border of the bid-ask spread. To justify the form of Eq. (1), we argued that if the price increases from the lower border of the bid-ask spread to some possibly new value of the higher border, the two cases are possible.
In the first case, an appropriate sell order occurs, with probability ǫ, and the price goes back to the vicinity of the previous price. This results in two consecutive price jumps of approximately the same length but opposite signs. In the second case, if other type of the order arrived, it leads to the elimination of the system memory present in the bid-ask spread.
As a result, the subsequent price jump can be considered in this case as independent of the previous jump and appears with probability 1−ǫ. These two cases can be formally expressed by the two variable pdf just in the form given by Eq. (1). However, as we argued in the previous section, one-step memory CTRW formalism is not able to properly describe the high frequency stock market dynamics.
Fortunately, from the second case considered above, we are able to extract the subsequent case, leading eventually to the two-step memory. That is, if after the first price jump the executable small volume buy order appeared, the price jump (initiated by this buy order) will also be small or even equals zero. In such a case, the memory of the system is still present in the bid-ask spread, because its lower border still did not move. Hence, the backward jump to the lower border is still possible with the price jump of approximately the same length as the second to last price jump, but with opposite sign. We emphasize that we do not assume that subsequent orders are independent, so our model describes even a situation where memory is present in the order flow [24] By means of pdf, the term describing such a case (of the two-step memory) can be approximated by term proportional to δ(r 3 + r 1 )δ(r 2 )h(r 1 ). The first Dirac's delta is responsible for the situation where the current jump r 3 repeats the second one before the current price jump r 1 , but with the opposite sign (i.e. r 3 = r 1 ). The second Dirac's delta gives the zero-length mid price jump r 2 . However, to obey all three constrains (a) -(c) on marginal distributions of h(r 3 , r 2 , r 1 ), we were forced to use instead of two deltas, the last term based on the product δ(r 3 + r 1 )h(r 2 )h(r 1 ). Let us remind that single jump distribution h(r 2 ) is strongly condensed in the vicinity of r 2 equals zero. Taking this term into account with appropriate weight, we completed our basic Eq. (8).
In our model the jumps of the process are not independent, as a current jump depends on two preceding jumps. Hence, the conditional pdf of the jump length r 3 , under the condition of previous jumps r 2 and r 1 , can be obtained from Eq. (8) by dividing of its both sides by h(r 2 , r 1 ) given by Eq. (1). This leads to the useful conditional pdf where the following dependences between Dirac's delta and Kronecker's delta were used As we precisely defined dependences between consecutive jumps, we can introduce a stochastic process and derive the analytical forms of propagator and velocity autocorrelation function of the process.

IV. SOLUTION
The high-frequency share price time series can be considered as a single realization or trajectory of a jump process. The trajectory of such a stochastic process is a stepway function consisting of waiting times t n prior to the jump of price r n . Hence, the single trajectory can be defined in time and space by the series of subsequent points (t 1 , r 1 ; t 2 , r 2 ; . . . ; t n , r n ) and the process can be described by the conditional probability density ρ(r n , t n | r n−1 , t n−1 ; r n−2 , t n−2 ; . . . ; r 2 , t 2 ; r 1 , t 1 ). This is the probability density of jump r n after waiting time t n , conditioned by the whole history (t 1 , r 1 ; t 2 , r 2 ; . . . ; t n−1 , r n−1 ). Now, we make the simplifying assumptions to construct theoretical model, which describes real share price process: • the process is stationary, ergodic and homogeneous in price variable. We neglect the influence of the so-called lunch effect, which is the result of a daily pattern of investors' activity; • all waiting times between successive price changes, t n , are i.i.d. random variables with distribution ψ(t n ) [29] with finite average. In case of infinite average the process is not ergodic [25,26]; • each price jump r n depends only on two previous price jumps r n−1 , r n−2 in the form given by Eq. (9).
Equation (10) gives the recipe for the infinitely long trajectory but, as the process is homogeneous and stationary, we can arbitrary choose the origin for the time and price axes. Since we analyze the trajectories starting at some arbitrary time t = 0 and price X = 0, we have to take into account that the first price jump after time t = 0 depends on the two previous price jumps, that we call r 0 and r −1 . This can be solved by weighting the trajectories by h(r 0 , r −1 ), where h is given by the Eq. (1).
Furthermore, we cannot use the same waiting-time distribution for the first jump as for other jumps. This is because jump r 0 might occur at any time before t = 0. Therefore, we can average over all possible time intervals t ′ between the jump r 0 and the time origin t = 0.
Such an averaging was proposed in [9] and leads to the distribution where expected (mean) waiting-time t = The aim of this section is to derive the conditional probability density, P (X, t), to find share price value X at time t, at condition that the share price initial value was assumed as the origin. Further in the text we call this probability the soft stochastic propagator, in contrast to the sharp one, which we define below. Note that t denotes here the clock or current time and not the waiting time or time interval. The derivation of the propagator consists of few steps described in the following paragraphs.
The intermediate quantity describing the stochastic process is the sharp, n-step propagator Q n (X, r n , r n−1 ; t), n = 1, 2, . . . . This propagator is defined as the probability density that the process, which had initially (at t = 0) the original value (X = 0), makes its (n−1) th jump by r n−1 from X − r n − r n−1 to X − r n (at any time) and makes its n-th jump by r n from X −r n to X exactly at time t. The key relation needed for exact solution of the process is given by the recursion relation Equation (12) relates two successive sharp propagators by the spatio-tempotral convolution.
This equation is valid only for n ≥ 3 and required propagators Q 1 and Q 2 are calculated directly from their definitions by using ψ ang h distributions. Now, we can define sharp summarized propagator Q (X, t) as Finally, to obtain the soft stochastic propagator, P (X, t), we use relation between soft and sharp propagators, which is much easier to consider in the Fourier-Laplace domaiñ whereÕ means the Fourier, Laplace, or Fourier-Laplace transform of O, while sojourn probabilities (in time and Laplace domains) are defined by the corresponding waiting-time distributions and In our previous paper [17] we presented the whole procedure how to obtain the explicit form of Eq. (14) for given one-step memory model. Unfortunately, for the recurrence given by Eq. (12) and dependence given by Eq. (9), the calculations are too tedious to present them here (they can be found in the Appendix to PhD thesis of one of us [27] [30]). We use herein only the final form of the sharp propagator: wherẽ and hereψ ≡ψ(s) andh ≡h(k) is a Fourier transform of a single jump distribution; parameter ǫ and ζ do not depend on s and k. Let us notice that the numerator and the denominator of the right hand side of Eq. (17) are the sixth order polynomials inψ(s).
We calculate the variance of the soft propagator in the Laplace domain by using Eqs.
(14) -(17) Notice that both numerator and denominator are herein reduced already to polynomials of the third order inψ. Hence, the Laplace transform of the velocity autocorrelation function (VAF) is given by As we are interested in a closed form of the VAF in time domain, we find the form of Eq.
(21) as too complex to perform the inverse Lapalce transformation, even for the simplest ψ(s). To simplify Eq. (21) and reduce the number of free parameters of the model we can assume that parameter ζ, responsible for the two-step memory, is of the second order in ǫ, i.e.
Hence, we have a direct correspondence between Eqs. (3) and (7) as it should be. Such a simplification gives VAF in the form where root j = − 1 2 + i √ 3 2 andŌ means a complex conjugate of O. Apparently, numerator and denominator are reduced now to polynomials already of the second order inψ. It is worth to mention that we can obtain power spectra of our process from the above Equation directly by using Wiener-Khinchin theorem [28]. The normalized VAF is given, in time domain, by expression where L −1 t {. . .} is an inverse Laplace transform and λ = j 6 . This is an advantage of our approach that much more complicated model, comparing to the previous one (cf. Eq. (19) in [17]), with longer memory leads only to a small although significant modification of our result shown in [17]. Furthermore, the VAF depends on the same quantities as those present in our previous model that is, the waiting time distributionψ(s) and parameter ǫ.
Nevertheless, we should discuss in details the difference between these two models.

V. DIFFERENCE BETWEEN ONE-AND TWO-STEP MEMORY MODELS
In Section II we discussed selected properties of the one-step memory model and compared them with well known properties of empirical data. Some disagreement observed there was a motivation for development of the two-step memory model, solved in the previous section.
Right now, we are ready to study the difference between these models at the level of observed characteristics, e.g. so significant autocorrelation functions.
Let us begin with the analysis of the autocorrelation function in the event-time, obtained within both models. The dependence between any two jumps of the process within the one-step memory model is given by Eq. (4). This dependence results in autocorrelation supplied by Eq. (5). However, in the case of the two-step memory model, the analogous dependence for k ≥ 1 takes more general but complicated form where conditional distributions under the integral were defined by Eqs. (8) and (9). The remaining unknown quantities µ k and ν k are functions of parameters of the model ǫ and ζ and are discussed below.
Autocorrelation function in the event time can be calculated analogously as in Eq. (5), which gives Directly from the construction procedure of our model, we obtain µ 1 = ǫ, ν 1 = 0 and µ 2 = ν 2 = ζ. Fortunately, we are also able to obtain the results for k = 3 and k = 4, in the explicit form which simplifies, due to approximation (22), into the form  [17] or Sec. VI) are, in practice, negligibly small quantities.
The particularly useful way to visualize the role of the two-step memory is to analyze velocity autocorrelation functions in a real time. To highlight the generic difference between the models we perform the inverse Laplace transformation in Eq. (24) for the simplest possible exponential waiting-time distribution. In such a case the observed difference between result for both models is not influenced by complexity of waiting time distribution. We assume that where t is the average inter-event time. In Laplace domain Eq. (35) takes the form Note, that it is the only continuous pdf where we have ψ 1 (t) = ψ(t). In the frame of our previous simpler model this WTD leads to the normalized VAF in the form (cf. Eq. (23) in [17]) Analogously, by substituting Eq. (35) into Eq. (24) we obtain for the two-step memory It can be easily proved that for ǫ ≪ 1 the VAF given by Eq. (38) reduces into Eq. (37). This reduction was expected as within approximation given by Eq. intermediate time the two-step memory model gives VAF closer to zero then that for the one-step memory model and the higher the parameter ǫ the larger the difference between corresponding VAFs is. The main qualitative difference between both models is that, for the two-step memory model, VAF changes its sign, as a cosine in Eq. (38) becomes negative for long enough time.

VI. COMPARISON WITH EMPIRICAL DATA
The comparison of the VAF predicted by both our model with the empirical VAF requires: (i) the realistic waiting time distributionψ(s), (ii) the estimated value of parameter ǫ, and (iii) the method of VAF estimation for unevenly spaced time-series.
These generic requirements were already discussed in detail in our previous paper [17].
Notably, we use herein the same empirical data sets and the same methodology as we used there, which make possible to compare predictions of our current theoretical model with their empirical counterpart.
In Ref. [17] we proposed a simple but quite realistic form of waiting-time distribution, which is a superposition (or weighted sum) of two exponential distributions where 0 ≤ w ≤ 1 is the weight while τ 1 and τ 2 are the corresponding (partial) relaxation where ν 1,2 (j) = 1 2j j(ω 1 + ω 2 ) + ǫυ ± (j(ω 1 + ω 2 ) + ǫυ) 2 − 4j(j + ǫ)ω 1 ω 2 , , Apparently, the formal similarity of the one-and two-step models is emphasized by similar structures of both resulting VAFs. However, in the current case the parameters of exponents in Eq. (40) are complex numbers (and not real numbers as in our previous model, cf. Eqs. (25) and (26) in Ref. [17]), which makes their interpretation much more complicated.
Furthermore, we should emphasize that all required parameters τ 1 , τ 2 , w and ǫ can be estimated on separated empirical data sets without using the empirical VAF. That is, to find parameter ǫ only jump lengths are sufficient, while for estimation of remaining parameters only waiting times are required. As a result, the comparison of our theoretical VAF with its empirical counterpart is not a fit as no free parameters were left to make it.
The comparison of our present model predictions with empirical VAF together with the corresponding predictions of our previous model is shown in Fig. 3. The improved agreement provided by our present model predictions is well seen. That is, for intermediate times (t ∈ [5,40]) the current model prediction is closer to the empirical data than the previous one, while VAF for smaller times (t < 5) and longer times (t > 40) remained, in fact, unchanged.

VII. CONCLUDING REMARKS
In the present paper we developed the version of the CTRW formalism which contains dependence between three consecutive jumps of the process or memory over two steps backward instead of one step backward as it was considered earlier. This dependence was studied in this paper independently on wheather correlations in the system exist or not. Such an approach was directly inspired by empirical histograms (cf. plots in Fig. 1).
Although our approach is sufficiently generic, we specified it, under the influence of empirical data, only for the case of vanishing event time autocorrelation function c(2) (cf. Tab. I). Moreover, our present model significantly improve the agreement of time dependent VAF with its empirical counterpart (cf. plot in Fig. 3) and strongly extends the field of application of the CTRW formalism.