Continuous-Time Random Walk with multi-step memory: an application to market dynamics

An extended version of the Continuous-Time Random Walk (CTRW) model with memory is herein developed. This memory involves the dependence between arbitrary number of successive jumps of the process while waiting times between jumps are considered as i.i.d. random variables. This dependence was established analyzing empirical histograms for the stochastic process of a single share price on a market within the high frequency time scale. Then, it was justified theoretically by considering bid-ask bounce mechanism containing some delay characteristic for any double-auction market. Our model appeared exactly analytically solvable. Therefore, it enables a direct comparison of its predictions with their empirical counterparts, for instance, with empirical velocity autocorrelation function. Thus, the present research significantly extends capabilities of the CTRW formalism.


Introduction
The dynamics of many complex systems, not only in natural but also in socio-economical sciences, is usually represented by stochastic time series. These series are often composed of elementary random spatio-temporal events, which may show some dependencies and correlations as well as apparent universal structures [1][2][3][4][5][6]. By this elementary event we understand a "spatial" jump, r, of a stochastic process preceded by waiting (interevent or pausing) time, τ , both being stochastic variables.
Such a two-phase stochastic process, named Continuous-Time Random Walk (CTRW), was introduced in the physical context of dispersive transport and diffusion by Montroll and Weiss [7] and applied successfully to description of a photocurrent relaxation in amorphous films [8][9][10][11][12] (and references therein) and in OLED ones [13,14].
The CTRW formalism was applied for example, for diffusion in probabilistic fractal structures such as percolation clusters [15] and for fractional diffusion [16]. The CTRW with broad waiting time distribution was applied, e.g. for diffusion in chaotic systems [17]. The CTRW formalism, containing broad spatial jump distribution explained superdiffusion (Lévy flights or walks) [18] observed in domains of rotating flows or weakly turbulent flow [19,20]. The CTRW found innumerable applications Contribution to the Topical Issue "Continuous Time Random Walk Still Trendy: Fifty-year History, Current State and Outlook", edited by Ryszard Kutner and Jaume Masoliver. a e-mail: tomasz.gubiec@fuw.edu.pl in many other fields: hydrogen diffusion in nanostructure compounds [21], nearly constant dielectric loss in disordered ionic conductors [22] subsurface tracer diffusion [23], electron transfer [24], aging of glasses [25,26], transport in porous media [27], diffusion of epicenters of earthquakes aftershocks [28], cardiological rhythms [29], search models [30], human travel [31] and even financial markets [32][33][34][35][36]. Today, the CTRW provides an unified description for both enhanced and dispersive diffusion [37][38][39][40] -the list of its applications is still growing (cf. [41]). Nearly three and a half decades ago the versions of the CTRW formalism containing the backward or forward correlations between jump directions were developed [42] (and references therein). Soon, the first application of the former version of the formalism, as in the case of concentrated lattice gas, was performed for the study of the tracer diffusion coefficient [43]. The study was directly inspired by hydrogen diffusion in transition metals [44,45] and ionic conductivity in super-ionic conductors [46]. As a result, for lattices of low coordination numbers or networks with low average nodes' degrees, the description of the tracer diffusion in concentrated lattice gas requires an extension of the CTRW formalism to take into account the dependencies over several subsequent jumps [47]. These dependencies can occur because the vacancy left behind the tracer particle after its jump favorizes the return of the tracer to the origin location, even after several jumps. The CTRW formalism with memory appeared also in other contexts including correlation over waiting times [48][49][50][51] (or the alternative approach in [52,53]), but up to now, still limited only to the dependence over two subsequent jumps as its extension to the case of memory (or dependence) over three or more subsequent jumps was too complicated for the theoretical derivation.
This work extends the field of applications of the CTRW formalism by including memory ranging over two jumps behind the current jump. In other words, in this work the dependence between three subsequent jumps is considered resulting in an exact analytical solution. Such an approach is useful not only for study of one-dimensional random walk but also can be useful in higher dimensions for different kinds of lattices and networks.
Furthermore, we applied our CTRW formalism to the subtle description of the high-frequency price dynamics driven by the microscopic mechanism of bid-ask bounce phenomena. One reason in favor of CTRW formalisms is that they provide a generic formula for the first and second order time-dependent statistics in terms of two auxiliary spatial, h(r), and temporal, ψ(τ ), distributions that can be obtained directly from empirical histograms.
The paper is organized as follows: in Section 2 we present the motivation of our work. In Section 3 we define the proper stochastic process which is solved in Section 4. In Section 5 the novel model is compared with our previous model [54] and in Section 6 the comparison with empirical data was made. Section 7 contains our concluding remarks.

Direct motivation
There are few (considered below) direct reasons supplied, e.g. by financial markets, that prompted us to include the two-step memory into the Continuous-Time Random Walk formalism in a generic way.
If, for simplicity, we record only successive share price jumps and not time intervals (waiting-times) between them, we will obtain the so-called "event-time" series. The event-time dependent autocorrelation functions of price changes obtained on this basis were already widely considered [55,56]. The shape of these autocorrelation functions, i.e. their dependence on event-time, is universal in the sense that this shape is independent of the market and stock analyzed. Moreover, for each considered event-time series we receive a distinctly negative value of lag-1 autocorrelation function while almost vanishing values for lag-2, lag-3, . . .. Therefore, the shape of this autocorrelation function can be considered as a stylized fact.
The significant correlation between two successive price jumps stimulated Montero and Masoliver [48] as well as authors of the present work [54] to describe the stochastic process of the single stock price as a CTRW with one step backward memory. In this one step backward memory the current value of the increment depends only on the previous one. Such a dependence is caused in finance by the bid-ask bounce phenomenon [55,57]. Previously, we assumed that the dependence between current price jump and the second one before the current price jump can be neglected as corresponding correlation vanishes [54]. However, in the present work the mentioned above dependence is taken into account as we observed, herein, that even vanishing of the correlation does not imply the lack of Empirical normalized histograms of different kinds of two price jumps dependencies: (a) for the current price jump and the preceding price jump, (b) for the current price jump and the second one before the current price jump or nextto-last jump. The larger logarithm of the joint probability is visualized by more intense grayness. To avoid singularity of the logarithmic scale, all probabilities are increased by small insignificant number, 10 −5 . The histogram is based on the dataset used in [54] for PEKAO stock.
dependence. This is a key observation which initialized the present work. We remind that by basing on the empirical histogram of the two consecutive price jumps (compare diagrams in Fig. 1 in Ref. [54] and the analogous one in Fig. 1a in the present work), we proposed a formula which describes dependence (herein of the backward form) between two consecutive (lag-1) jumps, r n , r n−1 , by the joint twovariable pdf h(r n , r n−1 ) = (1 − )h(r n )h(r n−1 ) + δ(r n + r n−1 )h(r n−1 ), (1) or equivalently by the conditional pdf h(r n | r n−1 ) = (1 − )h(r n ) + δ(r n + r n−1 ), where h(x) is an even function as no drift is present herein and 0 ≤ ≤ 1 is a constant weight, which can be estimated either from the histogram or from the lag-1 autocorrelation function of consecutive jumps of the process. Apparently, only the second term in equations (1) and (2) describes dependence (herein of the backward type) between r n and r n−1 variables. Furthermore, above formulas imply a dependence between r n and r n−2 jumps, expressed in the two-variable pdf which gives a significant, positive correlation between r n and r n−2 equals 2 . The generalization of equation (3) for the dependence between any two jumps is straightforward where k is the number of steps in the event time. Hence, the autocorrelation function of jumps in the event time is simply where the second moment (5) is not observed in empirical data as empirical autocorrelation function decreases to zero much quicker. In principle, the empirical autocorrelation function between jumps cannot be reproduced if one assumes that (i) only two successive jumps are dependent and (ii) this dependence is described by the symmetric distribution function h(r n , r n−1 ) = h(r n−1 , r n ), although the latter is justified by the empirical representation of h(r n , r n−1 ) shown in Figure 1a. As a consequence of assumption (i) the correlation between r n and r n−2 is always greater than zero (see Appendix A for detailed derivation), which essentially disagrees with empirical data shown in Figure  1b. Indeed, this disagreement is one of the main inspirations to consider the CTRW model with longer memory, where each current jump of the process depends on the two previous jumps.

Definition of the model
Let us begin with the analysis of the empirical histogram presenting dependence between the current price jump and the second one before the current jump. This histogram, which is a statistical realization of the function h 2 (r n , r n−2 ), is shown in Figure 1b. Observed antisymmetric dependence between r n and r n−2 can be considered as a generic empirical example of two random variables which are dependent but uncorrelated. Besides a sharp central cross, it contains both a "diagonal" and an "antidiagonal". These diagonals and anti-diagonals correspond to the case, where the current price jump and the second one before the current price jump have the same length but might have the same or the opposite signs.
Apparently, equation (3) is able to reproduce only the diagonal of the histogram. To reproduce both diagonal and anti-diagonal we have to extend equation (3) into the form essential for further considerations, where the second and third terms represent diagonal an anti-diagonal, respectively. These terms, together with the first term, make distribution h 2 (r n , r n−2 ) well normalized quantity.
To obtain a vanishing correlation between r n and r n−2 we assumed weights of the diagonal and anti-diagonal equal and denoted by ζ. Now, we can construct the three-variable pdf of three consecutive price jumps. For simplicity, instead of notation (r n , r n−1 , r n−2 ) we use (r 3 , r 2 , r 1 ). The three-variables pdf, h(r 3 , r 2 , r 1 ), should obey the following constrains concerning the marginal distributions: -Firstly, distribution h(r 3 , r 2 , r 1 ) integrated over any two of the three variables should reproduce, for the third variable, a single price jump distribution -the same for all three cases. The analogical constrain for two variables pdf is already satisfied by equation (1). -Secondly, distribution h(r 3 , r 2 , r 1 ) integrated over variable r 1 should reproduce two-variables pdf, h(r 3 , r 2 ), in the form of equation (1). The same pdf h(r 3 , r 2 , r 1 ) integrated over variable r 3 should also reproduce two-variables pdf, h(r 2 , r 1 ), again in the form of equation (1). -However, pdf h(r 3 , r 2 , r 1 ) integrated over variable r 2 should give pdf h 2 (r 3 , r 1 ) in the form of equation (6).
Hence, we propose a key formula for h(r 3 , r 2 , r 1 ) in the form h(r 3 , r 2 , r 1 ) = (1 − 2 )h(r 3 )h(r 2 )h(r 1 ) +ζδ(r 3 + r 2 )δ(r 2 + r 1 )h(r 1 ) which satisfies all constrains mentioned above. Obviously, this form is not a unique pdf but it is the simplest one which uses only two parameters ( and ζ), where additionally each term has clear interpretation.
It is worth to mention that all terms shown on the righthand side of equation (7), except the last one, are present, with slightly modified pre-factors, in the simple product of distributions h(r 3 | r 2 ) and h(r 2 , r 1 ) defined by equations (1) and (2), respectively. The only new term is the last one, proportional to δ(r 3 + r 1 )h(r 2 ). This term describes the case, when the price jump r 1 is followed by the second, independent price jump r 2 and the third price jump r 3 = −r 1 which has the same length as jump r 1 but the opposite sign. The adding of such a term is due to the bid-ask bounce phenomena with delay present herein. We explain what is meant by the name "bid-ask bounce with delay" by using a characteristic scenario given below.
Let us consider a continuous-time double auction market organized by the order book system [55,[58][59][60]. Let buy and sell orders be sorted according to the corresponding price limit. The gap between buy order with the highest price limit and sell order with the lowest price limit is called the bid-ask spread [55,[58][59][60]. In our previous paper [54] we analyzed, as a typical example, a series of orders which lead to the bouncing of the price between lower and higher border of the bid-ask spread. To justify the form of equation (1), we argued that if the price increases from the lower border of the bid-ask spread to some possibly new value of the higher border, the two cases are possible.
In the first case, an appropriate sell order occurs, with probability , and the price goes back to the vicinity of the previous price. This results in two consecutive price jumps of approximately the same length but opposite signs. In the second case, if other type of the order arrived, it leads to the elimination of the system memory present in the bid-ask spread. As a result, the subsequent price jump can be considered in this case as independent of the previous jump and appears with probability 1 − . These two cases can be formally expressed by the two variable pdf just in the form given by equation (1). However, as we argued in the previous section, one-step memory CTRW formalism is not able to properly describe the high frequency stock market dynamics.
Fortunately, from the second case considered above, we are able to extract the subsequent case, leading eventually to the two-step memory. That is, if after the first price jump the sufficiently executable small volume buy order appeared, the price jump (initiated by this buy order) will be equal zero. In such a case, the memory of the system is still present in the bid-ask spread, because its lower border still did not move, in fact. Hence, the backward jump to the lower border is still possible with the price jump of approximately the same length as the second to last price jump, but with opposite sign. Analogous dependence can be present for longer series of consecutive jumps but with systematically decreasing order. We emphasize that we do not assume that subsequent orders are independent, so our model even describes a situation where memory is present in the order flow [61].
By means of pdf, the term describing such a case (of the two-step memory) can be approximated by the term proportional to δ(r 3 + r 1 )δ(r 2 )h(r 1 ). The first Dirac's delta is responsible for the situation where the current jump r 3 repeats the second one, r 1 , before the current price jump, but with the opposite sign (i.e. r 3 = −r 1 ). The second Dirac's delta gives the zero-length mid price jump r 2 . However, to obey all three constrains (a)-(c) on marginal distributions of h(r 3 , r 2 , r 1 ), we were forced to use instead of two deltas, the last term based on the product δ(r 3 + r 1 )h(r 2 )h(r 1 ). Let us remind that single jump distribution h(r 2 ) is strongly concentrated at the vicinity of r 2 equals zero. Taking this term into account with appropriate weight, we thus completed our basic equation (7).
In our model the jumps of the process are not independent, as a current jump depends on two preceding jumps. Hence, the conditional pdf of the jump length r 3 , under the condition of previous jumps r 2 and r 1 , can be obtained from equation (7) by dividing of its both sides by h(r 2 , r 1 ) given by equation (1). This leads to the useful conditional pdf where the following dependences between Dirac's delta and Kronecker's delta were used As we precisely defined dependences between consecutive jumps, we can introduce a stochastic process and derive the analytical forms of the most significant quantities such as the propagator and velocity autocorrelation function of the process. Notably, equations (7) and (8) have generic character, which does not limit them to the local dynamics of share price only.

Solution
The high-frequency share price time series can be considered as a single realization or trajectory of a jump stochastic process. The trajectory of such a process is a step-way function consisting of waiting times τ n prior to the sudden jump increment of a price r n . Hence, the single trajectory can be defined in time and space by the series of subsequent temporary points τ 1 , r 1 ; τ 2 , r 2 ; . . . ; τ n , r n , and the process can be described by the conditional probability density ρ(r n , τ n | r n−1 , τ n−1 ; r n−2 , τ n−2 ; . . . ; r 2 , τ 2 ; r 1 , τ 1 ). This is the probability density of jump increment r n after waiting time τ n , conditioned by the whole history (τ 1 , r 1 ; τ 2 , r 2 ; . . . ; τ n−1 , r n−1 ). To construct theoretical model, we have to make the following simplifying assumptions: -the process is stationary, ergodic and homogeneous in space (price) variable. In the case of financial market, we neglect the influence of the so-called lunch effect, which is the non-stationarity resulting as a daily stable pattern of investors' activity; -all waiting times between successive changes of the process, τ n , are i.i.d. random variables with distribution ψ(τ n ) having finite average. 1 In case of infinite average the process is non-ergodic [62,63]; -each jump increment r n of the process depends only on two previous jump increments r n−1 , r n−2 in the form given by equation (8).
The approximations given above can be summarized in the form of a factorized distribution, Equation (9) gives the recipe for the infinitely long trajectory but, as the process is homogeneous and stationary, we can arbitrary choose the origin for the time and space axes. Since we analyze the trajectories starting at some arbitrary time t = 0 at origin, we have to take into account that the first jump of the process after time t = 0 depends on the two previous jumps, that we call r 0 and r −1 . This can be solved by weighting the trajectories by h(r 0 , r −1 ), where h is given by equation (1) even for n = 0.
Furthermore, we cannot use the same waiting-time distribution for the first jump as for other jumps. This is because jump increment r 0 might occur at any time before t = 0. Therefore, we can average over all possible time intervals τ between the instant of jump increment r 0 and the time origin t = 0. Such an averaging was proposed in [42] and leads to the distribution where expected (mean) waiting-time is The stationary process we can obtain by using a modified distribution for the first jump, as we consider further in the text.
The denominator in the first equation in equation (10) is required for the normalization. The only continuous case where ψ 1 (τ ) = ψ(τ ) is an exponential waiting-time distribution of a Poisson process.
The aim of this section is to derive the conditional probability density, P (X, t), to find value X of the process at time t, under the condition that the process initial value was assumed as the origin. Further in the text we call this probability the soft stochastic propagator, in contrast to the sharp one, which we define below. The derivation of the propagator consists of few steps described in the following paragraphs, which extends the corresponding derivation of the canonical CTRW formalism.
The intermediate very useful quantity describing the stochastic process is the sharp, n-step propagator Q n (X, r n , r n−1 ; t) , n = 1, 2, . . . This propagator is defined as the probability density that the process, which had initially (at t = 0) the original value (X = 0), makes its (n − 1)th jump by r n−1 from X − r n − r n−1 to X − r n (at any time) and makes its nth jump by increment r n from X − r n to X exactly at time t. The key expression needed for exact solution of the process is given by the recursion relation Q n (X, r n , r n−1 ; t) = Equation (11) relates two successive sharp propagators by the spatio-tempotral convolution. This equation is valid only for n ≥ 3 and should be completed by propagators Q 1 and Q 2 calculated directly from their definitions (cf. Ref. [54]).
We define sharp summarized propagator Q (X, t) as follows: dr n−1 Q n (X, r n , r n−1 ; t) . (12) Finally, to obtain the soft stochastic propagator, P (X, t), we use the relation between soft and sharp propagators, which is much easier to consider in the Fourier-Laplace domaiñ P (k, s) =Ψ 1 (s) +Ψ (s)Q(k; s), wherein Ψ 1 (τ ) is defined analogously.
To find an explicit form of equation (13), the procedure analogous to that for the one-step memory model was developed [54], although it was much more tedious (see Appendix B for details). Hence, the Laplace transform of the velocity autocorrelation function (VAF, also known as velocity autocovariance function) is given bỹ while numerator, N (ψ, ζ, ), and denominator, D(ψ, ζ, ), are defined, as follows: where the relation between the stochastic propagator in the Fourier and Laplace domains and the corresponding mean-square displacement were used herein. As we are interested in a closed form of the VAF in the time domain, we find both expressions in equation (16) as too complicated to perform the inverse Laplace transformation of equation (15), even for simple forms of ψ(s). To keep our model self-consistent (that is, Eq. (6) being an extension of Eq. (3)), we assume Our estimation of parameter ζ, based on the empirical data, gives this parameter almost equals to 2 . Hence, relation (17) simplifies both expressions in equation (16) eliminating the residual fluctuations of the order of ζ − 2 . Equation (15) is simplified now intõ andŌ means a complex conjugate of O.
Noticeably, we can obtain power spectra of our process from equation (18) directly by using Wiener-Khinchin theorem [64].
The normalized VAF (market with superscript n) is then given, in the time domain, by the expression where L −1 t (. . .) is an inverse Laplace transform and λ = j j−1 = 1 2 − i √ 3 6 . Apparently, the above obtained VAF uses solely quantities (ψ(s) and the parameter ) analogous to that of the one-step memory model [54].
Moreover, the very regular form of equation (18) and the corresponding result for the model containing the one-step memory backward enables to formulate the conjecture concerning the memory through arbitrary number of steps Herein, three terms in denominator of the first equality in equation (18) are treated as initial terms of a geometric series (accordingly, the numerator is treated). Apparently, for infinite many steps backward, (n → ∞), this equation gives the following result, very useful for our further considerations. The evolution ofC(s) governed in equations (20) and (21) solely byψ(s) is a significant issue. For instance, a multifractality could directly be considered using equation (21) if ψ(t) would be conducted in the form of properly suited superstatistics [35,36]. However, the analysis of anomalous diffusion requires, herein, resignation from stationarity. Lack of stationarity, which would appear here, results from the initial situation and not from how the process evolves itself. Therefore, there are no major obstacles to build a non-stationary CTRW formalism containing memory through infinite-many steps.

Comparison of the models
In Section 2, we discussed selected properties of the onestep memory model and compared them with the well known properties of empirical data. The observed disagreement inspired us for development of the two-and infinite-step memory model, solved in the previous section, where the latter model is based on our conjunction. Table 1. Comparison of the two-step memory model with the one-step memory model. Subsequent five values of the event-time autocorrelation function are presented. Apparently, the two-step memory model gives c(2) = 0, as required by empirical data.

Model c(0) c(1) c(2) c(3) c(4)
One-step memory 1 − In the present section we examine difference between one-, two-, and infinite-step memory models by using the autocorrelation function. Let us begin with the analysis of the autocorrelation function in the event time. The dependence between any two jumps of the process within the one-step memory model is given by equation (4). This dependence results in autocorrelation expressed by equation (5). However, in the case of the two-step memory model, the analogous dependence for k ≥ 1 is a bit more complicated. Fortunately, we obtain autocorrelation functions in the event time solely to k = 4. These functions are compared in Table 1 with the corresponding results for one-step memory model calculated from equation (5). It is worth recalling that empirical lack of the correlation for k ≥ 2 is considered as a stylized fact in high frequency empirical financial data. As we assumed during the construction of the model, the current version gives c(2) = 0. Apparently, for k ≥ 3 both models give autocorrelation functions of the same orders, which for empirical values of are negligibly small quantities (see also Sect. 6).
The particularly useful way to visualize the role of the two-and infinite many step memory models is to compare the corresponding velocity autocorrelation functions in a real time. We perform the inverse Laplace transformation in equations (19) and (21) for the simplest possible exponential waiting-time distribution to highlight the generic difference between the models. We assume where τ is an average of interevent time. Notably, this is a solely waiting-time distribution obeying the equality ψ 1 (t) = ψ(t) for the first step. In the frame of one-step memory model this WTD leads to the normalized VAF in the form (cf. Eq. (23) in [54]) Analogously, by substituting equation (22) into equation (19) we obtain for the two-step memory model Autocorrelation Time t / 〈 τ 〉 One-step memory, ε=0.05 (top, shifted by 1) Two-step memory, ε=0.05 (top, shifted by 1) Infinity-step memory, ε=0.05 (top, shifted by 1) One-step memory, ε=0.5 (middle, shifted by 0.5) Two-step memory, ε=0.5 (middle, shifted by 0.5) Infinity-step memory, ε=0.5 (middle, shifted by 0.5) One-step memory, ε=0.95 (bottom) Two-step memory, ε=0.95 (bottom) Infinity-step memory, ε=0.95 (bottom) It can be easily proved that for 1 the VAF given by equation (24) reduces into equation (23). This reduction was expected as within approximation given by equation (17) the difference between both models is of the order of 2 . Furthermore, by combining the conjecture given by equation (21) and the exponential WTD defined by equation (22), we easily obtain The predictions given by equations (23)

Comparison with empirical data
The satisfactory comparison of predicted autocorrelation functions with empirical ones requires [54]: -the use of sufficiently realistic waiting time distributionψ(s), -determination of values of our basic parameter from the separate fit to the corresponding empirical histograms, and -the use of sufficiently effective method of VAF estimation for unevenly spaced elements of time-series (as interevent time intervals have random lengths).
Useful form of waiting-time distribution is a superposition (or weighted sum) of two exponential distributions where 0 ≤ w ≤ 1 is the weight, while τ 1 and τ 2 are the corresponding (partial) relaxation times. Apparently, this waiting time pdf has sufficiently simple (for the analytical calculations) closed form after the Laplace transformation. Such a form can be easily and satisfactory fitted to the empirical histogram of waiting times (cf. Fig. 4 in Ref. [54]). Besides, this WTD makes the inverse Laplace transformation present in equation (19) an analytically solvable. Finally, we obtain VAF in the useful form where Apparently, the above given VAF is nontrivial because it contains the complex prefactors and exponents making its interpretation more complicated.
Notably, all required parameters τ 1 , τ 2 , w and are estimated by the separate empirical data without exploiting the empirical VAF. This is a basic result for further considerations. That is, to find parameter only set of empirical jump increments are sufficient to have, while for the estimation of remaining parameters only set of empirical waiting times is required. We obtained a very promising comparison of our theoretical VAF with its empirical counterpart because this is not a fit as no free parameters were left to make it.
The comparison of our theoretical predictions with the corresponding empirical VAFs is shown in Figure 3. The improved agreement provided by the two-and infinite many step memory models in comparison with the onestep memory model is well seen. However, still small but distinct systematic deviations exist even for the infinite many step memory model (especially for the initial time). PGNIG Empirical data One-step memory model Two-step memory model Infinity-step memory model Fig. 3. Comparison of three normalized autocorrelation functions of price velocity, for instance, for the PGNIG company (from the fuel and energy sector) quoted on the Warsaw Stock Exchange: (i) the empirical one (small black squares), (ii) the theoretical VAF (dotted curve) derived in reference [54] for the one-step memory model, (iii) the theoretical VAF (dashed curve) for the two-step memory model, and (iv) the theoretical VAF (solid curve) for the infinite many-step memory model. All models use the waiting time distribution in the form of the weighted sum of two exponential distributions given by equation (26). All parameters were fitted to separate data records, so both theoretical curves have no free parameters to fit to the empirical VAF (notably, the required method of determination of the empirical VAF was thoroughly considered in Ref. [54]).

Concluding remarks
Hitherto, only one step backward memory was considered analytically. In the present paper, we developed a version of the CTRW formalism containing memory over two steps backward or dependence between three consecutive jumps of the process. Herein, this extended dependence was studied, independently of whether the second-order correlations in the system exist or not. This condition significantly extends the capability of the CTRW formalism. This approach suggests that several already existing results could be improved if the one step backward memory model used there, would be replaced by the two step backward memory model.
Two results that can be considered as an achievement of our work are as follows.
-The derivation of the analytical closed formula for the propagator containing the two-step memory (cf. Eq. (13)). This is a significant extension of the corresponding one derived in our previous paper (cf. Eq. (17) in Ref. [54]), without increasing the number of free parameters and functions. -The conjugation of obtained propagator with geometric series, which enabled derivation of the velocity autocorrelation function containing infinite many step memory, keeping the model still analytically solvable. The solution is even simpler than that for those with finite-step memory.
By using propagator (13) we obtained the velocity autocorrelations of the processes. These autocorrelations describe the empirical counterpart better than model with one-step memory [54].
The strategy presented in this work brings a new approach to random walks with memory showing that even in the case of a finite diffusion coefficient the dependence through infinitely many steps play an important role in the CTRW formalism making the VAF much more realistic. However, still the presence of a distinct deficit of correlations suggests that, perhaps, the dependence between interevent times should be somehow coupled with a multi-step memory -this is still a challenge.
Author contribution statement T.G. performed the numerical simulations and data analysis. Both author wrote the manuscript and did analytical calculations.
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Appendix A: Proof
Below, we prove that c(2) is always larger than 0 if only two successive jumps are dependent.

Appendix B: Detailed derivation
In the present Appendix, we present detailed procedure of obtaining the propagator out of the two-step memory CTRW model. Let us begin with the generalization of the pdf of three consecutive jumps h(r 3 , r 2 , r 1 ) given by equation (7). Our two-step memory model is coherent with the one-step memory model presented in [54] only at the level of two-variable pdfs h(r 3 , r 2 ) and h(r 2 , r 1 ). The h 2 (r 3 , r 1 ) dependence present in the two-step memory model given by (6) is irreducible to the form obtained within one-step memory model (3). To obtain such a reducibility, we propose to add another formal parameter θ in the h(r 3 , r 2 , r 1 ) form: As a result, for θ = 0 equation (B.1) reduces to the form of (7) and for θ = ζ = 2 we obtain one-step memory model.
For simpler notation, let us introduce the following variables Conditional pdf h(r 3 | r 2 , r 1 ) can now be expressed in the form h(r 3 | r 2 , r 1 ) = (1 − δ r2,−r1 ) Ah(r 3 ) + Bδ(r 3 + r 1 ) The key relation needed for exact solution of the propagator is given by equation (11), which we recall below as Q n (X, r n , r n−1 ; t) = ×Q n−1 (X − r n , r n−1 , r n−2 ; t − t ) , (B.8) For simplicity of notation, let us introduce the following notationQ where we omit an explicit dependence on variables k and s. The key relation (B.8) in the Fourier-Laplace space and with notation defined above, takes the form: ×Q n−1 (r n−1 , r n−2 ) . (B.10) As previously, in order to obtain more intuitive notation we change variables (r n , r n−1 , r n−2 ) to (r 3 , r 2 , r 1 ), which gives The method, which will allow us to solve equation (B.11) for the given form of h(r 3 | r 2 , r 1 ) in (B.7), is to separate sharp propagatorQ n (r 3 , r 2 ) into two parts, i.e. the regular one and the singular one given, accordingly, by:Q R n (r 2 , r 1 ) = (1 − δ r2,−r1 )Q n (r 2 , r 1 ) , (B.12) With the quantities above, we can restoreQ n (r 3 , r 2 ), using the following relatioñ Q n (r 2 , r 1 ) =Q R n (r 2 , r 1 ) + δ(r 2 + r 1 )Q S n (r 2 ) . (B.14) Next, we transform equation (B.11) by substituting h with its exact form (B.7) and using definitions (B.12) and (B.13): The RHS of (B.15) contains only the regular and the singular propagators, while on the LHS we still have the full propagatorQ n (r 3 , r 2 ). Our aim is to obtain recurrence relation between the regular and the singular propagator. Let us multiply both sides of equation (B.15) by As a result, the term containing (1 − δ r3,−r2 ) δ(r 3 + r 2 ) disappears. Afterward, we multiply both sides of equation (B.15) by δ r3,−r2 , which leads us to In effect, we obtain a system of two recurrence equations on the regular and the singular sharp propagators. In order to solve these equations, let us introduce additional functions Apparently, the sharp propagatorQ n (k, s) can be expressed by functions R n and S n with all arguments equal to zero. Hence, our aim is to obtain R n and S n with all arguments equal to zero from the recurrence relations (B.24) and (B.24). Let us begin with the analysis of equation (B.24) for four cases, using properties (B.23): +A H(0) R n−1 (0, 0), (1, 0), At this place, it is reasonable to redefine factors A, B, C, D, E to removeψ(s) from the relation, which lead to where function R occurs with both arguments equal to zero. We still have function S with the argument equal to one. We can express it differently using relation (B.24) for b = 0, which gives Hence, By usingĥ =ĥ(k) = H(1), we find the above given expressions as equivalent to 0 = S 1 + S 2 + (E 2 + CD − 1)S + C(A + B + E)R, If we obtain explicitly the forms of R 1 , R 2 , R 3 , R 4 , S 1 , S 2 , S 3 , we will be able to reduce the problem of finding the the sharp propagator to the problem of solving a system of two Equations (B.37) with two variables R i S.
In order to obtain explicit forms of R 1 , R 2 , R 3 , R 4 , S 1 , S 2 , S 3 , we start with the explicit form of the first four propagators Q calculated directly from the definition and presented in the Fourier-Laplace  where Notably, RHS of equation (B.38) depends on s only throughψ =ψ(s) andψ 1 =ψ 1 (s), while it depends on k only throughĥ =ĥ(k). We obtained the explicit form of the sharp propagator of the process containing twostep memory. The soft propagator can be simply obtained using (13).

Appendix C: Variance and velocity autocorrelation function
The results obtained in the present Appendix are based on explicit form of the process propagator derived in Appendix B for the two-step memory.
This propagator was derived in Appendix B from the recursive Equation (B.8) according to a conditional dependence of three consecutive changes in the price given by equation (B.7). An exact closed form of the propagator Q (k, s) was given there by equation (B.38). Using equation (13), the soft propagatorP (k, s) was derived. Next, the Laplace transform of the time-dependent process variance was derived where,ψ =ψ(s), while remaining quantities are parameters independent on s and Finally, we obtain the Laplace transform of the velocity autocorrelation function of the process where explicit dependence on parameters ζ and θ is well seen.
We remind that for ζ = θ = 2 the model with the twostep memory well reproduces the results of that with the one-step memory. Then, which is the result derived already in [54]. For the autocorrelation function of the process including the two-step memory, one just accepts the value of the parameter θ = 0, which leads to equation (15).