Joint invariance principles for random walks with positively and negatively reinforced steps

Given a random walk $(S_n)$ with typical step distributed according to some fixed law and a fixed parameter $p \in (0,1)$, the associated positively step-reinforced random walk is a discrete-time process which performs at each step, with probability $1-p$, the same step as $(S_n)$ while with probability $p$, it repeats one of the steps it performed previously chosen uniformly at random. The negatively step-reinforced random walk follows the same dynamics but when a step is repeated its sign is also changed. In this work, we shall prove functional limit theorems for the triplet of a random walk, coupled with its positive and negative reinforced versions when $p<1/2$ and when the typical step is centred. As our work will show, the limiting process is Gaussian and admits a simple representation in terms of stochastic integrals. Our method exhausts a martingale approach in conjunction with the martingale functional CLT.


Introduction
In short, the purpose of this work is to establish invariance principles for random walks with step reinforcement, a particular class of random walks with memory that has been of increasing interest in recent years. Historically, the so-called elephant random walk (ERW) has been an important and fundamental example of a step-reinforced random walk that was originally introduced in the physics literature by Schütz and Trimper [24] more than 15 years ago. We shall first recall the setting of the ERW in order to motivate the two types of reinforcement that we will work with.
The ERW is a one-dimensional discrete-time nearest neighbour random walk with infinite memory, in allusion to the traditional saying that an elephant never forgets. It can be depicted as follows: Fix some q P p0, 1q, commonly referred to as the memory parameter, and suppose that an elephant makes an initial step in t´1, 1u at time 1. After, at each time n ě 2, the elephant selects uniformly at random a step from its past; with probability q, the elephant repeats the remembered step, whereas with complementary probability 1´q it makes a step in the opposite direction. Notably, in the case q " 1{2, the elephant merely follows the path of a simple symmetric random walk. Notably, the ERW is a timeinhomogeneous Markov chain (although some works in the literature improperly assert its non-Markovian character). The ERW has generated a lot of interest in recent years, a nonexhaustive list of references (with further references therein) is [3], [4], [5], [6], [13], [14], [15], [19], [22], [23], see also [2], [18] for variations. A striking feature that has been pointed at in those works, is that the long-time behaviour of the ERW exhibits a phase transition at some critical memory parameter. Functional limit theorems for the ERW were already proved by Baur and Bertoin in [3] by means of limit theorems for random urns. Indeed, the key observation is that the dynamics of the ERW can be expressed in terms of Pólya-type urn experiments and fall in the framework of the work of Janson [21]. For a strong invariance principle for the ERW, we refer to Coletti, Gava and Schütz in [14].
The framework of the ERW is however limited to the case of Rademacher distributed steps, and it is natural to look for generalisation of its dynamics that allow the typical step to have an arbitrary distribution on R. In this work, we aim to study the more general framework of step-reinforced random walks. We shall discuss two such generalisations, called positive and negative step-reinforced random walks, the former generalising the ERW when q P p1{2, 1q while the later covers the spectrum q P r0, 1{2s, in both cases when the typical step is Rademacher distributed. We start by introducing the former. For the rest of the work, X stands for a random variable that we assume belongs to L 2 pPq, we denote by σ 2 its variance and by µ its law. Moreover, unless specified otherwise, pS n q will always denote a random walk with typical step distributed as µ.
The noise reinforced random walk: A (positive) step-reinforced random walk or noise reinforced random walk is a generalisation of the ERW, where the distribution of a typical step of the walk is allowed to have an arbitrary distribution on R, rather than just Rademacher. The impact of the reinforcement is still described in terms of a fixed parameter p P p0, 1q, that we also refer to as the memory parameter or the reinforcement parameter. We will work with different values of p but for readability purposes p does not explicitly appear in the notation or terminology used in this work.
Vaguely speaking, the dynamics are as follows: at each discrete time, with probability p a step reinforced random walk repeats one of its preceding steps chosen uniformly at random, and otherwise, with complementary probability 1´p, it has an independent increment with a fixed but arbitrary distribution. More precisely, given an underlying probability space pΩ, F, Pq and a sequence X 1 , X 2 , . . . of i.i.d. copies of the random variable X with law µ, we defineX 1 ,X 2 , . . . recursively as follows: First, let pε i : i ě 2q be an independent sequence of Bernoulli random variables with parameter p P p0, 1q and also consider pU ris : i ě 2q an independent sequence where each U ris is uniformly distributed on t1, . . . , iu. We set first X 1 " X 1 , and next for i ě 2, we let Finally, the sequence of the partial sumŝ S n :"X 1`¨¨¨`Xn , n P N, is referred to as a positive step-reinforced random walk. From the algorithm, we have for any bounded measurable f : R Þ Ñ R`, Epf pX n`1 qq " p1´pqEpf pX n`1 qq`p n n ÿ j"1 Epf pX j qq and it follows by induction that eachX n has law µ. Beware however that the sequence pX i q is not stationary. Notice that if pŜ n q is not centred, it is often fruitful to reduce our analysis to the centred case by considering pŜ n´n EpXqq, which is a centred noise reinforced random walk with typical step distributed as X´EpXq. Observe that in the degenerate case p " 1, the dynamics of the positive step-reinforced random walk become essentially deterministic. Indeed when p " 1 we haveŜ n " nX 1 for all n ě 1, in particular the only remaining randomness for this process stems from the random variable X 1 . In this setting, when µ is the Rademacher distribution, Kürsten [23] (see also [17]) pointed out thatŜ " pŜ n q ně1 is a version of the elephant random walk with memory parameter q " pp`1q{2 P p1{2, 1q in the present notation. The remaining range of the memory parameter can be obtained by a simple modification that we will address when we introduce random walks with negatively reinforced steps. When µ has a symmetric stable distribution, S is the so-called shark random swim which has been studied in depth by Businger [12]. More general versions when the distribution µ is infinitely divisible have been considered by Bertoin in [9], and we will briefly comment on this setting in a moment. Finally, when we replace the sequence of Bernoulli random variables p n q by a deterministic sequence pr n q with r n P t0, 1u, the scaling exponents of the corresponding step reinforced random walks have been studied by Bertoin in [10].
In stark contrast to the ERW, the literature available on general step-reinforced random walks remains rather sparse. Quite recently, Bertoin [11] established an invariance principle for the step-reinforced random walk in the diffusive regime p P p0, 1{2q. Bertoin's work concerned a rather simple real-valued and centered Gaussian processB " pBptqq tě0 with covariance function given by for 0 ď s ď t and p P p0, 1{2q.
This process has notably appeared as the scaling limit for diffusive regimes of the ERW and other Polya urn related processes, see [3,13], [6] for higher dimensional generalisations, and [1]. In [11] the process displayed in (1.1) is referred to as a noise reinforced Brownian motion and belongs to a larger class of reinforced processes recently introduced by Bertoin in [9] called noise reinforced Lévy processes. The noise reinforced Brownian motion plays, in the framework of noise reinforced Lévy processes, the same role as the standard Brownian motion in the context of Lévy processes. Moreover, just as the standard Brownian motion B corresponds to the integral of a white noise,B can be thought of as the integral of a reinforced version of the white noise, hence the name. More precisely, from (1.1) it readily follows that the law ofB admits the following integral representation where β r " pβ r s q sě0 is a standard Brownian motion, or equivalently,B " pBptqq tě0 has the same law asˆt p ? 1´2p Some further properties of the noise reinforced Brownian motion can be found in [11], where the following functional limit theorem [11,Theorem 3.3] has been established: let p P p0, 1{2q and suppose that X P L 2 pPq. Then, we have the weak convergence of the scaled sequence in the sense of Skorokhod as n tends to infinitỹŜ where pB t q tě0 is a noise reinforced Brownian motion. Our work generalises this result but our approach differs from [11] as we work with a discrete martingale introduced by Bercu [4] for the ERW and later generalised in [7] for step-reinforced random walks. The martingale we work with is a discrete-time stochastic process of the form p a nŜn , where pp a n q ně0 is a properly defined sequence of positive real numbers of order n´p. As we shall see, investigation of said martingale and in particular its quadratic variation process, in conjunction with the functional martingale CLT [25], yields an alternative proof of Theorem 3.3 in [11].
The counterbalanced random walk: Next we turn our attention to the second process of interest, called the counterbalanced random walk or negative step-reinforced random walk, introduced recently by Bertoin in [8]. Beware that p in our work always corresponds to the probability of a repetition event, while in [8] this happens with probability 1´p. Similarly, we will consider a sequence of i.i.d. random variables pX n q nPN with distribution µ on R and at each time step, the step performed by the walker will be, with probability 1´p P p0, 1q, an independent step X n from the previous ones while with complementary probability p, the new step is one of the previously performed steps, chosen uniformly at random, with its sign changed. This last action will be referred to as a counterbalance of the uniformly chosen step. In particular, when µ is the Rademacher distribution, we obtain an ERW with parameter p1´pq{2 P r0, 1{2s.
Formally, recall that X 1 , X 2 , . . . is a sequence of i.i.d. copies of X and pε i : i ě 2q is an independent sequence of Bernoulli random variables with parameter p P p0, 1q. We define the sequence of incrementsX 1 ,X 2 , . . . recursively as follows (beware of the difference of notation betweenX andX): we set firstX 1 " X 1 , and next for i ě 2, we leť where U ri´1s denotes an independent uniform random variable in t1, . . . , i´1u. Finally, the sequence of partial sumsŠ n :"X 1`¨¨¨`Xn , n P N, is referred to as a counterbalanced random walk (or random walk with negatively reinforced steps). Notice also that, in contrast with the positive step-reinforced random walk, when p " 1 we still get a stochastic process, consisting of consecutive counterbalancing of the initial step X 1 while for p " 0 we just get the dynamics of a random walk. For the positive reinforced random walk we already pointed out that the steps are identically distributed and hence are centred as soon as X is centred. On the other hand, for the negatively step-reinforced case the recurrent equation on page 3 of [8] E`Š n`1˘" p1´pqm`p1´p{nqE`Š n˘, n ě 1 with initial condition EpŠ 1 q " EpX 1 q " m, yields that the process pŠ n q is also centered if X is centred. Observe however that in stark contrast to the positive step-reinforced random walk, we cannot say that the typical step is centered without loss of generality: Indeed, since n Þ Ñ EpX n q is no longer constant as soon as m ‰ 0, due to the random swap of signs in the negative reinforcement algorithm, the centered process pŠ n´E pŠ n qq is also no longer a counterbalanced random walk. Turning our attention to its asymptotic behaviour, Proposition 1.1 in [8] shows that the behaviour of the counterbalanced random walkŠ n is ballistic. More precisely, denoting by m " EpXq the mean of the typical step X, then for all p P r0, 1s the process pŠ n q satisfies a law of large numbers: lim nÑ8Š n n " p1´pqm 1`p in probability.
Moreover, Theorem 1.2 in [8] shows that if we also assume that the second moment m 2 " EpX 2 q is finite, then the fluctuations are Gaussian for all choices p P r0, 1q: In particular, when X is centred as will be our case, we simply geť On the other hand, when p " 1 which corresponds to the purely counterbalanced case, and under the additional assumption that X follows the Rademacher distribution, then The proofs of these results rely on remarkable connections with random recursive trees and even if these will not be needed in the present work, we encourage the interested reader to consult [8] for more details. In this article, we will establish a functional version of the asymptotic normality mentioned above under the additional assumption that m " 0, i.e. the typical step is centered. We recall that this assumption cannot be made without the loss of generality.
In the same spirit as in the noise-reinforced setting, we will call a noise counterbalanced Brownian motion of parameter p P r0, 1q a Gaussian processB with covariance given by E`BptqBpsq˘" 1 2p`1 s 1`p t p for 0 ď s ď t and p P p0, 1q, (1.3) and it follows that the law ofB admits the following integral representatioň in terms of a standard Brownian motion β c . The invariance principles: Before stating the functional versions of the results we just mentioned, notice that given a sample of i.i.d. random variables pX n q with law µ, and an additional independent collection p i q, pU risq of Bernoulli random variables and uniform random variables respectively as before, we can construct from the same sample simultaneously to the associated random walk pS n q, the processes pŜ n q and pŠ n q, that we refer respectively as the positive step-reinforced version and the negative step-reinforced version of pS n q. It is then natural to compare the dynamics of the triplet pS n ,Ŝ n ,Š n q, instead of individually working with pŜ n q and pŠ n q. When considering such a triplet, it will always be implicitly assumed that pŜ n q, pŠ n q have been constructed in this particular way from pS n q. In particular, we used the same sequence of uniform and Bernoulli random variables to define both reinforced versions. Now we have all the ingredients to state our first main result: Theorem 1.1. Fix p P r0, 1{2q and consider the triplet pS n ,Ŝ n ,Š n q consisting of the random walk pS n q with its reinforced version and its counterbalanced version of parameter p. Assume further that X is centred. Then, the following weak convergence holds in the sense of Skorokhod as n tends to infinity, where B,B,B denote respectively a standard BM, a noise reinforced BM and a counterbalanced BM with covariances, EpB sBt q " t´ppt^sq p`1 p1´pq{p1`pq, EpB sBt q " t p pt^sq 1´p , EpB tBs q " t p s´ppt^sqp1´pq{p1`pq.
Notice that in the case p " 0, i.e. when no reinforcement events occur, this is just Donsker's invariance principle since pŠ n q, pŜ n q are just the random walk pS n q andB,B are just B and hence, from now on we will assume that p ą 0. The process in the limit admits the following simple integral representation in terms of stochastic integralŝ where B " pB t q tě0 , β r " pβ t q tě0 , β c " pβ c q tě0 denote three standard Brownian motions with covariance structure EpB s β r t q " p1´pqpt^sq, EpB s β c t q " p1´pqpt^sq, Epβ r s β c t q " pt^sqp1´pq{p1`pq.
The restriction on the parameter p P p0, 1{2q comes from the fact that, as we will see, for the noise reinforced random walk only for such parameter the functional version works with this scaling, while the centred hypothesis is a restriction coming from the counterbalanced random walk. Now we point at some variants with less restrictive hypothesis, holding as long as we no longer consider the triplet. This allows us to drop some of the conditions we just mentioned, and the proofs will be embedded in the proof of Theorem 1.1. We start by removing the centred hypothesis when only working with the the pair pS n ,Ŝ n q in the diffusive regime p P r0, 1{2q. Theorem 1.2. Let p P r0, 1{2q and suppose that X P L 2 pPq. Let pS n q be a random walk with typical step distributed as X and denote by pŜ n q its positive step reinforced version. Then, we have weak joint convergence of the scaled sequence in the sense of Skorokhod as n tends to infinity towards a Gaussian process where B is a Brownian motion,B is a noise reinforced Brownian motion with covariance ErB sBt s " t p pt^sq 1´p .
It follows that the limit process in (1.7) admits the integral representation here B " pB t q tě0 and β r " pβ r t q tě0 denote two standard Brownian motions with EpB t β r s q " p1´pqpt^sq. This result extends Theorem 3.3 in [11] to the pair pS,Ŝq. Notice that the factor 1´p in the correlation can be interpreted in terms of the definition of the noise reinforced random walk, since at each discrete time step, with probability 1´p the processeŝ S and S share the same step X n .
Turning our attention to the counterbalanced random walk, when only working with the pair pS n ,Š n q we can extend the convergence to p P r0, 1q, and is the content of the following result: Theorem 1.3. Let p P r0, 1q and suppose that X P L 2 pPq is centred. If pS n q is a random walk with typical step distributed as X and pŜ n q is its counterbalanced version of parameter p, then we have the weak convergence of the sequence of processes in the sense of Skorokhod as n tends to infinityˆ1 where B is a Brownian motion andB is a noise counterbalanced Brownian motion with covariance ErB sBt s " t´ppt^sq p`1 p1´pq{p1`pq and σ 2 " ErX 2 s. If p " 1 and X follows the Rademacher distribution, the result still holds and in particular B andB are independent.
Moreover, the limit process in (1.8) admits the simple integral representation here B " pB t q tě0 and β c " pβ c t q tě0 denote two standard Brownian motions with EpB s β c t q " p1´pqpt^sq.
Finally, we turn back our attention to the noise reinforced setting when the parameter is p " 1{2. Our method allows us to establish an invariance principle for the step-reinforced random walk at criticality p " 1{2 but notice that in this case we do not establish a joint convergence, as the required scalings are no longer compatible. Theorem 1.4. Let p " 1{2 and suppose that X P L 2 pPq. Then, we have the weak convergence of the sequence of processes in the sense of Skorokhod as n tends to infinitỹŜ where B " pB t q tě0 denotes a standard Brownian motion.
Our proofs rely on a version of the martingale Functional Central Limit Theorem (abreviated MFCLT), which we state for the reader's convenience. For more general versions, we refer to Chapter VIII in [20]. If M " pM 1 , . . . , M d q is an rcll d-dimentional process, we denote by ∆M its jump process, which is the d-dimensional process null at 0 defined as  (ii) There exists some dense set D Ă R`such that for each t P D and i, j P t1, . . . , du, as n Ò 8, xM n,i , M n,j y t Ñ xM i , M j y t in probability, (1.10) and sup sďt |∆M n s | Ñ 0 in probability. (1.11) The rest of this paper is organised as follows: In Section 2 we introduce a crucial martingale for our reasoning associated with step-reinforced random walks and investigate its properties. We derive maximal inequalities and asymptotic results for the noise reinforced random walk that will be needed in the sequel. Section 3 is devoted to the proof of Theorem 1.1 under the additional assumption that the typical step X is bounded and in section 4 we discuss how to relax this assumption to the general case of unbounded steps by a truncation argument. In the process, we will also deduce the proofs of Theorem 1.2 and Theorem 1.3. Finally, in Section 5 we address the proof of Theorem 1.4 and we shall again proceed in two stages. Since many arguments can be carried over from the previous sections, some details are skipped.

The martingales associated to a reinforced random walk
In this section we work under the additional assumption that the typical step X P L 2 pPq is centred and recall that we denote by σ 2 " EpX 2 q its variance. The centred hypothesis is maintained for Sections 3 and 4, but dropped in Section 5.
Recall that if M " pM n q ně0 is a discrete-time real-valued and square integrable martingale, then its predicable variation process xM y is the process defined by xM y 0 " 0 and for n ě 1, while if pZ n q is another martingale, the predictable covariation of the pair xM, Zy is the process defined by xM, Zy 0 " 0 and for n ě 1, We define two sequence pp a n , n ě 1q, pq a n , n ě 1q as follows: Let p a 1 " q a 1 " 1 and for each n P t2, 3, . . . u, set p a n " for respectivelly p γ n " n`p n , q γ n " n´p n when n ě 2. Proposition 2.1. The processesM " pM n q ně0 ,M " pM n q ně0 defined asM 0 "M 0 " 0 andM n " p a nŜn ,M n " q a nŠn for n ě 1 are centred square integrable martingales and we denote the natural filtration generated by the pair by pF n q, where F 0 is the trivial sigmafield. Further, their respective predictable quadratic variation processes is given by xM y 0 " xM y 0 " 0 and, for all n ě 1 and where pV n q ně1 is the step-reinforced process given byV n "X 2 1`¨¨¨`X 2 n and the sums should be considered identical to zero for n " 1.
Proof. Starting with the positive-reinforced case, notice that for any n ě 1 we have Hence, sinceŜ n`1 "Ŝ n`Xn`1 , and p γ n " pn`pq{n, and therefore, we obtain EpM n`1 | F n q " p a n`1 EpŜ n`1 | F n q " p a n`1 p γ nŠn " p a nŜn "M n .
Moreover, as X is centred and the steps pX k q are identically distributed by what was discussed in the introduction, we have EpM n q " EpX 1 q " EpXq " 0 and we conclude that pM n q ně0 is a martingale. Turning our attention to its quadratic variation, we have EpŜ 2 n q ď n 2 EpX 2 q " n 2 σ 2 and hence,M n is indeed square integrable and its predictable quadratic variation exists. Next, we observe that for n ě 1 we have (2.6) Finally, as was pointed out in the proof of Lemma 3 in [9], and can be verified from the definition of theX n , it holds that and hence we arrive at the formula (2.2). For the negative-reinforced case, the proof follows very similar steps after minor modifications have been made. Since and the martingale property for pM n q ně0 follows. For the quadratic variation, the proof is the same after noticing that since clearlyX 2 k "X 2 k , we can also writeV n "X 2 1`¨¨¨`X 2 n .
We write for further use the following asymptotic behaviours: the first ones are related to the study of the positive-reinforced case and hold for p P p0, 1{2q: , p a n " Γpnq Γpn`pq " n´p as n Ò 8 (2.10) while for p " 1{2 we have a change on the asymptotic behaviour in the series, lim nÑ8 1 logpnq n ÿ k"1 p a 2 k " 1, p a n " Γpnq Γpn`1{2q " n´1 2 as n Ò 8 (2.11) which is the reason behind the different scaling showing in Theorem 1.4. On the other hand, for the negatively-reinforced case we have for p P p0, 1q, , q a n "

Γpnq Γpn´pq
" n p as n Ò 8. (2.12) The limits are derived from standard Gamma function asymptotic behaviour, and were already pointed out in Bercu [4]. We first focus our attention on a law of large numbers, that will be needed in the proof of Theorem 1.2.
Proof. The proof of Theorem 2.2 is adapted from [4] and outlined here for the sake of completeness. Under the standing assumption that the typical step X is bounded, we gather from (2.6) that xM y n ď ν n :" }X} 2 more precisely, as p ă 1{2, the sequence ν n increases to infinity with power polynomial rate of n 1´2p . We then obtain from the strong law of large numbers for martingales, see for instance Theorem 1.3.24 in [16], that lim nÑ8M n ν n " 0 a.s. and our claim follows.
We continue by investigating bounds for the second moments of the supremum process of the step-reinforced random walkŜ for all regimes, and then deduce related LLN type results that will also be needed afterwards. Lemma 2.3. For every n ě 1, the following bounds hold for some numerical constant c: Proof. We tackle each of the three cases p P p0, 1{2q, p " 1{2 and p P p1{2, 1q individually: (i) Let us first consider the case when p P p0, 1{2q. We observe that by (2.6) and by (2.10) EpM 2 n q " EpxM y n q ď n ÿ k"1 σ 2 p a 2 k`1 " σ 2 1 1´2p n 1´2p , as n Ñ 8.
Hence we obtain by Doob's inequality that where c 1 ą 0 is some constant. Since it evidently holds that it follows readily that Eˆsup kďn |Ŝ k | 2˙ď c 1 σ 2 n 1´2p p a 2 n " c 1 σ 2 n, as n Ñ 8.
By monotonicity, we conclude the proof for this case.
(ii) Let us now assume that p " 1{2, we then obtain by (2.11) and monotonicity that for all n ě 1 we have EpxM y n q ď σ 2 log n.
We conclude as in the previous case that this implies where c 2 ą 0 is some constant.
(iii) Finally, let us consider the case p ą 1{2. Here, we then have as n Ñ 8 for a constant C large enough and some finite constantc. This entails that EpxM y n q ď σ 2c and we deduce as before the bound where c 3 ą 0 is some constant.
Thus we have established the desired bounds.
As an application of the maximal inequalities displayed in Lemma 2.3 for the positive reinforced random walk, we establish L 2 convergence type results for all regimes p P p0, 1q that will be needed in our proofs: n log n " 0.
Proof. Let pf pnqq be a sequence of positive numbers and notice that by Lemma 2.3, if as , then we have convergence in the L 2 -sense to 0 of the sequence pŜ n {f pnqq. Now respectively for each one of the tree cases: (i) We take f pnq :" n 1´p and observe that n 2p´1 Ñ 0 as n Ñ 8 since p P p0, 1{2q.
This concludes the proof.
We wrap up our discussion by mentioning that in the superdiffusive regime p P p1{2, 1q the convergence displayed in Corollary 2.4 can be improved. The following proposition has already been observed in [7] using a different technique, we present here a more elementary approach.
Proposition 2.5. For every fixed p P p1{2, 1q, we have lim nÑ8Ŝ n n p "Ŵ a.s. and in L 2 pPq, whereŴ P L 2 pPq is a non-degenerate random variable.
Proof. Thanks to Proposition 2.1 we know thatM n "â nŜn is a martingale. Further, we obtain from (2.6) and the asymptotics p a n " n´p that, for some constant C large enough, for all n P N. Since p ą 1{2, the latter series is summable and we conclude that sup nPN Ep|M n | 2 q ă 8.
By Doob's martingale convergence theorem there exists a non-degenerate random variablê W P L 2 pPq such thatM n ÑŴ a.s. and in L 2 pPq as n Ñ 8. Using the asymptotics p a n " n´p we conclude the proof.

Proof of Theorem 1.1 when X is bounded.
Recall that in this section and Section 4 we work under the additional assumption that X is centred. As was discussed in the introduction, for positive step-reinforced random walks the centredness hypothesis can be assumed without loss of generality, but that is no longer the case for negative step-reinforced random walks. We are now in a position to prove Theorem 1.1 when X is bounded and in the process we will also establish Theorem 1.2 and Theorem 1.3. For that reason, in several statements we also consider p P r1{2, 1s when working with the counterbalanced random walk. Additionally, when we work with the counterbalanced random walk for p " 1, we assume as in Theorem 1.3 that X is Rademacher distributed, this will be recalled when necessary. Our approach relies on using the martingale introduced in Proposition 2.1 and applying the MFCLT 1.5. We will establish the general case for X P L 2 pPq by a truncation argument, detailed in Section 4. Now, the key is to notice that, since by (2.10) resp. (2.12) we have for any t ě 0 p a tntu n´p " t´p and q a tntu n p " t p as n Ò 8, in order to get the convergence (1.5) it is enough to prove (except for a technical detail at the origin in the third coordinate that will be properly addressed), the convergencê

3.2)
are just rescaled, continuous-time versions of the martingales we introduced in Proposition 2.1, multiplied by respective factors of n p´1{2 and n´1´p. We will also denote as N pnq the scaled random walk in the first coordinate and we proceed at establishing (3.1) by verifying that the conditions of the MFCLT 1.5 are satisfied. In that direction and recalling the condition (1.10), we start by investigating the asymptotic negligeability of the jumps: Lemma 3.1 (Asymptotic negligeability of jumps).
Proof. (i) We will show that we can bound sup t |∆N pnq t | uniformly by a function decreasing to 0 as n Ñ 8. In that direction, notice that By hypothesis we have }X} 8 ă 8 and since p a k is decreasing, it is enough to show that sup kPN k pp a k´p a k`1 q ă 8. p a k´p a k`1 " p a kˆ1´k`1 k`1`p˙" pk´p p`1q . (ii) Since we also have ∆M k`1 "ǎ k`1 pŠ k`1´γkŠk q, we deduce that ince q γ n " pn´pq{n. Recalling the asymptotic behaviour (2.12) of q a n , the supremum in the above expression can be uniformly bounded by C¨pnT q p log pnT q for a constant C large enough, entailing that sup tďT |∆Ň pnq t | Ñ 0 pointwise for each T . Now we turn our attention to the joint convergence of the quadratic variation process, and this is the content of the following lemma: Lemma 3.2 (Convergence of quadratic variations). For each fixed t P R`, the following convergences hold in probability for p P p0, 1{2q, unless specified otherwise: for p P p0, 1s.
where for the case p " 1 in piiq and pivq we assume that X is distributed Rademacher.
Lemma 3.2 provides the key asymptotic behaviour for the sequence of quadratic variations and its proof is rather long.
Proof. We tackle each item (i)-(v) individually, item (v) being the most arduous.
(i) For each n P N, we gather from (2.2) that the predictable quadratic variation of this martingale is given by k´1¸‚ .
We will study separately the limit as n Ñ 8 of the three nontrivial terms, as the first one evidently vanishes. To start with, it follows readily from (2.10) that Now, we claim that the second term converges to zero: " 0 a.s. (3.6) Indeed, by (2.10) it suffices to notice that by Theorem 2.2, we have lim kÑ8Ŝ k k " 0 a.s. since we recall that by our standing assumptions X is both bounded and centered. Finally, we claim that for the last term, the following limit holds: In that direction, notice that pV n q nPN is the reinforced version of the (non-centered) random walk V n " X 2 1`¨¨¨`X 2 n , n P N with mean EpX 2 i q " EpX 2 i q " σ 2 . In order to work with a centered reinforced random walk, we introduceŴ n "´X 2 1´E pX 2 q¯`¨¨¨`´X 2 n´E pX 2 q¯(3.8) "Ŷ 1`¨¨¨`Ŷn (3.9) with an obvious notation. This is the step-reinforced version of the random walk with typical step distributed as X 2´E pX 2 q, which is centered and bounded. This allows us to write for each k P NV k "Ŵ k`k¨E pX 2 q "Ŵ k`k¨σ 2 and by replacing in (3.7) and the law of large numbers (Theorem 2.2), applied to the centered reinforced random walk pŴ n q nPN , we obtain: Now, combining (3.5), (3.6) and (3.7) we conclude that (ii) By (2.3), nd we now study the convergence of the normalised series in the previous expression. By (2.12), the first term converges towards Turning our attention to the second term, we recall from Theorem 1.1 in [8] that pŠ n q satisfies a law of large numbers: lim nÑ8 1 nŠ n " p1´pq m 1`p " 0 in probability.
Since }X} 8 ă 8, n´1|Š n | ď }X} 8 and hence the convergence holds in L 2 pPq. This remark paired with the asymptotic behaviour of the series (2.12) yields: " 0 in L 1 pPq for all t ě 0 and a fortiori in probability. Finally, we claim that We start assuming that p ă 1, and proceeding as in (3.8), we set: It follows thatV n "Ŵ n`n σ 2 , whereŴ n is a centred noise reinforced random walk whose steps have the law of X 2´E pX 2 q with memory parameter p. Since p P p0, 1q, we recall from Corollary 2.4 that for all regimes, we have n´1Ŵ n Ñ 0 in L 1 pPq. As a consequence,  (3.14) in probability, which proves (3.10). If p " 1, by hypothesis X takes its values in t´1, 1u andV k´1 " k´1, yielding that the previously established limit (3.14) still holds, replacing 1´p by 0. Notice however that if we allowed X to take arbitrary values, we can no longer proceed as we just did since in that case,V n is a straight line with random slope:V n " nX 2 1 . Putting all pieces together, we obtain (ii).
(iii) Recalling thatX k " X k 1 t k "0u`XU rk´1s 1 t k "1u , and from independence of X k , k and U rk´1s from F k´1 , we get for k ě 2 p a k p1´pqσ 2 since the steps are centered, while for k " 1 we simply get EpM 1 X 1 q " σ 2 . From here, we deduce xN pnq , N pnq y t " n p´1 tntu ÿ k"1 Ep∆M k X k | F k´1 q " σ 2 p1´pqn p´1˜p 1´pq´1`t (iv) Recalling that in the counterbalanced caseX k " X k 1 t k "1u´XU rk´1s 1 t k "0u , we deduce from similar arguments as in the reinforced case, q a k¨p 1´pqσ 2 . Notice that if p " 1 the argument still holds and hence the above quantity is null for k ě 1. It follows that xŇ pnq , N pnq y t " n´p 1`pq tntu ÿ k"1 Ep∆M k X k | F k´1 q " σ 2 p1´pq¨n´p 1`pq tntu ÿ k"1 q a k and from the convergence Finally if p " 1, we clearly have lim nÑ8 xŇ pnq , N pnq y t " 0. (v) Notice that Ep∆M k ∆M k | F k´1 q " E´pŜ k´1 pp a k´p a k´1 q`X k p a k qpŠ k´1 pq a k´q a k´1 q`X k q a k q | F k´1"Ŝ k´1 pp a k´p a k´1 qŠ k´1 pq a k´q a k´1 q`Ŝ k´1 pp a k´p a k´1 qEpX k |F k´1 qq a kŠ k´1 pq a k´q a k´1 qEpX k | F k´1 qp a k`E pX kXk | F k´1 qp a k q a k ": P paq k`P pbq k`P pcq k`P pdq k .
where the notation was assigned in order of appearance. We write, nd study the asymptotic behaviour of these four terms individually. In that direction, we recall from (2.4) and (2.8) the identities EpX k | F k´1 q " pŜ k´1 {pk´1q, EpX k | F k´1 q "´pŠ k´1 {pk´1q as well as from (3.4) the asymptotic behaviour pp a k´p a k´1 q " pk´p p`1q while a similar computation yields pq a k´q a k´1 q " pk p´1 .
From the identities and asymptotic estimates we just recalled, we havě S k´1 pq a k´q a k´1 qEpX k | F k´1 qp a k "Š k´1 pq a k´q a k´1 qpŜ k´1 k´1 p a k "Š k´1 k k p p 2Ŝ k´1 k´1 p a k and since p a k " k´p, we have for some constant C large enough, hich converges a.s. towards 0 as n Ò 8 by Lemma 2.2.
we can follow exactly the same line of reasoning in order to establish lim nÑ8 n´1 tntu ÿ k"1 P pbq k " 0 a.s..
• Since pp a k´q a k´1 qpp a k´q a k´1 q " p 2 k´2 we deduce thatŜ k´1 pp a k´p a k´1 qŠ k´1 pq a k´q a k´1 q "Ŝ k´1Šk´1 k´2 and we conclude as before that lim nÑ8 n´1 tntu ÿ k"1 P paq k " 0 a.s..
• Finally, since by definition X k " X k 1 t k "0u`XU rk´1s 1 t k "1u ,X k " X k 1 t k "0u´XU rk´1s 1 t k "1u we have p a k q a k EpX kXk | F k´1 q " p a k q a k EpX 2 EpX jXj 1 t k "1,U rk´1s"ju | F k´1 q.
Since on one hand,X j ,X j for j ă k are F k´1 measurable while k , U rk´1s are independent of F k´1 , denoting asǦ the counterbalanced random walk made from the i.i.d. sequence X 2 1 , X 2 2 , . . . from the same instance of the reinforcement algorithm, we deduce jXj¸" p a k q a kˆp 1´pqσ 2´pǦ k´1 k´1ȧ nd sinceâ kǎk Ñ 1 as k Ñ 8, the problem boils down to studying the convergence as n Ò 8 of k´1˙.
The first term obviously converges towards tp1´pqσ 2 and we turn our attention to the second one. In that direction, at each k we decomposeǦpkq "Ǧ 1 pkqř k j"1Ǧ j pkq whereǦ i pkq consists exclusively of the sum of the steps that have been repeated i-times at step k. Since the steps have mean m " σ 2 we get from Lemma 4.1 in [8] (beware that p here is 1´p in [8]) that: (a) lim kÑ8 k´1Ǧ 1 pkq " σ 2 p1´pq{p1`pq a.s.
(b) lim kÑ8 k´1 ř k j"2 |Ǧ j pkq| " 0 in probability. Notice that (b) holds in L 1 pPq too since k´1 ř k j"2 |G j pkq| ď }X} 8 , as there are at most k´1 repeated steps at time k. Hence, lim nÑ8 p n tntu ÿ k"1 ř k´1 j"2Ǧ j pk´1q k´1 Ñ 0 in L 1 pPq and finally, we deduce that in probability, by the almost sure convergence in Cesaro-mean. Putting all pieces together, we conclude that the following convergence holds in probability: Bringing all our calculations above together we conclude the convergence in probability, lim nÑ8 xN n ,Ň n y t " σ 2 p1´pqt´pσ 2 p1´pqp1`pq´1t.
This concludes the proof of the lemma.
With this, we conclude the proof of Theorem 1.1 when X is bounded with an appeal to Lemma 3.1, Lemma 3.2 and the MFCLT (Theorem 1.5).

Reduction to the case of bounded steps.
In this section, we shall only assume that the typical step X P L 2 pPq of the step-reinforced random walkŜ is centred and no longer that it is bounded. We shall complete the proof of Theorem 1.1 by means of the truncation argument reminiscent to the one of Section 4.3 in [11].

Preliminaries
The reduction argument relies on the following lemma taken from [20], that we state for the reader's convenience: Lemma 4.1 (Lemma 3.31 in Chapter VI of [20]). Let pZ n q be a sequence of d-dimensional rcll (càdlàg) processes and suppose that If pY n q is another sequence of d-dimensional rcll processes with Y n ñ Y in the sense of Skorokhod, then Y n`Z n ñ Y in the sense of Skorokhod.
Finally, we will need the following lemma concerning convergence on metric spaces: Moreover, since for each fixed m k the corresponding sequence pa pm k q n q n converges, there exists a strictly increasing sequence pn k q k satisfying that, for each k, dpa pm k q i , a pm k q 8 q ď 2´k for all i ě n k . Now, we set for n ă n 1 , bpnq :" m 1 and for k ě 1, bpnq :" m k if n k ď n ă n k`1 and we claim pa bpnq n q n is the desired sequence. Indeed, it suffices to observe that for n k ď n ă n k`1 ,

Reduction argument
Recall that we are assuming that the typical step is centred. During the course of this section we will use that the truncated versions of the counterbalanced and noise reinforced random walks are still counterbalanced (resp. noise reinforced) random walks. Indeed, notice that if pŠ n q and pŜ n q have been built from the i.i.d. sequence pX n q ně1 by means of the negative-reinforcement and positive-reinforcement algorithms described in the introduction, splitting each X i for i P N as yields a natural decompositions for pŠ n q and pŜ n q in terms of two counterbalanced (reps. noise reinforced) random walks: where now pŠ ďK n q, pŠ ąK n q are counterbalanced versions with typical step centred and distributed respectively as X ďK :" X1 t|X|ďKu´E`X 1 t|X|ďKu˘( 4.1) and X ąK :" X1 t|X|ąKu´E pX1 t|X|ąKu q, (4.2) an analogue statement holding in the reinforced case for pŜ ďK n q, pŜ ąK n q. Moreover, X ďK is centred with variance σ 2 K and σ 2 K Ñ σ 2 as K Õ 8 while the variance of X ąK that we denote by η 2 K , converges towards zero as K Ò 8. We will also write the respective truncated random walk as Notice that pS ďK q, pŜ ďK n q and pŠ ďK n q have now bounded steps, allowing us to apply the result established in Section 3 to this triplet. Remark 4.3. We point out that while pŜ ďK n q can be simply obtained by considering the NRRW made from the steps X i 1 t|Xi|ďKu , i ě 1 and substracting nEpX1 tXďKu q at the n-th step for each n ě 1, and hence yielding a NRRW with steps given bŷ for the counterbalanced case we need to subtract the counterbalanced random walk issued from the constants EpX i 1 t|Xi|ďKu q, i ě 1 , which in contrast with the reinforced case, is a process on its own right because of the sign swap.
For each k, write as N n,KN n,K andŇ n,K the corresponding martingales as defined in (3.1) relative to S ďK ,Ŝ ďK andŠ ďK respectively. An application of Theorem 1.1 in the bounded case yields for every K, that However recalling the asymptotic behaviour n p p a tntu " t´p as n Ñ 8 and the definition of N n,ďK , we deduce that Since as K Ò 8, the right hand side converges weekly towards pσB t , σt p ş t 0 s´pdβ r s , σ ş t 0 s p dβ c s q and the convergence in distribution is metrisable, by Lemma 4.2 there exists a slowly increasing sequence converging towards infinity that we denote as pKpnq : n ě 1q, satisfying that, as n Ò 8, , holds away for the origin (this restriction is due to the fact that t´p is unbounded on any neighbourhood of 0). In order to get the convergence on R`and finally prove the claimed convergence in Theorem 1.1, we proceed as follows: We will only work with the third coordinate, as it is the only one presenting the difficulty. The argument is readily adapted to the triplet. Assume without loss of generality that σ 2 " 1, fix δ ą 0 and consider the partition of r0, δs, with points tδ2´i : i " 0, 1, 2, . . . u. Since the sequence pq a k q is increasing we obtain, Denoting as usual by pM n q the martingale pq a nŠn q ně0 , notice that by (2.3), the remark that follows, and (2.12), EpM 2 n q " EpxM ,M y n q ď c n ÿ k"1 q a 2 k ď cn 1`2p for some constant c that might change from one inequality to the other. We deduce by Doob's inequality which, recalling the asymptotic behaviour q a n " n p , yields for some constant c that might differ from one line to the other: Finally, write X pnq " p 1 ? nŠ tntu q tPR`. Since for any δ ą 0 we have pX pnq t q těδ ñ pB t q těδ as n Ò 8 and of course pB t`δ q tPR`ñ pB t q tPR`a s δ Ó 0, we deduce that there exists some decreasing sequence pδpnqq Ó 0 such that This establishes that the convergence´1 ?
nŠ tntu¯t PR`ñB holds on R`and with this, we conclude our proof of Theorem 1.1.
Remark 4.5. In the process of proving Theorem 1.1 in Section 3 and 4, we showed also that if we no longer consider the noise-reinforced random walk, we can extend the convergence of the pair to p P p0, 1q,

4.5)
where as usual β c , B are two Brownian motions with xB, β c y t " p1´pqt, and that the result still holds if p " 1 if we assume X follows the Rademacher distribution, in which case the processes are independent. This is precisely the content of Theorem 1.3. Finally, Theorem 1.2 also follows by recalling thatŜ n´n EpXq is a centred positive step-reinforced random walk and hence falls in our framework.
5 The critical regime for the positive-reinforced case: proof of Theorem 1.4 In this last section we turn our attention to the critical regime p " 1{2 for the noise reinforced case and prove the invariance principle with our martingale approach. The arguments are very similar and rely on exploiting the martingale defined in Proposition 2.1, the MFCLT and a truncation argument. The main difference comes from the fact that, for p " 1{2, the asymptotic behaviour of ř n k"1 p a 2 k is no longer the one claimed in (2.10). Namely, as we pointed out previously, lim nÑ8 1 logpnq n ÿ k"1 p a 2 k " 1 and the different scaling that we will use makes impossible to couple the convergence with the random walk or the counterbalanced random walk. Once again, we start with a law of large numbers-type result: Lemma 5.1. Suppose }X} 8 ă 8. We have the almost sure convergence lim nÑ8Ŝ n ? n log n " 0 a.s. and fortiori we have lim nÑ8 n´1Ŝ n " 0 a.s.
Proof. The proof of this statement follows along the same lines as the proof of Lemma 2.2.
Since p " 1{2 we have now, with the notation introduced in (2.13), that as n Ñ 8, where K 1 is a positive constant. That is, ν n increases slowly to infinity with a logarithmic speed. We obtain again from Theorem 1.3.24 in [16] that M 2 n log n " Oplog log nq a.s.
Hence, asM n " p a nŜn , the above readily implies that p a 2 nŜ 2 n log n " Oplog log nq a.s.
Further, we deduce from (2.11) that for p " 1{2, lim nÑ8 p a 2 n¨n " 1 and hence we deduce thatŜ Now we establish the general case by means of the usual reduction argument. We will not be as detailed as before, since the ideas are exactly the same. We do still assume without loss of generality that the steps are centred.
Proof of Theorem 1.4, general case. Maintaining the notation introduced for the truncated reinforced random walks of Section 4 as well as for the respective variances η K and σ K for K ą 0, Theorem 1.4 in the bounded step case shows for each K ą 0 the convergences in distribution as n tends to infinity in the sense of Skorokhod,Ŝ ďK ptn t uq a logpnqn t¸t PR`ù ñ pσ K Bptqq tPR`( 5.2) and from lim KÑ8 σ K " σ, it follows readily from (5.2) and the same arguments as before that as n tends to infinity,Ŝ ďKpnqq ptn t uq a logpnqn t¸t PR`ù ñ pσBptqq tPRf or some increasing sequence pKpnqq ně0 of positive real numbers converging towards infinity.
On the other hand, from Lemma 2.3 for p " 1{2 we deduce that lim nÑ8 1 n t logpnq Eˆsup kďn t |Ŝ ąbpnq pkq| 2˙ď c 2 lim nÑ8 η 2 Kpnq t " 0 and from here we can proceed as we did in the previous section. With this, we conclude the proof of Theorem 1.4.