Mapping TASEP back in time

We obtain a new relation between the distributions $\mu_t$ at different times $t\ge 0$ of the continuous-time TASEP (Totally Asymmetric Simple Exclusion Process) started from the step initial configuration. Namely, we present a continuous-time Markov process with local interactions and particle-dependent rates which maps the TASEP distributions $\mu_t$ backwards in time. Under the backwards process, particles jump to the left, and the dynamics can be viewed as a version of the discrete-space Hammersley process. Combined with the forward TASEP evolution, this leads to a stationary Markov dynamics preserving $\mu_t$ which in turn brings new identities for expectations with respect to $\mu_t$. The construction of the backwards dynamics is based on Markov maps interchanging parameters of Schur processes, and is motivated by bijectivizations of the Yang-Baxter equation. We also present a number of corollaries, extensions, and open questions arising from our constructions.

1. Introduction 1.1. TASEP. The Totally Asymmetric Simple Exclusion Process (TASEP) is a prototypical stochastic model of transport in one dimension. Introduced around 50 years ago in parallel in biology [MGP68], [MG69] and probability theory [Spi70], it has been extensively studied by a variety of methods.
TASEP is a continuous-time Markov process on the space of particle configurations in Z in which at most one particle per site is allowed. Each particle has an independent exponential clock of rate 1 (that is, the random time T after which the clock rings is distributed as Prob(T > s) = e −λs , s > 0, where λ = 1 is the rate). When the clock rings, the particle jumps to the right by one if the destination is free of a particle. Otherwise, the jump is blocked and nothing happens. See Figure 1 for an illustration. In this work we focus on the process with the most well-studied initial condition -the step initial condition. Under it, the particles initially occupy Z <0 , while Z ≥0 is free of particles. Denote by h(t, x) the TASEP interface (where t ∈ R ≥0 , x ∈ Z), which is obtained by placing a slope +1 or a slope −1 segment over a hole or particle, respectively, with the agreement that the step initial configuration corresponds to h(0, x) = |x|. See Figure 3 for an illustration. We also denote the TASEP distribution at time t (with step initial condition) by µ t .
It was shown by [Ros81] (see also, e.g., [Joh00], [Rom15,Chapter 4] for an alternative approach based on symmetric functions) that the interface grows linearly with time, and under the linear space and time scaling tends to the limit shape which is a parabola: where κ and τ are scaled space and time, and |κ| ≤ τ . In the past 20 years, starting with [Joh00], much finer results about asymptotic behavior of TASEP have become available through the tools of Integrable Probability (cf. [BG16], [BP14]). This asymptotic analysis revealed that TASEP belongs to the (one-dimensional) Kardar-Parisi-Zhang (KPZ) universality class [Cor12], [QS15]. In particular, the TASEP interface at time L, on the horizontal L 2/3 and vertical L 1/3 scales, converges to the top line of the Airy 2 line ensemble (about the latter see, e.g., [CH14]). Furthermore, computations with TASEP allow to formulate general predictions for all one-dimensional systems in the KPZ class (e.g., see [Fer08]). The progress in understanding multitime asymptotics of the TASEP interfaces is rapidly advancing at present (see Remark 7.5 for references to recent results).
1.2. The backwards dynamics. The goal of our work is to present a new surprising property of the family of TASEP distributions {µ t } t≥0 . We show that the distributions µ t are coupled in the reverse time direction by a time-homogeneous Markov process with local interactions (the interaction strength depends on the location in the system). Let us now describe this backwards dynamics.
Denote by C the (countable) space of configurations on Z which differ from the step configuration by finitely many TASEP jumps. 1 Consider the continuous-time Markov chain on C which evolves as follows. At each hole there is an independent exponential clock whose rate is equal to the number of particles to the right of this hole. When the clock at a hole rings, the leftmost of the particles that are to the right of the hole instantaneously jumps into this hole (in particular, the particles almost surely jump to the left). See Figure 2 for an illustration. Note that for configurations in C, almost surely at any time moment at most one particle can move because there are only finitely many holes with nonzero rate.
x 2 (t) x 1 (t) The jumping mechanism described above has the following features: • gaps attract neighboring particles from the right; • the rate of attraction is proportional to the size of the gap; • the jumping particle lands inside the gap uniformly at random.
The same features of the jumping mechanism appear in the well-known continuous-space Hammersley process [Ham72], [AD95]. For this reason we call our Markov process (which evolves in the discrete space) the backwards Hammersley-type process, or BHP, for short. Note that compared to the well-known continuous-space Hammersley process, our BHP is space-inhomogeneous: the jump rate at a hole also depends on the number of particles to the right of it. The evolutions of the interface under TASEP and the BHP are given in Figure 3.
Let {L τ } τ ∈R ≥0 be the Markov semigroup of the BHP defined in Section 1.2. That is, L τ (x, y), x, y ∈ C, is the probability that the particle configuration is y at time τ given that it started at x at time 0 (here we use the fact that BHP is time-homogeneous).
Remark 1.1. The backwards process is well-defined. Indeed, for each initial condition x ∈ C of the backwards process, the set of its possible further states is finite. Therefore, the probability L τ (x, y) for any x, y ∈ C is well-defined (and can be obtained by exponentiating the corresponding finite-size piece of the BHP jump matrix).
1.3. Main result. Recall that µ t is the distribution of the TASEP configuration at time t (with the step initial condition). The measure µ t is supported on the space C for all t ≥ 0.
Theorem 1. The BHP maps the TASEP distributions backwards in time. That is, for any t, τ ∈ R ≥0 , we have µ t L τ = µ e −τ t . (1.2) In detail, this identity means that for any x ∈ C we have y∈C µ t (y) L τ (y, x) = µ e −τ t (x).
As τ → +∞, the right-hand side of (1.2) becomes µ 0 , which is the delta measure on the step configuration. This agrees with the observation that for any x ∈ C we have 2 lim τ →+∞ L τ (x, y) = 1 y=step configuration .
Theorem 1 leads to a stationary Markov dynamics on the TASEP measure µ t (it is discussed in Section 1.7 below). In particular, this stationary dynamics brings new identities for expectations with respect to µ t . One of these identities is given in Corollary 7.4.
The simulation depicting the TASEP evolution from the step initial configuration to t = 350, and then the action of the BHP on this interface is available online [LP19]. The interfaces in Figure 3 are essentially snapshots of this simulation.
1.4. Remark. Reversal of Markov processes. Before discussing the strategy of the proof of Theorem 1 let us mention that TASEP, like any Markov chain (under certain technical assumptions), can be reversed in time, and its reversal is again a Markov chain -but usually time-inhomogeneous and quite complicated.
For TASEP, let {T t } t∈R ≥0 be its Markov semigroup. Defining T rev t,s (x, y) = µ s (y) µ t (x) T t−s (y, x), t > s, we see that T rev also maps the TASEP distributions back in time: µ t T rev t,s = µ s , s < t. In other words, the probabilities T rev come from the time-reversal of the TASEP conditional distributions. The Markov process corresponding to {T rev t,s } is time-inhomogeneous, and its interactions are substantially nonlocal. Theorem 1 implies that the BHP {L τ } is a different, much more natural, Markov process which maps the TASEP distributions back in time.
By a different mapping of the distributions we mean the following. One can check that the joint distribution of the TASEP configuration at two times e −τ t and t differs from the joint distribution of (x, y), where y is distributed as µ t , and x is obtained from y by running the BHP process L τ . Figure 3. An illustration of the TASEP interface growth (top) and the interface decay under the backwards dynamics (bottom). In both pictures, lighter curves are the interfaces at later times. One can see that the TASEP evolution is symmetric about the vertical axis, while the backwards dynamics is not symmetric. Because of this asymmetry, there are in fact two backwards processes -one focusing on holes and the other focusing on particles. We only consider one of them in the present work.
1.5. Idea of proof of Theorem 1. We prove Theorem 1 in Sections 4 to 6. Here let us outline the main steps.
First, we modify the problem by introducing an extra parameter q ∈ (0, 1), and consider the TASEP in which the k-th particle from the right, k ∈ Z ≥1 , has the jump rate q k−1 . 3 Let the distribution at time t of this TASEP (with step initial configuration) be denoted by µ Second, we use the well-known mapping of the TASEP to Schur processes. Schur processes [OR03] (and their various generalizations including the Macdonald processes [BC14]) are one of the central tools in Integrable Probability. The particular Schur processes we employ are probability distributions on particle configurations {x j i } 1≤i≤j in Z × Z ≥1 which satisfy an interlacing condition, see Figure 4.
There exists a Schur process (depending on q and the time parameter t ∈ R ≥0 ) under which the joint distribution of the leftmost particles {x N N } N ∈Z ≥1 in each horizontal row is the same as of the q-dependent TASEP particles x 1 (t) > x 2 (t) > . . . (i.e., this is µ (q) t ). This mapping between TASEP and Schur processes is described in [BF14], but also follows from earlier constructions involving the Robinson-Schensted-Knuth correspondence. We recall the details in Section 3. This Schur process corresponding to µ (q) t depends on q via the spectral parameters 1, q, q 2 , . . . attached to the horizontal lines (as indicated in Figure 4). The new ingredients we bring to Schur processes are Markov maps interchanging two neighboring spectral parameters (say, the j-th and the (j + 1)-th). By a Markov map we mean a way to randomly modify the interlacing particle configuration in Z × Z ≥1 such that: • At the j-th horizontal level the particles almost surely jump to the left; • All other levels are untouched; • The interlacing conditions are preserved; • If the starting configuration was distributed as a Schur process, then the resulting configuration is distributed as a modified Schur process with the j-th and the (j + 1)-th spectral parameters interchanged.
We refer to this as the "L Markov map" since it moves particles to the left (it has a counterpart, the "R Markov map", but we do not need it for the main result). The L Markov map at each j-th level depends only on the ratio of the spectral parameters being interchanged. Combining the L Markov maps in such a way that they interchange the bottommost spectral parameter 1 with q, then with q 2 , then with q 3 , and so on, we can move this parameter 1 to infinity, where it "disappears" (see Figure 9 below for an illustration). The resulting distribution of the configuration will again be a Schur process with the same spectral parameters (1, q, q 2 , . . .), but with the modified time parameter, t → qt. (Here we use the fact that the measure does not change under the simultaneous rescaling of the spectral parameters.) Considering the action of this combination of the L Markov maps on the leftmost particles {x N N }, we arrive at an explicit Markov transition kernel on C, denoted by L (q) , with the property that (this is Theorem 5.7 below) Finally, iterating the action of L (q) and taking the limit as q → 1, we arrive at Theorem 1.
1.6. "Toy" example. Coupling of Bernoulli random walks. The Schur process computations leading to Theorem 1 have an elementary consequence which we now describe. This statement may be proven independently by an interested reader using only basic probability theory. Its connection to Schur processes is detailed in Section 8.7. Fix β ∈ (0, 1), and let b β be the distribution of the simple random walk in the quadrant Z 2 ≥0 , under which the walker starts at (0, 0) and goes up with probability β and to the right with probability 1 − β, independently at each step.
Consider the continuous-time Markov process on the space of random walk trajectories under which each (up, right) local piece is independently replaced by the (right, up) piece at rate n, where n ∈ Z ≥0 is the vertical coordinate of the local piece. See Figure 5 for an illustration. Clearly, in each initial segment {0, 1, . . . , m} × Z, almost surely at each time moment there is at most one change of the trajectory. Moreover, for different m these processes are compatible, so by the Kolmogorov extension theorem they indeed define a continuous-time Markov process on the full space of random walk trajectories. Denote the resulting Markov semigroup by {D τ } τ ∈R ≥0 .
Proposition 2. For any β ∈ (0, 1) and τ ≥ 0 we have The action of D τ decreases the parameter β and almost surely moves the trajectory closer to the m (horizontal) axis. By symmetry, one can also define a continuous-time Markov chain which moves the vertical pieces of the trajectory to the left, and increases the parameter β. It could be interesting to look at the stationary dynamics -a combination of the two processes running in parallel which does not change β -and understand its large-scale asymptotic behavior. We do not focus on this question in the present work. 1.7. Stationary dynamics on the TASEP measure. Fix t ∈ R >0 . The backwards Hammersley-type process slowed down by a factor of t compensates the time change of the forward TASEP evolution. Running these two processes in parallel thus amounts to a continuous-time Markov process which preserves the TASEP distribution µ t .
One can say that the TASEP distributions µ t are the "blocking measures" for the stationary dynamics [Lig05] (see also [BB18]).
The presence of the stationary dynamics on µ t allows to obtain new properties of the TASEP measure. In particular, we write down an exact evolution equation for E G(N 0 t ), where N 0 t is the number of particles to the right of zero at time t, and G is an arbitrary function. This equation contains one more random quantity -the number of holes immediately to the left of zero. See Corollary 7.4 for details.
Moreover, in Section 7 we rederive the limit shape parabola for the TASEP by looking at the hydrodynamics of the process preserving µ t . Indeed, recall that the TASEP local equilibria -the ergodic translation invariant measures on configurations on the full line Z which are also invariant under the TASEP evolution -are precisely the product Bernoulli measures [Lig05]. In the bulk of the BHP, the difference between jump rates of consecutive particles is inessential. Thus, the product Bernoulli measures also serve as local equilibria for the BHP. 4 By looking at the local equilibria, one can write down two hydrodynamic PDEs for the TASEP limit shape: first is the well-known Burgers' equation, and the second is a PDE coming from the BHP, which is specific to the step initial condition. After simplifications, these PDEs lead to the parabola (1.1).
Beyond hydrodynamics, the asymptotic fluctuation behavior of the TASEP measures µ t as t → +∞ is understood very well by now, starting from [Joh00]. It would be very interesting to extend these results to the combination TASEP + t −1 BHP which preserves µ t .
1.8. Further extensions. The Markov maps on Schur processes we introduce to prove our main result, Theorem 1, offer a variety of other applications and open problems. We discuss them in more detail Section 8. Here let us briefly outline the main directions: • The one-dimensional statement (mapping the TASEP distributions back in time) has an extension to two dimensions. Namely, there is a continuous-time Markov process on interlacing particle configurations (as in Figure 4) which maps back in time the distributions of the anisotropic KPZ growth process on interlacing arrays studied in [BF14]. 4 In fact, they are the only (extreme) local equilibria because the particle-hole involution turns the homogeneous BHP into the PushTASEP (= long-range TASEP), and local equilibria for the latter are classified [Gui97], [AG05].
• Instead of Schur processes, one can consider interlacing configurations of finite depth. This includes probability distributions on boxed plane partitions with weight proportional to q vol (where vol is the volume under the boxed plane partition). In this setting our constructions produce Markov chains mapping the measure q vol to the measure q −vol , and vice versa. (A simulation is available online [PZ19].) Applying this procedure twice leads to a new sampling algorithm for the measures q ±vol . • A certain bulk limit of our two-dimensional Markov maps essentially leads to the growth processes preserving ergodic Gibbs measures on two-dimensional interlacing configurations introduced and studied in [Ton17]. Thus, one can view our Markov maps as exact "prebulk" stationary dynamics on two-dimensional interlacing configurations. • Theorem 1 may be interpreted as the statement that the family of measures {µ t } is coherent with respect to a projective system determined by the process {L τ }. Projective systems [BO13] generalize the notion of branching graphs, and the latter play a fundamental role in Asymptotic Representation Theory [VK81], [BO16]. (Even further, the distributions of the anisotropic KPZ growth are also coherent, on a projective system whose "levels" are spaces of two-dimensional interlacing configurations.) The framework of projective systems / branching graphs provides many natural questions in this setting. • Structurally, our Markov maps are inspired by the study of stochastic vertex models and bijectivization of the Yang-Baxter equation [BP17], [BMP19]. Compared with the Schur case, the full Yang-Baxter equation for the quantum sl 2 contains more parameters. In this setting, Schur polynomials should be replaced by the spin Hall-Littlewood or spin q-Whittaker symmetric functions [Bor17], [BW17]. It is interesting to see how far Theorem 1 can be generalized to other particle systems arising in this framework, including ASEP, various stochastic six vertex models, and random matrix models. • There exists a backwards dynamics for the ASEP started from a family of shock measures [BS18]. This ASEP backwards dynamics is obtained via a duality. While the shock measures are very different from the step initial configuration, it would be interesting to find connections of Theorem 1 to Markov duality.
Concrete open questions along these directions are formulated and discussed in Section 8.
Outline. In Sections 2 and 3 we recall the necessary facts about Schur processes, TASEP, and their connection. In Section 4 we introduce the L and R Markov maps at the level of interlacing arrays. The action of each such map swaps two neighboring spectral parameters. In Section 5 we combine the L Markov maps in such a way that their combination L (q) preserves the class of q-Gibbs measures on interlacing arrays (which includes the Schur processes related to the qdependent TASEP). We compute the action of L (q) on q-Gibbs measures and the corresponding Schur processes. In Section 6 we take a limit q → 1, which leads to our main result, Theorem 1. In Section 7 we illustrate the relation between the TASEP and the backwards evolutions at the hydrodynamic level by looking at the stationary dynamics on the TASEP distribution µ t . Finally, in Section 8 we discuss possible extensions of our constructions indicated in Section 1.8 above, and formulate a number of open questions.
a part of this work was done. Both authors were partially supported by the National Science Foundation grant DMS-1664617.

Ascending Schur processes
This section is a brief review of ascending Schur processes introduced in [OR03] and their relation to TASEP. More details may be found in, e.g., [BG16].
is a weakly decreasing sequence of nonnegative integers. We denote |λ| := N i=1 λ i . We call (λ) the length of a partition. By convention we do not distinguish partitions if they differ by trailing zeroes. In this way (λ) always denotes the number of strictly positive parts in λ. Denote by Y the set of all partitions including the empty one ∅ (by convention, (∅) = |∅| = 0).
2.2. Schur polynomials. Fix N ∈ Z ≥1 . The Schur symmetric polynomials in N variables are indexed λ ∈ Y and are defined as If N < (λ), we set s λ (x 1 , . . . , x N ) = 0, by definition. The Schur polynomials s λ indexed by all λ ∈ Y with (λ) ≤ N form a linear basis in the space C[x 1 , . . . , x N ] S N of symmetric polynomials in N variables. Each s λ is a homogeneous polynomial of degree |λ|.
The Schur polynomials are stable in the following sense: This stability allows to define Schur symmetric functions s λ , λ ∈ Y, in infinitely many variables. These objects form a linear basis of the algebra of symmetric functions Λ. We refer to [Mac95, Ch. I.2] for the precise definition and details on the algebra Λ.
Iterating (2.2) and breaking down all skew Schur polynomials into single-variable ones, we see that each Schur polynomial has the following form: where the sum is taken over all interlacing arrays of partitions of depth N in which the top row coincides with λ (see Figure 6 for an illustration). In combinatorial language, (2.5) is the representation of a Schur polynomial as a generating function of semistandard Young tableaux, cf. [Ful97].
Remark 2.1. If N < (λ), then there are no interlacing arrays of depth N whose top row is λ because at each level one can add at most one nonzero component. Thus, the right-hand side of (2.5) automatically vanishes if N < (λ). This agrees with the convention that s λ (x 1 , . . . , The following two identities for skew Schur polynomials play a fundamental role in our work. The first identity is a straightforward consequence of the symmetry of the Schur polynomials. (2.6) This is an identity of generating series in x i , y j under the standard geometric series expansion 1 1−x i y j = 1 + x i y j + (x i y j ) 2 + . . .. Moreover, (2.6) holds as a numerical identity if x i , y j ∈ C are such that |x i y j | < 1 for all i, j.
Remark 2.4. If we set λ = µ = ∅ in (2.6), the sum in the right-hand side disappears (because s ∅/κ = 1 κ=∅ ), and we obtain ν∈Y Again, this is a numerical identity provided that |x i y j | < 1 for all i, j.
2.4. Specializations. When x ≥ 0, we have s λ/κ (x) ≥ 0 from (2.4). More generally, the Schur polynomials s λ (x 1 , . . . , x N ) are nonnegative for real nonnegative x 1 , . . . , x N . We will also need the Plancherel specializations of Schur functions s λ . These specializations, indexed by t ≥ 0, may be defined through the limit Remark 2.5. We also have s λ (ρ t ) = t |λ| dim λ |λ|! , where dim λ is the dimension of the irreducible representation of the symmetric group S |λ| , or, equivalently, the number of standard Young tableaux of shape λ.
Generic nonnegative specializations will be denoted as ρ : Λ → R, and we will also use the notation s λ (ρ) for ρ(s λ ). For the purposes of the present paper, ρ would be either a Plancherel specialization, or a substitution of a finitely many nonnegative variables into the symmetric function.
Remark 2.6. A classification of Schur-positive specializations (that is, algebra homomorphisms Λ → R which are nonnegative on Schur functions) is known and is equivalent to the celebrated Edrei-Thoma theorem. See, for example, [BO16] for a modern account discussing various equivalent formulations.
2.5. Schur processes. Schur measures and processes are probability distributions on partitions or sequences of partitions whose probability weights are expressed through Schur polynomials in a certain way. They were introduced in [Oko01], [OR03].
A Schur measure is a probability measure on Y with probability weights depending on two nonnegative specializations ρ 1 , ρ 2 : The normalizing constant Z can be computed using the Cauchy identity (2.7) (provided that the infinite sum converges). Schur processes are probability measures on sequences of partitions generalizing the Schur measures. We will only need the particular case of ascending Schur processes. These are probability measures on interlacing arrays (for some fixed N ) depending on a nonnegative specialization ρ and c 1 , . . . , c N ≥ 0: (2.10) The normalizing constant has the form (this follows from (2.2) and (2.7)): (provided that the series converges). We call N the depth of a Schur process. We will sometimes call the c i 's the spectral parameters of Schur process P[ c | ρ]. The next statement immediately follows from (2.2) and the skew Cauchy identity: Proposition 2.7. Under the Schur process (2.10), the marginal distribution of each λ (K) , 1 ≤ K ≤ N , is given by the Schur measure P[(c 1 , . . . , c K ) | ρ].
Remark 2.8. The interlacing array in Figure 6 and the one in Figure 4 in the Introduction are related by We work with the {λ (N ) k } notation throughout the paper. By the Kolmogorov extension theorem, a measure on S is uniquely determined by a collection of compatible joint distributions of {λ (1) ≺ . . . ≺ λ (N ) } N ≥1 . If these joint distributions satisfy the c-Gibbs property, then the resulting measure on S is c-Gibbs.
Thus, Proposition 2.7 implies the following extension of the definition of a Schur process. Given an infinite sequence c 1 , c 2 , . . . , of nonnegative reals such that the sums like (2.11) converge for all N , one can define the Schur process P[ c | ρ] of infinite depth, i.e., a probability measure on S. Indeed, this is because the distributions (2.10) for different N are compatible with each other by Proposition 2.7, so the measure on S with the desired finite-dimensional distributions exists.
(2.12) The expression in the denominator is simply the normalizing constant. One can say that each interlacing array in (2.12) is weighted proportional to the corresponding term in the expansion (2.5). Note that the c-Gibbs property depends on the order of the c i 's, but the normalizing constant in (2.12) does not.
The next lemma is straightforward.
Lemma 2.9. Fix any j ≥ 2. Under a c-Gibbs measure, the conditional probability weight of λ (j) Denote the space of all c-Gibbs measures on S by G c . Note that this space does not change if we multiply all the parameters by the same positive number: G c = G a· c , a > 0. Indeed, this follows from (2.12) and the homogeneity of the Schur polynomials.
Remark 2.10. When all c i ≡ 1, the conditional distribution (2.12) becomes uniform (on the set of all interlacing arrays of depth N with top row λ). This uniform Gibbs case justifies the name c-Gibbs in the general situation.
The Schur process P[ c | ρ] is a particular example of a c-Gibbs measure. The full classification of c-Gibbs measures is known only in several particular cases. In the uniform case c i ≡ 1 this is the celebrated Edrei-Voiculescu theorem (see Section 8.1 below and also, e.g., [BO12] for a modern account discussing various equivalent formulations). When the c i 's form a geometric sequence, the classification was obtained much more recently in [Gor12] (see also [GO16] for a generalization).

Schur processes and TASEP
In this section we recall a coupling between TASEP (with step initial configuration and particledependent speeds) and a marginal of an ascending Schur process. This mapping can be seen as a consequence of the column Robinson-Schensted-Knuth insertion [VK86], One can also define a continuous-time Markov dynamics on interlacing arrays whose marginal is TASEP [BF14] (see also [BP14]).
3.1. TASEP. Let c 1 , . . . , c N , . . . be positive reals. The continuous-time TASEP (Totally Asymmetric Simple Exclusion Process) with step initial condition and speeds c is defined as follows. It is a Markov process on particle configurations . . (this is the step initial configuration); • The configuration has the rightmost particle x 1 ; • The configuration is densely packed far to the left, that is, for all large enough M (where the bound on M depends on t) we have x M (t) = −M ; • There is at most one particle per site.
Denote the space of such left-packed and right-finite particle configurations on Z by C.
The continuous-time Markov evolution of TASEP proceeds as follows. Each particle x i has an independent exponential clock with rate c i . That is, the time before x i attempts to jump is an exponential random variable: Prob(time > t) = e −c i t , t ≥ 0. (We will refer to c i 's as to the particle speeds.) When the clock of x i rings, the particle jumps to the right by one if the destination is not occupied. If the destination of the jumping particle is occupied, the jump is forbidden and the particle configuration does not change. Because the process starts from the step initial configuration, only finitely many particles are free to jump at any particular time. Therefore at any time almost surely at most one jump happens. See Figure 7 for an illustration.
rate c 1 Figure 7. An example of a jump and a forbidden jump in TASEP.

3.2.
Coupling to a Schur process. Fix N ∈ Z ≥1 , positive reals c 1 , . . . , c N , and t ≥ 0. Consider the Schur process P[ c | ρ t ] defined by (2.10), where ρ t is the Plancherel specialization. Note that the series for the normalizing constant (2.11) always converges because and the last expression is an entire function in t and c i . Since this procedure works for all N , we can view P[ c | ρ t ] as a Schur process of infinite depth, i.e., a probability measure on S.
The next result is present in [BF14], but alternatively follows from much earlier constructions involving Robinson-Schensted-Knuth correspondences [VK86], Theorem 3.1. Fix t ≥ 0 and particle speeds c 1 , c 2 , . . ., and consider the TASEP as in Section 3.1 at time t. Then we have equality of joint distributions at the fixed time t: where λ (i) are the random partitions coming from the Schur process P[ c | ρ t ] described above.
Remark 3.2. A dynamical version of this result is also proven in [BF14]: there exists a continuous-time Markov chain on interlacing arrays (even a whole family of them, cf. [BP16], [BP14]) whose action on a Schur process P[ c | ρ t ] continuously increases the parameter t. We will refer to the dynamics from [BF14] as the push-block process (see Definition 8.8 below for details).
For the push-block process on interlacing arrays, (3.1) holds as equality of joint distributions of Markov processes. In other words, (3.1) is also true for multitime joint distributions of these processes. However, we do not need this dynamical result for most of our constructions.

Markov maps
This section introduces our main objects -the Markov maps L (j) α and R (j) α which randomly change the j-th row λ (j) in an interlacing array while keeping all other rows intact. These maps act on c-Gibbs measures by permuting spectral parameters. 4.1. First level. Let us first describe the maps for j = 1 (the simplest nontrivial case) to illustrate their structure and properties. We use the shorthand notation λ (2) = (λ 1 , λ 2 ) and λ (1) = κ 1 . The interlacing means that λ 2 ≤ κ 1 ≤ λ 1 .
Definition 4.2 (The L and R maps, first level). For α ∈ [0, 1], let L (1) α be the Markov map 5 whose action on the pair κ ≺ λ does not change λ, and replaces κ 1 as follows: The notation for the L and R operators is suggested by the directions in which they move κ 1 . See Figure 8 for an illustration. (1) 0 lead to the maximal possible displacement of κ 1 , respectively, to the left or to the right.  The next lemma plays a key role and will later generalize to other rows of the interlacing array. Denote by s i , i = 1, 2, . . . the i-th elementary permutation of the spectral parameters, (4.1) Lemma 4.4. If c 1 ≥ c 2 and c 1 = 0, then the Markov operator L (1) Proof. Let us consider only L 1 is the identity. But in this case s 1 c = c, so there is nothing to prove. 5 A Markov map is the same as a stochastic matrix or a one-step transition operator of a Markov chain (it is also sometimes called "link" in the literature). An application of a Markov map is a random update of the underlying configuration. At the same time, each Markov map is a deterministic linear operator in the space of probability distributions on configurations. When applying a map M to a probability measure π, we write this as π → πM .
We can assume that c 1 > c 2 . Denote α = c 2 /c 1 . Using the c-Gibbs property, we see that given λ = (λ 1 , λ 2 ), the conditional probability weight of κ = (κ 1 ) is proportional to s κ (c 1 )s λ/κ (c 2 ), which by (2.4) leads to The action of the operator L (1) α on this distribution is readily computed: The final expression is the conditional probability weight of κ 1 given λ under the s 1 c-Gibbs property. This completes the proof. (1) c 2 /c 1 , then the algebraic computations in the proof of Lemma 4.4 are still valid. But the operator itself loses probabilistic meaning as some of its matrix elements become negative.

4.2.
Remark. Relation to bijectivization. The Markov maps of Definition 4.2 which interchange the spectral parameters were suggested by the idea of bijectivization of the Yang-Baxter equation first employed in [BP17] (see also [ABB18], [BMP19]).
First, note that one can deduce the symmetry of the skew Schur polynomials (Proposition 2.2) from the Yang-Baxter equation. This argument is present, for example, in [Bor17, Theorem 3.5] in a U q ( sl 2 ) setting with additional parameters q, s (the Schur case corresponds to q = s = 0).
Next, bijectivization refines the Yang-Baxter equation into a pair of forward and backward local Markov moves which randomly update the configuration. Here the locality means the following. Encode κ 1 using the occupation variables {η x } x∈Z , where η κ 1 = 1 and all other η x ≡ 0. The application of a single local Markov move (forward or backward) would change one of the occupation variables.
Then, considering a sequence of forward or backward moves leads, respectively, to the L and R operators. This can be seen by setting t = s = 0 in [BP17, Figure 4], taking a sequence of these moves, and passing from the occupation variables (equivalently, vertical arrows in the notation of that paper) to the elements of the interlacing array. For brevity, we do not explain the details of derivation of the L and R Markov operators from the bijectivization as an independent proof of the key Lemma 4.4 is rather straightforward.   for i = j, and replaces λ (j) as follows: k , k = 1, . . . , j, is randomly independently moved to the left (resp., to the right) within the segment to which λ Proof. Let us consider L (j) α only; the case of R (j) α is analogous. Denote α = c j+1 /c j . We may assume that α = 1 as otherwise there is nothing to prove. Let us also take j ≥ 2 as the case j = 1 is Lemma 4.4.

Action on q-Gibbs measures
This section shows that suitably composed L maps preserve the class of q-Gibbs measures on interlacing arrays, and describes how a q-Gibbs measure changes under this action. 5.1. q-Gibbs property. Fix q ∈ (0, 1]. A c-Gibbs measure on the set S of infinite interlacing arrays is called q-Gibbs if c i = q i−1 for all i ∈ Z ≥1 . We denote the set of q-Gibbs measures by G q .
Remark 5.1. One can define the volume of an interlacing array of finite depth N by (5.1) Then the q-Gibbs property is equivalent to saying that conditioned on λ (N ) , the probability weight of the interlacing array λ (1) ≺ . . . ≺ λ (N ) is proportional to q −vol(λ (1) ≺...≺λ (N ) ) (e.g., see [MP17]). Note that sometimes (in particular, in [Gor12]) the term "q-Gibbs measures" refers to the elements of G q −1 in our notation.
When q = 1, the q-Gibbs measures correspond to the uniform conditioning property (cf. Remark 2.10). Throughout this section we work under the assumption 0 < q < 1. q j (see Figure 9 for an illustration). Let L (q) be the Markov map which acts on probability measures on S by Let us explain why L (q) is well-defined. Recall that a probability measure on S is uniquely determined by a family of compatible joint distributions of (λ (1) , . . . , λ (N ) ) (cf. Section 2.6). Next, for all K > N we have M (K) (λ (1) , . . . , λ (N ) ) = M (N +1) (λ (1) , . . . , λ (N ) ). This guarantees that the collection of measures {M (N +1) (λ (1) , . . . , λ (N ) )} N ≥1 is indeed compatible, and thus defines a measure on S which we denote by M L (q) . Figure 9. Construction of the map L (q) . The spectral parameters q j correspond to the action on q-Gibbs measures considered in Section 5.4, and the lines indicate the swapping of the spectral parameters after each j-th map L (j) q j .
5.3. q-Gibbs harmonic families. Let M be a q-Gibbs measure on S. By the q-Gibbs property, for each N ≥ 1 the probability weight of λ (1) , . . . , λ (N ) is represented as a product of the marginal probability weight of λ (N ) and a q-Gibbs factor corresponding to the conditional distribution of λ (1) ≺ . . . ≺ λ (N −1) given λ (N ) . This allows to write where ϕ N is a function on the N -th level of the array defined as (1, q, . . . , q N −1 ) .
Because the functions ϕ N for different N come from the same q-Gibbs measure M, they must be compatible. This compatibility relation reads for all N ≥ 1 and all µ = (µ 1 , . . . , µ N −1 ) on the (N −1)-st level of the array (at the zeroth level we set ϕ 0 (∅) = 1, by agreement). We call a family of functions {ϕ N } satisfying (5.3) and ϕ 0 (∅) = 1 a q-Gibbs harmonic family. The term "harmonic" comes from the Vershik-Kerov theory of the boundary of branching graphs (e.g., see [KOO98]). Clearly, a q-Gibbs measure on S is uniquely determined by its associated q-Gibbs harmonic family {ϕ N }.

5.4.
Action of the iterated L map on q-Gibbs measures. If M ∈ G q , then the action of L (q) (that is, the sequence of the Markov maps L (j) q j ) on M swaps the spectral parameters as in Figure 9, moving c 1 = 1 all the way up to infinity where it "disappears". The resulting spectral parameters (q, q 2 , q 3 , . . .) are proportional to the original ones. This suggests that L (q) should preserve the class of q-Gibbs measures. The next result shows that this is indeed the case, and also describes the action of L (q) on G q in the language of harmonic families.
Theorem 5.3. The Markov map L (q) maps preserves G q , the set of q-Gibbs measures on S. More precisely, L (q) maps each q-Gibbs harmonic family {ϕ N } N ∈Z ≥1 to a new q-Gibbs harmonic family {φ N } N ∈Z ≥1 as follows: Proof. The second equality in (5.4) immediately follows from (2.4). Let us first explain why the sum in (5.4) is finite. We have by the definition (5.2) of ϕ N : q N (in this order). Because the last of these operators depends on λ (N +1) , we see that the distribution of θ (N ) is not determined only by the joint distribution of λ (1) , . . . , λ (N ) . In other words, to computeφ N we need to first extend ϕ N to ϕ N +1 , and utilize the q-Gibbs property.
Remark 5.4. Note that Theorem 5.3 fundamentally relies on the fact that the q-Gibbs measure lives on an infinite array. Indeed, for an array of finite depth it is not possible to move the spectral parameter 1 all the way up to infinity. In the proof of Theorem 5.3 we use the fact that the array has infinite depth when we extend ϕ N to ϕ N +1 . The case of arrays of finite depth is discussed in Section 8 below.
5.5. Application to Schur processes and TASEP with geometric speeds. Schur processes P[ c | ρ t ] with c i = q i−1 are particular cases of q-Gibbs measures with where we took into account the normalization of the Schur measures. The Markov map L (q) acts on these Schur processes as follows: where we used the skew Cauchy identity ((2.6) with κ = ∅) and the homogeneity of the Schur polynomials (both properties clearly survive the Plancherel limit (2.8)). Therefore, we have Definition 5.5. Let 0 < q < 1. We aim to define a Markov map L (q) on C. Fix a configuration x 1 > x 2 > . . . in C. By definition, its random imagex 1 >x 2 > . . . under the action of L (q) iŝ where the Y q i 's are independent truncated geometric random variables (see Definition 4.1).
Theorem 5.3, identity (5.7), and the fact that L (q) is a projection of L (q) immediately imply the following result: Theorem 5.7. For any t ≥ 0, we have is the distribution of the TASEP with geometric rates (with ratio q) and step initial configuration, and L (q) is the Markov map from Definition 5.5.

Limit q → 1 and proof of the main result
Here we take the limit as q → 1 of the results of the previous section, and arrive at a continuoustime Markov chain mapping the TASEP distributions backwards in time. This proves our main result, Theorem 1.
Iterate Theorem 5.7 to observe that for any T ∈ Z ≥1 : where (L (q) ) T simply denotes the T -th power. Next, introduce the scaling: where ε > 0 will go to zero, and τ ∈ R ≥0 is the scaled continuous time. Clearly, we have q T = e −τ (1 + O(ε)). We aim to take the limit as q → 1 in (6.1).
Recall that by µ t , t ∈ R ≥0 , we denote the distribution of the TASEP with constant speeds c i ≡ 1 at time t, started from the step initial configuration. Also recall that C is the space of left-packed, right-finite particle configurations on Z. The space C has a natural partial order: x precedes y if x i ≤ y i for all i.
Lemma 6.1. For any fixed τ, t ∈ R ≥0 and any δ > 0 there exists a finite set C δ = C δ (t, τ ) ⊂ C such that Proof. Take finite C δ ⊂ C such that µ t (C δ ) > 1 − δ, and, moreover, C δ is closed with respect to the partial order (i.e., if x precedes y and y ∈ C δ , then x ∈ C δ ). This is possible because µ t is a probability measure on C, and closing a finite set with respect to our partial order keeps it finite. (One can even estimate the size of C δ because the first particle x 1 (t) performs speed 1 directed a random walk.) Next, µ e −τ t (C δ ) > 1−δ because the TASEP dynamics almost surely increases the configuration with respect to the order. The rest of the claim follows by monotonically coupling the TASEP µ • with constant speeds to the TASEP µ (q) • with the q-geometric speeds. Here monotonicity means that the TASEP with the q-geometric speeds is always behind (in our partial order) the q = 1 TASEP; this monotone coupling exists since q < 1. By Lemma 6.1, it suffices to consider the limit of identity (6.1) as q → 1 on finite subsets of C. In the right-hand side we immediately get µ (q) q T t → µ e −τ t . In the left-hand side we have µ (q) t → µ t . It remains to take the limit of the T -th power of the Markov map L (q) .
The limit transition in (L (q) ) T is in the spirit of the classical Poisson approximation to the binomial distribution -the probability of jumps gets smaller, but the number of trials (i.e., the discrete time) scales accordingly. More precisely, we have for the random variables Y q k in Definition 5.5: This leads to the following definition of the continuous-time backwards dynamics: Definition 6.2 (Backwards Hammersley-type process L τ ). Consider the continuous-time dynamics on C defined as follows. Each particle x k , k = 1, 2, . . . independently jumps to the left to one of the holes {x k+1 + 1, x k+1 + 2, . . . , x k − 1} at rate k per hole. Equivalently, each particle x k has an independent exponential clock of rate k(x k − x k+1 − 1); when the clock rings, x k selects a hole between x k+1 and x k uniformly at random and instantaneously moves there. 6 Note that for configurations in C, the total jump rate of all particles is always finite. Therefore, the dynamics on C is well-defined. Denote by L τ , τ ∈ R ≥0 , the Markov transition operator of this dynamics from time 0 to time τ (note that the dynamics is time-homogeneous). Observe that the step configuration (x i = −i for all i = 1, 2, . . .) is absorbing for the backwards dynamics L τ .
Thanks to Lemma 6.1 and (6.3), we have the convergence (L (q) ) T → L τ . This completes the proof of the main theorem µ t L τ = µ e −τ t .

Stationary dynamics on the TASEP measure
Here we illustrate the relation between the TASEP and the backwards Hammersley-type process by constructing a Markov dynamics preserving the TASEP measure µ t . We also discuss hydrodynamics of these two processes.
In this section we denote particle configurations by occupation variables η : Z → {0, 1}, with η(x) = 1 if there is a particle at location x ∈ Z, and η(x) = 0 otherwise. The step initial configuration is η(x) = 1 iff x < 0. Recall that by C we denote the space of left-packed, rightfinite configurations. Denote by C = {0, 1} Z the space of all particle configurations in Z.
7.1. Definition of the stationary dynamics. Let A T be the infinitesimal generator for the TASEP with homogeneous particle speeds c i = 1 (Section 3.1), and {T t } t≥0 be the corresponding Markov semigroup. Let A L the infinitesimal generator of the backwards Hammersley-type process (BHP), see Definition 6.2, and {L τ } τ ≥0 denote the BHP semigroup. For a fixed configuration 6 The mechanism of jumping into a hole selected uniformly at random is similar to the Hammersley process [Ham72], [AD95]. Therefore, we will sometimes refer to Lτ (as well as its two-dimensional version Lτ discussed in Section 8.1 below) as Hammersley-type process (BHP, for short).
η ∈ C, we denote by η x,y , x = y, the configuration In words, η x,y corresponds to a particle jumping from location x ∈ Z to location y ∈ Z. Note that η x,y may not be in C even if η ∈ C.
The infinitesimal generator for the TASEP acts as follows: The factor η(x)(1 − η(x + 1)) takes care of the TASEP exclusion rule. The infinitesimal generator of the BHP acts as follows: Note that summations in the action of A L are well defined since for η ∈ C we have η(x) = 0 for x 0 and η(x) = 1 for x 0.
Remark 7.1. Here we do not discuss the full domains of definition of the generators A T and A L given by (7.1)-(7.2), and assume that they act on cylinder functions (i.e., the ones depending on finitely many coordinates in η).
Recall that µ t is the distribution of the TASEP configuration at time t started from the step initial configuration. Denote the corresponding random particle configuration by η t . We have η t ∈ C almost surely.
For any t ∈ R >0 , define the operator This is the generator of the continuous-time Markov process which is a combination of the BHP and the TASEP sped up by the factor of t. By a "combination" we mean that both processes run in parallel.
Proposition 7.2. The TASEP distribution {η t } is invariant under the continuous-time Markov process with generator A, that is, for all cylinder functions f .
for any t, τ ≥ 0. Fixing t ≥ 0, differentiating the above identity in τ , and sending τ to zero, we get µ t (tA T + A L ) = 0. This establishes the result.
Remark 7.3. It should be possible to show that the process with the generator (7.3), started from any configuration x ∈ C, converges (as time goes to infinity) to its stationary distribution µ t . However, we do not focus on this question in the present paper.
A local version of Proposition 7.2 holds, too. That is, the Bernoulli measures of any given density ρ ∈ [0, 1] on particle configurations on Z are invariant under both the TASEP and the homogeneous version of the BHP. (Locally the rates under BHP are constant, so the invariance should be considered under the homogeneous BHP.) The remarkable content of Proposition 7.2 is that the invariance is global on "out-of-equilibrium" random configurations with the distribution µ t , if the speeds of the TASEP and the inhomogeneous BHP are related in as in (7.3).
As a consequence of Proposition 7.2, let us take a specific function of the configuration: where G(·) is a function Z ≥0 → R. Note that N 0 is essentially the height function at zero. Let η t be the random configuration of the TASEP at time t with the step initial configuration, and N 0 t := η t (0) + η t (1) + . . .. Corollary 7.4. With the above notation, we have In the sum over x in the right-hand side almost surely only one term is nonzero, and the whole sum is equal to the distance of the rightmost particle in Z <0 to zero.
Proof. The right-hand side is equal to E A T f (η t ) , which by Proposition 7.2 is the same as −t −1 E(A L f (η t )). The rest follows from the computation of A L f (η t ) for the particular function (7.5), which is straightforward.
7.2. Hydrodynamics. The hydrodynamic limit for the TASEP is well known, with early results by [Lig75] on the convergence to a local equilibrium and by [Ros81] on the connection of the density function to the Burgers' equation. The latter means that under linear space and time scaling, the limiting density density function ρ(t, z) of the TASEP is the entropic solution of the following initial-value problem for the one-dimensional Burgers' equation: ρ(0, z) = 1, z ≤ 0 0; z > 0. (7.6) We refer to [BF87] for further details, see also [Fer18] for a recent review. The solution to (7.6) is given by (7.7) The limiting density ρ(t, z) describes the law of large numbers type behavior of the TASEP. and references therein for details. The progress in understanding the TASEP asymptotics with general initial data, and also the asymptotics of the space-time structure in TASEP is currently ongoing [FS16], [CFS18], [MQR17], [Joh18], [BL19], [FO19], [BG18], [DOV18], [Joh19], [BGH19], [JR19]. While we expect the BHP and the stationary dynamics from Section 7.1 to have applications for all these types of scaling limits, we begin by considering the hydrodynamic limit of the BHP in this section.
Fix T ∈ R >0 . For any > 0, let η t ∈ C be the random configuration at time t ∈ [0, T ] of the TASEP with particle speeds c k = −1 for all k ∈ Z ≥1 , and step initial conditions. That is, η t is the sped-up TASEP. The random empirical measure on R associated to η t ∈ C is given as follows: (7.8) Denote the set of compactly supported continuous functions on the line by C 0 (R). The integral of a function f ∈ C 0 (R) against the measure π is denoted by π , f . Clearly, The next statement can be found in, e.g., [Sep99]. The sequence of measures {π } ∈R >0 converges as → 0 in probability to ρ(t, z)dz so that the density function ρ(t, z) is the entropic solution of the initial value problem for the Burgers equation (7.6). That is, for each t ≥ 0, given any δ > 0, for any f ∈ C 0 (R). This result for TASEP generalizes to a large class of initial conditions. For instance, given a continuous density profile ρ 0 : R → [0, 1], a sequence {ν } ∈R >0 of probability measures on C = {0, 1} Z is said to be associated to the profile ρ 0 if for every f ∈ C 0 (R) and every δ > 0, we have Then, the empirical measure π t for the TASEP, with initial conditions now given by ν converges in probability to an absolutely continuous measure ρ(t, z)dz so that the density function is the entropic solution to the Burgers' equation with the initial value given by the density profile ρ(t, z), see [Sep99]. We expect a similar hydrodynamic result to hold for the BHP but with a different PDE arising from the infinitesimal generator of the BHP.
Conjecture 1. Let ρ 0 : R → [0, 1] be an initial density profile and let {ν } ∈R >0 be a sequence of probability measures on C associated to ρ 0 . 7 Also, for a fixed > 0, take η t ∈ C to be the random configuration at time t > 0 of the BHP sped up by the factor of −1 , with the initial configuration η 0 determined by the measure ν . Then, for every t > 0, the sequence of random empirical measures π t defined as in (7.8) converges in probability to the absolutely continuous measure π t (dz) = ρ(z, t)dz in the sense of (7.9). The density ρ(t, z) is a solution of the initial value problem ρ(0, z) = ρ 0 (z).
(7.10) 7 We need to make sure that the BHP evolution is well-defined, so the initial configuration must be in C ⊂ C.
Remark 7.6. In Conjecture 1, it is unclear to the authors if there is a unique solution to the initial value problem (7.10). In particular, it is unclear what type of solution the limiting density profile ρ(t, z) should be.
Remark 7.7. The differential equation (7.10) can be informally obtained by looking at the local version of the BHP. That is, locally we expect the configuration to be close to the independent Bernoulli random configuration on the whole line Z with the density ρ(t, z). Then the expression under ∂/∂z in the right-hand side of (7.10) is the (negative) flux. Indeed, ∞ z ρ(t, w)dw means the inhomogeneous rate in the BHP, while −(1 − ρ(t, z))/ρ(t, z) is the local flux of the homogeneous BHP with left jumps and speed 1. See Proposition 7.9 below for more discussion.
Let us check that Conjecture 1 holds for the initial data associated with the TASEP distributions µ t .
Proposition 7.8. Fix some t 0 ∈ R and let η ∼ µ −1 e t 0 be the TASEP random configuration at time −1 e t 0 . Then, the sequence {η } ∈R >0 is associated to the density profile and Conjecture 1 is true for the measures ν = µ −1 e t 0 .
Proof. By results for the TASEP, we know that the sequence η is associated to the density profile ρ 0 given in the statement. Also, by Theorem 1, we know that the random configuration η t obtained from ν = µ −1 e t 0 by the BHP evolution as in Conjecture 1, is distributed according to µ −1 e t 0 −t . So, again by results for the TASEP, we know that the sequence of random measures π t converges to an absolutely continuous measure π t (dz) = ρ(z, t)dz with the density given by z > e t 0 −t .
One can then check directly that the above ρ(t, z) solves the initial value problem (7.10). This completes the proof.
We base Conjecture 1 on the random evolution of the empirical measure π t given by the infinitesimal generator for the BHP.
Proposition 7.9. Let f : R → R be a twice differentiable compactly supported function and let η t ∈ C be the random configuration given by the BHP sped up by −1 . Here the time t ≥ 0 and the initial configuration η 0 ∈ C are fixed. Then, there are martingales M ,f t with respect to the natural filtration σ(η s , s ≤ t) so that where we regard π t , f as a function of the configuration η t . We can compute With the help of the approximation the statement follows from standard results on Markov chains.
7.3. Limit shape for TASEP with step initial condition. Let us present an alternate derivation for the limit shape of the TASEP with the step initial configuration assuming Conjecture 1 but independent of the similar result for the TASEP. We only assume that the TASEP empirical measure converges to ρ satisfying the following system of equations: (7.11) In particular, we show that this system of partial differential equations determines a unique solution under some general assumptions. First, eliminate the time derivative so that Note that, for all t ∈ R ≥0 , there is a z ∈ Z small enough so that ρ(t, z) = 1. This implies that the constant c(t) is in fact zero. Thus, we have Taking the space derivative, we have (7.12) Revisiting the system of equations (7.11), we may now write the second equation as follows 2ρ(t, z)).
By separation of variables, we may solve the equation above up to a constant of integration, but this constant of integration may be determined by (7.12). Thus, we get the well-known hydrodynamic density function We have thus shown that the comparability of the TASEP and the BHP uniquely picks out the entropic solution to the Burgers' equation for the limiting density function with the step initial condition.

Extensions and open questions
In this section we describe a number of modifications and extensions of the constructions presented earlier, and outline a number of open questions. 8.1. More general q-Gibbs measures. The Markov map L (q) from Definition 5.2 acts nicely on Schur processes P[(1, q, q 2 , . . .) | ρ] with general specializations ρ. Even more generally, we can consider two-sided Schur processes which live on interlacing arrays of signatures. Signatures are analogues of partitions in which parts are allowed to be negative. Interlacing arrays of signatures are simply the collections {λ (k) j } 1≤j≤k , satisfying the interlacing inequalities as in Figure 6, and with λ (k) j ∈ Z. (Note that we consider arrays of infinite depth.) For a specialization ρ parametrized as where ψ n (ρ), n ∈ Z, are the coefficients of the expansion n∈Z ψ n (ρ)u n = e γ + (u−1)+γ − (u −1 −1) and |u| = 1. One of the equivalent forms of the Edrei-Voiculescu theorem (e.g., see [BO12]) states that (8.3) parametrizes the space of all totally nonnegative two-sided sequences.
Remark 8.1. In particular, taking α ± i = β ± i = 0 for all i, γ − = 0, and γ + = t turns the just defined specialization ρ into ρ t defined in Section 2.4 Define the two-sided ascending Schur process P[(1, q, q 2 , . . .) | ρ] as the unique q-Gibbs measure on interlacing arrays of signatures such that for any N , where the skew Schur functions for signatures can be defined by (2.4). Define the Schur process P[ 1 | ρ] as the q → 1 degeneration of (8.4) (here and below we denote by 1 the sequence of spectral parameters which are all equal to 1). Another equivalent form of the Edrei-Voiculescu theorem states that P[ 1 | ρ] are all possible extreme Gibbs measures on interlacing arrays of signatures (a Gibbs measure is called extreme if it cannot be represented as a convex combination of other Gibbs measures). We refer to [Bor11] for further details on the definition of the two-sided Schur processes.
Theorem 8.2. Let ρ be a specialization with parameters (8.1) such that α − i = 0 for all i. Then we have for all 0 < q < 1: where ρ (q) is the specialization corresponding to the parameterŝ Proof of Theorem 8.2. This follows from Theorem 5.3 similarly to the computation in the beginning of Section 5.5. Namely, denote (8.3) by Ψ(u; ρ). The q-Gibbs measure (8.4) corresponds to the q-Gibbs harmonic family .
Note that Ψ(1; ρ) = 1 but it is convenient to include this factor here. Note also that the condition α − i ≡ 0 ensures that the series Ψ(q m ; ρ) converge for all m ∈ Z ≥1 . The action of L (q) turns the q-Gibbs harmonic family {ϕ N } intô In particular, for N = 1 we have ψ n (ρ (q) ) = ψ n (ρ) q n Ψ(q; ρ) , n ∈ Z, which readily translates into the modification of the parameters (8.5) in the claim.
Measures on interlacing arrays of the form P[(1, q, q 2 , . . .) | ρ] are not extreme q-Gibbs. A classification of extreme q-Gibbs measures is obtained in [Gor12] (note that our q corresponds to 1/q in that paper, so the description of the boundary needs to be reversed). Extreme q-Gibbs measures P In [BG13] a decomposition of the non-extreme q-Gibbs measures P[(1, q, q 2 , . . .) | ρ] onto the extreme ones P (q) n is given in terms of a determinantal point process on the set of shifted labels. The shifted labels in our notation are n 1 − 1 > n 2 − 2 > . . ., and they form a random point configuration on Z whose correlation functions have a determinantal form. The action of L (q) on n from Proposition 8.3 removes the largest point in this determinantal process on Z, and shifts all its other points by one to the right.
Question 2. How to explicitly link the action of L (q) on n with the modification of the parameters (8.5) of the determinantal point process describing P[(1, q, q 2 , . . .) | ρ]? Does this correspondence (between the action on the parameters of the kernel and the action on the underlying random point configuration) survive any limit transition to more familiar determinantal point processes (e.g., random matrix spectra or Airy 2 )?
8.2. Limit q → 1 and action on Gibbs measures. The q → 1 limit of Theorem 8.2 can be obtained similarly to the argument in Section 6. Define by L τ , τ ∈ R ≥0 , the continuoustime Markov semigroup under which each particle λ However, this definition presents an issue since in a generic interlacing array, under L τ infinitely many particles jump in finite time. Moreover, because for any k ∈ Z ≥1 jumps of λ (k) j depend on the (k + 1)-st level, one cannot simply restrict L τ to the first several levels. Therefore, we have to consider a smaller space of interlacing arrays: : N ∈ Z ≥1 , N − K + 1 ≤ j ≤ N } (that is, to the K leftmost diagonals) is a Markov process, in which only finitely many particles jump in finite time. For different K, these Markov processes are compatible. Therefore, L τ makes sense on the state space S c . Below we denote by L τ the Markov semigroup constructed in this manner.
Idea of proof. One can check that the Schur process P[ 1 | ρ] with α − i = β − i = 0 for all i, γ − = 0, and β + 1 < 1 is supported on the subset S c described in Definition 8.4. Similarly to Section 6, we see that under the scaling q = e −ε , T = τ /ε , ε → 0, we have (L (q) ) T → L τ . Next, the modification of the parameters (8.5) is a one-parameter semigroup. That is, applying L (q) one more time replaces q everywhere in (8.5) by q 2 . Because q T ∼ e −τ , we get the result.
In particular, L τ maps the push-block process of [BF14] (see Definition 8.8 below) backwards in time in the same sense as Theorem 1.
8.3. Iterated R maps. Consider the maps R (j) α defined in Section 4. Similarly to Section 5.2, we can define the iterated R map R (q) by (this definition has the same formal meaning as for the map L (q) , see Section 5.2). The map R (q) acts nicely on q −1 -Gibbs measures (i.e., corresponding to c = (1, q −1 , q −2 , . . .)). Namely, one can check that an analogue of Theorem 5.3 holds, with q replaced by q −1 in the definition of the harmonic functions and in (5.4). The q → 1 continuous-time limit R τ of R (q) is also readily defined with the help of Definition 8.4 -this is just the mirroring of L τ from Section 8.1, in which all particles jump to the right. One can obtain the following analogue of Theorems 8.2 and 8.5 for the action of R (q) and R τ on q −1 -Gibbs Schur processes: Theorem 8.6. Let ρ be a specialization as in (8.1)-(8.3) such that α + i = 0 for all i. We have for all 0 < q < 1: where ρ (1/q) has modified parameters as in (8.5), but with q replaced by 1/q. Moreover, if α + i = β + i = 0 for all i, γ + = 0, and β + 1 < 1, then , where ρ (e τ ) is defined in a similar way. Question 3. Is it possible to extend the definition of R τ to Schur processes with γ + > 0? (This is equivalent to extending L τ to the case γ − > 0.) If such an extension is possible, then R τ would turn the time t in the Schur process P[ 1 | ρ t ] (with the Plancherel specialization γ + = t and all other parameters zero) into e τ t, that is, forward. Note that this process would move infinitely many particles in finite time and move individual particles very far, too.
Recall that P[ 1 | ρ t ] can be generated by the push-block dynamics (Definition 8.8 below). Under this dynamics, the rightmost components {λ (N ) 1 } of the interlacing array evolve as a PushTASEP, a close relative of TASEP, but with a pushing mechanism [BF08], [BF14]. Therefore, a positive answer to Question 3 would lead to a continuous-time semigroup which maps PushTASEP forward in time.
(8.6) Note that T (s i ) do not satisfy the symmetric group relations when acting on S λ,N (in particular, T (s i ) 2 is not identity).
Let M λ c denote the c-Gibbs measure on S λ,N : Note that in contrast with arrays of infinite depth (cf. Section 2.7), here the c-Gibbs property determines the measure M λ c uniquely.
Let w N = (N, N − 1, . . . , 2, 1) be the longest element in the symmetric group S N , and w N = s i 1 s i 2 . . . s i N (N −1)/2 , 1 ≤ i k ≤ N − 1, be its reduced word decomposition which is also assumed fixed. Define T := T (s i 1 )T (s i 2 ) . . . T (s i N (N −1)/2 ) (8.7) (in this notation we do not indicate the dependence on the choice of a particular reduced word). Clearly, T 2 acts as the identity on the S N part of S λ,N . Moreover, T 2 preserves the measure M λ c viewed as the measure on S λ,N × {e}. Indeed, this is because by Proposition 4.7 each T (s i ) maps M λ σ c to M λ s i σ c . The map T 2 can be viewed as a sampling algorithm for the measure M λ c : Proposition 8.7. Start with any (nonrandom) interlacing array (λ (1) ≺ . . . ≺ λ (N −1) ≺ λ) and apply the Markov map T 2k to it. The distribution of the resulting random interlacing array converges, as k → +∞, to M λ c in the total variation norm. Proof. This follows from the standard convergence theorem for Markov chains on finite spaces. Indeed, the Markov chain corresponding to T 2k is • aperiodic since T 2 assigns positive probability to the trivial move; • irreducible because T 2 assigns positive probability to changing only one entry λ (k) j , 1 ≤ j ≤ k ≤ N − 1 in the interlacing array (the set S λ,N is connected by such individual changes). This completes the proof.
Question 4. How fast is the convergence in Proposition 8.7, depending on the system size (which is ∼ N λ 1 )? What is the mixing time of T? 8.5. q-distributed lozenge tilings. Let us now consider a concrete case of the setup outlined in the previous Section 8.4. Fix N and the top row λ = (b, b, . . . , b, 0, 0 . . . , 0), where b repeats a times and 0 repeats c times, with a + c = N . Then interlacing arrays of depth N and top row λ are in bijection with lozenge tilings of a hexagon with sides a, b, c, a, b, c, or, equivalently, with boxed plane partitions (see Figure 10 for an illustration and, e.g., [BP14] for more details).
Let M q −1 and M q denote the measures under which the probability weight of a lozenge tiling is proportional to q −vol or q vol , respectively, where the volume is defined in (5.1). These two measures are c-Gibbs with c = (1, q, q 2 , . . . , q N −1 ) and c = (q N −1 , . . . , q, 1), respectively (recall that multiplying c by a scalar does not change the c-Gibbs property).
Take the reduced word w N = (s 1 s 2 . . . s N −1 )(s 1 s 2 . . . s N −2 ) . . . (s 1 s 2 )(s 1 ), and let T be the corresponding Markov map (8.7). One readily sees that the action of T on M q −1 : • Almost surely moves vertical lozenges (see Figure 10) to the left because in (8.6) we always choose the option L; • Changes the (N − 1)-st row of the tiling only once, the (N − 2)-nd only twice, and so on. An exact sampling algorithm for M q −1 was presented in [BGR10]. Starting with the exact sample of M q −1 (Figure 10, left) and applying T, we obtain an exact sample of M q (Figure 10, right), while randomly moving the vertical lozenges to the left. An implementation of this mapping M q −1 T = M q with all the intermediate steps can be found online [PZ19].
The map T works in the same way for an arbitrary top row λ (when the polygon being tiled is not necessarily a hexagon, but can be a general sawtooth domain as in, e.g., [Pet14]). The advantage of the hexagon case is the presence of the exact sampling algorithm [BGR10].  [BGR10], and the picture on the right is the result of applying the map T.
Question 5. Consider lozenge tilings of growing sawtooth domains with top rows λ = λ(N ) which depend on N in some way. Can the symmetry of the q ±vol measures manifested by the map T be utilized to obtain the limit shape and fluctuations of the leftmost piece of the frozen boundary as N → +∞?
Here by the leftmost piece we mean the part of the curve separating the leftmost region occupied by only vertical lozenges, and the liquid region. Existence and characterization of limit shapes for q ±vol is due to [CKP01], [KO07], and some explicit formulas were obtained recently in [DFG19].
8.6. Dynamics in the bulk. Consider the Schur process P[ 1 | ρ t ] (also sometimes known as the Plancherel measure for the infinite-dimensional unitary group). It is convenient to use lozenge tiling interpretation of interlacing arrays as in the previous Section 8.5. From [BK08], [BF14] it is known that as N , k, and t go to infinity proportionally to each other, the local lattice configuration of lozenges around each λ (N ) k converges to the ergodic translation invariant Gibbs measure on lozenges tilings of the whole plane (see Figure 11 for an illustration). Such ergodic measures form a two-parameter family [She05]. As parameters one can take the densities of two of the three types of lozenges. We remark that the ergodic Gibbs measures are far from being independent Bernoulli ones. In particular, the joint correlations of lozenges possess a determinantal structure [OR03].
We say that (N, k, t) correspond to the bulk of the system if the limiting density of each of the types of lozenges around λ (N ) k (t) is positive. One can also consider the bulk limit of the dynamics L τ . Because L τ maps the Schur process P[ 1 | ρ t ] to P[ 1 | ρ e −τ t ] and t → +∞, we need to scale τ as τ = τ/t (here τ ∈ R >0 is the new scaled time which stays fixed). Then e −τ/t t ∼ 1 − τ t t = t − τ. Considering L τ/t is equivalent to slowing down all the jump rates in L by the factor of t. Since we are looking around level N and N grows proportionally to t, the slowed down dynamics in the bulk will have equal jump rates on all levels at finite distance from the N -th one. Therefore, under the bulk limit of L τ , each vertical lozenge can move into one of the holes to the left of it (with the requirement that the interlacing is preserved), at a constant rate per hole (for simplicity, we can assume that this rate is equal to 1). See Figure 11 for an illustration. Consider the combination of the dynamics L τl/t and R τr/t running in parallel, 8 where l, r > 0 are parameters. In the bulk limit of this combination, we readily obtain the Hammersley-type process in the bulk with two-sided jumps. This two-sided dynamics was introduced and studied in [Ton17], where it was shown that this dynamics preserves the ergodic Gibbs measures on tilings of the whole plane. We see that our Markov maps L τ and R τ can be viewed as the pre bulk limit versions of the two-sided Hammersley-type processes of [Ton17].
Let us now discuss connections to the push-block dynamics of [BF14]. For completeness, let us recall its definition: Definition 8.8 (Push-block dynamics). Each vertical lozenge has an independent exponential clock of rate 1. When the clock rings, the lozenge tries to move to the right by one. If it is blocked by a vertical lozenge from below (see the square mark in Figure 11), then the jump is suppressed. If there are vertical lozenges above the one moving, then they also get pushed to the right by one.
The one-sided particular case of the Hammersley-type processes is essentially the push-block dynamics, up to rotating the picture by π/3 and focusing on the yellow lozenges in Figure 11 instead of the vertical (gray) ones. Thus, in the bulk limit Theorem 8.5 informally turns into the statement that one can run the one-sided Hammersley-type process and the push-block dynamics (both in terms of the vertical lozenges), and the resulting process preserves ergodic Gibbs measures. This statement essentially follows from [Ton17], as well as its rather straightforward generalization given next: Proposition 8.9. Running six one-sided Hammersley-type processes in parallel, where each individual process moves one type of lozenges in one of the directions e iπk/3 , 0 ≤ k ≤ 5, at a specified rate α k ≥ 0, preserves ergodic Gibbs measures on tilings of the whole plane. 8.7. Branching graph perspective. Recall that by S c we denote the set of all interlacing arrays of infinite depth which have many zeroes along the left border (Definition 8.4). Let us explain how the Markov maps L τ can be utilized to equip S c × R with a structure of an R-graded projective system in the sense of [BO13]. Projective systems generalize branching graphs such as the Young graph, and the latter play a fundamental role in Asymptotic Representation Theory [VK81], [BO16]. The definitions and questions in this subsection are motivated by the connection to branching graphs.
Remark 8.10. The set S c × R is "larger" than the more well-studied branching graphs. Namely, in the Young and Gelfand-Tsetlin graphs the vertices are indexed by Young diagrams and signatures, respectively (a signature is a tuple (ν 1 ≥ . . . ≥ ν N ), ν i ∈ Z), while in S c × R the vertices are whole infinite collections of interlacing diagrams λ (1) ≺ λ (2) ≺ . . .. This makes it hard to predict which properties of the Young and Gelfand-Tsetlin graphs could translate to S c × R.
Let M s , s ∈ R, be probability measures on S supported by S c (examples include the one-sided Schur measures as in Theorem 8.5). We call the family {M s } s∈R coherent if for any τ ≥ 0 and s ∈ R we have M s L τ = M s−τ . Coherent families are sometimes known as entrance laws, cf. [Dyn78]. Clearly, coherent families form a convex set. Its extreme elements are, by definition, those which cannot be represented as nontrivial convex combinations of other coherent families. • When β + 1 = β ∈ (0, 1) and all other parameters are zero, the random interlacing array corresponding to M Schur s has the form λ (N ) = (1 X N 0 N −X N ), where (X 1 , X 2 , . . .) is the trajectory of the simple random walk with steps 0, 1 taken with probabilities 1 − β + 1 (s) and β + 1 (s), respectively. The parameter β + 1 (s) interpolates between 0 and 1 at s = −∞ and s = +∞, respectively. The map L τ thus provides a coupling between these simple random walk trajectories with varying probability of up step. The concrete action of L τ in this example leads to Proposition 2 formulated in the Introduction. Let us focus on the case {M Schur s } with γ + = 1 and all other parameters zero. The structure of a projective family allows to define for each s ∈ R the up-down Markov process on S c which preserves each M Schur s (see [BO09]). In more detail, the forward Markov generator is defined as L up s,s+ds (µ → λ) = M Schur s+ds (λ) M Schur s (µ) L ds (λ → µ), λ, µ ∈ S c .
One can check that this is not the same forward evolution as the push-block generator (under any time change). In particular, L up s,s+ds is time-inhomogeneous. Therefore, the up-down Markov process arising from the branching graphs formalism does not reduce (in restriction to the leftmost particles λ (N ) N ) to the stationary dynamics from Section 7. Question 8. The up-down Markov chains associated with distinguished non-extreme coherent families on well-studied branching graphs converge to infinite-dimensional diffusions on the boundary (e.g., [BO09], [Pet09]). Is there such a limit procedure for the up-down processes associated with {M Schur s } or other coherent families on S c × R?
Viewing C as a subset of S c , one can similarly define the projective system structure on C × R associated with the Markov maps L τ (Definition 6.2). The restrictions of {M Schur s } form coherent families on C × R, and all the problems formulated in this subsection also make sense for the smaller object C × R. Note that the up-down Markov chain on each floor C × {s} with γ + = 1 (and all other parameters zero) preserves the TASEP distribution µ e s , but is it not the same as the stationary dynamics discussed in Section 7.
8.8. Lifting to additional parameters. The definition of the local Markov maps L (j) α and R (j) α which randomly change a single level of an interlacing array is inspired by the bijectivization of a degenerate case of the Yang-Baxter equation. Beyond this degenerate case associated with the Schur symmetric polynomials, the bijectivization can be developed to include models associated with spin Hall-Littlewood or spin q-Whittaker symmetric functions [BP17], [BMP19]. A scheme of symmetric functions is given in Figure 12 (see Section 4.2 for more details). Therefore, one can potentially define Markov maps preserving the class of probability measures on interlacing arrays satisfying a version of the Gibbs property associated with the spin Hall-Littlewood functions. These Gibbs measures include the subclass of spin Hall-Littlewood processes. The Markov maps on the spin Hall-Littlewood processes could project (in a way similar to how L τ leads to L τ ) into maps acting nicely on distributions of the stochastic six-vertex model and the ASEP with step initial data.
Question 9. Do there exist Markov maps on (spin) q-Whittaker processes mapping the time parameter in the q-TASEP or the (continuous-time) q-Hahn TASEP backwards?
Finally, let us discuss a setting which does not immediately fit into the scheme of Figure 12 but is also of interest. Configurations of the (not necessarily stochastic) six-vertex model with the domain wall boundary conditions (e.g., see [Res10]) can be encoded as finite depth interlacing arrays of strict partitions with fixed top row. The Yang-Baxter equation swapping spectral parameters in this model can potentially be bijectivised in the same way as in [BP17], which should lead to Markov maps acting nicely on the distribution of the six-vertex model. (In the Schur case this is described in Section 8.4.) Question 10. Can these Markov maps be taken to the continuous-time limit similarly to the q → 1 limit described in Section 6? If this is possible, this would lead to a new non-local sampling algorithm for the distribution of the homogeneous (i.e., with equal spectral parameters) six-vertex model with domain wall boundary conditions. The bulk limit of this latter algorithm should presumably coincide with the Markov process from [BB17] preserving the distribution of the six-vertex model on a torus.