1 Introduction

If a quantum computer is subjected to excessive random noise, it will cease to provide an advantage over classical devices. The resulting phase transition between “quantum” and “classical” computational regimes was first clearly identified by Aharonov [1]. Recently, this idea has enjoyed a resurgence in the context of the “measurement-induced phase transition” (MIPT) arising in quantum circuits subjected to random measurements [2,3,4,5,6]. A well-studied model that realizes this transition is a chain of L qubits acted upon by alternating unitary and measurement layers, consisting of local unitary gates and a spatial density \(p>0\) of single-qubit projective measurements respectively. Under such dynamics, mixed initial states will undergo a process of “dynamical purification” whereby they tend to pure states at long times, with the Rényi entropy of the system density matrix decaying to zero along generic quantum trajectories [7]. The characteristic timescale for this process to occur is called the purification time, which we denote \(\tau _{\textrm{P}}\). For sufficiently small p, such systems realize an “entangling” phase, for which \(\tau _{\textrm{P}}\) is exponentially long in the system size, while for larger p, they realize a “disentangling” phase, for which \(\tau _{\textrm{P}}\) is independent of the system size. The two phases are separated by a critical point \(p=p_c\) at which \(\tau _{\textrm{P}} \sim L\) [7, 8].

A particularly simple model for such dynamical purification is given by the “monitored Haar-random quantum dot”, whose time evolution consists of Haar-random unitaries on the full L-qubit Hilbert space, i.e. unitary matrices drawn from the Haar measure on U(N) where \(N=2^L\), alternating with layers of independent single-qubit projective measurements on pL of the qubits. Variants of this model were studied in Refs. [9,10,11]. These models realize only the entangling phase of the conventional MIPT, with a “trivial” critical point at \(p_c = 1\). Nevertheless, we show below that these models yield a plethora of new and exact results, some of which should capture universal features of the entangling phase in spatially local systems, though we will not explicitly consider spatially local systems in this paper.

We achieve this by applying a combination of ideas from random-matrix theory and disorder physics to analyze this problem, which differ from the field-theoretic replica methods that have largely been the analytical tools of choice for analyzing monitored dynamical phases in previous work [12]. This approach allows us to determine analytically the behaviour of various physical quantities that were previously mostly accessible by numerical simulations or by heuristic arguments, including the purification time, the dynamics of Rényi entropies and the distribution of Born probabilities. We also introduce an unstructured model with near-identity weak measurements that can be seen as a monitored analogue of Dyson Brownian motion (see Refs. [9, 11] for related constructions). The Fokker–Planck equation describing this model in the continuous time limit turns out to coincide with a known solvable model of Calogero–Sutherland type [13]. This exact solution grants us a thorough understanding of the dynamics of the full set of singular values of Kraus operators along quantum trajectories, for all times.

Our perspective in this paper differs from most previous work on monitored quantum circuits in another important respect. Monitored dynamical phases and the transitions between them are usually diagnosed by averaging physical quantities over ensembles of quantum trajectories weighted by their Born probabilities, and the main objects of study are the numerical values of these averages [12]. The quantities being averaged are furthermore usually nonlinear functions of the density matrix on each quantum trajectory, leading to a “post-selection barrier” that hinders direct comparison with experiments [14,15,16]. Post-selection and averaging tend to be advocated on the grounds that they are indispensable for probing “typical” quantum trajectories, which determine the monitored dynamical phase but are invisible at the level of the conventional density matrix. In this paper, we instead study time evolution along typical quantum trajectories directly (see also Refs. [8, 17]). Considering time evolution along single trajectories in lieu of Born-rule averaging can be motivated by analogy with the ergodic hypothesis in classical statistical physics, which similarly permits replacing ensemble averaging in chaotic systems with time-averaging along individual phase-space trajectories. However, in order to apply such reasoning to monitored quantum systems, we must first understand the extent to which distinct quantum trajectories are statistically alike. We must also understand the shape of the probability distributions that capture this statistical similarity. The latter will depend on both time and on the specific system under consideration, raising the question of how far such probability distributions reflect universal features of the underlying dynamical phase. To our knowledge, these probability distributions have not been studied in detail. Here, we derive them for monitored Haar-random quantum dots and return to questions of universality in the conclusion.

The paper is structured as follows. We first consider projective measurements and introduce the statistical ensembles of Kraus operators that will be the central objects of study. We present exact expressions for the Lyapunov spectrum of these Kraus operators, based on a mapping to so-called “truncated unitary ensembles” in random matrix theory [18, 19], from which we determine the purification time analytically. We then derive the exact distribution of Born probabilities, and explain how this generalizes the so-called Porter–Thomas distribution [20,21,22] for random unitary circuits to a two-parameter family of distributions determined by both the measurement density p and the circuit depth t. Finally, we consider time evolution starting from maximally mixed initial states, which clarifies earlier results [7, 10, 23] on the time evolution of Rényi entropies as reflecting a crossover in time from a narrow Wigner-semicircle-like to a broad (and to a first approximation log-normal) distribution of singular values of Kraus operators.

We next turn to the weakly measured case, for which we derive a Dorokhov–Mello–Pereyra–Kumar (DMPK)-like equation [24, 25] that captures the time evolution of the joint distribution function of singular values of Kraus operators. The resulting Fokker–Planck equation is exactly solvable [13] in a manner analogous to the time-reversal-symmetry-breaking case of the DMPK equation [26]. Using this exact solution, we demonstrate explicitly that the joint distribution function of singular values exhibits a “semicircle-to-square” crossover between log-GUE statistics and log-normal statistics as a function of time. This represents a remarkably complete understanding of this model’s dynamics that corroborates our results on projective measurements. We conjecture that this behaviour is universal for entangling phases of monitored quantum systems, in the same manner that random matrix theory captures the spectral properties of generic closed quantum systems.

2 Projective Measurements

2.1 Kraus Operator Ensembles

Consider a quantum dot of L qubits, acted upon by pairs of alternating unitary and measurement layers that together constitute individual time steps. Each unitary layer consists of a Haar-random unitary operator acting on all L qubits and drawn from the Haar measure on U(N), where \(N=2^L\) throughout this paper. Each measurement layer comprises independent, projective, single-qubit \(\hat{Z}\) measurements acting on a mean number of qubits pL per layer, with \(0 \le p \le 1\). The specific spatial probability distribution of these measurements will depend on the specific model of interest, to be fixed below. The difference between the models that we consider here and more standard [12] spatially local Haar random quantum circuits is depicted schematically in Fig. 1. We denote the string of measurement outcomes in the jth measurement layer by \(\textbf{m}_j\), and denote the measurement history of the whole circuit along a given quantum trajectory by the tuple \(\textbf{m}=(\textbf{m}_1,\textbf{m}_2,\ldots ,\textbf{m}_t)\). We denote the projection operator corresponding to projection onto the measurement outcome \(\textbf{m}_j\) by \(\hat{P}_{\textbf{m}_j}\). In general, the rank of \(\hat{P}_{\textbf{m}_j}\) equals \(2^{L-|\textbf{m}_{j}|}\), where \(|\textbf{m}_j|\) denotes the number of measurements made at time j.

Fig. 1
figure 1

A schematic illustration of how the Haar-random monitored quantum circuits that we consider in this work (right) differ from more standard spatially local examples (left). The legs at the bottom of each picture correspond to individual qubits, blue rectangles depict Haar random unitary gates acting on the Hilbert space of the incoming legs, and red dots indicate single-qubit projective measurements (Color figure online)

For a pure initial state \(|\psi (0)\rangle \), the time-evolved state along a given quantum trajectory \(\textbf{m}\) can be written as [12]

$$\begin{aligned} |\psi (t)\rangle = \hat{K}_{\textbf{m}}(t) |\psi (0)\rangle /\sqrt{p(\textbf{m})}, \end{aligned}$$
(1)

where the Kraus operator

$$\begin{aligned} \hat{K}_{\textbf{m}}(t) = \hat{P}_{\textbf{m}_t} \hat{U}_{t} \ldots \hat{P}_{\textbf{m}_1} \hat{U}_1 \end{aligned}$$
(2)

and the Born probability of the string of measurement outcomes \(\textbf{m}\) is given by

$$\begin{aligned} p(\textbf{m}) = \langle \psi (0) | \hat{K}_{\textbf{m}}(t)^\dagger \hat{K}_{\textbf{m}}(t) | \psi (0) \rangle . \end{aligned}$$
(3)

This can be generalized to arbitrary initial density matrices \(\hat{\rho }(0)\), whose time evolution under this dynamics is given by

$$\begin{aligned} \hat{\rho }(t) = \sum _{\textbf{m}} \hat{K}_{\textbf{m}}(t) \hat{\rho }(0) \hat{K}_{\textbf{m}}^\dagger (t). \end{aligned}$$
(4)

Where applicable, we make the choice [7] to unravel this evolution as an ensemble of single-trajectory density matrices

$$\begin{aligned} \hat{\rho }_{\textbf{m}}(t) = \frac{\hat{K}_{\textbf{m}}(t)\hat{\rho }(0)\hat{K}_{\textbf{m}}^\dagger (t)}{p(\textbf{m})} \end{aligned}$$
(5)

drawn with probabilities \(p(\textbf{m}) = \textrm{Tr}[\hat{K}_{\textbf{m}}(t)\hat{\rho }(0)\hat{K}_{\textbf{m}}^\dagger (t)]\). We refer to the review article Ref. [12] for a more detailed discussion of Kraus operators for monitored quantum circuits. When considering mixed initial states, for simplicity we will always restrict our attention to the maximally mixed initial state

$$\begin{aligned} \hat{\rho }(0) = \frac{1}{2^L}\mathbb {1}, \end{aligned}$$
(6)

for which

$$\begin{aligned} \hat{\rho }_{\textbf{m}}(t) = \frac{\hat{K}_{\textbf{m}}(t)\hat{K}_{\textbf{m}}^\dagger (t)}{\textrm{Tr}[\hat{K}_{\textbf{m}}(t)\hat{K}_{\textbf{m}}^\dagger (t)]} \end{aligned}$$
(7)

along each quantum trajectory.

The basic quantity of interest in this work is the statistical ensemble of Kraus operators \(\{\hat{K}_{\textbf{m}}\}\), defined in general by first sampling over the Haar measure, then sampling over measurement locations and finally sampling over measurement outcomes. Since we will be focusing on Haar random unstructured systems, the latter average will mostly be redundant, and expectation values \(\mathbb {E}\) will always denote expectation values along a fixed quantum trajectory with respect to the Haar measure on each unitary layer unless specified otherwise. Thus we do not explicitly consider Born-rule averaged quantities in this work, although we address the distribution of Born probabilities with respect to the Haar measure in Sect. 2.5.

The main tool at our disposal for studying the ensemble of Kraus operators \(\hat{K}_{\textbf{m}}\) will be their singular value decomposition,

$$\begin{aligned} \hat{K}_{\textbf{m}}(t) = \hat{V}_t\hat{D}_t \hat{W}_t^\dagger \end{aligned}$$
(8)

where we write \(\hat{D}_t = \textrm{diag}(\sigma _1(t),\ldots ,\sigma _N(t))\), with the convention that \(\sigma _1(t) \ge \cdots \ge \sigma _N(t) \ge 0\). We emphasize that the matrices of singular vectors \(\hat{V}_t,\, \hat{W}_t\) and the singular values \(\sigma _n(t)\) are random variables that depend sensitively on the circuit realization, and suppress the measurement history in Eq. (8) for notational convenience.

2.2 Rank Collapse and Dynamical Purification

Because the Kraus operators above generically include projective measurements, their rank \(r(t) = \textrm{rk}[\hat{K}_{\textbf{m}}(t)]\) is a random variable that does not increase in time and \(\sigma _n(t) = 0\) for \(n > r(t)\). The decay of the rank defines a “rank-collapse time”, given by the mean stopping time

$$\begin{aligned} \tau _{\mathrm {R.C.}} = \mathbb {E}[\min {\{t:r(t)=1\}}], \end{aligned}$$
(9)

which is infinite if the rank does not decay.

In general, the singular values \(\sigma _n(t)\) will decay in an average sense as \(t \rightarrow \infty \); in particular, the Oseledets ergodic theorem [27] guarantees the existence of Lyapunov exponents

$$\begin{aligned} \lambda _n = \lim _{t \rightarrow \infty } \frac{\log {\sigma _n(t)}}{t} \end{aligned}$$
(10)

with probability one, provided that \(n \le r(t)\) with probability one for all time. To see the physical meaning of the first few singular values and their Lyapunov exponents, suppose that \(\lambda _2\) is defined and consider time evolution from a maximally mixed state, as in Eq. (7). Then

$$\begin{aligned} \hat{K}_{\textbf{m}}(t) \hat{K}_{\textbf{m}}(t)^\dagger \sim \sigma _1^2(t) \textbf{v}_1 \textbf{v}_1^\dagger + \sigma _2^2(t) \textbf{v}_2 \textbf{v}_2^\dagger , \quad t \rightarrow \infty , \end{aligned}$$
(11)

where \(\textbf{v}_n\) denotes the nth column of \(\hat{V}_t\) in Eq. (8). Thus the Born probability

$$\begin{aligned} p(\textbf{m}) \sim \frac{1}{N}\sigma _1^2(t), \quad t \rightarrow \infty , \end{aligned}$$
(12)

decays at a rate \(\tau ^{-1} = 2\lambda _1\) [8], while the trajectory density matrix

$$\begin{aligned} \hat{\rho }_{\textbf{m}}(t) \sim \textbf{v}_1 \textbf{v}_1^\dagger + \left( \frac{\sigma _2^2(t)}{\sigma _1^2(t)}\right) \textbf{v}_2 \textbf{v}_2^\dagger , \quad t \rightarrow \infty . \end{aligned}$$
(13)

It is clear that the decay of the ratio

$$\begin{aligned} \nu (t) = \frac{\sigma _2^2(t)}{\sigma _1^2(t)} \end{aligned}$$
(14)

determines the rate of convergence of \(\hat{\rho }_{\textbf{m}}(t)\) to a pure state. We thus define the “purification time” \(\tau _{\textrm{P}}\) by

$$\begin{aligned} \tau _{\textrm{P}}^{-1} = \lim _{t\rightarrow \infty } \frac{-\mathbb {E}[\log {\nu (t)}]}{t} = 2(\lambda _1-\lambda _2). \end{aligned}$$
(15)

For more general initial states, such as pure states, \(\tau _{\textrm{P}}^{-1}\) is properly thought of as the expected rate of convergence of the trajectory density matrix to a rank-one projection along the random singular vector \(\textbf{v}_1\). The specific distribution of \(\textbf{v}_1\) will depend on the specific model and monitored dynamical phase under consideration. Thus a more general perspective on the purification time \(\tau _{\textrm{P}}\) is that it defines the timescale on which a monitored quantum system “forgets” its initial state [9] and begins to reveal universal properties of its monitored dynamical phase. We return to this point in the conclusion.

The above discussion reveals that there are two important timescales that determine the fate of the monitored quantum dot (and indeed arbitrary projectively measured quantum circuits) at asymptotically long times, namely the rank-collapse time and the purification time. However, these two timescales are in tension, because standard results [27, 28] on the existence of Lyapunov exponents \(\lambda _n\) for \(n>1\) require that \(\sigma _n(t)\) is generically non-zero for all time. Thus in order to define the purification time or higher Lyapunov exponents rigorously, we require that \(\tau _{\mathrm {R.C.}}= \infty \). This is not the case for the standard formulation of monitored random circuits, according to which measurements are performed randomly and independently at every site [2] leading to rank collapse in finite time. Previous work on models with rank collapse implicitly assumes a parametric separation of scales

$$\begin{aligned} \tau _{\textrm{P}} \ll \tau _{\mathrm {R.C.}} \end{aligned}$$
(16)

in L, and then estimates \(\tau _{\textrm{P}}\) numerically by simulating the system for times much shorter than \(\tau _{\mathrm {R.C.}}\) (see [8]). The downside of this approach is that there is an inherent “fuzziness” in the definition of \(\tau _{\textrm{P}}\) on the order of inevitable statistical fluctuations \(1/\sqrt{\tau _{\mathrm {R.C.}}}\) induced by rank collapse. While this is mostly a mathematical subtlety when it comes to estimating the purification time numerically (since for the canonical spatially local model \(\tau _{\mathrm {R.C.}} \sim 1/(2p)^L \gg 1\) by a mapping to bond percolation [4]), it becomes a serious obstacle even for numerical estimates of higher Lyapunov exponents \(\lambda _n\) with \(n > 2\), which succumb to rank reduction at much earlier times than \(\tau _{\mathrm {R.C.}}\).

We resolve this subtlety for monitored quantum dots by noting that the timescales \(\tau _{\mathrm {R.C.}}\) and \(\tau _{\textrm{P}}\) naturally pertain to two distinct microscopic models, with and without rank collapse respectively, for which the separation of scales Eq. (16) can be proved analytically as \(L \rightarrow \infty \). This then justifies attempting to estimate \(\tau _{\textrm{P}}\) numerically in the model with rank collapse. These models, which we refer to as “Model I” and “Model II” respectively, differ only in the spatial distribution of measurements in each measurement layer and are defined as follows.

For Model I, the measurements in each measurement layer are performed randomly and independently at each qubit with probability p, as for the standard formulation of the MIPT [2]. Then the number of measurements in each layer fluctuates and rank collapse occurs when all the qubits in a given layer are measured, which occurs with probability \(p^L\). For Model II, we perform a fixed number of measurements \(pL \in \{0,1,\ldots ,L\}\) per layer. Note that for the monitored quantum dots under consideration in this paper, the specific location of these measurements does not matter, by Haar randomness of the unitary layers.

For both models, we can determine the rank-collapse time in finite systems analytically, in contrast to the situation for spatially local models, for which closed forms are not easily obtained (even if the asymptotic behaviour is understood [4]). First consider Model I. From Eq. (9), the rank-collapse time is the expected value of the stopping time T such that \(r(T)=1\) and \(r(t) > 1\) for \(t < T\). Note that \(\mathbb {P}(T=t) = (1-p^L)^{t-1}p^L\). Thus

$$\begin{aligned} \tau _{\mathrm {R.C.,I}} = \mathbb {E}[T] = \sum _{t=1}^{\infty } t(1-p^L)^{t-1}p^L = \frac{1}{p^L}. \end{aligned}$$
(17)

Next consider Model II. In this case \(r(t) = 2^{(1-p)L} > 1\) almost surely for \(p <1\) and it follows that

$$\begin{aligned} \tau _{\mathrm {R.C.,II}} = \mathbb {E}[T] = \infty , \quad p <1. \end{aligned}$$
(18)

Below, we will restrict our attention to Model II with \(0<p<1\) and write \(M = 2^{(1-p)L}\) for the rank of its Kraus operators.

2.3 Mapping to Truncated Unitary and Ginibre Ensembles

The “truncated unitary ensembles” of random matrices [18] consist of square submatrices of Haar random matrices drawn from U(N). We can relate the Kraus operators Eq. (2) for Model II to products of \(M \times M\) truncated unitary matrices drawn from U(N) as follows. We first note that by Haar randomness and transitivity of U(N) on measurement outcomes, the statistics of singular values of \(\{\hat{K}_{\textbf{m}}(t)\}\) is unchanged if we fix the measurement outcomes \(\hat{P}_{\textbf{m}_j} = \hat{P}\) in each layer to be the same. Writing \(\overset{\mathrm {s.v.}}{\sim }\) for equality of singular value statistics, we have

$$\begin{aligned} \hat{K}_{\textbf{m}}(t) = \hat{P}_{\textbf{m}_t} \hat{U}_{t} \ldots \hat{P}_{\textbf{m}_1} \hat{U}_1 \overset{\mathrm {s.v.}}{\sim } \hat{P} \hat{U}_t \ldots \hat{P} \hat{U}_1. \end{aligned}$$
(19)

at each time step t. To proceed further, we use the fact that \(\hat{P}^2=\hat{P}\) to write

$$\begin{aligned} \hat{P} \hat{U}_t \ldots \hat{P} \hat{U}_1 = (\hat{P} \hat{U}_t \hat{P})(\hat{P}\hat{U}_{t-1}\hat{P}) \ldots (\hat{P} \hat{U}_2\hat{P}) \hat{P}\hat{U}_1, \end{aligned}$$
(20)

implying that

$$\begin{aligned} \hat{K}_{\textbf{m}}(t) \overset{\mathrm {s.v.}}{\sim } \hat{R}_{t} \hat{R}_{t-1} \ldots \hat{R}_2 \hat{S}_1, \end{aligned}$$
(21)

where in the computational basis, \(\hat{R}_j = \hat{P} \hat{U}_j \hat{P}\) is an M-by-M truncated unitary matrix padded by \(N-M\) rows and columns of zeros, with the same configuration of nonzero entries for each j, and \(\hat{S}_1 = \hat{P}\hat{U_1}\) has M unit singular values and \(N-M\) zero singular values, since

$$\begin{aligned} \hat{S}_1\hat{S}_1^\dagger = \hat{P}. \end{aligned}$$
(22)

It follows from this analysis that the singular values of \(\hat{K}_\textbf{m}(t)\) are distributed as the singular values of products of \(t-1\) truncated unitary matrices. While various properties of products of truncated unitary matrices have been derived analytically in the literature [19, 29,30,31], including explicit expressions for their Lyapunov spectrum that we will discuss further below, the simplest results are obtained in the limit \(M/N \rightarrow 0\), in which \(\hat{K}\) is distributed as a product of Ginibre random matrices [29]. In the physical context of monitored random circuits introduced above, this regime is realized in the thermodynamic limit \(L \rightarrow \infty \) in which the number of qubits tends to infinity at any fixed \(p>0\).

Intuitively, the emergence of Ginibre ensembles corresponds to the fact that as \(M/N \rightarrow 0\), the non-zero elements of \(\hat{R}_j\) look “random”, in the sense that they are asymptotically distributed as the elements of a Ginibre matrix, up to rescaling [32, 33]. To be precise, let \(\hat{A}_j\) denote an M-by-M complex Ginibre matrix whose elements are i.i.d. with probability distribution function \(f(v) = \frac{1}{\pi } e^{-|v|^2}\) and define

$$\begin{aligned} \hat{B}_j = \frac{1}{\sqrt{N}} \hat{A}_j. \end{aligned}$$
(23)

This choice of normalization ensures that in the absence of truncation, \(M=N\), the expected squared norm of each row and each column of \(\hat{B}_j\) coincides with the expected squared norm of each row and each column of an N-by-N unitary matrix. It can then be shown [29] that as \(M/N \rightarrow 0\), the product of Ginibre random matrices

$$\begin{aligned} \hat{G}(t) = \hat{B}_{t-1} \hat{B}_{t-2} \ldots \hat{B}_1, \quad \hat{G}(1) = \mathbb {1}_M, \end{aligned}$$
(24)

has the same probability distribution as the non-zero elements of \(\hat{R}_t \hat{R}_{t-1} \ldots \hat{R}_2\), and therefore that the non-zero singular values of \(\hat{K}_{\textbf{m}}(t)\) and \(\hat{G}(t)\) are identically distributed in the thermodynamic limit. We note that the spectral density of the product \(\hat{G}(t)\) has been studied in some detail [34, 35], and will make use of some of these results below.

2.4 Computation of Lyapunov Exponents and the Purification Time

We now compute the entire Lyapunov spectrum for Model II. This follows immediately from the mapping Eq. (21) from Model II to the products of truncated unitary matrices, combined with an expression for the Lyapunov exponents of such products derived recently [19, 30, 31]. Together, these results imply that the Lyapunov exponents for Model II are given by

$$\begin{aligned} \lambda _n = -\frac{1}{2}(\psi (N-n+1) - \psi (M-n+1)), \quad n=1,2,\ldots ,M, \end{aligned}$$
(25)

where \(\psi (x) = \Gamma '(x)/\Gamma (x)\) denotes the digamma function.

It follows from Eq. (25) that the inverse purification time for Model II

$$\begin{aligned} \tau _{\textrm{P,II}}^{-1} = 2(\lambda _1-\lambda _2) = \frac{1}{M-1}-\frac{1}{N-1} \end{aligned}$$
(26)

by the standard digamma function identity [36] \(\psi (n+1) = \psi (n) + \frac{1}{n}\). We deduce that

$$\begin{aligned} \lim _{L\rightarrow \infty } \frac{\log {\tau _{\textrm{P,II}}}}{L} = (1-p)\log {2} < - \log {p} = \lim _{L\rightarrow \infty } \frac{\log {\tau _{\mathrm {R.C.,I}}}}{L} \end{aligned}$$
(27)

for \(p < 1\), implying that

$$\begin{aligned} \tau _{\textrm{P,II}} \ll \tau _{\mathrm {R.C.,I}} \end{aligned}$$
(28)

for \(L \gg 1\) and \(p<1\), which is one way to make Eq. (16) precise. The functions of p appearing on either side of Eq. (27) are plotted in Fig. 2; it is clear that they coincide only in the approach to the “transition” as \(p \rightarrow 1^-\). In particular, we have proved that the purification time

$$\begin{aligned} \tau _{\textrm{P}} \sim M \sim 2^{(1-p)L}, \quad L \rightarrow \infty , \end{aligned}$$
(29)

grows exponentially in the system size, and having established Eq. (16) we henceforth suppress the subscript specifying Model II.

Fig. 2
figure 2

The rank-collapse time for Model I (dashed red line) versus the purification time for Model II (solid blue line). It is clear from this plot that these timescales are exponentially well separated as \(L \rightarrow \infty \), except in the immediate vicinity of \(p=1\) (Color figure online)

2.5 The Distribution of Born Probabilities

Now consider applying Model II dynamics to an arbitrary initial pure state \(|\psi \rangle \), with a view to determining the distribution of Born probabilities as a function of both the density of measurements p and the circuit depth t. The Born probability of measuring a string of measurement outcomes \(\textbf{m}\) after time t can be written explicitly as \( p(\textbf{m}) = \Vert \hat{P}_{\textbf{m}_t} \hat{U}_{t} \ldots \hat{P}_{\textbf{m}_1} \hat{U}_1 |\psi \rangle \Vert ^2\). Introducing rotated projection operators \(\hat{P}'_{\textbf{m}_j} = \hat{U}_1^\dagger \ldots \hat{U}_{j}^\dagger \hat{P}_{\textbf{m}_j} \hat{U}_j \ldots \hat{U}_1\), we can write this as

$$\begin{aligned} p(\textbf{m}) = \Vert \hat{P}'_{\textbf{m}_t}\hat{P}'_{\textbf{m}_{t-1}}\ldots \hat{P}'_{\textbf{m}_1} |\psi \rangle \Vert ^2. \end{aligned}$$
(30)

This implies that

$$\begin{aligned} p(\textbf{m}) = p(\textbf{m}_t|\textbf{m}_{t-1}\ldots \textbf{m}_{1})p(\textbf{m}_{t-1}|\textbf{m}_{t-2}\ldots \textbf{m}_{1})\ldots p(\textbf{m}_{2}| \textbf{m}_{1})p(\textbf{m}_1), \end{aligned}$$
(31)

where

$$\begin{aligned} p(\textbf{m}_j|\textbf{m}_{j-1}\ldots \textbf{m}_{1}) = \frac{\Vert \hat{P}'_{\textbf{m}_j}\hat{P}'_{\textbf{m}_{j-1}}\ldots \hat{P}'_{\textbf{m}_1} |\psi \rangle \Vert ^2}{\Vert \hat{P}'_{\textbf{m}_{j-1}}\hat{P}'_{\textbf{m}_{j-2}}\ldots \hat{P}'_{\textbf{m}_1} |\psi \rangle \Vert ^2}. \end{aligned}$$
(32)

Note that so far we have assumed nothing about the distribution of unitary layers \(\hat{U}_j\). If we now assume that \(\hat{U}_j\) are Haar-random matrices on the full Hilbert space, a drastic simplification occurs and the measurement outcomes at distinct time steps become pairwise uncorrelated. Explicitly, we have

$$\begin{aligned} p(\textbf{m}_j|\textbf{m}_{j-1}\ldots \textbf{m}_1) \sim \frac{\sum _{k=1}^{M} |v_{j,k}|^2}{\sum _{k=1}^{N} |v_{j,k}|^2} \end{aligned}$$
(33)

for all j, where the \(v_{j,k}\) are i.i.d. complex Gaussian random variables with probability density function (p.d.f.) \(f(v) = \frac{1}{\pi }e^{-|v|^2}\). Note that sums of K such variables \(\sum _{k=1}^K |v_{j,k}|^2\) are gamma distributed, with p.d.f. \(f_K(x) = \frac{1}{\Gamma (K)}x^{K-1}e^{-x}\). Thus let \(X_{K,j}\) denote a set of independently distributed gamma random variables, each with p.d.f. \(f_K(x)\). Then

$$\begin{aligned} p(\textbf{m}_j|\textbf{m}_{j-1}\ldots \textbf{m}_1) \sim \frac{X_{M,j}}{X_{M,j} + X_{N-M,j}} \sim \textrm{Beta}(M,N-M) \end{aligned}$$
(34)

follows a beta distribution with parameters M and \(N-M\) and p.d.f

$$\begin{aligned} f_{M,N-M}(x) = \frac{\Gamma (N)}{\Gamma (M)\Gamma (N-M)}x^{M-1}(1-x)^{N-M-1}. \end{aligned}$$
(35)

The usual Porter–Thomas distribution for random unitary circuits [21] corresponds to measuring every qubit after each unitary layer, and is recovered from Eqs. (34) and (35) in the limit that \(p=1\) and \(M=1\), for which

$$\begin{aligned} p_{\mathrm {Porter{-}Thomas}}(\textbf{m}_j|\textbf{m}_{j-1}\ldots \textbf{m}_1) \sim \textrm{Beta}(1,N-1) \end{aligned}$$
(36)

is asymptotically exponentially distributed with mean 1/N for large N.

Let us now return to the original problem of determining the probability distribution function of \(p(\textbf{m})\) for Model II. We showed above that

$$\begin{aligned} p(\textbf{m}) \sim \prod _{j=1}^t Y_j, \end{aligned}$$
(37)

for i.i.d. beta random variables \(Y_j \sim \textrm{Beta}(M,N-M)\). This implies in particular that

$$\begin{aligned} \log p(\textbf{m}) \sim \sum _{j=1}^t \log {Y_j}, \end{aligned}$$
(38)

from which log-normality of \(p(\textbf{m})\) at long times is immediate. It follows by the central limit theorem that

$$\begin{aligned} \log p(\textbf{m}) \rightarrow \mu t + \varsigma t^{1/2} Z_t,\quad t \rightarrow \infty , \end{aligned}$$
(39)

in distribution, where the mean

$$\begin{aligned} \mu = \psi (M)-\psi (N) < 0, \end{aligned}$$
(40)

the variance

$$\begin{aligned} \varsigma ^2 = \psi '(M)-\psi '(N) > 0, \end{aligned}$$
(41)

and \(Z_t \sim \mathcal {N}(0,1)\) is a unit normal random variable. Thus we have shown that the Born probabilities \(p({\textbf{m}})\) for Model II are asymptotically log-normally distributed, with a shape that is determined by the parameters N, M and t.

Exact expressions for the distributions of Born probabilities can be obtained as follows. Denoting the characteristic function of a given random variable W by \(\varphi _W(\theta ) = \mathbb {E}[e^{i\theta W}]\), it follows by Eq. (37) that

$$\begin{aligned} \varphi _{\log {p(\textbf{m})}}(\theta ) = \mathbb {E}[e^{i\theta \sum _{j=1}^t \log {Y_j}}] = \prod _{j=1}^t \mathbb {E}[e^{i\theta \log {Y}_j}] = [\varphi _{\log {Y_1}}(\theta )]^t. \end{aligned}$$
(42)

The characteristic function of the beta distribution Eq. (35) is given by

$$\begin{aligned} \varphi _{\log {Y_1}}(\theta ) = \frac{\Gamma (N)}{\Gamma (M)}\frac{\Gamma (M+i\theta )}{\Gamma (N+i\theta )}, \end{aligned}$$
(43)

implying that the PDF for the random variable \(\log {p(\textbf{m})}\) after t time steps is given by

$$\begin{aligned} f_{\log {p(\textbf{m})}}(x) = \left( \frac{\Gamma (N)}{\Gamma (M)}\right) ^t\int _{-\infty }^{\infty } \frac{d\theta }{2\pi } e^{- i\theta x} \left( \frac{\Gamma (M+i\theta )}{\Gamma (N+i\theta )}\right) ^t, \quad x < 0. \end{aligned}$$
(44)

Note that the constraint \(x < 0\) closes the contour in the upper half-plane. This contour encloses all the poles of the integrand, since by the recurrence relation for Gamma functions the integrand has \(N-M\) distinct poles of order t,

$$\begin{aligned} g(\theta ) = \left( \frac{\Gamma (M+i\theta )}{\Gamma (N+i\theta )}\right) ^t = i^{-(N-M)t}\prod _{j=0}^{N-M-1} \frac{1}{(\theta - i(M+j))^t}, \end{aligned}$$
(45)

which are evenly spaced along the positive imaginary axis, \( \theta _j = i(M+j)\) for \(j=0,1,\ldots ,N-M-1\). Thus

$$\begin{aligned} f_{\log {p(\textbf{m})}}(x) = \left( \frac{\Gamma (N)}{\Gamma (M)}\right) ^t \sum _{j=0}^{N-M-1} i \textrm{Res}[e^{-i\theta x}g(\theta );\theta _j] \end{aligned}$$
(46)

with \(g(\theta )\) given by Eq. (45).

In principle, this calculation yields an exact expression for the p.d.f. \(f_{\log {p(\textbf{m})}}(x)\) of log Born probabilities for all times. In practice, these expressions rapidly become cumbersome. At \(t=1\) we recover Eq. (35) up to the necessary change of variables, while for \(t=2\) we find that

$$\begin{aligned} f_{\log {p(\textbf{m})}}(x)= & {} \left( \frac{\Gamma (N)}{\Gamma (M)}\right) ^2 e^{Mx} \sum _{j=0}^{N-M-1} \frac{e^{jx}}{\Gamma (j+1)^2 \Gamma (N-M-j)^2}\nonumber \\{} & {} \times \left[ 2(\psi (j+1) - \psi (N-M-j)) -x\right] . \end{aligned}$$
(47)

For all times, Eq. (37) implies that

$$\begin{aligned} \mathbb {E}[\log {p(\textbf{m}})] = \mu t \sim \log {\left( \frac{1}{2^{pLt}}\right) }, \quad L \rightarrow \infty , \end{aligned}$$
(48)

in the large system limit, and similarly that

$$\begin{aligned} \textrm{Var}[\log {p(\textbf{m}})] = \varsigma ^2 t \sim \frac{t}{\tau _P}, \quad L \rightarrow \infty . \end{aligned}$$
(49)

Thus we have proved that for large systems, the “typical” Born probability, i.e. the mean of \(\log {p(\textbf{m})}\), coincides with the uniform distribution over all \(2^{pLt}\) possible measurement outcomes up to time t, while the distribution of \(p(\textbf{m})\) remains narrow on a log scale until times comparable to the purification time.

2.6 Dynamics of Rényi Entropies

We now illustrate how the above results provide an analytical handle on time evolution along quantum trajectories, starting from a maximally mixed initial state as in Eq. (7). We will focus on the behaviour of Rényi entropies with \(\alpha >0\) (including the von Neumann entropy defined as the limit \(\alpha \rightarrow 1\)) along a given quantum trajectory, which can be expressed in terms of the corresponding Kraus operator’s singular values as [10]

$$\begin{aligned} S^{(\alpha )}_{\textbf{m}}(t) = \frac{1}{1-\alpha } \log {\textrm{Tr}[\rho ^\alpha _{\textbf{m}}(t)]} = \frac{1}{1-\alpha } \log {\sum _{n=1}^M\left( \frac{\sigma _n^{2}(t)}{\sum _{n'=1}^M\sigma _{n'}^{2}(t)}\right) ^{\alpha }}. \end{aligned}$$
(50)

We discuss the limits of short and long times separately. Our analysis is similar in spirit to that of Ref. [10], with a greater degree of analytical control owing to the results of previous sections.

2.6.1 Short Times

We assume throughout this section that \(L \gg 1\), so that the approximation of Kraus operators by a product \(\hat{G}(t)\) of \(t-1\) Ginibre matrices as in Eq. (24) holds, and let \(\rho (\sigma ,t) = \sum _{n=1}^M \delta (\sigma -\sigma _n(t))\) denote the density of singular values of \(\hat{G}(t)\). Then the Rényi entropies are determined by the ratios

$$\begin{aligned} \left( \frac{\sigma _n^{2}(t)}{\sum _{n'=1}^M\sigma _{n'}^{2}(t)}\right) ^{\alpha } = \frac{\int _0^\infty d\sigma \, \rho (\sigma ,t)\sigma ^{2\alpha }}{\left( \int _0^\infty d\sigma \, \rho (\sigma ,t)\sigma ^2\right) ^\alpha }. \end{aligned}$$
(51)

Let us write

$$\begin{aligned} \rho (\sigma ,t) = \bar{\rho }(\sigma ,t) + \delta \rho (\sigma ,t), \end{aligned}$$
(52)

where \(\bar{\rho }(\sigma ,t) = \mathbb {E}[\rho (\sigma ,t)]\), for the mean and fluctuations of \(\rho (\sigma ,t)\) about its mean respectively. Similarly, we write \(m^{(\alpha )} = \int _0^\infty d\sigma \, \bar{\rho }(\sigma ,t) \sigma ^{2\alpha }\) and \(\delta m^{(\alpha )} = \int _0^\infty d\sigma \, \delta \rho (\sigma ,t) \sigma ^{2\alpha }\). Then the ratio

$$\begin{aligned} \log {\frac{\int _0^\infty d\sigma \, \rho (\sigma ,t)\sigma ^{2\alpha }}{\left( \int _0^\infty d\sigma \, \rho (\sigma ,t)\sigma ^2\right) ^\alpha }} = \log {\left( \frac{m^{(\alpha )}}{(m^{(1)})^{\alpha }}\right) } + \left( \frac{\delta m^{(\alpha )}}{m^{(\alpha )}} - \alpha \frac{\delta m^{(1)}}{m^{(1)}}\right) + \mathcal {O}(\delta \rho ^2) \end{aligned}$$
(53)

is dominated by the moments of the mean spectral density, provided the leading fluctuation correction \(\left( \frac{\delta m^{(\alpha )}}{m^{(\alpha )}} - \alpha \frac{\delta m^{(1)}}{m^{(1)}}\right) \) is small. Let us assume this at times that are short compared to the purification time (for an analytical argument that such short-time fluctuations are small in the context of weak measurements, see Sect. 3.4). Then the Rényi entropies are dominated by their non-fluctuating part

$$\begin{aligned} \bar{S}^{(\alpha )}(t) = \frac{1}{1-\alpha } \log {\left( \frac{m^{(\alpha )}}{(m^{(1)})^{\alpha }}\right) }. \end{aligned}$$
(54)

In the regime of times \(1 \ll t \ll M\), we have [35]

$$\begin{aligned} \bar{\rho }(\sigma ,t) \approx {\left\{ \begin{array}{ll} \frac{2N^{1-\frac{1}{t}}\sigma ^{\frac{2}{t}-1}}{t}, &{} \sigma ^2 < M \left( \frac{M}{N}\right) ^{t-1}, \\ 0, &{} \sigma ^2 > M \left( \frac{M}{N}\right) ^{t-1}, \end{array}\right. } \end{aligned}$$
(55)

yielding

$$\begin{aligned} m^{(\alpha )} \approx \frac{1}{\alpha t + 1}\frac{M^{\alpha t+1}}{N^{\alpha (t-1)}}, \end{aligned}$$
(56)

which predicts that

$$\begin{aligned} \bar{S}^{(\alpha )}(t) \approx \log M - \log t + \mathcal {O}(t^0), \quad 1 \ll t \ll M, \end{aligned}$$
(57)

for all \(\alpha > 0\), recovering a result that had been previously derived in various specific limits using various distinct physical arguments [7, 10, 23, 37] and is consistent with our precise definition of the purification time in Eqs. (15) and (29).

For \(\alpha = 2,3, \ldots ,M\) (but not the von Neumann entropy) a more refined estimate is available, using the exact result [34]

$$\begin{aligned} m^{(\alpha )} = \frac{1}{\alpha !}\sum _{r=0}^{\alpha -1}(-1)^r \begin{pmatrix} \alpha -1 \\ r \end{pmatrix} \left[ \frac{(M-r-1+\alpha )!}{(M-r-1)!} \right] ^{t} \end{aligned}$$
(58)

for such integer moments. This yields

$$\begin{aligned} \bar{S}^{(\alpha )}(t) = \frac{1}{1-\alpha } \log {\left( \frac{1}{\alpha ! M^{\alpha t}} \sum _{r=0}^{\alpha -1}(-1)^r \begin{pmatrix} \alpha -1 \\ r \end{pmatrix} \left[ \frac{(M-r-1+\alpha )!}{(M-r-1)!} \right] ^{t}\right) } \end{aligned}$$
(59)

for the non-fluctuating contribution to the Rényi entropies. The simplest non-trivial case of this formula is the second Rényi entropy for which

$$\begin{aligned} \bar{S}^{(2)}(t) = - \log \left( \frac{1}{2}\left( (1+1/M)^t - (1-1/M)^t\right) \right) , \end{aligned}$$
(60)

which is easily verified to recover Eq. (57) in the limit \(t \ll M\).

2.6.2 Long Times

We next consider the dynamics of Rényi entropies at times that are long compared to the purification time, for which the leading singular values of the Kraus operators are well separated from one another with high probability, so that the semicircle law Eq. (55) no longer provides a good approximation to their distribution. First note that as \(t \rightarrow \infty \) the squared singular values are distributed as [19, 30]

$$\begin{aligned} \sigma _n^2(t) \sim e^{2Y_nt}, \end{aligned}$$
(61)

where the \(Y_n\) are independent normal variables with mean \(\lambda _n\) and variance

$$\begin{aligned} \varsigma _n^2 = \frac{1}{4t}(\psi '(M-n+1)-\psi '(N-n+1)). \end{aligned}$$
(62)

In particular, the Rényi entropies and von Neumann entropies are dominated by the largest two singular values with high probability as \(t \rightarrow \infty \), implying that

$$\begin{aligned} S_{\textbf{m}}^{(\alpha )}(t) \sim {\left\{ \begin{array}{ll} \frac{\alpha }{\alpha -1} \nu (t), &{} \alpha > 1, \\ \nu (t) \log {\nu (t)}, &{} \alpha = 1, \\ \frac{1}{1-\alpha } \nu (t)^{\alpha }, &{} 0<\alpha < 1, \end{array}\right. } \end{aligned}$$
(63)

in terms of the ratio \(\nu (t)\) defined in Eq. (14) (similar expressions were obtained in Appendix E of Ref. [10]). By our definition of the purification time Eq. (15), it follows that

$$\begin{aligned} \mathbb {E}[\log {S_{\textbf{m}}^{(\alpha )}(t)}] \sim {\left\{ \begin{array}{ll} -\frac{t}{\tau _\textrm{P}}, &{} \alpha \ge 1, \\ -\frac{\alpha t}{\tau _\textrm{P}}, &{} 0<\alpha < 1, \end{array}\right. } \end{aligned}$$
(64)

as \(t \rightarrow \infty \). Thus we have constructed a solvable model that confirms earlier proposals for the late-time dynamics of entropy in entangling phases of spatially local monitored systems [7, 10, 23], together with the idea that at long times \(t \gg \tau _P\), the purification time \(\tau _{\textrm{P}}\) captures the typical [10] behaviour of a broad (and to a first approximation log-normal [30, 38]) distribution of \(S_{\textbf{m}}^{\alpha }(t)\).

Unfortunately, a more detailed characterization of the late-time distributions of \(\nu (t)\) and \(S_{\textbf{m}}^{\alpha }(t)\), including their mean (rather than typical) values, is beyond the scope of the long-time prediction Eq. (61) and related expressions [38], because these results do not contain information about the joint distribution of \(\sigma _1^2(t)\) and \(\sigma ^2_2(t)\) that would be necessary to accurately model \(\nu (t)\). We therefore turn to a model with weak measurements, for which the joint distribution function of singular values can be obtained analytically.

3 Weak Measurements

3.1 Model and Fokker–Planck Equation

We now consider a generalization of Model II defined in Sect. 2.2 above, whose only distinction from Model II is that its measurement layers consist of independent weak measurements on pL qubits per layer, which act on a single qubit as

$$\begin{aligned} \hat{P}_{\uparrow } = \frac{1}{2} \begin{pmatrix} 1+ \epsilon &{} 0 \\ 0 &{} 1-\epsilon \end{pmatrix}, \quad \hat{P}_{\downarrow } = \frac{1}{2} \begin{pmatrix} 1-\epsilon &{} 0 \\ 0 &{} 1+\epsilon \end{pmatrix}, \end{aligned}$$
(65)

where we fix \(0\le \epsilon \le 1\). We again write the Kraus operators along a single quantum trajectory at time t as \( \hat{K}_{\textbf{m}}(t) = \hat{P}_{\textbf{m}_t} \hat{U}_t \ldots \hat{P}_{\textbf{m}_1}\hat{U}_1\), where \(\hat{P}_{\textbf{m}_j}\) now denotes the tensor product of pL weak measurement operators as in Eq. (65), and the identity on the remaining \((1-p)L\) qubits. Regardless of the specific set of measurement outcomes, which we mostly suppress for economy of notation, we can write

$$\begin{aligned} \hat{P}_{\textbf{m}_{j}} = \frac{1}{2^{pL}} (\hat{\mathbb {1}}+\hat{\Lambda }_{j}), \end{aligned}$$
(66)

where \(\hat{\Lambda }_{j}\) has eigenvalues

$$\begin{aligned} l_n = (1+\epsilon )^n(1-\epsilon )^{pL-n}-1, \quad n=0,1,\ldots ,pL, \end{aligned}$$
(67)

with respective multiplicities

$$\begin{aligned} d_n = \begin{pmatrix} pL \\ n \end{pmatrix} 2^{(1-p)L}. \end{aligned}$$
(68)

It follows from these expressions that

$$\begin{aligned} \textrm{tr}[\hat{\Lambda }_{j}] = 0, \quad \textrm{tr}[\hat{\Lambda }_{j}^2] = 2^L\left[ (1+\epsilon ^2)^{pL}-1\right] . \end{aligned}$$
(69)

We would like to understand the evolution of the singular values \(\sigma _1(t) \ge \cdots \ge \sigma _N(t) \ge 0\) of \(\hat{K}_{\textbf{m}}(t)\). Thus consider the singular value decomposition \(\hat{K}_{\textbf{m}}(t) = \hat{V}_t\hat{D}_t \hat{W}^\dagger _t\) as in Eq. (8). Then

$$\begin{aligned} \hat{K}_{\textbf{m}}(t+1)\hat{K}_{\textbf{m}}(t+1)^\dagger = \hat{P}_{t+1} \hat{U}_{t+1} \hat{V}_{t} \hat{D}^2_t \hat{V}_t^\dagger \hat{U}_{t+1}^\dagger \hat{P}_{t+1}. \end{aligned}$$
(70)

Letting \(\hat{U} = \hat{U}_{t+1} \hat{V}^\dagger _{t}\), which inherits Haar randomness of \(\hat{U}_{t+1}\) at time \(t+1\), we can write

$$\begin{aligned} \hat{U}^\dagger \hat{K}_{\textbf{m}}(t+1) \hat{K}_{\textbf{m}}(t+1) \hat{U} = (\hat{U}^\dagger \hat{P}_{t+1} \hat{U})\hat{D}^2_t (\hat{U}^\dagger \hat{P}_{t+1} \hat{U}). \end{aligned}$$
(71)

Introducing the operators \(\hat{B} = \hat{U}^\dagger \hat{\Lambda }_{t+1} \hat{U}\) and \(\hat{X}_t = 2^{2pLt}\hat{K}_{\textbf{m}}(t)\hat{K}_{\textbf{m}}(t)^\dagger \), we have

$$\begin{aligned} \hat{X}_{t+1} \overset{\mathrm {s.v.}}{\sim } (\hat{\mathbb {1}}+\hat{B}) \hat{X}_t (\hat{\mathbb {1}}+\hat{B}) = \hat{X}_t + \hat{B}\hat{X}_t + \hat{X}_t\hat{B} + \hat{B} \hat{X}_t \hat{B}. \end{aligned}$$
(72)

where as above \(\overset{\mathrm {s.v.}}{\sim }\) denotes a common distribution of singular values (which are eigenvalues in this case). Let \(x_1(t) \ge \cdots \ge x_N(t) \ge 0\) denote the eigenvalues of \(\hat{X}_t\), which are given by \(x_n(t) = 2^{2pLt}\sigma _n^2(t)\) in terms of the singular values \(\sigma _n(t)\) of the Kraus operators. Then, if \(pL\epsilon ^2 \ll 1\) so that typical eigenvalues of \(\Lambda \) are small assuming large pL, Eq. (72) can be expanded perturbatively in \(\epsilon \) to yield

$$\begin{aligned} x_{n}(t+1)= & {} x_n(t) + B_{nn} x_n(t) + x_n(t) B_{nn} + \sum _{m} |B_{nm}|^2 x_m(t) \nonumber \\{} & {} + \sum _{m \ne n} |B_{nm}|^2 \frac{(x_n(t)+x_m(t))^2}{x_n(t)-x_m(t)} + \mathcal {O}(\epsilon ^3) \end{aligned}$$
(73)

Collecting terms yields a perturbative model for the time evolution of the singular values \(x_n(t)\), namely

$$\begin{aligned} x_n(t+1)-x_n(t) = \left( 2B_{nn}+ \sum _{m} |B_{nm}|^2 + 4 \sum _{m \ne n} |B_{nm}|^2 \frac{x_m(t)}{x_n(t) - x_m(t)}\right) x_n(t). \end{aligned}$$
(74)

This equation splits naturally into multiplicative “noise” and “drift” contributions, given by

$$\begin{aligned} \eta _n(t+1) = 2 B_{nn}, \quad \Delta _n(t+1) = \sum _{m} |B_{nm}|^2 + 4 \sum _{m \ne n} |B_{nm}|^2 \frac{x_m(t)}{x_n(t) - x_m(t)}. \end{aligned}$$
(75)

Thus

$$\begin{aligned} x_n(t+1)-x_n(t) = \left( \eta _n(t+1) + \Delta _n(t+1)\right) x_n(t). \end{aligned}$$
(76)

To eliminate multiplicative noise, we write \(x_n(t) = e^{y_n(t)}\) and, again assuming that \(\epsilon \) is small, find that

$$\begin{aligned} y_n(t+1)-y_n(t) = \log {\left( \eta _n(t+1) + \Delta _n(t+1)\right) } = \eta _n(t+1) + \mathcal {D}_n(t+1)+ \mathcal {O}(\epsilon ^3), \end{aligned}$$
(77)

where the new drift term

$$\begin{aligned} \mathcal {D}_n(t+1) = \Delta _n(t+1) - \frac{1}{2}\eta _n(t+1)^2. \end{aligned}$$
(78)

To proceed further, we note that the matrix elements

$$\begin{aligned} B_{nn} =\sum _{a} U_{an}U^*_{an}(\Lambda (t+1))_{aa} \end{aligned}$$
(79)

and

$$\begin{aligned} |B_{mn}|^2 = \sum _{a,b} U_{an}U_{bm}U^*_{am}U^*_{bn} (\Lambda (t+1))_{aa}(\Lambda (t+1))_{bb} \end{aligned}$$
(80)

are amenable to Haar averaging. In particular, the two-point function

$$\begin{aligned} \langle U_{an} U^*_{an} \rangle = \frac{1}{N}, \end{aligned}$$
(81)

and the four-point functions

$$\begin{aligned} \langle U_{an} U_{bm} U^*_{an} U^*_{bm} \rangle&= \frac{1}{N^2-1}(1+\delta _{ab}\delta _{mn}) - \frac{1}{N(N^2-1)}(\delta _{mn}+\delta _{ab}), \end{aligned}$$
(82)
$$\begin{aligned} \langle U_{an}U_{bm}U^*_{am}U^*_{bn} \rangle&= \frac{1}{N^2-1}(\delta _{mn}+\delta _{ab}) -\frac{1}{N(N^2-1)} (1+\delta _{ab}\delta _{mn}), \end{aligned}$$
(83)

by standard results [39], where angle brackets \(\langle ...\rangle \) denote the Haar average over \(\hat{U} \in SU(N)\), implying that

$$\begin{aligned} \langle \eta _n(t) \rangle = \frac{1}{N}\textrm{tr}[\hat{\Lambda }] =0 \end{aligned}$$
(84)

and

$$\begin{aligned} \langle \eta _m(t_1)\eta _n(t_2) \rangle = \Gamma \delta _{t_1t_2} \delta _{mn} \end{aligned}$$
(85)

where the noise strength

$$\begin{aligned} \Gamma = \frac{4}{N^2-1}\left( 1-\frac{1}{N}\right) \textrm{tr}[\hat{\Lambda }^2] \approx \frac{4}{N^2} \textrm{tr}[\hat{\Lambda }^2] \end{aligned}$$
(86)

for \(L \gg 1\). Similarly \( \langle |B_{mn}|^2 \rangle = \frac{1}{N^2-1} \left( 1-\frac{\delta _{mn}}{N}\right) \textrm{tr}[\hat{\Lambda }^2] \approx \frac{\Gamma }{4}\), implying that the average value of the drift term (conditioned on the circuit realization at time t) is given by

$$\begin{aligned} \langle \mathcal {D}_n(t+1) \rangle \approx \frac{\Gamma (N-2)}{4} + \Gamma \sum _{m\ne n} \frac{e^{y_m(t)}}{e^{y_n(t)}- e^{y_m(t)}}. \end{aligned}$$
(87)

Thus, assuming that fluctuations of \(\mathcal {D}_{n}(t+1)\) about its Haar-averaged value are negligible, we obtain the Langevin equation

$$\begin{aligned} y_{n}(t+1) - y_n(t) = \eta _n(t+1) + \frac{\Gamma (N-2)}{4} + \Gamma \sum _{m\ne n} \frac{e^{y_m(t)}}{e^{y_n(t)}-e^{y_m(t)}}. \end{aligned}$$
(88)

This can be written in a more symmetric form by making the change of variables \(2z_n(t) = y_n(t) + \frac{\Gamma N}{4}t\), which yields

$$\begin{aligned} z_n(t+1)-z_n(t) = \frac{1}{2}\eta _n(t) + \frac{\Gamma }{4} \sum _{m\ne n} \coth {(z_n(t)-z_m(t))}. \end{aligned}$$
(89)

Finally, letting \(\Gamma \rightarrow 0\) yields the continuous-time Fokker–Planck equation

$$\begin{aligned} \partial _s P = \sum _{n=1}^N\left( -\partial _{z_n}(D_n P) + \partial _{z_n}^2 P\right) , \end{aligned}$$
(90)

where \(P(\vec {z},s)\) denotes the joint PDF of \(\vec {z}(t) = (z_1(t),z_2(t),\ldots ,z_N(t))\) at time

$$\begin{aligned} s = \Gamma t/8, \end{aligned}$$
(91)

and

$$\begin{aligned} D_n(\vec {z}) = 2 \sum _{m \ne n} \coth {(z_n-z_m)}. \end{aligned}$$
(92)

In particular, the timescale \(\Gamma ^{-1}\) determines the dynamics of singular values, and should be thought of as the purification time for these models, an interpretation that will be confirmed by our results for the short and long time dynamics of Rényi entropies below.

The continuous-time Fokker–Planck equation Eq. (90) was first derived [13] in the context of a stochastic matrix model known as “isotropic Brownian motion” [40]. In that work, it was also pointed out that Eq. (90) is exactly solvable via a connection to Calogero–Sutherland models. In order to be self-contained, we now derive this exact solution.

3.2 Exact Solution of the Fokker–Planck Equation

To solve Eq. (90), we first note that \(D_n\) can be written in gradient form

$$\begin{aligned} D_n(\vec {z}) = -\partial _{z_n}\Phi (\vec {z}) \end{aligned}$$
(93)

where the “prepotential”

$$\begin{aligned} \Phi (\vec {z}) = -2 \sum _{j<k} \log {\sinh {(z_j-z_k)}} \end{aligned}$$
(94)

arises naturally in the theory of the classical hyperbolic Calogero–Sutherland model [41]. This implies that Eq. (90) can be written as [42]

$$\begin{aligned} \partial _s P = \partial _{z_n} \left( e^{-\Phi } \partial _{z_n} \left( e^{\Phi }P\right) \right) . \end{aligned}$$
(95)

Letting \(\psi = e^{\Phi /2} P\) then yields an imaginary-time Schrödinger equation for \(\psi \), namely [42, 43]

$$\begin{aligned} \partial _s \psi = \sum _{n=1}^N \partial _{z_n}^2 \psi - V \psi , \end{aligned}$$
(96)

with an effective potential

$$\begin{aligned} V(\vec {z}) = \frac{1}{4} \sum _{n=1}^N D_n^2(\vec {z}) + \frac{1}{2} \sum _{n=1}^N \partial _{z_n}D_n(\vec {z}). \end{aligned}$$
(97)

To proceed further write \(z_{nm} = z_n-z_m\) and note that

$$\begin{aligned} \frac{1}{4} \sum _{n=1}^N D_n^2(\vec {z}) = \sum _{\begin{array}{c} n \ne m \\ n \ne l \end{array}} \coth {z_{nm}}\coth {z_{nl}} = \sum _{n \ne m} \coth ^2{z_{nm}} + \sum _{\begin{array}{c} l \ne m \\ m \ne n \\ n \ne l \end{array}} \coth {z_{nm}}\coth {z_{nl}}. \end{aligned}$$
(98)

By an identity apparently first published by Calogero and Perelomov [44], we have

$$\begin{aligned} \sum _{\begin{array}{c} l \ne m \\ m \ne n \\ n \ne l \end{array}} \coth {z_{nm}}\coth {z_{nl}} = \frac{1}{3}N(N-1)(N-2) \end{aligned}$$
(99)

while

$$\begin{aligned} \frac{1}{2} \sum _{n=1}^N \partial _{z_n}D_n(\vec {z}) = N(N-1) - \sum _{n \ne m} \coth ^2{z_{nm}}, \end{aligned}$$
(100)

yielding a complete cancellation of interactions in the effective potential

$$\begin{aligned} V(\vec {z}) = \frac{1}{3}N(N^2-1), \end{aligned}$$
(101)

which is reminiscent of (but simpler than) the cancellation of interactions that occurs for the \(\beta =2\) case of the DMPK equation [26]. Making the change of variables \(\psi = e^{-\frac{N(N^2-1)}{3}s}\tilde{\psi }\) finally reduces the Fokker–Planck evolution Eq. (90) to a free diffusion equation

$$\begin{aligned} \partial _s \tilde{\psi } = \sum _{n=1}^N \partial _{z_n^2} \tilde{\psi }, \end{aligned}$$
(102)

(albeit with non-trivial boundary conditions to be discussed shortly).

It remains to solve Eq. (102) in the “ordered sector” \(z_1 \ge z_2 \ge \cdots \ge z_N\), subject to the initial condition

$$\begin{aligned} P(\vec {z},0) = \prod _{j=1}^N \delta (z_j-\varepsilon _j), \end{aligned}$$
(103)

where the regulators \(\varepsilon _1> \varepsilon _2> \cdots> \varepsilon _N > 0\) will be taken to zero at the end of the calculation and are needed to avoid the singularities in Eq. (90) at collision planes \(z_j = z_k\). We further impose boundary conditions

$$\begin{aligned} e^{-\Phi (\vec {z})} \left( \partial _{z_j}-\partial _{z_k}\right) \left( e^{\Phi (\vec {z})} P(\vec {z},s) \right) = 0, \quad z_j \rightarrow z_k^+, \quad j < k, \end{aligned}$$
(104)

for all \(s \ge 0\) that are sufficient for the vanishing of probability flux at collision planes, and therefore guarantee conservation of probability within the ordered sector.

Our method of solution follows that of Refs. [26, 45] for the \(\beta =2\) DMPK equation but differs in its details. It is first useful to note that for every solution to Eq. (102), there exists a solution to (90), given by

$$\begin{aligned} P(\vec {z},s) = e^{-\frac{N(N^2-1)}{3}s} \left( \frac{\prod _{j<k} \sinh (z_j-z_k)}{\prod _{j<k} \sinh (\varepsilon _j-\varepsilon _k)}\right) \tilde{\psi }(\vec {z},s). \end{aligned}$$
(105)

In terms of the “wavefunction” \(\tilde{\psi }\) in Eq. (105), we find that the initial condition

$$\begin{aligned} \tilde{\psi }(\vec {z},0) = \prod _{j=1}^N \delta (z_j-\varepsilon _j) \end{aligned}$$
(106)

and the vanishing of the wavefunction at collision planes,

$$\begin{aligned} \tilde{\psi }(\vec {z},s) = 0, \quad z_j \rightarrow z_k^+, \quad j < k, \end{aligned}$$
(107)

for \(s \ge 0\), are sufficient for the initial and boundary conditions Eqs. (103) and (104) on P to hold. To satisfy the boundary conditions in the ordered sector, we extend \(\tilde{\psi }\) to the whole space and consider a “fermionic” initial condition

$$\begin{aligned} \tilde{\psi }(\vec {z},0) = \sum _{\kappa \in S_N} (-1)^{\textrm{sgn}(\kappa )} \prod _{j=1}^N \delta (z_j-\varepsilon _{\kappa (j)}), \end{aligned}$$
(108)

where \(\textrm{sgn}: S_N \rightarrow {\pm 1}\) denotes the sign of the permutation \(\kappa \in S_N\). We can write this more suggestively as a Slater determinant of single-particle factors

$$\begin{aligned} \tilde{\psi }(\vec {z},0) = \textrm{det}\left[ \delta (z_j-\varepsilon _k)\right] , \end{aligned}$$
(109)

where \(\textrm{det}[A_{jk}]\) denotes the determinant of the matrix A with elements \(A_{jk}\). Then the Slater determinant of heat kernels

$$\begin{aligned} \tilde{\psi }(\vec {z},s) = \textrm{det}\left[ \frac{1}{\sqrt{4 \pi s}}e^{-\frac{(z_j-\varepsilon _k)^2}{4s}}\right] \end{aligned}$$
(110)

satisfies both the diffusion equation Eq. (102) for \(s \ge 0\) and the initial and boundary conditions Eqs. (103) and (104) in the ordered sector. The final nontrivial step is to set the regulators \(\varepsilon _j \rightarrow 0\).

To this end, we first note that the leading behaviour of the denominator of Eq. (105)

$$\begin{aligned} \prod _{j<k} \sinh {(\varepsilon _j - \varepsilon _k)} \sim \prod _{j<k} (\varepsilon _j - \varepsilon _k), \quad \varepsilon _1 \ll 1, \end{aligned}$$
(111)

is a Vandermonde determinant of order \(N-1\) in each of the \(\varepsilon _j\). We next expand the heat kernels in terms of Hermite polynomials [36]

$$\begin{aligned} e^{-\frac{(z_j-\varepsilon _k)^2}{4s}} \sim e^{-z_j^2/4s} \sum _{n=0}^{N-1} H_n(\tilde{z}_j)\frac{\tilde{\varepsilon }_k^n}{n!}, \quad \varepsilon _1 \ll 1, \end{aligned}$$
(112)

to the same order in \(\varepsilon \), where \(\tilde{z}_j = z_j/\sqrt{4s}, \, \tilde{\varepsilon }_k = \varepsilon _k/\sqrt{4s}\). However, at this order, the sum in Eq. (112) is nothing but a product of square matrices and so

$$\begin{aligned} \textrm{det}\left[ e^{-z_j^2/4s} \sum _{n=0}^{N-1} H_n(\tilde{z}_j)\frac{\tilde{\varepsilon }_k^n}{n!}\right]&= \frac{e^{-|\vec {z}|^2/4s}}{\prod _{n=1}^{N-1}n!} \begin{vmatrix} H_0(\tilde{z}_1)&H_1(\tilde{z}_1)&\ldots&H_{N-1}(\tilde{z}_1) \\ H_0(\tilde{z}_2)&H_1(\tilde{z}_2)&\ldots&H_{N-1}(\tilde{z}_2) \\ \vdots&\vdots&\ddots&\vdots \\ H_0(\tilde{z}_N)&H_1(\tilde{z}_N)&\ldots&H_{N-1}(\tilde{z}_N) \\ \end{vmatrix}\nonumber \\&\quad \times \begin{vmatrix} 1&1&\ldots&1 \\ \tilde{\varepsilon }_1&\tilde{\varepsilon }_2&\ldots&\tilde{\varepsilon }_N \\ \vdots&\vdots&\ddots&\vdots \\ \tilde{\varepsilon }_1^{N-1}&\tilde{\varepsilon }_2^{N-1}&\ldots&\tilde{\varepsilon }_N^{N-1} \end{vmatrix}\nonumber \\&= \frac{1}{(2s)^{N(N-1)/2}\prod _{n=1}^{N-1}n!} e^{-|\vec {z}|^2/4s} \prod _{j<k} (\varepsilon _j - \varepsilon _k)(z_j - z_k), \end{aligned}$$
(113)

where in the second line we applied Gaussian elimination to the matrix of Hermite polynomials to obtain another Vandermonde determinant. Combining the above expressions, we deduce that

$$\begin{aligned} P(\vec {z},s) = \frac{1}{(4\pi s)^{N/2}(2s)^{N(N-1)/2}\prod _{n=1}^{N-1}n!} e^{-\frac{N(N^2-1)}{3}s} \left( \prod _{j<k}(z_j-z_k)\sinh (z_j-z_k) \right) e^{-|\vec {z}|^2/4s}\nonumber \\ \end{aligned}$$
(114)

solves the Fokker–Planck equation for all \(s \ge 0\), satisfies the initial condition \( P(\vec {z},0) = \prod _{j=1}^N \delta (z_j)\), and conserves probability in the ordered sector. This recovers the solution to Eq. (90) presented in Ref. [13]. We note for future reference that the random variables \(z_n(t)\) modelled by Eq. (114) are related to the singular values \(\sigma _n(t)\) of Kraus operators at time t by the change of variables

$$\begin{aligned} e^{z_n(t)} = e^{\left( pL\log {2} + \frac{\Gamma N}{8}\right) t} \sigma _n(t) \end{aligned}$$
(115)

and recall that s is related to t by Eq. (91).

3.3 Very Short Times: Emergence of Log-GUE

Let us first consider the probability distribution Eq. (114) as \(s \rightarrow 0\) (note that \(s \ll 1\) corresponds to \(t \ll \Gamma ^{-1}\), i.e. times that are short compared to the purification time). We have

$$\begin{aligned} P(\vec {z},s) = \frac{1}{(4\pi s)^{N/2}(2s)^{N(N-1)/2}\prod _{n=1}^{N-1}n!} \left( \prod _{j<k}(z_j-z_k)^2 + \mathcal {O}(s^2)\right) e^{-|\vec {z}|^2/4s}.\qquad \end{aligned}$$
(116)

Thus the leading contribution to \(P(\vec {z},s)\) is from the distribution function \(P_{\mathrm {v.s.t.}}(\vec {z},s)\), given by

$$\begin{aligned} P(\vec {z},s) \sim P_{\mathrm {v.s.t.}}(\vec {z},s) \!= \!\frac{(2s)^{N(N-1)/2}}{(4\pi s)^{N/2}\prod _{n=1}^{N-1}n!}\! \left( \prod _{j<k}\frac{(z_j-z_k)^2}{4s}\right) \!e^{-|\vec {z}|^2/4s}, \quad s \rightarrow 0^+.\qquad \end{aligned}$$
(117)

This is strongly reminiscient of the distribution function of the Gaussian Unitary Ensemble (GUE) on the full space [46],

$$\begin{aligned} P_{\textrm{GUE}}(\vec {z}) = \frac{1}{N!} \frac{2^{N(N-1)/2}}{\pi ^{N/2} \prod _{n=1}^{N-1}n!} \left( \prod _{j<k}(z_j-z_k)^2\right) e^{-|\vec {z}|^2} \end{aligned}$$
(118)

and indeed the two distributions are related by a simple change of variables

$$\begin{aligned} P_{\mathrm {v.s.t.}}(\vec {z},s) = \frac{N!}{(4s)^{N/2}} P_{\textrm{GUE}}(\vec {z}/\sqrt{4s}), \end{aligned}$$
(119)

where the prefactor of N! arises because we restricted the domain of \(P(\vec {z},s)\) to the ordered sector. We will call the regime of validity of the approximation Eq. (117) the “very-short-time regime”, to be determined below. In the very-short-time regime, the singular values of Kraus operators follow a “log-GUE” distribution, which is related to the conventional GUE in the same sense that the log-normal distribution is related to the normal distribution. Defining the random variable \(\rho (z,t) = \sum _{n=1}^N \delta (z-z_n(t))\), it follows by the Wigner semicircle law for the conventional GUE [46] that the non-fluctuating part of the level density

$$\begin{aligned} \bar{\rho }_{\mathrm {v.s.t.}}(z,t) = {\left\{ \begin{array}{ll} \frac{2}{\pi } \frac{1}{\Gamma t} \sqrt{ N \Gamma t - z^2}, &{} |z| \le \sqrt{N \Gamma t}, \\ 0, &{} |z| > \sqrt{N \Gamma t}, \end{array}\right. } \end{aligned}$$
(120)

for \(N \gg 1\) and very short times. The boundary of the very-short-time regime is set by the requirement in Eq. (119) that \(|z_j-z_k| \ll 1\) for all jk: from Eq. (120), this is the case for \(N\Gamma t \ll 1\).

We can use this observation to extend the analysis of Rényi entropies in Sect. 2.6.1 to very short times as follows. First note that in terms of \(\rho (z,t)\), the Rényi entropies along a given quantum trajectory can be written as

$$\begin{aligned} S_{\textbf{m}}^{(\alpha )}(t) = \frac{1}{1-\alpha } \log {\left( \frac{\int _{-\infty }^\infty dz \, \rho (z,t) e^{2\alpha z}}{\left( \int _{-\infty }^\infty dz \, \rho (z,t)e^{2z}\right) ^\alpha }\right) }. \end{aligned}$$
(121)

In particular, we have

$$\begin{aligned} \int _{-\infty }^{\infty } dz \, \bar{\rho }_{\mathrm {v.s.t.}}(z,t) e^{2\alpha z} = \frac{2N}{\pi } \int _0^\pi d\theta \, \sin ^2{\theta } e^{2\alpha \sqrt{N \Gamma t} \cos {\theta }} = \frac{1}{\alpha } \sqrt{\frac{N}{\Gamma t}} I_1(2\alpha \sqrt{N \Gamma t}) ,\qquad \quad \end{aligned}$$
(122)

where \(I_1(w)\) denotes a modified Bessel function of the first kind [36], so that the non-fluctuating part of each Rényi entropy is given by

$$\begin{aligned} \bar{S}^{(\alpha )}(t) = \frac{1}{1-\alpha } \log { \left( \frac{1}{\alpha } \left( \frac{N}{\Gamma t}\right) ^{\frac{1-\alpha }{2}} \frac{I_1(2\alpha \sqrt{N \Gamma t})}{I_1(2\sqrt{N \Gamma t})^\alpha }\right) }, \quad \Gamma t \ll 1. \end{aligned}$$
(123)

The small-argument asymptotic behaviour

$$\begin{aligned} I_1(w) \sim \frac{w}{2}(1+\frac{w^2}{8}), \quad w \rightarrow 0, \end{aligned}$$
(124)

implies perturbative time dependence

$$\begin{aligned} \bar{S}^{(\alpha )}(t) \approx \log {N} - \frac{\alpha N \Gamma t}{2}, \quad \Gamma t \ll \frac{1}{N}, \end{aligned}$$
(125)

of the Rényi entropies at very short times. Meanwhile, log-GUE statistics and the semicircle law Eq. (120) break down at times \(\frac{1}{N} \ll \Gamma t \ll 1\) that are short but not very short, as we now discuss.

3.4 Short Times: Semicircle-to-Square Crossover

Let us now consider how the semicircle law corresponding to the exact distribution function \(P(\vec {z},s)\) differs from Eq. (120) beyond the regime of very short times. We first note that \(P(\vec {z},s)\) can be written as a Boltzmann weight for a harmonically trapped gas of particles on a line at positions \(z_1<z_2< \cdots <z_N\) in the standard fashion,

$$\begin{aligned} P(\vec {z},s) \propto e^{-W(\vec {z},s)}, \end{aligned}$$
(126)

where the effective potential

$$\begin{aligned} W(\vec {z},s) = \sum _{i=1}^N \frac{z_i^2}{4s} -\frac{1}{2}\sum _{j \ne k} \log {\left( (z_j-z_k)\sinh {(z_j-z_k)}\right) } \end{aligned}$$
(127)

interpolates between Dyson’s “log-gas” [46] in the limit of small interparticle separations [13] \(|z_j-z_k| \rightarrow 0\) that was discussed in the previous section, and a one-dimensional Coulomb gas at large interparticle separations [47] \(|z_j-z_k| \rightarrow \infty \), as follows from the asymptotic behaviour \(\log {(z\sinh {z})} \sim \log {(|z|)}\) as \(|z| \rightarrow \infty \). The latter model is also known in the literature as the “one-dimensional jellium model”, various properties of which can be derived analytically [48,49,50], such as its uniform density of states [48].

It will be instructive to understand this crossover to uniformity in terms of the density of states \(\rho (z)\). Suppressing explicit time dependence, we have

$$\begin{aligned} W[\rho ] \!= \!\int _{-\infty }^\infty \!dz \, \rho (z) \frac{z^2}{4s}\! -\! \frac{1}{2}\int _{-\infty }^\infty \int _{-\infty }^\infty dz \, dw \, \rho (z) \rho (w) \log {\left( (z-w)\sinh {(z-w)}\right) }.\qquad \end{aligned}$$
(128)

Note also that \(\rho (z)\) satisfies the constraint \(N[\rho ] = \int _{-\infty }^{\infty } dz \, \rho (z) = N\). Thus for large \(N \gg 1\), the non-fluctuating part of the density of states \(\bar{\rho }(z)\) minimizes the functional \(W[\bar{\rho }] - \mu N[\bar{\rho }]\), implying that

$$\begin{aligned} \frac{z^2}{4s} - \int _{-\infty }^{\infty } dw \, \bar{\rho }(w) \log {\left( (z-w)\sinh {(z-w)}\right) } = \mu . \end{aligned}$$
(129)

Finally differentiating with respect to z yields the singular integral equation [51, 52]

(130)

which is distinguished from the usual semicircle law by the presence of the \(\coth {(z-w)}\) term in the kernel, that breaks the spatial rescaling symmetry of the former. This breaking of scaling symmetry allows \(\bar{\rho }\) to flow from a semicircle law to a uniform distribution as s increases. We note that an exact but implicit solution to the integral equation Eq. (130) was proposed recently in the literature [52].

We instead proceed numerically and solve for solutions to Eq. (130) supported on a finite interval \([-a,a]\) with \(a>0\) for \(s>0\) (note that the microscopic distribution function Eq. (114) is inversion symmetric in z). Numerical results obtained by discretizing the principal value integral in Eq. (130) and inverting the resulting matrix to obtain \(\rho (z,s)\) are shown in Fig. 3 and confirm that the semicircle law discussed in the previous section quickly converges to a uniform distribution beyond the regime of very short times. The resulting late-time profile is sometimes called a “square law” [13] and in this sense Eq. (130) captures a “semicircle-to-square” crossover with increasing s.

Fig. 3
figure 3

Numerical solutions to the singular integral equation Eq. (130) on intervals \([-a(s),a(s)]\) with the endpoint position a(s) determined implicitly by number conservation. We set \(N=50\) and discretize the integral over one thousand points. The accuracy of our scheme is confirmed by its recovery of the semicircle law discussed in the previous section as \(s \rightarrow 0\), and we observe a clear semicircle-to-square crossover from a semicircle law to uniform behaviour between the very-short-time (\(s \ll 0.01\)) and short-time (\(0.01 \ll s \ll 1\)) regimes defined by Eq. (137)

Let us now attempt to understand this crossover to uniformity analytically. Thus consider the uniform ansatz

$$\begin{aligned} \bar{\rho }_{\mathrm {s.t.}}(w) = {\left\{ \begin{array}{ll} \frac{N}{2a}, &{} |w| \le a, \\ 0, &{} |w| > a. \end{array}\right. } \end{aligned}$$
(131)

This yields

(132)

Expanding the right-hand side of Eq. (132) perturbatively in z yields a linear estimate Nz/a. We will show that Eq. (132) is close to this linear estimate over a large “bulk region” \(|z \pm a| > \varepsilon \), where \(\varepsilon > 0\) is an order one constant. To this end, we define the function

$$\begin{aligned} \eta (z) = \frac{N}{2a}\left( \log {\left( \frac{(a+z)\sinh {(a+z)}}{(a-z)\sinh {(a-z)}}\right) } - 2z\right) , \end{aligned}$$
(133)

which quantifies the deviation of the right-hand side of Eq. (132) from linearity. The function \(\eta (z)\) is odd and strictly increasing in z on the interval \((-a,a)\). It is also convex on the interval \([0,a-\epsilon ]\). Together these observations imply an upper bound

$$\begin{aligned} \frac{|\eta (z)|}{N|z|/a} \le \frac{a}{N}\frac{\eta (a-\varepsilon )}{a-\varepsilon }, \quad 0< |z| < a-\varepsilon , \end{aligned}$$
(134)

valid over the bulk region. Finally, we note that to leading order in a and for \(\varepsilon \) of order one,

$$\begin{aligned} \frac{a}{N}\frac{\eta (a-\varepsilon )}{a-\varepsilon } = \frac{\log {a}}{2a} + \mathcal {O}\left( \frac{1}{a}\right) , \end{aligned}$$
(135)

which establishes that

(136)

in the entire bulk region. Comparison with the original integral equation (130) reveals that the uniform ansatz Eq. (131) yields an accurate bulk solution provided we make the identification

$$\begin{aligned} a(s) = 2sN \gg 1, \end{aligned}$$
(137)

corresponding to the regime of times \(\Gamma t \gg \frac{1}{N}\), which strictly excludes the very-short-time regime identified in the previous section. Thus we have shown that from the short-time regime onwards, the uniform ansatz Eq. (131) yields a good approximation to the solution of Eq. (130), excluding a boundary region of width \(2\varepsilon \) that has negligible measure compared to the bulk region. The latter point in particular means that the uniform ansatz Eq. (131) can reliably be used to estimate integrals over singular values when the separation between neighbouring singular values is small.

For example, at short times the uniform ansatz Eqs. (131) and (137) predicts that

$$\begin{aligned} \bar{S}^{(\alpha )}(t) = \frac{1}{1-\alpha } \log {\left( \frac{1}{\alpha } \left( \frac{2}{\Gamma t}\right) ^{1-\alpha } \frac{\sinh (\alpha N \Gamma t /2)}{\sinh (N \Gamma t / 2)^\alpha }\right) } \approx - \log {\Gamma t}, \quad \frac{1}{N} \ll \Gamma t \ll 1,\nonumber \\ \end{aligned}$$
(138)

whose leading time dependence perfectly matches Eq. (57) for projective measurements, provided that \(\Gamma ^{-1}\) is interpreted as the purification time. That \(\Gamma ^{-1}\) indeed defines the purification time in the sense of Eq. (15) will be checked carefully below, but can be seen here by noting that the timescale for the characteristic spacing between adjacent \(z_n\) to become comparable to their long-time standard deviation, invalidating the treatment of \(\rho (z,t)\) as a uniform distribution, is defined by the condition \(2a(s)/N = \mathcal {O}(s^{1/2})\), i.e. \(\Gamma t = \mathcal {O}(1)\).

Finally, we note that the energy functional Eq. (128) appearing in the Dyson gas formulation of our problem can be used to estimate the strength of fluctuations of the Rényi entropies, thereby validating our estimation of short-time Rényi entropies by their non-fluctuating values as in Eq. (138). We first expand the functional W perturbatively about the mean field \(\bar{\rho }(w)\) to yield a quadratic effective action

$$\begin{aligned} \mathcal {S}[\delta \rho ] = W[\bar{\rho }+\delta \rho ] - W[\bar{\rho }] = \frac{1}{2}\int _{-a}^{a}\int _{-a}^{a} dz\, dw \, K(z-w) \delta \rho (z) \delta \rho (w), \end{aligned}$$
(139)

whose kernel

$$\begin{aligned} K(z-w) = -\log {(z-w)\sinh {(z-w)}}. \end{aligned}$$
(140)

It follows that the expected value of the correlation function \(\delta \rho (z) \delta \rho (w)\) with respect to the Boltzmann weight \(\propto e^{-\mathcal {S}}\) is given by

$$\begin{aligned} \langle \delta \rho (z) \delta \rho (w) \rangle = G(z-w), \end{aligned}$$
(141)

where the Green’s function satisfies

$$\begin{aligned} \int _{-a}^{a} dw \, K(x-w)G(w-z) = \delta (x-z). \end{aligned}$$
(142)

We refer to Refs. [53, 54] for a careful justification of analogous arguments for the DMPK equation. To proceed further, we note that the bulk short-time behaviour described above is consistent with the linear approximation

$$\begin{aligned} K(z-w) \approx -|z-w| \end{aligned}$$
(143)

to the kernel K, for which the Green’s function

$$\begin{aligned} G(z-w) = -\frac{1}{2}\delta ''(z-w) \end{aligned}$$
(144)

defines an inverse in the bulk region. We note that Eq. (144) implies the results of Ref. [50] on the variance of linear statistics for the one-dimensional Coulomb gas, which is a nontrivial test of its validity. Let us define the mean moments \(n^{(\alpha )} = \int _{-a}^{a} dz \, \bar{\rho }(z) \, e^{2\alpha z}\) and their fluctuations \(\delta n^{(\alpha )} = \int _{-a}^{a} dz \, \delta \rho (z) \, e^{2\alpha z}\). Combining the short-time approximations Eq. (131) and Eq. (144) and neglecting boundary effects yields the estimates

$$\begin{aligned} n^{(\alpha )}(t) \approx \frac{1}{\alpha \Gamma t}\exp {\left( \frac{\alpha N \Gamma t}{2}\right) }, \quad \langle \delta n^{(\alpha )}(t) \delta n^{(\beta )}(t) \rangle \approx -\frac{(\alpha -\beta )^2}{4(\alpha +\beta )}\exp {\left( \frac{(\alpha +\beta )N\Gamma t}{2}\right) }\nonumber \\ \end{aligned}$$
(145)

for \(N \Gamma t \gg 1\). From these expressions, we deduce that the leading contribution to the covariance of Rényi entropies grows quadratically in time

$$\begin{aligned} \langle \delta S^{(\alpha )}(t) \delta S^{(\beta )}(t)\rangle \approx \frac{\alpha \beta (\Gamma t)^2}{4(\alpha -1)(\beta -1)} \left( \frac{(\beta -1)^2}{\beta +1} + \frac{(\alpha -1)^2}{\alpha +1}-\frac{(\alpha -\beta )^2}{\alpha +\beta }\right) \end{aligned}$$
(146)

for \(N \Gamma t \gg 1\). In particular, since \((\Gamma t)^2 \ll |\log {\Gamma t}|\) for \(\Gamma t \ll 1\), this guarantees that fluctuations in the Rényi entropies will be small compared to the mean Rényi entropies \(\bar{S}^{(\alpha )}\) until the end of the short-time regime \(\Gamma t \approx 1\), at which the \(\bar{S}^{(\alpha )}\) are order one by Eq. (138) and thus have magnitude comparable to their fluctuations Eq. (146).

3.5 Long Times: Log-Normality, Lyapunov Exponents and Level Repulsion

Anticipating that the \(z_n\) are well-separated for \(s \gg 1\), we define the asymptotic drift velocities

$$\begin{aligned} c_n = \lim _{z_{j}-z_k \rightarrow \infty , \, j < k} D_n(\vec {z}) = 2(N+1-2n). \end{aligned}$$
(147)

Noting that \(\sum _{n=1}^N c_n^2 = 4N(N^2-1)/3\) and \(\sum _{j<k} (z_j-z_k) = \frac{1}{2}\sum _{n=1}^N c_n z_n\), we can rewrite Eq. (114) as

$$\begin{aligned} P(\vec {z},s) = \frac{\prod _{j<k} (z_j - z_k)}{(4s)^{N(N-1)/2}\prod _{n=1}^{N-1}n!} \prod _{j<k} \left( 1-e^{-2(z_j-z_k)}\right) \prod _{n=1}^N \frac{1}{\sqrt{4\pi s}}e^{-\frac{(z_n-c_ns)^2}{4s}}.\qquad \end{aligned}$$
(148)

Thus as \(s \rightarrow \infty \),

$$\begin{aligned} P(\vec {z},s) = P_{\mathrm {l.t.}}(\vec {z},s) + \mathcal {O}(s^{-1/2}), \end{aligned}$$
(149)

where the dominant contribution at asymptotically long times is given by

$$\begin{aligned} P_{\mathrm {l.t.}}(\vec {z},s) = \frac{\prod _{j<k} (c_j - c_k)s}{(4s)^{N(N-1)/2}\prod _{n=1}^{N-1}n!} \prod _{n=1}^N \frac{1}{\sqrt{4\pi s}}e^{-\frac{(z_n-c_ns)^2}{4s}} = \prod _{n=1}^N \frac{1}{\sqrt{4\pi s}}e^{-\frac{(z_n-c_ns)^2}{4s}},\qquad \end{aligned}$$
(150)

since the Vandermonde determinant

$$\begin{aligned} \prod _{j<k} (c_j - c_k)s = \prod _{j<k} 4(k-j)s = (4s)^{N(N-1)/2} \prod _{n=1}^{N-1}n!. \end{aligned}$$
(151)

We deduce that long-time asymptotic behaviour of \(P(\vec {z},s)\) is log-normal.

We note that the drift velocities Eq. (147) correspond to Lyapunov exponents of \(e^{z_n}\) as a function of the rescaled time s [13]. Combining Eqs. (150), (115) and (91) then implies that \( \sigma _n(t) \sim e^{\lambda _n t}\) as \(t \rightarrow \infty \), where the Lyapunov exponents for the singular values as a function of time are given by

$$\begin{aligned} \lambda _n = -pL \log 2 + \frac{\Gamma }{8}(N+2-4n), \quad n =1,2,\ldots ,N. \end{aligned}$$
(152)

Thus the Lyapunov spectrum for weak measurements is linear in n for all n (in contrast to the Lyapunov spectrum for projective measurements, Eq. (25), which is only linear for \(n \ll M\)). In particular, we can read off the purification time defined by Eq. (15), which is given by

$$\begin{aligned} \tau _{\textrm{P}} = \Gamma ^{-1} \end{aligned}$$
(153)

for this model as expected.

Despite accurately capturing the behaviour of widely separated singular values, the asymptotic expression Eq. (150) leads to unphysical predictions for the joint distribution function of \(z_1\) and \(z_2\), predicting for example that the mean ratio of the two largest singular values \(\nu (t)=e^{2(z_2-z_1)}\) does not decay in time. This is because Eq. (150) omits the level repulsion implied by our ordering of the \(z_n\). To correct this, we make a more accurate long-time approximation to the exact distribution function Eq. (114) that is again expected to hold at asymptotically long times, whereby subleading singular values \(z_n\) with \(n \ge 3\) are again treated as lognormal but the leading two singular values \(z_1\) and \(z_2\) are treated exactly. Integrating over \(z_n\) for \(n \ge 3\) then yields the “improved” joint distribution function

$$\begin{aligned} \tilde{P}_{\mathrm {l.t.}}(z_1,z_2,s) = \frac{(z_1-z_2)}{4s}\left( 1-e^{-2(z_1-z_2)}\right) \frac{1}{4\pi s}e^{-\frac{(z_1-c_1s)^2+(z_2-c_2s)^2}{4s}} \end{aligned}$$
(154)

for the two leading singular values. Let us now confirm that this yields physically reasonable predictions for \(\nu (t)\).

It will be helpful to introduce centre-of-mass coordinates \(\bar{z} = \frac{z_1+z_2}{2}\) and \(\zeta = z_1 - z_2\), in which the improved joint distribution function

$$\begin{aligned} \tilde{P}_{\mathrm {l.t.}}(z_1,z_2,s) = A(\bar{z})B(\zeta ) \end{aligned}$$
(155)

is separable, with

$$\begin{aligned} A(\bar{z}) = \frac{1}{\sqrt{2\pi s}} e^{-\frac{1}{2s}(\bar{z}-(2N-4)s)^2} \end{aligned}$$
(156)

and

$$\begin{aligned} B(\zeta ) = \frac{1}{\sqrt{8\pi s}} \frac{\zeta }{4s} \left( e^{-\frac{1}{8s}(\zeta -4s)^2} - e^{-\frac{1}{8s}(\zeta +4s)^2} \right) . \end{aligned}$$
(157)

The latter manifestly exhibits level repulsion as \(\zeta \rightarrow 0^+\). With respect to the improved long-time measure Eq. (157), typical values of \(\nu (t)\) behave as

$$\begin{aligned} \mathbb {E}[\log {\nu }(t)] \sim -\Gamma t, \quad \Gamma t \gg 1, \end{aligned}$$
(158)

implying that

$$\begin{aligned} \mathbb {E}[\log {S_{\textbf{m}}^{(\alpha )}(t)}] \sim {\left\{ \begin{array}{ll} -\Gamma t, &{} \alpha \ge 1, \\ -\alpha \Gamma t, &{} 0<\alpha < 1, \end{array}\right. } \end{aligned}$$
(159)

which recovers Eqs. (158) and (159) at times \(t \gg \Gamma ^{-1}\), while the mean value of \(\nu (t)\) also decays exponentially,

$$\begin{aligned} \mathbb {E}[\nu (t)] \sim \frac{16}{9} \frac{1}{\sqrt{\pi (\Gamma t)^3}} e^{-\Gamma t/4}, \quad \Gamma t \gg 1, \end{aligned}$$
(160)

(albeit at a slightly different rate from the typical value), implying for example that the mean \(\alpha >1\) Rényi entropies

$$\begin{aligned} \mathbb {E}[S_{\textbf{m}}^{(\alpha )}(t)] \sim \frac{\alpha }{\alpha -1} \frac{16}{9} \frac{1}{\sqrt{\pi (\Gamma t)^3}} e^{-\Gamma t/4}, \quad \alpha > 1, \end{aligned}$$
(161)

decay exponentially for \(t \gg \Gamma ^{-1}\).

4 Conclusion

We have presented various solvable models of monitored quantum circuits (“monitored Haar-random quantum dots”) that realize the entangling phase of monitored quantum dynamics. While these models have antecedents in the literature [9,10,11], the full extent of their analytical tractability does not appear to have been exploited until now. This analytical tractability has allowed us to derive the first exact expressions that we are aware of for quantities such as the purification time, the Lyapunov spectrum and the distribution of Born probabilities in the entangling phase of a monitored quantum system. By constructing explicit mappings from monitored Haar-random quantum dots to well-studied models in random matrix theory, such as products of truncated unitary matrices [18, 29] and isotropic Brownian motion [13, 40], we have further provided a template for realizing random-matrix universality in monitored quantum systems.

Our proposed notion of universality for such systems is analogous to the use of random matrix theory (RMT) as a description of the spectra of closed quantum systems; in the same way that RMT captures spectral correlations of generic chaotic quantum systems, we conjecture that the models discussed in this paper capture certain features of the entangling phase of generic monitored quantum systems, including the spatially local quantum circuits in which the entangling phase was first identified [2, 4].

Before describing our proposal in detail, let us explain how universality for monitored quantum systems must differ from random-matrix universality and the related eigenstate thermalization hypothesis (ETH) [55] in closed quantum systems. Most obviously, the qualitative behaviour of the unstructured models considered in this paper depends on three parameters: the Hilbert space dimension N, the purification time \(\tau _{\textrm{P}}\) and the time t under consideration. The latter two parameters have no natural counterpart in closed quantum systems, whose dynamics is Hamiltonian and therefore time-translation invariant. In particular, we expect that monitored quantum systems will exhibit different facets of universal behaviour depending on the dimensionless parameter \(t/\tau _{\textrm{P}}\), with qualitative differences between the short-time (\(t / \tau _{\textrm{P}} \ll 1\)) and long-time (\(t / \tau _{\textrm{P}} \gg 1\)) regimes. Second, instead of the matrix elements of local observables and energy-level-spacing statistics that are usually of primary interest in formulating and testing ETH [55], the monitored setting suggests new arenas for universality, such as the statistics of single-trajectory Born probabilities and Rényi entropies discussed in this paper, both of which reflect the singular-value statistics of the underlying Kraus operators.

First consider the short-time regime, \(t/\tau _{\textrm{P}} \ll 1\). Our results on the distribution of Born probabilities in Sect. 2.5 and on the sample-to-sample fluctuations of Rényi entropies in Sect. 3.4 suggest that for the models studied in this paper, both the Born probabilities and the Rényi entropies will be narrowly distributed about their means (possibly on a log scale) in the short-time regime. We expect this conclusion to hold more generally for the entangling phase in arbitrary spatially local models. This means that on timescales that are short compared to the purification time (excluding initial transients as in Sect. 3.3), the behaviour of the system is to a first approximation decoupled from its measurement history, and to this extent, all quantum trajectories are statistically alike.

By contrast, our results on both Born probabilities and Rényi entropies at long times (Sects. 2.5, 2.6.2 and 3.5) indicate that these quantities exhibit a broad and normal (or normal-derivative-like) distribution on a log scale at long times \(t/\tau _{\textrm{P}} \gg 1\). Thus the observed behaviour in the long-time regime will depend strongly on the measurement history, and distinct quantum trajectories will exhibit very different properties. We expect similar behaviour for the entangling phase of spatially local systems at long times, even at the level of the shapes of these probability distributions, which should reflect universal properties of large products of identically distributed random matrices [35].

Finally, we note that the state-vector of the system at long times should exhibit some degree of universality by Eq. (13), which predicts that the long-time density matrix along any given trajectory is a random pure state. This is reminiscent of Berry’s conjecture [56], which posits that the eigenvectors of a chaotic Hamiltonian are distributed as Gaussian random vectors. For the projectively-measured Haar-random quantum dot, the long-time state-vector \(\textbf{v}_1\) in Eq. (15) is distributed as a Gaussian random vector in the M-dimensional image of the most recent measurement layer. More generally, for spatially local systems the long-time state-vector \(\textbf{v}_1\) is no longer expected to be perfectly Gaussian for entangling phases, as can be seen from the presence of logarithmic in L corrections to the bipartite entanglement entropy of the long-time state in spatially local systems [57, 58], which would otherwise exhibit purely volume-law dependence on L as predicted by the Page curve [59].

Note that in this paper, we do not consider the Born-rule weighted averages of quantities such as Rényi entropies. We emphasise that the possibility of Born-rule weighting does not arise for the Lyapunov exponents, which by definition characterize a product of independently and identically distributed random matrices. On the other hand, for Rényi entropies we expect that introducing Born-rule averaging will leave our results essentially unchanged at times \(t \ll \tau _{\textrm{P}}\) but will have an effect at times \(t \gg \tau _{\textrm{P}}\). Accounting for Born-rule averaging within the approach in Sect. 3 would lead to a distinct Fokker–Planck equation from the one studied above, as discussed in Ref. [9].

Important goals for future work include directly testing the above predictions of universality in spatially local realizations of the entangling dynamical phase (for which there is currently a dearth of analytical understanding) and developing a similarly universal characterization of disentangling phases. Another interesting question is whether the kinds of quantities that we have computed analytically in this paper, such as Lyapunov spectra and distributions of Born probabilities, can shed further light on monitored phases of “non-interacting” or Gaussian quantum circuits [9, 10, 60,61,62,63], for which it is known that the purification time scales quadratically with the system size [9, 10]. In the Haar-random setting, it seems worth understanding how far the generalizations of the Porter–Thomas distribution that we identify in Sect. 2.5 imply hardness-of-sampling results analogous to what is known for random unitary circuits [22]. Such results would both complement the prediction of a computational-complexity transition in monitored random circuits [64] and lend theoretical support to proposals [65, 66] for diagnosing monitored dynamical phases that avoid post-selection by computing cross-entropies instead.

Shortly after this work was completed, related results appeared in two papers [67, 68].