Scattering in quantum dots via noncommutative rational functions

In the customary random matrix model for transport in quantum dots with $M$ internal degrees of freedom coupled to a chaotic environment via $N\ll M$ channels, the density $\rho$ of transmission eigenvalues is computed from a specific invariant ensemble for which explicit formula for the joint probability density of all eigenvalues is available. We revisit this problem in the large $N$ regime allowing for (i) arbitrary ratio $\phi := N/M \le 1$; and (ii) general distributions for the matrix elements of the Hamiltonian of the quantum dot. In the limit $\phi \to 0$ we recover the formula for the density $\rho$ that Beenakker (Rev. Mod. Phys., 69:731-808, 1997) has derived for a special matrix ensemble. We also prove that the inverse square root singularity of the density at zero and full transmission in Beenakker's formula persists for any $\phi<1$ but in the borderline case $\phi=1$ an anomalous $\lambda^{-2/3}$ singularity arises at zero. To access this level of generality we develop the theory of global and local laws on the spectral density of a large class of noncommutative rational expressions in large random matrices with i.i.d. entries.


Introduction and results
Since the pioneering discovery of E. Wigner on the universality of eigenvalue statistics of large random matrices [38], random matrix theory has become one of the most successful phenomenological theories to study disordered quantum systems, see [2] for a broad overview. Among many other applications, it has been used for open quantum systems and quantum transport, in particular to predict the distribution of transmission eigenvalues of scattering in quantum dots and wires. The theory has been developed over many excellent works starting with the ground-breaking papers by Mello, Pereyra, Seligman [30] and by Verbaarschot, Weidenmüller and Zirnbauer [37]; for a complete overview with extensive references see reviews by Beenakker [7,8], Fyodorov and Savin [21] and Schomerus [35].
We will focus on quantum dots, i.e. systems without internal structure, coupled to an environment (electron reservoir) via scattering channels. Quantum wires, with a typically quasi one-dimensional internal structure, will be left for further works. In the simplest setup the quantum dot is described by a self-adjoint Hamiltonian (complex Hermitian matrix 1 ) H ∈ C M ×M acting on an M dimensional state space C M . It is coupled to an environment with N 0 effective degrees of freedom via an M × N 0 complex coupling matrix W . Following Wigner's paradigm, both the Hamiltonian H and the coupling matrix W are drawn from random matrix ensembles respecting the basic symmetries of the model. Typically the entries of W are independent, identically distributed (i.i.d.), while H is a Wigner matrix, i.e. it has i.i.d. entries on and above the diagonal. We allow for general distributions in contrast to most existing works in the literature that assume H has Gaussian or Lorentzian distribution.
One common assumption is that the interaction W is independent of the Fermi energy E. At any fixed energy E ∈ R we define the scattering matrix (see [7,Eq. (81)]) This is the finite dimensional analogue of the Mahaux-Weidenmüller formula in nuclear physics [28] that can be derived from H in the N 0 → ∞ limit. The definition (1.1) will be the starting point of our mathematical analysis. Since H = H * , one can easily check that S(E) is unitary.
To distinguish between transmission and reflection, we assume that the scattering channels are split into two groups, left and right channels, with dimensions N 1 + N 2 = N 0 and the interaction Hamiltonian is also split accordingly; W = (W 1 , W 2 ) ∈ C M ×(N 1 +N 2 ) . Therefore S(E) has a natural 2 × 2 block structure and we can write it as (see [7,Eq. (23)]) where R ∈ C N 1 ×N 1 , R ∈ C N 2 ×N 2 are the reflection matrices and T ∈ C N 2 ×N 1 , T ∈ C N 1 ×N 2 are the transmission matrices.
As a consequence of unitarity, one finds that T T * , T (T ) * , I − RR * and I − R (R ) * have the same set of nonzero eigenvalues. For simplicity, we assume that N 1 = N 2 = N i.e. generically these four matrices have no zero eigenvalues, the general N 1 = N 2 case has been considered in [11]. We denote these transmission eigenvalues by λ 1 , λ 2 , . . . , λ N . They express the rate of the transmission through each channel. By unitarity of S, clearly λ i ∈ [0, 1] for all i; λ i = 0 means the channel is closed, while λ i = 1 corresponds to a fully open channel. The transmission eigenvalues carry important physical properties of the system. For example Tr T T * = i λ i gives the zero temperature conductance is the shot noise power, giving the zero temperature fluctuation of the current (Büttiker's formula, [12], [7,Eq. (35)]). The dimensionless ratio of the shot noise power and the conductance is called the Fano factor (see [8,Eq. (2.15)]) The current fluctuation is therefore given by a certain linear statistics of the transmission eigenvalues and thus it can be computed from the density ρ of these eigenvalues. Therefore, determining ρ is a main task in the theory of quantum dots. In many physical situations it is found that ρ has a bimodal structure with a peak at zero and a peak at unit transmission rates. Furthermore, ρ exhibits a power law singularity at the edges of its support [0, 1]. One main result of the theory in [7] is that in the M N 1 regime at energy E = 0 the density of transmission eigenvalues for a quantum dot is given by ρ Bee (λ) = 1 π λ(1 − λ) , (1.5) (the answer is different for quantum wires), i.e. it has an inverse square root singularity at both edges, see [8,Eq. (3.12)]. In this case, the Fano factor is F = 1/4 which fits well the experimental data. The goal of this paper is to revisit and substantially generalize the problem of transmission eigenvalues with very different methods than Beenakker and collaborators used. While those works used invariant matrix ensembles for H and relied on explicit computations for the circular ensemble, we consider very general distribution for the matrix elements of H. In particular, we show that in the regime φ := N/M ∈ (0, 1], M → ∞, the empirical density of transmission eigenvalues has a deterministic limit ρ = ρ φ and we give a simple algebraic equation to compute it. The solution can be continuously extended as φ → 0, the equation simplifies for φ = 0, the density becomes explicitly computable and it coincides with (1.5); lim φ→0 ρ φ = ρ φ=0 = ρ Bee for E = 0 and our formula holds for any E in the bulk spectrum of H. While no short explicit formula is available for the general case φ ∈ (0, 1], we can analyse the singularities of ρ φ for any fixed φ ∈ (0, 1] in detail. More precisely, we rigorously prove that for any fixed φ ∈ (0, 1) the density ρ φ has an inverse square root singularity both at 0 and at 1, ρ φ (λ) ∼ λ −1/2 , for 0 < λ 1, and ρ φ (λ) ∼ (1 − λ) −1/2 , for 0 < 1 − λ 1, (1.6) qualitatively in line with ρ Bee from (1.5). However, ρ φ is not symmetric around λ = 1 2 and the Fano factor at E = 0 slightly differs from 1 4 when φ = 0. Fig. 1.1 shows the Fano factor for different values of φ numerically computed from our theory. We mention that the deviation from 1/4 is within 2% for the entire range of φ ∈ [0, 1) which is well below the error bar of the experimental results presented in Fig. 6 of [7] and adapted from [31]. The value φ = 1 is special, since the singularity of ρ at λ ≈ 0 changes to ρ φ=1 (λ) ∼ λ −2/3 , for 0 < λ 1 (1.7) from λ −1/2 in (1.6), while the inverse square root singularity at 1 persists. This enhancement of singularity signals the emergence of a δ-function component at 0 in ρ φ as φ becomes larger than 1, which is a direct consequence of T T * not having full rank when N > M . We remark that this regime is quite unphysical since it corresponds to more scattering channels than the total number of internal states of the quantum dot. In realistic quantum dots N is smaller than M since not every mode of the dot may participate in scattering. We therefore do not pursue the detailed analysis of ρ φ for φ > 1, although our method can easily be extended to these φ's as well.
There are two main differences between our model and that of Beenakker et al. First, the distribution imposed on the random matrix H is different and we consider random W . Second, our current method works in the entire regime φ = N/M ∈ (0, 1], while Beenakker assumes M N , i.e. φ 1. We now explain both differences. Following Wigner's original vision, any relevant distribution must respect the basic symmetry of the model; in our case this demands that H be complex Hermitian, while no symmetry constraint is imposed on W . Respecting basic symmetries, one may define ensembles essentially in two ways. Invariant ensembles are defined by imposing that the entire distribution be invariant; it is typically achieved via a global Gibbs factor times the natural flat measure on the space of matrices satisfying the basic symmetry. Wigner ensembles and their generalizations impose distributions directly on the matrix elements and often demand independence (up to the basic symmetry constraint). These two procedures typically yield different ensembles.
While in the simplest case of random Hermitian matrices both types of ensembles have been actively investigated, for ensembles with more complicated structure, like our S that is a rational function of the basic ingredients H and W (1.1), up to recently only the invariant approach was available. Sophisticated explicit formulas have been developed to find the joint distribution of eigenvalues for more and more complicated structured ensembles (see [18]), which could then be combined with orthogonal polynomial methods to obtain local correlation functions. The heavy reliance on explicit formulas, however, imposes a serious limitation on how complicated functions of random matrices, as well as how general distributions on these matrices can be considered. For example, the Gibbs factor is often restricted to Gaussian or closely related ensembles to remain in the realm of explicitly computable orthogonal polynomials.
There have been considerable developments in the other type of ensembles in the recent years. Departing from the invariant world, about ten years ago the Wigner-Dyson-Mehta universality of local eigenvalue statistics has been proven for Wigner matrices with i.i.d. entries, see [16] and references therein. Later the i.i.d. condition was relaxed and even matrices with quite general correlation structures among their entries can be handled [14]. One of the key ingredients was to better understand the Matrix Dyson equation (MDE), the basic equation governing the density of states [1]. Together with the linearization trick, this allows us to handle arbitrary polynomials in i.i.d. random matrices [13] and in the current work we extend our method to a large class of rational functions. Note that even if the building block matrices have independent entries, the linearization of their rational expressions will have dependence, but the general MDE can handle it (see (2.13)). In our work we deal only with bounded rational functions and relatively straightforward regularizations of unbounded rational functions, the general theory of unbounded rational functions is still in development, see [29] and references therein. We stress that the distribution of the matrix elements of H and W can be practically arbitrary. In particular, our result is not restricted to the Gaussian world.
In comparison, the result of Beenakker et al. [ where Q ∈ C M ×2N with Q ij = δ ij . In fact, for the sake of explicit calculations, it is necessary to replace the GUE by a Lorentzian distribution (irrelevant constants ignored), and their density can be easily computed, yielding (1.5) in the N → ∞ limit. While Beenakker's result relies on an impressive identity, it allows little flexibility in the inputs: H needs to be Lorentzian with very large dimension, moreover M N and E = 0 are required. In contrast, our setup allows for a large freedom in choosing the distribution of H, it covers the entire range M ≥ N and also applies to any E in the bulk, however for computational simplicity we model the contacts W also by a random matrix. While the most relevant regime for scattering on quantum dots is still M N , as scattering involves surface states only, a very recent work [20] introduces absorbing channels well inside the quantum dot that leads to physical models with M ∼ N .
The flexibility in our result stems from the fact that our method directly aims at the density of states via an extension of the MDE theory to linearizations of rational functions of random matrices. It seems unnecessarily ambitious, hence requiring too restrictive conditions, to attempt to find the joint distribution of all eigenvalues. Even for Wigner matrices this is a hopeless task beyond the Gaussian regime. We remark that the present analysis of the density of transmission eigenvalues for the quantum dot is only one convenient application of our approach. This method is powerful enough to answer many related questions concerning the density of states such as the analysis of the scattering poles [19] as well as extensions from quantum dots to quantum wires that we will address in future work.

Model and main theorem
In order to accommodate all parameters that will appear in our analysis, we introduce a generalized version of the scattering matrix (1.1) with γ > 0 and w ∈ C + for φ ∈ (0, 1/2] and w ∈ C + ∪R for φ ∈ (1/2, 1], where C + := {z ∈ C : Im z > 0} denotes the complex upper half plane. The constant γ > 0 quantifies the coupling between the internal quantum dot Hamiltonian H and the external leads W 1 , W 2 , i.e. the effective Hamiltonian of the open quantum systems becomes H −i γW W * . The spectral parameter w is used to regularize the potentially unstable inverse in (1.9). Note that for φ ∈ (0, 1/2) the matrix W W * has a nontrivial kernel (and even for φ = 1/2 it has a very small eigenvalue), hence initially a regularization with a small positive imaginary part is necessary that will later be carefully removed. For the technically easier φ ∈ (1/2, 1] regime we may directly choose w to be real since the eigenvalues of W W * are bounded away from zero with very high probability. The general formula (1.1) recovers the scattering matrix (1.1) by setting γ = π, Re w = E and Im w = 0. The regularized scattering matrix still has the 2 × 2-block structure from (1.2) with T ∈ C N ×N . We consider the following random matrix model (see (1.1) and (1.2)) where the triple of parameters (w, φ, γ) belongs to the same set as for (1.9). Furthermore, for M, N ∈ N, φ := N/M the matrices H ∈ C M ×M and W 1 , W 2 ∈ C M ×N are three independent random matrix ensembles satisfying the following assumptions (H1qd) H is a Hermitian random matrix having independent (up to symmetry constraints) centered entries of variance 1/M ; (H2qd) W 1 and W 2 are (non-Hermitian) random matrices having independent centered entries of variance 1/M ; (H3qd) entries of H, W 1 and W 2 , denoted by H(i, j), W 1 (i, j) and W 2 (i, j) correspondingly, have finite moments of all orders, i.e., there exist ϕ n > 0, n ∈ N, such that Remark 1.1 (Constant matrices). In (1.10) and later in the paper, for B ∈ C, n ∈ N and I n ∈ C n×n the identity matrix of size n, we use the shorthand notation B · I n = B. This notation is used only when the dimension of I n can be unambiguously determined from the context.
Remark 1.2. In the sequel we consider the coupling constant γ to be a fixed positive number, therefore we will omit the dependence on γ in our notation.
Denote by µ T w,φ (dλ) := 1 N N i=1 δ λ i the empirical spectral measure of T w,φ , where λ i are the eigenvalues of the Hermitian matrix T w,φ . To simplify the presentation, we will assume in this paper that the dimensions of the matrices H, W 1 and W 2 grow over a subsequence (N, M ) = (kn, ln), n ∈ N, i.e. N/M = φ is kept fixed. One could easily extend our argument to include the general situation when one considers two sequences N = N n and M = M n tending to infinity such that φ n = N n /M n → φ.
We now formulate our two main results. The first one, Theorem 1.3, is the global law for the empirical eigenvalue density; it shows that µ T w,φ (dλ) has a deterministic limit denoted by ρ w,φ (dλ). The second result, Theorem 1.4, contains key properties of ρ w,φ (dλ). It shows that the regularization in the spectral parameter w can be removed and that ρ w,φ can be continuously extended to φ → 0; moreover, it explicitly identifies its singularities at zero and one. Theorem 1.3 (Global law). Fix a positive real number γ > 0 and a rational number φ ∈ (0, 1] ∩ Q. Let w be a fixed spectral parameter: for φ ∈ (0, 1/2] we assume that w ∈ C + , while for φ ∈ (1/2, 1] we assume w ∈ C + ∪ R. Then there exists a deterministic probability measure ρ w,φ (dλ) with supp ρ w,φ ⊂ [0, 1] such that µ T w,φ (dλ) converges weakly to ρ w,φ (dλ) in probability (and almost surely) as M, N → ∞ with N/M = φ. Theorem 1.4 (Properties of the transmission eigenvalue density). Let the numbers γ, w, φ and the measure ρ w,φ (dλ) be as in Theorem 1.3.
(ii) If φ = 1 and γ > 0, then ρ E,φ (λ) has the following asymptotics near 0 and 1: where (iii) If φ ∈ (0, 1) and γ > 0, then where ξ 0 > 0 is the unique positive solution of (iv) If φ = 0, γ > 0 and |E| < 2 the density is given globally by the explicit formula Proofs of Theorems 1.3 and 1.4. We now explain the structure of the paper that contains the proof of both theorems. The density function ρ E,φ (λ) will be derived from a deterministic equation, the Dyson equation of the linearizaton of (1.10) (see equations (2.13) and (2.27) for φ = 1, as well as (3.7) and (3.18) for φ ∈ (0, 1) below). This equation plays a central role in our analysis. Sections 2 and 3 below contain the proofs of the model specific parts of Theorem 1.4 which use some key conclusions of the general theory on noncommutative rational functions developed in Appendix A. The φ = 1 case is treated in Section 2 with full details, while in Section 3 we explain the modifications for the general rational φ ∈ (0, 1) case. Theorem 1.3 is proven in Lemma 2.5 for φ = 1 and Lemma 3.4 for general rational φ ∈ (0, 1) using the general global law from Appendix A. Part (ii) is proven in Section 2.5 after having established key properties of the solution for the Dyson equation. Section 3 follows the same structure as Section 2 and proves parts (i), (iii) and (iv) of Theorem 1.4 for the φ ∈ [0, 1) case.
Remark 1.5. The restriction |E| < E * in Theorem 1.4 is used to identify the regime in which the density ρ E,φ has two singularities. For |E| > E * , the singularity at λ = 1 disappears and the support of the density becomes bounded away from λ = 1. Physically this indicates that there are no fully open channels in this regime. This effect is most severe in the case φ = 0, where the system becomes completely reflective as |E| increases above the threshold E * = 2. Our approach is also applicable for |E| > E * , but for the simplicity of the presentation we restrict our analysis to the physically more relevant situation when two singularities exist simultaneously.
We formulated Theorem 1.3 about the specific matrix (1.10). However, our method works for very general noncommutative (NC) rational expressions in large matrices with i.i.d. entries (with or without Hermitian symmetry) generalizing our previous work [13] on polynomials. For convenience of the readers interested only in the concrete scattering problem the main part of our paper focuses on this model and we defer the general theory to Appendix A. The details of this appendix are not necessary in order to follow the main arguments in Sections 2 and 3 provided some basic facts from the appendix are accepted. These facts will be re-stated for the specialization to our concrete operator T w,φ from (1.10). On the other hand, Appendix A is written in a self-contained form, so readers interested in the general theory can directly go to it, skipping the concrete application.
In Appendix A, we first give the precise definition of the set of admissible rational expressions that requires some technical preparation, see Section A.1 for details. Roughly speaking, we can consider any rational expression whose denominators are stably invertible with overwhelming probability. We remark that this property holds for (1.10) since the imaginary part of w − H + i γW W * has a positive lower bound as long as φ > 1/2 or w ∈ C + . With this definition at hand, we develop the theory of global and local laws as well as the identification of the pseudospectrum for such rational expressions in Sections A.6 and A.5, respectively.
2 Proof of Theorem 1.4 for the special case φ = 1 In this section we study the model (1.10) for fixed γ > 0, φ = 1 and w ∈ C + ∪ R. This special choice of parameter φ ensures that the linearization of T w,φ has a fairly simple structure, which makes the proof of Theorems 1.3 and 1.4 more transparent and streamlined. Generalization to φ ∈ (0, 1) is postponed to Section 3. Since the parameter φ is fixed to be equal to 1, we will omit the dependence on φ in the notation throughout the current section. The information about linearizations of general rational functions is collected in the Appendix A.1. Here we often refer to specialization of these results to T w .

Linearization trick and the Dyson equation for linearization
We consider the random matrix model T w defined in (1.10) as a self-adjoint rational function of random matrices H, W 1 and W 2 . In order to study its eigenvalues we introduce the self-adjoint linearization matrix H w ∈ C 8N ×8N with w ∈ C + ∪ R. Denote by J m ∈ C m×m , m ∈ N, a matrix whose (1, 1) entry is equal to 1 and all other entries are equal to 0. For any n ∈ N and R ∈ C n×n we define R to be the operator norm of R.
The following definition is a specialization of the notion of generalized resolvent from Appendix A.3.
Definition 2.1 (Generalized resolvent). We call the matrix-valued function z → (H w − zJ 8 ⊗ I N ) −1 defined for z ∈ C + the generalized resolvent of H w .
The results below are formulated using the notion of asymptotically overwhelming probability.
Definition 2.2. We say that a sequence of events {Ω N } N ∈N holds asymptotically with overwhelming probability (a.w.o.p. for short) if for any D > 0 there exists C D > 0 such that Consider the set Note that W W * is a sample covariance matrix with concentration ratio 1/2, hence its spectrum is asymptotically supported on an arbitrarily small neighborhood of with very high probability. The limiting support follows from the classical Bai-Yin theorem [6], the corresponding large deviation result under somewhat different conditions follows, e.g. from Corollary V.2.1 of [17]. In fact, the boundedness of (W W * ) −1 and W i also follows from [33], at least for subgaussian distributions. Alternatively, under our conditions (1.11) this result also follows from Lemma B.1 in the Appendix by choosing n = M , l = 1 and m = 2. Similarly, from the properties of classical Wigner and iid ensembles (see, e.g., [5,Section 5]), we obtain that the event { H ≤ 3, W 1 ≤ 3, W 2 ≤ 3 } also holds a.w.o.p., and we conclude that for any D > 0 there exists C D > such that In the rest of this section we consider the random matrix models (1.9) and (1.10) restricted to the set Θ N . Since on this set the smallest eigenvalue of W W * cannot be smaller that 1/12, we have that (w − H + i γW W * ) −1 ≤ 1 12γ for all w ∈ C + ∪ R and thus the model (1.10) is well defined on Θ N . In the next lemma we establish an a priori bound for the generalized resolvent (H w − zJ 8 ⊗ I N ) −1 .
uniformly for all z ∈ C + and w ∈ C + ∪ R.
(ii) For all w ∈ C + ∪ R, z ∈ C + and 1 ≤ i, j ≤ N be the linearization matrix obtained via the linearization algorithm described in Appendix A.2 where δ αβ is the Kronecker delta.
Note that all transposition matrices P ij in (2.8) leave the upper-left N × N block intact. Thus (2.6) follows from the Definition A.5 of linearization and the Schur complement formula (see, e.g., (A.20)) by taking A = C N ×N .
In order to prove the bound (2.5), consider the set Θ N defined in (2.3). One can see that H and H w are related by (2.8). For any R ∈ C 8×8 , applying the transformation R → P ij R P ij does not change the norm of R, while applying the map R → T R T might change the norm by an irrelevant non-zero constant factor only. We thus conclude that H w also satisfies the bound (2.5) with a constant C γ being possibly different than the one for H and κ 3 = 0 w w 0 , κ 4 = 0 1 1 1 0 0 , κ 5 = 0 0 1 1 1 0 . (2.10) With this notation H w can be rewritten as where K 0 = K 0 (w), K 1 , L 1 , L 2 ∈ C 8×8 are given by their block structures as with a linear map Γ : C 8×8 → C 8×8 given by (2.14) Lemma 2.4 (Existence and basic properties of the solution to the DEL (2.13)). For any γ > 0, w ∈ C + ∪ R and z ∈ C + define M z,w ∈ C 8×8 as where s, c 1 , c 2 are freely independent semicircular and circular elements and τ S is a tracial state on a C * -probability space (S, τ S ) with the unit element 1 S . Then (i) For any γ > 0 there exists C γ > 0 such that M z,w satisfies the a priori bound uniformly for all w ∈ C + ∪ R and z ∈ C + .
(ii) For any γ > 0, w ∈ C + ∪ R and z ∈ C + , matrix M z,w satisfies the DEL (2.13) and has positive semidefinite imaginary part, Im M z,w ≥ 0. Moreover, for all γ > 0 and w ∈ C + ∪ R, the matrix-valued function z → M z,w is analytic on C + .
(iii) For any γ > 0, w ∈ C + ∪ R and z ∈ C + function z → M z,w admits the representation where M ∞ w ∈ C 8×8 , and V w (dλ) is a positive semidefinite matrix-valued measure on R with compact support. In particular, lim z→∞ M z,w = M ∞ w .
Proof. Fix γ > 0 and denote the noncommutative rational (in fact, polynomial) expression Using the fact that c 1 c * 1 + c 2 c * 2 has free Poisson distribution of rate 2, which in particular implies that ( 2 ) −1 , we conclude that the triple (s, c 1 , c 2 ) belongs to the effective domain D q 0 ,{q 1,w };C with C = γ −1 (1 − 1 √ 2 ) −1 for all w ∈ C + ∪ R (see Section A.1 for the corresponding definitions).
Following the proofs of Lemmas A.7 and A.10 and specializing them to our concrete case, we obtain that for any fixed γ > 0 there exists C γ > 0 such that uniformly for all w ∈ C + ∪ R, which yields the a priori bound (2.16). Part (ii) of Lemma 2.4 now follows directly from parts (ii)-(iv) of Lemma A.12. Finally, (2.17) follows from the representation (A.27) in Lemma A.12.
With DEL (2.13) we associate the corresponding stability operator L z,w : C 8×8 → C 8×8 given by The following lemma is directly obtained from Proposition A.19 and establishes Theorem 1.3 and the weak limits in the part (i) of Theorem 1.4 for φ = 1.
Lemma 2.5 (Global law for µ Tw ). For w ∈ C + ∪ R the empirical spectral measure µ Tw (dλ) converges weakly in probability (and almost surely) to Proof. We apply Proposition A. 19 to the rational expression in random matrices (1.10). By (A.78), for any w ∈ C + and fixed θ > 0 the generalized resolvent of the linearization In particular, the identity (2.6), similarly to (A.79) (see also Definition A.17), implies the pointwise convergence of the Stieltjes transform of the empirical spectral measure of T w : for any θ, ε, D > 0 there exists a constant C θ,ε,D > 0 such that uniformly on { z : Im z ≥ θ −1 , |z| ≤ θ } and w ∈ C + ∪ R. The convergence in (2.21) together with (2.17) imply that the weak convergence holds in probability (see, e.g., [4,Theorem 2.4.4]). Now we prove the almost sure convergence. Take any z 0 ∈ C + , a sequence {z i } ⊂ C + of different complex numbers with z i → z 0 such that z 0 is the accumulation point of {z i }. Define the sequence of events Then it follows from (2.21) and the Borel-Cantelli lemma applied to {A N } that with probability 1 for all i ∈ N. Finally, applying the Vitali-Porter theorem we conclude that the weak convergence (2.22) holds almost surely.
To prove the bound on the support of ρ w (dλ) note, that the scattering matrix S(w), related to T w = T T * via (1.2), is unitary. This implies that all singular values of T are located in the interval [0, 1], thus supp µ Tw ⊂ [0, 1]. But from (2.22) we know that the empirical spectral measure of T T * converges weakly to ρ w (dλ), which yields suppρ w ⊂ [0, 1].
Note that M z,w is a matrix-valued Herglotz function. Therefore, from the properties of the (matrixvalued) Herglotz functions (see, e.g., [22, Theorems 2.2 and 5.4]), the absolutely continuous part of ρ w (dλ) is given by the inverse Stieltjes transform of M z,w (1, 1) (see Lemma A.12) We call the function ρ w (λ), defined in (2.27), the self-consistent density of states of the solution to the DEL (2.13). It will be shown in Section 2.5 that ρ w (dλ) is in fact purely absolutely continuous, i.e., ρ w (dλ) = ρ w (λ)dλ. The statements (a) and (b) of part (ii) of Theorem 1.4 will be derived from the study of M z,w for the spectral parameter z being close to the real line. Note that our particular choice of linearization (2.1) allows rewriting the original DEL (2.13) in a slightly simpler form. More precisely, if with R 11 , R 22 ∈ C 3×3 and R 33 ∈ C 2×2 , then (2.12) yields so that the image Γ[R] is a block-diagonal matrix. Together with the definition of K 0 in (2.12), this implies that the right-hand side in (2.13) is a block-diagonal matrix with blocks of sizes 3, 3, and 2 correspondingly. We conclude that any solution to the DEL (2.13) has a block-diagonal form, which, in particular, allows us to write , where we omit the dependence of the blocks on z and w. Now DEL (2.13) can be decomposed into a system of three matrix equations of smaller dimensions

Useful identities
From now on and until the end of Section 2, we study the matrix-valued function M z,w with w = E ∈ R. We start by collecting several important relations between the components of M z,E .
where s is a semicircular element, c 1 , c 2 are circular elements, all freely independent in a C * -probability (2.36) Using the fact that −s, −c * 1 and −c * 2 form again a freely independent family of one semicircular and two circular elements, we can easily check that (here × denotes multiplication in S 8×8 ) from which (2.33) follows after factorizing Q − .
Proof. Using the Schur complement formula, the lower-right 2×2 block of the inverse of H (sc) For convenience, change the rows in the above matrix, so that Notice, that since c 1 c * 1 + c 2 c * 2 has a free Poisson distribution of rate 2, 2 ) 2 1 S and thus both diagonal elements of the matrix on the right-hand side of (2.40) are invertible. Rewrite the matrix in the square brackets in the following way: for the entries of the first row apply the Schur complement formula with respect to the (1, 1)-component, and for the second row apply the Schur complement formula with respect to the (2, 2)-component. This leads to the following expressions for M z,E (7, 7) and M z, Under τ S we can swap the labels of c 1 and c 2 and replace a with −a * without changing the value in (2.43). After completing these operations, we obtain ( Proof. Denote so that (2.32) can be rewritten as Then from (2.38) we have that Subtracting (2.51) from (2.50) gives 2.3 Boundedness of M z,E away from z = 0 and z = 1 Proof. Introduce the following notation for the entries of M 3 Our goal is to show that M z,E and in particular M z,E (1, 1) is bounded everywhere if z is away from 0 or 1. From (2.31) we have that (2.58) which after some elementary computations yields defined above all depend on the variables z and E, but in order to make the exposition lighter we drop the explicit dependence on these variables from the notation. Using (2.58)-(2.60) it is enough to show that for any fixed We now prove some additional relations that can be obtained from (2.56), (2.57) and (2.49)-(2.52). Plugging (2.56) and (2.57) into (2.50) (recall that we are using notation (2.55)) gives which, after applying (2.52) to the terms in the parenthesis, can be rewritten as From the definitions of T 1 and M 3 we have Combining (2.63) and (2.64), we get the following quadratic equation for det M 3 Note, that (2.62)-(2.65) hold for all E ∈ R and all z ∈ C + . Using the above relations, we proceed with a proof by contradiction. Assume that there exists 11 | → ∞ as n → ∞ (here and below we denote the evaluations at (z n , E n ) by adding the superscript (n)). Solving (2.65) for det M 3 allows us to express det M 3 in terms of m 11 By passing to a subsequence, we may assume that the choice of the ± sign in (2.66) is constant for all n.
If we take the − sign in (2.66), then  1 | → ∞ on some sequence (z n , E n ) ∞ n=1 , which contradicts to the boundedness of m 11 in (2.61). We conclude that all entries of M 3 are bounded everywhere away from the points z ∈ {0, 1}.
which together with the fact that m (n) 11 is bounded implies that Then by (2.56) and (2.57) we find by (2.77) and (2.78), which contradicts to (2.79) since z n is away from 0. We conclude that | det T 1 | is bounded away from 0, which together with (2.60) establishes (a non-effective version of) (2.54).
In order to keep the presentation simple, above we showed that M z,E is bounded for z ∈ C + \{0, 1} and E ∈ R without providing an explicit effective bound C θ as formulated in (2.54). Note that the constants hidden in the O (·) terms (for example, in (2.67)-(2.70), (2.71)-(2.74) or (2.75)-(2.80)) depend only on |z n |, |1 − z n | and E n . Therefore, using the assumptions |m where the constant ξ 0 < 0 is as in part (ii) of Theorem 1.4 (see also (2.95) below).
The choices of the branches for (·) 1/3 and (·) 1/2 are specified in the course of the proof below.
Proof. We multiply (2.32) from the left by M 3 and from the right by ( with the short hand notation Subsequently, the equation for M 3 takes the form ∆ = 0 with and performing the matrix products we see that the entries of ∆ ∈ C 2×2 are polynomials in the entries of M 3 , E and z. Step 1: Expansion around z = 0. We will now first construct a solution to (2.32) in a vicinity of z = 0 by asymptotic expansion. Later, in Step 3, we will show that the constructed solution coincides with M 3 defined in (2.15) and (2.30). For this purpose we write t = z 1/3 with an analytic cubic root on C \ (i (−∞, 0])) such that (−1) 1/3 = −1. Then we make the ansatz where we will determine the unknown functions where the entries of P := (p ij ) 2 i,j=1 are polynomials in t, E, ξ ij and where Q := (q ij ) 2 i,j=1 is given by Thus the equation ∆ = 0 is equivalent to Q + tP = 0. For t = 0 the three possible solutions are (2.88) Since det ∇ Ξ Q| t=0 = −3ζ 4 (1 + E 2 ) 2 the equation Q + tP = 0 is linearly stable at t = 0 and can thus be solved for Ξ t in a neighborhood of Ξ 0 when t is sufficiently small. Since the equation is polynomial, the solution Ξ t admits a power series expansion in t. Now we define the analytic function M 3 (z) through (2.85) on a neighborhood of z = 0 in C \ (i (−∞, 0])) with Ξ the solution to Q + tP = 0 and the choice ζ < 0 in (2.88).
We will now check that for any phase e i ψ = −1 in the complex upper half-plane the imaginary part of M 3 (θe i ψ ) is positive definite in the sense that for any fixed ε > 0 we have for sufficiently small θ > 0. Indeed, for any vector u = (u 1 , u 2 ) ∈ C 2 we find for some constants β, C > 0, small t, an ε-dependent constant β ε and any R > 0. For R = 2C/β ε and sufficiently small t this is still positive.
Step 2: Expansion around z = 1. For the expansion around z = 1 we proceed similarly to the discussion at z = 0. We set t = √ z − 1, where the square root has a branch cut at i (−∞, 0] and √ 1 = 1. Here we make an ansatz with a reduced number of unknown functions by exploiting the identities (2.38) and (2.45), namely (2.91) We will determine the unknown functions ξ and ν. Plugging (2.91) into (2.84), multiplying out everything and simplifying afterwards reveals where p 1 , p 2 are polynomials in t, E, ξ, ν and where (2.93) The solution to the system (q 1 , q 2 ) = 0 at t = 0 has the form In r = 0 from (2.94) we choose the unique negative solution (as long as the expression inside the square root is positive) We compute the Jacobian and evaluate at t = 0 with ξ = ξ 0 to find In particular, ξ and ν admit a power series expansion in t.
Step 3: Coincidence of M 3 and M 3 , asymptotic behavior of M z,E (1, 1). Here we show that the asymptotic expansion M 3 around z = 0, 1 coincides with M 3 defined in (2.15) and (2.30), the solution to DEL (2.32) constructed from free probability. For this purpose, for any small δ > 0, we construct a modification M δ z,E of the explicit solution M z,E (given by (2.36)) as follows where c i are circular and s semicircular elements. Since the original generalised resolvent has the bound M z,E ≤ C(1 + 1 Im z ), we also get Now, similarly as in (2.31)-(2.32), we derive the DEL corresponding to the generalized resolvent of H (sc),δ E and find where we exploited the block structure of M δ z,E and denoted where the map Φ δ can be read off from the inverse of the right-hand side of (2.102). Clearly, Φ δ with any δ > 0 is a contraction with respect to the Carathéodory metric mapping the set of 2 × 2 matrices with (strictly) positive definite imaginary parts strictly into itself (for more details see [25]). In particular, there is a unique solution with positive semidefinite imaginary part and thus M δ 3 = M 3 for δ > 0. Together with (2.100) and taking the limit δ ↓ 0 we conclude M 3 = M 3 . In particular, M 3 has a power law expansion around the singularities at z = 0, 1.
Finally, the asymptotic behavior of ρ E (λ) (1.13)-(1.14) near its singularities at 0 and 1 follows from Lemma 2.10 and the inverse Stieltjes transform formula (2.27). This, together with the global law established in Lemma 2.5, finishes the proof of Theorem 1.4 for φ = 1 .
3 Proof of Theorem 1.4 for general rational φ ∈ (0, 1) In this section we explain how the techniques described in Section 2 can be used to prove Theorems 1.3 and 1.4 in the case when φ = k/l ∈ (0, 1) for fixed k, l ∈ N, i.e. when M = ln and N = kn for some integer n tending to infinity. In addition to the steps used in Section 2, we need to tensorize the setup to accommodate for rectangular matrices. For example, the M × N = ln × kn matrices W 1 and W 2 will be viewed as l × k rectangular block matrices with blocks of dimension n × n. As we will see later, this allows us to treat the linearization matrix as a Kronecker random matrix in independent Wigner and i.i.d. matrices, which in turn makes various probabilistic estimates of the error terms in the corresponding DEL readily accessible from [3]. The restriction φ = k/l makes the presentation conceptually easier.
Note that different values of φ = k/l ∈ (0, 1) require slightly different approaches. For φ ∈ (1/2, 1) the matrix T E,φ ∈ C kn×kn given by (1.10) is well defined and bounded with very high probability. Indeed, in this case the product W W * is a sample covariance matrix with concentration ratio 1 2φ ∈ (0, 1), therefore similarly as in Section 2.1, one can define a sequence of events Θ φ,N holding a.w.o.p. (see (2.4) and (3.2) below) such that for any small δ > 0 and big enough n ≥ n 0 (δ), the spectrum of when restricted to Θ φ,N (see, e.g., [5,Section 5]). Thus, as n tends to infinity, the matrices in the denominator of (1.10) have positive imaginary parts and therefore bounded inverses with very high probability.
On the other hand, for φ ∈ (0, 1/2] we need to proceed via regularization by replacing i γW W * with i γW W * + i φI M for some small > 0 to ensure the invertibility of the denominators in (1.10). This requires a more careful analysis of the -dependence of various bounds and identities before we can take → 0.
Note also that for φ > 1, Rank(T E,φ ) ≤ M and the spectral measure of the matrix T E,φ has an atom of mass 1 − 1/2φ at zero, while, as we will show, for φ < 1 the spectral measure µ T E,φ does not have the pure point component. The regime φ = 1 studied in Section 2 is borderline: the limiting spectral measure of T E,φ does not have atom at 0, but its behavior near the origin is more singular than in the case φ < 1.

Linearization trick and the Dyson equation for linearization
In order to apply the linearization trick for φ = k/l ∈ (0, 1), we split H, W 1 and W 2 into blocks of size n × n, so that with H ij , W 1,ij , W 2,ij ∈ C n×n . Note that for any i ∈ {1, . . . , l}, √ l · H ii is a (normalized) Wigner matrix of size n, and for any i = j, √ l · H ij is a (normalized) i.i.d. matrix of size n. Similarly, all the matrices √ l · W 1,ij and √ l · W 2,ij are also i.i.d. matrices of size n. All these matrices are independent apart from the natural constraint H ij = H * ji . Define the events for some δ > 0. Similarly as in Section 2.1, one can show that the events Θ φ,N hold a.w.o.p. and the random matrix models (1.9) and (1.10) for φ ∈ (1/2, 1) and w ∈ C + ∪ R restricted to Θ φ,N are well defined (see Remark A.21 and Lemma B.1). At the same time, in order to deal with φ ∈ (0, 1/2], we consider the regularized models (1.9) and (1.10) with w ∈ C + , i.e., Im w > 0 strictly positive, which guarantees the invertibility of w − H + i γW W * without any additional restrictions on H and W . The linearization matrix H w,φ for (1.10) is defined as in (2.1). This is a Kronecker random matrix consisting of (6k + 2l) × (6k + 2l) blocks of size n. We now introduce a tensorized version of the generalized resolvent that takes into account the additional structure coming from (3.1).
uniformly for all z ∈ C + and w ∈ C + ∪ R.
(ii) For any γ > 0, φ ∈ (0, 1/2] ∩ Q and w ∈ C + there exists C γ,φ,w > 0 such that a.w.o.p. Proof. Denote by D q 0 ,{q 1,w };C the effective domain related to the noncommutative rational function q 1,w (x, y 1 , y 2 , y * 1 , y * 2 ) = w − x + i γ(y 1 y * 1 + y 2 y * 2 ). More precisely (see (A.4) below), the set D q 0 ,{q 1,w };C consists of all triples of elements (x,ŷ 1 ,ŷ 2 ) such that For φ = k/l ∈ (0, 1/2] on the other hand, the evaluation of the function q 1,w on random matrices H, W 1 and W 2 is invertible only for Im w > 0, in which case the norm of the inverse of q 1,w evaluated on any triple (x,ŷ 1 ,ŷ 2 ) is bounded by 1 γ Im w . With the above choice of the constant C in D q 0 ,{q 1,w };C depending on the value of φ, the proof of (i)-(iii) can be obtained from the same argument as in Lemma 2.3 by restricting H w,φ to the events Θ φ,N defined in (3.2) and taking into account the dimensions of H, W 1 and W 2 and the relation N = kn (see Remark A.21).
From the structure of the linearization, we derive the DEL corresponding to H w,φ for an unknown matrix-valued function M depending on z, w and φ, having the following components: (i) the expectation matrix is given by with matrices κ 1 , κ 2 , κ 3 (w) defined as in (2.9)-(2.10); (ii) the operator Γ φ : C (6k+2l)×(6k+2l) → C (6k+2l)×(6k+2l) maps an arbitrary matrix with R 11 , R 22 ∈ C 3k×3k , R 33 ∈ C 2l×2l into a block diagonal matrix with the first 3k × 3k block equal to the second 3k × 3k diagonal block equal to 11) and the lower-right 2l × 2l block equal to (3.12) Here, for n ∈ N, we denote by Tr n the trace of an n × n matrix κ 4 , κ 5 are defined as in (2.10) and σ 1 is a standard Pauli matrix. The operator Γ φ is the tensorized analogue of Γ from (2.29).
This leads to two different a priori estimates: a bound (3.14) uniform in w for φ ∈ (1/2, 1) and a w-dependent bound for φ ∈ (0, 1/2]. The rest of the proof follows directly from Lemma A.12. We omit the dependence of M z,w on φ for brevity. With these notations we have the following global law establishing Theorem 1.3 and partially (i) of Theorem 1.4 for φ ∈ (0, 1). The proof of the weak limit (1.12) for φ ∈ (0, 1/2) is postponed to Section 3.6.
Proof. The proofs are similar to the case φ = 1 (see Lemma 2.5 and Remark A.21) after taking into account the dimensions of the matrices H, W 1 and W 2 and the additional structure (3.1).
Definition 3.5 (Self-consistent density of states). We call the function Since supp(ρ w,φ ) ⊂ [0, 1] by unitarity of S(w), part (iii) of Theorem 1.4 can be established by proving the boundedness of the upper-left k × k minor of M z,w for the spectral parameter z bounded away from 0 and 1 (Section 3.3 below), and analyzing the asymptotic behavior of this upper-left submatrix in the vicinity of the special points z = 0 and z = 1 (Section 3.4 below). The study of M z,w is simplified by the particular form of K 0 (w) and Γ φ , which implies that Similarly as in the case φ = 1, plugging (3.20) into (3.21) leads to the following self-consistent equation which is the analogue of (2.32) for φ = 1.
(ii) M z,w (6k + l + 1, 6k + l + 1) = 4γ 2 zM z,w (6k + 1, 6k + 1); (iii) M z,w (6k +l +1, 6k +1)−M z,w (6k +1, 6k +l +1) = i γ M z,w (6k +l +1, 6k +l +1) 1− γ Im w 2φ det T 1 , where, similarly as in (2.46), we denoted Then the parts (ii) and (iii) of the above lemma can be rewritten as Proof. In order to establish Lemma 3.6, we can follow the proofs of Lemmas 2.6-2.8 and apply them to the matrix and H (sc) correspondingly, and taking into account the dimensions of these matrices. For example, if we replace each diagonal entry of the matrix Q − from the proof of Lemma 2.6 by the tensor product of this entry and a corresponding identity matrix (I k or I l ), we obtain that the diagonal blocks of M z,w and M z,−w coincide.
In order to prove (3.24), similarly as in the proof of Lemma 2.7, use the Schur complement formula with respect to the invertible upper-left 6k × 6k submatrix of (3.26) to write the 2l × 2l lower-right submatrix of its inverse as (3.27) Then we can switch the blocks of (3.27) as in (2.40) and apply the Schur complement formula with respect to (3.28) The expression in (3.28) is invertible: for (φ, w) ∈ (((1/2, 1) ∩ Q) × (C + ∪ R)) the spectrum of W 2 ) * follows the free Poisson distribution with rate 2φ ∈ (1, 2) and is therefore bounded away from zero, and for (φ, w) ∈ (((0, 1/2] ∩ Q) × C + ) the invertibility is guaranteed by Im w being strictly positive. Due to the properties of freely independent circular and semicircular elements, switching the labels of the pair (W 2 ) or changing the sign of H (sc) does not change the value of an expression involving these matrices after applying id 6k+2l ⊗ τ S . Therefore, by proceeding as in (2.42)-(2.44) with M z,w (7, 7) and M z,w (8,8) replaced by the corresponding l × l blocks of M z,w , and using the diagonal structure of these blocks (3.19) and the part (i) of this lemma, we obtain (3.24). Now it is straightforward to check by plugging (3.24) into (3.22) and following the proof of Lemma 2.8, that (3.25) holds. This proves Lemma 3.6.
3.3 Boundedness of M z,w away from z = 0 and z = 1 for φ ∈ (0, 1) The goal of this section is to establish a uniform bound on M z,w for parameter w close to the real line and parameter z bounded away from 0 and 1. This will be used later in the proof of Theorem 1.4, in particular to show the absolute continuity of the measure ρ E,φ (dλ). In the case φ ∈ (1/2, 1) we can set w = E ∈ R and work directly with M z,E . For φ ∈ (0, 1/2] we prove the uniform bound for M z,w with w = E + i φ and small > 0, which will allow taking the limit → 0 in Section 3.4. (i) Case φ ∈ (1/2, 1]: For any γ > 0 and small θ > 0 there exists C θ,γ > 0 such that For any γ > 0, small θ > 0, φ 0 ∈ (0, 1/2) and 0 > 0 small enough there exists C θ,γ,φ 0 , 0 > 0 such that uniformly on the set Proof. Consider first (3.29) for which φ ∈ (1/2, 1). By setting = 0 in Lemma 3.6 we can proceed by establishing (3.29) in the same manner as in the proof of Lemma 2.9. To guarantee a uniform bound in parameters z, E and φ, instead of a sequence (z n , E n ) ∞ n=1 as in the proof of Lemma 2.9, we assume the existence of a sequence ((z n , E n , φ n )) ∞ n=1 , z n ∈ C + , |E| ≤ 1/θ, φ n ∈ (1/2, 1), on which |m For (3.30), i.e. φ ∈ [φ 0 , 1 − φ 0 ], the above argument has to be slightly adjusted to ensure a uniform bound for small > 0. Instead of a sequence ((z n , E n , φ n )) ∞ n=1 as in the first part of the proof, we now assume the existence of a sequence ((z n , E n , φ n , n )) ∞ n=1 , z n ∈ C + , |E| ≤ 1/θ, φ n ∈ [φ 0 , 1 − φ 0 ], n > 0, on which |m  and observe that since the analogue of (2.52), in the regime |m with | det T 1 (n) | → 0, the parameter φ n disappears after dividing (3.37) by φ n , therefore we can proceed exactly as in the proof of Lemma 2.9. We conclude, similarly as in Lemma 2.9, that M z,w is uniformly bounded provided that min{|z|, |1 − z|} ≥ θ, |E| ≤ 1/θ, φ 0 ≤ φ ≤ 1 − φ 0 and ≤ 0 for some φ 0 > 0 and θ, 0 > 0 small enough.

Singularities of M z,E (1, 1)
For φ ∈ (1/2, 1), the solution matrix M z,E with E ∈ R is given directly via (3.13). For φ ∈ (0, 1/2] this formula cannot be directly applied when w = E ∈ R is real; we need an additional regularization argument. Nevertheless, in the next lemma we show the existence of the solution to the Dyson equation (3.20)-(3.22) for φ ∈ (0, 1/2] and w = E ∈ R and we establish the asymptotic behavior of M z,E (1, 1) near z = 0 and z = 1 for φ ∈ (0, 1). We start by constructing an expansion of the solution M z,w in the vicinity of z = 0 for φ ∈ (0, 1) and w = E + i φ with > 0 sufficiently small. We then use this expansion to extend M z,w to w = E ∈ R for φ ∈ (0, 1/2] by taking ↓ 0 and to study the asymptotic behavior of M z,E at special points z = 0 and z = 1. Recall that the solution to the Dyson equation has the block structure (3.19), where M 1 , M 2 are determined by M 3 through (3.20) and M 3 satisfies (3.22). Lemma 3.8 (Existence of M z,E for φ ∈ (0, 1/2] and singularities of M z,E (1, 1) for φ ∈ (0, 1)).
The branch of the square root is chosen to be continuous on C \ (i(−∞, 0]) such that √ 1 = 1.
Proof. The analysis of (3.22) for φ ∈ (0, 1) will follow similar steps as the analysis of (2.32) for the φ = 1 case as performed in Section 2.4 and we will omit the details of some straightforward albeit tedious calculations. In the first step below we analyze the solution to (3.22) for rational φ ∈ (0, 1) and w = E + i φ with E ∈ R and > 0 small enough. Using the same procedure that led to (2.84) we rewrite this equation as ∆ = 0 with and the two 2 × 2-matrices Step 1: Expansion around z = 0. We construct an expansion of M 3 as a power series in t = √ z in a neighborhood of z = 0. We make an ansatz compatible with the symmetries (3.24) and (3.25), namely is a polynomial in all variables t, E, γ, ξ, ν, and f , in which it is quadratic. The choice (3.46) for f ensures that the symmetry condition (3.25) is satisfied. We plug (3.45) into (3.43) and, after a mechanical but very long calculation that uses q 0 = 0 in the (2, 1)-and (2, 2)-entries of ∆, find where q 1, ,t = q 1 + q 1 + t 4γ 3 p 1 , q 2, ,t = q 2 + t γ p 2 , and p 1 , p 2 are polynomials in t, E, γ, φ, ξ, ν, and q 1 = q 1,0,0 , q 2 = q 2,0,0 , q 1 are the following explicitly defined functions of the unknowns (ξ, ν): In particular, the equation ∆ = 0 is equivalent to (q 1, ,t , q 2, ,t ) = 0 and thus to (q 1 , q 2 ) = 0 in the limit t → 0 and → 0. This, in turn, fixes the values for ν 0 = ν| t=0 and ξ 0 = ξ| t=0 through where we choose the positive solution ξ 0 for r = 0. The fact that r = 0 has a unique positive solution ξ 0 > 0 is an explicit elementary calculation. The positivity of ξ 0 will ensure the positive definiteness of Im M 3 for z ∈ C + . We compute the Jacobian of the function (q 1 , q 2 ) from (3.48) as Using that at t = 0 we have q 1 = 0 and ξ 0 = 0, we can eliminate the quadratic terms in ν and obtain Again an elementary calculation using the defining equation r = 0 for ξ 0 shows that J(ξ 0 , ν 0 ) never vanishes. Thus, the ansatz (3.45) solves the Dyson equation in a small neighbourhood of z = t 2 = 0. Furthermore, for t = u(1 + iu) with sufficiently small u > 0 it is easy to see that its imaginary part is positive definite. By using a regularization argument analogous to the one from Step 3 of Section 2.4 and combining it with the uniqueness of solutions to the Dyson equation with positive definite imaginary part, this implies that M 3 (t) = M 3 (z) for all √ z = t = u(1 + iu), φ ∈ (0, 1) and small enough > 0. Since both functions are analytic this also implies equality for z = t 2 in the complex upper half plane intersected with a neighbourhood of z = 0.
Combining the above information, we see that for any index pair (i, j), {M z,E+i nφn (i, j)} n≥1 is a family of analytic functions, locally bounded on C + that converges on {|z| ≤ δ, Im z > 0} to M z,E (i, j). Applying the Vitali-Porter theorem (see, e.g., Section 2.4 in [34]), we conclude that for any z ∈ C + the limit M z,E (i, j) := lim (φn, n)→(φ,0) M z,E+i nφn (i, j) exists, the convergence holds uniformly on the compact subsets of C + and, as a result, the function z → M z,E (i, j) is analytic on C + . Taking M z,E (i, j) as the entries of the matrix-valued function M z,E defines the solution to the Dyson equation (3.21) for φ ∈ (0, 1) and w = E ∈ R.
To compute the asymptotic behavior of M z,E (1, 1) we use (3.45) with = 0 (see (3. Step 3: Expansion around z = 1. We apply exactly the same procedure as in Step 2 of Lemma 2.10, just we insert the parameters φ and γ into the identities (2.83) and (2.84) and follow them through the analysis. Here we record the final result of this elementary calculation. Our ansatz is The expansion in t = √ z − 1 gives that for all |E| ≤ E 0 with the upper-left component of M z,E is given by where ξ 0 is defined by This finishes the proof of the lemma.

Proof of parts (i), (iii) and (iv) of Theorem 1.4
In this section we collect the results established in Sections 3.1-3.5 and complete the proof of Theorem 1.4. Recall that Theorem 1.3 was proven in Lemma 2.5 for φ = 1 and Lemma 3.4 for φ ∈ (0, 1).
Proof of part (i) of Theorem 1.4. The extension of ρ w,φ (dλ) to w = E ∈ R for φ ∈ (0, 1/2] ∩ Q as well as the limit (1.12) and the extension of ρ E,φ (dλ) to irrational φ ∈ (0, 1) follows from the equivalence between the weak convergence of measures defined by (3.16) and the pointwise convergence of M z,w established in Lemma 3.4 and part (a) of Lemma 3.8 for the corresponding limits.
Similarly as for the case φ = 1 in Section 2.5, Lemma 2.11 and the integrability of M z,E that can be deduced from Lemmas 3.7 and 3.8 yield the absolute continuity of ρ w,φ (dλ) = ρ w,φ (λ)dλ.

Comparison with the results of Beenakker and Brouwer
Consider the scattering matrix (1.9) with w = E ∈ R in the regime φ = N/M → 0 as M → ∞ with N 0 = 2N . This model was studied in the case of the Gaussian entries by Beenakker and Brouwer in [7,10], and one of the remarkable results of their theory is that in the experimentally relevant setting of the ideal coupling the limiting transmission eigenvalue density is given by the arcsine law (1.5) (see [8,Eq. (3.12)]. The ideal coupling assumption is formulated in terms of the matrix S(E) having zero mean [8,Eq. (3.8)]. Below we show that in the regime φ → 0 the assumption E[S(E)] = 0 is equivalent to E = 0 and γ = 1. By plugging these values into (1.19) and (3.61) we recover the arcsine distribution and the corresponding Fano factor F (0, 1) = 1/4. Since the results of the current section do not affect the main outcomes of this paper and are meant to be of expository nature, we will keep the presentation rather informal, focusing only on the crucial steps and omitting the technical details.
For simplicity we will assume in this section that H is Gaussian. Note that W ∈ C M ×N 0 , so with unitary matrices U ∈ C M ×M and V ∈ C N 0 ×N 0 and where γ i are the singular values of W and N 0 M . Note that N 0 = 2N = 2φM , and thus the eigenvalues of W * W have Marchenko-Pastur distribution with parameter 2φ. For φ → 0, regime that we are interested in, the eigenvalues of W * W will be concentrated around point 1, in the neighborhood of size O( √ φ), so for simplicity of presentation we will omit the asymptotically small term O( √ φ) and assume throughout these computations that all γ i = 1, i.e., Γ = I N 0 .
After applying the singular value decomposition to W and factoring out matrices U and U * from the inverse, we get and note that 1 2φ H 1 and 1 1−2φ H 3 are both independent GUE matrices. Now the inverse matrix in (4.6) can be rewritten as (4.8) Using the Schur complement formula we have that the upper-left N 0 × N 0 block of (4.8), the only part that does not vanish after sandwiching (4.8) by ( Γ t , 0) and its transpose, is given by The semicircular law for the Hermitian (GUE) matrix 1 1−2φ H 3 implies that as M → ∞, where m sc (z) denotes the Stieltjes transform of the semicircular distribution and "≈" denotes that the corresponding equality holds asymptotically with a vanishing additive term and with high probability. Note that random matrices H 1 , H 2 and H 3 are independent. Therefore, by the concentration for quadratic forms (see, e.g., [15, Theorem C.1]) can be approximated by From (4.9) and (4.11) it remains to check the limiting behavior of (4.12) Recall that since φ is small, we assumed that Γ Γ t = I N 0 . In this case, using again the semicircular law for the Hermitian matrix 1 2φ H 1 , we have From the asymptotics zm sc (z) → −1, |z| → ∞, we get that We conclude that as N 0 , M → ∞, φ → 0 (see (4.6) and (4.16)) . and for E ∈ R, the expression E + m sc (E) is purely imaginary if and only if E = 0

A Spectral properties for a general class of rational expressions in random matrices
A.1 Noncommutative (NC) rational expressions and their linearizations NC rational expressions are formally defined as expressions obtained by applying the four algebraic operations (including taking inverses) to a tuple of NC variables. A systematic overview of the abstract theory of NC rational expression and functions (equivalence classes of rational expressions) can be found in [9]. Note that unlike polynomials, rational expressions do not have a canonical representation, which may lead to a situation when, after evaluating on some algebra, two rational expressions represent the same function. In this paper we leave aside the question of identification of rational functions, and will work instead directly with rational expressions and their evaluations, specifying each time on which domain the evaluation is taking place. Below we introduce a standard set-up in which we will work and define recursively the classes of rational expressions, denoted by letter Q, together with corresponding domains of evaluation denoted by D.
Let H be a Hilbert space, A ⊆ B(H) be a C * -algebra (of bounded operators on H) with norm · A , and let x 1 , . . . , x α * , y 1 , . . . , y β * be the NC variables taking values in A with x α = x * α for 1 ≤ α ≤ α * . Denote by A sa ⊂ A the set of self-adjoint elements of A and let C x, y, y * be the set of polynomials in x := (x 1 , . . . , x α * ), y := (y 1 , . . . , y β * ) and y * := (y * 1 , . . . , y * β * ). We define the NC rational expressions recursively on their height using the following procedure: (i) Let q 0 := 1 A . The set of rational expressions of height 0 is defined to be the set of polynomials in x, y, y * with the domain of definition A α * sa × A β * (Q q 0 , D q 0 ) := (C x, y, y * , A α * sa × A β * ).

(A.4)
This domain will allow an effective control of the norm of rational expressions of height n. Remark A.1. Similar constructions of rational functions/expressions involving their height have been exploited actively in the literature (see, e.g., [32], [9,Chapter 4], [39]). Note that here we allow the "denominators" without constant terms, so they cannot be automatically expanded into geometric series for small x, y. Hence we need to introduce and follow explicit domains. Remark A.2. In the sequel, the statement "q is a rational expression of height n" will implicitly mean that there (uniquely) exist q 1 ∈ (Q q 0 ) 1 , q 2 ∈ (Q q 0 ,q 1 ) 2 , . . ., q n ∈ (Q q 0 ,...,q n−1 ) n and C > 0 such that q ∈ Q q 0 ,...,qn and q is evaluated on the effective domain D q 0 ,...,qn;C . Note that many of the basic results, in particular about constructing the linearizations, can be formulated in a completely abstract form or without restriction to effective domains. Remark A.3. When evaluating a rational expression of height n on different C * -algebras A 1 , . . . , A k , we will use the notation D q 0 ,...,qn (A i ) and D q 0 ,...,qn;C (A i ) correspondingly.
The linearization of q can be written as The idea of studying noncommutative rational functions/expressions via linearizations goes back to Kleene [26]. Since the publication of this work various approaches and algorithms have been developed for constructing linearizations of general classes of rational functions/expressions (see, e.g., [9] or [24] for a pedagogical presentation of the subject). For reader's convenience we provide below a simple linearization algorithm based on the method described in [13,Section A.1]. We use the following observation: for matrices A i ∈ (C x, y, y * ) m i ×m i and B j ∈ (C x, y, y * ) m j ×m j+1 , m j ∈ N, the lower right m 1 × m k submatrix of the inverse of the block matrix Now, if q 1 , . . . , q are rational expressions having known linearizations A 1 , . . . , A , then one can easily check that the linearization of the product w 0 with b 0 = (w 0 , 0, . . . , 0) ∈ (C x, y, y * ) 1×m 1 , b k = (w k , 0, . . . , 0) t ∈ (C x, y, y * ) m k ×1 and B j being matrices of the form Using (A.9), we can construct a linearization of rational expressions using the induction on the height. In the case of a rational expression of height 0 (polynomial function), linearization can be constructed using, e.g., the algorithm from [13, Section A.1]. If q 1 ∈ (Q q 0 ) 1 and q ∈ Q q 0 ,q 1 is a rational expression of height 1, the linearization can be obtain using the following algorithm: (B0) write q as a sum of monomials in x, y, y * , 1 (B1) for each polynomial q 1,γ 1 , q * 1,γ 1 , 1 ≤ γ 1 ≤ 1 construct a linearization (not necessarily self-adjoint) using the algorithm from [13, Section A.1]; (B2) linearization of a monomial in x, y, y * , 1 q 1 , 1 q * 1 of the form w 1 1 q 1 w 2 1 q 2 · · · w k−1 1 q k−1 w k is given by (A.9) with A i being either linearizations of polynomials q i ∈ {q 1,γ 1 , q * (B3) the (possibly not self-adjoint) linearization of a linear combination of monomials (and thus q) is constructed by putting the linearizations of monomials obtained at (B2) into a block-diagonal form using a procedure similar to (R1)-(R2) from [13, Section A.1]; (B4) if after step (B3) the resulting linearization is not self-adjoint, the symmetrized linearization of q = (q + q * )/2 can be obtained by putting the linearization obtained at step (B3) and its conjugate transpose into a block-skew-diagonal form similarly as in (R3) from [13,Section A.1].
Suppose that we know how to construct linearizations for rational expressions of height ≤ n − 1 and suppose that q ∈ Q q 0 ,...,qn is a rational expression of height n. Then the linearization of q can be constructed using the following algorithm, that is an adaptation of (B0)-(B4): (S0) write q as a sum of monomials in x, y, y * , 1 (S1) construct linearizations (not necessarily self-adjoint) of each rational expression q t,γt for 1 ≤ t ≤ n, 1 ≤ γ t ≤ t of height ≤ n − 1 (rational expressions of height ≤ n − 1); (S2) linearization of a monomial in x, y, y * , 1 The steps (S3)-(S4) are identical to (B3)-(B4).
Remark A.6. It is always possible to obtain the specific form of monomials required in steps (B0) and (S0) by adding factors 1 A or 1 1 A , as, for example, in Note that there is some ambiguity at steps (B0) and (S0), representing the fact that we have freedom to choose the order in which the monomials appear in the sum, as well as freedom to put the product of the constant terms 1 A and 1 1 A between the terms w j and 1 q i . All the results of Section A hold for any choice of the representation of q as a sum of monomials, therefore we do not fix any particular order or other rules to guarantee the uniqueness of the representation in (B0) and (S0).
The condition (ii) in the definition of the linearization is satisfied by construction. The condition (i) follows from the following lemma, which gives a bound on ( L(x, y, y * )) −1 C (m−1)×(m−1) ⊗ A , where for any n ∈ N and R = (R ij ) n i,j=1 ∈ C n×n ⊗ A we denote Lemma A.7 (Invertibility of L). Let q be a self-adjoint rational expression of height n and let L = L q ∈ (C x, y, y * ) m×m be the linearization of q constructed via the above algorithm. Let L be the submatrix of L defined using the decomposition (A.5). Then there exist C Lq > 0 and n Lq ∈ N such that for any (x, y) ∈ D q 0 ,...,qn;C Proof. We prove (A.13) by induction on n. For n = 0 (the special case of polynomial functions) (A.13) follows from, for example, [13, (3.16)]. Suppose (A.13) holds for all rational expressions of height k ≤ n − 1. Consider q of height n with linearization obtained via (S0)-(S4). Steps (S3) and (S4) of the linearization algorithm endow L with block-diagonal (S3) or block-skew-diagonal (S4) structure with blocks being the linearizations of monomials obtained at step (S2). Therefore, in order to obtain the bound (A.13) it is enough to consider only the inverses of the blocks of the form with A i being the linearizations of the rational expressions and B j being of the form (A.10). One can easily check that the inverse of (A.14) consists of the blocks of the type . .. The induction step, together with the Schur complement formula for A −1 i and the condition that for (x, y) ∈ D q 0 ,...,qn;C implies that for each block of type (A.14) there exist C > 0 and n ∈ N such that 16) where A i 's and B j 's in the left-hand side are evaluated at (x, y) ∈ D. Taking C Lq and n Lq , respectively, the maximum over all C 's and the maximum over all n 's in the bounds (A.16) running through all monomials in the representation of q, leads to (A.13) Remark A.8. Suppose that P ∈ C m×m is of the form with Q ∈ C (m−1)×(m−1) invertible. It is easy to see that if L ∈ (C x, y, y * ) m×m is a linearization of a rational expression of height q, then so is (P ⊗ 1 A )L(P −1 ⊗ 1 A ). We will use this freedom to bring linearizations to more convenient form.

A.3 A priori bound on generalized resolvents
Definition A.9 (Generalized resolvent). Let L ∈ (C x, y, y * ) m×m . We call the matrix-valued function z → (L − zJ m ⊗ 1 A ) −1 defined for z ∈ C + the generalized resolvent of L.
Lemma A.10. Let q be a self-adjoint rational expression of height n and let L ∈ (C x, y, y * ) m×m be the linearization of q constructed using the algorithm from Section A.2. Then there exist C Lq > 0 and n Lq ∈ N such that for all C > 1, (x, y) ∈ D q 0 ,...,qn;C and z ∈ C + Proof. Rewrite L − zJ m ⊗ 1 A using the block decomposition from (A.5) with L ∈ (C x, y, y * ) (m−1)×(m−1) . By (ii) in the definition of the linearization and the Schur complement formula we have (A.20) Now (A.18) follows from Lemma A.7 and the trivial bound for resolvents of self-adjoint elements

A.4 Dyson equation for linearizations of NC rational expressions
Let q be a self-adjoint rational expression of height n and let L ∈ (C x, y, y * ) m×m be its linearization constructed using the algorithm from Section A.2. Write L as with K 0 , K α , L β ∈ C m×m and K 0 , K α self-adjoint. Define the completely positive map Γ : C m×m → C m×m by Definition A.11 (Dyson equation for linearizations). We call the equation the Dyson equation for the linearization (DEL) (of a rational expression).
Lemma A.12 (Solution of DEL: existence and basic properties). Let H be a Hilbert space and let S ⊂ B(H) be a C * -algebra containing a freely independent family {s 1 , . . . , s α * , c 1 , . . . , c β * } of α * semicircular and β * circular elements in a NC probability space (S, τ S ). Let q ∈ Q q 0 ,...,qn and assume that (s, c) ∈ D q 0 ,...,qn;C for s = (s 1 , . . . , s α * ), c := (c 1 , . . . , c β * ) and some C > 0. Define Lq > 0 such that A.5 Convergence of spectrum for the rational expressions in random matrices and the a priori bound for the generalized resolvent in random matrices The next two sections are devoted to the study of the eigenvalues of a general class of rational expressions evaluated on random Wigner and iid matrices.
Denote X := (X 1 , . . . , X α * ), Y := (Y 1 , . . . , Y β * ), Y * := (Y * 1 , . . . , Y * β * ) and let q be a (self-adjoint) rational expression in α * self-adjoint and β * non self-adjoint noncommutative variables. In order to prove the local law for q(X, Y , Y * ) we will need to show that the spectrum of q(X, Y , Y * ) converges to the spectrum of q(s, c, c * ). To this end, for any ε > 0, m ∈ N and operator R ∈ C mN ×mN , denote by Spec ε (R) the ε-pseudospectrum of R defined by It is easy to check that for any R ∈ C mN ×mN ∼ = C m×m ⊗ C N ×N where · C mN ×mN and · C N ×N denote the operator norms on C mN ×mN and C N ×N correspondingly. For any (not necessarily self-adjoint) rational expression r(x, y, y * ) in NC variables x, y, y * , denote by L r := L r (x, y, y * ) ∈ (C x, y, y * ) mr×mr its (not necessarily self-adjoint) linearization, which can be constructed using, for example, the algorithm from Section A.2 omitting steps (B4) and (S4) if r is not self-adjoint. Define the corresponding Hermitized linearization by L r,z (x, y, y * ) := 0 with L r,z (x, y, y * ) ∈ (C x, y, y * ) 2mr×2mr . For the Hermitized linearization (A.36) we define (similarly as in (A.23)) the self-energy operator Γ r,z : C 2mr×2mr → C 2mr×2mr , given by the completely positive map where K r,z α and L r,z β are the coefficient matrices of L r,z (see, e.g., (A.6)). Note, that if we evaluate L r,z on the tuple of random matrices (X, Y , Y * ), then L r,z (X, Y , Y * ) belongs to the class of Kronecker random matrices, which were studied in [3]. Therefore, from [3, Lemma 2.2], we have that the corresponding Matrix Dyson equation has a unique solution with positive semidefinite imaginary part Im M r,z ω ≥ 0. Moreover, for each z ∈ C, the solution matrix M r,z ω admits the Stieltjes transform representation where {V r,z } z∈C is a family of measures taking values in the set of positive definite matrices. In the limit N → ∞ the solution M r,z ω ⊗ I N well approximates (L r,z (X, Y , Y * ) − ωI) −1 in the entrywise maximum norm (see [3,Lemma B.1]).
Remark A.14. The statement of [3, Lemma 2.2] is formulated in the more general setting of Wignertype matrices allowing independent but not necessarily identically distributed entries. This model, in general, leads to a system of N matrix equations. In our case, the matrices in X and Y have i.i.d. entries (up to symmetry constraints), which reduces the system of N possibly different matrix equations (see, e.g., [3, Eq. (2.6)]) to N identical matrix equations of the form (A.38).
For any rational expression r with linearization L r define the set D Lr ε ⊂ C by where ρ r,z (dλ) := 1 2mr Tr V r,z (dλ) and the family of measures V r,z (dλ) were defined in (A.39). The set D Lr ε is called the self-consistent ε-pseudospectrum of r related to its linearization L r , and ρ r,z is called the self-consistent density of states of L r,z (X, Y , Y * ).
The next lemma contains the main result of this section.
Lemma A.15 (Convergence of the (pseudo)spectrum). Suppose that q ∈ Q q 0 ,...,qn is a (not necessarily self-adjoint) rational expression of height n, and (s, c) ∈ D q 0 ,...,qn;C (S) with some constant C > 0. Then there exists C w Lq > 0 such that where L q is defined as in (A.5). There exists also a constant C > 0 depending only on the linearization L q and the constant C such that Proof. We split the proof of this lemma in two parts. First we show that the condition (A.41) together with (A.42) implies (A.43) for any, not necessarily self-adjoint, rational expression and its linearization. After that we prove that (A.41) and (A.42) are satisfied for q if (s, c) ∈ D q 0 ,...,qn;C (S) using induction on the height n. Suppose that we have an arbitrary rational expression r and its linearization L r of size m r , and suppose that there exists C w Lr > 0 such that a.w.o.p.
where L r is defined similarly as in (A.5). Then from the definition of the linearization (Definition A.5) and the Schur complement formula (A.20), we can choose C 3 , C 4 > 0 such that and the sequence of inequalities hold a.w.o.p. for all z ∈ C. Here the first and third inequalities follow from the Schur complement formula (A.20) and the norm bounds max 1≤α≤α * X α C N ×N ≤ 3, max 1≤β≤β * Y β C N ×N ≤ 3 holding a.w.o.p., the second inequality holds deterministically for all realizations of X and Y (see (A.35)), C 3 > 0 depends on C r , m r and the norms of X α , Y β , which we bound by 3, and C 4 > 1 additionally depends on C 3 . Note again from (A.20) that if L r (X, Y , Y * ) is invertible, then On the other hand, using the definition of L r,z from (A.36), the set on the right-hand side of (A.49) can be described via the spectrum of L r,z (X, Y , Y * ) as with the identity (A.50) holding deterministically for all realizations of X and Y . Under the condition (A.44), the equality (A.49) can be rewritten in terms of the pseudospectrum using (A.46) as In order to study the spectrum of L r,z (X, Y , Y * ), we will exploit the fact that L r (X, Y , Y * ) − zJ mr ⊗ I N belongs to the class of Kronecker random matrices and thus we can use the results from [3] about the location of spectrum for this class of random matrix ensembles. By applying part (i) of [3,Theorem 4.7] to L r,z , we have (similarly as in the proof of [3, Lemma 6.1] for bounded ζ) that for any z satisfying dist (0, supp ρ r,z ) ≥ 2ε, a.w.o.p.
Now we claim that (A.54) holds simultaneously a.w.o.p. for all {z : |z| ≤ 2C 3 , z / ∈ D Lr 2ε }. To prove this strengthening, we apply the standard grid argument (again analogously as in the proof of [3, Lemma 6.1]) together with the Lipschitz continuity of the eigenvalues of L r,z (X, Y , Y * ) in z. Together with (A.51) and (A.53) this implies that a.w.o.p.
Suppose that the statement of the theorem is true for all rational expressions of height ≤ n − 1. Together with (s, c) ∈ D q 0 ,...,qn;C (S) this, in particular, means that a.w.o.p.
for some C 1 > 0, and for all r ∈ T q := {q i,γ i : 0 ≤ i ≤ n, 1 ≤ γ i ≤ i } with q i,γ i as in the definition of Q q 0 ,...,qn . We now show that there exists C > 0 such that (X, Y ) ∈ D q 0 ,...,qn; C C N ×N a.w.o.p., or equivalently, that for all γ n ∈ {1, . . . , n } a.w.o.p. (A.60) Using the result established in the first part of the proof, (A.58) and (A.59) with r = q n,γn imply that a.w.o.p.
Next notice, that from the definition of L r,z (see (A.36)), we have that L qn,γ n ,0 (s, c, c * ) The bound (A.41) together with the Schur complement formula (A.20) applied to (L q (X, Y , Y * ) − zJ m ⊗I N ) −1 , the trivial bound (A.21) applied to q(X, Y , Y * ) and the norm bounds X α C N ×N ≤ 3 and Y β C N ×N ≤ 3 holding (similarly as in (2.3)-(2.4)) a.w.o.p., imply the trivial bound for the generalized resolvent in random matrices.
Corollary A. 16. There exists C w Lq > 0 depending only on the linearization L q , such that a.w.o.p.
In order to formulate the local law, we need to introduce the notion of the stochastic domination. Let G z := (H − zJ ⊗ I N ) −1 ∈ C mN ×mN be the generalized resolvent of H. Note that the generalized resolvent G z , when viewed as taking values in C m×m ⊗ C N ×N , can be written as where the collection of matrices E ij := (δ ki δ jl ) 1≤k,l≤N form a standard basis of C N ×N and G z,ij ∈ C m×m is an m × m matrix for each (i, j) pair. In general, we will follow the convention that for any A ∈ C m×m ⊗ C N ×N we denote by A kl ∈ C N ×N , k, l = 1, 2, . . . , m the (k, l)-th block according to the C m×m factor in the tensor product, while A ij ∈ C m×m , i, j = 1, 2, . . . , N denotes the (i, j)-th block in the second factor, i.e.
in particular A ij (k, l) = A kl (i, j). Let M (sc) z : C + → C m×m be a matrix valued function given by (A.25). For each z ∈ C + we define the stability operator L z : C m×m → C m×m corresponding to M For n ∈ N and an operator R : C n×n → C n×n we denote by R C n×n →C n×n the operator norm of R generated by the the operator norm on C n×n . Then the following holds.
Proposition A.19 (Global law). For any θ > 0, uniformly on { z : Im z ≥ θ −1 , |z| ≤ θ } max 1≤i,j≤N where ≺ θ indicates that the constant in the definition of the stochastic domination (see Definition A.17) may depend on θ. In particular, uniformly on { z : Im z ≥ θ −1 , |z| ≤ θ } The global law for the trace of the resolvent of a rational expression, Tr g z , has already been proven in [39] with a somewhat different method; we comment on this point in Remark A.20.
Proof. Firstly we show that for any rational expression q as defined at the beginning of this section and any θ > 0, the bounds (M1)-(M2) from Theorem A.18 hold uniformly on { z : Im z ≥ θ −1 , |z| ≤ θ }, namely that for any θ > 0 there exist C θ > 0 such that for { z : Im z ≥ θ −1 , |z| ≤ θ } The first estimate in (A.80) follows directly from (A.26) and (Im z) −1 ≤ θ. In order to obtain the second estimate, we use the identity Remark A.21. The results of Sections A.3-A.6 can be extended to the setting when the NC variables x 1 , . . . , x α * , y 1 , . . . , y β * are replaced by the matrices of NC variables x α = x α (i, j) 1≤i,j≤lα ∈ A lα×lα , y β = y β (i, j) 1≤i≤l β 1≤j≤k β ∈ A l β ×k β (A.82) and q(x, y, y * ) ∈ A lq×lq with properly chosen dimensions that make the corresponding matrix operations well defined. The linearization of such model has the same structure (A.6) and we can analyze it by considering the matrix entries x α (i, j), 1 ≤ i, j ≤ k α , and y β (i, j), 1 ≤ i ≤ l β , 1 ≤ j ≤ k β , as the new NC variables. Note that in this case the diagonal entries of x α are self-adjoint. In order to relate the spectrum of the rational expression q(x, y, y * ) to its linearization, the generalized resolvent should be replaced by z → (L − zJ lq ) −1 . (A.83) where J lq = lq i=1 E ii ∈ C m×m with E ij being the standard basis of C m×m and m the dimension of the linearization. With the above definition the upper-left l q × l q corner of (L − zJ lq ) −1 is equal to (q(x, y, y * ) − I lq ⊗ 1 A ) −1 , the resolvent of the rational expression q(x, y, y * ).
We can then proceed with the study of the corresponding random matrix model, with x α (i, i) independent Wigner matrices and all other elements being independent i.i.d. matrices. The linearization of this model is again a Kronecker random matrix, therefore all the probabilistic estimates in the above analysis remain valid. The proof of the main results of Sections A.3-A.6 follows the same lines as in the l q = l α = l β = k β = 1 setting, the main changes are notational and are reduced to incorporating the additional structure (A.82) and (A.83).

B Norm bounds for random sample covariance matrices
In this section we prove the norm bounds for random sample covariance ensembles that are used frequently in the paper. Although various forms of these bounds are well known in the literature (see e.g. [6], [17] or [33]), we provide a short alternative proof to make sure that the assumptions about the random matrix ensembles and the probabilistic estimates match the setting of the present paper.
Lemma B.1. Let l, m ∈ N and let W ∈ C ln×mn be a (non-Hermitian) random matrix with independent centered entries of variance 1/(ln). Suppose that the entries of W have finite moments of all orders. Then for any m < l and δ > 0, as n → ∞, a.w.o.p. .