1 Introduction

The quantum random energy model (QREM) draws its motivation from various directions. In mathematical biology, it has been put forward as a simple model for the expression of genotypes under mutation in a random fitness landscape [4, 14]. More recently, it gained attention as a basic testing ground of quantum annealing algorithms for searches in unstructured energy landscapes (cf. [6, 18] and references therein) as well as in the context of many-body localization [5, 9, 15, 19, 25]. Its original motivation stems from the quest of understanding quantum effects in mean-field spin glasses [10, 13, 17, 22, 26].

The classical backbone, the random energy model (REM) was put forward by Derrida [11, 12] in the early 1980s as the limiting and solvable case of a class of mean-field spin glasses. The space of N-bit strings \( \mathcal {Q}_N = \{ -1, 1\}^N \) serves as the configuration space of the REM. The energy associated with \( \sigma = (\sigma _1 , \dots , \sigma _N) \in \mathcal {Q}_N \) is a rescaled Gaussian random variable

$$\begin{aligned} U(\sigma ) := \sqrt{N} \, g(\sigma ) \end{aligned}$$

with \( g(\sigma ) \) forming an independent and identically distributed (i.i.d.) process with standard normal law. \( \mathcal {Q}_N \) may be interpreted as the state space of a system of N spin-\(\tfrac{1}{2}\) quantum objects recorded, e.g., in the z-basis. The corresponding Hilbert space is given by the Nfold tensor product \( \otimes _{j=1}^N \mathbb {C}^2\) which is unitarily equivalent to \( \ell ^2( \mathcal {Q}_N) \). Effects of a transversal (e.g. in the negative x-direction) constant magnetic field of strength \( \varGamma \ge 0 \) on the spins are taken into account through the componentwise flip operators \( F_j \sigma := (\sigma _1, \dots , - \sigma _j , \dots , \sigma _N ) \), which are implemented on \( \psi \in \ell ^2( \mathcal {Q}_N) \) as

$$\begin{aligned} \left( T\psi \right) (\sigma ) := - \sum _{j=1}^N \psi ( F_j \sigma ) . \end{aligned}$$

This operator coincides with the negative sum of x-components of the Pauli matrices. The energy of the QREM is then given by an Anderson-type random matrix

$$\begin{aligned} H := \varGamma \ T + U \end{aligned}$$
(1)

where U acts as a multiplication operator on \( \ell ^2( \mathcal {Q}_N) \).

The process \( U(\sigma ) \) is the limiting case \( p \rightarrow \infty \) of the Gaussian family of p-spin models characterized by its mean and covariance function,

$$\begin{aligned} \mathbb {E}\left[ U(\sigma ) \right] = 0 , \qquad \mathbb {E}\left[ U(\sigma ) U(\sigma ') \right] = N \left( N^{-1} \sum _{j=1}^N \sigma _j \sigma _j' \right) ^p =: N \xi _p(\sigma ,\sigma ') . \end{aligned}$$
(2)

The case \( p= 2 \) corresponds to the famous Sherrington–Kirkpatrick model. The simplifying feature of the limit \( p \rightarrow \infty \) is the lack of correlations. The quantum p-spin generalisation of the QREM is then given by the random matrix (1) in which U is a multiplication operator by the correlated field.

1.1 Main Result

In this paper, we will be interested in thermodynamic properties of the QREM which are encoded in its partition function

$$\begin{aligned} Z(\beta , \varGamma ) := 2^{-N}\, {{\text {Tr}}\,}e^{-\beta H} \end{aligned}$$

at inverse temperature \( \beta \in [0, \infty ) \), or, equivalently, its pressure

$$\begin{aligned} p_N(\beta , \varGamma ) := N^{-1} \; \ln Z(\beta , \varGamma ) . \end{aligned}$$
(3)

Up to a factor of \( - \beta ^{-1} \), the latter coincides with the specific free energy.

In the thermodynamic limit \( N \rightarrow \infty \) the pressure of the REM converges almost surely [7, 11, 12],

$$\begin{aligned} \lim _{N\rightarrow \infty } p_N(\beta , 0) = p^{\mathrm {REM}}(\beta ) = \left\{ \begin{array}{ll} \tfrac{1}{2} \beta ^2 &{}\quad \text{ if } \; \beta \le \beta _c , \\ \tfrac{1}{2} \beta _c^2 + (\beta - \beta _c) \beta _c &{}\quad \text{ if } \; \beta > \beta _c .\end{array} \right. \end{aligned}$$
(4)

It exhibits a freezing transition into a low-temperature phase characterized by the vanishing of the specific entropy above

$$\begin{aligned} \beta _c := \sqrt{2 \ln 2 } . \end{aligned}$$
Fig. 1
figure 1

Phase diagram of the QREM as a function of the transversal magnetic field \( \varGamma \) and the temperature \( \beta ^{-1}\). The first-order transition occurs at fixed \( \beta \) and \( \varGamma _c(\beta ) \). The freezing transition is found at temperature \( \beta _c^{-1} \), which is unchanged in the presence of small magnetic field

Under the influence of the transversal field, the spin-glass phase of the REM disappears for large \( \varGamma > 0 \) and a first-order phase transition into a quantum paramagnetic phase characterised by

$$\begin{aligned} p^{\mathrm {PAR}}(\beta \varGamma ) := \ln \cosh \left( \beta \varGamma \right) \end{aligned}$$

is expected to occur. The precise location of this first-order transition and the shape of the phase diagram of the QREM has been predicted by Goldschmidt [17] in the 1990s on the basis of arguments using the replica trick and the so-called static approximation in the associated path integral. His calculations have been repeated and refined in various papers—all still based on the replica trick and further approximations [13, 22] (see also [26] and references). As a main result of this paper, we give a rigorous proof of this result.

Theorem 1

For any \( \varGamma , \beta \ge 0 \) almost surely:

$$\begin{aligned} \lim _{N\rightarrow \infty } p_N(\beta , \varGamma ) = \max \{ p^{\mathrm {REM}}(\beta ) , p^{\mathrm {PAR}}(\beta \varGamma ) \} . \end{aligned}$$

As will become clear from the proof, which is found in Sect. 2 below, the special structure of the pressure as a maximum of competing extremal cases is mainly caused by the fact that the REM’s energy landscape is steep and rough due to the lack of correlations. This renders the model solvable. Before diving into the details of the proof, let us add some comments (see also Fig. 1):

  1. 1.

    As in the classical case, the pressure \( p_N(\beta , \varGamma ) \) is self-averaging, i.e. in the thermodynamic limit it coincides with its probabilistic average, the so-called quenched pressure \( \mathbb {E}\left[ p_N(\beta , \varGamma ) \right] \). For the QREM, this follows immediately from the Gaussian concentration inequality for Lipschitz functions. The Lipschitz constant of the pressure’s variations with respect to the i.i.d. standard Gaussian variables \( g(\sigma ) \) is bounded by

    $$\begin{aligned} \displaystyle \sum _{\sigma \in \mathcal {Q}_N} \left( \frac{\partial p_N(\beta , \varGamma )}{\partial g(\sigma )} \right) ^2 = \frac{\beta ^2}{N \, 2^{2N} Z(\beta , \varGamma )^2 } \sum _\sigma \langle \sigma | e^{-\beta H} | \sigma \rangle ^2 \le \frac{\beta ^2}{N}. \end{aligned}$$

    Here and in the following we use bracket notation for matrix elements. Consequently, we have the Gaussian tail estimate

    $$\begin{aligned} \mathbb {P}\left( \left| p_N(\beta , \varGamma ) - \mathbb {E}\left[ p_N(\beta , \varGamma ) \right] \right| > \frac{t \, \beta }{\sqrt{N} } \right) \le C \, \exp \left( - c t^2\right) \end{aligned}$$
    (5)

    for all \( t > 0 \) and all \( N \in \mathbb {N} \) with some constants \( c, C \in (0,\infty ) \). In fact, self-averaging for more general quantum p-spin models has already been established in [10].

  2. 2.

    For fixed \( \beta \) a first-order phase transition is found at

    $$\begin{aligned} \varGamma _c(\beta ) := \beta ^{-1} {\text {arcosh}}\left( \exp \left( p^{\mathrm {REM}}(\beta )\right) \right) . \end{aligned}$$

    In particular, \( \varGamma _c(0) = 1 \) and \( \varGamma _c(\beta _c) = \beta _c^{-1} {\text {arcosh}}(2) \). In the low-temperature limit, \( \lim _{\beta \rightarrow \infty } \varGamma _c(\beta ) = \beta _c \), the first-order transition connects to the known location of the quantum phase transition of the ground state [18]. In this context, it is useful to recall that the REM’s extreme energies are almost surely found at \(\Vert U \Vert _\infty = \beta _c N + o(N)\), cf. [7, Ch. 9]. For \( \varGamma < \beta _c \), the energetically separated ground state is sharply localized near the lowest-energy configuration of the REM. For \( \varGamma > \beta _c \), the energetically separated ground state resembles the maximally delocalized state given by the ground state of T. Near \( \varGamma = \beta _c \), the ground-state gap closes exponentially [1].

  3. 3.

    For \( \varGamma > \varGamma _c(\beta ) \), the magnetization in the x-direction is strictly positive,

    $$\begin{aligned} \beta ^{-1} \, \frac{\partial }{\partial \varGamma } \, p^{\mathrm {PAR}}(\beta \varGamma ) = \tanh (\beta \varGamma ) > 0 . \end{aligned}$$
  4. 4.

    For all \( \varGamma < \varGamma _c(\beta ) \) the line of the freezing transition transition remains unchanged at \( \beta = \beta _c \). In the frozen regime, the QREM has zero specific entropy.

1.2 Comments and Open Problems

We close the introduction with some further comments and open problems:

  1. 1.

    For the quantum p-spin model it is conjectured that the structure of the phase diagram in Fig. 1 only changes smoothly in 1/p at low temperatures (see e.g. [13] ). Non-rigorous 1/p expansions in a replica analysis have been the basis of these assertions. (A tiny step towards a proof of the continuity of the pressure at \( p =\infty \) has been undertaken recently on the basis of the methods presented here in [21].)

    Such expansion-based arguments have been extended in [22] to cover the case of ferromagnetic bias, in which the Gaussian spin-p couplings are tilted towards a ferromagnetic interaction. The paper [22] argues that the spin glass phase will also disappear in favour of a ferromagnetic phase for sufficiently large tilting.

  2. 2.

    As in the classical case, the quenched pressure \( \mathbb {E}\left[ p_N(\beta , \varGamma ) \right] \) is generally smaller than the annealed pressure \( N^{-1} \ln \mathbb {E}\left[ Z(\beta ,\varGamma )\right] \). However, in the high-temperature phase, \( \beta < \beta _c \), asymptotic equality holds—even in the quantum case as is not hard to show by performing the annealed average in the path-integral representation. The fluctuation properties of the partition function are well studied in classical cases (see e.g. [3, 8] and [7, Ch. 9–10] for further references). We leave it to a future work to extend these results to the quantum case.

  3. 3.

    For a large class of mean-field spin glasses, the pressure in the thermodynamic limit is known to be universal in that it does not depend on the details of the randomness (cf. [28] and references therein). Such universality results have been extended to the quantum case in [10].

  4. 4.

    Most recently, there has been some progress in understanding the free energy of the quantum Sherrington–Kirkpatrick model. The absence of a spin-glass phase for high temperatures was addressed in [20]. In particular, it is shown that in the high-temperature phase the quenched pressure asymptotically coincides with the annealed pressure thereby generalising some of the results in [3]. The paper [2] identified the thermodynamic limit of the quenched pressure with a certain limit of a variational principle involving classical vector-spin glasses.

2 Proof

The proof of Theorem 1 consists of a pair of asymptotically coinciding upper and lower bounds.

Proof of Theorem 1

The assertion is a consequence of Lemma 1 and Corollary 1 below. \(\square \)

The following two subsections contain the details of the argument.

2.1 Lower Bound

Not surprisingly, our lower bound is more robust and will hold for more general p-spin models also. Let us first recall that if \( U(\sigma ) \) is a Gaussian random field of the form (2) with \( p \in [1,\infty ] \), then its pressure

$$\begin{aligned} p^{\mathrm {U}}(\beta ) := \lim _{N\rightarrow \infty } N^{-1} \ln 2^{-N} \sum _{\sigma \in \mathcal {Q}_N} e^{-\beta U(\sigma ) } \end{aligned}$$
(6)

is known to converge almost surely to a non-random expression, which is in fact given by the famous Parisi formula [23, 24, 27, 28]. In the special case \( p= \infty \) this reduces to \( p^{\mathrm {U}}(\beta ) = p^{\mathrm {REM}}(\beta ) \).

Lemma 1

Consider the quantum p-spin model, i.e. \( H = \varGamma \, T + U \) with U diagonal and Gaussian of the form (2) with \( p \in [1,\infty ] \). For any \( \varGamma , \beta \ge 0 \) and almost surely

$$\begin{aligned} \liminf _{N\rightarrow \infty } p_N(\beta , \varGamma ) \ge \max \{ p^{\mathrm {U}}(\beta ) , p^{\mathrm {PAR}}(\beta \varGamma ) \} . \end{aligned}$$
(7)

Proof

We use the Gibbs variational principle,

$$\begin{aligned} \ln {{\text {Tr}}\,}e^{-\beta H} = - \inf _{\varrho } \left[ \beta \, {{\text {Tr}}\,}\left( H \varrho \right) + {{\text {Tr}}\,}\left( \varrho \ln \varrho \right) \right] \end{aligned}$$
(8)

in which the infimum is taken over all density matrices, \( \varrho \ge 0 \), \( {{\text {Tr}}\,}\varrho = 1 \), on \( \ell ^2(\mathcal {Q}_N) \). There are two natural choices:

  1. 1.

    We may pick \( \varrho = e^{-\beta U }/ {{\text {Tr}}\,}e^{-\beta U } \). In this case, the right-hand side is lower bounded by

    $$\begin{aligned} \ln {{\text {Tr}}\,}e^{-\beta U } - \beta \varGamma {{\text {Tr}}\,}\left( T \, \varrho \right) = \ln {{\text {Tr}}\,}e^{-\beta U } . \end{aligned}$$

    The last step follows from the fact that the diagonal matrix elements of T vanish. Consequently, we arrive at the bound,

    $$\begin{aligned} p_N(\beta , \varGamma ) \ge \frac{1}{N} \ln \left( \frac{1}{2^N} \sum _{\sigma \in \mathcal {Q}_N} e^{-\beta U(\sigma ) }\right) , \end{aligned}$$

    which together with the known convergence (6) yields the first part of the claim.

  2. 2.

    We may also pick \( \varrho = e^{-\beta \varGamma T }/ {{\text {Tr}}\,}e^{-\beta \varGamma T } \). In this case, the right-hand side in (8) reduces to

    $$\begin{aligned} \ln {{\text {Tr}}\,}e^{-\beta \varGamma T } - \beta \, {{\text {Tr}}\,}\left( U \varrho \right) = N \ln \left( 2\cosh (\beta \varGamma ) \right) - \frac{\beta }{2^N} \sum _{\sigma \in \mathcal {Q}_N} U(\sigma ) , \end{aligned}$$

    where we used \( \langle \sigma | e^{-\beta \varGamma T } | \sigma \rangle = \cosh (\beta \varGamma )^N \) for the diagonal matrix element of the semigroup generated by \(- T\). Consequently, we arrive at the bound,

    $$\begin{aligned} p_N(\beta , \varGamma ) \ge p^{\mathrm {PAR}}(\beta \varGamma ) - \frac{\beta }{N 2^N} \sum _{\sigma \in \mathcal {Q}_N} U(\sigma ) . \end{aligned}$$

    The last term converges to zero almost surely by the strong law of large numbers. More precisely, for any \( \varepsilon > 0 \), an exponential Chebychev bound yields

    $$\begin{aligned} \mathbb {P}\left( \frac{1 }{N 2^N} \sum _{\sigma \in \mathcal {Q}_N} U(\sigma ) > \varepsilon \right)&\le e^{- N \varepsilon ^2/2 } \, \mathbb {E}\left[ \exp \left( \frac{ \varepsilon }{2^{N+1}} \sum _{\sigma \in \mathcal {Q}_N} U(\sigma ) \right) \right] \\&= e^{- N \varepsilon ^2/2 } \exp \left( \frac{\varepsilon ^2}{2^{2(N+1) }} \sum _{\sigma , \sigma '} N \, \xi _p(\sigma ,\sigma ') \right) \\&\le e^{- N \varepsilon ^2/4 } . \end{aligned}$$

    The same bound also applies to \( - \sum _{\sigma } U(\sigma ) \). Since the right-hand side is summable in N, a Borel–Cantelli argument ensures the claimed almost-sure convergence.

\(\square \)

2.2 Upper Bound

Typical values of the REM \( U(\sigma ) \) fluctuate on order \( \mathcal {O}(\sqrt{N}) \). Our upper bound rests on the observation that configurations on which large negative deviations occur,

$$\begin{aligned} \mathcal {L}_\varepsilon := \left\{ \sigma \in \mathcal {Q}_N \, \big | \, U(\sigma ) \le - \varepsilon N \right\} , \end{aligned}$$
(9)

form gap-connected clusters whose maximal size remains bounded uniformly in N even for \( \varepsilon > 0 \) arbitrarily small. For the precise formulation of this result, it is useful to recall that the Hamming distance

$$\begin{aligned} d(\sigma , \sigma ') := \sum _{j=1}^N 1\left[ \sigma _j \ne \sigma _j' \right] \end{aligned}$$

renders \( \mathcal {Q}_N \) (through the nearest-neighbour relation) into a graph called the Hamming cube, in which each vertex has exactly N neighbours. For future purposes, we also introduce the Hamming ball of radius \( r \in [0, N] \) centered at \( \sigma \in \mathcal {Q}_N \),

$$\begin{aligned} B_r(\sigma ) := \left\{ \sigma ' \in \mathcal {Q}_N \, \big | \, d(\sigma , \sigma ') \le r \right\} . \end{aligned}$$

Its volume \( |B_{r} | \) is known to be bounded by \( \exp \left( N \gamma (r/N) \right) \) for all \( r < N/2 \) in terms of the binary entropy, \( \gamma (\xi ) := -\xi \ln \xi - (1-\xi ) \ln (1-\xi ) \). Here, a simpler bound is sufficient:

$$\begin{aligned} \left| B_r \right| = \sum _{j=0}^r \left( {\begin{array}{c}N\\ j\end{array}}\right) \le \sum _{j=0}^r \frac{N^j}{j!} \le e \, N^r . \end{aligned}$$
(10)

Definition 1

Let \(\widetilde{\mathcal {Q}}_N\) be the supergraph of the Hamming cube \(\mathcal {Q}_N\), which one obtains by adding the edges \(\{ \sigma , \sigma ' \}\), where \(\sigma ,\sigma '\) are two vertices with \(d(\sigma ,\sigma ') =2\). We call \( \mathcal {C}_\varepsilon \subset \mathcal {L}_\varepsilon \) a gap-connected component, if \( \mathcal {C}_\varepsilon \) is connected as a subset of \(\widetilde{\mathcal {Q}}_N\). A gap-connected component \( \mathcal {C}_\varepsilon \) is maximal if there is no other vertex \( \sigma \in \mathcal {L}_\varepsilon \backslash \mathcal {C}_\varepsilon \) such that \( \mathcal {C}_\varepsilon \cup \{ \sigma \} \) forms a gap-connected component.

For each realisation of the randomness the large-deviation set then naturally decomposes into a finite (edge-)disjoint union of maximally gap-connected components,

$$\begin{aligned} \mathcal {L}_\varepsilon = \bigcup _\alpha \, \mathcal {C}_\varepsilon ^{(\alpha )} . \end{aligned}$$

On any gap-connected component \( \mathcal {C}_\varepsilon \) for every vertex \( \sigma \in \mathcal {C}_\varepsilon \) there is some \( \sigma ' \in \mathcal {C}_\varepsilon \backslash \{\sigma \} \) with \( d(\sigma , \sigma ') \in \{ 1, 2 \}\) – not necessarily \( d(\sigma , \sigma ') = 1 \). By construction, we thus have for all \( \alpha \ne \alpha ' \):

$$\begin{aligned} d\left( \mathcal {C}_\varepsilon ^{(\alpha )}, \mathcal {C}_\varepsilon ^{(\alpha ')} \right) = \min \left\{ d(\sigma , \sigma ') \, | \, \sigma \in \mathcal {C}_\varepsilon ^{(\alpha )} \wedge \sigma ' \in \mathcal {C}_\varepsilon ^{(\alpha ')} \right\} > 2. \end{aligned}$$
(11)

The next lemma controls with good probability the size of each subset \( \mathcal {C}_\varepsilon ^{(\alpha )}\), which is just the number of its vertices and denoted by \( | \mathcal {C}_\varepsilon ^{(\alpha )} | \).

Lemma 2

For all \( \varepsilon > 0 \) and \( N \in \mathbb {N} \) there is some subset \( \varOmega _{\varepsilon , N} \) of realizations such that:

  1. 1.

    for some \( c_\varepsilon > 0 \), which is independent of N, and all N large enough:

    $$\begin{aligned} \mathbb {P}\left( \varOmega _{\varepsilon , N} \right) \ge 1 - e^{- c_\varepsilon N } , \end{aligned}$$
  2. 2.

    on \( \varOmega _{\varepsilon , N} \): \( \displaystyle \, \max _\alpha \big | \mathcal {C}_\varepsilon ^{(\alpha )} \big | < K_\varepsilon := \left\lceil \frac{4 \ln 2}{\varepsilon ^2} \right\rceil \).

Proof

We start by noting that the event

$$\begin{aligned} \varOmega _{\varepsilon , N} := \bigcap _{\sigma _\in \mathcal {Q}_N} \left\{ \left| B_{r_\varepsilon }(\sigma ) \cap \mathcal {L}_\varepsilon \right| < K_\varepsilon \right\} \end{aligned}$$
(12)

with \( r_\varepsilon := 4K_\varepsilon \) implies the second assertion in the lemma. This follows from the fact that in the event \( \varOmega _{\varepsilon , N} \), in which there are at most \( K_\varepsilon - 1 \) large deviation sites in the ball of radius \( r_\varepsilon \) around any fixed \( \sigma \in \mathcal {L}_\varepsilon \), the gap-connected component to which \( \sigma \) belongs, must be strictly contained in a ball of radius at most \( 2 (K_\varepsilon - 1 ) < r_\varepsilon -2 \), i.e. it cannot gap-connect to other vertices outside the ball \(B_{r_\varepsilon }(\sigma ) \) and hence consists of at most \( K_\varepsilon \) vertices.

It therefore remains to estimate the probability of the event complementary to \( \varOmega _{\varepsilon , N} \). Using the union bound we obtain:

$$\begin{aligned}&\mathbb {P}\left( \bigcup _{\sigma _\in \mathcal {Q}_N} \left\{ \left| B_{ r_\varepsilon }(\sigma ) \cap \mathcal {L}_\varepsilon \right| \ge K_\varepsilon \right\} \right) \le \sum _{\sigma \in \mathcal {Q}_N} \mathbb {P}\left( \left| B_{ r_\varepsilon }(\sigma ) \cap \mathcal {L}_\varepsilon \right| \ge K_\varepsilon \right) \nonumber \\&\quad \le \sum _{\sigma \in \mathcal {Q}_N} \sum _{j= K_\varepsilon }^{|B_{ r_\varepsilon } |} \mathbb {P}\left( \left| B_{ r_\varepsilon }(\sigma ) \cap \mathcal {L}_\varepsilon \right| = j \right) \nonumber \\&\quad \le 2^N \sum _{j= K_\varepsilon }^{|B_{ r_\varepsilon } |} \left( {\begin{array}{c}|B_{ r_\varepsilon } |\\ j\end{array}}\right) e^{- j \varepsilon ^2 N /2} \le 2^N \sum _{k=K_\varepsilon }^\infty \frac{|B_{ r_\varepsilon } |^j}{j!} e^{-j \varepsilon ^2 N /2} \nonumber \\&\quad \le 2^N \frac{|B_{ r_\varepsilon } |^{K_\varepsilon }}{K_\varepsilon !} e^{-K_\varepsilon \varepsilon ^2 N /2} \exp \left( |B_{ r_\varepsilon } | e^{- \varepsilon ^2 N /2 }\right) \nonumber \\&\quad \le \frac{|B_{ r_\varepsilon } |^{K_\varepsilon }}{K_\varepsilon !} e^{-K_\varepsilon \varepsilon ^2 N /4} \exp \left( |B_{ r_\varepsilon } | e^{- \varepsilon ^2 N /2 }\right) . \end{aligned}$$
(13)

Here the third line relies on the fact that the number of subsets of a given size equals the binomial coefficient. Moreover, specifying the large-deviation sites in \( B_{ r_\varepsilon }(\sigma ) \) allows one to compute the probability of the event using the independence of the random field \( U(\sigma ) \). To estimate this probability, we use the elementary estimate on the complementary error function,

$$\begin{aligned} \mathbb {P}\left( \sigma \in \mathcal {L}_\varepsilon \right) = \int _{-\infty }^{-\varepsilon \sqrt{N} } e^{-x^2/2} \frac{dx}{\sqrt{2\pi }} \le e^{-\varepsilon ^2 N /2} , \end{aligned}$$
(14)

as well as the trivial bound on the probability of the complementary elementary event. The last inequality in the second line of (13) results from a simple bound on the binomial coefficient. The forth line is the standard estimate of the remainder of the exponential series. Finally, the last line follows by definition of \( K_\varepsilon \). Since the volume of the ball \( |B_{ r_\varepsilon } | \) grows only polynomially in N by (10), the right-hand side of (13) is exponentially bounded for large enough N. This completes the proof. \(\square \)

Our main idea behind an upper bound on the partition function \( Z(\beta , \varGamma ) \) is to decompose H into the multiplication operator U restricted to vertices in \( \mathcal {L}_\varepsilon \) and the QREM H restricted to the complementary set \( \mathcal {L}_\varepsilon ^c \) plus a remainder term \( A_{ \mathcal {L}_\varepsilon } \). For this purpose, we write \( \ell ^2(\mathcal {Q}_N) = \ell ^2( \mathcal {L}_\varepsilon ) \oplus \ell ^2( \mathcal {L}_\varepsilon ^c) \) and set \( U_{\mathcal {L}_\varepsilon } \) the multiplication operator by the REM values on \( \ell ^2( \mathcal {L}_\varepsilon ) \). On the orthogonal complement \( \ell ^2( \mathcal {L}_\varepsilon ^c) \), we define the natural restriction of (1). Note that \(-T \) is the adjacency matrix on the Hamming cube. In the restriction \( H_{\mathcal {L}_\varepsilon ^c} \), we simply restrict the adjacency matrix to the subgraph associated with \( \mathcal {L}_\varepsilon ^c \). We then define \( A_{ \mathcal {L}_\varepsilon } \) through:

$$\begin{aligned} H =: U_{\mathcal {L}_\varepsilon } \oplus H_{\mathcal {L}_\varepsilon ^c} - \varGamma A_{ \mathcal {L}_\varepsilon } . \end{aligned}$$
(15)

Clearly, the matrix elements of the remainder term are related to all edges reaching \( \mathcal {L}_\varepsilon \):

$$\begin{aligned} \langle \sigma | A_{ \mathcal {L}_\varepsilon } | \sigma ' \rangle = {\left\{ \begin{array}{ll} 1 &{}\quad \text{ if } \sigma \in \mathcal {L}_\varepsilon \text{ or } \sigma ' \in \mathcal {L}_\varepsilon \text{ and } d(\sigma , \sigma ') = 1 , \\ 0 &{}\quad \text{ else. } \end{array}\right. } \end{aligned}$$
(16)

The following lemma contains an estimate on the operator norm of the remainder. In case the components in the decompositions are of small size, this estimate is not so wasteful.

Lemma 3

Let \( \mathcal {L}_\varepsilon = \bigcup _\alpha \, \mathcal {C}_\varepsilon ^{(\alpha )} \) stand for a finite (edge-)disjoint union of maximally gap-connected components of the large deviation set (9). Then

$$\begin{aligned} \left\| A_{ \mathcal {L}_\varepsilon } \right\| \le \sqrt{2N \, \max _{\alpha } \big | \mathcal {C}_\varepsilon ^{(\alpha )} \big | } . \end{aligned}$$
(17)

Proof

Since the components are edge-disjoint in the sense that (11) holds, we have

$$\begin{aligned} \left\| A_{ \mathcal {L}_\varepsilon } \right\| = \max _{\alpha } \big \Vert A_{ \mathcal {C}_\varepsilon ^{(\alpha ) }} \big \Vert , \end{aligned}$$

where the operators in the right-hand side satisfy (16) with \( \mathcal {L}_\varepsilon \) substituted by \( \mathcal {C}_\varepsilon ^{(\alpha ) }\). Consequently, their operator norms are bounded by a Frobenius estimate

$$\begin{aligned} \big \Vert A_{ \mathcal {C}_\varepsilon ^{(\alpha ) }} \big \Vert \le \sqrt{\sum _{\sigma , \sigma ' } \left| \langle \sigma | A_{\mathcal {C}_\varepsilon ^{(\alpha )}} | \sigma ' \rangle \right| ^2} . \end{aligned}$$

Since the double sum is restricted to \( \sigma \in \mathcal {C}_\varepsilon ^{(\alpha ) } \) or \( \sigma ' \in \mathcal {C}_\varepsilon ^{(\alpha ) } \) and, in each of the two cases, the other sum has at most N terms, the assertion follows. \(\square \)

The fact that the operator norm in the preceding lemma does not scale with N might sound remarkable at first sight. However, we remind the reader that even the full adjacency matrix \( -T_{B_{N\rho }} \) restricted to a Hamming ball of radius \(N\rho \) with \( \rho \in (0,1/2) \), is known [16] to be bounded by \(\big \Vert T_{B_{N\rho }} \big \Vert \le 2 N\sqrt{\rho (1-\rho )} +o(N) \).

We are now ready to conclude our asymptotically sharp upper bound.

Corollary 1

For any \( \varGamma , \beta \ge 0 \) almost surely:

$$\begin{aligned} \limsup _{N\rightarrow \infty } p_N(\beta ,\varGamma ) \le \max \left\{ p^{\mathrm {REM}}(\beta ), p^{\mathrm {PAR}}(\beta \varGamma ) \right\} \, . \end{aligned}$$

Proof

We pick \( \varepsilon > 0 \) arbitrarily small and start from the decomposition (15) of the Hamiltonian. The Golden–Thompson inequality yields

$$\begin{aligned} Z(\beta , \varGamma )&\le 2^{-N}\, {{\text {Tr}}\,}e^{-\beta U_{\mathcal {L}_\varepsilon } \oplus H_{\mathcal {L}_\varepsilon ^c} } \, e^{ -\beta \varGamma A_{ \mathcal {L}_\varepsilon }} \\&\le 2^{-N}\, e^{ \beta \varGamma \Vert A_{ \mathcal {L}_\varepsilon } \Vert } \left( {{\text {Tr}}\,}_{\ell ^2(\mathcal {L}_\varepsilon )} e^{-\beta U_{\mathcal {L}_\varepsilon } } + {{\text {Tr}}\,}_{\ell ^2(\mathcal {L}_\varepsilon ^c)} e^{-\beta H_{\mathcal {L}_\varepsilon ^c}} \right) . \end{aligned}$$

The first term in the bracket on the right-hand side is trivially estimated in terms of the partition function of the REM:

$$\begin{aligned} 2^{-N}\, {{\text {Tr}}\,}_{\ell ^2(\mathcal {L}_\varepsilon )} e^{-\beta U_{\mathcal {L}_\varepsilon } } \le Z(\beta ,0) = e^{N p_N(\beta ,0) } . \end{aligned}$$

For the second term we use the fact that the adjacency matrix \( - T_{\mathcal {L}_\varepsilon ^c} \) has non-negative matrix elements and hence generates a positivity preserving semigroup on \( \ell ^2(\mathcal {L}_\varepsilon ^c)\). Since the diagonal values of its perturbation are bounded from below by \( - \varepsilon N \) by assumption on \( \mathcal {L}_\varepsilon ^c \), we conclude

$$\begin{aligned} 2^{-N}\, {{\text {Tr}}\,}_{\ell ^2(\mathcal {L}_\varepsilon ^c)} e^{-\beta H_{\mathcal {L}_\varepsilon ^c}}&\le e^{\beta \varepsilon N} 2^{-N}\, {{\text {Tr}}\,}_{\ell ^2(\mathcal {L}_\varepsilon ^c)} e^{-\beta \varGamma T_{\mathcal {L}_\varepsilon ^c} } \\&\le e^{\beta \varepsilon N} 2^{-N}\, {{\text {Tr}}\,}e^{-\beta \varGamma T} = \exp \left( N \left( \beta \varepsilon + p^{\mathrm {PAR}}(\beta \varGamma ) \right) \right) . \end{aligned}$$

Here, the last inequality follows from the monotonicity of \(e^{-\beta \varGamma T_{\mathcal {L}_\varepsilon ^c}}\) with respect to \(\mathcal {L}_\varepsilon ^c\), which is in turn a consequence of the non-negativity of the matrix elements of the adjacency matrix. To summarize, we thus obtain

$$\begin{aligned} p_N(\beta ,\varGamma ) \le \max \left\{ p_N(\beta ,0) , \beta \varepsilon + p^{\mathrm {PAR}}(\beta \varGamma ) \right\} + \tfrac{ 1}{N} \left( \beta \varGamma \, \Vert A_{ \mathcal {L}_\varepsilon } \Vert + \ln 2 \right) . \end{aligned}$$
(18)

According to Lemma 2 there is some \( \varOmega _{\varepsilon , N} \) whose complementary probability is exponentially small in N and on which Lemma 3 guarantees that for all N large enough:

$$\begin{aligned} p_N(\beta ,\varGamma ) \le \max \left\{ p_N(\beta ,0) , p^{\mathrm {PAR}}(\beta \varGamma ) \right\} + 2 \beta \varepsilon \, . \end{aligned}$$

Since the probabilities of the complementary event are summable in N, a Borel–Cantelli argument together with the known almost sure convergence (4) of the REM thus finishes the proof. \(\square \)