1 Introduction

Rayleigh–Schrödinger perturbation theory (RSPT) [1] is a simple, yet powerful tool for approximating Hamiltonian spectra and eigenfunctions. Its application is so ubiquitous that anyone who has ever done any quantum mechanics calculations is likely to have used it at some point. Corrections to the eigenvalues and eigenvectors of an unperturbed problem are given as as a power series in a small perturbation. In the standard textbook approach (e.g. [2, Ch. 11]) corrections are determined recursively as a function of all lower-order terms. Explicit expressions have long been known as well [3,4,5,6,7,8], but, notably, not for the normalised eigenfunctions.

Kato [3, 4] gave the first explicit solution of (generally degenerate) RSPT. Instead of choosing an arbitrary eigenbasis, he stated the results in terms of projectors onto (possibly still degenerate) eigenspaces. Bloch [5] modified these projectors, reducing the number of terms by a factor of 2. He assumed a perturbation that completely lifts the degeneracy and concerned himself with the construction of an appropriate basis of the degenerate unperturbed eigenspace (“les ‘bonnes fonctions’ non perturbées”). Bloch also introduced the diagrammatic representation described below, as well as an alternative choice of bonnes fonctions that allowed for the restriction to a subset of diagrams, called convex, further reducing the number of terms in order n by a factor of \((n+1)/2\).

Earlier, following Brueckner [9], Goldstone [10] used Feynman diagrams to explicitly write down corrections to the nondegenerate ground state of an interacting fermionic system. Huby [6] restated Bloch’s results, in a form suggested by Brueckner [9], where the same terms are constructed in a different way. He can give explicit formulas for eigenvectors and not merely projectors because he considered the nondegenerate case. These expressions for the eigenvectors were not normalised. Salzman [7] similarly focused on the nondegenerate, unnormalised case and developed a new diagrammatic formalism, set up to collect equivalent terms. This in principle allows for a further reduction in the number of terms. The rules he gave for constructing diagrams do not provide this reduction automatically, however. Equivalent terms still had to be collected together manually. From the more mathematical direction, the equivalence of Bloch’s diagram counting with that of the leaves of ordered trees can be found in the work of Stanley [11]. More recent surveys have related term counting in Rayleigh–Schrödinger perturbation theory to other combinatorial objects [12].

Silverstone and Holloway [8] derived alternative formulas for the nondegenerate eigenvalues and their unnormalised eigenvectors that formally lead to the least number of terms, however, at the price of evaluating a large number of derivatives. Quantifying the number of terms in the resulting Silverstone–Holloway expression, beyond ‘more than the number of partitions of n into positive integers,’ is a non-trivial task. More recently, Magesan and Gambetta [13] developed a formalism that preserves norms exactly by perturbatively series expanding the generator of a unitary operator. For a given order, the canonical transformation of the Hamiltonian by that unitary is then in turn series expanded. This method does not directly give an explicit series for eigenvectors and -values. Bloch’s original work still finds application in the context of effective Hamiltonians in Jordan and Farhi’s arbitrary order perturbative gadgets [14].

In the present work, we consider anew the perturbation of a nondegenerate eigenvalue in standard RSPT. The phase and normalisation freedom of the eigenvector significantly influences the expansion. To the best of our knowledge, here we give the first explicit solution which preserves the norm of the eigenvector at 1 in every order.

In Sect. 2, we briefly review nondegenerate RSPT and the main results of Bloch [5] that we build upon. Our new results are derived in Sect. 3. We comment on their efficiency and how they can be improved in Sect. 4. Finally, in Sect. 5, we focus more on the diagrammatic aspect and show how our results work in practice, going up to fourth order, and conclude in Sect. 6.

2 Rayleigh–Schrödinger Perturbation Theory

2.1 Recursive Definition

Given a Hamiltonian \(H=H_0+\epsilon V\) parameterised by \(\epsilon \in [0,1]\), we assume we can expand any of its eigenvalues \(\lambda \), and the corresponding eigenvector \(\left| \lambda \right\rangle \), in a power series in \(\epsilon \),

$$\begin{aligned} \lambda =\sum _{n=0}^\infty \epsilon ^n\lambda _n,\qquad \left| \lambda \right\rangle =\sum _{n=0}^\infty \epsilon ^n\left| \lambda _n \right\rangle , \end{aligned}$$
(1)

i.e. they satisfy

$$\begin{aligned} \left( H_0+\epsilon V\right) \sum _{n=0}^\infty \epsilon ^n\left| \lambda _n \right\rangle =\sum _{n,m=0}^\infty \epsilon ^{n+m}\left| \lambda _m \right\rangle \lambda _n. \end{aligned}$$
(2)

Sorting Eq. (2) by powers of \(\epsilon \) and equating the coefficients gives in zeroth order \(H_0\left| \lambda _0 \right\rangle =\lambda _0\left| \lambda _0 \right\rangle \). Usually \(H_0\) is chosen to be analytically diagonalisable; we call \(\lambda _0\) and \(\left| \lambda _0 \right\rangle \) the unperturbed eigenenergies and -vectors, respectively. Here we further assume that they are discrete and nondegenerate, and we define the complementary projectors

$$\begin{aligned} P_{\lambda }=\left| \lambda _0 \right\rangle \left\langle \lambda _0 \right| ,\qquad Q_{\lambda }=1-P_{\lambda }. \end{aligned}$$
(3)

For nonzero powers \(n\in \mathbb {N}\) of \(\epsilon \), Eq. (2) gives

$$\begin{aligned} H_0\left| \lambda _{n} \right\rangle +V\left| \lambda _{n-1} \right\rangle =\sum _{m=0}^n\left| \lambda _{n-m} \right\rangle \lambda _m. \end{aligned}$$
(4)

We can then consider the \(P_{\lambda }\)- and \(Q_{\lambda }\)-components of Eq. (4) separately to derive equations for \(\lambda _n\) and \(\left| \lambda _{n} \right\rangle \), respectively. For the energies we get

$$\begin{aligned} \lambda _n=\left\langle \lambda _{0} \right| V\left| \lambda _{n-1} \right\rangle -\sum _{m=1}^{n-1}\left\langle \lambda _{0} \big | \lambda _{n-m} \right\rangle \lambda _m. \end{aligned}$$
(5)

Note, here and throughout the paper, the convention

$$\begin{aligned} \sum _{n=N}^M A_n=0\quad \text {for any sequence }A_n\text { if }M<N \end{aligned}$$
(6)

applies. The \(Q_{\lambda }\)-component gives

$$\begin{aligned} Q_{\lambda }\left| \lambda _{n} \right\rangle =\frac{Q_{\lambda }}{\lambda _0-H_0}\left( V\left| \lambda _{n-1} \right\rangle -\sum _{m=1}^{n-1}\left| \lambda _{n-m} \right\rangle \lambda _m\right) , \end{aligned}$$
(7)

where on the right-hand side we see appearing the reduced resolvent

$$\begin{aligned} S=\frac{Q_{\lambda }}{\lambda _0-H_0}=Q_{\lambda }\frac{1}{\lambda _0-H_0}Q_{\lambda }=\sum _{\lambda ^\prime \ne \lambda }\frac{\left| \lambda ^\prime _0 \right\rangle \left\langle \lambda ^\prime _0 \right| }{\lambda _0-\lambda ^\prime _0} \end{aligned}$$
(8)

which, since by assumption \(\lambda _0\) is nondegenerate, is well-defined. For the sake of a compact notation, there is no index \(\lambda \) on S, but it should be remembered as implicit. For later reference we also define powers of S, where it will be convenient to define \(S^0\) separately [3]

$$\begin{aligned} S^0=-P_{\lambda },\qquad S^k=\frac{Q_{\lambda }}{\left( \lambda _0-H_0\right) ^k}\quad \text {for }k\in \mathbb {N}. \end{aligned}$$
(9)

Clearly \(\left\langle \lambda _{0} \big | \lambda _{n} \right\rangle \) is not constrained by Eq. (7) or Eq. (5), and by extension Eq. (2). The simplest choice, and one employed by many authors [6,7,8] , is \(\left\langle \lambda _{0} \big | \lambda _{n} \right\rangle =0\). But it can be more convenient to use this degree of freedom to normalise the eigenvector to 1 in every order, i.e.

$$\begin{aligned} \sum _{n,m=0}^N\epsilon ^{n+m}\left\langle \lambda _{n} \big | \lambda _{m} \right\rangle =1+\mathcal {O}\left( \epsilon ^{N+1}\right) \qquad \forall \,N\in \mathbb {N}_0. \end{aligned}$$
(10)

This way the calculated eigenvectors immediately form an orthonormal basis (up to higher-order terms) and can be used straightforwardly to calculate expectation values without having to manually renormalise. Equation (10) requires the unperturbed eigenvector to be normalised, \(\left\langle \lambda _{0} \big | \lambda _{0} \right\rangle =1\), and fixes the real part of \(\left\langle \lambda _{0} \big | \lambda _{n} \right\rangle \). Choosing to set the imaginary part to 0, we arrive at

$$\begin{aligned} \left\langle \lambda _{0} \big | \lambda _{n} \right\rangle =-\frac{1}{2}\sum _{m=1}^{n-1}\left\langle \lambda _{m} \big | \lambda _{n-m} \right\rangle . \end{aligned}$$
(11)

This is not a unique phase choice (see Sect. 4), though it is the conventional one [2, Ch. 11]. In Theorem 1, we collect Eqs. (5), (7) and (11). It is not a new result but rarely stated explicitly for arbitrary orders.

Theorem 1

(Cohen-Tannoudji et al.) The sequences \(\lambda _n\) and \(\left| \lambda _{n} \right\rangle \), \(n\in \mathbb {N}\) that satisfy the coupled recurrence relations

$$\begin{aligned} \lambda _n&= \left\langle \lambda _{0} \right| \left( V\left| \lambda _{n-1} \right\rangle -\sum _{m=1}^{n-1}\left| \lambda _{n-m} \right\rangle \lambda _m\right) , \end{aligned}$$
(12)
$$\begin{aligned} \left| \lambda _{n} \right\rangle&= -\frac{1}{2}\left| \lambda _{0} \right\rangle \sum _{m=1}^{n-1}\left\langle \lambda _{m} \big | \lambda _{n-m} \right\rangle +\frac{Q_{\lambda }}{\lambda _0-H_0}\left( V\left| \lambda _{n-1} \right\rangle -\sum _{m=1}^{n-1}\left| \lambda _{n-m} \right\rangle \lambda _m\right) , \end{aligned}$$
(13)

where the starting values \(\lambda _0\), \(\left| \lambda _{0} \right\rangle \) are an eigenvalue and corresponding unit eigenvector of \(H_0\), respectively, solve Eq. (2) while preserving the normalisation of \(\sum _{n=0}^N\epsilon ^n\left| \lambda _{n} \right\rangle \) for every \(N\ge 0\) [2, Ch. 11].

2.2 Bloch Sequences and Diagrams

Fig. 1
figure 1

Bloch diagrams. On the left are all diagrams for \(n=1\) and \(n=2\), including the non-convex (0, 2). On the right is some order n diagram. Following Bloch [5], we sometimes use curved, dashed lines to represent an arbitrary diagram

Bloch’s seminal paper on degenerate RSPT [5] was the main inspiration for this paper. In this subsection, we summarise the relevant definitions and results that we adopt from there. These also apply to the nondegenerate case straightforwardly, see e.g. [6], and are adapted to our notation accordingly. The result for the eigenvector is

(14)

where the sum is over Bloch sequences of length n defined by

$$\begin{aligned} {\{k\}_n}:\,\,\,\,\,\,k_i\in \mathbb {N}_0,\,i=1,2,\dots ,n,\qquad \sum _{i=1}^n k_i=n, \end{aligned}$$
(15)

and the prime indicates it is restricted to those sequences that satisfy

$$\begin{aligned} \sum _{i=1}^p k_i\ge p,\quad p=1,2,\dots ,n-1. \end{aligned}$$
(16)

\(\left| \overline{\lambda _n} \right\rangle \) is a solution to Eq. (2); it is not normalised, but satisfies the condition \(\left\langle \lambda _0\big |\overline{\lambda _n}\right\rangle =0\), so that \(\left\langle \lambda _0\big |\overline{\lambda }\right\rangle =1\). We distinguish it from the normalised one defined in Theorem 1 with an overline.

Using Eq. (14), Eq. (5) becomes

(17)

Bloch sequences can be represented graphically as staircase diagrams where step i has height \(k_i\) and width 1, as illustrated in Fig. 1. The diagrams satisfying Eq. (16) are those that always stay above the diagonal. They are also called convex and are known in the combinatorics literature as Dyck paths [15, p.76].

3 Stepwise Diagrammatic Solution

We find that when we require a normalised state vector, given the phase choice Eq. (11), the basic structure of the solution is retained:

Theorem 2

The coupled recurrence relations in Theorem 1 are solved by \(\lambda _n\), \(\left| \lambda _{n} \right\rangle \) of the form

$$\begin{aligned} \lambda _n&= \sum _{\{k\}_{n-1}}e\!\left( k_1,k_2,\dots ,k_{n-1}\right) \left\langle \lambda _{0} \right| VS^{k_{1}}VS^{k_{2}}V\!\dots S^{k_{n-1}}V \left| \lambda _{0} \right\rangle , \end{aligned}$$
(18)
$$\begin{aligned} \left| \lambda _{n} \right\rangle&= \sum _{\{k\}_n} c\!\left( k_1,k_2,\dots ,k_n\right) S^{k_{1}}VS^{k_{2}}V\!\dots S^{k_{n}}V \left| \lambda _{0} \right\rangle . \end{aligned}$$
(19)

where c, e are rational-valued functions.

Note that the absence of the primes on the sums in Eqs. (18) and (19) means that we must also allow non-convex diagrams (recall Eq. (16)).

Note further that we have introduced again a series for the eigenvalue Eq. (18), which may appear unnecessary given the existing result of Bloch Eq. (17), with known, simple values for the coefficients e. One should, however, view Eq. (18) as an auxiliary equation, not useful for actual evaluation of \(\lambda _n\) (Bloch’s result is best for that), but quite useful for obtaining the coefficients c in Eq. (19) according to the recurrence that is about to be derived. This is made possible by the non-uniqueness of the perturbation theory in terms of Bloch diagrams, which we explore more thoroughly in Sect. 4.

Proof

This is by complete induction on n. For \(n=1\), Theorem 1 gives the first-order corrections \(\lambda _1=\left\langle \lambda _{0} \right| V\left| \lambda _{0} \right\rangle \) and \(\left| \lambda _{1} \right\rangle =S^1V\left| \lambda _{0} \right\rangle \). These are of the form of Eqs. (18) and (19) with \(e(\emptyset )=c(1)=1\), proving the base case.

For \(n>1\), assume Eqs. (18) and (19) hold for all \(m<n\). We compute \(\lambda _n\), \(\left| \lambda _{n} \right\rangle \) using Theorem 1:

$$\begin{aligned} \lambda _n&=\left\langle \lambda _{0} \right| V\sum _{\{k\}_{n-1}}c\!\left( k_1,\dots ,k_{n-1}\right) S^{k_{1}}V\!\dots S^{k_{n-1}}V\left| \lambda _{0} \right\rangle \nonumber \\&\quad - \left\langle \lambda _{0} \right| \sum _{m=1}^{n-2}\sum _{\{k\}_{n-m}}c\!\left( k_1,\dots ,k_{n-m}\right) S^{k_{1}}V\!\dots S^{k_{n-m}}V\left| \lambda _{0} \right\rangle \nonumber \\&\quad \times \sum _{\{j\}_{m-1}}e\!\left( j_1,\dots ,j_{m-1}\right) \left\langle \lambda _{0} \right| VS^{j_{1}}V\!\dots S^{j_{m-1}}V\left| \lambda _{0} \right\rangle \end{aligned}$$
(20a)
$$\begin{aligned}&=\sum _{\{k\}_{n-1}}c\!\left( k_1,\dots ,k_{n-1}\right) \left\langle \lambda _{0} \right| VS^{k_{1}}V\!\dots S^{k_{n-1}}V\left| \lambda _{0} \right\rangle \nonumber \\&\quad - \sum _{m=1}^{n-2}\sum _{\begin{array}{c} \{k\}_{n-m}\\ k_1=0 \end{array}}\sum _{\{j\}_{m-1}}c\!\left( 0,k_2,\dots ,k_{n-m}\right) e\!\left( j_1,\dots ,j_{m-1}\right) \nonumber \\&\quad \times \left\langle \lambda _{0} \right| VS^{k_{2}}V\!\dots S^{k_{n-m}}VS^0VS^{j_{1}}V\!\dots S^{j_{m-1}}V\left| \lambda _{0} \right\rangle , \end{aligned}$$
(20b)

where in the second line we have changed the upper limit on the sum over m from \(n-1\) to \(n-2\) since \(\left\langle \lambda _{0} \big | \lambda _{1} \right\rangle =0\), and in the last line we used \(\left\langle \lambda _{0} \right| S^{k_1}=-\delta _{0,k_1}\left\langle \lambda _{0} \right| \) and \(\left| \lambda _{0} \right\rangle \left\langle \lambda _{0} \right| =-S^0\). The result is of the form (18) with \(e\!\left( k_1,\dots ,k_{n-1}\right) \) given by Eq. (23). In Eq. (23), the \(\delta _{n-m,K_{n-m}}\) ensures that the arguments of c and e are Bloch sequences.

We repeat the same reasoning for the eigenvector

$$\begin{aligned}&\left| \lambda _{n} \right\rangle = -\frac{1}{2}\left| \lambda _{0} \right\rangle \sum _{m=1}^{n-1}\sum _{\{k\}_m}\sum _{\{j\}_{n-m}}\quad c\!\left( k_1,\dots ,k_m\right) \nonumber \\&\quad \quad \times c\!\left( j_1,\dots ,j_{n-m}\right) \left\langle \lambda _{0} \right| VS^{k_{m}}\!\dots VS^{k_{1}} S^{j_{1}}V\!\dots S^{j_{n-m}}V\left| \lambda _{0} \right\rangle \nonumber \\&\quad \quad +S^1V\sum _{\{k\}_{n-1}}c\!\left( k_1,\dots ,k_{n-1}\right) S^{k_{1}}V\!\dots S^{k_{n-1}}V\left| \lambda _{0} \right\rangle \nonumber \\&\quad \quad -S^1\sum _{m=1}^{n-1}\sum _{\{k\}_{n-m}}\sum _{\{j\}_{m-1}}c\!\left( k_1,\dots ,k_{n-m}\right) \nonumber \\&\quad \quad \times e\!\left( j_1,\dots ,j_{m-1}\right) S^{k_{1}}V\!\dots S^{k_{n-m}}V \left| \lambda _{0} \right\rangle \left\langle \lambda _{0} \right| VS^{j_{1}}V\!\dots S^{j_{m-1}}V\left| \lambda _{0} \right\rangle \end{aligned}$$
(21a)
$$\begin{aligned}&=\sum _{m=1}^{n-1}\sum _{\{k\}_m}\sum _{\{j\}_{n-m}}\frac{1}{2}\left( 1-\delta _{0,k_1}-\delta _{0,j_1} \right) c\!\left( k_1,\dots ,k_m\right) c\!\left( j_1,\dots ,j_{n-m}\right) \nonumber \\&\quad \times S^0VS^{k_{m}}V\!\dots S^{k_{2}}V S^{k_1+j_1}VS^{j_{2}}V\!\dots S^{j_{n-m}}V\left| \lambda _{0} \right\rangle \nonumber \\&\quad +\sum _{\{k\}_{n-1}}c\!\left( k_1,\dots ,k_{n-1}\right) S^1VS^{k_{1}}V\!\dots S^{k_{n-1}}V\left| \lambda _{0} \right\rangle \nonumber \\&\quad +\sum _{m=1}^{n-1}\sum _{\{k\}_{n-m}}\sum _{\{j\}_{m-1}}\left( 1-\delta _{0,k_1}\right) c\!\left( k_1,\dots ,k_{n-m}\right) e\!\left( j_1,\dots ,j_{m-1}\right) \nonumber \\&\quad \times S^{k_1+1}VS^{k_{2}}V\!\dots S^{k_{n-m}}VS^0VS^{j_{1}}V\!\dots S^{j_{m-1}}V\left| \lambda _{0} \right\rangle . \end{aligned}$$
(21b)

Here we have again used \(\left| \lambda _{0} \right\rangle \left\langle \lambda _{0} \right| =-S^0\) as well as

$$\begin{aligned} S^kS^j={\left\{ \begin{array}{ll}-S^0=-S^{k+j} &{} \text {if }k=j=0,\\ S^{k+j} &{} \text {if }k,j>0,\\ 0 &{} \text {else.}\end{array}\right. } \end{aligned}$$
(22)

The result is of the form (19) with \(c(k_1,\dots ,k_n)\) given by Eq. (24). Note that in Eqs. (20) and (21) (and therefore in equations throughout the following) argument lists can be empty. Specifically, for \(m=1\), \(e(j_1,\dots ,j_{m-1})=e(\emptyset )\). This corresponds to an appearance of \(\lambda _1\) in Theorem 1. \(\square \)

Corollary 2.1

The functions c and e defined in Theorem 2 satisfy (\(n\ge 2\))

$$\begin{aligned}&e\!\left( k_1,\dots ,k_{n-1}\right) \nonumber \\&\quad = c\!\left( k_1,\dots ,k_{n-1}\right) - \sum _{m=1}^{n-2} \delta _{0,k_{n-m}} \delta _{n-m,K_{n-m}} c\!\left( 0,k_1,\dots ,k_{n-m-1}\right) e\!\left( k_{n-m+1},\dots ,k_{n-1}\right) , \end{aligned}$$
(23)
(24)

with the primed sum restricted to \(k_{m+1}\ge m-K_m \ge 0\) so that all the arguments of c are non-negative, and \(K_m=\sum _{i=1}^m k_i\). For \(m=1\), in some of the argument lists, the initial index is smaller than the final index; as in Eq. (6) such an argument list should be interpreted as an empty set. The starting values of c and e (\(n=1\)) are

$$\begin{aligned} c(1)=1,\qquad e(\emptyset )=1. \end{aligned}$$
(25)

Note that functions c and e are independent of the Hamiltonian. We will refer to the three cases in Eq. (24) as the \(k_1=0\), \(k_1=1\), and \(k_1>1\) rules. For a diagrammatic explanation of the recurrence relations, refer to Sect. 5.1.

In Eqs. (14) and (17), all diagrams are summed up with a coefficient of 1, or 0 if the sum is extended to non-convex diagrams. The same cannot be true for c and e because of the factor 1/2 in \(\left\langle \lambda _{0} \big | \lambda _{n} \right\rangle \). A nonzero \(\left\langle \lambda _{0} \big | \lambda _{n} \right\rangle \) means that some diagrams start below the diagonal, so are definitely not convex. And the factor 1/2 means their coefficient is generally unequal to 1.

From calculating c for all diagrams up to fourth order, cf. Sect. 5, and selected higher-order diagrams, we anticipate that it will have the following property, which will be useful in the subsequent analysis:

Definition 1

(Crossing Property). For any Bloch sequence \((k_1,\dots ,k_n)\) let \(x(k_1,\dots ,k_n)\) be the number of times its associated diagram intersects the main diagonal. We say a function f has the crossing property if there is another function g such that

$$\begin{aligned} f(k_1,\dots ,k_n)=g(\lceil {x(k_1,\dots ,k_n)/2} \rceil ) \end{aligned}$$
(26)

for all Bloch sequences \((k_1,\dots ,k_n)\), i.e. f depends only on the number of times a diagram crosses from below to above the main diagonal. Here \(\lceil {\cdot } \rceil \) is the ceiling function.

If the function c has the crossing property, the problem of evaluating it only needs to be performed on a set of representative diagrams. These diagrams can be taken to be the ones with the Bloch sequences (cf. Eq. (15))

$$\begin{aligned} \{k\}_{2n}=(0,2)^n, \end{aligned}$$
(27)

meaning 0,2 repeated n times. It is helpful below to have a separate symbol for these specific instances of the c function:

Definition 2

\(t(n)=c\!\left( (0,2)^n\right) \) for \(n>0\), and \(t(0)=c(1)=1\).

t(n) can be computed:

Lemma 1

$$\begin{aligned} t(n)=\left( {\begin{array}{c}n-\frac{1}{2}\\ n\end{array}}\right) =\frac{1}{2^{2n}} \left( {\begin{array}{c}2n\\ n\end{array}}\right) =\frac{\Gamma \!\left( n+\frac{1}{2}\right) }{\sqrt{\pi }\Gamma (n+1)}. \end{aligned}$$
(28)

Proof

The generalised binomial coefficient is defined in the usual way

$$\begin{aligned} \left( {\begin{array}{c}r\\ n\end{array}}\right) =\frac{r(r-1)\dots (r-n+1)}{n!}. \end{aligned}$$
(29)

We use the \(k_1=0\) and \(k_1=1\) rules of Eq. (24) to derive a recurrence relation for t(n),

$$\begin{aligned} t(n)&\overset{k_1=0}{=} \frac{1}{2}\left[ t(0)c\!\left( 1,(0,2)^{n-1}\right) -t(1)t(n-1)+c(1,0,2)c\!\left( 1,(0,2)^{n-2}\right) \right. \nonumber \\&\quad -t(2)t(n-2)+\dots \left. +c\!\left( 1,(0,2)^{n-1}\right) c(1)\right] \nonumber \\&\overset{k_1=1}{=}\frac{1}{2}\left[ t(0)t(n-1) -t(1)t(n-1)+t(1)t(n-2)-t(2)t(n-2)+\dots +t(n-1)t(0)\right] \nonumber \\&\quad =t(0)t(n-1)-t(1)t(n-1)+t(1)t(n-2)\nonumber \\&\qquad -t(2)t(n-2)+\dots +\frac{(-1)^{n-1}}{2}t\!\left( \left\lfloor {\frac{n}{2}}\right\rfloor \right) ^2\nonumber \\&\quad {=}\sum _{m=0}^{n-1}(-1)^m t\!\left( \left\lceil {\frac{m}{2}}\right\rceil \right) t\!\left( n-1-\left\lfloor {\frac{m}{2}}\right\rfloor \right) 2^{-\delta _{m,n-1}}\nonumber \\&\quad {=}\frac{1}{2}\sum _{m=0}^{2(n-1)}(-1)^m t\!\left( \left\lceil {\frac{m}{2}}\right\rceil \right) t \!\left( n-1-\left\lfloor {\frac{m}{2}}\right\rfloor \right) \nonumber \\&\quad {=}\frac{1}{2}\sum _{m=0}^{n-1}t(m)t(n-1-m)-\frac{1}{2}\sum _{m=1}^{n-1}t(m)t(n-m). \end{aligned}$$
(30)

Here \(\lfloor {\cdot } \rfloor \), \(\lceil {\cdot } \rceil \) are floor and ceiling function, respectively, rounding to the nearest integer lesser/greater than the argument. The summands are symmetric under reversing the order of summation, so for all but one term the factor 1/2 cancels. But we find it convenient to instead keep all terms and group them by even and odd indices, as done in the last line of Eq. (30). Then by bringing the latter sum to the left-hand side, which we can also write as \(t(n)=\frac{1}{2}t(n)t(0)+\frac{1}{2}t(0)t(n)\), and multiplying by 2, we can rewrite Eq. (30) as

$$\begin{aligned} \sum _{m=0}^n t(m)t(n-m)=\sum _{m=0}^{n-1} t(m)t(n-1-m)=t(0)^2=1, \end{aligned}$$
(31)

i.e. we find that the sum is independent of n, so we can set, for example, \(n=1\) to evaluate it.

To complete the proof constructively,Footnote 1 consider that Eq. (31) has the form of a discrete convolution, so we can restate it in terms of the (ordinary) generating function of t,

$$\begin{aligned} g(x)=\sum _{n=0}^\infty t(n)x^n, \end{aligned}$$
(32)

as

$$\begin{aligned} g(x)^2=\sum _{n=0}^\infty x^n=\frac{1}{1-x} \end{aligned}$$
(33)

with a geometric series, so we can express the generating function as a binomial series to determine t,

$$\begin{aligned} g(x)=\frac{1}{\sqrt{1-x}}=\sum _{n=0}^\infty \left( {\begin{array}{c}-\frac{1}{2}\,\\ n\end{array}}\right) (-x)^n=\sum _{n=0}^\infty \left( {\begin{array}{c}n-\frac{1}{2}\\ n\end{array}}\right) x^n \end{aligned}$$
(34)

which gives Eq. (28). \(\square \)

Note that t(n) decreases only slowly with n; asymptotically, \(t(n)\sim 1/\sqrt{\pi n}\).

Fig. 2
figure 2

Illustration of crossing numbers. The diagram (2, 0, 0, 2, 0, 2, 0, 3, 0) is the lowest order diagram with crossing numbers 1, 3, 1, 0. The diagram \((0,2)^n\) from Definition 2 is the lowest order diagram with crossing numbers 0, n, shown here with \(n=4\). Both the (main) diagonal and the upper diagonal are illustrated here

The solution for e is slightly more complicated as, in contrast to c, it does not depend solely on the diagram’s topology w.r.t. the main diagonal, but also w.r.t. the upper diagonal, which is defined as the diagonal line one unit higher than the main diagonal, as illustrated in Fig. 2.

Definition 3

(Crossing Numbers). We say a Bloch sequence \((k_1,\dots ,k_n)\) has crossing numbers \(N_1,n_1,N_2,n_2,\dots ,N_m,n_m\) if its associated diagram crosses, in order, above the upper diagonal \(N_1\) times, below the main diagonal \(n_1\) times, then above the upper diagonal \(N_2\) times, etc. Here m is some integer with \(1\le m\le n/3+1\). For concreteness there is always an even number of crossing numbers, where the first and last one, \(N_1\) and \(n_m\), may be 0 while all other ones are strictly positive integers such that m is well-defined.

Given a Bloch sequence \((k_1,\dots ,k_n)\), the crossing numbers can be constructed as follows:

figure a

A canonical diagram that has crossing numbers \(N_1,\dots ,n_m\) is

$$\begin{aligned} (k_1,\dots ,k_n)= & {} ((2,0)^{N_1},(0,2)^{n_1-1},0,3,0,(2,0)^{N_2-1},(0,2)^{n_2-1},0,3,0,\nonumber \\&\dots ,(2,0)^{N_m-1},(0,2)^{n_m}) \end{aligned}$$
(36)

some examples of which are given in Fig. 2. For the special case that the crossing numbers are \(N_1,n_1=0,0\), this sequence is empty, and we can instead take \((k_1)=(1)\) as this canonical diagram.

The upper limit \(m\le n/3+1\) is derived by setting \(N_1=n_m=0\), and all other \(N_i=n_i=1\) in the lowest order diagram. Such a large m is somewhat of an outlier though. If we consider a Bloch diagram (rotated by \(-\pi /4\)) as a random bridge, we could develop the notion of a typical diagram. The number of times an nth- order diagram touches or intersects the diagonal, which is an upper bound on m, asymptotically follows a Rayleigh distribution with mean \(\sqrt{\pi n}\) [15, p.708]. This seems to indicate that typically \(m\sim \sqrt{n}\), i.e. in most instances \(m\ll n\).

We will now proceed to the main result of the paper, Theorem 3, in which explicit formulas for c and e are obtained. We first briefly review the heuristics that led us to the formulation of this theorem. We noted that if we assumed that c had the crossing property, Lemma 1 would be sufficient to calculate c for all diagrams. By calculating a number of examples, we made observations about the structure of the solution, noting the dependence on the crossing numbers only, and used these to allow further simplification of the recurrence relations. We came to an ansatz for solving the coupled recurrence relations, guided by the observation that our solution for e has to be consistent with c having the crossing property.

In the end, the ansatz is proved in the following theorem by induction, accompanied by a straightforward algebraic analysis:

Theorem 3

Let \((k_1,\dots ,k_n)\) be a Bloch sequence with crossing numbers \(N_1,n_1,\dots ,N_m,n_m\). The functions c and e defined in Theorem 2 are

$$\begin{aligned} c\!\left( k_1,\dots ,k_n\right)&=t\!\left( \sum _{i=1}^m n_i\right) , \end{aligned}$$
(37)
$$\begin{aligned} e\!\left( k_1,\dots ,k_n\right)&=\sum _{i=1}^m \left[ t\!\left( \sum _{l=1}^i N_l\right) -t\!\left( \sum _{l=1}^{i-1} N_l\right) +\delta _{i,1}\right] t\!\left( \sum _{j=i}^m n_j\right) , \end{aligned}$$
(38)

with \(t(x)=\left( {\begin{array}{c}2x\\ x\end{array}}\right) 2^{-2x}\) as given in Eq. (28), i.e. c has the crossing property and e is a function of the crossing numbers only.

Proof

We verify that Eqs. (37) and (38) are consistent with Corollary 2.1. At \(n=1\), we have \(k_1=1\) with \(N_1=n_1=0\), which are also the crossing numbers for an empty diagram (\(\emptyset \), \(n=0\)). Then Eqs. (37) and (38) give \(c(1)=e(1)=e(\emptyset )=1\), consistent with Eq. (25).

Suppose Theorem 3 holds for all diagrams of degree less than n. (Note that degree simply refers to the number of entries in the Bloch sequence \(\{k\}_n\).) We apply Eq. (23) to compute \(e(k_1,\dots ,k_n)\) and show it is consistent with Eq. (38):

$$\begin{aligned} e\!\left( k_1,\dots ,k_n\right)&=t\!\left( \sum _{j=1}^m n_j\right) -\sum _{i=1}^m\sum _{k=1}^{N_i}t\!\left( \sum _{j=1}^{i-1}N_j+k\right) \nonumber \\&\quad \times \sum _{l=i}^m\left[ t\!\left( \sum _{h=i}^l N_h-k\right) -\left( 1-\delta _{l,i}\right) t\!\left( \sum _{h=i}^{l-1} N_h-k\right) \right] t\!\left( \sum _{g=l}^m n_g\right) \end{aligned}$$
(39a)
$$\begin{aligned}&=t\!\left( b_1\right) -\sum _{i=1}^m\sum _{k=1+a_{i-1}}^{a_i}t(k)\sum _{l=i}^m \left[ t\!\left( a_l-k\right) -\left( 1-\delta _{l,i}\right) t\!\left( a_{l-1}-k\right) \right] t\!\left( b_l\right) \end{aligned}$$
(39b)
$$\begin{aligned}&=t\!\left( b_1\right) -\sum _{l=1}^m t\!\left( b_l\right) \sum _{i=1}^l\sum _{k=1+a_{i-1}}^{a_i}t(k) \left[ t\!\left( a_l-k\right) -\left( 1-\delta _{l,i}\right) t\!\left( a_{l-1}-k\right) \right] \end{aligned}$$
(39c)
$$\begin{aligned}&=t\!\left( b_1\right) -\sum _{l=1}^m t\!\left( b_l\right) \left[ \sum _{k=1}^{a_l} t(k)t\!\left( a_l-k\right) - \sum _{k=1}^{a_{l-1}}t(k)t\!\left( a_{l-1}-k\right) \right] \end{aligned}$$
(39d)
$$\begin{aligned}&=\sum _{l=1}^m t\!\left( b_l\right) \left[ t\!\left( a_l\right) -t\!\left( a_{l-1}\right) +\delta _{l,1}\right] \end{aligned}$$
(39e)

where in Eq. (39b) we introduce \(a_i=\sum _{j=1}^i N_j\) and \(b_i=\sum _{j=i}^m n_j\) to simplify notation, and shift the summation index k by \(a_{i-1}\). Then in Eq. (39c) we switch the sums over i and l. Note that \(a_0=0\), \(a_1=N_1\ge 0\), and \(a_{i+1}>a_i\) for \(i>0\). So in Eq. (39d) we can combine the double sum over ik into one over k. And finally in Eq. (39e) we add and subtract the \(k=0\) terms, then use Eq. (31) and get Eq. (38).

Similarly, we calculate \(c(k_1,\dots ,k_n)\) using Eq. (24)

$$\begin{aligned}&c\!\left( k_1,\dots ,k_n\right) \nonumber \\&\quad =\delta _{0,k_1}\frac{1}{2} \left[ \sum _{i=1}^{b_1}t(i-1)t\!\left( b_1-i\right) -\sum _{i=1}^{b_1-1} t(i)t\!\left( b_1-i\right) \right] +\delta _{1,k_1}t\!\left( b_1\right) \end{aligned}$$
(40a)
$$\begin{aligned}&\qquad +\left( 1-\delta _{0,k_1}-\delta _{1,k_1}\right) \sum _{i=1}^m\sum _{j=1}^{N_i}t\!\left( a_{i-1}+j-1\right) s \!\left( N_i-j,n_i,\dots ,N_m,n_m\right) \nonumber \\&=\delta _{0,k_1}\frac{1}{2} \left[ \sum _{i=0}^{b_1-1}t(i)t\! \left( b_1-1-i\right) -\left( 1-2t\!\left( b_1\right) \right) \right] +\delta _{1,k_1}t\!\left( b_1\right) +\left( 1-\delta _{0,k_1}-\delta _{1,k_1}\right) \end{aligned}$$
(40b)
$$\begin{aligned}&\qquad \times \sum _{i=1}^m\sum _{j=1}^{N_i}t\!\left( a_{i-1}+j-1\right) \sum _{k=i}^m t\!\left( b_k\right) \left[ t\!\left( \sum _{h=i}^k N_h-j\right) -\left( 1-\delta _{k,i}\right) t\!\left( \sum _{h=i}^{k-1} N_h-j\right) \right] \nonumber \\&\quad =\left( \delta _{0,k_1}+\delta _{1,k_1}\right) t\!\left( b_1\right) \end{aligned}$$
(40c)
$$\begin{aligned}&\qquad +\left( 1-\delta _{0,k_1}-\delta _{1,k_1}\right) \sum _{k=1}^m t\!\left( b_k\right) \sum _{i=1}^k\sum _{j=a_{i-1}}^{a_i-1}t(j)\left[ t\!\left( a_k-1-j\right) -\left( 1-\delta _{k,i}\right) \right. \nonumber \\&\qquad \times \left. t\!\left( a_{k-1}-1-j\right) \right] \nonumber \\&=\left( \delta _{0,k_1}+\delta _{1,k_1}\right) t\!\left( b_1\right) \end{aligned}$$
(40d)
$$\begin{aligned}&\qquad +\left( 1-\delta _{0,k_1}-\delta _{1,k_1}\right) \sum _{k=1}^m t\!\left( b_k\right) \left[ \sum _{j=0}^{a_k-1} t(j)t\!\left( a_k-1-j\right) -\sum _{j=0}^{a_{k-1}-1}t(j)t\!\left( a_{k-1}-1-j\right) \right] \nonumber \\&=\left( \delta _{0,k_1}+\delta _{1,k_1}\right) t\!\left( b_1\right) +\left( 1-\delta _{0,k_1}-\delta _{1,k_1}\right) \sum _{k=1}^m t\!\left( b_k\right) \left[ 1-\left( 1-\delta _{k,1}\right) \right] \end{aligned}$$
(40e)
$$\begin{aligned}&=t\!\left( b_1\right) . \end{aligned}$$
(40f)

In Eq. (40a) note that for \(k_1=0\) the diagram will have \(2b_1-1\) intersections with the main diagonal, \(b_1-1\) of which are horizontal and thus come with a negative sign. (This rule is a consequence of the negative sign in Eq. (24), see Sect. 5.1, Fig. 3.) Then in Eq. (40b) we shift the index of the first sum, as well as add and subtract the \(i=0,b_1\) terms to the second sum and immediately evaluate it with Eq. (31), which we then also apply to the first sum in the following step. In Eq. (40c) we index-shift the j sum by \(a_{i-1}-1\), and switch the k and i sums. In Eq. (40e) we use Eq. (31) again, but we have to be careful not to apply it if the sums vanish because of Eq. (6). Since we are in the \(k_1>1\) term, we know that \(a_1=N_1\ge 1\), and \(a_{i+1}>a_i\) for \(i\ge 1\) still holds, so the only term Eq. (6) applies to is the second j sum for \(k=1\) since \(a_0=0\). \(\square \)

4 On Non-uniqueness of Diagrammatic Representations

By stating a recurrence relation and initial conditions we uniquely define a quantity. For example, combined with the initial conditions, Eqs. (12) and (13) fix the eigenenergy and eigenvector corrections, and Eqs. (23) and (24) uniquely define the coefficients c and e. That does not mean, however, that c and e are necessarily the unique solutions of Eqs. (12) and (13) or that Eqs. (12) and (13) are the unique solutions of Eq. (2).

Since we are considering the nondegenerate case, eigenvectors are determined up to a factor. We are fixing the normalisation with Eq. (10), but that still leaves a phase freedom. The zeroth-order phase is set by our choice of \(\left| \lambda _{0} \right\rangle \). We can modify this phase in higher orders of \(\epsilon \) by adding an imaginary part to Eq. (11). An arbitrary imaginary part would generally change the structure of Eq. (19), but we could preserve it, e.g. by setting

$$\begin{aligned} \left\langle \lambda _{0} \big | \lambda _{n} \right\rangle \rightarrow {\left\{ \begin{array}{ll} -{\sum }_{m=1}^{(n-1)/2}\left\langle \lambda _{m} \big | \lambda _{n-m} \right\rangle &{} \text {if }n\text { is odd}, \\ -{\sum }_{m=1}^{n/2-1} \left\langle \lambda _{m} \big | \lambda _{n-m} \right\rangle -\frac{1}{2} \left\langle \lambda _{n/2} \big | \lambda _{n/2} \right\rangle &{} \text {if }n\text { is even}.\end{array}\right. } \end{aligned}$$
(41)

This reduces the number of diagrams but comes at the cost of a more complicated rule requiring an even/odd distinction.

Note that if the Hamiltonian is real-symmetric, Eqs. (11) and (41) are equivalent, equal, yet Eq. (41) still provides the more compact description in terms of the number of diagrams. This brings us to the main point of this section: Once norm and phase are fixed, the eigenvector is uniquely defined, but the representation in terms of diagrams is not. The eigenenergy is of course independent of the factor in front of the eigenvector but has a similar freedom with regard to the decomposition into diagrams.

Definition 4

A Bloch sequence \((k_1,\dots ,k_n)\) containing \(q-1\) zeroes, \(q=1,\dots ,n\), can be represented equivalently by q strings of positive integers \(z_i\in \mathbb {N}^k\), \(k=0,\dots ,n-q+1\) (note that null strings, \(k=0\), are allowed). \(\mathcal {Z}\) is defined as the mapping between a Bloch sequence and the set of \(z_i\) strings:

(42)

We also define the operators T, L, and D (“total”, “length”, and “difference”) applied to string z:

$$\begin{aligned} \text {for }z=j_1j_2\dots j_k:\quad T(z)=\sum _{i=1}^k j_i,\quad L(z)=k,\quad D(z)=T(z)-L(z). \end{aligned}$$
(43)

As an example for the map \(\mathcal {Z}\), we can write

(44)

Here \(n=5\), \(q=3\), and we see the appearance of integer strings \(z_i\) of varying length, including the null string.

Theorem 4

In the expansion of the eigenenergy, Eq. (18), all diagrams that differ only by a permutation of z strings correspond to the same matrix element.

Similarly, in the expansion of the eigenvector, Eq. (19), diagrams that share the same first string and otherwise differ only by a permutation of the remaining strings correspond to the same matrix element.

Proof

Suppose the \(m\hbox {th}\) component of a Bloch sequence \((k_1,\dots ,k_n)\) vanishes, \(k_m=0\). The term this sequence contributes to the energy correction is

$$\begin{aligned} {\begin{matrix} &{}\left\langle \lambda _{0} \right| VS^{k_{1}}V\!\dots S^{k_{m-1}}VS^0VS^{k_{m+1}}V\!\dots S^{k_{n}}V\left| \lambda _{0} \right\rangle \\ &{}\quad =-\left\langle \lambda _{0} \right| VS^{k_{1}}V\!\dots S^{k_{m-1}}V\left| \lambda _{0} \right\rangle \left\langle \lambda _{0} \right| VS^{k_{m+1}}V\!\dots S^{k_{n}}V\left| \lambda _{0} \right\rangle \\ &{}\quad =-\left\langle \lambda _{0} \right| VS^{k_{m+1}}V\!\dots S^{k_{n}}V\left| \lambda _{0} \right\rangle \left\langle \lambda _{0} \right| VS^{k_{1}}V\!\dots S^{k_{m-1}}V\left| \lambda _{0} \right\rangle \\ &{}\quad =\left\langle \lambda _{0} \right| VS^{k_{m+1}}V\!\dots S^{k_{n}}VS^0VS^{k_{1}}V\!\dots S^{k_{m-1}}V\left| \lambda _{0} \right\rangle . \end{matrix}} \end{aligned}$$
(45)

Suppose \(k_m\) is the \(M\hbox {th}\) zero in the Bloch sequence, and \(\mathcal {Z}(k_1,\dots ,k_n)=(z_1,\dots ,z_q)\). Equation (45) implies that \((z_1,\dots ,z_M,z_{M+1},\dots ,z_q)\) has the same operator content as \((z_{M+1},\dots ,z_q,z_1,\dots ,z_M)\), i.e. the operator content is invariant under cyclical permutation of strings. Now, let \(k_j\) be the \(J\hbox {th}\) zero, w.l.o.g. assume \(j>m\)

$$\begin{aligned}&\left\langle \lambda _{0} \right| VS^{k_{1}}V\!\dots S^{k_{m-1}}VS^0VS^{k_{m+1}}V\!\dots S^{k_{j-1}}VS^0VS^{k_{j+1}}V\!\dots S^{k_{n}}V\left| \lambda _{0} \right\rangle \nonumber \\&\quad =(-1)^2\left\langle \lambda _{0} \right| VS^{k_{1}}V\!\dots S^{k_{m-1}}V\left| \lambda _{0} \right\rangle \left\langle \lambda _{0} \right| VS^{k_{m+1}}V\!\dots S^{k_{j-1}}V\left| \lambda _{0} \right\rangle \nonumber \\&\qquad \times \left\langle \lambda _{0} \right| VS^{k_{j+1}}V\!\dots S^{k_{n}}V\left| \lambda _{0} \right\rangle \nonumber \\&\quad =(-1)^2\left\langle \lambda _{0} \right| VS^{k_{m+1}}V\!\dots S^{k_{j-1}}V\left| \lambda _{0} \right\rangle \left\langle \lambda _{0} \right| VS^{k_{1}}V\!\dots S^{k_{m-1}}V\left| \lambda _{0} \right\rangle \nonumber \\&\qquad \times \left\langle \lambda _{0} \right| VS^{k_{j+1}}V\!\dots S^{k_{n}}V\left| \lambda _{0} \right\rangle \nonumber \\&\quad =\left\langle \lambda _{0} \right| VS^{k_{m+1}}V\!\dots S^{k_{j-1}}VS^0VS^{k_{1}}V\!\dots S^{k_{m-1}}VS^0VS^{k_{j+1}}V\!\dots S^{k_{n}}V\left| \lambda _{0} \right\rangle , \end{aligned}$$
(46)

i.e. \((z_1,\dots ,z_q)\) has the same operator content as \((z_{M+1},\dots ,z_J,z_1,\dots ,z_M,z_{J+1},\dots ,z_q)\). For example, by setting \(M=1\) we can permute the first string \(z_1\) to the \(J\hbox {th}\) position without changing the order of the other strings. From this we can compose all permutations.

For the eigenvector, the calculation works out analogously with the only difference that here the operator content does not start with a \(\left\langle \lambda _{0} \right| V\), so we can never permute \(z_1\). The rest of the diagram \((z_2,\dots ,z_q)\) has the same structureFootnote 2 as a term in the energy expansion, so the same permutation rules apply. \(\square \)

Part of the redundancy identified by Theorem 4 already appears when we define the recurrence relations for c and e. For example, in the last term of Eq. (12), we can switch the order to \(\lambda _m\left\langle \lambda _{0} \big | \lambda _{n-m} \right\rangle \), which would change Eq. (20) and would lead to a different recurrence relation for e and thus different values for c and e.

We could also compare to the Bloch style result for the energy, Eq. (17), which is a sum over all convex diagrams, and note that in our language, a convex diagram is described as \(N_1,0\) with \(e=t(N_1)>0\), but there are also many non-convex diagrams for which \(e\ne 0\). So our result for the energy is less efficient. But even when restricting to convex diagrams, Theorem 4 still leads to a lot of redundancy. Salzman [7] addresses this for the (unnormalised) eigenvector by separating terms into an operator part (what we would call \(z_1\)) and a coefficient containing the matrix elements. The number of different \(z_1\)’s in an order n convex diagram is \(2^n-n\). (Salzman already gave this as a sum, we just confirmed and evaluated it.) Unfortunately, the rules he gives to list all diagrams are relatively complicated and equivalent coefficients are collected manually. Silverstone and Holloway [8], again for unnormalised eigenvectors, give a formally minimal result which still requires evaluating many derivatives.

To reduce the number of diagrams in our result down to a minimum, we can sum up c and e for all diagrams that are equivalent by Theorem 4 and only keep one representative diagram. For example, we can declare an ordering on strings and choose as representative diagram the one where strings are ordered descending.

Definition 5

(Ordering of strings). Let \(y=(k_1,\dots ,k_n)\ne z=(j_1,\dots ,j_m)\) be strings of positive integers. We say \(y>z\)

$$\begin{aligned}&\text {if }D(y)>D(z),\nonumber \\&\text {else if }L(y)>L(z),\nonumber \\&\text {else if }k_1>j_1,\nonumber \\&\text {else if }k_2>j_2,\nonumber \\&\vdots \end{aligned}$$
(47)

The canonical representative of a permutation group of strings has \(z_1\ge z_2\ge \dots \ge z_q\).

Giving the sum over all c or e for an arbitrary representative diagram is generally not an easy task. Of course, given a string representation \(z_1^{m_1}\dots z_k^{m_k}\) (k distinct strings \(z_i\) with multiplicity \(m_i\)), we can write down all the \((\sum _{i=1}^km_i)!/\prod _{i=1}^km_i!\) permutations, calculate their c and e and sum them up to get a \(c_\text {eff}\) and \(e_\text {eff}\). The difficulty lies in automating this, i.e. listing only the canonical diagrams and finding an explicit function on them that gives \(c_\text {eff}\) and \(e_\text {eff}\) directly, preferably without having to invoke Eq. (35) for the whole permutation group. This is less of a concern for the energy where we can alternatively start from Eq. (17). Then the problem becomes counting all the convex permutations, a nested sum for which can be written down but perhaps cannot be evaluated explicitly without specifying a Bloch sequence first.

4.1 Number of Terms

We take a look at how many diagrams are generated by our method and other previous methods, and how many of them may correspond to distinct operator expressions. This subsection is summarised in Table 1.

At order n, there are \(\left( {\begin{array}{c}2n-1\\ n\end{array}}\right) \) distinct Bloch sequences [5]. This can easily be seen by considering that to construct all diagrams we have to list all distinct arrangements of n unit vertical steps and \(n-1\) unit horizontal steps (the nth horizontal step is always fixed at the end). This is the number of terms in our perturbation expansion for \(\left| \lambda _{n} \right\rangle \) and \(\lambda _{n+1}\) (though e can be 0). If we apply Theorem 4, it becomes an upper bound for the number of canonically ordered diagrams, i.e. the minimum number of terms required to cover all distinct operators. Asymptotically it scales as \(4^n/2\sqrt{\pi n}\).

From Bloch [5] we know that convex diagrams are sufficient for the expansion of the energy (or the unnormalised vector). The number of these diagrams for order n is simply the Catalan numbers \(C_n=(2n)!/n!(n+1)!=\frac{2}{n+1}\left( {\begin{array}{c}2n-1\\ n\end{array}}\right) \) [5];[15, p.76]. Asymptotically \(C_n\sim 4^n/\sqrt{\pi n^3}\) [15, p.7], i.e. the exponential scaling is the same, only the algebraic pre-factor is improved.

A lower bound for the minimum number of diagrams is the number of partitions of n into positive integers, cf. [8]. There is no known explicit expression for this partition function, but it has a generating function, recurrence relations, and an asymptotic expression \(\exp (\pi \sqrt{ 2n/3})/4\sqrt{3}n=4^{\pi \ln {4}\sqrt{2n/3}}/4\sqrt{3}n\) [15, p.41].

As stated above, Salzman [7] grouped diagrams by \(z_1\) (the string of positive integers before the first 0 in the Bloch sequence) and counted \(2^n-n\) distinct groups within convex diagrams of length n. We can view this as a lower bound on the number of terms in the unnormalised eigenvector correction \(\left| \overline{\lambda _n}\right\rangle \), since \(z_1\) cannot be permuted with the other strings without changing the operator content. By adding the number of \(z_1\)’s leading to a non-convex diagram, we can generalise this to a lower bound for the number of terms in \(\left| \lambda _{n} \right\rangle \): \(2^n-1\). For the energy the situation is slightly more complicated as diagrams with a distinct \(z_1\) can still be equivalent. For convex diagram this becomes relevant at \(n\ge 5\), which is why it does not appear in Figs. 7 or 8, e.g. (3, 0, 2, 0, 0) is equivalent to (2, 0, 3, 0, 0). Yet, any of the \(2^n-n\) \(z_1\)’s that can start a convex diagram can be the greatest string of a canonically ordered diagram, so this lower bound also applies to \(\lambda _{n+1}\) after all.

Table 1 Number of terms in the energy correction \(\lambda _{n+1}\), the unnormalised eigenvector correction \(\left| \overline{\lambda _n}\right\rangle \), the normalised eigenvector correction \(\left| \lambda _{n} \right\rangle \), and the correction to the projector onto a degenerate eigenspace \(P_n\), by order n, including the asymptotic behaviour in the last row, where known

Clearly none of the bounds are tight for sufficiently large n, though the latter set of lower bounds show that the minimal number of diagrams scales as \(\exp (c n)\) rather than \(\exp (c \sqrt{n})\).

All these considerations remain independent of the Hamiltonian. If we include information about the Hamiltonian, further simplifications can be made. For example, as noted above, if the Hamiltonian is real-symmetric the operator content of a diagram is invariant under reversing the ordering within strings (except for \(z_1\) in the eigenvector expansion). A more generally applicable scenario is a completely off-diagonal (in the unperturbed eigenbasis) perturbation, since this can always be achieved by absorbing the diagonal part of V into \(H_0\). In particular, this means that \(\left\langle \lambda _{0} \right| V\left| \lambda _{0} \right\rangle \) vanishes, so any Bloch sequence that ends (and/or starts) in zero and/or contains two zeroes in succession does not contribute to the eigenvector (energy) expansion. As already noted by Salzman this greatly reduces the number of necessary diagrams [7].

5 Practical Demonstration of Diagrammatics

5.1 Diagrammatic Interpretation of Recurrence Relations

To facilitate a more thorough understanding of Corollary 2.1, here we recount the recurrence relations diagrammatically.

Broadly speaking, the recurrence relations in Theorem 1 and Corollary 2.1 both define how to compute higher-order terms from lower-order terms. If we consider them in terms of diagrams, there is a key difference though. Theorem 1 defines how to construct (the sum of) all order n diagrams by combining lower-order diagrams. On the other hand, Corollary 2.1 gives the coefficients of a single-order n diagram by deconstructing it into all possible compositions of lower-order diagrams. Thus there is an implicit change of approach in going from the proof of Theorem 1 to Corollary 2.1. In the following, the point of view of Corollary 2.1 is illustrated more transparently. We show at which points (marked with dots in Figs. 3, 4, 5 and 6) diagrams should be cut in two and how.

Fig. 3
figure 3

Diagrammatic illustration of the recurrence relation for e, Eq. (23). The sum is over all horizontal intersections with the upper diagonal, including (if applicable) the one at height n, in which case the argument of e on the right-hand side is \(\emptyset \). Here and in the following we mark the intersections we sum over and where we cut the diagrams with a dot

For e, we take c of the same diagram, then for every horizontal intersection with the upper diagonal we subtract a decomposition where we take c of a diagram beginning with a 0-step followed by everything before the intersection, multiplied by e of the part of the original diagram following the 0-step after the intersection, see Fig. 3.

Fig. 4
figure 4

\(k_1=0\) rule of c recurrence, Eq. (24). At the intersection with the diagonal the diagram is cut in two. The first part is rotated by \(\pi \), i.e. read backwards. The pictorial exponent is to be read as 1 if the diagram intersects the diagonal horizontally, and 0 if the intersection is vertical

We split up the c recurrence relations, Eq. (24), into three parts again. For \(k_1=0\), see Fig. 4, we sum over all intersections with the main diagonal. There is at least one such intersection, since we start below the diagonal and end above it. The part of the diagram before the intersection is read backwards or equivalently is rotated by \(\pi \). The part after the intersection is left as is. We multiply c of both diagram parts and divide by 2. Horizontal intersections get a minus sign. (The example in Fig. 4 shows a vertical intersection.)

Fig. 5
figure 5

\(k_1=1\) rule of c recurrence, Eq. (24). Adding or removing a \(k_1=1\) step in front of a diagram (or at any position) does not change its c-value

The \(k_1=1\) rule remains the simplest. If a diagram starts with a 1-step, remove it, see Fig. 5. This easily generalises to: remove all \(k_i=1\)-steps. Though we should remember to stop at \(n=1\), alternatively one could define \(c(\emptyset )=1\), which would effectively make \(\left| \lambda _{0} \right\rangle \) the starting point instead of \(\left| \lambda _{1} \right\rangle \).

Fig. 6
figure 6

\(k_1>1\) rule of c recurrence, Eq. (24). Similarly to Fig. 3, a diagram is cut in two at every horizontal intersection with the upper diagonal. The first upwards unit towards the upper diagonal and the 0-step after the intersection is discarded

Fig. 7
figure 7

Bloch diagrams for orders 1 through 3, labelled with their corresponding Bloch sequences and c and e values. By Theorem 4, the horizontal square brackets group diagrams that contribute the same operator content to the expansion of the eigenenergy. Whenever there are double brackets, the inner ones indicate diagrams that contribute the same operator content to the eigenvector expansion. We can now compute effective coefficients \(c_\text {eff}\), \(e_\text {eff}\) by summing the bracketed c, e and assign them to the left-most diagram (the canonical representative)

Fig. 8
figure 8

Bloch diagrams for order 4, arranged analogously to Fig. 7

For \(k_1>1\), we have a sum over horizontal intersections with the upper diagonal, see Fig. 6. The decomposition of the diagram has similarities with the one in the e-recurrence (Fig. 3). The second part of the diagram is treated the same but in the first part, instead of adding a 0-step the first step is lowered by 1. Another difference is that here we are guaranteed to have at least one summand since the diagram starts above the upper diagonal.

Again a decomposition based on horizontal intersections with the upper diagonal results in one diagram where the upper diagonal becomes the main diagonal and (up to) one diagram containing the remainder.

As an example, consider

$$\begin{aligned} c(2,0,0,2)\overset{k_1>1}{=}c(1)e(0,2)\overset{e}{=}c(1)c(0,2)\overset{k_1=0}{=}\frac{1}{2}c(1)^3=\frac{1}{2}. \end{aligned}$$
(48)

5.2 Diagrams up to Fourth Order

Figures 7 and 8 show all Bloch diagrams up to fourth order with their c and e values. Diagrams producing equivalent operator content are grouped together. One can easily verify that the results for the energy are consistent with Bloch’s, Eq. (17), by summing up all the grouped e coefficients and getting the number of convex diagrams in the group.

Fourth-order perturbation theory is not exactly an outlandish endeavour, yet has sufficient complexity that even though the associated Talk page has since 2010 noted that there are mistakes in the expressions listed on Wikipedia, to date no one has corrected them [16]. There are 35 Bloch sequences for \(n=4\), of which 14 are convex, 13 need to appear in the energy series (4 if V completely off-diagonal), and 26 need to appear in the normalised eigenvector series (12 if V is completely off-diagonal).

6 Conclusion

We have shown how to explicitly solve the conventionally normalised Rayleigh–Schrödinger perturbation series to arbitrary order. The structure of earlier results for unnormalised vectors is readily adapted to this problem. The normalisation necessarily increases the number of terms in the expansion. We surveyed how the number of terms varies between different methods, and how to identify equivalent diagrams. An efficient summation of these equivalent diagrams remains an open problem, and there is likely no simple solution.

No matter how efficiently terms are summarised, their number grows exponentially with the order of perturbation.

Counting and analysing Bloch diagrams and associated quantities offers a rich trove of combinatorics problems, many of which may have already been studied in the context of paths, random walks, and bridges.