1 Introduction

Green’s function methods are widely used to study many-body systems and they represent a natural framework that connects microscopical details of a theory with its macroscopical properties [1].

The attempt to self-consistently determine these quantities has a long history and it still remains one of the central paradigms in the study of strongly correlated systems. From the most recent DMFT [2], where they are used to fix the mapping of a lattice model onto an impurity one, to the older equation of motion approach [3, 4]. In the latter method, given an interacting Hamiltonian, an extensively growing chain of coupled equations are derived [5, 6]. For few-body systems it is possible to use various implementations of this method to obtain the single particle Green’s function exactly [7].

However, in order to study thermodynamical properties of an interacting system a truncation procedure able to approximately decouple this extensively growing system of coupled equations plays a crucial role. Early attempts in the construction of truncation schemes explored arbitrary truncation schemes and decoupling schemes of Tyablikov-type [3, 4]. Despite some successful applications these decoupling schemes often led to violation of the analytical structure of the Green’s functions, predicting imaginary poles and negative spectral weight for the single particle Green’s function. Despite these difficulties Hubbard in his pioneering work [8], managed to find a useful decoupling for a two-pole approximation for the Hubbard model. This decoupling (Hubbard-I) is still often used in treating strongly correlation in presence of local interactions, especially in studies of quantum systems out of equilibrium [9], and multi-orbital systems [10].

Almost a decade after these early works Roth developed a universal decoupling scheme able to enforce correct analytical properties for approximated Green’s functions [11]. This decoupling scheme is now called the Roth procedure, and often relies on parameters that can not be determined within the scheme itself, making unavoidable ulterior approximations. For this reason this method is often regarded as an uncontrolled approximation, which severely limits its applicability. The works of Mancini and Avella et al. [12] show that the Roth procedure leads to violations of other physical principles such as the Pauli principle and that it is possible to constrain some, if not all of the unknown decoupling parameters, by enforcing such physical requirements. Despite much progress in finding easy extendable decoupling schemes [13], the possibility to systematically check what are the approximations involved in the decoupling still remains a neglected aspect.

In this paper we present a decoupling scheme based on a partial orthogonalization of the operators involved, where the relation between the true Green’s function and the approximate one can readily be obtained. The paper is organized as follows: in Sect. 2 we provide a general discussion of the formalism, we clarify the role of the Hermiticity of the E-matrix and we present our decoupling scheme based on the partial orthogonalization of the operators. In Sect. 3 we apply our scheme to a two-pole approximation of the Hubbard model making evident the relationship between the approximate and the true Green’s function. In Sect. 4 we analyze the global sum rules that should be respected in the two-pole approximation of the Hubbard model and we present a variational scheme as a guiding principle for the determination of the unknown orthogonalization parameters. In Sect. 5 we provide numerical results at half-filling and in Sect. 6 we give analytical formulas that are useful to understand the Green’s function. In Sects. 7 and 8 we discuss numerical results for hole doping in the strong- and intermediate-coupling regimes respectively. Finally, in Sect. 9 we provide some conclusions and an outlook.

2 A scheme for the truncation of the EoM

2.1 Formalism review

We will mainly use the notation of Tserkovnikov in the following [5]. For completeness we briefly review what we will need for this paper. Let us first assume we have a set of fermionic operators \(\{{\hat{A}}_i\}_{i=1}^M\) closed under the commutation with the hamiltonian for some evolution matrix K

$$\begin{aligned}{}[{\hat{A}}_i,H]=\sum _j K_{ij}{\hat{A}}_j . \end{aligned}$$
(1)

Then the equation of motion (EOM) for the Green’s function matrix gives

$$\begin{aligned} z \langle \negthinspace \langle A^{\,}_i | A_j^\dagger \rangle \negthinspace \rangle _z = \langle A^{\,}_i | A_j^\dagger \rangle + \sum _k K_{ik} \langle \negthinspace \langle A^{\,}_k | A_j^\dagger \rangle \negthinspace \rangle _z , \end{aligned}$$
(2)

here the normalization matrix N

$$\begin{aligned} N_{ij} = \langle A^{\,}_i | A^\dagger _j \rangle = \langle \{ {\hat{A}}^{\,}_i ,{\hat{A}}_j^\dagger \} \rangle . \end{aligned}$$
(3)

Consequently the Green’s function, viewed as a matrix becomes

$$\begin{aligned} \langle \negthinspace \langle A | A^\dagger \rangle \negthinspace \rangle _z = \frac{1}{z\mathbbm {1}- K} N = N \frac{1}{z\mathbbm {1}- K^\dagger } , \end{aligned}$$
(4)

where the second form is obtained making use of the fact that \({\hat{H}}\) is Hermitean. For these two forms to be consistent we have the condition that

$$\begin{aligned} KN=NK^\dagger , \end{aligned}$$
(5)

which will be of crucial importance in the developments below. Finally one may calculate averages of bilinear of all of the operators involved using the formula

$$\begin{aligned} \langle {\hat{A}}_{j}^{\dagger } {\hat{A}}_i^{\,} \rangle = \frac{1}{2\pi i}\oint dz f(z) \langle \negthinspace \langle A^{\,}_i | A_{j}^{\dagger }\rangle \negthinspace \rangle _z , \end{aligned}$$
(6)

where the contour encircles the real axis.

2.2 A partial orthogonalization scheme for the truncation of the EOM

As shown in the previous work this framework gives exact results if the set of operator \(\{{\hat{A}}_i\}_{i=1}^M\) are closed under the commutation with the Hamiltonian [7]. In an extend many-body system the number of operators necessary to close the equation of motion exactly will typically grow exponentially with the size of the system, making a direct application of this scheme unfeasible.

To produce a truncation scheme capable of producing physical Green’s functions, it is important to notice that in a Hermitean theory the average of the operators involved in the dynamics and their evolution are not independent. In particular as noticed by Roth [11] the matrix

$$\begin{aligned} E_{ij} \equiv \langle \{ [{\hat{A}}^{\,}_i,{\hat{H}}],{\hat{A}}_j^\dagger \} \rangle = \sum _k K_{ik}N_{kj} \end{aligned}$$
(7)

needs to be Hermitean. Here \(\langle \dots \rangle \) indicate some average over exact eigenstates of the theory \({\hat{H}}\). K is the full evolution matrix of the operators and N is the normalization matrix introduced above. Using the fact that the matrix N is Hermitean by construction (i.e., it holds for averages in any state) this gives the same consistency condition as Eq. (5) above. This condition together with the fact that N has to be positive definite guarantees that the Green’s function posseses real poles and positive spectral weight [11].

When the hierarchy of the evolution of an operator \({\hat{A}}_1\) is considered at most one new operator is generated in each step, i.e.,

$$\begin{aligned} \bigl [ {\hat{A}}_1 ,{\hat{H}} \bigr ]= & {} K_{11} {\hat{A}}_1 + K_{12} {\hat{A}}_2 , \end{aligned}$$
(8a)
$$\begin{aligned} \bigl [ {\hat{A}}_2 , {\hat{H}} \bigr ]= & {} K_{21} {\hat{A}}_1 + K_{22} {\hat{A}}_2 + K_{23} {\hat{A}}_3 , \end{aligned}$$
(8b)

etc. until the EOM closes and no new operators are generated. Note that \({\hat{A}}_2\) is not unique since one can add a part of \({\hat{A}}_1\) to it, and similarly for the other higher \({\hat{A}}\)’s. In any event K is only non-zero on the first upper diagonal and below. Let us now truncate the EOM at the q-th operator. A brute force truncation of the matrices involved gives

$$\begin{aligned} K_{trunc}=\begin{pmatrix} K_{11} &{} K_{12}&{} &{}0\\ K_{21}&{} K_{22} &{} &{}0\\ \vdots &{} &{}\ddots &{} \vdots \\ K_{q1}&{}K_{q2} &{}\ldots &{}K_{qq} \end{pmatrix}, \end{aligned}$$
(9)

and the corresponding \(N_{trunc}\)

$$\begin{aligned} N_{trunc}=\begin{pmatrix}N_{11}&{}N_{12}&{} &{}N_{1q}\\ N_{21}&{}N_{22} &{} &{}N_{2q}\\ \vdots &{} &{}\ddots &{} \vdots \\ N_{q1}&{}N_{q2} &{}\ldots &{}N_{qq} \end{pmatrix}. \end{aligned}$$
(10)

Now we note that

$$\begin{aligned} E_{trunc}=K_{trunc}N_{trunc}, \end{aligned}$$
(11)

differs from the corresponding sub-block of the full E only in the last row, through the coupling of \(K_{q , q+1}\) to the \((q+1)\)-th row of the full N matrix. Therefore an arbitrary truncation of the equation of motion is going to generate an evolution that in general does not satisfy the condition in Eq. (5), leading to a potentially unphysical approximation for the Green’s function.

In this paper we propose to restore the Hermiticity of \(E_{trunc}\) adding to the first operator not considered explicitly in the dynamics \({\hat{A}}_{q+1}\) a linear combination of the operators \({\hat{A}}_1,\dots ,{\hat{A}}_q\)

$$\begin{aligned} {\hat{A}}'_{q+1} = {\hat{A}}_{q+1} -\sum _{l=1}^q \lambda _l {\hat{A}}_l . \end{aligned}$$
(12)

Most of the \(\lambda \) parameters will be fixed by demanding that

$$\begin{aligned} \langle A'_{q+1} |A^\dagger _j\rangle =0 ~~~~~~~~\text {for} ~~~j=1,\dots ,q-1. \end{aligned}$$
(13)

This partial orthogonalization procedure ensures that \(E_{trunc}\) is Hermitean, because it makes it identical to the corresponding block of E except for the last element on the diagonal \(E_{qq}\) which is not fixed by our procedure. The Roth procedure corresponds to also orthogonalizing with respect to \({\hat{A}}_q^\dagger \). This gives q equations for q unknowns, and therefore also fixes the value of \(E_{qq}\), whereas in our scheme we have \(q-1\) equations for q unknowns, leaving \(E_{qq}\) arbitrary. We will use this additional freedom to make sure that our approximation fulfills other physically relevant criteria such as Pauli principle constraints or sum rules. In the next section we are going to elucidate this procedure by applying it to a two-pole approximation of the Hubbard model. In particular it will be evident that the effect of this procedure is a non-unique modification of the last row of \(K_{trunc}\). This arbitrariness can be exploited to enforce global sum rules for the Green’s functions and open up the possibility of using different criteria to fix the free parameters \(\lambda _i\) not fixed by Eq. (13).

A last remark on this scheme is that despite the freedom in the choice of the parameters \(\lambda _i\) one can always write the residual Green’s functions not considered explicitly in the dynamics, making transparent the approximation involved in this truncation of the equation of motion.

3 Application to the Hubbard model in a two-pole approximation

In this section we will apply our scheme to a two-pole approximation to the Green’s function in the Hubbard model. Let us consider the Hubbard hamiltonian

$$\begin{aligned} {\hat{H}}= \sum _{\mathbf{k}} \epsilon _\mathbf{k}(c^\dagger _{\mathbf{k}\uparrow }c^{\,}_{\mathbf{k}\uparrow } + c^\dagger _{\mathbf{k}\downarrow }c^{\,}_{\mathbf{k}\downarrow } )+ U \sum _i n_{i\uparrow }n_{i\downarrow }, \end{aligned}$$
(14)

We will denote the total number of sites with \(N_s\), \(c^{\,}_{\mathbf{k}\sigma }\) indicates fermion operator with spin \(\sigma \) and \(n_{i\sigma } = c^\dagger _{i\sigma }c^{\,}_{i\sigma }\).

Let us consider the first three operators that appear in the equation of motion hierarchy

$$\begin{aligned} {\hat{A}}_{1\mathbf{k}}= & {} c^{\,}_{\mathbf{k}\uparrow },\\ {\hat{A}}_{2\mathbf{k}}= & {} (c_{\downarrow }^\dagger c^{\,}_{\downarrow }c^{\,}_{\uparrow })_{\mathbf{k}}.\\ {\hat{A}}_{3 \mathbf{k}}= & {} \frac{1}{\sqrt{N_s}}\sum _\mathbf{p}\Bigr ( \epsilon _\mathbf{p}\big [(c^{\dagger }_{\downarrow }c_{\downarrow })_{\mathbf{k}-\mathbf{p}}c_{\mathbf{p}\uparrow } - (c^{\dagger }_{\downarrow }c_{\uparrow })_{\mathbf{k}-\mathbf{p}}c_{\mathbf{p}\downarrow }\bigr ] \\&- \epsilon _{-\mathbf{p}}c^{\dagger }_{-\mathbf{p}\downarrow }(c_{\downarrow }c_{\uparrow })_{\mathbf{k}-\mathbf{p}} \Bigr ). \end{aligned}$$

where we have introduced

$$\begin{aligned} ({\hat{O}}_1 \ldots {\hat{O}}_n )_\mathbf{k}= \frac{1}{\sqrt{N_s}} \sum _{i} e^{i \mathbf{k}\cdot \mathbf{x}_i } {\hat{O}}_{1x_i} \dots {\hat{O}}_{nx_i} . \end{aligned}$$
(15)

Let us first do a brute force truncation of the evolution after two operators. The truncated evolution becomes

$$\begin{aligned} K_{trunc} (\mathbf{k})=\begin{pmatrix} \epsilon _\mathbf{k}&{}U \\ 0&{}U\end{pmatrix}, \end{aligned}$$
(16)

and the respective N matrix becomes

$$\begin{aligned} N_{trunc} (\mathbf{k}) = N = \begin{pmatrix} 1&{} {\bar{n}}_{\downarrow }\\ {\bar{n}}_{\downarrow }&{}{\bar{n}}_{\downarrow } \end{pmatrix}, \end{aligned}$$
(17)

with \({\bar{n}}_\downarrow = \langle n_{i\downarrow } \rangle \) which is independent of the site index i. In this case \(E_{trunc}=K_{trunc} N_{trunc}\) is not Hermitean (except in special cases such as \(U=0\), \(\epsilon _\mathbf{k}= 0\) or \({\bar{n}}_\downarrow = 0\)) and this leads to an unphysical approximation for the Green’s function for some range of the parameters.

Let us now apply our scheme to this particular problem, in this case we need to determine \(\lambda _{1\mathbf{k}},\lambda _{2\mathbf{k}}\) such that

$$\begin{aligned} \langle A_{3\mathbf{k}}| c^{\dagger }_\mathbf{k}\rangle - \lambda _{1\mathbf{k}} \langle A_{1\mathbf{k}}| c^{\dagger }_\mathbf{k}\rangle -\lambda _{2\mathbf{k}} \langle A_{2\mathbf{k}}| c^{\dagger }_\mathbf{k}\rangle =0. \end{aligned}$$
(18)

Evaluating the anticommutator averages we obtain

$$\begin{aligned} \epsilon _\mathbf{k}{\bar{n}}_{\downarrow } -\lambda _{1\mathbf{k}} - \lambda _{2\mathbf{k}} {\bar{n}}_{\downarrow }=0 . \end{aligned}$$
(19)

As already anticipated in Sec. 2, the values of \(\lambda _{1\mathbf{k}}\) and \(\lambda _{2\mathbf{k}}\) are not uniquely determined by this procedure. Without any loss of generality let us eliminate \(\lambda _{1\mathbf{k}}\) writing

$$\begin{aligned} \lambda _{1\mathbf{k}} = (\epsilon _\mathbf{k}-\lambda _{2\mathbf{k}}) {\bar{n}}_{\downarrow } \end{aligned}$$
(20)

Using this we can write

$$\begin{aligned} {\hat{A}}_{3\mathbf{k}} = {\hat{A}}'_{3\mathbf{k}} + (\epsilon _\mathbf{k}-\lambda _{2\mathbf{k}}) {\bar{n}}_{\downarrow } {\hat{A}}_{1\mathbf{k}} + \lambda _{2\mathbf{k}} {\hat{A}}_{2\mathbf{k}} \end{aligned}$$
(21)

where \(\langle A'_{3\mathbf{k}}| c^\dagger _\mathbf{k}\rangle =0\).

At this point the equation of motion for the operator \({\hat{A}}_{1\mathbf{k}}\) can be rewritten as (B is here arbitrary)

$$\begin{aligned} z \langle \negthinspace \langle A_{1\mathbf{k}} | B^\dagger \rangle \negthinspace \rangle= & {} \langle A_{1\mathbf{k}} | B^\dagger \rangle \nonumber \\&+ \epsilon _\mathbf{k}\langle \negthinspace \langle A_{1\mathbf{k}} | B^\dagger \rangle \negthinspace \rangle + U \langle \negthinspace \langle A_{2 \mathbf{k}} | B^\dagger \rangle \negthinspace \rangle \end{aligned}$$
(22)

and for \({\hat{A}}_{2\mathbf{k}}\)

$$\begin{aligned} z \langle \negthinspace \langle A_{2\mathbf{k}} | B^\dagger \rangle \negthinspace \rangle= & {} \langle A_{2\mathbf{k}} | B^\dagger \rangle +(\epsilon _\mathbf{k}-\lambda _{2\mathbf{k}}) {\bar{n}}_{\downarrow } \langle \negthinspace \langle A_{1\mathbf{k}} | B^\dagger \rangle \negthinspace \rangle \nonumber \\&+ (U + \lambda _{2\mathbf{k}} ) \langle \negthinspace \langle A_{2 \mathbf{k}} | B^\dagger \rangle \negthinspace \rangle + \langle \negthinspace \langle A'_{3 \mathbf{k}} | B^\dagger \rangle \negthinspace \rangle \nonumber \\ \end{aligned}$$
(23)

Consequently the new evolution given by the partial orthogonalization procedure is

$$\begin{aligned} K(\mathbf{k}) = \begin{pmatrix} \epsilon _\mathbf{k}&{} U \\ {\bar{n}}_{\downarrow }(\epsilon _\mathbf{k}-\lambda _{2\mathbf{k}})&{} U+\lambda _{2\mathbf{k}} \end{pmatrix}. \end{aligned}$$
(24)

The physical condition in Eq. (5) is now satisfied for this evolution for any choice of the model parameters \(\lambda _{2\mathbf{k}} ,{\bar{n}}_{\downarrow },U,\epsilon _\mathbf{k}\). The approximate Green’s function for the truncated theory becomes

$$\begin{aligned} G (z,\mathbf{k}) = \frac{1}{z\mathbbm {1}-K(\mathbf{k})} N . \end{aligned}$$
(25)

Assuming no spin symmetry breaking, the parameter \({\bar{n}}_{\downarrow }\) can be determined self-consistently, by applying the fermionic characterization of the spectral theorem stated in Eq. (6) to \(G_{11}\), obtaining:

$$\begin{aligned} \langle c^\dagger _{\mathbf{k}\uparrow }c^{\,}_{\mathbf{k}\uparrow } \rangle= & {} \frac{1}{2\pi i} \oint dz f(z) G_{11}(z,\mathbf{k}), \end{aligned}$$
(26)
$$\begin{aligned} {\bar{n}}_{\downarrow }= {\bar{n}}_{\uparrow }= & {} \frac{1}{N_s} \sum _\mathbf{k}\langle c^+_{\mathbf{k}\uparrow }c^{\,}_{\mathbf{k}\uparrow } \rangle . \end{aligned}$$
(27)

To see that we can always write the residual Green’s function highlighting the approximation involved in the truncation of the equation of motion let us analyze the special case where \(\lambda _{2\mathbf{k}}=\epsilon _\mathbf{k}\). The equation of motion of the Green’s function with \(B^\dagger = c^\dagger _\mathbf{k}\) becomes:

$$\begin{aligned} (z-\epsilon _\mathbf{k}) \langle \negthinspace \langle A_{1\mathbf{k}} | c^{\dagger }_\mathbf{k}\rangle \negthinspace \rangle= & {} 1 + U \langle \negthinspace \langle A_{2\mathbf{k}}| c^{\dagger }_\mathbf{k}\rangle \negthinspace \rangle , \end{aligned}$$
(28a)
$$\begin{aligned} (z-\epsilon _k-U)\langle \negthinspace \langle A_{2\mathbf{k}}| c^{\dagger }_\mathbf{k}\rangle \negthinspace \rangle= & {} {\bar{n}}_{\downarrow } + \langle \negthinspace \langle A'_{3\mathbf{k}} | c^{\dagger }_\mathbf{k}\rangle \negthinspace \rangle . \end{aligned}$$
(28b)

Recalling that \(A_{1\mathbf{k}}=c_{\mathbf{k}\uparrow }\) we find that the conventional fermion Green’s function may be written exactly as

$$\begin{aligned} \langle \negthinspace \langle c^{\,}_{\mathbf{k}\uparrow }| c^{\dagger }_{\mathbf{k}\uparrow } \rangle \negthinspace \rangle= & {} \frac{1-{\bar{n}}_{\downarrow }}{z-\epsilon _\mathbf{k}} + \frac{{\bar{n}}_{\downarrow }}{z-\epsilon _\mathbf{k}-U} \nonumber \\&+ \frac{U \langle \negthinspace \langle A'_{3\mathbf{k}} | c^{\dagger }_{\mathbf{k}\uparrow } \rangle \negthinspace \rangle }{(z-\epsilon _\mathbf{k}-U)(z-\epsilon _\mathbf{k})}. \end{aligned}$$
(29)

From this it is clear that truncating the equation of motion implies that the term on the last line is neglected, making the approximation evident. Moreover we note that \(\langle \negthinspace \langle A'_{3\mathbf{k}} | c^{\dagger }_{\mathbf{k}\uparrow } \rangle \negthinspace \rangle \) does not contain poles at \(\epsilon _\mathbf{k}\) and \(\epsilon _\mathbf{k}+U\) (since double poles in the original Green’s function are not allowed) and its total spectral weight is vanishing (since \(\langle A'_{3\mathbf{k}} | c^{\dagger }_{\mathbf{k}\uparrow } \rangle =0\)). We may also note that excitations at \(\epsilon _\mathbf{k}\) and \(U + \epsilon _\mathbf{k}\) appears in the exact thermal Green’s function (although their weight may be exponentially small) since they are exact energy differences between states with charge 1 and 0 and \(2N_s -1\) and \(2N_s\) respectively.

4 The Global constraints on the two-pole approximation of the Hubbard model

As noticed and stressed by Mancini and Avella [12] the Roth procedure does not ensure that global sum rules such as those related to the Pauli principle and Ward identities are satisfied. In the context of a two-pole approximation of the Hubbard model, these violations can be related to global constraint between averages. In particular the average double occupancy of the system can be evaluated in two inequivalent ways

$$\begin{aligned} D = \frac{1}{N_s} \sum _i \langle n_{i\downarrow } n_{i\uparrow } \rangle= & {} \frac{1}{N_s} \sum _\mathbf{k}\langle A_{1\mathbf{k}}^\dagger A_{2\mathbf{k}} \rangle \nonumber \\= & {} \frac{1}{N_s} \sum _\mathbf{k}\langle A_{2\mathbf{k}}^\dagger A_{2\mathbf{k}} \rangle . \end{aligned}$$
(30)

At the operatorial level these two ways of writing the averages are equal. Consequently when we evaluate these averages using the spectral theorem and the effective evolution, we have to make sure that

$$\begin{aligned} \varDelta =\sum _\mathbf{k}\frac{1}{2 \pi i } \oint dz \bigl [ G_{12}(z,\mathbf{k}) - G_{22}(z,\mathbf{k}) \bigr ] f(z)=0.\nonumber \\ \end{aligned}$$
(31)

This constraint is very important, because it removes a fundamental ambiguity related to the determination of the energy in the Roth scheme. In particular we can notice that in the previously studied solution, where we used \(\lambda _{1\mathbf{k}}=0\) and \(\lambda _{2\mathbf{k}}=\epsilon _\mathbf{k}\) the constraint in Eq. (31) is automatically satisfied, because the argument of the integral is identically 0 for every \(\mathbf{k}\), making the solution suitable for unambiguous physical interpretation. On a physical level this choice of the parameters makes the evolution diagonal in the two Hubbard operators which are orthogonal by construction.

4.1 Variational determination of the orthogonalization parameter

As previously stated the determination of the orthogonalization parameters \(\lambda _{2\mathbf{k}}\) plays a crucial role. Different values for this parameters gives different approximations to the true Green’s function, all of them are physical in the sense that the spectral weights are positive and the excitation energies real, which is a fundamental requirement. On the other hand different values of this parameter may correspond to quite different physics. In some sense \(\lambda _{2\mathbf{k}}\) may be viewed as a kind of mean field parameter, in the sense of variational mean field theory [14]. Any choice for \(\lambda _{2\mathbf{k}}\) is allowed and gives physical results, but we want to determine the parameter to approximate the physics in the “best” possible way. The definition of “best” is however not unique, since approximations do not get everything correctly. Depending on what one choose to optimize different approximations will result.

It may be reasonable to demand that the solution posses the full lattice symmetry (i.e., assuming unbroken lattice symmetry). Then the evolution matrix \(K(\mathbf{k})\) may be expanded in terms of proper basis functions with full lattice symmetry. The simplest non-trivial possibility is to take the ansatz for \(\lambda _{2\mathbf{k}}\) to be

$$\begin{aligned} \lambda _{2\mathbf{k}}= a_0 + a_1 \epsilon _\mathbf{k}, \end{aligned}$$
(32)

where \(a_0\) and \(a_1\) are some real \(\mathbf{k}\)-independent constants. This may be viewed as the first two terms in a locality expansion. Let us also note that this is exactly the form for \(\lambda _{2\mathbf{k}}\) that is obtained in the Roth procedure in the two-pole approximation in the Hubbard model [15].

If we further assume unbroken spin symmetry the average Free energy of the system (i.e. including the chemical potential term in the energy) may be evaluated using

$$\begin{aligned} \langle F \rangle = \sum _\mathbf{k}\Big ( 2(\epsilon _\mathbf{k}-\mu ) \langle A_{1\mathbf{k}}^ \dagger A^{\,}_{1\mathbf{k}} \rangle + U \langle A_{2\mathbf{k}}^\dagger A^{\,}_{2\mathbf{k}} \rangle \Big ). \end{aligned}$$
(33)

To fix the parameters \(a_0\) and \(a_1\) we propose a zero temperature scheme based on minimizing the free energy. In particular we are going to use

$$\begin{aligned} \varDelta (a_0,a_1)=0, \qquad \underset{a_0,a_1}{\text {min}} \langle F \rangle , \end{aligned}$$
(34)

to fix \(a_0\) and \(a_1\). This is a constrained minimization problem and may be studied with standard methods in several ways. We can for example first fix \(a_1\) and then try to solve the equation \(\varDelta (a_0,a_1) = 0\) for \(a_0\). This may in general have more than one solution so it is crucial in this scheme to always check the number of roots of \(\varDelta (a_0)\). In addition the parameter \({\bar{n}}_\downarrow \) will be determined self-consistently.

5 Numerical results for the half filled case

In this section we are going to report some numerical results for the half filled case for a square lattice \(100\times 100\) at \(T=0\). Throughout we will measure energies in units of t, which amounts to setting \(t=1\). Half filling is obtained by taking \(\mu = U/2\). In Sect. 6 below an analytical treatment of the half-filled case will be presented as well.

In particular we are going to report the results obtained for two possible set of parameter \(a_0\), \(a_1\) which satisfy the constraint Eq. (31): the \(a_0=0\), \(a_1=1\) case and the \(a_0\), \(a_1\) obtained by the variational scheme presented in Sect. 4.1. From Eq. (31) it is possible to notice that for for \(a_0=0\) we have \(\varDelta (0,a_1)=0\) independently on the value of \(a_1\) and this is the only possible root, as can be seen in Fig. 1 (here we report \(\varDelta (a_0)\) only for a particular value of \(a_1\) but the situation is the same for other values of \(a_1\)).

Fig. 1
figure 1

The function \(\varDelta (a_0)\) for parameters \(U=12\), \(a_1=-3\)

Fig. 2
figure 2

Expectation value of the Free energy \(\langle F \rangle (a_1)\) for \(U=8\), \(a_0=0\). The global minimum near \(a_1 = -3\) is clearly visible

To carry out the Free energy minimization carefully, it is important to have a sketch of the Free energy landscape as a function \(a_1\), since we will put \(a_0=0\). A representative curve can be seen in Fig. 2, and we notice that \(\langle F \rangle (a_1)\) posses a global minima for negative values of \(a_1\). In particular after carry out the constrained minimization numerically we found that the minimum of the free energy is reached for \(a_1=-3\), \(a_0=0\), independently on the coupling strength U. At this point we are going to compare the avarage energy and double occupancy obtained for the two choices of the decoupling parameter \(a_1=1\), \(a_0 =0\) and \(a_1=-3\), \(a_0=0\) against the benchmark results gathered from Le Blanc et al. [16], reported respectively in Tables 1 and 2. From Table 1 it is possible to notice both decouplings \(a_1=1\) and \(a_1=-3\) predicts energies that may be lower than the exact methods. Consequently this scheme is not variational in the usual sense. This is expected since we are approximating the Green’s function. In fact, despite that we know analytically the term neglected in the Green’s function Eq. (29), the approximate Green’s function properties are determined self-consistently within the truncated theory which can be different from the original one.

Table 1 Benchmark zero-temperature energies at half filling, \(T=0\), for a range of interaction strengths U from Ref. [16] compared with ours. The benchmark results have been rounded to approximately two decimals
Table 2 Benchmark zero-temperature double occupancy at half filing, for a range of interaction strengths U. The benchmark results have been rounded to approximately two decimals

From Table 2 we can notice that in the case \(a_1=1\) the double occupancy drop to zero for \(U\ge 8\). In the other case \(a_1=-3\) double occupancy is predicted to be of the order \((t/U)^2\), which agree at least in order of magnitude with the benchmark results.

It is important to highlight that despite the crudeness of the two-pole approximation, this scheme independently on the value of parameter \(a_1\) is capable to capture the effect of the correlation predicting a double occupancy that is significantly reduced from the mean field value \(n_{\uparrow }n_\downarrow =1/4\). The two possible choice of parameters \(a_1=\), \(a_0=0\) and \(a_1=-3\), \(a_0=0\) predict big differences at the level of predicted observables however. In particular for the case \(a_0=0\), \(a_1=1\) the truncated theory posses two energy bands shifted rigidly by U (i.e., independently of the momentum), as may be seen in Eq. (29) and in Fig. 3.

Fig. 3
figure 3

Band structures for different values of U and \(a_1=1\) at half-filling. The red line indicates the chemical potential

With this choice of parameters the occupations of all the k-points in the first Brillouin zone are half occupied for \(U>8t\) and for \(U<8t\) we have a formation of fully occupied region around the \(\varGamma \) point surrounded by a region of half occupied k-points, and an empty region close to M point. The half-filled region in between shrinks as the interaction strength is decreased as can be seen in Fig. 4. We can also notice that in the limit \(U \rightarrow 0\), we recover the diamond-shaped Fermi surface for free fermions on the square lattice at half-filling.

Fig. 4
figure 4

Colormap of the average occupation in the first Brillouin zone for different values of U and \(a_1=1\) at half-filling

To capture the metallic or insulting behavior of the solution one should in principle evaluate the conductivity or the charge-charge correlation function, which is in principle unaccessible with the operators used here. However we can have an indication on the metallic or insulating behavior of the system by analyzing density of states. Let us first consider the case \(a_1 = 1\). In this case we can see in Fig. 5 that for \(U>8t\) the density of states posses an hard gap and there is a formation of separated lower and upper Hubbard bands, which is a signature of an insulating phase. For \(U<8t\) the lower and upper Hubbard bands overlap giving rise to a gapless density of state which is an indication of a metallic phase. Consequently for the choice of parameter \(a_1=1\), \(a_0=0\), we can notice that \(U=8\) represent a critical value of the interaction above which the system is in an insulating state and below which the system is in a metallic state.

Fig. 5
figure 5

DOS for different values of U and \(a_1=1\) at half-filling. Energies on the x-axis are measured with respect to the chemical potential

A radically different behavior is predicted by the choice of parameters \(a_1=-3\), \(a_0=0\). In this case the system posseses two bands that repel with increasing interaction strength and there is always a small gap between the two bands for any non-vanishing value of U. The gap becomes very small for small interactions as can be seen by looking at the case \(U=0.1\) in Fig. 6. With the choice of these parameters the occupation in first Brillouin zone is characterized by the presence of an almost fully occupied region around the \(\varGamma \) point which changes continuously to a low but non-zero occupation at the corner of the first Brillouin zone (the M point). As the interaction is decreased the almost fully occupied region around the \(\varGamma \) becomes increasingly occupied and the corner of the Brillouin zone get increasingly depleted. From the figures it looks like a Fermi surface is formed at a \(U=0.1\) along the high symmetry vector \(M\varGamma \) as it is possible to notice in Fig. 7. There is however a small gap that is not seen on this scale, this becomes clear in the analytic treatment in Sect. 6. In the limit of \(U\rightarrow 0\) also in this case we recover the diamond-shaped Fermi surface for a free electron gas on square lattice. As we did for the case \(a_1=1\), \(a_0=0\) we can also study the density of state in order to get an indication on the phase of the system. In this case there is always a gap between the upper and lower Hubbard band, but the gap is very small for small U as can be seen in Fig. 8. However in order to better characterize the possible phases of the system for this choice of parameters it is going to be beneficial a study of the \(\langle F \rangle (U)\).

Fig. 6
figure 6

Band structures for different values of U and \(a_1=-3\) at half-filling. The red line indicates the chemical potential

Fig. 7
figure 7

Colormap of the average occupation in the first Brillouin zone for different values of U and \(a_1=-3\) at half-filling

Fig. 8
figure 8

DOS for different values of U and \(a_1=-3\) at half-filing. Energies on the x-axis are measured with respect to the chemical potential. There is always a tiny gap that is not visible on this scale in the lower two panels

In fact, the solution with \(a_1 = -3\) always has a lower expectation value of the Free energy than the solution at \(a_1 = 1\). Moreover, since \(a_1=-3\) is insulating, our scheme indicates that the insulator is stable at half-filling.

In principle the minimization of the free energy will not guarantee to better capture the underlying physics of the system since the free energy is not a variational quantity in terms of the Green’s function. In order to do that one needs to estimate the contribution of the residual Green’s function, which can not be done in a simple way within the theory in itself. We note that the solution with \(a_1=-3\) enhance more the insulating character of the system and the one with \(a_1=1\) tends to account for the full bandwidth W of the free dispersion. It is therefore reasonable to assume that the solutions with \(a_1=-3\) would better describe the system when \(U \gg W\), while \(a_1=1\) would better describe the system state for \(U \ll W\). This qualitative argument is also consistent with the results summarized in Tables 1 and 2.

5.1 Relation to the two-pole approximation of Avella and collaborators

We can relate our approach to that of Avella et al by comparing the associated E-matrices [15]. The relation between their parameters (\(\varDelta \) and p) and ours (\(a_0\) and \(a_1\)) are given by

$$\begin{aligned} -2dt \varDelta= & {} {\bar{n}}_\downarrow (1- {\bar{n}}_\downarrow ) a_0 , \end{aligned}$$
(35a)
$$\begin{aligned} p= & {} {\bar{n}}_\downarrow (1- {\bar{n}}_\downarrow ) a_1 + {\bar{n}}^2_\downarrow . \end{aligned}$$
(35b)

The issue of the determination of these parameters is discussed at length in Ref. [15]. We note that they choose to determine \(\varDelta \) self-consistently from the Green’s function, and fix p so that Pauli principle is satisfied. This is different from our procedure where \(a_0\) (and therefore \(\varDelta \)) is determined so that the Pauli principle is satisfied, in the next step we fix \(a_1\) to minimize expectation value of the Free energy. Comparing the results we also have two classes of solutions, but the parameters obtained are not identical.

In previous work two inequivalent solutions of the two dimensional Hubbard model are found and they are named COM1 and COM2 [12, 15]. In the half filling case all solutions within the two-pole approximation: Hubbard I, Roth, COM1 and COM2 are characterized by having \(\varDelta =0\) [12]. In our scheme this have a clear interpretation, in fact \(\varDelta =0\) is the only value that guarantees the Pauli principle constraint to be satisfied, this can be seen both in the numerical study Sect. 5 and in the analytic study Sect. 6. At half filling the solution \(a_1=1,a_0=0\) resembles the COM1 in many aspects despite that the parameters are different. Both solutions predict a critical \(U_c\), where the system goes from an insulating state to a metallic one. Moreover, the band structure in COM1 is almost rigidly shifted with a separation proportional to U. On the other hand the \(a_1=-3,a_0=0\) resembles the COM2 solution and they are both characterized by the absence of a critical U, predicting an insulator for arbitrary small repulsion U.

6 Analytical results

Since the two-pole approximation involves \(2 \times 2\) matrices everything may be evaluated exactly. A straightforward calculation gives (dropping \(\mathbf{k}\) indexes on \(\epsilon _\mathbf{k}\) and \(\lambda _{2\mathbf{k}}\) and other parameters for brevity)

$$\begin{aligned} \langle \negthinspace \langle c^{\,}_{\mathbf{k}\uparrow } | c^{\dagger }_{\mathbf{k}\uparrow } \rangle \negthinspace \rangle= & {} \frac{1}{2} \Bigl ( \frac{1+\delta _1}{z-E_-} + \frac{1-\delta _1}{z-E_+} \Bigr ) , \end{aligned}$$
(36a)
$$\begin{aligned} \langle \negthinspace \langle \eta ^{\,}_{\mathbf{k}\uparrow } | \eta ^{\dagger }_{\mathbf{k}\uparrow } - c^{\dagger }_{\mathbf{k}\uparrow } \rangle \negthinspace \rangle= & {} \delta _2 \Bigl ( \frac{1}{z-E_-} - \frac{1}{z-E_+} \Bigr ), \end{aligned}$$
(36b)
$$\begin{aligned} \langle \negthinspace \langle \eta ^{\,}_{\mathbf{k}\uparrow } | \eta ^{\dagger }_{\mathbf{k}\uparrow } \rangle \negthinspace \rangle= & {} \frac{{\bar{n}}_\downarrow }{2} \Bigl ( \frac{1+\delta _3}{z-E_-} + \frac{1-\delta _3}{z-E_+} \Bigr ), \end{aligned}$$
(36c)

where the poles are located at

$$\begin{aligned} E_{\pm } = \frac{U+\epsilon + \lambda _2 \pm \sqrt{(U - \epsilon + \lambda _2)^2 + 4 {\bar{n}}_\downarrow U (\epsilon -\lambda _{2})}}{2}.\nonumber \\ \end{aligned}$$
(37)

The other parameters that are related to the weight of the poles are

$$\begin{aligned} \delta _1= & {} \frac{U(1-2 {\bar{n}}_\downarrow ) + \lambda _2 - \epsilon }{\sqrt{(U - \epsilon + \lambda _2)^2 + 4 {\bar{n}}_\downarrow U (\epsilon -\lambda _{2})}}, \end{aligned}$$
(38a)
$$\begin{aligned} \delta _2= & {} \frac{{\bar{n}}_\downarrow (1-{\bar{n}}_\downarrow ) (\epsilon - \lambda _2)}{\sqrt{(U - \epsilon + \lambda _2)^2 + 4 {\bar{n}}_\downarrow U (\epsilon -\lambda _{2})}}, \end{aligned}$$
(38b)
$$\begin{aligned} \delta _3= & {} - \frac{U + (1 - 2 {\bar{n}}_\downarrow ) (\lambda _2 - \epsilon )}{\sqrt{(U - \epsilon + \lambda _2)^2 + 4 {\bar{n}}_\downarrow U (\epsilon -\lambda _{2})}} . \end{aligned}$$
(38c)

Using this we may calculate many quantities of interest, such as the density of spin-up electrons

$$\begin{aligned} {\bar{n}}_\uparrow = \frac{1}{N_s} \sum _\mathbf{k}\Bigl ( \frac{1+\delta _{1\mathbf{k}}}{2}n_{-\mathbf{k}} + \frac{1-\delta _{1\mathbf{k}}}{2} n_{+\mathbf{k}} \Bigr ), \end{aligned}$$
(39)

the Pauli principle constraint (\(\varDelta = 0\))

$$\begin{aligned} \sum _\mathbf{k}\delta _{2\mathbf{k}} (n_{- \mathbf{k}} - n_{+ \mathbf{k}} ) = 0 , \end{aligned}$$
(40)

and average double occupancy

$$\begin{aligned} D = {\bar{n}}_\downarrow \frac{1}{N_s} \sum _\mathbf{k}\Bigl ( \frac{1+\delta _{3\mathbf{k}}}{2}n_{- \mathbf{k}} + \frac{1-\delta _{3\mathbf{k}}}{2} n_{+ \mathbf{k}} \Bigr ) , \end{aligned}$$
(41)

as well as the average kinetic energy (of two spin species)

$$\begin{aligned} \langle {\hat{H}}_0 \rangle = \frac{2}{N_s} \sum _\mathbf{k}\epsilon _\mathbf{k}\Bigl ( \frac{1+\delta _{1\mathbf{k}}}{2}n_{- \mathbf{k}} + \frac{1-\delta _{1\mathbf{k}}}{2} n_{+ \mathbf{k}} \Bigr ).\nonumber \\ \end{aligned}$$
(42)

6.1 Simplifying assumptions – insulator

It is possible to find the solution with \(a_1 = -3\) obtained in the numerical study above analytically. In this subsection we present this solution is some detail since it provides an interesting zeroth order approximate Green’s function at half-filling.

Let us assume that U is sufficiently large and chemical potential sufficiently small so that \(n_{-\mathbf{k}} = 1\) and \(n_{+\mathbf{k}} = 0\) for all \(\mathbf{k}\). We must then have

$$\begin{aligned} \frac{1}{N_s} \sum _\mathbf{k}\delta _{2\mathbf{k}} = 0 . \end{aligned}$$
(43)

Then \({\bar{n}}_\uparrow = {\bar{n}}_\downarrow = 1/2\) solves Eq. (39). With this choice

$$\begin{aligned} \delta _2 = \frac{1}{4} \frac{\epsilon - \lambda _2}{\sqrt{U^2 + (\epsilon - \lambda _2)^2}}, \end{aligned}$$
(44)

and therefore any \(\lambda _2 = a_1 \epsilon \) will satisfy the Pauli principle constraint. The other parameters then become

$$\begin{aligned} \delta _1= & {} \frac{ (a_1 -1 ) \epsilon }{\sqrt{U^2 + (a_1 - 1)^2 \epsilon ^2}}, \end{aligned}$$
(45a)
$$\begin{aligned} \delta _3= & {} - \frac{U }{\sqrt{U^2 + (a_1 - 1)^2 \epsilon ^2}}. \end{aligned}$$
(45b)

Using this me may write down expressions for average double occupancy

$$\begin{aligned} D = \frac{1}{4} \frac{1}{N_s} \sum _\mathbf{k}\Bigl (1- \frac{U }{\sqrt{U^2 + (a_1 - 1)^2 \epsilon _\mathbf{k}^2}} \Bigr ) , \end{aligned}$$
(46)

and average kinetic energy

$$\begin{aligned} \langle {\hat{H}}_0 \rangle = \frac{1}{N_s} \sum _\mathbf{k}\frac{ (a_1 -1 ) \epsilon ^2_\mathbf{k}}{\sqrt{U^2 + (a_1 - 1)^2 \epsilon _\mathbf{k}^2}} . \end{aligned}$$
(47)

Minimizing \(\langle {\hat{H}}_0 \rangle + U D\) we find a minimum at \(a_1 = -3\), with the energy being

$$\begin{aligned} \langle {\hat{H}}_0 \rangle + U D = \frac{1}{4} \frac{1}{N_s} \sum _\mathbf{k}\Bigl ( U - \sqrt{U^2 + (4 \epsilon _\mathbf{k})^2} \Bigr ). \end{aligned}$$
(48)

The gain in energy due to hopping is increased with respect to more conventional approaches, such as anti-ferromagnetic mean field. Let us also note that the band structure for this solution is

$$\begin{aligned} E_{\pm } = \frac{U -2 \epsilon _\mathbf{k}\pm \sqrt{U^2 + (4 \epsilon _\mathbf{k})^2}}{2} , \end{aligned}$$
(49)

in the large-U limit we therefore get

$$\begin{aligned} E_{\pm } \approx U \Bigl ( \frac{1\pm 1}{2}\Bigr ) - \epsilon _\mathbf{k}, \end{aligned}$$
(50)

giving us two Hubbard bands with the full bare non-interacting bandwidth. Note however that the sign of the kinetic term is opposite to what if would be in the non-interacting case. The solution \(a_0 = 0\), \(a_1 = 1\) in the same region has \(D=0\) and \(\langle {\hat{H}}_0 \rangle =0\) so is always higher in energy than \(a_0 = 0, a_1 = -3\). This agrees with our numerical findings.

7 Hole doped case in the strong coupling regime

In this section we are going to apply our scheme in the hole doped case for an interaction strength larger than the bandwidth namely \(U=12\). From the Free energy plots in Fig. 9 it is clear that upon hole doping (decreasing the chemical potential) the minima around \(a_1=1\) is pushed down in energy with respect to the one near \(a_1 = -3\), until it becomes the global one below a critical value near \(\mu = 1.4\). From an analysis of the DOS in Fig. 10, it is possible to notice that the solution around \(a_1=-3\) is characterized by the presence of an hard gap and the system is predicted to be insulating up to \(\mu =1.4\). On the other hand the solution around \(a_1=1\) is characterized by a smaller gap and when \(\mu <1.4 \), becomes the global minima of the Free energy. The DOS found in Fig. 11 suggests the formation of a metallic phase and it is possible to notice a spectral weight transfer from high energy states to the low energy ones.

Fig. 9
figure 9

Free energy expectation value \(\langle F \rangle \) as a function of \(a_1\) for different values of the chemical potential decreasing from the top panel \(\mu \in \{6.0,3.0,1.8,1.4,1.2\}\), interaction strength is \(U=12\)

Fig. 10
figure 10

DOS (top left), colormap of the average occupation in the first Brillouin zone (top right), average occupations along high symmetry lines (bottom left), and band structure (bottom right), for \(U=12\), \(\mu =1.4\) and \(a_1=-3\) and \(\langle n_{\downarrow } \rangle =0.5\). The red line indicates the chemical potential

Fig. 11
figure 11

DOS (top left), colormap of the average occupation in the first Brillouin zone (top right), average occupations along high symmetry lines (bottom left), and band structure (bottom right), for \(U=12\), \(\mu =1.4\) and \(a_1=1.1\) and \(\langle n_{\downarrow } \rangle =0.42\). The red line indicates the chemical potential

Another interesting feature of Fig. 9 for \(\mu \le 1.4\) is that the Free energy as a function of \(a_1\) features a discontinuous behavior at some values in the range \(a_1\in [-5,-4]\). In this case the lower band get attracted to the upper one, pushing it above the chemical potential for certain momenta. This results in the formation of unoccupied k-points in the first Brillouin and two Fermi surfaces that may be seen in Fig. 12. This may be viewed as a Lifshitz transition [17, 18]. This solution is however not energetically favorable, and is not likely stabilized without additional interactions.

Fig. 12
figure 12

Colormap of the occupation in the first Brillouin zone for \(U=12\) and \(\mu =1.2\), for \(a_1=-4.1\) (top left) and \(a_1=-3.9\) (top right). Zoom in of band structure for the corresponding two cases (lower two panels) making the Lifshitz transition apparent. The red line indicates the chemical potential

8 Hole doped case in the intermediate coupling regime

Fig. 13
figure 13

Expectation value of the Free energy \(\langle F \rangle \) as a function of \(a_1\) for different values of chemical potential from the top panel \(\mu \in \{2.0,0.7,0.3,0.0,-0.4\}\), \(U=4\)

In this section we are going to apply our scheme to the hole doped case for an interaction comparable to the bandwidth, namely \(U=4\). The Free energy plots in Fig. 13 indicate that the situation is more involved in this case compared to the one obtained in the strong coupling limit of Sect. 7. There appears three local minima: one in the region \(a_1\in [-4,-3]\), one in the region \(a_1 \in [0,1]\), and one in the region \(a_1\in [3,4]\). In our discussion below we will call these minima \(m_1\), \(m_2\), and \(m_3\). For \(\mu >0.3\) \(m_1\) is the global minimum. Upon hole doping we can see that the local minimum \(m_3\) is pushed down in energy and the minimum \(m_2\) gets formed. For \(\mu <0.3\) the minimum in \(m_3\) becomes the global one until for \(\mu <-0.3\) the minimum in \(m_2\) becomes the global minimum.

The character of the solution \(m_1\) may elucidated from Fig. 14. It is characterized by an insulating gap and predicts the system to be half filled for \(\mu \in [0.3,2]\). This solution is also characterized by the absence of a Fermi surface, and should be viewed as being in the same phase as the corresponding half-filled solution studied above with \(a_1 = -3\), and also the corresponding strong coupling solution.

Fig. 14
figure 14

Characterization of solution \(m_1\). DOS (top left), colormap of the average occupation in the first Brillouin zone (top right), average occupations along high symmetry lines (bottom left), and band structure (bottom right), for \(U=4\), \(\mu =0.3\) and \(a_1 = -3.0\) and \(\langle n_{\downarrow } \rangle = 0.50\). The red line indicates the chemical potential

The solution \(m_3\) may be characterized by studying Fig. 15, it is gapless which suggests a metallic state. There is moreover two sharp Fermi surfaces with discontinuities in the occupation numbers and a partially depleted “ring” around the \(\varGamma \)-point is formed. The formation of an additional Fermi surface can be understood by looking at the band structure plot in Fig. 15. The upper Hubbard band crosses the chemical potential between the \(\varGamma \) and X points and this gives a discontinuity in the occupation. There is also a strong wave vector dependence on the relative spectral weight of the two bands which accounts for the continuous dependence of the occupation in the intermediate region where one of the bands is occupied.

Fig. 15
figure 15

Characterization of solution \(m_3\). DOS (top left), colormap of the average occupation in the first Brillouin zone (top right), average occupations along high symmetry lines (bottom left), and band structure (bottom right), for \(U=4\), \(\mu =0.0\) and \(a_1=3.5\) and \(\langle n_{\downarrow } \rangle = 0.30\). The red line indicates the chemical potential

The solution \(m_2\) is also characterized by the absence of a gap as can be seen in Fig. 16. There is one sharp Fermi surface and consequently, in contrast to the solution \(m_3\), there is no formation of a depleted ring around the \(\varGamma \)-point.

Fig. 16
figure 16

Characterization of solution \(m_2\). DOS (top left), colormap of the average occupation in the first Brillouin zone (top right), average occupations along high symmetry lines (bottom left), and band structure (bottom right), for \(U=4\), \(\mu =-0.4\) and \(a_1=0.6\) and \(\langle n_{\downarrow } \rangle = 0.30\). The red line indicates the chemical potential

As in the strong coupling case above there exists discontinuities in some curves in Fig. 13. In particular for \(\mu =0.3\) there is a discontinuity in \(\langle F \rangle (a_1)\) around \(a_1=1.8\) that is barely visible in the figure. The origin of this is a Lifshitz type transition where the Fermi surface change topology passing from a connected to a non-connected one as can be seen in Fig. 17. The discontinuity near \(a_1 \approx -3.2\) at \(\mu =0.3\) is of the same type as the considered above, see Fig. 12.

Fig. 17
figure 17

Colormap of the occupation in the first Brillouin zone for \(U=4\) and \(\mu =0.3\), for \(a_1=1.7\) (top left) and \(a_1=1.9\) (top right). Zoom in of band structure for the corresponding two cases (lower two panels) making the topological transition in the shape of the Fermi surface apparent. The red line indicates the chemical potential

9 Conclusions and outlook

In the context of the Green’s function equation of motion method, we disclose the dependency between the algebra of the operators and their evolution, stressing that the Hermiticity of the E-matrix is a fundamental relation that all physical theories must satisfy. We also realized that for an arbitrary truncation the Hermiticity of the E-matrix is generally violated which leads to unphysical approximation for the Green’s function.

To overcome this type of problem a novel truncation scheme for the equations of motion based on a partial orthogonalization was developed, in the context of the hierarchy of the operators. The main outcome of this procedure is an approximation for the fermionic Green’s function, which can in principle be extended to an arbitrary number of poles. The extension to more poles is possible, but technically cumbersome: not all the variables in the theory would be determined by constraints leading to an increasingly complicated minimization problem. On the other hand it might be necessary to include additional operators to capture some coherent excitations that are missed if these operators are not explicitly considered. From a physical point of view, another important way to extend the theory is to include the effects of the neglected operators in some way, for example through the Mori–Zwanzig method [19, 20] or the irreducible self-energy method [5]. It is known that this type of extension generates more accurate spectral functions, for example giving life-times to quasiparticles and dressing fermionic quasiparticles with bosons, such as plasmons. Work along these lines is in progress [21].

We applied this truncation scheme to a two-pole approximation for the Hubbard model showing that the Hubbard-I and Mancini results can be obtained as a particular choices of a much wider range of decoupling possibilities. We introduced a variational procedure to determine the partial orthogonalization parameter(s). By employing it we analyzed a set of possible solutions for the two-pole approximation for the Hubbard model and we show that independently of the choice of the orthogonalization parameter both the atomic limit and the non-interacting limit are obtained as special cases for the half-filled case. Furthermore the solutions obtained, suggests the presence of a Mott metal-insulator transition both in the large coupling limit and in the intermediate one. In the latter case we also find the presence of three competing solutions: one with an insulating character and two with metallic ones, characterized by different occupations in the first Brillouin zone and different number of Fermi surfaces. We want to stress that the variational procedure proposed to fix the parameters in this paper is not the only option available and in principle whatever decoupling parameters which satisfy the algebra constraint should be considered valid. Despite that, this method allows a transparent way to determine the part of the Green’s function that is neglected from the original theory and constrain its total spectral weight. This enables further refinements of the approximate Green’s function, where the effect of the neglected part can be incorporated in the theory using an adequate form of the self-energy.

In the end it is important to recall that this scheme can be applied both in the study of fermionic and bosonic systems. Various application of this novel decoupling scheme also in case of broken symmetries are planned for future works.