3.1 From Physical Problems to Two-Dimensional Tensor Networks

3.1.1 Classical Partition Functions

Partition function, which is a function of the variables of a thermodynamic state such as temperature, volume, and etc., contains the statistical information of a thermodynamic equilibrium system. From its derivatives of different orders, we can calculate the energy, free energy, entropy, and so on. Levin and Nave pointed out in Ref. [1] that the partition functions of statistical lattice models (such as Ising and Potts models) with local interactions can be written in the form of TN. Without losing generality, we take square lattice as an example.

Let us start from the simplest case: the classical Ising model on a single square with only four sites. The four Ising spins denoted by s i (i = 1, 2, 3, 4) locate on the four corners of the square, as shown in Fig. 3.1a; each spin can be up or down, represented by s i = 0 and 1, respectively. The classical Hamiltonian of such a system reads

$$\displaystyle \begin{aligned} \begin{array}{rcl} H_{s_1s_2s_3s_4}=J(s_1s_2 + s_2s_3 + s_3s_4 + s_4s_1) - h (s_1+s_2+s_3+s_4), \end{array} \end{aligned} $$
(3.1)

with J the coupling constant and h the magnetic field.

Fig. 3.1
figure 1

(a) Four Ising spins (blue balls with arrows) sitting on a single square, and the red lines represent the interactions. The blue block is the tensor T (Eq. (3.2)), with the black lines denoting the indexes of T. (b) The graphic representation of the TN on a larger lattice with more than one square. (c) The TN construction of the partition function on infinite square lattice

When the model reaches the equilibrium at temperature T, the probability of each possible spin configuration is determined by the Maxwell–Boltzmann factor

$$\displaystyle \begin{aligned} \begin{array}{rcl} T_{s_1s_2s_3s_4}=e^{-\beta H_{s_1s_2s_3s_4}}, {} \end{array} \end{aligned} $$
(3.2)

with the inverse temperature β = 1∕T.Footnote 1 Obviously, Eq. (3.2) is a fourth-order tensor T, where each element gives the probability of the corresponding configuration.

The partition function is defined as the summation of the probability of all configurations. In the language of tensor, it is obtained by simply summing over all indexes as

$$\displaystyle \begin{aligned} \begin{array}{rcl} Z=\sum_{s_1s_2s_3s_4} T_{s_1s_2s_3s_4}. \end{array} \end{aligned} $$
(3.3)

Let us proceed a little bit further by considering four squares, whose partition function can be written in a TN with four tensors (Fig. 3.1b) as

$$\displaystyle \begin{aligned} \begin{array}{rcl} Z = \sum_{\{ss'\}} T_{s_1s_2s_2^{\prime}s_1^{\prime}} T_{s_2^{\prime}s_3s_4s_3^{\prime}} T_{s_4^{\prime}s_3^{\prime}s_5s_6} T_{s_8s_1^{\prime}s_4^{\prime}s_7}. \end{array} \end{aligned} $$
(3.4)

Each of the indexes {s′} inside the TN is shared by two tensors, representing the spin that appears in both of the squares. The partition function is obtained by summing over all indexes.

For the infinite square lattice, the probability of a certain spin configuration (s 1, s 2, ⋯) is given by the product of infinite number of tensor elements as

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} e^{-\beta H_{\{s\}}}=e^{-\beta H_{s_1s_2s_3s_4}} e^{-\beta H_{s_4s_5s_6s_7}} \cdots =T_{s_1s_2s_3s_4}T_{s_4s_5s_6s_7} \cdots \end{array} \end{aligned} $$
(3.5)

Then the partition function is given by the contraction of an infinite TN formed by the copies of T (Eq. (3.2)) as

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} Z=\sum_{\{s\}} \prod_{n} T_{s^n_1 s^n_2 s^n_3 s^n_4}, {} \end{array} \end{aligned} $$
(3.6)

where two indexes satisfy \(s^n_j = s^m_k\) if they refer to the same Ising spin. The graphic representation of Eq. (3.6) is shown in Fig. 3.1c. One can see that on square lattice, the TN still has the geometry of a square lattice. In fact, such a way will give a TN that has a geometry of the dual lattice of the system, and the dual of the square lattice is itself.

For the Q-state Potts model on square lattice, the partition function has the same TN representation as that of the Ising model, except that the elements of the tensor are given by the Boltzmann weight of the Potts model and the dimension of each index is Q. Note that the Potts model with q = 2 is equivalent to the Ising model.

Another example is the eight-vertex model proposed by Baxter in 1971 [2]. It is one of the “ice-type” statistic lattice model, and can be considered as the classical correspondence of the Z2 spin liquid state. The tensor that gives the TN of the partition function is also (2 × 2 × 2 × 2), whose non-zero elements are

$$\displaystyle \begin{aligned} \begin{array}{rcl} T_{s_1,\cdots, s_N}&=& \left\{ \begin{array}{lll} 1, \ \ s_1+\cdots + s_N=even, \\ 0, \ \ \text{otherwise}. \end{array} \right. \end{array} \end{aligned} $$
(3.7)

We shall remark that there are more than one ways to define the TN of the partition function of a classical system. For example, when there only exist nearest-neighbor couplings, one can define a matrix \(M_{ss'} = e^{-\beta H_{ss'}}\) on each bond and put on each site a super-digonal tensor I (or called copy tensor) defined as

$$\displaystyle \begin{aligned} I_{s_1,\cdots, s_N}= \begin{cases} 1, \ \ s_1 = \cdots = s_N;\\ 0, \ \ \text{otherwise}. \end{cases} \end{aligned} $$
(3.8)

Then the TN of the partition function is the contraction of copies of M and I, and possesses exactly the same geometry of the original lattice (instead of the dual one).

3.1.2 Quantum Observables

With a TN state, the computations of quantum observables as \(\langle \psi | \hat {O} | \psi \rangle \) and 〈ψ|ψ〉 are the contraction of a scalar TN, where \(\hat {O}\) can be any operator. For a 1D MPS , this can be easily calculated, since one only needs to deal with a 1D TN stripe. For 2D PEPS , such calculations become contractions of 2D TNs. Taking 〈ψ|ψ〉 as an example, the TN of such an inner product is the contraction of the copies of the local tensor (Fig. 3.1c) defined as

$$\displaystyle \begin{aligned} \begin{array}{rcl} T_{a_1a_2a_3a_4} = \sum_{s} P_{s,a^{\prime\prime}_1a^{\prime\prime}_2a^{\prime\prime}_3a^{\prime\prime}_4}^{\ast} P_{s,a_1^{\prime}a_2^{\prime}a_3^{\prime}a_4^{\prime}}, \end{array} \end{aligned} $$
(3.9)

with P the tensor of the PEPS and \(a_i = (a^{\prime }_i, a^{\prime \prime }_i)\). There are no open indexes left and the TN gives the scalar 〈ψ|ψ〉. The TN for computing the observable \(\langle \hat {O} \rangle \) is similar. The only difference is that we should substitute some small number of \(T_{a_1a_2a_3a_4}\) in original TN of 〈ψ|ψ〉 with “impurities” at the sites where the operators locate. Taking one-body operator as an example, the “impurity” tensor on this site can be defined as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \widetilde{T}_{a_1a_2a_3a_4}^{[i]} = \sum_{s,s'} P_{s,a^{\prime\prime}_1a^{\prime\prime}_2a^{\prime\prime}_3a^{\prime\prime}_4}^{\ast} \hat{O}_{s,s'}^{[i]} P_{s',a_1^{\prime}a_2^{\prime}a_3^{\prime}a_4^{\prime}}. \end{array} \end{aligned} $$
(3.10)

In such a case, the single-site observables can be represented by the TN contraction of

$$\displaystyle \begin{aligned} \begin{array}{rcl} \frac{\langle \psi| \hat{O}^{[i]}|\psi \rangle}{\langle \psi| \psi \rangle} = \frac{\text{tTr }\ \widetilde{T}^{[i]} \prod_{n\neq i} T}{\text{tTr }\ \prod_{n=1}^{N} T}. \end{array} \end{aligned} $$
(3.11)

For some non-local observables, e.g., the correlation function, the contraction of \(\langle \psi | \hat {O}^{[i]} \hat {O}^{[j]} |\psi \rangle \) is nothing but adding another “impurity” by

$$\displaystyle \begin{aligned} \begin{array}{rcl} \langle \psi| \hat{O}^{[i]} \hat{O}^{[j]} |\psi \rangle = \text{tTr }\ \widetilde{T}^{[i]}\widetilde{T}^{[j]} \prod_{n\neq i,j}^{N} T. \end{array} \end{aligned} $$
(3.12)

3.1.3 Ground-State and Finite-Temperature Simulations

Ground-state simulations of 1D quantum models with short-range interactions can also be efficiently transferred to 2D TN contractions. When minimizing the energy

$$\displaystyle \begin{aligned} \begin{array}{rcl} E=\frac{\langle \psi | \hat{H} | \psi \rangle} {\langle \psi | \psi \rangle}, \end{array} \end{aligned} $$
(3.13)

where we write |ψ〉 as an MPS . Generally speaking, there are two ways to solve the minimization problem: (1) simply treat all the tensor elements as variational parameters; (2) simulate the imaginary-time evolution

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} |\psi_{gs}\rangle = \lim_{\beta \rightarrow \infty} \frac{e^{-\beta \hat{H}}|\psi\rangle}{\parallel{e^{-\beta \hat{H}}|\psi\rangle}\parallel}. \end{array} \end{aligned} $$
(3.14)

The first way can be realized by, e.g., Monte Carlo methods where one could randomly change or choose the value of each tensor element to locate the minimal of energy. One can also use the Newton method and solve the partial-derivative equations ∂E∂x n = 0 with x n standing for an arbitrary variational parameter. Anyway, it is inevitable to calculate E (i.e., \(\langle \psi | \hat {H} | \psi \rangle \) and 〈ψ|ψ〉) for most cases, which is to contraction the corresponding TNs as explained above.

We shall stress that without TN , the dimension of the ground state (i.e., the number of variational parameters) increases exponentially with the system size, which makes the ground-state simulations impossible for large systems.

The second way of computing the ground state with imaginary-time evolution is more or less like an “annealing” process. One starts from an arbitrarily chosen initial state and acts the imaginary-time evolution operator on it. The “temperature” is lowered a little for each step, until the state reaches a fixed point. Mathematically speaking, by using Trotter-Suzuki decomposition, such an evolution is written in a TN defined on (D + 1)-dimensional lattice, with D the dimension of the real space of the model.

Here, we take a 1D chain as an example. We assume that the Hamiltonian only contains at most nearest-neighbor couplings, which reads

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \hat{H}=\sum_{n} \hat{h}_{n,n+1}, \end{array} \end{aligned} $$
(3.15)

with \(\hat {h}_{n,n+1}\) containing the on-site and two-body interactions of the n-th and n + 1-th sites. It is useful to divide \(\hat {H}\) into two groups, \(\hat {H}=\hat {H}^e+\hat {H}^{o}\) as

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \begin{aligned} \hat{H}^e \equiv\sum_{even\ n}\hat{h}_{n,n+1}, \ \ \ \hat{H}^{o} \equiv\sum_{odd\ n}\hat{h}_{n,n+1}. \end{aligned} \end{array} \end{aligned} $$
(3.16)

By doing so, each two terms in \(\hat {H}^e\) or \(\hat {H}^{o}\) commutes with each other. Then the evolution operator \(\hat {U}(\tau )\) for infinitesimal imaginary time τ → 0 can be written as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{U}(\tau) = e^{-\tau \hat{H}} = e^{-\tau \hat{H}^e} e^{-\tau \hat{H}^o} + O\left(\tau^2\right) \left[\hat{H}^e, \hat{H}^o\right]. {} \end{array} \end{aligned} $$
(3.17)

If τ is small enough, the high-order terms are negligible, and the evolution operator becomes

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{U}(\tau) \simeq \prod_{n} \hat{U}(\tau)_{n,n+1}, {} \end{array} \end{aligned} $$
(3.18)

with the two-site evolution operator \(\hat {U}(\tau )_{n,n+1} = e^{-\tau \hat {H}_{n,n+1}}\).

The above procedure is known as the first-order Trotter-Suzuki decomposition [3,4,5]. Note that higher-order decomposition can also be adopted. For example, one may use the second-order Trotter-Suzuki decomposition that is written as

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} e^{-\tau \hat{H}} \simeq e^{-\frac{\tau}{2} \hat{H}^e} e^{-\tau \hat{H}^o} e^{-\frac{\tau}{2} \hat{H}^e}. \end{array} \end{aligned} $$
(3.19)

With Eq. (3.18), the time evolution can be transferred to a TN, where the local tensor is actually the coefficients of \(\hat {U}(\tau )_{n,n+1}\), satisfying

$$\displaystyle \begin{aligned} \begin{array}{rcl} T_{s_n s_{n+1} s^{\prime}_n s^{\prime}_{n+1}} = \langle s^{\prime}_n s^{\prime}_{n+1} | \hat{U}(\tau)_{n,n+1} | s_n s_{n+1}\rangle. {} \end{array} \end{aligned} $$
(3.20)

Such a TN is defined in a plain of two dimensions that corresponds to the spatial and (real or imaginary) time, respectively. The initial state is located at the bottom of the TN (β = 0) and its evolution is to do the TN contraction which can be efficiently solved by TN algorithms (presented later).

In addition, one can readily see that the evolution of a 2D state leads to the contraction of a 3D TN. Such a TN scheme provides a straightforward picture to understand the equivalence between a (d + 1)-dimensional classical and a d-dimensional quantum theory. Similarly, the finite-temperature simulations of a quantum system can be transferred to TN contractions with Trotter-Suzuki decomposition. For the density operator \(\hat {\rho }(\beta ) = e^{-\beta \hat {H}}\), the TN is formed by the same tensor given by Eq. (3.20).

3.2 Tensor Renormalization Group

In 2007, Levin and Nave proposed TRG approach [1] to contract the TN of 2D classical lattice models. In 2008, Gu et al. further developed TRG to handle 2D quantum topological phases [6]. TRG can be considered as a coarse-graining contraction algorithm. To introduce the TRG algorithm, let us consider a square TN formed by infinite number of copies of a fourth-order tensor \(T_{a_1a_2a_3a_4}\) (see the left side of Fig. 3.2).

Fig. 3.2
figure 2

For an infinite square TN with translational invariance, the renormalization in the TRG algorithm is realized by two local operations of the local tensor. After each iteration, the bond dimensions of the tensor and the geometry of the network keep unchanged

Contraction and Truncation

The idea of TRG is to iteratively “coarse-grain” the TN without changing the bond dimensions, the geometry of the network, and the translational invariance. Such a process is realized by two local operations in each iteration. Let us denote the tensor in the t-th iteration as T (t) (we take T (0) = T). For obtaining T (t+1), the first step is to decompose T (t) by SVD in two different ways (Fig. 3.2) as

$$\displaystyle \begin{aligned} \begin{array}{rcl} T^{(t)}_{a_1a_2a_3a_4} = \sum_{b} U_{a_1a_2b}V_{a_3a_4b}, \end{array} \end{aligned} $$
(3.21)
$$\displaystyle \begin{aligned} \begin{array}{rcl} T^{(t)}_{a_1a_2a_3a_4} = \sum_{b} X_{a_4a_1b}Y_{a_2a_3b}. {} \end{array} \end{aligned} $$
(3.22)

Note that the singular value spectrum can be handled by multiplying it with the tensor(s), and the dimension of the new index satisfies dim(b) = χ 2 with χ the dimension of each bond of T (t).

The purpose of the first step is to deform the TN, so that in the second step, a new tensor T (t+1) can be obtained by contracting the four tensors that form a square (Fig. 3.2) as

$$\displaystyle \begin{aligned} \begin{array}{rcl} T^{(t+1)}_{b_1b_2b_3b_4} \leftarrow \sum_{a_1a_2a_3a_4} V_{a_1a_2b_1} Y_{a_2a_3b_2} U_{a_3a_4b_3} X_{a_4a_1b_4}. {} \end{array} \end{aligned} $$
(3.23)

We use an arrow instead of the equal sign, because one may need to divide the tensor by a proper number to keep the value of the elements from being divergent. The arrows will be used in the same way below.

These two steps define the contraction strategy of TRG. By the first step, the number of tensors in the TN (i.e., the size of the TN ) increases from N to 2N, and by the second step, it decreases from 2N to N∕2. Thus, after t times of each iterations, the number of tensors decreases to the \(\frac {1}{2^t}\) of its original number. For this reason, TRG is an exponential contraction algorithm.

Error and Environment

The dimension of the tensor at the t-th iteration becomes \(\chi ^{2^t}\), if no truncations are implemented. This means that truncations of the bond dimensions are necessary. In its original proposal, the dimension is truncated by only keeping the singular vectors of the χ-largest singular values in Eq. (3.22). Then the new tensor T (t+1) obtained by Eq. (3.23) has exactly the same dimension as T (t).

Each truncation will absolutely introduce some error, which is called the truncation error. Consistent with Eq. (2.7), the truncation error is quantified by the discarded singular values λ as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varepsilon = \frac{\sqrt{\sum_{b=\chi}^{\chi^2-1} \lambda_{b}^2}}{\sqrt{\sum_{b=0}^{\chi^2-1} \lambda_{b}^2}}. {} \end{array} \end{aligned} $$
(3.24)

According to the linear algebra, ε in fact gives the error of the SVD given in Eq. (3.22), meaning that such a truncation minimizes the error of reducing the rank of T (t), which reads

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varepsilon = |T^{(t)}_{a_1a_2a_3a_4} - \sum_{b=0}^{\chi-1} U_{a_1a_2b} V_{a_3a_4b}|. {} \end{array} \end{aligned} $$
(3.25)

One may repeat the contraction-and-truncation process until T (t) converges. It usually only takes ∼10 steps, after which one in fact contract a TN of 2t tensors to a single tensor.

The truncation is optimized according to the SVD of T (t). Thus, T (t) is called the environment. In general, the tensor(s) that determines the truncations is called the environment. It is a key factor to the accuracy and efficiency of the algorithm. For those that use local environments, like TRG, the efficiency is relatively high since the truncations are easy to compute. But, the accuracy is bounded since the truncations are only optimized according to some local information (like in TRG the local partitioning T (t)).

One may choose other tensors or even the whole TN as the environment. In 2009, Xie et al. proposed the second renormalization group (SRG) algorithm [7]. The idea is in each truncation step of TRG , they define the global environment that is a fourth-order tensor \(\mathscr {E}_{a_1^{\tilde {n}} a_2^{\tilde {n}} a_3^{\tilde {n}} a_4^{\tilde {n}}} = \sum _{\{a\}} \prod _{n \neq \tilde {n}} T^{(n, t)}_{a_1^na_2^na_3^na_4^n}\) with T (n, t) the n-th tensor in the t-th step and \(\tilde {n}\) the tensor to be truncated. \(\mathscr {E}\) is the contraction of the whole TN after getting rid of \(T^{(\tilde {n}, t)}\), and is computed by TRG. Then the truncation is obtained not by the SVD of \(T^{(\tilde {n}, t)}\), but by the SVD of \(\mathscr {E}\). The word “second” in the name of the algorithm comes from the fact that in each step of the original TRG, they use a second TRG to calculate the environment. SRG is obviously more consuming, but bears much higher accuracy than TRG. The balance between accuracy and efficiency, which can be controlled by the choice of environment, is one main factor to consider while developing or choosing the TN algorithms.

3.3 Corner Transfer Matrix Renormalization Group

In the 1960s, the corner transfer matrix (CTM) idea was developed originally by Baxter in Refs. [8, 9] and a book [10]. Such ideas and methods have been applied to various models, for example, the chiral Potts model [11,12,13], the 8-vertex model [2, 14, 15], and to the 3D Ising model [16]. Combining CTM with DMRG, Nishino and Okunishi proposed the CTMRG [17] in 1996 and applied it to several models [17,18,19,20,21,22,23,24,25,26,27]. In 2009, Orús and Vidal further developed CTMRG to deal with TNs [28]. What they proposed to do is to put eight variational tensors to be optimized in the algorithm, which are four corner transfer matrices C [1], C [2], C [3], C [4] and four row (column) tensors R [1], R [2], R [3], R [4], on the boundary, and then to contract the tensors in the TN to these variational tensors in a specific order shown in Fig. 3.3. The TN contraction is considered to be solved with the variational tensors when they converge in this contraction process. Compared with the boundary-state methods in the last subsection, the tensors in CTMRG define the states on both the boundaries and corners.

Fig. 3.3
figure 3

Overview of the CTMRG contraction scheme. The tensors in the TN are contracted to the variational tensors defined on the edges and corners

Contraction

In each iteration step of CTMRG , one choses two corner matrices on the same side and the row tensor between them, e.g., C [1], C [2], and R [2]. The update of these tensors (Fig. 3.4) follows

$$\displaystyle \begin{aligned} \begin{array}{rcl} \tilde{C}^{[1]}_{\tilde{b}_2b^{\prime}_1} &\displaystyle \leftarrow&\displaystyle \sum_{b_1} C^{[1]}_{b_1b_2}R^{[1]}_{b_1a_1b^{\prime}_1}, \end{array} \end{aligned} $$
(3.26)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \tilde{R}^{[2]}_{\tilde{b}_2 a_4 \tilde{b_3}} &\displaystyle \leftarrow&\displaystyle \sum_{a_2}R^{[2]}_{b_2a_2b_3}T_{a_1a_2a_3a_4}, {} \end{array} \end{aligned} $$
(3.27)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \tilde{C}^{[2]}_{\tilde{b}_3b^{\prime}_4} &\displaystyle \leftarrow&\displaystyle \sum_{b_4} C^{[2]}_{b_3b_4}R^{[3]}_{b_4a_3b^{\prime}_4}, \end{array} \end{aligned} $$
(3.28)

where \(\tilde {b}_2=(b_2,a_1)\) and \(\tilde {b}_3=(b_3,a_1)\).

Fig. 3.4
figure 4

The first arrow shows absorbing tensors R [1], T, and R [3] to renew tensors C [1], R [2], and C [2] in left operation. The second arrow shows the truncation of the enlarged bond of \(\tilde {C}^{[1]}\), \(\tilde {R}^{[2]}\), and \(\tilde {C}^{[2]}\). Inset is the acquisition of the truncation matrix Z

After the contraction given above, it can be considered that one column of the TN (as well as the corresponding row tensors R [1] and R [3]) is contracted. Then one chooses other corner matrices and row tensors (such as \(\tilde {C}^{[1]}\), C [4], and R [1]) and implement similar contractions. By iteratively doing so, the TN is contracted in the way shown in Fig. 3.3.

Note that for a finite TN , the initial corner matrices and row tensors should be taken as the tensors locating on the boundary of the TN. For an infinite TN, they can be initialized randomly, and the contraction should be iterated until the preset convergence is reached.

CTMRG can be regarded as a polynomial contraction scheme. One can see that the number of tensors that are contracted at each step is determined by the length of the boundary of the TN at each iteration time. When contracting a 2D TN defined on a (L × L) square lattice as an example, the length of each side is L − 2t at the t-th step. The boundary length of the TN (i.e., the number of tensors contracted at the t-th step) bears a linear relation with t as 4(L − 2t) − 4. For a 3D TN such as cubic TN, the boundary length scales as 6(L − 2t)2 − 12(L − 2t) + 8, thus the CTMRG for a 3D TN (if exists) gives a polynomial contraction.

Truncation

One can see that after the contraction in each iteration step, the bond dimensions of the variational tensors increase. Truncations are then in need to prevent the excessive growth of the bond dimensions. In Ref. [28], the truncation is obtained by inserting a pair of isometries V  and V in the enlarged bonds. A reasonable (but not the only choice) of V  for translational invariant TN is to consider the eigenvalue decomposition on the sum of corner transfer matrices as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{b}\tilde{C}^{[1]\dagger}_{\tilde{b}b}{\tilde{C}^{[1]}}_{\tilde{b}'b} + \sum_{b}\tilde{C}^{[2]\dagger}_{\tilde{b}b}{\tilde{C}^{[1]}}_{\tilde{b}'b} \simeq \sum_{b=0}^{\chi-1} V_{\tilde{b}b} \varLambda_{b} V^{*}_{\tilde{b}'b}. {} \end{array} \end{aligned} $$
(3.29)

Only the χ largest eigenvalues are preserved. Therefore, V  is a matrix of the dimension  × χ, where D is the bond dimension of T and χ is the dimension cut-off. We then truncate \(\tilde {C}^{[1]}\), \(\tilde {R}^{[2]}\), and \(\tilde {C}^{[2]}\) using V  as

$$\displaystyle \begin{aligned} \begin{array}{rcl} C^{[1]}_{b^{\prime}_1b_2} &\displaystyle =&\displaystyle \sum_{\tilde{b}_2}\tilde{C}^{[1]}_{\tilde{b}_2b^{\prime}_1}V^{*}_{\tilde{b}_2b_2}, \end{array} \end{aligned} $$
(3.30)
$$\displaystyle \begin{aligned} \begin{array}{rcl} R^{[2]}_{b_2a_4b_3} &\displaystyle =&\displaystyle \sum_{\tilde{b}_2,\tilde{b}_3}\tilde{R}^{[2]}_{\tilde{b}_2a_4\tilde{b}_3} V_{\tilde{b}_2b_2} V^{*}_{\tilde{b}_3b_3},{} \end{array} \end{aligned} $$
(3.31)
$$\displaystyle \begin{aligned} \begin{array}{rcl} C^{[2]}_{b_3b^{\prime}_4} &\displaystyle =&\displaystyle \sum_{\tilde{b}_3}\tilde{C}^{[2]}_{\tilde{b}_3b^{\prime}_4} V_{\tilde{b}_3b_3}. \end{array} \end{aligned} $$
(3.32)

Error and Environment

Same as TRG or TEBD , the truncations are obtained by the matrix decompositions of certain tensors that define the environment. From Eq. (3.29), the environment in CTMRG is the loop formed by the corner matrices and row tensors. Note that symmetries might be considered to accelerate the computation. For example, one may take C [1] = C [2] = C [3] = C [4] and R [1] = R [2] = R [3] = R [4] when the TN has rotational and reflection symmetries (\(T_{a_1a_2a_3a_4} = T_{a_1^{\prime }a_2^{\prime }a_3^{\prime }a_4^{\prime }}\) after any permutation of the indexes).

3.4 Time-Evolving Block Decimation: Linearized Contraction and Boundary-State Methods

The TEBD algorithm by Vidal was developed originally for simulating the time evolution of 1D quantum models [29,30,31]. The (finite and infinite) TEBD algorithm has been widely applied to varieties of issues, such as criticality in quantum many-body systems (e.g., [32,33,34]), the topological phases [35], the many-body localization [36,37,38], and the thermodynamic property of quantum many-body systems [39,40,41,42,43,44,45].

In the language of TN , TEBD solves the TN contraction problems in a linearized manner, and the truncation is calculated in the context of an MPS. In the following, let us explain the infinite TEBD (iTEBD) algorithm [31] (Fig. 3.5) by still taking the infinite square TN formed by the copies of a fourth-order tensor T as an example. In each step, a row of tensors (which can be regarded as an MPO ) are contracted to an MPS |ψ〉. Inevitably, the bond dimensions of the tensors in the MPS will increase exponentially as the contractions proceed. Therefore, truncations are necessary to prevent the bond dimensions diverging. The truncations are determined by minimizing the distance between the MPSs before and after the truncation. After the MPS |ψ〉 converges, the TN contraction becomes 〈ψ|ψ〉, which can be exactly and easily computed.

Fig. 3.5
figure 5

The illustration of the contraction and truncation of the iTEBD algorithm. In each iteration step, a row of tensors in the TN are contracted to the MPS, and truncations by SVD are implemented so that the bond dimensions of the MPS keep unchanged

Contraction

We use is two-site translational invariant MPS, which is formed by the tensors A and B on the sites and the spectrum Λ and Γ on the bonds as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{\{a\}} \cdots \varLambda_{a_{n-1}} A_{s_{n-1}, a_{n-1} a_{n}} \varGamma_{a_{n}} B_{s_{n}, a_{n} a_{n+1}} \varLambda_{a_{n+1}} \cdots. \end{array} \end{aligned} $$
(3.33)

In each step of iTEBD, the contraction is given by

$$\displaystyle \begin{aligned} \begin{array}{rcl} A_{s, \tilde{a} \tilde{a}'} \leftarrow \sum_{s'} T_{s b s' b'} A_{s', a a'}, \ \ B_{s, \tilde{a} \tilde{a}'} \leftarrow \sum_{s'} T_{s b s' b'} B_{s', a a'}, {} \end{array} \end{aligned} $$
(3.34)

where the new virtual bonds are entangled, satisfying \(\tilde {a} = (b,a)\) and \(\tilde {a}' = (b',a')\). Meanwhile, the spectrum is also updated as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varLambda_{\tilde{a}} \leftarrow \varLambda_{a} {\mathbf{1}}_{b}, \ \ \varGamma_{\tilde{a}'} \leftarrow \varGamma_{a'} {\mathbf{1}}_{b'}, \end{array} \end{aligned} $$
(3.35)

where 1 is a vector with 1 b = 1 for any b.

It is readily to see that the number of tensors in iTEBD will be reduced linearly as tN, with t the number of the contraction-and-truncation steps and N → the number of the columns of the TN. Therefore, iTEBD (also finite TEBD) can be considered as a linearized contraction algorithm, in contrast to the exponential contraction algorithm like TRG .

Truncation

Truncations are needed when the dimensions of the virtual bonds exceed the preset dimension cut-off χ. In the original version of iTEBD [31] , the truncations are done by local SVDs . To truncate the virtual bond \(\tilde {a}\), for example, one defines a matrix by contracting the tensors and spectrum connected to the target bond as

$$\displaystyle \begin{aligned} \begin{array}{rcl} M_{s_1 \tilde{a}_1,s_{2} \tilde{a}_2} = \sum_{\tilde{a}} \varLambda_{\tilde{a}_1} A_{s_1, \tilde{a}_1 \tilde{a}} \varGamma_{\tilde{a}} B_{s_2, \tilde{a} \tilde{a}_2} \varLambda_{\tilde{a}_2}. \end{array} \end{aligned} $$
(3.36)

Then, perform SVD on M, keeping only the χ-largest singular values and the corresponding basis as

$$\displaystyle \begin{aligned} \begin{array}{rcl} M_{s_1 \tilde{a}_1,s_2 \tilde{a}_2} = \sum_{a=0}^{\chi-1} U_{s_1, \tilde{a}_1 a} \varGamma_{a} V_{s_2, a \tilde{a}_2}. {} \end{array} \end{aligned} $$
(3.37)

The spectrum Γ is updated by the singular values of the above SVD. The tensors A and B are also updated as

$$\displaystyle \begin{aligned} \begin{array}{rcl} A_{s_1, \tilde{a} a} = (\varLambda_{\tilde{a}})^{-1} U_{s_1, \tilde{a} a}, \ \ B_{s_2, a \tilde{a}} = V_{s_2, a \tilde{a}} (\varLambda_{\tilde{a}})^{-1}. {} \end{array} \end{aligned} $$
(3.38)

Till now, the truncation of the spectrum Γ and the corresponding virtual bond have been completed. Any spectra and virtual bonds can be truncated similarly.

Error and Environment

Similar to TRG and SRG , the environment of the original iTEBD is M in Eq. (3.37), and the error is measured by the discarded singular values of M. Thus, iTEBD seems to only use local information to optimize the truncations. What is amazing is that when the MPO is unitary or near unitary, the MPS converges to a so-called canonical form [46, 47]. The truncations are then optimal by taking the whole MPS as the environment. If the MPO is far from being unitary, Orús and Vidal proposed the canonicalization algorithm [47] to transform the MPS into the canonical form before truncating. We will talk about this issue in detail in the next section.

Boundary-State Methods: Density Matrix Renormalization Group and Variational Matrix Product State

The iTEBD can be understood as a boundary-state method. One may consider one row of tensors in the TN as an MPO (see Sect. 2.2.6 and Fig. 2.10), where the vertical bonds are the “physical” indexes and the bonds shared by two adjacent tensors are the geometrical indexes. This MPO is also called the transfer operator or transfer MPO of the TN. The converged MPS is in fact the dominant eigenstate of the MPO.Footnote 2 While the MPO represents a physical Hamiltonian or the imaginary-time evolution operator (see Sect. 3.1), the MPS is the ground state. For more general situations, e.g., the TN represents a 2D partition function or the inner product of two 2D PEPSs, the MPS can be understood as the boundary state of the TN (or the PEPS ) [48,49,50]. The contraction of the 2D infinite TN becomes computing the boundary state, i.e., the dominant eigenstate (and eigenvalue) of the transfer MPO.

The boundary-state scheme gives several non-trivial physical and algorithmic implications [48,49,50,51,52], including the underlying resemblance between iTEBD and the famous infinite DMRG (iDMRG) [53]. DMRG [54, 55] follows the idea of Wilson’s NRG [56], and solves the ground states and low-lying excitations of 1D or quasi-1D Hamiltonians (see several reviews [57,58,59,60]); originally it has no direct relations to TN contraction problems. After the MPS and MPO become well understood, DMRG was re-interpreted in a manner that is more close to TN (see a review by Schollwöck [57]). In particular for simulating the ground states of infinite-size 1D systems, the underlying connections between the iDMRG and iTEBD were discussed by McCulloch [53]. As argued above, the contraction of a TN can be computed by solving the dominant eigenstate of its transfer MPO. The eigenstates reached by iDMRG and iTEBD are the same state up to a gauge transformation (note the gauge degrees of freedom of MPS will be discussed in Sect. 2.4.2). Considering that DMRG mostly is not used to compute TN contractions and there are already several understanding reviews, we skip the technical details of the DMRG algorithms here. One may refer to the papers mentioned above if interested. However, later we will revisit iDMRG in the clue of multi-linear algebra.

Variational matrix product state (VMPS) method is a variational version of DMRG for (but not limited to) calculating the ground states of 1D systems with periodic boundary condition [61]. Compared with DMRG, VMPS is more directly related to TN contraction problems. In the following, we explain VMPS by solving the contraction of the infinite square TN. As discussed above, it is equivalent to solve the dominant eigenvector (denoted by |ψ〉) of the infinite MPO (denoted by \(\hat {rho}\)) that is formed by a row of tensors in the TN. The task is to minimize \(\langle \psi | \hat {\rho } |\psi \rangle \) under the constraint 〈ψ|ψ〉 = 1. The eigenstate |ψ〉 written in the form of an MPS .

The tensors in |ψ〉 are optimized on by one. For instance, to optimize the n-th tensor, all other tensors are kept unchanged and considered as constants. Such a local minimization problem becomes \(\hat {H}^{eff} |T_n\rangle = \mathscr {E} \hat {N}^{eff} |T_n\rangle \) with \(\mathscr {E}\) the eigenvalue. \(\hat {H}^{eff}\) is given by a sixth-th order tensor defined by contracting all tensors in \(\langle \psi | \hat {\rho } |\psi \rangle \) except for the n-th tensor and its conjugate (Fig. 3.6a). Similarly, \(\hat {N}^{eff}\) is also given by a sixth-th order tensor defined by contracting all tensors in 〈ψ|ψ〉 except for the n-th tensor and its conjugate (Fig. 3.6b). Again, the VMPS is different from the MPS obtained by TEBD only up to a gauge transformation.

Fig. 3.6
figure 6

The illustration of (a) \(\hat {H}^{eff}\) and (b) \(\hat {N}^{eff}\) in the variational matrix product state method

Note that the boundary-state methods are not limited to solving TN contractions. An example is the time-dependent variational principle (TDVP) . The basic idea of TDVP was proposed by Dirac in 1930 [62], and then it was cooperated with the formulation of Hamiltonian [63] and action function [64]. For more details, one could refer to a review by Langhoff et al. [65]. In 2011, TDVP was developed to simulate the time evolution of many-body systems with the help of MPS [66]. Since TDVP (and some other algorithms) concerns directly a quantum Hamiltonian instead of the TN contraction, we skip giving more details of these methods in this paper.

3.5 Transverse Contraction and Folding Trick

For the boundary-state methods introduced above, the boundary states are defined in the real space. Taking iTEBD for the real-time evolution as an example, the contraction is implemented along the time direction, which is to do the time evolution in an explicit way. It is quite natural to consider implementing the contraction along the other direction. In the following, we will introduce the transverse contraction and the folding trick proposed and investigated in Refs. [67,68,69]. The motivation of transverse contraction is to avoid the explicit simulation of the time-dependent state |ψ(t)〉 that might be difficult to capture due to the fast growth of its entanglement.

Transverse Contraction

Let us consider to calculate the average of a one-body operator \(o(t) = \langle \psi (t)| \hat {o} |\psi (t)\rangle \) with |ψ(t)〉 that is a quantum state of infinite size evolved to the time t. The TN representing o(t) is given in the left part of Fig. 3.7, where the green squares give the initial MPS |ψ(0)〉 and its conjugate, the yellow diamond is \(\hat {o}\), and the TN formed by the green circles represents the evolution operator \(e^{it\hat {H}}\) and its conjugate (see how to define the TN in Sect. 3.1.3).

Fig. 3.7
figure 7

Transverse contraction of the TN for a local expectation value 〈O(t)〉

To perform the transverse contraction, we treat each column of the TN as an MPO \(\hat {\mathscr {T}}\). Then as shown in the right part of Fig. 3.7, the main task of computing o(t) is to solve the dominant eigenstate |ϕ〉 (normalized) of \(\hat {\mathscr {T}}\), which is an MPS illustrated by the purple squares. One may solve this eigenstate problems by any of the boundary-state methods (TEBD , DMRG , etc.). With |ϕ〉, o(t) can be exactly and efficiently calculated as

$$\displaystyle \begin{aligned} o(t) = \frac{\langle \psi (t) | \hat{o} | \psi (t) \rangle }{\langle \psi (t) | \psi (t) \rangle} = \frac{\langle \phi | \hat{\mathscr{T}}_o | \phi \rangle }{\langle \phi | \hat{\mathscr{T}} | \phi \rangle}, \end{aligned} $$
(3.39)

with \(\hat {\mathscr {T}}_o\) is the column that contains the operator \(\hat {o}\). Note that the length of |ϕ〉 (i.e., the number of tensors in the MPS ) is proportional to the time t, thus one should use the finite-size versions of the boundary-state methods. It should also be noted that \(\hat {\mathscr {T}}\) may not be Hermitian. In this case, one should not use |ϕ〉 and its conjugate, but compute the left and right eigenstates of \(\hat {\mathscr {T}}\) instead.

Interestingly, similar ideas of the transverse contraction appeared long before the concept of TN emerged. For instance, transfer matrix renormalization group (TMRG) [70,71,72,73] can be used to simulate the finite-temperature properties of a 1D system. The idea of TMRG is to utilize DMRG to calculate the dominant eigenstate of the transfer matrix (similar to \(\mathscr {T}\)). In correspondence with the TN terminology, it is to use DMRG to compute |ϕ〉 from the TN that defines the imaginary-time evolution. We will skip of the details of TMRG since it is not directly related to TN. One may refer the related references if interested.

Folding Trick

The main bottleneck of a boundary-state method concerns the entanglement of the boundary state. In other words, the methods will become inefficient when the entanglement of the boundary state grows too large. One example is the real-time simulation of a 1D chain, where the entanglement entropy increases linearly with time. Solely with the transverse contraction, it will not essentially solve this problem. Taking the imaginary-time evolution as an example, it has been shown that with the dual symmetry of space and time, the boundary states in the space and time directions possess the same entanglement [69, 74].

In Ref. [67], the folding trick was proposed. The idea is to “fold” the TN before the transverse contraction (Fig. 3.8). In the folded TN, each tensor is the tensor product of the original tensor and its conjugate. The length of the folded TN in the time direction is half of the original TN, and so is the length of the boundary state.

Fig. 3.8
figure 8

The illustration of the folding trick

The previous work [67] on the dynamic simulations of 1D spin chains showed that the entanglement of the boundary state is in fact reduced compared with that of the boundary state without folding. This suggests that the folding trick provides a more efficient representation of the entanglement structure of the boundary state. The authors of Ref. [67] suggested an intuitive picture to understand the folding trick. Consider a product state as the initial state at t − 0 and a single localized excitation at the position x that propagates freely with velocity v. By evolving for a time t, only (x ± vt) sites will become entangled. With the folding trick, the evolutions (that are unitary) besides the (x ± vt) sites will not take effects since they are folded with the conjugates and become identities. Thus the spins outside (x ± vt) will remain product state and will not contribute entanglement to the boundary state. In short, one key factor to consider here is the entanglement structure, i.e., the fact that the TN is formed by unitaries. The transverse contraction with the folding trick is a convincing example to show that the efficiency of contracting a TN can be improved by properly designing the contraction way according to the entanglement structure of the TN.

3.6 Relations to Exactly Contractible Tensor Networks and Entanglement Renormalization

The TN algorithms explained above are aimed at dealing with contracting optimally the TNs that cannot be exactly contracted. Then a question arises: Is a classical computer really able to handle these TNs? In the following, we show that by explicitly putting the isometries for truncations inside, the TNs that are contracted in these algorithms become eventually exactly contractible, dubbed as exactly contractible TN (ECTN) . Different algorithms lead to different ECTN. That means the algorithm will show a high performance if the TN can be accurately approximated by the corresponding ETNC.

Figure 3.9 shows the ECTN emerging in the plaquette renormalization [75] or higher-order TRG (HOTRG) algorithms [76]. Take the contraction of a TN (formed by the copies of tensor T) on square lattice as an example. In each iteration step, four nearest-neighbor Ts in a square are contracted together, which leads to a new square TN formed by tensors (T (1)) with larger bond dimensions. Then, isometries (yellow triangles) are inserted in the TN to truncate the bond dimensions (the truncations are in the same spirit of those in CTMRG , see Fig. 3.4). Let us not contract the isometries with the tensors, but leave them there inside the TN. Still, we can move on to the next iteration, where we contract four T (1)’s (each of which is formed by four T and the isometries, see the dark-red plaques in Fig. 3.9) and obtain more isometries for truncating the bond dimensions of T (1). By repeating this process for several times, one can see that tree TNs appear on the boundaries of the coarse-grained plaques. Inside the 4-by-4 plaques (light red shadow), we have the two-layer tree TNs formed by three isometries. In the 8-by-8 plaques, the tree TN has three layers with seven isometries. These tree TNs separate the original TN into different plaques, so that it can be exactly contracted, similar to the fractal TNs introduced in Sect. 2.3.6.

Fig. 3.9
figure 9

The exactly contractible TN in the HOTRG algorithm

In the iTEBD algorithm [29,30,31, 47] (Fig. 3.10), one starts with an initial MPS (dark-blue squares). In each iteration, one tensor (light blue circles) in the TN is contracted with the tensor in the MPS and then the bonds are truncated by isometries (yellow triangles). Globally seeing, the isometries separate the TN into many “tubes” (red shadow) that are connected only at the top. The length of the tubes equals to the number of the iteration steps in iTEBD. Obviously, this TN is exactly contractible. Such a tube-like structure also appears in the contraction algorithms based on PEPS.

Fig. 3.10
figure 10

The exactly contractible TN in the iTEBD algorithm

For the CTMRG algorithm [28], the corresponding ECTN is a little bit complicated (see one quarter of it in Fig. 3.11). The initial row (column) tensors and the corner transfer matrices are represented by the pink and green squares. In each iteration step, the tensors (light blue circles) located most outside are contracted to the row (column) tensors and the corner transfer matrices, and isometries are introduced to truncate the bond dimensions. Globally seeing the picture, the isometries separate the TN into a tree-like structure (red shadow), which is exactly contractible.

Fig. 3.11
figure 11

A part of the exactly contractible TN in the CTMRG algorithm

For these three algorithms, each of them gives an ECTN that is formed by two part: the tensors in the original TN and the isometries that make the TN exactly contractible. After optimizing the isometries, the original TN is approximated by the ECTN. The structure of the ECTN depends mainly on the contraction strategy and the way of optimizing the isometries depends on the chosen environment.

The ECTN picture shows us explicitly how the correlations and entanglement are approximated in different algorithms. Roughly speaking, the correlation properties can be read from the minimal distance of the path in the ECTN that connects two certain sites, and the (bipartite) entanglement can be read from the number of bonds that cross the boundary of the bipartition. How well the structure suits the correlations and entanglement should be a key factor of the performance of a TN contraction algorithm. Meanwhile, this picture can assist us to develop new algorithms by designing the ECTN and taking the whole ECTN as the environment for optimizing the isometries. These issues still need further investigations.

The unification of the TN contraction and the ECTN has been explicitly utilized in the TN renormalization (TNR) algorithm [77, 78], where both isometries and unitaries (called disentangler) are put into the TN to make it exactly contractible. Then instead of tree TNs or MPSs , one will have MERAs (see Fig. 2.7c, for example) inside which can better capture the entanglement of critical systems.

3.7 A Shot Summary

In this section, we have discussed about several contraction approaches for dealing with 2D TNs. Applying these algorithms, many challenging problems can be efficiently solved, including the ground-state and finite-temperature simulations of 1D quantum systems, and the simulations of 2D classical statistic models. Such algorithms consist of two key ingredients: contractions (local operations of tensors) and truncations. The local contraction determines the way how the TN is contracted step by step, or in other words, how the entanglement information is kept according to the ECTN structure. Different (local or global) contractions may lead to different computational costs, thus optimizing the contraction sequence is necessary in many cases [67, 79, 80]. The truncation is the approximation to discard less important basis so that the computational costs are properly bounded. One essential concept in the truncations is “environment,” which plays the role of the reference when determining the weights of the basis. Thus, the choice of environment concerns the balance between the accuracy and efficiency of a TN algorithm.