2.1 Scalar, Vector, Matrix, and Tensor

Generally speaking, a tensor is defined as a series of numbers labeled by N indexes, with N called the order of the tensor.Footnote 1 In this context, a scalar, which is one number and labeled by zero index, is a zeroth-order tensor. Many physical quantities are scalars, including energy, free energy, magnetization, and so on. Graphically, we use a dot to represent a scalar (Fig. 2.1).

Fig. 2.1
figure 1

From left to right, the graphic representations of a scalar, vector, matrix, and tensor

A D-component vector consists of D numbers labeled by one index, and thus is a first-order tensor. For example, one can write the state vector of a spin-1∕2 in a chosen basis (say the eigenstates of the spin operator \(\hat {S}^{[z]}\)) as

$$\displaystyle \begin{aligned} \begin{array}{rcl} |\psi\rangle =C_1|0 \rangle +C_2|1 \rangle =\sum_{s=0,1}C_s|s\rangle, \end{array} \end{aligned} $$
(2.1)

with the coefficients C a two-component vector. Here, we use |0〉 and |1〉 to represent spin up and down states. Graphically, we use a dot with one open bond to represent a vector (Fig. 2.1).

A matrix is in fact a second-order tensor. Considering two spins as an example, the state vector can be written under an irreducible representation as a four-dimensional vector. Instead, under the local basis of each spin, we write it as

$$\displaystyle \begin{aligned} \begin{array}{rcl} |\psi\rangle =C_{00}|0\rangle|0\rangle +C_{01}|0\rangle|1\rangle +C_{10}|1\rangle|0\rangle+C_{11}|1\rangle|1\rangle =\sum_{ss'=0}^{1}C_{ss'}|s\rangle|s'\rangle, \end{array} \end{aligned} $$
(2.2)

with \(C_{ss'}\) a matrix with two indexes. Here, one can see that the difference between a (D × D) matrix and a D 2-component vector in our context is just the way of labeling the tensor elements. Transferring among vector, matrix, and tensor like this will be frequently used later. Graphically, we use a dot with two bonds to represent a matrix and its two indexes (Fig. 2.1).

It is then natural to define an N-th order tensor. Considering, e.g., N spins, the 2N coefficients can be written as an N-th order tensor C,Footnote 2 satisfying

$$\displaystyle \begin{aligned} \begin{array}{rcl} |\psi\rangle =\sum_{s_1 \cdots s_N=0}^{1}C_{s_1\ldots s_N}|s_1\rangle\ldots|s_N\rangle. \end{array} \end{aligned} $$
(2.3)

Similarly, such a tensor can be reshaped into a 2N-component vector. Graphically, an N-th order tensor is represented by a dot connected with N open bonds (Fig. 2.1).

In above, we use states of spin-1∕2 as examples, where each index can take two values. For a spin-S state, each index can take d = 2S + 1 values, with d called the bond dimension. Besides quantum states, operators can also be written as tensors. A spin-1∕2 operator \(\hat {S}^{\alpha }\) (α = x, y, z) is a (2 × 2) matrix by fixing the basis, where we have \( S^{\alpha }_{s_1^{\prime }s_2^{\prime }s_1s_2} = \langle s_1^{\prime }s_2^{\prime } |\hat {S}^{\alpha } |s_1s_2\rangle \). In the same way, an N-spin operator can be written as a 2N-th order tensor, with Nbra and Nket indexes.Footnote 3

We would like to stress some conventions about the “indexes” of a tensor (including matrix) and those of an operator. A tensor is just a group of numbers, where their indexes are defined as the labels labeling the elements. Here, we always put all indexes as the lower symbols, and the upper “indexes” of a tensor (if exist) are just a part of the symbol to distinguish different tensors. For an operator which is defined in a Hilbert space, it is represented by a hatted letter, and there will be no “true” indexes, meaning that both upper and lower “indexes” are just parts of the symbol to distinguish different operators.

2.2 Tensor Network and Tensor Network States

2.2.1 A Simple Example of Two Spins and Schmidt Decomposition

After introducing tensor (and its diagram representation), now we are going to talk about TN, which is defined as the contraction of many tensors. Let us start with the simplest situation, two spins, and consider to study the quantum entanglement properties for instance. Quantum entanglement, mostly simplified as entanglement, can be defined by the Schmidt decomposition [1,2,3] of the state (Fig. 2.2) as

$$\displaystyle \begin{aligned} \begin{array}{rcl} | \psi \rangle = \sum_{ss'=0}^{1} C_{ss'} |s \rangle |s' \rangle = \sum_{ss'=0}^{1} \sum_{a=1}^{\chi} U_{sa} \lambda_{a a'} V^{\ast}_{a s'} |s \rangle |s' \rangle, {} \end{array} \end{aligned} $$
(2.4)

where U and V  are unitary matrices, λ is a positive-defined diagonal matrix in descending order,Footnote 4 and χ is called the Schmidt rank. λ is called the Schmidt coefficients since in the new basis after the decomposition, the state is written in a summation of χ product states as |ψ〉 =∑aλ a|ua|va, with the new basis |ua =∑sU sa|s〉 and \(| v \rangle _{a} = \sum _{s'} V^{\ast }_{s a} |s' \rangle \).

Fig. 2.2
figure 2

The graphic representation of the Schmidt decomposition (singular value decomposition of a matrix). The positive-defined diagonal matrix λ, which gives the entanglement spectrum (Schmidt numbers), is defined on a virtual bond (dumb index) generated by the decomposition

Graphically, we have a small TN , where we use green squares to represent the unitary matrices U and V , and a red diamond to represent the diagonal matrix λ. There are two bonds in the graph shared by two objects, standing for the summations (contractions) of the two indexes in Eq. (2.4), a and a′. Unlike s (or s′), the space of the index a (or a′) is not from any physical Hilbert space. To distinguish these two kinds, we call the indexes like s the physical indexes and those like a the geometrical or virtual indexes. Meanwhile, since each physical index is only connected to one tensor, it is also called an open bond.

Some simple observations can be made from the Schmidt decomposition. Generally speaking, the index a (also a′ since λ is diagonal) contracted in a TN carry the quantum entanglement [4]. In quantum information sciences, entanglement is regarded as a quantum version of correlation [4], which is crucially important to understand the physical implications of TN. One usually uses the entanglement entropy to measure the strength of the entanglement, which is defined as \(S = - 2 \sum _{a=1}^{\chi } \lambda _{a}^2 \ln \lambda _{a}\). Since the state should be normalized, we have \(\sum _{a=1}^{\chi }\lambda _{a}^2 = 1\). For \(\dim (a) = 1\), obviously |ψ〉 = λ 1|u1|v1 is a product state with zero entanglement S = 0 between the two spins. For \(\dim (a) = \chi \), the entanglement entropy \(S \leq \ln \chi \), where S takes its maximum if and only if λ 1 = ⋯ = λ χ. In other words, the dimension of a geometrical index determines the upper bound of the entanglement.

Instead of Schmidt decomposition, it is more convenient to use another language to present later the algorithms: singular value decomposition (SVD) , a matrix decomposition in linear algebra. The Schmidt decomposition of a state is the SVD of the coefficient matrix C, where λ is called the singular value spectrum and its dimension χ is called the rank of the matrix. In linear algebra, SVD gives the optimal lower-rank approximations of a matrix, which is more useful to the TN algorithms. Specifically speaking, with a given matrix C of rank-χ, the task is to find a rank-\(\tilde {\chi }\) matrix C′ (\(\tilde {\chi } \leq \chi \)) that minimizes the norm

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathscr{D} = |M-M'| = \sqrt{\sum_{ss'} \left(M_{ss'} - M^{\prime}_{ss'}\right)^2}. \end{array} \end{aligned} $$
(2.5)

The optimal solution is given by the SVD as

$$\displaystyle \begin{aligned} \begin{array}{rcl} M^{\prime}_{ss'} = \sum_{a=0}^{\chi'-1} U_{sa} \lambda_{a a} V^{\ast}_{s' a}. \end{array} \end{aligned} $$
(2.6)

In other words, M′ is the optimal rank-χ′ approximation of M, and the error is given by

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varepsilon = \sqrt{\sum_{a=\chi'}^{\chi-1} \lambda_{a}^2}, {} \end{array} \end{aligned} $$
(2.7)

which will be called the truncation error in the TN algorithms.

2.2.2 Matrix Product State

Now we take a N-spin state as an example to explain the MPS , a simple but powerful 1D TN state. In an MPS, the coefficients are written as a TN given by the contraction of N tensors. Schollwöck in his review [5] provides a straightforward way to obtain such a TN is by repetitively using SVD or QR decomposition (Fig. 2.3). First, we group the first N − 1 indexes together as one large index, and write the coefficients as a 2N−1 × 2 matrix. Then implement SVD or any other decomposition (for example, QR decomposition) as the contraction of C [N−1] and A[N]

$$\displaystyle \begin{aligned} \begin{array}{rcl} C_{s_1 \cdots s_{N-1}s_N} = \sum_{a_{N-1}} C^{[N-1]}_{s_1 \cdots s_{N-1},a_{N-1}} A^{[N]}_{s_{N}, a_{N-1} }. \end{array} \end{aligned} $$
(2.8)

Note that as a convention in this paper, we always put the physical indexes in front of geometrical indexes and use a comma to separate them. For the tensor C [N−1], one can do the similar thing by grouping the first N − 2 indexes and decompose again as

$$\displaystyle \begin{aligned} \begin{array}{rcl} C_{s_1 \cdots s_{N-1} a_{N-1}} = \sum_{a_{N-2}} C^{[N-2]}_{s_1 \cdots s_{N-2},a_{N-2}} A^{[N-1]}_{s_{N-1}, a_{N-2} a_{N-1}}. \end{array} \end{aligned} $$
(2.9)

Then the total coefficients become the contraction of three tensors as

$$\displaystyle \begin{aligned} \begin{array}{rcl} C_{s_1 \cdots s_{N-1}s_N} = \sum_{a_{N-2} a_{N-1}} C^{[N-2]}_{s_1 \cdots s_{N-2},a_{N-2}} A^{[N-1]}_{s_{N-1}, a_{N-2} a_{N-1}} A^{[N]}_{s_{N}, a_{N-1}}. \end{array} \end{aligned} $$
(2.10)

Repeat decomposing in the above way until each tensor only contains one physical index, we have the MPS representation of the state as

$$\displaystyle \begin{aligned} \begin{array}{rcl} C_{s_1 \cdots s_{N-1}s_N} = \sum_{a_{1} \cdots a_{N-1}} A^{[1]}_{s_1, a_{1}} A^{[2]}_{s_2, a_{1} a_{2}} \cdots A^{[N-1]}_{s_{N-1}, a_{N-2} a_{N-1}} A^{[N]}_{s_{N}, a_{N-1}}. {} \end{array} \end{aligned} $$
(2.11)

One can see that an MPS is a TN formed by the contraction of N tensors. Graphically, MPS is represented by a 1D graph with N open bonds. In fact, an MPS given by Eq. (2.11) has open boundary condition, and can be generalized to periodic boundary condition (Fig. 2.4) as

$$\displaystyle \begin{aligned} \begin{array}{rcl} C_{s_1 \cdots s_{N-1}s_N} = \sum_{a_{1} \cdots a_{N}} A^{[1]}_{s_1, a_{N} a_{1}} A^{[2]}_{s_2, a_{1} a_{2}} \cdots A^{[N-1]}_{s_{N-1}, a_{N-2} a_{N-1}} A^{[N]}_{s_{N}, a_{N-1} a_{N}}, {} \end{array} \end{aligned} $$
(2.12)

where all tensors are third-order. Moreover, one can introduce translational invariance to the MPS, i.e., A [n] = A for n = 1, 2, ⋯ , N. We use χ, dubbed as virtual bond dimension of the MPS, to represent the dimension of each geometrical index.

Fig. 2.3
figure 3

An impractical way to obtain an MPS from a many-body wave-function is to repetitively use the SVD

Fig. 2.4
figure 4

The graphic representations of the matrix product states with open (left) and periodic (right) boundary conditions

MPS is an efficient representation of a many-body quantum state. For a N-spin state, the number of the coefficients is 2N which increases exponentially with N. For an MPS given by Eq. (2.12), it is easy to count that the total number of the elements of all tensors is Ndχ 2 which increases only linearly with N. The above way of obtaining MPS with decompositions is also known as tensor train decomposition (TTD) in MLA , and MPS is also called tensor-train form [6]. The main aim of TTD is investigating the algorithms to obtain the optimal tensor-train form of a given tensor, so that the number of parameters can be reduced with well-controlled errors.

In physics, the above procedure shows that any states can be written in an MPS , as long as we do not limit the dimensions of the geometrical indexes. However, it is extremely impractical and inefficient, since in principle, the dimensions of the geometrical indexes {a} increase exponentially with N. In the following sections, we will directly applying the mathematic form of the MPS without considering the above procedure.

Now we introduce a simplified notation of MPS that has been widely used in the community of physics. In fact with fixed physical indexes, the contractions of geometrical indexes are just the inner products of matrices (this is how its name comes from). In this sense, we write a quantum state given by Eq. (2.11) as

$$\displaystyle \begin{aligned} \begin{array}{rcl} |\psi \rangle =tTr A^{[1]} A^{[2]} \cdots A^{[N]} |s_1s_2 \cdots s_N \rangle = tTr \prod_{n=1}^{N} A^{[n]} |s_n \rangle. {} \end{array} \end{aligned} $$
(2.13)

tTr stands for summing over all shared indexes. The advantage of Eq. (2.13) is to give a general formula for an MPS of either finite or infinite size, with either periodic or open boundary condition.

2.2.3 Affleck–Kennedy–Lieb–Tasaki State

MPS is not just a mathematic form. It can represent non-trivial physical states. One important example can be found with AKLT model proposed in 1987, a generalization of spin-1 Heisenberg model [7]. For 1D systems, Mermin–Wagner theorem forbids any spontaneously breaking of continuous symmetries at finite temperature with sufficiently short-range interactions. For the ground state of AKLT model called AKLT state, it possesses the sparse anti-ferromagnetic order (Fig. 2.5), which provides a non-zero excitation gap under the framework of Mermin–Wagner theorem. Moreover, AKTL state provides us a precious exactly solvable example to understand edge states and (symmetry-protected) topological orders.

Fig. 2.5
figure 5

One possible configuration of the sparse anti-ferromagnetic ordered state. A dot represents the S = 0 state. Without looking at all the S = 0 states, the spins are arranged in the anti-ferromagnetic way

AKLT state can be exactly written in an MPS with χ = 2 (see [8] for example). Without losing generality, we assume periodic boundary condition. Let us begin with the AKLT Hamiltonian that can be given by spin-1 operators as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{H}=\sum_n\left[\frac{1}{2} \hat{S}_n\cdot \hat{S}_{n+1}+\frac{1}{6} (\hat{S}_n\cdot \hat{S}_{n+1})^2+\frac{1}{3}\right]. {} \end{array} \end{aligned} $$
(2.14)

By introducing the non-negative-defined projector \(\hat {P}_2(\hat {S}_n+\hat {S}_{n+1})\) that projects the neighboring spins to the subspace of S = 2, Eq. (2.14) can be rewritten in the summation of projectors as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{H}=\sum_n \hat{P}_2(\hat{S}_n+\hat{S}_{n+1}). \end{array} \end{aligned} $$
(2.15)

Thus, the AKLT Hamiltonian is non-negative-defined, and its ground state lies in its kernel space, satisfying \(\hat {H}|\psi _{AKLT}\rangle = 0\) with a zero energy.

Now we construct a wave-function which has a zero energy. As shown in Fig. 2.6, we put on each site a projector that maps two (effective) spins-1∕2 to a triplet, i.e., the physical spin-1, where the transformation of the basis obeys

$$\displaystyle \begin{aligned} \begin{array}{rcl} |+\rangle&\displaystyle =&\displaystyle |00\rangle \end{array} \end{aligned} $$
(2.16)
$$\displaystyle \begin{aligned} \begin{array}{rcl} |\tilde{0}\rangle&\displaystyle =&\displaystyle \frac{1}{\sqrt{2}}(|01\rangle+|10\rangle), \end{array} \end{aligned} $$
(2.17)
$$\displaystyle \begin{aligned} \begin{array}{rcl} |-\rangle&\displaystyle =&\displaystyle |11\rangle. \end{array} \end{aligned} $$
(2.18)

The corresponding projector is determined by the Clebsch–Gordan coefficients [9], and is a (3 × 4) matrix. Here, we rewrite it as a (3 × 2 × 2) tensor, whose three components (regarding to the first index) are the ascending, z-component, and descending Pauli matrices of spin-1∕2,Footnote 5

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sigma^+ = \left[ {\begin{array}{*{30}c} 0 \quad 1 \\ 0 \quad 0 \\ \end{array}} \right], \quad \ \ \sigma^{z} = \left[ {\begin{array}{*{30}c} 1 \quad 0 \\ 0 \ \ -1 \\ \end{array}} \right], \quad \ \ \sigma^- = \left[ {\begin{array}{*{30}c} 0 \quad 0 \\ 1 \quad 0 \\ \end{array}} \right]. \quad \ \ {} \end{array} \end{aligned} $$
(2.19)

In the language of MPS , we have the tensor A satisfying

$$\displaystyle \begin{aligned} \begin{array}{rcl} A_{0,a a'} = \sigma^+_{aa'}, \quad A_{1,a a'} = \sigma^z_{aa'}, \quad A_{2,a a'} = \sigma^-_{aa'}. {} \end{array} \end{aligned} $$
(2.20)
Fig. 2.6
figure 6

An intuitive graphic representation of the AKLT state. The big circles representing S = 1 spins, and the small ones are effective \(S= \frac {1}{2}\) spins. Each pair of spin-1∕2 connecting by a red bond forms a singlet state. The two “free” spin-1∕2 on the boundary give the edge state

Then we put another projector to map two spin-1∕2 to a singlet, i.e., a spin-0 with

$$\displaystyle \begin{aligned} \begin{array}{rcl} |\bar{0}\rangle = \frac{1}{\sqrt{2}}(|01\rangle-|10\rangle). \end{array} \end{aligned} $$
(2.21)

The projector is in fact a (2 × 2) identity with the choice of Eq. (2.19)

$$\displaystyle \begin{aligned} \begin{array}{rcl} I = \left[ {\begin{array}{*{30}c} 1 \quad 0 \\ 0 \quad 1 \\ \end{array}} \right]. {} \end{array} \end{aligned} $$
(2.22)

Now, the MPS of the AKLT state with periodic boundary condition (up to a normalization factor) is obtained by Eq. (2.12), with every tensor A given by Eq. (2.20). For such an MPS, every projector operator \(\hat {P}_2(\hat {S}_n+\hat {S}_{n+1})\) in the AKLT Hamiltonian is always acted on a singlet, then we have \(\hat {H}|\psi _{AKLT}\rangle = 0\).

2.2.4 Tree Tensor Network State (TTNS) and Projected Entangled Pair State (PEPS)

TTNS is a generalization of the MPS that can code more general entanglement states. Unlike an MPS where the tensors are aligned in a 1D array, a TTNS is given by a tree graph. Figure 2.7a, b shows two examples of TTNS with the coordination number z = 3. The red bonds are the physical indexes and the black bonds are the geometrical indexes connecting two adjacent tensors. The physical ones may locate on each tensor or put on the boundary of the tree. A tree is a graph that has no loops, which leads to many simple mathematical properties that parallel to those of an MPS. For example, the partition function of a TTNS can be efficiently exactly computed. A similar but more power TN state called MERA also has such a property (Fig. 2.7c). We will get back to this in Sect. 2.3.6. Note an MPS can be treated as a tree with z = 2.

Fig. 2.7
figure 7

The illustration of (a) and (b) two different TTNSs and (c) MERA

An important generalization to the TNs of loopy structures is known as projected entangled pair state (PEPS), proposed by Verstraete and Cirac [10, 11]. The tensors of a PEPS are located in, instead of a 1D chain or a tree graph, a d-dimensional lattice, thus graphically forming a d-dimensional TN . An intuitive picture of PEPS is given in Fig. 2.8, i.e., the tensors can be understood as projectors that map the physical spins into virtual ones. The virtual spins form the maximally entangled state in a way determined by the geometry of the TN. Note that such an intuitive picture was firstly proposed with PEPS [10], but it also applies to TTNS.

Fig. 2.8
figure 8

(a) An intuitive picture of the projected entangled pair state. The physical spins (big circles) are projected to the virtual ones (small circles), which form the maximally entangled states (red bonds). (b)–(d) Three kinds of frequently used PEPSs

Similar to MPS , a TTNS or PEPS can be formally written as

$$\displaystyle \begin{aligned} \begin{array}{rcl} |\varPsi\rangle = tTr \prod_{n} P^{[n]} |s_n\rangle, {} \end{array} \end{aligned} $$
(2.23)

where tTr means to sum over all geometrical indexes. Usually, we do not write the formula of a TTNS or PEPS, but give the graph instead to clearly show the contraction relations.

Such a generalization makes a lot of senses in physics. One key factor regards the area law of entanglement entropy [12,13,14,15,16,17] which we will talk about later in this chapter. In the following as two straightforward examples, we show that PEPS can indeed represents non-trivial physical states including nearest-neighbor resonating valence bond (RVB) and Z 2spin liquid states. Note these two types of states on trees can be similarly defined by the corresponding TTNS.

2.2.5 PEPS Can Represent Non-trivial Many-Body States: Examples

RVB state was firstly proposed by Anderson to explain the possible disordered ground state of the Heisenberg model on triangular lattice [18, 19]. RVB state is defined as the superposition of macroscopic configurations where all spins are paired to form the singlet states (dimers). The strong fluctuations are expected to restore all symmetries and lead to a spin liquid state without any local orders. The distance between two spins in a dimer can be short range or long range. For nearest-neighbor RVB, the dimers are only the nearest neighbors (Fig. 2.9, also see [20]). RVB state is supposed to relate to high-T c copper-oxide-based superconductor. By doping the singlet pairs, the insulating RVB state can translate to a charged superconductive state [21,22,23].

Fig. 2.9
figure 9

The nearest-neighbor RVB state is the superposition of all possible configurations of nearest-neighbor singlets

For the nearest-neighbor situation, an RVB state (defined on an infinite square lattice, for example) can be exactly written in a PEPS of χ = 3. Without losing generality, we take the translational invariance, i.e., the TN is formed by infinite copies of several inequivalent tensors. Two different ways have been proposed to construct the nearest-neighbor RVB PEPS [24, 25]. In addition, Wang et al. proposed a way to construct the PEPS with long-range dimers [26]. In the following, we explain the way proposed by Verstraete et al. to construct the nearest-neighbor one [24]. There are two inequivalent tensors: the tensor defined on each site whose dimensions are (2 × 3 × 3 × 3 × 3) only has eight non-zero elements,

$$\displaystyle \begin{aligned} \begin{array}{rcl} P_{0,0222} = P_{0,2022} = P_{0,2202} = P_{0,2220} = 1 \end{array} \end{aligned} $$
(2.24)
$$\displaystyle \begin{aligned} \begin{array}{rcl} P_{1,1222} = P_{1,2122} = P_{1,2212} = P_{1,2221} = 1 \end{array} \end{aligned} $$
(2.25)

The two-dimensional index of P is a physical index with s = 0 representing spin up and s = 1 spin down. The extra dimension for each of the other four geometrical indexes is used for carrying the vacuum state. The tensor P is acting as projector that maps the occupied geometrical index (either up or down) to a physical spin. For example, P 1,2122 means to map a virtual spin up which occupies the second geometrical index to a real spin up. The rest elements are all zero, which means the corresponding projections are forbidden.

Then a projector B is introduced for building spin singlets between two nearest-neighbor sites connected by a shared geometrical bond in the RVB structure. B is a (3 × 3) matrix with only three non-zero elements

$$\displaystyle \begin{aligned} B_{01}=1, B_{10}=-1, B_{22}=1. \end{aligned} $$
(2.26)

Matrix B plays as a router, which only lets the singlet state defined as |10〉−|10〉 and vacuum state go through the path.

Then the infinite PEPS (iPEPS) of the nearest-neighbor RVB is given by the contraction of infinite copies of P’s on the sites and B’s (Fig. 2.8) on the bonds as

$$\displaystyle \begin{aligned} \begin{array}{rcl} |\varPsi\rangle = \sum_{\{s,a\}} \prod_{n \in sites} P_{s_n, a_n^1 a_n^2 a_n^3 a_n^4} \prod_{m \in bonds} B_{a^1_m a^2_m } \prod_{j \in sites} |s_j\rangle. {} \end{array} \end{aligned} $$
(2.27)

After the contraction of all geometrical indexes, the state is the superposition of all possible configurations consisting of nearest-neighbor dimers. This iPEPS looks different from the one given in Eq. (2.23) but they are essentially the same, because one can contract the B’s into P’s so that the PEPS is only formed by tensors defined on the sites.

Another example is the Z2 spin liquid state, which is one of simplest string-net states [27,28,29], firstly proposed by Levin and Wen to characterize gapped topological orders [30]. Similarly with the picture of strings, the Z2 state is the superposition of all configurations of string loops. Writing such a state with TN , the tensor on each vertex is (2 × 2 × 2 × 2) satisfying

$$\displaystyle \begin{aligned} \begin{array}{rcl} P_{a_1 \cdots a_N}&=& \left\{ \begin{array}{lll} 1, \ \ a_1+\cdots + a_N=even, \\ 0, \ \ otherwise. \end{array} \right. \end{array} \end{aligned} $$
(2.28)

The tensor P forces the fusion rules of the strings: the number of the strings connecting to a vertex must be even, so that there are no loose ends and all strings have to form loops. It is also called in some literatures the ice rule [31, 32] or Gauss’ law [33]. In addition, the square TN formed solely by the tensor P gives the famous eight-vertex model, where the number “eight” corresponds to the eight non-zero elements (i.e., allowed sting configurations) on a vertex [34].

The tensors B are defined on each bond to project the strings to spins, whose non-zero elements are

$$\displaystyle \begin{aligned} \begin{array}{rcl} B_{0,00}=1, \ \ B_{1,11}=1. \end{array} \end{aligned} $$
(2.29)

The tensor B is a projector that maps the spin-up (spin-down) state to the occupied (vacuum) state of a string.

2.2.6 Tensor Network Operators

The MPS or PEPS can be readily generalized from representations of states to those of operators called MPO [35,36,37,38,39,40,41,42] or projected entangled pair operator (PEPO) Footnote 6 [43,44,45,46,47,48,49,50,51,52]. Let us begin with MPO, which is also formed by the contraction of local tensors as (Fig. 2.10)

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{O} = \sum_{\{s,a\}} \prod_n W_{s_n s_n^{\prime}, a_n a_{n+1}}^{[n]}|s_n\rangle \langle s_n^{\prime}|. {} \end{array} \end{aligned} $$
(2.30)
Fig. 2.10
figure 10

The graphic representation of a matrix product operator, where the upward and downward indexes represent the bra and ket space, respectively

Different from MPS , each tensor has two physical indexes, of which one is a bra and the other is a ket index (Fig. 2.11). An MPO may represent several non-trivial physical models, for example, the Hamiltonian. Crosswhite and Bacon [53] proposed a general way of constructing an MPO called automata. Now we show how to construct the MPO of an Hamiltonian using the properties of a triangular MPO. Let us start from a general lower-triangular MPO satisfying \(W^{[n]}_{::,00}=C^{[n]}\), \(W^{[n]}_{::,01}=B^{[n]}\), and \(W^{[n]}_{::,11}=A^{[n]}\) with A [n], B [n], and C [n] some d × d square matrices. We can write W [n] in a more explicit 2 × 2 block-wise form as

$$\displaystyle \begin{aligned} \begin{array}{rcl} W^{[n]}= \begin{pmatrix} C^{[n]} &\displaystyle 0 \\ B^{[n]} &\displaystyle A^{[n]} \end{pmatrix}. \end{array} \end{aligned} $$
(2.31)

If one puts such a W [n] in Eq. (2.30), it will give the summation of all terms in the form of

$$\displaystyle \begin{aligned} \begin{array}{rcl} O &\displaystyle =&\displaystyle \sum_{n=1}^N A^{[1]} \otimes \cdots \otimes A^{[n-1]} \otimes B^{[n]} \otimes C^{[n+1]} \otimes \cdots \otimes C^{[N]} \\ &\displaystyle =&\displaystyle \sum_{n=1}^N \prod_{\otimes i=1}^{n-1} A^{[i]} \otimes B^{[n]} \otimes \prod_{\otimes j=n+1}^{N} C^{[j]}, {} \end{array} \end{aligned} $$
(2.32)

with N the total number of tensors and ∏ the tensor product.Footnote 7 Such a property can be easily generalized to a W formed by D × D blocks.

Fig. 2.11
figure 11

The graphic representation of a projected entangled pair operator, where the upward and downward indexes represent the bra and ket space, respectively

Imposing Eq. (2.32), we can construct as an example the summation of one-site local terms, i.e., ∑nX [n],Footnote 8 with

$$\displaystyle \begin{aligned} \begin{array}{rcl} W^{[n]}= \begin{pmatrix} I &\displaystyle 0 \\ X^{[n]} &\displaystyle I \end{pmatrix}, \end{array} \end{aligned} $$
(2.33)

with X [n] a d × d matrix and I the d × d identity.

If two-body terms are included, such as ∑mX [m] +∑nY [n]Z [n+1], we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} W^{[n]}= \begin{pmatrix} I &\displaystyle 0 &\displaystyle 0 \\ Z^{[n]} &\displaystyle 0 &\displaystyle 0 \\ X^{[n]} &\displaystyle Y^{[n]} &\displaystyle I \end{pmatrix}. \end{array} \end{aligned} $$
(2.34)

This can be obviously generalized to L-body terms. With open boundary conditions, the left and right tensors are

$$\displaystyle \begin{aligned} \begin{array}{rcl} W^{[1]}= \begin{pmatrix} I &\displaystyle 0 &\displaystyle 0 \end{pmatrix}, \ \ \end{array} \end{aligned} $$
(2.35)
$$\displaystyle \begin{aligned} \begin{array}{rcl} W^{[N]}= \begin{pmatrix} 0 \\ 0 \\ I \end{pmatrix}. \end{array} \end{aligned} $$
(2.36)

Now we apply the above technique on a Hamiltonian of, e.g., the Ising model in a transverse field

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{H} = \sum_n \hat{S}^{z}_n \hat{S}^{z}_{n+1} + h \sum_m \hat{S}^{x}_m. \end{array} \end{aligned} $$
(2.37)

Its MPO is given by

$$\displaystyle \begin{aligned} \begin{array}{rcl} W^{[n]}= \begin{pmatrix} I &\displaystyle 0 &\displaystyle 0 \\ \hat{S}^z &\displaystyle 0 &\displaystyle 0 \\ h \hat{S}^x &\displaystyle \hat{S}^z &\displaystyle I \end{pmatrix}. \end{array} \end{aligned} $$
(2.38)

Such a way of constructing an MPO is very useful. Another example is the Fourier transformation to the number operator of Hubbard model in momentum space \(\hat {n}_k = \hat {b}_k^{\dagger } \hat {b}_{k}\). The Fourier transformation is written as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{n}_k = \sum_{m,n=1}^{N} e^{i(m-n)k} \hat{b}_m^{\dagger} \hat{b}_{n}, \end{array} \end{aligned} $$
(2.39)

with \(\hat {b}_n\) (\(\hat {b}_n^{\dagger }\)) the annihilation (creation) operator on the n-th site. The MPO representation of such a Fourier transformation is given by

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{W}_n = \begin{pmatrix} \hat{I} &\displaystyle 0 &\displaystyle 0 &\displaystyle 0 \\ \hat{b}^{\dagger} &\displaystyle e^{ik} \hat{I} &\displaystyle 0 &\displaystyle 0 \\ \hat{b} &\displaystyle 0 &\displaystyle e^{-ik} \hat{I} &\displaystyle 0 \\ \hat{b}^{\dagger} \hat{b} &\displaystyle e^{+ik} \hat{b}^{\dagger} &\displaystyle e^{-ik} \hat{b} &\displaystyle \hat{I} \end{pmatrix}, \end{array} \end{aligned} $$
(2.40)

with \(\hat {I}\) the identical operator in the corresponding Hilbert space.

The MPO formulation also allows for a convenient and efficient representation of the Hamiltonians with longer range interactions [54]. The geometrical bond dimensions will in principle increase with the interaction length. Surprisingly, a small dimension is needed to approximate the Hamiltonian with long-range interactions that decay polynomially [46].

Besides, MPO can be used to represent the time evolution operator \(\hat {U}(\tau ) = e^{-\tau \hat {H}}\) with Trotter–Suzuki decomposition, where τ is a small positive number called Trotter–Suzuki step [55, 56]. Such an MPO is very useful in calculating real, imaginary, or even complex time evolutions, which we will present later in detail. An MPO can also give a mixed state.

Similarly, PEPS can also be generalized to projected entangled pair operator (PEPO , Fig. 2.11), which on a square lattice, for instance, can be written as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{O} = \sum_{\{s,a\}} \prod_n W_{s_n s_n^{\prime}, a_n^1 a_n^2 a_n^3 a_n^4}^{[n]} |s_n\rangle \langle s_n^{\prime}|. {} \end{array} \end{aligned} $$
(2.41)

Each tensor has two physical indexes (bra and ket) and four geometrical indexes. Each geometrical bond is shared by two adjacent tensors and will be contracted.

2.2.7 Tensor Network for Quantum Circuits

A special case of TN are quantum circuits [57]. Quantum circuits encode computations made on qubits (or qudits in general). Figure 2.12 demonstrates the TN representation of a quantum circuit made by unitary gates that act on a product state of many constituents initialized as ∏|0〉.

Fig. 2.12
figure 12

The TN representation of a quantum circuit. Two-body unitaries act on a product state of a given number of constituents |0〉⊗⋯ ⊗|0〉 and transform it into a target entangled state |ψ

An Example of Quantum Circuits

In order to make contact with TN, we will consider the specific case of quantum circuits where all the gates act on at most two neighbors. An example of such circuit is the Trotterized evolution of a system described by a nearest-neighbor Hamiltonian \(\hat {H}=\sum _{i,i+1} \hat {h}_{i,i+1}\), where i, i + 1 label the neighboring constituents of a one-dimensional system. The evolution operator for a time t is \(\hat {U}(t)=\exp (-i\hat {H}t)\), and can be decomposed into a sequence of infinitesimal time evolution steps [58] (more details will be given in Sect. 3.1.3)

$$\displaystyle \begin{aligned} \hat{U}(t)=\lim_{N\to\text{{$\infty$}}} \exp\left(-i\frac{t}{N}\hat{H}\right)^{N}. \end{aligned} $$
(2.42)

In the limit, we can decompose the evolution into a product of two-body evolution

$$\displaystyle \begin{aligned} \hat{U}(t)=\lim_{\tau \to 0}\prod_{i,i+1} \hat{U}(\tau)_{i,i+1}, \end{aligned} $$
(2.43)

where \(\hat {U}_{i,i+1}(\tau )=\exp (-i\tau \hat {h}_{i,i+1})\) and τ = tN. This is obviously a quantum circuit made by two-qubit gates with depth N. Conversely, any quantum circuit naturally possesses an arrow of time, it transforms a product state into an entangled state after a sequence of two-body gates.

Casual Cone

One interesting concept in a quantum circuit is that of the causal cone illustrated in Fig. 2.13, which becomes explicit with the TN representations. Given a quantum circuit that prepares (i.e., evolves the initial state to) the state |ψ〉, we can ask a question: which subset of the gates affect the reduced density matrix of a certain subregion A of |ψ〉? This can be seen by constructing the reduced density matrix of the subregion A \(\rho _{A}=tr_{\bar {A}}|\psi \rangle \langle \psi |\) with \(\bar {A}\) the rest part of the system besides A.

Fig. 2.13
figure 13

(a) The past casual cone of the red site. The unitary gate U 5 does not affect the reduced density matrix of the red site. This is verified by computing explicitly ρ A by tracing over all the others constituents. (b) In the TN of ρ A, U 5 is contracted with \(U^{\dagger }_5\), which gives an identity

The TN of the reduced density matrix is formed by a set of unitaries that define the past causal cone of the region A (see the area between the green lines in Fig. 2.13). The rest unitaries (for instance, the \(\hat {U}_5\) and its conjugate in the right sub-figure of Fig. 2.13) will be eliminated in the TN of the reduced density matrix. The contraction of the causal cone can thus be rephrased in terms of the multiplication of a set of transfer matrices, each performing the computation from t to t − 1. The maximal width of these transfer matrices defines the width of the causal cone, which can be used as a good measure of the complexity of computing ρ A [59]. The best computational strategy one can find to compute exactly ρ A will indeed always scale exponentially with the width of the cone [57].

Unitary Tensor Networks and Quantum Circuits

The simplest TN , the MP can be interpreted as a sequential quantum circuit [60]. The idea is that one can think of the MPS as a sequential interaction between each constituent (a d-level system) an ancillary D-level system (the auxiliary qDit, red bonds). The first constituent interacts (say the bottom one shown in Fig. 2.14) and then sequentially all the constituents interact with the same D-level system. With this choice, the past causal cone of a constituent is made by all the MPS matrices below it. Interestingly in the MPS case, the causal cone can be changed using the gauge transformations (see Sect. 2.4.2), something very different to what happens in two-dimensional TNs . This amounts to finding appropriate unitary transformations acting on the auxiliary degrees of freedom that allow to reorder the interactions between the D-level system and the constituents. In such a way, a desired constituent can be made to interact first, then followed by the others. An example of the causal cone in the center gauge used in iDMRG calculation [61] is presented in Fig. 2.15. This idea allows to minimize the number of tensors in the causal cone of a given region. However, the scaling of the computational cost of the contraction is not affected by such a temporal reordering of the TN, since in this case the width of the cone is bounded by one unitary in any gauge. The gauge choice just changes the number of computational steps required to construct the desired ρ A. In the case that A includes non-consecutive constituents, the width of the cone increases linearly with the number of constituents, and the complexity of computing ρ A increases exponentially with the number of constituents.

Fig. 2.14
figure 14

The MPS as a quantum circuit. Time flows from right to left so that the lowest constituent is the first to interact with the auxiliary D-level system. Here we show the past causal cone of a single constituent. Similarly, the past causal cone of A made by adjacent constituent has the same form starting from the upper boundary of A

Fig. 2.15
figure 15

Using the gauge degrees of freedom of an MPS, we can modify its past causal cone structure to make its region as small as possible, in such a way decreasing the computational complexity of the actual computation of specific ρ A. A convenient choice is the center gauge used in iDMRG

Again, the gauge degrees of freedom can be used to modify the structure of the past causal cone of a certain spin. As an example, the iDMRG center gauge is represented in Fig. 2.15.

An example of a TN with a larger past causal cone can be obtained by using more than one layers of interactions. Now the support of the causal cone becomes larger since it includes transfer matrices acting on two D-level systems (red bonds shown in Fig. 2.16). Notice that this TN has loops but it still exactly contractible since the width of the causal cone is still finite.

Fig. 2.16
figure 16

The width of the causal cone increases as we increase the depth of the quantum circuit generating the MPS state

2.3 Tensor Networks that Can Be Contracted Exactly

2.3.1 Definition of Exactly Contractible Tensor Network States

The notion of the past causal cone can be used to classify TNSs based on the complexity of computing their contractions. It is important to remember that the complexity strongly depends on the object that we want to compute, not just the TN. For example, the complexity of an MPS for a N-qubit state scales only linearly with N. However, to compute the n-site reduced density matrix, the cost scales exponentially with n since the matrix itself is an exponentially large object. Here we consider to compute scalar quantities, such as the observables of one- and two-site operators.

We define the a TNS to be exactly contractible when it is allowed to compute their contractions with a cost that is a polynomial to the elementary tensor dimensions D. A more rigorous definition can be given in terms of their tree width see, e.g., [57]. From the discussion of the previous section, it is clear that such a TNS corresponds to a bounded causal cone for the reduced density matrix of a local subregion. In order to show this, we now focus on the cost of computing the expectation value of local operators and their correlation functions on a few examples of TNSs .

The relevant objects are thus the reduced density matrix of a region A made of a few consecutive spins, and the reduced density matrix of two disjoint blocks A 1 and A 2 of which each made of a few consecutive spins. Once we have the reduced density matrices of such regions, we can compute arbitrary expectation values of local operators by \(\langle {\mathscr {O}\rangle =}tr(\rho _{A}\mathscr {O})\) and \(\langle {\mathscr {O}}_{A_{1}}\mathscr {O}^{\prime }_{A_{2}}\rangle =tr(\rho _{A_{1}\cup A_{2}}\mathscr {O}_{A_{1}}\mathscr {O}^{\prime }_{A_{2}})\) with \(\mathscr {O}_{A}\), \({\mathscr {O}}_{A_{1}}\), \(\mathscr {O}^{\prime }_{A_{2}}\) arbitrary operators defined on the regions A, A 1, A 2.

2.3.2 MPS Wave-Functions

The simplest example of the computation of the expectation value of a local operator is obtained by considering MPS wave-functions [8, 62]. Figure 2.17 shows an MPS in the left-canonical form (see Sect. 5.1.3 for more details). Rather than putting the arrows of time, here we put the direction in which the tensors in the TN are isometric. In other words, an identity is obtained by contracting the inward bonds of a tensor in |ψ〉 with the outward bonds of its conjugate in 〈ψ| (Fig. 2.18). Note that |ψ〉 and 〈ψ| have opposite arrows, by definition. These arrows are directly on the legs of the tensors. The arrows in |ψ〉 are in the opposite direction than the time, by comparing Fig. 2.14 with Fig. 2.18. The two figures indeed represent the MPS in the same gauge. This means that the causal cone of an observable is on the right of that observable, as shown on the second line of Fig. 2.18, where all the tensors on the left side are annihilated as a consequence of the isometric constraints. We immediately have that the causal cone has at most the width of two. The contraction becomes a power of the transfer operator of the MPS E =∑iA i ⊗ A i, where A i and A i represent the MPS tensors and its complex conjugate. The MPS transfer matrix E only acts on two auxiliary degrees of freedom. Using the property that E is a completely positive map and thus has a fixed point [8], we can substitute the transfer operator by its largest eigenvector v, leading to the final TN diagram that encodes the expectation value of a local operator.

Fig. 2.17
figure 17

The MPS wave-function representation in left-canonical form

Fig. 2.18
figure 18

The expectation value of a single-site operator with an MPS wave-function

In Fig. 2.19, we show the TN representation of the expectation value of the two-point correlation functions. Obviously, the past causal cone width is bounded by two auxiliary sites. Note that in the second line, the directions of the arrows on the right side are changed. This in general does not happen in more complicated TNs as we will see in the next subsection. Before going there, we would like to comment the properties of the two-point correlation functions of MPS. From the calculation we have just performed, we see that they are encoded in powers of the transfer matrix that evolve the system in the real space. If that matrix can be diagonalized, we can immediately see that the correlation functions naturally decay exponentially with the ratio of the first to the second eigenvalue. Related details can be found in Sect. 5.4.2.

Fig. 2.19
figure 19

Two-point correlation function of an MPS wave-function

2.3.3 Tree Tensor Network Wave-Functions

An alternative kind of wave-functions are the TTNSs [63,64,65,66,67,68,69]. In a TTNS, one can add the physical bond on each of the tensor, and use it as a many-body state defined on a Caley-tree lattice [63]. Here, we will focus on the TTNS with physical bonds only on the outer leafs of the tree.

The calculations with a TTNS normally correspond to the contraction of tree TNs. A specific case of a two-to-one TTNS is illustrated in Fig. 2.20, named binary Caley tree. This TN can be interpreted as a quantum state of multiple spins with different boundary conditions. It can also be considered as a hierarchical TN, in which each layer corresponds to a different level of coarse-graining renormalization group (RG) transformation [64]. In the figure, different layers are colored differently. In the first layer, each tensor groups two spins into one and so on. The tree TN can thus be interpreted a specific RG transformation. Once more, the arrows on the tensors indicate the isometric property of each individual tensor that the directions are opposite as the time, if we interpret the tree TN as a quantum circuit. Note again that |ψ〉 and 〈ψ| have opposite arrows, by definition.

Fig. 2.20
figure 20

A binary TTNS made of several layers of third-order tensors. Different layers are identified with different colors. The arrows flow in the opposite direction of the time while being interpreted as a quantum circuit

The expectation value of a one-site operator is in fact a tree TN shown in Fig. 2.21. We see that many of the tensors are completely contracted with their Hermitian conjugates, which simply give identities. What are left is again a bounded causal cone. If we now build an infinite TTNS made by infinitely many layers, and assume the scale invariance, the multiplication of infinitely many power of the scale transfer matrix can be substituted with the corresponding fixed point, leading to a very simple expression for the TN that encodes the expectation value of a single-site operator.

Fig. 2.21
figure 21

The expectation value of a local operator of a TTNS . We see that after applying the isometric properties of the tensors, the past causal cone of a single site has a bounded width. The calculation again boils down to a calculation of transfer matrices. This time the transfer matrices evolve between different layers of the tree

Similarly, if we compute the correlation function of local operators at a given distance, as shown in Fig. 2.22, we can once more get rid of the tensors outside the casual cone. Rigorously we see that the causal cone width now increases to four sites, since it consists of two different two-site branches. However, if we order the contraction as shown in the middle, we see that the contractions boil down again to a two-site causal cone. Interestingly, since the computation of two-point correlations at very large distance involves the power of transfer matrices that translate in scale rather than in space, one would expect that these matrices are all the same (as a consequence of scale-invariance, for example). Thus, we would get polynomially decaying correlations [70].

Fig. 2.22
figure 22

The computation of the correlation function of two operators separated by a given distance boils down to the computation of a certain power of transfer matrices. The computation of the casual cone can be simplified in a sequential way, as depicted in the last two sub-figures

2.3.4 MERA Wave-Functions

Until now, we have discussed with the TNs that, even if they can be embedded in a 2D space, they contain no loops. In the context of network complexity theory, they are called mean-field networks [71]. However, there are also TNs with loops that are exactly contractible [57]. A particular case is that of a 1D MERA (and its generalizations) [72,73,74,75,76]. The MERA is again a TN that can be embedded in a 2D plane, and that is full of loops as seen in Fig. 2.23. This TN has a very peculiar structure, again, inspired from RG transformation [77]. MERA can also be interpreted as a quantum circuit where the time evolves radially along the network, once more opposite to the arrows that indicate the direction along which the tensors are unitary. The MERA is a layered TN, with where layer (in different colors in the figure) is composed by the appropriate contraction of some third-order tensors (isometries) and some fourth-order tensors (disentangler). The concrete form of the network is not really important [76]. In this specific case we are plotting a two-to-one MERA that was discussed in the original version of Ref. [75]. Interestingly, an operator defined on at most two sites gives a bounded past causal cone as shown in Figs. 2.24 and 2.25.

Fig. 2.23
figure 23

The TN of MERA . The MERA has a hierarchical structure consisting of several layers of disentanglers and isometries. The computational time flows from the center towards the edge radially, when considering MERA as a quantum circuit. The unitary and isometric tensors and the network geometry are chosen in order to guarantee that the width of the causal cone is bounded

Fig. 2.24
figure 24

Past causal cone of a single-site operator for a MERA

Fig. 2.25
figure 25

Two-point correlation function in the MERA

As in the case of the TTNS , we can indeed perform the explicit calculation of the past causal cone of a single-site operator (Fig. 2.24). There we show the TN contraction of the required expectation value, and then simplify it by taking into account the contractions of the unitary and isometric tensors outside the casual cone with a bounded width involving at most four auxiliary constituents.

The calculation of a two-point correlation function of local operators follows a similar idea and leads to the contraction shown in Fig. 2.25. Once more, we see that the computation of the two-point correlation function can be done exactly due to the bounded width of the corresponding casual cone.

2.3.5 Sequentially Generated PEPS Wave-Functions

The MERA and TTNS can be generalized to two-dimensional lattices [64, 74]. The generalization of MPS to 2D, on the other hand, gives rise to PEPS . In general, it belongs to the 2D TNs that cannot be exactly contracted [24, 78].

However for a subclass of PEPS, one can implement the contract exactly, which is called sequentially generated PEPS [79]. Differently from the MERA where the computation of the expectation value of any sufficiently local operator leads to a bounded causal cone, sequentially generated PEPS has a central site, and the local observables around the central site can be computed easily. However, the local observables in other regions of the TN give larger causal cones. For example, we represent in Fig. 2.26a sequentially generated PEPS for a 3 × 3 lattice. The norm of the state is computed in (b), where the TN boils down to the norm of the central tensor. Some of the reduced density matrices of the system are also easy to compute, in particular those of the central site and its neighbors (Fig. 2.27a). Other reduced density matrices, such as those of spins close to the corners, are much harder to compute. As illustrated in Fig. 2.27b, the causal cone of a corner site in a 3 ×3 PEPS has a width 2. In general for an L × L PEPS, the casual cone would have a width L∕2.

Fig. 2.26
figure 26

(a) A sequentially generated PEPS. All tensors but the central one (green in the figure) are isometries, from the in-going bonds (marked with ingoing arrows) to the outgoing ones. The central tensor represents a normalized vector on the Hilbert space constructed by the physical Hilbert space and the four copies of auxiliary spaces, one for each of its legs. (b) The norm of such PEPS, after implementing the isometric constraints, boils down to the norm of its central tensor

Fig. 2.27
figure 27

(a) The reduced density matrices of a PEPS that is sequentially generated containing two consecutive spins (one of them is the central spin. (b) The reduced density matrix of a local region far from the central site is generally hard to compute, since it can give rise to an arbitrarily large causal cone. For the reduced density matrix of any of the corners with a L × L PEPS, which is the most consuming case, it leads to a causal cone with a width up to L∕2. That means the computation is exponentially expensive with the size of the system

Differently from MPS, the causal cone of a PEPS cannot be transformed by performing a gauge transformation. However, as firstly observed by F. Cucchietti (private communication), one can try to approximate a PEPS of a given causal cone with another one of a different causal cone, by, for example, moving the center site. This is not an exact operation, and the approximations involved in such a transformation need to be addressed numerically. The systematic study of the effect of these approximations has been studied recently in [80, 81]. In general, we have to say that the contraction of a PEPS wave-function can only be performed exactly with exponential resources. Therefore, efficient approximate contraction schemes are necessary to deal with PEPS.

2.3.6 Exactly Contractible Tensor Networks

We have considered above, from the perspective of quantum circuits, whether a TNS can be contracted exactly by the width of the casual cones. Below, we reconsider this issue from the aspect of TN.

Normally, a TN cannot be contracted without approximation. Let us consider a square TN, as shown in Fig. 2.28. We start from contracting an arbitrary bond in the TN (yellow shadow). Consequently, we obtain a new tensor with six bonds that contains χ 6 parameters (χ is the bond dimension). To proceed, the bonds adjacent to this tensor are probably a good choice to contract next. Then we will have to restore a new tensor with eight bonds. As the contraction goes on, the number of bonds increases linearly with the boundary of the contracted area, thus the memory increases exponentially as O(χ ). For this reason, it is impossible to exactly contract a TN, even if it only contains a small number of tensors. Thus, approximations are inevitable. This computational difficulty is closely related to the area law of entanglement entropy [17] (also see Sect. 2.4.3), or the width of the casual cone as in the case of PEPS. Below, we give three examples of TNs that can be exactly contracted.

Fig. 2.28
figure 28

If one starts with contracting an arbitrary bond, there will be a tensor with six bonds. As the contraction goes on, the number of bonds increases linearly with the boundary of the contracted area, thus the memory increases exponentially as O(χ ) with χ the bond dimension

Tensor Networks on Tree Graphs

We here consider a scalar tree TN (Fig. 2.29a) with N L layers of third-order tensors. Some vectors are put on the outmost boundary. An example that a tree TN may represent is an observable of a TTNS. A tree TN is written as

$$\displaystyle \begin{aligned} \begin{array}{rcl} Z = \sum_{\{a\}} \prod_{n=1}^{N_L} \prod_{m=1}^{M_n} T^{[n,m]}_{a_{n,m,1},a_{n,m,2},a_{n,m,3}} \prod_k v^{[k]}_{a_k}, \end{array} \end{aligned} $$
(2.44)

with T [n, m] the m-th tensor on the n-th layer, M n the number of tensors of the n-th layer, and v [k] the k-th vectors on the boundary.

Fig. 2.29
figure 29

Two kinds of TNs that can be exactly contracted: (a) tree and (b) fractal TNs. In (b), the shadow shows the Sierpiński gasket, where the tensors are defined in the triangles

Now we contract each of the tensor on the N L-th layer with the corresponding two vectors on the boundary as

$$\displaystyle \begin{aligned} \begin{array}{rcl} v_{a_3}^{\prime} = \sum_{a_1a_2} T^{[N_L m]}_{a_1a_2a_3} v^{[k_1]}_{a_1} v^{[k_2]}_{a_2}. \end{array} \end{aligned} $$
(2.45)

After the vectors are updated by the equation above, and the number of layers of the tree TN becomes N L − 1. The whole tree TN can be exactly contracted by repeating this procedure.

We can see from the above contraction that if the graph does not contain any loops, i.e., has a tree-like structure, the dimensions of the obtained tensors during the contraction will not increase unboundedly. Therefore, the TN defined on it can be exactly contracted. This is again related to the area law of entanglement entropy that a loop-free TN satisfies: to separate a tree-like TN into two disconnecting parts, the number of bonds that needs to be cut is only one. Thus, the upper bond of the entanglement entropy between these two parts is constant, determined by the dimension of the bond that is cut. This is also consistent with the analyses based on the maximal width of the casual cones.

Tensor Networks on Fractals

Another example that can be exactly contracted is the TN defined on the fractal called Sierpiński gasket (Fig. 2.29b) (see, e.g., [82, 83]). The TN can represent the partition function of the statistical model defined on the Sierpiński gasket, such as Ising and Potts model. As explained in Sec. II, the tensor is given by the probability distribution of the three spins in a triangle.

Such a TN can be exactly contracted by iteratively contracting each three of the tensors located in a same triangle as

$$\displaystyle \begin{aligned} \begin{array}{rcl} T_{a_1a_2a_3}^{\prime} = \sum_{b_1b_2b_3} T_{a_1b_1b_2} T_{a_2b_2b_3} T_{a_3b_3b_1}. \end{array} \end{aligned} $$
(2.46)

After each round of contractions, the dimension of the tensors and the geometry of the network keep unchanged, but the number of the tensors in the TN decreases from N to N∕3. It means we can exactly contract the whole TN by repeating the above process.

Algebraically Contractible Tensor Networks

The third example is called algebraically contractible TNs [84, 85]. The tensors that form the TN possess some special algebraic properties, so that even the bond dimensions increase after each contraction, the rank of the bonds is kept unchanged. It means one can introduce some projectors to lower the bond dimension without causing any errors.

The simplest algebraically contractible TN is the one formed by the super-diagonal tensorI defined as

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{a_1,\cdots, a_N}&=& \left\{ \begin{array}{lll} 1, \ \ a_1 = \cdots = a_N, \\ 0, \ \ \text{otherwise}. \end{array} \right. \end{array} \end{aligned} $$
(2.47)

I is also called copy tensor, since it forces all its indexes to take a same value.

For a square TN of an arbitrary size formed by the fourth-order Is, obviously we have its contraction Z = d with d the bond dimension. The reason is that the contraction is the summation of only d non-zero values (each equals to 1).

To demonstrate its contraction, we will need one important property of the copy tensor (Fig. 2.30): if there are n ≥ 1 bonds contracted between two copy tensors, the contraction gives a copy tensor

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_{a_1 \cdots b_1 \cdots} = \sum_{c_1 \cdots} I_{a_1\cdots c_1 \cdots} I_{b_1 \cdots c_1 \cdots}. \end{array} \end{aligned} $$
(2.48)

This property is called the fusion rule, and can be understood in the opposite way: a copy tensor can be decomposed as the contraction of two copy tensors.

Fig. 2.30
figure 30

The fusion rule of the copy tensor: the contraction of two copy tensors of N 1-th and N 2-th order gives a copy tensor of (N 1 + N 2 − N)-th order, with N the number of the contracted bonds

With the fusion rule, one will readily have the property for the dimension reduction: if there are n ≥ 1 bonds contracted between two copy tensors, the contraction is identical after replacing the n bonds with one bond

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{c_1 \cdots c_n} I_{a_1\cdots c_1 \cdots c_n} I_{b_1 \cdots c_1 \cdots} = \sum_{c} I_{a_1\cdots c} I_{b_1 \cdots c}. \end{array} \end{aligned} $$
(2.49)

In other words, the dimension of the contracting bonds can be exactly reduced from χ n to χ. Applying this property to TN contraction, it means each time when the bond dimension increases after contracting several tensors into one tensor, the dimension can be exactly reduced to χ, so that the contraction can continue until all bonds are contracted.

From the TN of the copy tensors, a class of exactly contractible TN can be defined, where the local tensor is the multiplication of the copy tensor by several unitary tensors. Taking the square TN as example, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} T_{a_1a_2a_3a_4} = \sum_{b_1b_2b_3b_4} X_{b_1} I_{b_1b_2b_3b_4} U_{a_1b_1} V_{a_2b_2} U^{\ast}_{a_3b_3} V^{\ast}_{a_4b_4}, \end{array} \end{aligned} $$
(2.50)

with U and V  two unitary matrices. X is an arbitrary d-dimensional vector that can be understood as the “weights” (not necessarily to be positive to define the tensor). After putting the tensors in the TN, all unitary matrices vanish to identities. Then one can use the fusion rule of the copy tensor to exactly contract the TN, and the contraction gives \(Z=\prod _b (X_{b})^{N_T}\) with N T the total number of tensors.

The unitary matrices are not trivial in physics. If we take d = 2 and

$$\displaystyle \begin{aligned} U=V= \begin{bmatrix} \sqrt{2}/2 & \sqrt{2}/2 \\ \sqrt{2}/2 & -\sqrt{2}/2 \\ \end{bmatrix}, \end{aligned} $$
(2.51)

the TN is in fact the inner product of the Z 2 topological state (see the definition of Z 2 PEPS in Sect. 2.2.3). If one cuts the system into two subregions, all the unitary matrices vanish into identities inside the bulk. However, those on the boundary will survive, which could lead to exotic properties such as topological orders, edge states, and so on. Note that Z 2 state is only a special case. One can refer to a systematic picture given by X. G. Wen called the string-net states [27,28,29].

2.4 Some Discussions

2.4.1 General Form of Tensor Network

One can see that a TN (state or operator) is defined as the contraction of certain tensors {T [n]} with a general form as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathscr{T}_{\{s\}} = \sum_{\{a\}} \prod_{n} T^{[n]}_{s^n_1s^n_2 \cdots,a^n_1 a^n_2 \cdots}. {} \end{array} \end{aligned} $$
(2.52)

The indexes {a} are geometrical indexes, each of which is shared normally two tensors and will be contracted. The indexes {s} are open bonds, each of which only belongs to one tensor. After contracting all the geometrical indexes, the TN represents a \(\mathscr {N}\)-th order tensor, with \(\mathscr {N}\) the total number of the open indexes {s}.

Each tensor in the TN can possess different number of open or geometrical indexes. For an MPS , each tensor has one open index (called physical bond) and two geometrical indexes; for PEPS on square lattice, it has one open and four geometrical indexes. For the generalizations of operators, the number of open indexes is two for each tensor. It also allows hierarchical structure of the TN, such as TTNS and MERA .

One special kind of the TNs is the scalar TN with no open bonds, denoted as

$$\displaystyle \begin{aligned} \begin{array}{rcl} Z = \sum_{\{a\}} \prod_{n} T^{[n]}_{a^n_1 a^n_2 \cdots}. {} \end{array} \end{aligned} $$
(2.53)

It is very important because many physical problems can be transformed to computing the contractions of scalar TNs. A scalar TN can be obtained from the TNs that has open bonds, such as \(Z = \sum _{\{s\}} \mathscr {T}_{\{s\}}\) or \(Z = \sum _{\{s\}} \mathscr {T}_{\{s\}}^{\dagger } \mathscr {T}_{\{s\}}\), where Z can be the cost function (e.g., energy or fidelity) to be maximized or minimized. The TN contraction algorithms mainly deal with the scalar TNs.

2.4.2 Gauge Degrees of Freedom

For a given state, its TN representation is not unique. Let us take translational invariant MPS as an example. One may insert a (full-rank) matrix U and its inverse U −1 on each of the virtual bonds and then contracted them, respectively, into the two neighboring tensors. The tensors of new MPS become \(\tilde {A}^{[n]}_{s, a a'} = \sum _{bb'} U_{ab} A^{[n]}_{s, b b'} U^{-1}_{a' b'}\). In fact, we only put an identity I = UU −1, thus do not implement any changes to the MPS. However, the tensors that form the MPS change, meaning the TN representation changes. It is also the case when inserting an matrix and its inverse on any of the virtual bonds of a TN state, which changes the tensors without changing the state itself. Such degrees of freedom is known as the gauge degrees of freedom, and the transformations are called gauge transformations.

The gauge degrees of on the one hand may cause instability to TN simulations. Algorithms for finite and infinite PEPS were proposed to fix the gauge to reach higher stability [86,87,88]. On the other hand, one may use gauge transformation to transform a TN state to a special form, so that, for instance, one can implement truncations of local basis while minimizing the error non-locally [45, 89] (we will go back to this issue later). Moreover, gauge transformation is closely related to other theoretical properties such as the global symmetry of TN states, which has been used to derive more compact TN representations [90], and to classify many-body phases [91, 92] and to characterize non-conventional orders [93, 94], just to name a few.

2.4.3 Tensor Network and Quantum Entanglement

The numerical methods based on TN face great challenges, primarily that the dimension of the Hilbert space increases exponentially with the size. Such an “exponential wall” has been treated in different ways by many numeric algorithms, including the DFT methods [95] and QMC approaches [96].

The power of TN has been understood in the sense of quantum entanglement: the entanglement structure of low-lying energy states can be efficiently encoded in TNSs . It takes advantage of the fact that not all quantum states in the total Hilbert space of a many-body system are equally relevant to the low-energy or low-temperature physics. It has been found that the low-lying eigenstates of a gapped Hamiltonian with local interactions obey the area law of the entanglement entropy [97].

More precisely speaking, for a certain subregion \(\mathscr {R}\) of the system, its reduced density matrix is defined as \(\hat {\rho }_{\mathscr {R}}= \mathrm {Tr}_{\mathscr {E}} (\hat {\rho })\), with \(\mathscr {E}\) denotes the spatial complement of \(\mathscr {R}\). The entanglement entropy is defined as

$$\displaystyle \begin{aligned} \begin{array}{rcl} S(\rho_{\mathscr{R}}) = - \mathrm{Tr} \lbrace \rho_{\mathscr{R}} \mathrm{log} (\rho_{\mathscr{R}} ) \rbrace . \end{array} \end{aligned} $$
(2.54)

Then the area law of the entanglement entropy [17, 98] reads

$$\displaystyle \begin{aligned} \begin{array}{rcl} S(\rho_{\mathscr{R}})= O(\vert \partial \mathscr{R} \vert) , \end{array} \end{aligned} $$
(2.55)

with \(\vert \partial \mathscr {R}\vert \) the size of the boundary. In particular, for a D-dimensional system, one has

$$\displaystyle \begin{aligned} \begin{array}{rcl} S=O(l^{D-1}), {} \end{array} \end{aligned} $$
(2.56)

with l the length scale. This means that for 1D systems, S = const. The area law suggests that the low-lying eigenstates stay in a “small corner” of the full Hilbert space of the many-body system, and that they can be described by a much smaller number of parameters. We shall stress that the locality of the interactions is not sufficient to the area law. Vitagliano et al. show that simple 1D spin models can exhibit volume law, where the entanglement entropy scales with the bulk [99, 100].

The area law of entanglement entropy is intimately connected to another fact that a non-critical quantum system exhibits a finite correlation length. The correlation functions between two blocks in a gapped system decay exponentially as a function of the distance of the blocks [101], which is argued to lead to the area law. An intuitive picture can be seen in Fig. 2.31. Let us consider a 1D gapped quantum system whose ground state |ψ ABC〉 possesses a correlation length ξ corr. By dividing into three subregions A, B, and C, the reduced density operator \(\hat {\rho }_{AC}\) is obtained when tracing out the block B, i.e., \(\hat {\rho }_{AC} = \mathrm {Tr}_{\mathrm {B}} | \psi _{\mathrm {ABC}} \rangle \langle \psi _{\mathrm {ABC}} |\) (see Fig. 2.32). In the limit of large distance between A and C blocks with l AC ≫ ξ corr, one has the reduced density matrix satisfying

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{\rho}_{AC} \simeq \hat{\rho}_{A} \otimes \hat{\rho}_{C}, \end{array} \end{aligned} $$
(2.57)

up to some exponentially small corrections. Then |ψ ABC〉 is a purificationFootnote 9 of a mixed state with the form \(| \psi _{A B_l} \rangle \otimes | \psi _{B_r C} \rangle \) that has no correlations between A and C; here B l and B r sit at the two ends of the block B, which together span the original block.

Fig. 2.31
figure 31

Bipartition of a 1D system into two half chains. Significant quantum correlations in gapped ground states occur only on short length scales

Fig. 2.32
figure 32

To argue the 1D area law, the chain is separated into three subsystems denoted by A, B, and C. If the correlation length ξ corr is much larger than the size of B (denoted by l AC), the reduced density matrix by tracing B approximately satisfies \(\hat {\rho }_{AC} \simeq \hat {\rho }_{A} \otimes \hat {\rho }_{C}\)

It is well known that all possible purifications of a mixed state are equivalent to each other up to a local unitary transformation on the virtual Hilbert space. This naturally implies that there exists a unitary operation \(\hat {U}_B\) on the block B that completely disentangles the left from the right part as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hat{I}_A \otimes \hat{U}_B \otimes \hat{I}_C \vert \psi_{ABC} \rangle \rightarrow \vert \psi_{A B_l} \rangle \otimes \vert \psi_{B_rC} \rangle . \end{array} \end{aligned} $$
(2.58)

\(\hat {U}_B\) implies that there exists a tensor \(B_{s,a a'}\) with 0 ≤ a, a′, s ≤ χ − 1 and basis {|ψ A〉}, {|ψ B〉}, {|ψ C〉} defined on the Hilbert spaces belonging to A, B, C such that

$$\displaystyle \begin{aligned} \begin{array}{rcl} \vert \psi_{ABC} \rangle \simeq \sum_{s a a'} B_{s,a a'} \vert \psi^A_{a} \rangle \vert \psi^B_{s} \rangle \vert \psi^C_{a'} \rangle . {} \end{array} \end{aligned} $$
(2.59)

This argument directly leads to the MPS description and gives a strong hint that the ground states of a gapped Hamiltonian is well represented by an MPS of finite bond dimensions, where B in Eq. (2.59) is analog to the tensor in an MPS. Let us remark that every state of N spins has an exact MPS representation if we allow χ to grow exponentially with the number of spins [102]. The whole point of MPS is that a ground state can typically be represented by an MPS where the dimension χ is small and scales at most polynomially with the number of spins: this is the reason why MPS-based methods are more efficient than exact diagonalization.

For the 2D PEPS , it is more difficult to strictly justify the area law of entanglement entropy. However, we can make some sense of it from the following aspects. One is the fact that PEPS can exactly represent some non-trivial 2D states that satisfies the area law, such as the nearest-neighbor RVB and Z2 spin liquid mentioned above. Another is to count the dimension of the geometrical bonds \(\mathscr {D}\) between two subsystems, from which the entanglement entropy satisfies an upper bound as \(S \leq \log \mathscr {D}\).Footnote 10

After dividing a PEPS into two subregions, one can see that the number of geometrical bonds N b increase linearly with the length scale, i.e., N b ∼ l. It means the dimension \(\mathscr {D}\) satisfies \(\mathscr {D} \sim \chi ^{l}\), and the upper bound of the entanglement entropy fulfills the area law given by Eq. (2.56), which is

$$\displaystyle \begin{aligned} \begin{array}{rcl} S \leq O(l). \end{array} \end{aligned} $$
(2.60)

However, as we will see later, such a property of PEPS is exactly the reason that makes it computationally difficult.