1 Introduction

Surprising connections between geometry and information have an honorary place in current research in theoretical physics. These ideas date back to the Bekenstein-Hawking formula [1, 2] relating the entropy and area of a black hole. The discovery of the AdS/CFT correspondence – the observation that certain gauge theories are equivalent (or “dual”) to gravitational theories in one higher dimension (see e.g., [3, 4]) – enabled putting the relation between gravity and information on firm ground. Specifically, it permitted Ryu and Takayanagi (RT) to formulate a proposal [5] (later proven by [6]) that relates the entanglement entropy – a quantity characterizing quantum correlations between two regions in conformal field theory (CFT) – and areas of minimal surfaces in asymptotically anti-de Sitter (AdS) spaces.

The RT proposal was the starting point for many interesting developments. It was used to study entanglement in strongly correlated systems and as a consequence improved our understanding of critical points and topological phases, chaos and thermalization, and RG flows (see [7] for a review). Furthermore, it provides an interpretation of spacetime as emergent from quantum entanglement. Specifically, it can be used to understand the way in which the boundary information is encoded in the bulk, and vice versa, in the AdS/CFT correspondence.

However, black holes pose a barrier for our understanding of spacetime in terms of entanglement. The reason is that the space behind the horizon of black holes is only partially accessible via the minimal surfaces in the RT proposal and therefore a lot of the geometry remains uninterpreted in terms of quantum information. This is not a technicality but rather it has been suggested that it is not possible to fully reconstruct the geometry behind the horizon using the boundary data and this topic is still being debated (see, e.g., [8]). Furthermore, despite recent progress in reconstructing the Page curve of black hole evaporation [9, 10] we still lack a full understanding of how black holes process and store information about objects which are thrown into them.

One aspect of these problems is that the volume behind the horizon of black holes keeps growing for a very long time while the entanglement of a subsystem saturates at times of the order of the subsystem size [11]. In fact, it is non-trivial to identify dual field theory quantities which have a similar long-term growth behavior.

To begin addressing this difficulty, Susskind et al. proposed that the volume behind a black hole horizon should be dual to a quantity from quantum information theory known as quantum computational complexity [12,13,14,15]. Quantum computational complexity tries to estimate how hard it is to construct a given quantum “target state”, starting with a simple (usually unentangled) “reference state” using a set of simple universal “gates” [16, 17]. For example, if we start with a quantum system consisting of a large number of spins initiated to be all aligned, we could ask, what is the minimal number of one and two-spin unitary operations taken from a given set required to get to a given target state.

As we will explain in this review, in chaotic systems the complexity grows linearly as time evolves and reacts to perturbations in a distinctive way. All these behaviors have a counterpart in the behavior of the volume behind the horizon. The duality between complexity and certain geometric quantities – specifically the volume and gravitational action – was conjectured based on these similar features. We will refer to these conjectures as the “holographic complexity proposals”.

At first, the holographic complexity proposals suffered from lack of rigor due to the absence of a proper definition of complexity outside the traditional spin-chain formulation, in particular for quantum field theory (QFT) states. However, this difficulty was circumvented, first for Gaussian states in free and weakly interacting field theories [18,19,20,21,22] and later for strongly interacting conformal field theories using different approaches [23,24,25,26,27,28]. In fact, the study of complexity in field theory is interesting in its own right, apart from the relation to black holes. Quantum Computational Complexity is expected to have purely condensed matter applications for the detection of phase transitions [29, 30] and in the study of thermalization and chaos [31, 32] as a natural extension of entanglement entropy.

With the surge in literature on complexity in field theory and holography, and with many people coming into this field from different disciplines, we thought it would be good to have an introductory text. This review was written to be comprehensible but by no means comprehensive. We only review those ingredients which are strictly necessary to enter the field with the hope of getting the reader to a point where it is easy to read relevant research articles in the field.

This article was written for the special issue of EPJ-C Frontiers in Holographic Duality. Other aspects of the relation between holography and quantum information are reviewed in [33,34,35], submitted as a part of the same issue.

This review is organized as follows. In Sect. 2 we begin with an overview of quantum computation. Then, in Sect. 3 we define Quantum Computational Complexity and discuss its properties in spin chains with fast scrambling dynamics and how it relates with scrambling and chaos. In Sect. 4 we present a continuous definition of complexity due to Nielsen. In Sect. 5 we discuss the complexity of systems of coupled simple harmonic oscillators in preparation of our study of complexity in free and weakly interacting QFTs. In Sect. 6 we review the complexity of Gaussian and coherent states in free and weakly interacting QFTs, both pure and mixed, and discuss complexity in strongly interacting CFTs. In Sects. 78 we discuss the holographic complexity conjectures and the relevant evidence. We conclude in Sect. 9 with a summary and outline of open questions.

2 A quantum computation primer

Quantum computers can famously achieve exponential speed-up of computation compared to classical ones, at least for some problems. They can do this by taking advantage of the possibility of putting a quantum system in a superposition of states; performing operations on a superposition is, roughly speaking, like performing the computation in parallel on all the states in the superposition. Of course this is not precisely true, since in order to read the result one has to perform a measurement, which will cause the collapse of the state of the system on an eigenstate of the measured observable. One might then expect that each input requires a different measurement and the advantage of having the superposition is lost. But this is not the case: by a judicious choice of the algorithm and the initial state one can extract the information in an efficient way.

It is very instructive to see how these ideas work in practice on a simple example: the Deutsch’s algorithm (we follow the presentation given in [36]). Suppose we have the task of computing a function \(f(x): \{0,1\} \rightarrow \{0,1\}\). One can build a circuit that implements the 2-qubits unitary operator \(U_f: |x,y \rangle \rightarrow |x,y+f(x) \rangle \) where the addition is understood to be mod 2. We could read out the value of f(x) by applying the operator on \(|x,0 \rangle \) and reading the second qubit, and we assume that this operation can be done with the same efficiency as in the classical case. Now let us consider an initial state in a superposition. Let us define \(|\pm \rangle = \frac{|0 \rangle \pm |1 \rangle }{\sqrt{2}}\). First observe that \(U_f |x,- \rangle = (-)^{f(x)} |x,- \rangle \). Then one can compute

$$\begin{aligned} U_f |+,- \rangle&= \,\frac{1}{2} ((-)^{f(0)}+(-)^{f(1)}) |+,- \rangle \nonumber \\&\quad + \, \frac{1}{2} ((-)^{f(0)}-(-)^{f(1)}) |-,- \rangle \, . \end{aligned}$$
(1)

If we project the first qubit on the \(|\pm \rangle \) basis, we can read off whether \(f(0)=f(1)\) or \(f(0)\ne f(1)\) (we could equivalently say that we computed \(f(0)+f(1)\) mod 2). The point of the example is that there is no way of doing this classically without computing separately f(0) and f(1), whereas quantum mechanically we get the result with a single computation. Not only the computations proceed in parallel, but they can be recombined by using interference of different states. This simple example is not very impressive, but it can be generalized to an analogous problem involving a function on n qubits; the Deutsch-Jozsa algorithm solves the problem with one computation instead of the \(2^{n-1}+1\) required classically (see [36]).

Another important point illustrated by the example is that an efficient computation will typically require a particular initial state. We started from \(|\psi _0 \rangle = |+- \rangle \), but supposing that our computer starts in a canonical state \(|00 \rangle \), we will need to apply some operations to prepare \(|\psi _0 \rangle \). Analogously, in the final step we need to measure the state in the \(|\pm \rangle \) basis, but if we can only measure in the computational basis (i.e., the \(|0 \rangle ,|1 \rangle \) basis), we have to use another operator to move between the two bases.

We can then formalize a quantum computation as a series of operations on a set of qubits, and the number of operations required to go from the initial to the final state is a measure of the difficulty of the task. This is the notion of quantum computational complexity. In the next section we will give a more precise definition.

We should point out that the notion of computational complexity is related to the question of the resources needed to solve a problem. We are typically interested in finding the fastest algorithm for a given problem. Assuming that each quantum operation (gate) requires a fixed amount of time, the number of operations is a measure of the total time required for the computation. The real physical time will of course depend on the physical implementation of the gates, but there are some unavoidable limits imposed by quantum mechanics; the Margolus and Levitin [37] and the Aharonov et al. [38,39,40] bounds give the minimum time required for evolving a given state into an orthogonal stateFootnote 1\(t_{min} = \frac{\pi \hbar }{2 E}\), where \(E = \langle H - E_0 \rangle \) is the expectation value of the energy above the ground state or the variance of the energy in the state \((\langle E^2\rangle - \langle E \rangle ^2)^{1/2}\), respectively.

Alternative notions of complexity exist, related to the optimization of different resources. For example one could take into account the number of qubits used in a quantum algorithm similarly to storage in classical complexity. A different notion is the Kolmogorov complexity. In the classical setup, this is the length of the minimal program that can produce a given string; so it is a measure of the amount of information contained in the string, or how much it can be compressed without losing information. Quantum versions of Kolmogorov complexity have also been proposed [42]. One can of course also combine the requirements of limitation on time, storage space and algorithmic complexity all together.

In this review, we will focus only on one notion of quantum computational complexity, related to the number of operations. The reason is that this notion has been found (or rather, conjectured) to play an interesting role in the holographic duality, in connection with the properties of black hole interiors, and as a consequence it has been developed in the last few years from a point of view slightly different from that of quantum computing. We cannot rule out that other notions will also become relevant as we understand more and more of the relation between geometry and information (see for example [43] for a discussion of the Kolmogorov complexity in the context of holography).

3 Complexity in qubit systems

3.1 Quantum computational complexity

We have explained that a quantum computation can be formalized as the problem of producing a certain state, from an initial state, through a series of unitary operations. In practice we can only build a quantum circuit using a discrete set of gates, each one implementing a simple operation, typically acting only on one or two qubits at the time. Two questions arise naturally: first, is it possible to construct an arbitrary unitary operator using a finite predetermined set of gates? Second, if a unitary can be constructed, how many gates are needed?

For the first question, it is obvious that the set of all finite circuits built out of a finite set of gates can only reproduce a discrete subset of the unitary group. However if we allow for a margin of error, i.e., if we only ask that for any operator U we can find a circuit that gives an operator V such that \(\Vert U - V \Vert < \varepsilon \), in the operator norm,Footnote 2 then the answer is positive: there exist sets of universal gates, using which any unitary can be constructed with arbitrary precision. The full argument can be found in [36]. Here, we only give an outline of the proof. Let us consider first operators acting on a single qubit, i.e., elements of SU(2). A generic element can be written as a rotation of an angle \(\theta \) around the axis \(\mathbf {n}\), \(R_{\mathbf {n}} (\theta )\equiv e^{-i \theta \mathbf {n} \mathbf {\sigma }/(2|\mathbf {n}|)}\), where \(\mathbf {\sigma }\) is the vector of Pauli matrices. We can use two gates: the Hadamard gate (denoted by H) and the T gate (sometimes referred to as the \(\pi /8\) phase gate)

$$\begin{aligned}&H = \frac{1}{\sqrt{2}} \left( \begin{array}{cc} 1 &{} 1\\ 1 &{} - 1 \end{array}\right) = \frac{1}{\sqrt{2}} (\sigma _x + \sigma _z), \nonumber \\&T = \left( \begin{array}{cc} e^{-i \pi /8} &{} 0 \\ 0 &{} e^{i \pi / 8} \end{array}\right) \,. \end{aligned}$$
(2)

One can check that \(H T H = R_{{\hat{x}}} (\pi / 4)\), and \(T H T H = R_{\mathbf {n}} (\theta )\), where \(\mathbf {n} = (\cos \pi / 8, \sin \pi / 8, \cos \pi / 8 )\) and \(\cos (\theta /2) = \cos ^2(\pi /8)\). Note that the angle \(\theta \) is an irrational multiple of \(2 \pi \). This implies that we can approximate any angle of rotation by taking powers of \(R_{\mathbf {n}} (\theta ) \). Furthermore one can see that \(H R_{\mathbf {n}} (\theta ) H = R_{\mathbf {m}} (\theta ) \) with \(\mathbf {m} = (\cos \pi / 8, - \sin \pi / 8, \cos \pi / 8 )\). Since \(\mathbf {m}\) and \(\mathbf {n}\) are not parallel, one can find a parametrization of an arbitrary rotation as

$$\begin{aligned} U = e^{i \phi } R_{\mathbf {n}} (\alpha ) R_{\mathbf {m}} (\beta ) R_{\mathbf {n}} (\gamma )\,. \end{aligned}$$
(3)

These would be the Euler angles in the case where \(\mathbf {m} \perp \mathbf {n}\). This shows that the gates HT are universal for a single qubit.

For the case of more than one qubit, an arbitrary unitary cannot be approximated using only the H and T gates since those do not generate quantum correlations between multiple qubits. However, it turns out that adding one kind of two-qubit operation is enough to generate a universal gate set on any number of qubits. An example of such a gate is the CNOT gate:

$$\begin{aligned} CNOT =\frac{1}{2} (1 + \sigma _z^{(1)}) \otimes \mathbb {1}^{\mathbb {(2)}} + \frac{1}{2}(1 - \sigma _z^{(1)}) \otimes \sigma _x^{(2)}. \end{aligned}$$
(4)

One can easily see that in the computational basis,Footnote 3 this gate flips the second qubit only if the first qubit is in the state 1. With the CNOT gate in hand the proof of universality amounts to a linear algebra theorem and it proceeds as follows. First, we can show that any unitary operator can be decomposed as a product of two-level operators, which act non-trivially only on a subspace spanned by two computational basis vectors. Then, essentially one has to map any two-dimensional subspace to a single qubit; this can be achieved by acting with the CNOT gate.Footnote 4 This proves that every unitary operation can be decomposed as a product of H and T gates acting on the different qubits and CNOT gates acting on all pairs of qubits.

An alternative proof can be given, which is perhaps more suggestive and closer to a physicist’s mindset. We can write a generic unitary operator as

$$\begin{aligned} U = \exp \left( i \sum y_a h_a \right) \end{aligned}$$
(5)

where the sum is over all operators of the form \(h_a = \prod _i \sigma _{k_i}^{(i)}\). This can be approximated as \(U = (\prod _a e^{i \frac{y^a}{n} h_a})^n + \mathcal{O}(4^N/n)\), where N is the number of qubits. Note that we assume that \(n\gg 4^N\) so that the correction is much smaller than the leading term. Using single-qubit operations, one can convert any \(h_a\) into \(h=\prod _i \sigma _z^{(i)}\). The operator \(e^{i \alpha h}\) can be implemented using a single-qubit operator and the CNOT gates as follows: we apply successively the CNOT to the j-th qubit and an ancillary qubit. The effect is to encode the product of all bits on the extra qubit, and then one can act on it with \(e^{i \alpha \sigma _z}\), and reverse the series of CNOTs. The circuit is represented in Fig. 1. In this way, we have demonstrated that using only one and two-qubit operations any unitary can be constructed to arbitrary precision. The arbitrary precision is achieved by tuning n to be as large as we wish.

Fig. 1
figure 1

Illustration of a circuit implementing the unitary transformation \(\exp {\left( i \alpha \prod _i \sigma _z^i\right) }\)

Notice that this logic could be applied also at the level of one qubit: any element of SU(2) can be written as \(e^{a_x \sigma _x +a_y \sigma _y +a_z \sigma _z}\) and can be approximated using the three gates \(e^{i \epsilon \sigma _x}, e^{i \epsilon \sigma _y}\), \(e^{i \epsilon \sigma _z}\). In fact, the third gate \(e^{i \epsilon \sigma _z}\) can be replaced by further combinations of the first two gates using the group commutation relations. We have again a set of two universal gates on one qubit. However, now the gates have to be adjusted according to the required precision; moreover, for \(\epsilon \) very small, the gates are very close to the identity and a circuit built with them would be very susceptible to noise, although such considerations are outside our purview.

Having established the possibility of approximating an arbitrary unitary operator, we can address the second question: how efficiently can we simulate a given operator? This question leads us finally to the notion of complexity. Let us start with a definition.

figure a

The answer should depend on the allowed error \(\epsilon \) (also known as the tolerance), on the allowed set of gates, and on the size of the system, that is on the number of qubits N. At the single qubit level, the Solovay–Kitaev theorem [44] states that any operator can be built with \(O \left( \log ^c \frac{1}{\epsilon } \right) \) gates, where \(c \approx 2\). For a system of N qubits, we can give an estimate by computing how many balls of radius \(\epsilon \) are needed to cover the unitary group \(U (K\equiv 2^N)\). This group has dimension \(K^2\), and its volume (see e.g., [45] Corollary 3.5.2) is given byFootnote 5

$$\begin{aligned} \text {Vol} \, (U (K)) = \frac{(2 \pi )^{(K^2 + K) / 2}}{2!3! \ldots (K - 1) !} \, . \end{aligned}$$
(6)

The volume of an \(\epsilon \)-ball of the same dimension isFootnote 6

$$\begin{aligned} \text {Vol} (B_{\epsilon }) = \frac{(\sqrt{\pi } \epsilon )^{K^2}}{ (K^2 / 2) !} \end{aligned}$$
(7)

and the ratio of the two volumes gives an estimate of the required number of balls. For large N one finds, using the Stirling’s formula,

$$\begin{aligned} \log \left( \frac{\text {Vol} (U (2^N))}{\text {Vol} (B_{\epsilon })} \right) \sim 2^{2N}\left( \frac{N}{2} \log 2 + \log \frac{1}{\epsilon } \right) \,. \end{aligned}$$
(8)

The main thing to notice is that the dependence on the error is only logarithmic, just as in the case of one qubit, but the dependence on the size of the system is exponential. Given a set of p gates, the number of circuits with m elements is bounded by \(p^m\). Therefore, the number of unitaries with complexity less than or equal to m is bounded by \(p^m\). Together with Eq. (8), this implies that most unitary transformations are exponentially complex. In other words, simulating a unitary operator is generically exponentially hard. Enlarging the set of gates cannot improve the situation: one can show that if a circuit can be built with m gates, then it can be build with \({\mathcal {O}}(m \log ^c(\frac{m}{\epsilon }))\) gates from a different universal set [48]. Combining the estimate in Eq. (8) with the Solovay–Kitaev theorem, one can show that a unitary over N qubits may be approximated with tolerance \(\epsilon \) using at most \(O(N^2 2^{2N}\log ^c(N^2 2^{2N}/\epsilon )\) gates [36].

In this section we have considered the operator complexity; the question of the complexity of a state is related but not identical, because many unitary operators can produce the same state.

figure b

We will dwell more on the difference between the two later on; for now we can just notice that a similar counting argument shows that the state complexity has the same qualitative behavior as the operator complexity in that the discretized number of states in \(\mathbb {CP}^{K-1}\) is exponential in N and logarithmic \(\epsilon \).

3.2 Complexity in fast scramblers

In the previous section we have considered the complexity from the point of view of computation, i.e., we focused on the complexity of a unitary operation designed to perform a certain task. From a physics perspective, unitaries arise as operators that describe the evolution in time of a system. It is natural then to consider the question of how complexity changes with time. Under some assumptions, the result will follow from the volume counting of last section. We follow here the presentation given in [13, 46].

We model the evolution of a Hamiltonian system with a discrete circuit of the form shown in Fig. 2.

Fig. 2
figure 2

Illustration of a circuit representing time evolution according to a k-local (in this case 2-local) Hamiltonian

We assume that the circuit contain only k-local gates, i.e., gates that act on \(k\ll N\) qubits at the time. The evolution happens in discrete steps, at each step the qubits are divided in groups of k and acted on by the gates; however the partition changes at every step, so the qubits are all interacting with each other. This is a feature of systems that have the property of fast scrambling, namely, the information contained in a part of the system is quickly distributed over the whole system [49]. After n steps of evolution, the number of unitaries that could be generated is

$$\begin{aligned} \left( \frac{N!}{(N/k)!\ (k!)^{N/k}} \right) ^n \sim \exp \left( n \frac{k-1}{k} N \log N \right) \,. \end{aligned}$$
(10)

This is much smaller than the total number of unitaries in (8), unless n is exponentially large. We can often assume that all these unitaries are different from each other, and that there is no other circuit that generates them more efficiently; under these assumptions, the complexity is

$$\begin{aligned} {\mathcal {C}}=n N/k \,, \end{aligned}$$
(11)

so it grows linearly with the number of steps and with the size of the system. The linear growth is expected to continue until most of the group has been explored, which happens for \(n = {\mathcal {O}}(2^{2N})\), and then the complexity saturates and oscillates close to its maximal value. Eventually quantum recurrence will make it return to small values but on a doubly-exponential time scale, see Fig. 3.

Fig. 3
figure 3

Illustration of the time dependence of complexity during chaotic Hamiltonian evolution. The complexity grows linearly until it reaches its maximal value which is exponential in the number of degrees of freedom, and is expected to decrease significantly around the quantum recurrence time which is doubly exponential in the number of degrees of freedom in the system, once the full unitary group has been explored

Another natural question that one can ask is: how does the complexity grow when the system is subject to a perturbation? We can consider an operator W that is simple, e.g., it acts on a single qubit, and let it evolve, so we need to find the complexity of the so-called precursor

$$\begin{aligned} W(t) = U(t) W U(-t) \,. \end{aligned}$$
(12)

A precursor is defined [50] as any non-local operator which acts at one time, to simulate the effect of a local operator acting at a different time (later or earlier). For the present purposes, we can just think of the forward or backward time evolution of a local operator. It is clear that this is a very different question from finding the complexity of U(t) itself; for instance, when W is the identity operator, W(t) is also the identity operator for any t, so its complexity does not grow. The circuit model explains why [51, 52]: a discretized version of the circuit that represents W(t) can be drawn like in Fig. 4, with a layer in the middle representing W, and series of layers on the left and the right representing \(U(t),U(-t)\). In fact, we have discretized time here into a series of discrete time steps which we will label n. The gates on the right are the inverse of the corresponding ones on the left. But this is not the optimal circuit for W(t), because gates on the two sides that act on qubits that are not affected by W will have no effect and can be canceled out. At the second layer for example, the cancellation is obstructed not only by the qubit acted on by W but also by those qubits that have interacted with W indirectly via gates which operated in the first layer. We will refer to those qubits as infected qubits. This concept generalizes to the following layers too. At every step we will call infected qubits those qubits which have interacted with the qubit W or with any qubit which has interacted (directly or indirectly) with W by the operations of the previous layers. This is illustrated in Fig. 4.

Fig. 4
figure 4

Illustration of the switchback effect. The perturbation W is acted on by U(t) on the left and \(U^\dagger (t)\) on the right to create the precursor operator. Two qubit gates participating in the most efficient preparation of U(t) are labeled \(g_i\) and they appear as light-purple circles before applying them to W (and as light-red circles after being applied). Estimating the complexity of the precursor operator at different times depend on delicate cancellations which can be seen after applying the gates. For example, the gate \(g_2\) commutes with the perturbation and the previously applied gates and therefore does not contribute to the complexity

Let us define s(n) to be the number of qubits that have been infected after the action of n layers of the circuit, and \(p(n)=s(n)/N\) the fraction of infected qubits. When another layer is applied, the probability that a qubit is infected is the probability that it was already infected plus the probability that it was not, multiplied by the probability that one of the \(k-1\) qubits that it interacts with is infected.Footnote 7 It is easier to write it in terms of \(q(n)=1-p(n)\). The evolution of the infection is described by

$$\begin{aligned} q(n+1) = q(n)^{k} \,. \end{aligned}$$
(13)

This can be easily solved and we find the number of infected qubits:

$$\begin{aligned} s(n) = N \left( 1 - \left( 1-\frac{s_0}{N}\right) ^{k^n}\right) , \end{aligned}$$
(14)

where \(s_0\) is the initial number of infected qubits. When the initial operator is small, we can approximate this expression for small n with \(s(n) \sim s_0 k^n\). The complexity is given by the sum of the infected sites at different steps. We cannot perform the sum analytically, however we can see that because of the exponential behavior, \((s(n+1)-s(n))/s(n)\) becomes small after a few steps. We can then replace the difference equation by a differential equation

$$\begin{aligned} \frac{ds}{dn} = (N-s)\left( 1-\left( 1-\frac{s}{N}\right) ^{k-1} \right) \,. \end{aligned}$$
(15)

The solution can be given explicitly for the inverse function n(s):

$$\begin{aligned} n = \left. \frac{1}{k-1} \log \left( \frac{1-(1-\frac{s}{N})^{k-1}}{(1-\frac{s}{N})^{k-1}} \right) \right| _{s_0}^s. \end{aligned}$$
(16)

This expression can be inverted as follows

$$\begin{aligned} \begin{aligned} \frac{s}{N}&= 1-\left( 1+{\mathfrak {c}}\, e^{(k-1)n} \right) ^{-\frac{1}{k-1}}, \\ {\mathfrak {c}}&= \left( 1-\frac{s_0}{N}\right) ^{-(k-1)} -1, \end{aligned} \end{aligned}$$
(17)

from which we can extract the early time behavior: \(s(n) \sim s_0 e^{(k-1) n}\), and the late time behavior: \(s(n) \sim N (1 - {\mathfrak {c}}^{-\frac{1}{k-1}} e^{-n})\), where for these limits we have assumed that \(s_0\ll N\) and therefore \({\mathfrak {c}}\sim \frac{s_0(k-1)}{N}\). We can also see that the time it takes for a small perturbation to spread to a finite fraction of the system (the scrambling time) is of order \(n_* \sim \frac{1}{k-1} \log \left( \frac{N}{s_0(k-1)}\right) \).Footnote 8

In the case of a 2-local circuit, \(k=2\), the solution (17) takes the form

$$\begin{aligned} s(n) = \frac{N s_0 e^n}{N+s_0(e^n-1)}\,. \end{aligned}$$
(18)

We can then compute the complexity which is obtained by summing over the number of infected qubits at different times:

$$\begin{aligned} {\mathcal {C}}(n) = \int _0^n s(n')d n' =N \log \left( 1+e^{(n-n_*)}\right) \,, \end{aligned}$$
(19)

where here again, we have assumed \(s_0 \ll N\) and defined \(n_*=\log \frac{N}{s_0}\).

There are two notable features of this result. 1) It grows linearly for times larger than the scrambling time; the delay in the onset of the linear growth is called the switchback effect [13]; just as for the unperturbed evolution, the linear growth will eventually come to an end and the complexity will saturate on exponentially long time scales. This linear growth behavior is very important; it is one of the motivations for the holographic conjectures that we will present later in Sect. 7.1. We will comment further on this in the discussion Sect. 2) The early-time behavior is exponentially growing, but with a small prefactor that is suppressed as 1/N. It can be argued that this behavior is related to the Lyapunov growth of the out-of-time-order correlators [53] which is a signature of quantum chaos. Under the assumption of maximal chaos, this yields the identification \( (k-1) n = 2 \pi T t\). The number of qubits corresponds to the entropy of the system. Up to prefactors, we find that the rate of growth is expected to be proportional to TS. This expectation is borne out by the two holographic complexity proposals CV and CA applied to black holes which we will discuss later in Sect. 7. The time dependence of the complexity of the precursor is illustrated in Fig. 5.

Fig. 5
figure 5

Illustration of the time dependence of complexity of the precursor. An initial exponential regime is followed by linear growth starting at the scrambling time \(n_*\)

4 Continuous complexity

4.1 Nielsen’s approach

We have estimated the number of gates needed to reproduce a given unitary, but how can one go about finding the actual optimal circuit that does the job? This appears to be a very difficult problem.

An approach to this question, proposed by Nielsen [54,55,56] turns the question into a geometric problem, and as such provides a universally applicable strategy. The idea is suggested in the proof of universality given in the previous section: if the universal gates are chosen to be \(e^{i \epsilon h}\), then a circuit will explore the unitary group by small steps, and in the limit \(\epsilon \rightarrow 0\) will give a continuous path, which can be constructed by means of a time-dependent Hamiltonian,

$$\begin{aligned} U(t) = \overleftarrow{{\mathcal {P}}} \exp \left( \int _0^{t} H (s) ds \right) . \end{aligned}$$
(20)

The Hamiltonian can be expanded in a basis of operators

$$\begin{aligned} H (t) = \sum _I Y^I (t) {\mathcal {O}}_I . \end{aligned}$$
(21)

The complexity of a given target unitary \(U_T\) is defined by the minimization of a suitable cost functional \(F [U(t),\dot{U}(t)]\) as

$$\begin{aligned} {\mathcal {C}}_F [U_t] = \min _{\{ U(t) \}} \int dt \, F [U(t), \dot{U}(t)] \,, \end{aligned}$$
(22)

with the constraint that the desired operator is reached at some fixed time \(t_f\). In this way the problem is translated into a Hamiltonian control problem.

A set of particularly relevant length functionals are those satisfying the following conditions [18, 54]

  1. 1.

    F is continuous for all unitaries U(t) and all non-vanishing tangent vectors \(\dot{U}(t)\).

  2. 2.

    F is positive: \(F[U(t),\dot{U}(t)]\ge 0\) with equality iff \(\dot{U}(t)=0\).

  3. 3.

    F is positively homogeneous: \(F[U(t),\lambda \dot{U}(t)]= \lambda F[U(t),\dot{U}(t)]\) for any \(\lambda \ge 0\).

  4. 4.

    F satisfies the triangular inequality \(F[U, \dot{U}+\dot{V}] \le F[U, \dot{U}] +F[U, \dot{V}]\)

If in addition one assumes smoothness (continuity of all derivatives) and requires that the Hessian of \(F(U,\dot{U}\)) as a function of \(\dot{U}\) is strictly positive for all U (a condition that is stronger than the triangular inequality) then the length functional is a Finsler metric and the manifold is a Finsler manifold. The positive homogeneity property allows us to take \(t_f=1\) without loss of generality. The interest of these definitions is that in a Finsler metric the problem of finding minimal length curves translates into a geodesic equation which is a second-order differential equation, just as for the more usual Riemannian geometry.

As the reader might have already noticed, in this approach, the complexity is not uniquely defined, as it depends on the choice of the cost function. For instance, a quite general family of cost functions, that we will use in the following, is given by

$$\begin{aligned} F_{k,\{p\}} [Y^I] = \left( \sum _I p_I | Y^I |^k\right) ^{\frac{1}{k}}, \end{aligned}$$
(23)

where the positive penalty factors \(p_I>0\) account for the relative difficulty of implementing different gates.Footnote 9 In the case \(k = 2\) the cost function is the distance induced by a Riemannian metric on the space of unitaries. This metric is always right-invariant, as it is defined in terms of \(H (t) = \partial _t U (t) U^{-1} (t)\), but in general it is not left-invariant.Footnote 10 For \(k\ne 2\), the cost functions satisfy the properties 1–4 and while they are not Finsler metrics, they can still be approximated arbitrarily well by Finsler metrics.

Notice that the complexity thus defined will depend on the choice of the basis of operators used and in general it is not invariant under a change of basis. One can obtain a basis-independent notion using the Schatten norm:

$$\begin{aligned} S_k [H] \quad = \left( \text {tr} (H^{\dagger } H)^{\frac{k}{2}} \right) ^{\frac{1}{k}} . \end{aligned}$$
(24)

If the operators of the basis are chosen so that \(\frac{1}{2}\text {tr} ({\mathcal {O}}_I {\mathcal {O}}_{J }^{\dagger }) = \delta _{IJ}\), then \(F_{2 k} [H] = (1/\sqrt{2}) S_k [H]\). In this case \(F_2\) corresponds to the left- and right-invariant metric, and is invariant under an orthogonal change of basis.

One may wonder whether the “continuous” complexity defined in this section can be related precisely to the discrete notion defined by the number of gates. The argument given in [55] shows that this is the case, and at the same time it illustrates the role of the penalty factors. They consider a Hamiltonian of the form

$$\begin{aligned} H = \sum _a Y^a \sigma _a + \sum _i \widetilde{Y^i} \sigma _i \end{aligned}$$
(25)

where \(\sigma _a \) are one- or two-qubit gates, and \(\sigma _i\) are three or higher qubit gates, taken to be tensor products of Pauli-matrices. Note that these generators are not normalized as before but rather \(\text {tr}(\sigma _A\sigma _B) = 2^N \delta _{AB}\). With this choice, the relation between the cost functions (23) and (24) is rescaled accordingly. We will keep this normalization until the end of the section to match with the reviewed literature. The cost function is chosen as \(F = \left( \sum _a (Y^a)^2 + p \sum _i (\widetilde{Y^i})^2\right) ^{1/2}\). When the penalty factor p is taken to be very large, one can expect that the optimal path will use only the “easy” gates. This can be formalized using the projector \(P \sigma _a = \sigma _a, P \sigma _i = 0\). First, one can show that if \(U = \exp \int H (t)\), \(U_P = \exp \int P H (t)\) , then

$$\begin{aligned} \Vert U - U_P \Vert \le \frac{2^N}{\sqrt{p}} {\mathcal {C}}_F [U]\,. \end{aligned}$$
(26)

This shows that, by penalizing enough the higher order gates, the operator can be approximated with arbitrary precision using only one and two-qubit gates. For instance, choosing \(\sqrt{p} > 4^N\), we obtain \(\Vert U - U_P \Vert \le {\mathcal {C}}_F [U]/2^N\).

Then, replacing the functions \(Y^a (t)\) with step-wise constant functions, one can effectively discretize the integral, and exhibit a circuit built with one and two-qubit gates that approximates U. The discrete complexity \({\mathcal {C}}_{\text {d}}(U,\epsilon )\), defined as the number of gates in the optimal circuit that builds U with a tolerance \(\epsilon \), is then related to the continuous one as

$$\begin{aligned} {\mathcal {C}}_\text {d}(U,\epsilon ) \le c \frac{N^6 {\mathcal {C}}_F[U]^3}{\epsilon ^2} \, \end{aligned}$$
(27)

for some constant c. Moreover, as proven in [54], the complexity gives also a lower bound on the number of gates, provided the cost function satisfies certain conditions: given an exactly universal set of gates \({\mathcal {G}} =\{ e^{i X_i} \}\), which allows us to reach the target unitary exactly, and a cost function that satisfies \(F[X_i]<1 \, \forall i\),Footnote 11 then for any unitary it holds that \({\mathcal {C}}_F[U] \le {\mathcal {C}}_\mathcal{{G}}[U]\), where the latter is the exact discrete complexity of U with respect to the gate set. This shows that the notions of discrete and continuous complexity are polynomially related to each other. It is not known what cost function gives the tightest bound; notably, \(F_2\) is not optimal, since for all operators \(F_2(U)\le \pi \).

4.2 Complexity of one qubit

In order to get a better understanding of the complexity geometry, it is useful to consider the simplest possible case: a system of a single qubit. We follow mainly the presentation in [57].

As explained in the previous section, the choice of a cost function of the type \(F_2\) is equivalent to the choice of a right-invariant metric on SU(2). As is well-known, there is a unique (up to rescaling) right-and-left invariant metric; when equipped with this metric, the group is isometric to the round sphere \(S^3\). The general right-invariant metric can be written using the right-invariant 1-forms \(\omega ^a\) defined by \(dg \, g^{-1} = \omega ^a i \sigma _a\):

$$\begin{aligned} ds^2 = I_{ab} \, \omega ^a \omega ^b \,. \end{aligned}$$
(28)

The maximally symmetric round-sphere is obtained when \(I_{ab} = I \delta _{ab}\). If we choose, for instance, a diagonal matrixFootnote 12 but with different entries: \(I_{xx}=I_{yy}=1, I_{zz} = p\), then the geometry is that of a squashed 3-sphere. Let us consider the following parametrization of SU(2):

$$\begin{aligned} g = \begin{pmatrix} z_1 &{} z_2\\ -{\bar{z}}_2 &{} {\bar{z}}_1 \\ \end{pmatrix}, \end{aligned}$$
(29)

with \((z_1,z_2) \in \mathbb {C}^{\mathbb {2}} \,, |z_1|^2+|z_2|^2=1\). In these coordinates the metric with the penalty factor p is the pullback on \(S^3\) of the following metric on \(\mathbb {C}^{\mathbb {2}}\):

$$\begin{aligned} \begin{aligned} ds^2&= dz_1 d {\bar{z}}_1 + dz_2 d {\bar{z}}_2 \\&\quad - \frac{p-1}{4} (z_1 d{\bar{z}}_1 - {\bar{z}}_1 dz_1 + z_2 d{\bar{z}}_2 - \bar{z}_2 dz_2)^2 \,. \end{aligned} \end{aligned}$$
(30)

The geodesics can be described explicitly as follows [58]: the geodesic starting from the identity with tangent vector v is given by

$$\begin{aligned} g(t) = R_{J}(t |J|) \, R_{{\hat{z}}}(t \gamma J_3) \end{aligned}$$
(31)

where we used the same notation for the rotations as in Sect. 3.1, \(\gamma = \frac{1}{p}-1\) and J is the angular momentum, related to the angular velocity as \(J_a=I_{ab} v^b\). Clearly for \(\gamma =0\) we recover the usual geodesics on the sphere.

In coordinates, the geodesic trajectories are

$$\begin{aligned} \begin{aligned} z_1(t)&= e^{-i \gamma J_3 t/2} \left( \cos \frac{|J| t}{2} - i {\hat{J}}_3 \sin \frac{|J| t}{2} \right) \,, \\ z_2(t)&= e^{-i \gamma J_3 t/2} ({\hat{J}}_1+i {\hat{J}}_2) \sin \frac{|J| t}{2} \,, \quad {\hat{J}} = J/|J| \,. \end{aligned} \end{aligned}$$
(32)

It is instructive to consider the behavior of neighboring geodesics \(g_J(t), g_{J+\delta J}(t)\); their difference gives the Jacobi vector field, whose length tells us whether geodesics converge or diverge; more precisely one has [59]

$$\begin{aligned} ||\delta _w g_v(t)||^2 = t^2 - \frac{1}{3}K_{v,w} t^4 + o(t^4) \end{aligned}$$
(33)

where \(v=\dot{g}_v(0)\), w is a unit vector orthogonal to v, and \(K_{v,w}\) is the sectional curvature of the plane spanned by vw. The calculation gives

$$\begin{aligned} \begin{aligned} K_{1,3}=K_{2,3}&\propto p \,,\qquad K_{1,2}&\propto 4-3p \,. \end{aligned} \end{aligned}$$
(34)

We see that for \(p=1\) all the sectional curvatures are equal, as the metric is isotropic. For \(p>4/3\) the sectional curvature becomes negative in the plane 1, 2 spanned by the easy generators. This is a general feature, which can be understood as follows: since the commutator of two easy gates gives a hard one, it may be more efficient, in order to go from \(\sigma _x\) to \(\sigma _y\), to travel along the two axis rather than the hypotenuse. This appearance of hyperbolic geometry is a striking feature of complexity geometry, and can illustrate one important aspect, namely the fact that the distance in complexity can be much larger than the distance in the operator norm. In fact, there always exists a small ball around each point, inside which the direct geodesics are the shortest paths. Then for sufficiently small \({\mathcal {C}}_2(U)=\epsilon \), one has \(\epsilon \le {\mathcal {C}}_{2,p}(U) \le \sqrt{p} \, \epsilon \). For p large the two distances can be very different, even though they go to zero together, so the complexity is still a continuous function of the distance. The difference becomes more significant when we consider systems with more degrees of freedom: in that case, as we have already seen, the complexity can increase exponentially in the number of qubits while the Hilbert space distance cannot.

As pointed out in [51], a hyperbolic geometry similar to what we saw above but for a larger number of qubits accounts for the switchback effect discussed in Sect. 3.2. An initially small operator can be represented as a short segment in the space of unitaries. The precursor is obtained evolving in time the two ends of the segment. Connecting the ends with geodesics sweeps out a two-dimensional surface; if we assume a constant negative curvature on this surface, then one can show that the geodesic distance grows in time with the same features described by the switchback, i.e., initially exponential and later linear with a time offset. This behavior is illustrated in Fig. 6.

Fig. 6
figure 6

Illustration of the evolution of the complexity of the precursor as a geodesic deviation in negatively curved space

Finally, we can analyze in detail in this example the difference between operator complexity and state complexity. For the latter, we want to find the shortest path in operator space requiring that we reach a certain target state, so we define

$$\begin{aligned} {\mathcal {C}}(|\psi _T \rangle ,|\psi _R \rangle ) = \min _U {\mathcal {C}}(U), ~~~\text {s.t.}~~~ U|\psi _R \rangle =|\psi _T \rangle \,. \end{aligned}$$
(35)

The space of states of a qubit is \(\mathbb {CP}^1 \approx S^2\). It can be identified with the coset SU(2)/H where H is the stabilizer group of the action of SU(2) on the states. Explicitly we can parametrize the group as

$$\begin{aligned} (z_1,z_2) = \left( \frac{x}{\sqrt{1+x {\bar{x}}}}e^{i \alpha },\frac{1}{\sqrt{1+x {\bar{x}}}}e^{-i \alpha } \right) \, \end{aligned}$$
(36)

and identify x with the local coordinate on \(\mathbb {CP}^{\mathbb {1}}\). The minimization over the stabilizer in (35) means that locally we have to choose a direction along the fiber that minimizes the length. When we write the metric (30) in these coordinates, we find that one can extract a term \((d\alpha + \ldots )^2\). Setting this term to zero minimizes the length, and one is left with a metric which is best written in angle coordinates using the stereographic projection \(x=\cot (\frac{\theta }{2})e^{i \phi }\) :

$$\begin{aligned} ds^2 =\frac{1}{4} \left( d\theta ^2 + \frac{p \sin ^2 \theta }{\sin ^2\theta + p \cos ^2 \theta } d\phi ^2 \right) \,. \end{aligned}$$
(37)

It is clear from the definition (35) that the state complexity is in general not left-invariant, since the operator complexity is not: \({\mathcal {C}}(g|\psi _T \rangle ,g|\psi _R \rangle ) \ne {\mathcal {C}}(|\psi _T \rangle ,|\psi _R \rangle )\), and indeed the metric (37) is not homogeneous. For large p it has negative curvature everywhere except in a small region around the equator.

So far we have considered only the geometry corresponding to the penalized \(F_2\) cost. We could ask what is the distance for other costs, for instance \(F_1\). Unfortunately, it is quite complicated to compute the geodesics, even in this simple setup of a single qubit. Looking at the definition (23), it is clear that there is a simple case in which \({\mathcal {C}}_1\) and \({\mathcal {C}}_2\) coincide: when there is only one non-vanishing \(Y^I\). In this case the geodesic can be written as the exponential of a single gate, and we should assume that the gate is contained in the basis. However the inspection of the geodesics (32) shows that they do not have this simple form, except for the unpenalized case \(\gamma =0\), or for the special geodesics with \(J_3=0\).

5 Complexity of harmonic oscillators

So far we have discussed the complexity of states over spin chains. Those states live in a finite dimensional Hilbert space. We can also study the complexity in infinite-dimensional Hilbert spaces as long as we focus on a specific sub-manifold of states generated by a closed algebra of operators. One example is that of Gaussian states of bosonic or fermionic systems. We will develop some technology to deal with this example which will come in handy later when studying complexity in free scalar quantum field theory.

5.1 Complexity of Gaussian states

Gaussian states can be fully characterized by their one- and two-point functions. To make use of this fact we will define the Gaussian states in terms of their covariance matrix and displacement vector, see e.g., [60,61,62]

$$\begin{aligned} \text {Tr}({\hat{\rho }} \,{\hat{\xi }}^a {\hat{\xi }}^b) = \frac{1}{2}(G^{(ab)}+i \Omega ^{[ab]}),\qquad \text {Tr}({\hat{\rho }} \,{\hat{\xi }}^a)=w^a, \end{aligned}$$
(38)

where \({\hat{\rho }}\) is the density matrix representing the Gaussian state and \({\hat{\xi }}^a = ({\hat{q}}_1,\ldots , {\hat{q}}_N,{\hat{p}}_1,\ldots ,{\hat{p}}_N)\) are 2N degrees of freedom on the quantum phase space consisting of position and momentum operators which can be either fermionic or bosonic. In the case of a pure state (38) simply becomes

$$\begin{aligned} \langle \psi | {\hat{\xi }}^a {\hat{\xi }}^b |\psi \rangle = \frac{1}{2}(G^{(ab)}+i \Omega ^{[ab]}),\quad \langle \psi |{\hat{\xi }}^a |\psi \rangle =w^a\,. \end{aligned}$$
(39)

In Eqs. (38) and (39), \(G^{(ab)}\) encodes the symmetric part of the correlation function and \(\Omega ^{[ab]}\) encodes its anti-symmetric part. To begin with, we take the simplifying assumption that the states have vanishing one-point functions \(w^a=0\) in Eqs. (38) and (39). The case of non-vanishing displacement will be treated later in Sect. 5.3.

We will focus mostly on the bosonic case below, but a lot of this machinery has also been adapted for studying fermionic states, see e.g., [63,64,65]. For a bosonic system \(\Omega ^{[ab]}\) is trivially fixed by the canonical commutation relations of the phase space operators

$$\begin{aligned} \Omega = \begin{pmatrix}0&{}\mathbb {1}_{n\times n}\\ -\mathbb {1}_{n\times n}&{}0\\ \end{pmatrix}, \end{aligned}$$
(40)

and the only non-trivial information is in \(G^{(ab)}\). Hence, from now on we will refer to \(G^{(ab)}\) as the covariance matrix of the state \({\hat{\rho }}\).

For our complexity study we will focus on quantum circuits which move entirely within the space of Gaussian states with vanishing displacement and will therefore be parametrized using covariance matrices. Such circuits are generated by exponentiating quadratic generators as follows

$$\begin{aligned} {\hat{\rho }}(\sigma ) = {\widehat{U}}(\sigma ) {\hat{\rho }}(0) \widehat{U}^\dagger (\sigma ), \quad {\widehat{U}} (\sigma ) =e^{-\frac{i}{2} {\hat{\xi }}^a k_{(ab)}(\sigma ) {\hat{\xi }}^b} \end{aligned}$$
(41)

where \({\widehat{U}}(\sigma )\) is a unitary transformation parametrized by a symmetric matrix \(k_{(ab)}(\sigma )\) and \({\hat{\rho }}(\sigma )\) is the instantaneous density metric along the circuit with \(\sigma \in [0,1]\) a path-parameter along the circuit.Footnote 13 Then, with some algebra one can easily demonstrate that (see e.g., [66, 67])

$$\begin{aligned} \begin{aligned}&{\widehat{U}}^{\dagger }(\sigma ) \, {\hat{\xi }}^a \, {\widehat{U}}(\sigma ) = S(\sigma )^a{}_b(\sigma ) {\hat{\xi }}^b, \\&G(\sigma ) = S(\sigma ) \cdot G(0) \cdot S^T(\sigma ),\\&S^a{}_b(\sigma ) = \left( e^{K(\sigma )}\right) ^a{}_b, ~~~~ K^a{}_b = (\Omega \cdot k(\sigma ))^a{}_b, \end{aligned} \end{aligned}$$
(42)

where \(G(\sigma )\) is the covariance matrix of the state \(\hat{\rho }(\sigma )\) along the circuit. Note that \(S(\sigma )\) in the last equation belongs to the symplectic group \(Sp(2N,{\mathbb {R}})\) by virtue of satisfying

$$\begin{aligned} S(\sigma ) \cdot \Omega \cdot S^T(\sigma ) = \Omega . \end{aligned}$$
(43)

To make connection with the complexity functionals of Eq.  (23), we should decompose the symplectic transformation using a fixed basis of generators \(K_I\) of the symplectic group \(Sp(2N,{\mathbb {R}})\)

$$\begin{aligned} S(\sigma ) = \overleftarrow{{\mathcal {P}}} \exp \int _0^\sigma d\sigma ' \, Y^I(\sigma ') K_I \end{aligned}$$
(44)

and extract the control functions \(Y_I\).

The complexity depends on this choice of basis. One option is to fix the basis of generators \(K_I\) in terms of our choice \({\hat{\xi }}^a\) of the operators on the quantum phase space. That is, we select

$$\begin{aligned}&(K_{I=(a',b')})^a{}_b=(\Omega \cdot k_{I=(a',b')})^a{}_b,\quad a',b'\in {1,\ldots ,2N},\nonumber \\&k^I_{(ab)} = \frac{1}{\sqrt{1+\delta _{a'b'}}} (\delta _a^{a'} \delta _b^{b'}+\delta _b^{a'} \delta _a^{b'}), \end{aligned}$$
(45)

which represent the generator \(\exp {\left[ -i\frac{ {\hat{\xi }}_{a'}{\hat{\xi }}_{b'}+{\hat{\xi }}_{b'}{\hat{\xi }}_{a'}}{2\sqrt{1+\delta _{a'b'}}}\right] }\), see Eqs. (41) and (42). The proportionality factor is fixed such that the different generators are orthonormal, i.e., \(\frac{1}{2}\text {Tr}(K_I K_J^T) = \delta _{IJ}\). With this choice of basis we can extract the control functions

$$\begin{aligned} Y^I = \frac{1}{2}\text {Tr}(\partial _{\sigma }S S^{-1} K_I^T)\,. \end{aligned}$$
(46)

The norm (23) with \(p_I=1\) and \(k=2\), which we refer to as the unpenalized \(F_2=\sqrt{\sum _I |Y^I|^2}\) norm, can be expressed directly from the matrices \(S(\sigma )\) along the circuit as follows

$$\begin{aligned} ds^2 = \frac{1}{2} \text {Tr} \left( dS \, S^{-1} \, \, (dS \, S^{-1})^T \, \right) . \end{aligned}$$
(47)

This expression is written covariantly and does not require a particular choice of basis to be evaluated. However, note that to prove its equivalence with the unpenalized \(F_2\) norm, we had to assume that the generators of the circuit are chosen to be orthonormal.

A natural generalization of the \(F_2\) norm in Eq. (23) is defined in terms of a given covariance matrix \(G_{\text {metric}}\)

$$\begin{aligned} ds^2 = \frac{1}{2} \text {Tr} \left( dS \, S^{-1} \, G_{\text {metric}} \, (dS \, S^{-1})^T \, G_{\text {metric}}^{-1} \right) . \end{aligned}$$
(48)

In effect, the choice of \(G_{\text {metric}}\) introduces some penalty factors into the definition of the \(F_2\) norm. When the generators of the symplectic group satisfy

$$\begin{aligned} \frac{1}{2} \text {Tr} \left( K_I \, G_{\text {metric}} \, K_J^T G_{\text {metric}}^{-1} \right) = \delta _{IJ}, \end{aligned}$$
(49)

we recover the unpenalized \(F_2\) norm. More generally, we have

$$\begin{aligned} \frac{1}{2} \text {Tr} \left( K_I \, G_{\text {metric}} \, K_J^T G_{\text {metric}}^{-1} \right) = \gamma _{IJ} \end{aligned}$$
(50)

and \(F_2 = \sqrt{\gamma _{IJ}Y^I Y^J}\) where \(\gamma _{IJ}\) function as penalty factors. We would like to emphasize that the unpenalized \(F_2\) norm is basis dependent. While remaining unmodified under orthogonal transformations which mix the positions among themselves (accompanied by the same orthogonal transformation on momenta), the unpenalized \(F_2\) norm in fact changes under more general symplectic transformations which modify the orthogonality condition (49), even with \(G_{\text {metric}}=1\).

The complexity problem, i.e., finding the optimal trajectory (or circuit) between a reference state \(G_R\) and a target state \(G_T\) within the complexity geometry (48), can now be formulated explicitly as a geodesic problem, namely

$$\begin{aligned} \begin{aligned}&{\mathcal {C}}_2 = \min _{S(\sigma )} \int _0^1 d\sigma \left( \frac{ds}{d\sigma }\right) , \qquad \text {such that}\\&S(\sigma =1) G_R S^T (\sigma =1) = G_T\,. \end{aligned} \end{aligned}$$
(51)

It was proven [20, 66], that when the matrix \(G_{\text {metric}}\) used to define the geometry (48) coincides with the covariance matrix \(G_R\) of the reference state, the geodesics from the reference state to the target state take a particularly simple form of “straight lines”, i.e.,

$$\begin{aligned} S(\sigma ) = \exp \left[ \frac{\sigma }{2} \log \Delta \right] , \qquad \Delta \equiv G_T G_R^{-1}, \end{aligned}$$
(52)

where \(\Delta \) is the relative covariance matrix between the reference and the target state.

With the choice \(G_{\text {metric}}=G_R\), and for generators satisfying the condition (49), the unpenalized \({\mathcal {C}}_2\) complexity, associated with the unpenalized \(F_2\) cost function reads

$$\begin{aligned} {\mathcal {C}}_2 (G_R,G_T) = \frac{1}{2\sqrt{2}}\sqrt{\text {Tr}[(\log \Delta )^2]}\,. \end{aligned}$$
(53)

While the trajectory (52) does not necessarily minimize the unpenalized \(F_1\) norm given by Eq. (23) with \(k=1\) and \(p_I=1\), we could still evaluate its cost to obtain an upper bound on the unpenalized \({\mathcal {C}}_1\) complexity

$$\begin{aligned} {\mathcal {C}}_1 \le {\mathcal {C}}_1^{UB} = \sum _I |Y^I| = \frac{1}{4} \sum _I |\text {Tr}(\log \Delta \cdot K_{I}^T)|\,. \end{aligned}$$
(54)

5.2 Single harmonic oscillator

As a specific example, let us focus on the bosonic case of a simple Harmonic oscillator described by the following HamiltonianFootnote 14

$$\begin{aligned} H = \frac{1}{2M}P^2+\frac{1}{2}M\omega ^2 Q^2 \end{aligned}$$
(55)

with M and \(\omega \) the mass and frequency of the oscillator, respectively, and Q and P are its position and momentum. In what follows it will be more convenient to work in terms of dimensionless position and space coordinates and hence we rescale

$$\begin{aligned} p \equiv P/\omega _g, \qquad q\equiv \omega _g Q. \end{aligned}$$
(56)

(In the case of several positions and momentum operators we rescale all of them). Later on, the scale \(\omega _g\) will participate in defining a gate scale when discussing complexity. More precisely it will play a role in rendering the control functions \(Y^I\) dimensionless. With the rescaled variables, the Hamiltonian takes the form

$$\begin{aligned} H = \frac{\omega _g^2}{M} \, \left( \frac{1}{2}p^2 + \frac{1}{2} \lambda ^2 q^2\right) , \qquad \lambda \equiv \frac{M\omega }{\omega _g^2}. \end{aligned}$$
(57)

A general Gaussian wavefunction takes the form

$$\begin{aligned} \psi (q) = \langle q | \psi \rangle = \left( \frac{a}{\pi }\right) ^{1/4} \exp \left[ -\frac{1}{2} (a+ib)q^2\right] \end{aligned}$$
(58)

where a and b are real numbers and a has to be positive in order for the wavefunction to be normalizable. For the special case of the vacuum state of the Hamiltonian (57) we have \(a=\lambda \) and \(b=0\).

Explicitly evaluating the covariance matrix for the wavefunction (58) we obtain

$$\begin{aligned} G = \begin{pmatrix} \frac{1}{a} &{}~ -\frac{b}{a} \\ -\frac{b}{a} &{}~~ \frac{a^2+b^2}{a} \end{pmatrix} \end{aligned}$$
(59)

and in particular for the vacuum state

$$\begin{aligned} G_{\text {vac}} = \begin{pmatrix} \frac{1}{\lambda } &{}~ 0 \\ 0 &{}~~ \lambda \end{pmatrix}. \end{aligned}$$
(60)

As we will motivate later when discussing complexity in QFT, the reference state is often taken to be the ground state of another Hamiltonian with a different frequency \(\omega =\mu \) and hence its covariance matrix is

$$\begin{aligned} G_R = \begin{pmatrix} \frac{1}{\lambda _R} &{}~ 0 \\ 0 &{}~~ \lambda _R \end{pmatrix}, \qquad \lambda _R = \frac{M\mu }{\omega _g^2}. \end{aligned}$$
(61)

The relative covariance matrix between the reference state and the vacuum reads

$$\begin{aligned} \Delta = \begin{pmatrix} \frac{\lambda _R}{\lambda } &{}~ 0 \\ 0 &{}~~ \frac{\lambda }{\lambda _R} \end{pmatrix} \end{aligned}$$
(62)

and so the unpenalized \({\mathcal {C}}_2\) complexity is simply

$$\begin{aligned} {\mathcal {C}}_2 (G_R,G_T) = \frac{1}{2} \left| \log \left( \frac{\lambda }{\lambda _R}\right) \right| = \frac{1}{2} \left| \log \left( \frac{\omega }{\mu }\right) \right| . \end{aligned}$$
(63)

Note that in this expression the gate scale \(\omega _g\) has canceled.

To obtain the bound (54) on the unpenalized \({\mathcal {C}}_2={\mathcal {C}}_1^{UB}\) complexity we should first select a basis. As described around Eq. (45), we could consider circuits associated with the generators

$$\begin{aligned} {\widehat{K}}_1 = \frac{1}{2}\left( pq+qp\right) , \quad {\widehat{K}}_2 = \frac{q^2}{\sqrt{2}},\quad {\widehat{K}}_3 = \frac{p^2}{\sqrt{2}}. \end{aligned}$$
(64)

Using the relations (41) and (42) we may read the relevant matrices \(k_{(ab)}\)

$$\begin{aligned} \begin{aligned} k_{(ab)}^1 = \begin{pmatrix}0&{}1\\ 1&{}0\end{pmatrix},~ k_{(ab)}^2 = \begin{pmatrix}\sqrt{2}&{}0\\ 0&{}0\end{pmatrix},~ k_{(ab)}^3 = \begin{pmatrix}0&{}0\\ 0&{}\sqrt{2}\end{pmatrix}, \end{aligned} \end{aligned}$$
(65)

and the corresponding \(Sp(2,{\mathbb {R}})\) generators:

$$\begin{aligned} \begin{aligned} K_1 = \begin{pmatrix}1&{}0\\ 0&{}-1\end{pmatrix},~ K_2 = \begin{pmatrix}0&{}0\\ -\sqrt{2}&{}0\end{pmatrix},~ K_3 = \begin{pmatrix}0&{}\sqrt{2}\\ 0&{}0\end{pmatrix}. \end{aligned} \end{aligned}$$
(66)

This leads to

$$\begin{aligned} {\mathcal {C}}_1^{UB} = \frac{1}{2} \left| \log \left( \frac{\lambda }{\lambda _R }\right) \right| = \frac{1}{2} \left| \log \left( \frac{\omega }{\mu }\right) \right| . \end{aligned}$$
(67)

Note that in this very special case we have obtained the same result for the two cost functions. Generally this will not be the case. If we consider for example a system of many decoupled harmonic oscillators, each with Hamiltonian of the form (55) but with different frequencies \(\omega _i\), the complexities will simply be given by

$$\begin{aligned} {\mathcal {C}}_1^{UB} = \frac{1}{2} \sum _{i} \left| \log \frac{\omega _i}{\mu }\right| ; \quad {\mathcal {C}}_2 =\frac{1}{2} \sqrt{\sum _{i} \left( \log \frac{\omega _i}{\mu }\right) ^2}. \end{aligned}$$
(68)

5.3 Complexity of coherent states

We can extend the discussion of Sect. 5.1 to the case of Gaussian states with non-zero displacement (cf. (38) and (39)), i.e., coherent states. We follow mostly the treatment of [68], with some modifications (see also [69] for a different approach). For simplicity, we focus on wavefunctions of the form

$$\begin{aligned} \psi (q_i) = \mathcal{N} \text {exp} \left[ - \frac{1}{2} A_{ij} (q_i -a_i)(q_j - a_j) \right] \,, \end{aligned}$$
(69)

with \(A_{ij}\) and \(a_i\) for \(i\in \{1,\ldots ,N\}\) real parameters. As a consequence, the displacement vector in (39) is non-vanishing only in the coordinates directions and is zero in the momenta, \(\langle q_i \rangle = a_i,\) \(\langle p_i \rangle =0\). Clearly this restricts the choice of symplectic transformations, as we can only allow transformations that do not mix coordinates and momenta.Footnote 15 The transformations we consider take the form

$$\begin{aligned} q_i \rightarrow m_{ij} (q_j + b_j), \end{aligned}$$
(70)

where \(m_{ij}\) is a general real matrix. These transformations keep us within the class of real wavefunctions (69), in addition to keeping the vanishing expectation value of the momentum. The transformations (70) form the group \(GL(N,{\mathbb {R}}) < imes {\mathbb {R}}^N\).

We could generalize the discussion of Sect. 5.1 by introducing new gates that move within the space of coherent states. We will follow a different route which allows us to borrow the previous results directly. We observe that a coherent state wavefunction can be interpreted as a Gaussian wavefunction in a space with one more coordinate. We rewrite (69) as

$$\begin{aligned} \psi (q_I) = \mathcal{N} \text {exp} \left[ - \frac{1}{2} {\tilde{A}}_{IJ} q_I q_J \right] \,, \end{aligned}$$
(71)

with \(q_I = (q_0,q_i)\). At \(q_0=1\), this reduces to (69) if \({\tilde{A}}_{ij} = A_{ij}\), \({\tilde{A}}_{i0}=- A_{ij} a_j\), whereas the value of \({\tilde{A}}_{00}\) can be reabsorbed in the normalization factor and so is irrelevant.

The transformations (70) can be embedded into the group of linear transformations \(GL(N+1,{\mathbb {R}})\) on the operatorsFootnote 16 of the extended space as follows:

$$\begin{aligned} M = \begin{pmatrix} 1 &{} 0 \\ m \mathbf{b} &{} m \end{pmatrix} \,, \quad m \in GL(N, {\mathbb {R}}) \,. \end{aligned}$$
(72)

The action on the wavefunction induced by \({\hat{q}} \rightarrow M {\hat{q}}\) is given by \(q\rightarrow M^{-1} q\) or equivalently \({\tilde{A}} \rightarrow M^{-1}{}^T {\tilde{A}} \, M^{-1}\). Notice that the value of \(q_0\) does not change under the action of M.

In order to apply the formulas of Sect. 5.1 we need the covariance matrix of the state and the symplectic transformations that act on it. They have a block-diagonal form:

$$\begin{aligned} G = \begin{pmatrix} {\tilde{A}}^{-1} &{} 0 \\ 0 &{} {\tilde{A}} \end{pmatrix} \,, \quad S = \begin{pmatrix} M &{} 0 \\ 0 &{} M^{-1}{}^T \end{pmatrix} \,. \end{aligned}$$
(73)

With these ingredients at hand, we can use the formula (48) for the metric. Choosing as before \(G_{metric} =G_R= \begin{pmatrix} \frac{1}{\lambda _R} \mathbb {1} &{} 0 \\ 0 &{} \lambda _R \mathbb {1} \end{pmatrix}\), this gives

$$\begin{aligned} \begin{aligned} ds^2&= \text {tr} \left( dM \, M^{-1} (dM\, M^{-1})^T \right) \\&= \text {tr}\left( dm \, m^{-1} (dm \, m^{-1})^T \right) + d \mathbf{b}^T m^T m d\mathbf{b}\,. \end{aligned} \end{aligned}$$
(74)

We find that the \({\mathbb {R}}^N\) factor has a flat metric, but it is non-trivially fibered over the GL(N) factor.

In order to give a more explicit description of the geometry we restrict now to the case \(N=2\). We can use the following parametrization of a GL(2) matrix:

$$\begin{aligned} m = \begin{pmatrix} \cos \alpha &{} -\sin \alpha \\ \sin \alpha &{} \cos \alpha \end{pmatrix} \begin{pmatrix} e^{-y_1} &{} 0 \\ 0 &{} e^{-y_{2}} \end{pmatrix} \begin{pmatrix} \cos \beta &{} -\sin \beta \\ \sin \beta &{} \cos \beta \end{pmatrix} . \end{aligned}$$
(75)

In these coordinates the metric (74) reads

$$\begin{aligned} ds^2= & {} dy_1^2 + dy_2^2 + 2 d\alpha ^2 \nonumber \\&+ 4 \cosh (y_1-y_2) d\alpha d\beta +2 \cosh (2y_1-2y_2) d\beta ^2 \nonumber \\&+ e^{-2 y_1} (\cos \beta \, db_1 - \sin \beta \, db_2)^2 \nonumber \\&+ e^{-2 y_2} (\sin \beta \, db_1 + \cos \beta \, db_2)^2 \,. \end{aligned}$$
(76)

The equations for the geodesics in this geometry cannot be solved analytically. An interesting property of this geometry, as was shown in [68], is that if we want to start from the reference state \(A_R = \lambda _R \mathbb {1}\), \(\mathbf{a}_R =0\) and arrive at the target state \(A =\begin{pmatrix} \lambda _1 &{} 0 \\ 0 &{} \lambda _2 \end{pmatrix}\) with \(\lambda _1 \ne \lambda _2\), and with \(a_1,a_2\) both non-vanishing, then the corresponding geodesic will pass through states in which the two oscillators are entangled, even though in both the initial and final states the two oscillators are unentangled.

If instead we turn on only one component of the displacement vector, it is possible to find simple geodesics analytically. One can show that the geodesics satisfying \(\alpha =\beta =n\pi \), \(b _2=0\) can be obtained from the induced metric on this slice :

$$\begin{aligned} ds^2 = dy_1^2+dy_2^2+e^{-2 y_1} db_1^2 \,. \end{aligned}$$
(77)

This geometry is \({\mathbb {H}}^2\times {\mathbb {R}}\), and we see the hyperbolic space in the coordinates \(y_1\), \(b_1\) arising from the fibration. The target states corresponding to this submanifold have \(\langle q_2 \rangle =0\) and are unentangled in the given coordinates. It is easy to evaluate the complexity of a target state with \(A_T=\text {diag}(\lambda _1,\lambda _2 )\), \(\mathbf{a}=(a_1,0)\) and obtain

$$\begin{aligned} {\mathcal {C}}_2 = \sqrt{ \frac{1}{4} \log ^2 \frac{\lambda _2}{\lambda _R} + \text {arccosh}^2\left( \frac{\lambda _R+\lambda _1+ \lambda _1 a_1^2}{2 \sqrt{\lambda _R \lambda _1}} \right) }\,. \end{aligned}$$
(78)

The geometry (77) is simple enough that in this case we can compute explicitly the complexity also for the \(F_1\) cost function, rather than just giving an upper bound.Footnote 17 Since the \(y_2\) direction is decoupled, we can consider trajectories in the \(y_1,b_1\) direction; for simplicity we rename them as yb. The cost function is

$$\begin{aligned} F_1 = \int ds \left( |\dot{y} | + e^{-y} |\dot{b}| \right) \,. \end{aligned}$$
(79)

This is a singular functional, so we cannot find solutions from the equations of motion. Let us consider a trajectory from \((y_i,b_i)\) to \((y_f,b_f)\) and assume for simplicity that \(y_f> y_i, b_f > b_i\). If we assume that \(\dot{y}(s)>0\), the first term is independent of the trajectory, and the second term is minimized by making y as large as possible. The minimal trajectory will move in a straight line first along the y axis, and then along the b axis at \(y=y_f\). The cost of this path is \( \Delta y + e^{- y_f} \Delta b\). But it can be more convenient to minimize the second term by moving along b at a larger value of y, say \({\tilde{y}}\), paying the price of backtracking in the y direction. The minimum length is obtained for \(e^{{\tilde{y}}} = \Delta b/2\), and is \(2+2 \log \frac{\Delta b}{2} - y_i - y_f\). This path has shorter length when \({\tilde{y}}>y_f\), or \(e^{-y_f}\Delta b >2\). In terms of the parameters of the wavefunction, moving from the reference state to the target state \(\lambda \) , a (with \(\lambda >\lambda _R\)) and using the relations \(y_f=\frac{1}{2}\log \frac{\lambda }{\lambda _R}\), \(b_f=\sqrt{\frac{\lambda }{\lambda _R}}\, a\) and \(b_i=y_i=0\),Footnote 18 we find a cost

$$\begin{aligned} \begin{aligned} {\mathcal {C}}_1&= \frac{1}{2} \log \frac{\lambda }{\lambda _R} + |a| \,, \quad&|a| < 2\,, \\ {\mathcal {C}}_1&= \frac{1}{2} \log \frac{\lambda }{\lambda _R} + 2 + 2 \log \frac{ |a|}{2} \,, \quad&|a| > 2 \,. \end{aligned} \end{aligned}$$
(80)

Similar results can be obtained for \(\lambda < \lambda _R\). Notice that the contribution from the displacement is frequency-independent. The dependence on a is linear for small a, whereas it is quadratic for the \({\mathcal {C}}_2\) case. For large a the leading behavior is \(\log (a^2)\) in both cases, but the subleading terms are different and are frequency-dependent for \({\mathcal {C}}_2\). The path that minimizes \({\mathcal {C}}_1\) is not the same that minimizes \({\mathcal {C}}_2\), so the upper bound \({\mathcal {C}}_1^{UB}\) from the previous sections is not saturated.

5.4 Complexity of the thermofield double state

A particularly interesting example of a Gaussian state of vanishing displacement whose complexity can be studied using the techniques of Sect.  5.1 is the thermofield double (TFD) state of a single harmonic oscillator. The complexity of this state was studied in [66] (see also [67]). The TFD state is defined with respect to two identical copies of a given system as follows

$$\begin{aligned} |TFD(t) \rangle = {\mathcal {N}}_{\text {TFD}} \sum _n e^{-\frac{\beta E_n}{2}-iE_n t} |E_n\rangle _L |E_n\rangle _R \end{aligned}$$
(81)

where the two copies have been labeled left and right (L/R), \(E_n\) are the energy eigenstates, t is the time, \(\beta \) is the inverse temperature and \({\mathcal {N}}_{\text {TFD}}\) is a normalization constant. The TFD state is a pure state which evolves non-trivially under time evolution.Footnote 19 It is also a particularly symmetric purification of the thermal state, i.e., when considering the reduced density matrix and tracing out the right subsystem we are left with a mixed thermal state on the left subsystem – more on that in the next section.

If we focus on the example of the single harmonic oscillator from Sect. 5.1, we will have energy eigenstates defined according to the Hamiltonian (55)

$$\begin{aligned} H |n\rangle = \omega \left( n+\frac{1}{2}\right) |n \rangle . \end{aligned}$$
(82)

Of course, since we are working with two copies of the system, we will have both left and right energy eigenstates \(|n\rangle _L\) and \(|n\rangle _R\). In terms of these eigenstates the TFD state reads

$$\begin{aligned}&|TFD(t,\omega ) \rangle = {\mathcal {N}}_{\text {TFD}} \sum _{n=0}^\infty e^{-n\beta \omega /2-i(n+\frac{1}{2})\omega t} |n \rangle _L |n \rangle _R \nonumber \\&\quad = e^{-i \omega t/2} {\mathcal {N}}_{\text {TFD}} \exp {\left[ e^{-\beta \omega /2-i\omega t} a_L^\dagger a_R^\dagger \right] } |0 \rangle _L |0 \rangle _R. \end{aligned}$$
(83)

The second line shows that this state is Gaussian since it is produced from the vacuum state using a quadratic operator. It will be convenient to combine the position and momentum operators for the left and right copies as follows

$$\begin{aligned} Q_{\pm } = \frac{1}{\sqrt{2}}(Q_L\pm Q_R), \quad P_{\pm } = \frac{1}{\sqrt{2}}(P_L\pm P_R), \end{aligned}$$
(84)

and define their dimensionless versions according to Eq. (56). In these ± coordinates, the \(4 \times 4\) covariance matrix is block diagonal. The blocks have the form

$$\begin{aligned} \begin{aligned}&G_{\text {TFD}}^{\pm }(t) = \\&{\small \begin{bmatrix} \frac{(\cosh (2\alpha )\pm \sinh (2\alpha )\cos (\omega t))}{\lambda } &{} {\mp }\sinh (2\alpha )\sin (\omega t) \\ {\mp }\sinh (2\alpha )\sin (\omega t) &{} \lambda (\cosh (2\alpha ){\mp }\sinh (2\alpha )\cos (\omega t)) \end{bmatrix}} \,, \end{aligned} \end{aligned}$$
(85)

where we have defined

$$\begin{aligned} \alpha = \frac{1}{2}\log \left[ \frac{1+e^{-\beta \omega /2}}{1-e^{-\beta \omega /2}}\right] , \end{aligned}$$
(86)

and \(\lambda \) has been defined in Eq. (57). The reference state for each of the blocks is taken as in Eq. (61) and selecting \(G_{metric}=G_R\) as described above Eq. (52), we can evaluate the \({\mathcal {C}}_2\) complexity as before. At \(t=0\), we obtain

$$\begin{aligned} {\mathcal {C}}_2 = \sqrt{\frac{1}{2} \log ^2 \frac{\omega }{\mu } + 2\alpha ^2}. \end{aligned}$$
(87)

Note that the gate scale \(\omega _g\) canceled from this expression. Evaluating the length of the \({\mathcal {C}}_2\) optimal circuit with the \(F_1\) cost function yields at \(t=0\) in the basis defined with respect to the \(Q^{\pm }\) and \(P^{\pm }\) coordinates

$$\begin{aligned} {\mathcal {C}}_1^{(\pm ),UB} = \left| \frac{1}{2}\log \frac{\omega }{\mu } + \alpha \right| + \left| \frac{1}{2}\log \frac{\omega }{\mu } - \alpha \right| . \end{aligned}$$
(88)

When considering a basis which acts naturally on the physical L and R degrees of freedom rather than the ± modes, we obtain the following complexity at \(t=0\)

$$\begin{aligned} {\mathcal {C}}_1^{(LR),UB} = |\log (\omega /\mu )\,| + 2|\alpha |. \end{aligned}$$
(89)

We will see later that the results of the measure \({\mathcal {C}}_1^{(LR)}\) match best with holography.

It is interesting to compare the complexity of the TFD state at \(t=0\) to that of two copies of the vacuum state, see Eqs. (63) and (67). We refer to this difference in complexities as the complexity of formation of the thermal state [70]

$$\begin{aligned} \Delta {\mathcal {C}} \equiv {\mathcal {C}} (|TFD(t=0)\rangle ) - 2{\mathcal {C}} (|0\rangle )\,. \end{aligned}$$
(90)

This yields for the various cost functions

$$\begin{aligned}&\Delta {\mathcal {C}}_2 = \sqrt{\frac{1}{2} \log ^2 \frac{\omega }{\mu } + 2\alpha ^2} - \frac{1}{\sqrt{2}} \left| \log \frac{\omega }{\mu } \right| \,, \nonumber \\&\Delta {\mathcal {C}}_1^{(\pm ),UB} = \left| \frac{1}{2}\log \frac{\omega }{\mu } + \alpha \right| + \left| \frac{1}{2}\log \frac{\omega }{\mu } - \alpha \right| - \left| \log \frac{\omega }{\mu } \right| \,, \nonumber \\&\Delta {\mathcal {C}}_1^{(LR),UB} = 2|\alpha |\,. \end{aligned}$$
(91)

We can also evaluate the complexity at a different time \(t\ne 0\), but the expressions are slightly more cumbersome and we will not write them here. We refer the reader to section 4.4 of [66]. In general at \(t\ne 0\) the gate scale \(\omega _g\) dependence will not cancel out. However, simplified expressions can be obtained when choosing it such that \(\lambda _R=1\). We will make this choice from now on. Let us further remark that due to the periodic time dependence in the covariance matrix (85), it is clear that the complexity will oscillate in time with frequency \(\omega \). The contribution of these oscillations to the complexity can be shown to be exponentially suppressed at large \(\beta \omega \) (i.e., \(\Delta {\mathcal {C}} \sim e^{-\#\beta \omega }\)).

5.5 Complexity of mixed states

So far we have focused on the complexity of pure states. However, it is of interest to try and define complexity for mixed states too. In this section we will focus on one such definition – the complexity of purification, i.e., the lowest value of the circuit complexity optimized over the possible purifications of the mixed state we are interested in.

More precisely, imagine that we start with a mixed state of a system \({\mathcal {A}}\) described by the density matrix \({\hat{\rho }}_{{\mathcal {A}}}\). To purify the mixed state we supplement the degrees of freedom in \({\mathcal {A}}\) with ancillary degrees of freedom in a complementary system \({\mathcal {A}}^c\). We consider purifications of the state \({\hat{\rho }}_{{\mathcal {A}}}\), i.e., pure states on the combined system \(|\psi _{{\mathcal {A}} {\mathcal {A}}^c}\rangle \) such that \({\hat{\rho }}_{{\mathcal {A}}} = Tr_{{\mathcal {A}}^c} |\psi _{{\mathcal {A}} {\mathcal {A}}^c}\rangle \langle \psi _{{\mathcal {A}} {\mathcal {A}}^c}|\). The complexity of purification is simply defined as the minimal pure state complexity among all such possible purifications and all possible ancillary system sizes \({\mathcal {C}}({\hat{\rho }}_{\mathcal {A}}) = \min {\mathcal {C}}(|\psi _{{\mathcal {A}} {\mathcal {A}}^c}\rangle )\) starting with a completely unentangled reference state on the combined \({\mathcal {A}} {\mathcal {A}}^c\) system. Figure 7 illustrates this process.

Fig. 7
figure 7

Illustration of the definition of complexity of purification. We purify the reduced density matrix \(\rho _{\mathcal {A}}\) in terms of ancilla degrees of freedom on a system \({\mathcal {A}}^c\) and optimize the preparation of the state of the combined system

Several alternative definitions for mixed state complexity have been proposed. For example, we can consider an approach based on the spectrum of eigenvalues \(p_i\) of the density matrix \({\hat{\rho }} = \sum _i p_i |\phi _i\rangle \langle \phi _i |\), see, e.g., [71]. In this approach, one breaks the process of constructing the state \({\hat{\rho }}\) into two separate parts. First, we define the spectrum complexity \({\mathcal {C}}_S\) of the state \({\hat{\rho }}\) as the minimal complexity of purification among all states with the same spectrum as \({\hat{\rho }}\). We will denote the state for which this minimum is achieved by \({\hat{\rho }}_{\text {spec}}\). Second, we turn the state \({\hat{\rho }}_{\text {spec}}\) into our state of interest by using unitary operations with minimal complexity. This is always possible since the two states have the same spectrum. We call this part the basis complexity \(\widetilde{{\mathcal {C}}}_B\). In any case, the complexity of purification \({\mathcal {C}}_P\) is always smaller than \({\mathcal {C}}_S+\widetilde{{\mathcal {C}}}_B\), because reaching the mixed state via \({\hat{\rho }}_{\text {spec}}\) is one possible circuit. The spectrum approach to mixed state complexity is illustrated in Fig. 8.

Fig. 8
figure 8

Illustration of spectrum and basis complexity for mixed states

Another approach to mixed state complexity is the ensemble complexity, see, e.g., [71]. Here, as before, we decompose the mixed state \({\hat{\rho }}\) as an ensemble of pure states \(\rho = \sum _i p_i |\psi _i\rangle \langle \psi _i|\) and define the ensemble complexity as the weighted average over the complexities of the pure states in this ensemble, minimized over all possible ensembles, i.e., \({\mathcal {C}}_E(\rho ) = \min _{\text {ensemble}} \sum _i p_i {\mathcal {C}}(|\psi _i\rangle )\).

Yet another approach to mixed state complexity is based on using an information metric adapted to trajectories between mixed states directly, without purifying them first. For example, [72, 73] considered the Bures metric or Fisher-Rao information metric.

A more detailed discussion of mixed state circuits and complexity can be found in, e.g., [71,72,73,74,75,76]. However, as we said before, here we will focus on the complexity of purification.

As before, when restricting to Gaussian states we are able to make considerable progress in studying the complexity [74] (see also [77]). Let us start again with the example of a simple Harmonic oscillator and consider the most general mixed state with real parametersFootnote 20

$$\begin{aligned} \rho (x,x') \equiv \langle x |{\hat{\rho }}| x' \rangle \propto e^{-\frac{1}{2} (a x^2+a x'^2 -2 bx x')} \end{aligned}$$
(92)

where the density matrix is Hermitian \(\rho (x,x') = \rho ^*(x',x)\) as it should be, and a and b are real parameters satisfying \(a>b\) and \(b\ge 0\), such that the density matrix is normalizable and positive semidefinite. The normalization constant is fixed by requiring \(\text {Tr}(\rho ) = \int \rho (x,x)=1\). The most general purification with two degrees of freedom and real parameters reads

$$\begin{aligned} \psi _{12}(x,y) \equiv \langle x,y |\psi \rangle \propto e^{-\frac{1}{2}(\omega _1 x^2+\omega _2 y^2 +2 \omega _3 xy)}, \end{aligned}$$
(93)

where in order to indeed be a purification of the state (93) should satisfy

$$\begin{aligned} \int dy \, \psi _{12} (x,y) \psi ^*_{12} (x',y) = \rho (x,x'). \end{aligned}$$
(94)

Explicitly this yields

$$\begin{aligned} \omega _1 = a+b, \qquad \omega _2 = \frac{\omega _3^2}{2b}, \end{aligned}$$
(95)

where \(\omega _3\) remains a free parameter. We can easily diagonalize the wavefunction (93) and bring it to the form

$$\begin{aligned} \psi _{12} (x_+,x_-) \propto \, e^{-\frac{1}{2}(\omega _+ x_+^2 + \omega _- x_-^2)} \end{aligned}$$
(96)

where \(\omega _{\pm }\) are the eigenvalues of the matrix \( \left( \begin{array}{cc} a+b &{} \omega _3 \\ \omega _3 &{} \frac{\omega _3^2}{2b} \\ \end{array} \right) \). In this form, the two oscillators decouple and we can use Eq. (68) to evaluate the complexity. We focus on the \({\mathcal {C}}_1\) complexity since it will be most closely related to holography as we will see later on. We obtain the upper bound

$$\begin{aligned} {\mathcal {C}}_1^{\text {diag},UB} = \min _{\omega _3} \frac{1}{2} \left| \log \frac{\omega _+}{\mu }\right| +\frac{1}{2} \left| \log \frac{\omega _-}{\mu }\right| \end{aligned}$$
(97)

where \(\mu \) is the reference state scale and the final answer is obtained by minimizing over the purification free parameter \(\omega _3\). The diag superscript indicates that we evaluate the \({\mathcal {C}}_1\) complexity in the diagonal basis, whose generators are defined with respect to the coordinates \(x_{\pm }\) according to the prescription described in Eq. (45). It is also possible to explore the complexity in the physical basis which distinguishes naturally the physical and ancillary degrees of freedom [74] but we will not pursue this possibility here.Footnote 21

In the above example, we purified a mixed state of a single harmonic oscillator using one additional harmonic oscillator. It is always the case that doubling the number of degrees of freedom in the system is enough to purify it.Footnote 22 However, one might wonder if purifications with more degrees of freedom are more efficient from the complexity point of view. Testing the above with purifications of a single oscillator using two ancillary oscillators, one concludes that at least for such small systems optimal purifications are essential purifications – which use the smallest number of degrees of freedom necessary for the purification.

We can use the above results to answer the question - is the thermofield double state of two harmonic oscillators of frequency \(\omega \) at \(t=0\) (cf. Eq. (83))

$$\begin{aligned} |TFD\rangle _{12} = {\mathcal {N}}_{TFD} \sum _{n=0}^\infty e^{- \beta \omega n/2} |n\rangle _1 |n\rangle _2 \end{aligned}$$
(98)

the optimal purification of the thermal state

$$\begin{aligned} {\hat{\rho }}_{th} = {\mathcal {N}}_{th} \sum _{n=0}^\infty e^{-\beta \omega n} |n\rangle \langle n|, \end{aligned}$$
(99)

where \(|n\rangle \) are the energy eigenstates of our oscillator and \(\beta \) is the inverse temperature. Using Mehler’s formula for summation over Hermite polynomials we can show that the thermal state is Gaussian of the form (92) with the following parameters

$$\begin{aligned} a=\omega \coth (\beta \omega ), \quad b=\frac{\omega }{\sinh (\beta \omega )}, \end{aligned}$$
(100)

while the thermofield double state is also Gaussian of the form (93) with parameters

$$\begin{aligned} \omega _1 =\omega _2 = \omega \coth \left( \frac{\beta \omega }{2}\right) , \quad \omega _3=-\frac{\omega }{\sinh \left( \frac{\beta \omega }{2}\right) }. \end{aligned}$$
(101)

Minimizing over all possible purifications of the thermal state encloses a larger family of purifications than just the TFD state. Performing the minimization yields the following complexity of purification

$$\begin{aligned} \begin{aligned}&{\mathcal {C}}_1^{UB,\text {diag}}({\hat{\rho }}_{th}) = \\&{\left\{ \begin{array}{ll} \frac{1}{2}\log \frac{\mu }{\omega } + \frac{1}{2} \log \left( \frac{\mu \coth \left( \frac{\beta \omega }{2}\right) -\omega }{\mu -\omega \coth \left( \frac{\beta \omega }{2}\right) }\right) &{} \beta \omega \coth \left( \frac{\beta \omega }{4}\right) \le \beta \mu \\ \log \coth \frac{\beta \omega }{4} &{} {\begin{array}{c} \beta \omega \tanh \left( \frac{\beta \omega }{4}\right) \le \\ \beta \mu \le \beta \omega \coth \left( \frac{\beta \omega }{4}\right) \end{array}} \\ \frac{1}{2}\log \frac{\omega }{\mu } + \frac{1}{2} \log \left( \frac{\omega \coth \left( \frac{\beta \omega }{2}\right) -\mu }{\omega -\mu \coth \left( \frac{\beta \omega }{2}\right) }\right) &{} \beta \mu \le \beta \omega \tanh \left( \frac{\beta \omega }{4}\right) \end{array}\right. }. \end{aligned} \end{aligned}$$
(102)

Comparing this to the complexity of the thermofield double, i.e., without the additional minimization over purifications, we obtain

$$\begin{aligned} \begin{aligned}&{\mathcal {C}}_1^{UB,\text {diag}}(|TFD\rangle _{12}) =\\&{\left\{ \begin{array}{ll} \log \frac{\mu }{\omega } &{} \beta \omega \coth \left( \frac{\beta \omega }{4}\right) \le \beta \mu \\ \log \coth \frac{\beta \omega }{4} &{} {\begin{array}{c} \beta \omega \tanh \left( \frac{\beta \omega }{4}\right) \le \\ \beta \mu \le \beta \omega \coth \left( \frac{\beta \omega }{4}\right) \end{array}} \\ \log \frac{\omega }{\mu } &{} \beta \mu \le \beta \omega \tanh \left( \frac{\beta \omega }{4}\right) \end{array}\right. }. \end{aligned} \end{aligned}$$
(103)

From the comparison of the two above results we see that the thermofield double state is the optimal purification of the thermal state only in the middle regime (which may be quite narrow), see Fig. 9.

Fig. 9
figure 9

Plot illustrating the three regimes of the complexity of the thermofield double state and the complexity of purification of the thermal state. The gray dashed curve indicates our choice of the reference state scale (in this case we have chosen \(\beta \mu =7\)) and depending on whether it is higher or lower than the red and blue curves (the functions \(\beta \omega \tanh (\beta \omega /4)\) and \(\beta \omega \coth (\beta \omega /4)\)) tells us which is the relevant complexity regime in Eqs.  (102) and (103). Only in the small range of frequency indicated in purple will the thermofield double be the optimal purification of the thermal state

6 Complexity in QFT

After having extensively studied the complexity of a small number of harmonic oscillators, we are now ready to use those results to study the complexity of states within Quantum Field Theory (QFT) – the framework studying many body physics with changing particle number. We will consider the complexity of the vacuum state, the thermofield double state and several interesting examples of mixed states of free (or nearly free) bosonic field theories. Just like many other quantities in QFT, we will see that also the complexity diverges due to contributions from short distance correlations in the system. We will explain how to regulate those divergences. We will conclude this section with a discussion of complexity in strongly interacting conformal field theories.

6.1 Free scalar QFT

Here we describe the pioneering works [18, 19] which were the first to study complexity in a simple QFT. These works studied the complexity of the vacuum state of a free bosonic QFT in d spacetime dimensions described by the following Hamiltonian

$$\begin{aligned} H = \frac{1}{2} \int d^{d-1}x \left[ \pi (x)^2+\mathbf {\nabla }\phi (x)^2+m^2 \phi (x)^2\right] . \end{aligned}$$
(104)

Naively, we expect the vacuum state to be simple and therefore to have low complexity. However, the complexity is defined with respect to a reference state. While there is no canonical choice of a state in a Hilbert space, we will argue below that there is a natural choice of the reference state in the context of studying quantum computational complexity, which is a completely unentangled state. With this choice, it turns out that the complexity of the vacuum state in QFT is highly divergent. This is because the vacuum state has correlations down to arbitrarily short length scales which are absent in the reference state. For readers familiar with the notion of entanglement entropy this should not come as a surprise since a similar divergence appears there. One way to regularize the divergences is by placing the theory on a spatial lattice. Alternatively, we could use a sharp momentum cutoff. Both the entanglement entropy and complexity diverge when the lattice spacing is sent to zero.

Fig. 10
figure 10

Illustration of a system of coupled harmonic oscillators obtained by discretizing the scalar field theory on a lattice with spacing \(\delta \). Red springs represent contributions from the mass m of the scalar field while blue springs introduce couplings between the different oscillators originating from the derivative term in the Hamiltonian

As in [18], we will regularize the divergences by placing the theory on a \(d-1\) dimensional periodic lattice with lattice spacing \(\delta \) and length L in all directions, see Fig. 10. In this way, the theory becomes that of \(N^{d-1} = (L/\delta )^{d-1}\) coupled harmonic oscillators and the complexity is a natural extension of the results of Sect. 5.2. We will label the different lattice sites in terms of a \(d-1\) dimensional vector \(\mathbf {a}\) where each component \(0\le a_i \le N-1\) is an integer. The discretized version of (104) reads

$$\begin{aligned} H = \sum _{\mathbf {a}} \delta {\tilde{\pi }}_{\mathbf {a}}^2 + m^2 \delta ^{-1} {\tilde{\phi }}_{\mathbf {a}}^2 + \delta ^{-3} \sum _j ({\tilde{\phi }}_{\mathbf {a}+\mathbf {e}_j}-{\tilde{\phi }}_{\mathbf {a}})^2 \end{aligned}$$
(105)

where we have defined \({\tilde{\phi }}_{\mathbf {a}} = \delta ^{d/2}\,\phi (\delta \cdot \mathbf {a})\), \({\tilde{\pi }}_{\mathbf {a}} = \delta ^{d/2-1}\pi (\delta \cdot \mathbf {a})\) and \(\mathbf {e}_j\) denotes the unit vector in the j-th direction. Periodicity implies \({\tilde{\phi }}_{\mathbf {a}+N \mathbf {e}_j}={\tilde{\phi }}_{\mathbf {a}}\) and \({\tilde{\pi }}_{\mathbf {a}+N \mathbf {e}_j}={\tilde{\pi }}_{\mathbf {a}}\) for all \(\mathbf {a}\)-s and j-s. The above coordinate and momentum operators satisfy the commutation relations \([{\tilde{\phi }}_{\mathbf {a}},\tilde{\pi }_{\mathbf {b}}] = i \delta _{\mathbf {a} \mathbf {b}}\). To decouple the different oscillators in (105) we employ a discrete Fourier transform

$$\begin{aligned} \phi _{\mathbf {n}} = N^{-\frac{d-1}{2}} \sum _{\mathbf {a}} e^{-\frac{2\pi i \mathbf {n} \mathbf {a}}{N}} {\tilde{\phi }}_{\mathbf {a}},\quad \pi _{\mathbf {n}} = N^{-\frac{d-1}{2}} \sum _{\mathbf {a}} e^{\frac{2\pi i \mathbf {n} \mathbf {a}}{N}} {\tilde{\pi }}_{\mathbf {a}}, \end{aligned}$$
(106)

where \(\mathbf {n}\) is again a \(d-1\) dimensional vector of integers running between 0 and \(N-1\). The position and momentum operators in momentum space also satisfy the commutation relations \([\phi _{\mathbf {n}},\pi _{\mathbf {k}}] = i \delta _{\mathbf {n} \mathbf {k}}\). Using the above transformations, we obtain the diagonalized Hamiltonian in momentum space

$$\begin{aligned} H=\frac{1}{2M}\sum _{k_i=0}^{N-1}\left[ |\pi _{\mathbf {k}}|^2 +M^2 \omega _{k}^2 |\phi _{\mathbf {k}}|^2\right] \end{aligned}$$
(107)

with

$$\begin{aligned} \omega _k^2 = m^2+ \frac{4}{\delta ^2}\sum _{i=1}^{d-1} \sin ^2\left( \frac{\pi k_i}{N}\right) , \qquad M=\frac{1}{\delta }. \end{aligned}$$
(108)

In terms of the momentum space coordinates, the ground-state wave-function reads

$$\begin{aligned} \langle \phi _{\mathbf {k}} | 0 \rangle = \mathcal {N_{\text {vac}}} \exp {\left[ -\sum _{\mathbf {k}} \frac{M \omega _k |\phi _{\mathbf {k}}|^2}{2}\right] } \end{aligned}$$
(109)

where the normalization constant is given by \({\mathcal {N}}_{\text {vac}} = \prod _{\mathbf {k}} \left( \frac{M\omega _{\mathbf {k}}}{\pi }\right) ^{1/4}\). This wave-function is Gaussian and so we can use our techniques from Sect.  5 to evaluate its complexity.

As mentioned earlier, the vacuum state is in fact very complex – its complexity diverges with the lattice spacing. The underlying reason for this divergence is the derivative term in (104). This term is the one responsible for entangling the different lattice sites. Without this term, the Hamiltonian would factorize in position space and the quantum state of the different lattice sites would not be correlated.

When we pick a reference state, we want it to satisfy quite the opposite property. We would like the different oscillators to be completely unentangled. Therefore, a natural choice for the reference state is the ground state of an ultra-local Hamiltonian

$$\begin{aligned} H = \frac{1}{2} \int d^{d-1}x \left[ \pi (x)^2+\mu ^2 \phi (x)^2\right] , \end{aligned}$$
(110)

where comparing to Eq. (104) we notice that the derivative term has been turned off. The discretized Hamiltonian in momentum space takes the form (107) with \(\omega _{\mathbf {k}}=\mu \) and the relevant wavefunction for the reference state reads

$$\begin{aligned} \langle \phi _{\mathbf {k}} | \mu \rangle = \mathcal {N_{\mu }} \exp {\left[ -\sum _{\mathbf {k}} \frac{M \mu \, |\phi _{\mathbf {k}}|^2}{2}\right] }, \end{aligned}$$
(111)

where again \(\mathcal {N_{\mu }}\) is a normalization constant. Notice that this state is again Gaussian and has a fixed frequency for all momenta.

As in the last section, we will focus on trajectories moving entirely in the space of Gaussian states. The motion between Gaussian states can be studied in terms of symplectic transformations of the corresponding covariance matrices induced by quadratic gates in position and momentum variables. The optimal trajectory takes the form (52) for each momentum mode separately where the relative covariance metric (62) is replaced with

$$\begin{aligned} \Delta _{\mathbf {k}} = \begin{pmatrix} \frac{\mu }{\omega _{ k}} &{}~ 0 \\ 0 &{}~~ \frac{\omega _{ k}}{\mu }. \end{pmatrix} \end{aligned}$$
(112)

for each momentum mode. The upper bound \({\mathcal {C}}_1^{UB}\) and the complexity \({\mathcal {C}}_2\) are given by Eq. (68) summed over the different momentum modes

$$\begin{aligned} {\mathcal {C}}_1^{UB} = \frac{1}{2} \sum _{\mathbf {k}} \left| \log \frac{\omega _k}{\mu }\right| ; \quad {\mathcal {C}}_2 =\frac{1}{2} \sqrt{\sum _{\mathbf {k}} \left( \log \frac{\omega _k}{\mu }\right) ^2}. \end{aligned}$$
(113)

To improve our intuitive understanding of the optimal circuit constructing the ground state, let us write it explicitly in terms of the relevant unitary transformation in Eq. (41) (see also Eqs. (42) and (52)):

$$\begin{aligned} |\psi (\sigma )\rangle = \exp \left[ -\frac{i}{4} \sigma \sum _{\mathbf {k}} \log \left( \frac{\mu }{\omega _k}\right) (\phi _{\mathbf {k}} \pi _{\mathbf {k}} +\pi _{\mathbf {k}} \phi _{\mathbf {k}}) \right] |\mu \rangle , \end{aligned}$$
(114)

with the path parameter \(\sigma \in [0,1]\) as before. In this way, we see that the optimal circuit consists of “squeezing” the wavefunction for each momentum mode separately. Of course, since we have discretized our theory on the lattice, the state obtained at \(\sigma =1\) is not exactly the ground state of the original continuum Hamiltonian (104) but it approximates it on distances larger than the lattice spacing.

Evaluating the result for the complexity (113) yields at the leading order in the small lattice spacing

$$\begin{aligned} \begin{aligned}&{\mathcal {C}}_1^{UB} \simeq \frac{\text {Vol}}{2\delta ^{d-1}} |\log \mu \delta \,|+\dots \\&{\mathcal {C}}_2 \simeq \frac{1}{2} \left( \frac{\text {Vol}}{\delta ^{d-1}}\right) ^{1/2} |\log \mu \delta \,| +\cdots \end{aligned} \end{aligned}$$
(115)

where \(\text {Vol}=L^{d-1}\) is the spatial volume of the system. As we will see later, the behavior of \({\mathcal {C}}_1^{UB}\) matches much better with the results obtained from holography which hints that this cost function is better suited to be identified with the dual of complexity in holography. Note that the free field theory and the strongly coupled holographic theories are very different from each other. However, just as for the entanglement entropy, the structure of divergences is expected to follow a similar pattern. For the above reason, in what follows we will mostly focus on the \({\mathcal {C}}_1^{UB}\) complexity.

Our results for the complexity are expressed in terms of \(\mu \) – the characteristic scale of the reference state. How are we to think about this scale? We can obtain a hint from the divergence structure in Eq. (115). Divergent QFT quantities do not usually mix logarithmic and polynomial divergences. The appearance of this divergence in the complexity can be however remedied by choosing the scale of the reference state to depend on the cutoff, i.e., \(\mu \delta = e^{-{\tilde{\mu }}}\), where \({\tilde{\mu }}\) is an order one constant. In this case

$$\begin{aligned} {\mathcal {C}}_1^{UB} \simeq \frac{\text {Vol}}{2\delta ^{d-1}} | \,{\tilde{\mu }}\,|+\dots \end{aligned}$$
(116)

This choice is also natural from a physical point of view – since we are introducing correlations at all scales down to the lattice scale \(\delta \) it is natural to start with a state whose typical frequency is also of the order of the (inverse) lattice spacing. The result (116) has a volume law divergence. This can be contrasted with the typical area law divergence of the entanglement entropy.Footnote 23 We will later see that this behavior is reproduced in holography. The complexity of the ground state of fermionic systems has been treated using similar methods and there as well one obtains a volume law [20, 21]. The above result is an upper bound on the complexity, however, a simple counting argument shows that the complexity following from exact optimization \({\mathcal {C}}_1\) will have the same scaling with the cutoff and volume of the system.

Finally, let us make a comment about the scheme of regularization. Above, we have regularized the complexity by placing our theory on a periodic lattice with lattice spacing \(\delta \) as in [18]. Let us now comment on a different scheme of regularization used in [19]. In this case, we work with a continuous momentum variable

$$\begin{aligned} \mathbf {k}_{c} = \frac{2\pi \mathbf {k}}{L} \end{aligned}$$
(117)

and replace all the above sums \(\sum _{\mathbf {k}}\) by integrals \(\text {Vol} \int \frac{d^{d-1}k_c}{(2\pi )^{d-1}}\). The momentum integrals are regulated by a sharp momentum cutoff, i.e., we cut our momentum integrals at a sharp value \(|\mathbf {k}_c|=\Lambda \). The results in this regularization scheme can be obtained from the former lattice regularization by initially placing the momentum cutoff significantly below the lattice scale \(\Lambda \ll \frac{2\pi }{\delta }\) and later sending the lattice spacing \(\delta \rightarrow 0\) such that the result remains finite and regulated by the new cutoff \(\Lambda \). In that case, we may approximate the frequency in Eq. (108) by \(\omega _{k_c} = \sqrt{k_c^2+m^2}\). As before, the state \(|\psi (\sigma =1)\rangle \) constructed by the continuous version of the circuit (114)

$$\begin{aligned} |\psi (\sigma )\rangle&= \exp \left[ -\frac{i}{4} \sigma \int _{|k_c|<\Lambda } \frac{d^{d-1} k_c}{(2\pi )^{d-1}} \log \left( \frac{\mu }{\omega _k}\right) K(\mathbf {k}_c) \right] |\mu \rangle , \nonumber \\ K(\mathbf {k}_c)&\equiv \phi (\mathbf {k}) \pi (\mathbf {k}_c) +\pi (\mathbf {k}_c) \phi (\mathbf {k}_c), \qquad \sigma \in [0,1] \end{aligned}$$
(118)

is not actually the ground state of the Hamiltonian (104) but it approximates it for momenta below the cutoff momentum. With this regularization scheme, the complexity reads (113)

$$\begin{aligned} {\mathcal {C}}_1^{UB}= & {} \frac{1}{2} \text {Vol} \int \frac{d^{d-1}k_c}{(2\pi )^{d-1}} \left| \log \frac{\omega _k}{\mu }\right| , \nonumber \\ {\mathcal {C}}_2= & {} \frac{1}{2} \sqrt{ \text {Vol}\int \frac{d^{d-1}k_c}{(2\pi )^{d-1}} \left( \log \frac{\omega _k}{\mu }\right) ^2}, \end{aligned}$$
(119)

and the leading divergences are as in (115) with the replacement \(\delta \rightarrow 1/\Lambda \).

6.2 Weakly interacting QFT

It is clearly of great interest to understand how the analysis of the previous section can be extended to the case of interacting field theories, and study the dependence of the complexity on the couplings. Unfortunately this is a difficult task, and at the time of writing this review only partial results are available.

The authors of [22] generalized the previous study by considering the complexity of nearly Gaussian states building on the idea of quantum circuit perturbation theory [78,79,80]. They studied the complexity of the ground state of a \(\lambda \phi ^4\) theory described by the following Hamiltonian

$$\begin{aligned} H \!=\! \frac{1}{2}\int d^{d-1} x\left[ \pi (x)^2 \!+\!(\nabla \phi (x))^2 +m^2 \phi ^2 + \frac{\lambda }{12} \phi (x)^4\right] \end{aligned}$$
(120)

with the coefficient \(\lambda \) treated perturbatively. The authors used perturbation theory in quantum mechanics to express the ground state of this theory as an exponentiated polynomial of order four (rather than two in the Gaussian case). They were then able to enlarge the set of gates used to manipulate Gaussian states up to order six in position and momentum to manipulate these states. This led to a well defined notion of Nielsen-type complexity. However, they found that within this approach the reference state could not be taken to be Gaussian but had to contain some non-quadratic terms. As a consequence, the cost functional also had to be made dependent on the coupling in order to have a smooth zero-coupling limit. As an aside, the authors proposed an alternative mean field theory approximation where one simply includes perturbative corrections to the mass in the Gaussian wavefunction. In this approximation the authors were able to show that at the Wilson–Fisher fixed point around four dimensions the interaction has slightly increased the complexity compared to the Gaussian fixed point.

6.3 Complexity of the thermofield double state

Another interesting example of a Gaussian state in free bosonic QFT is the thermofield double state [66]. For the case of a single harmonic oscillator this state was studied in Sect. 5.4. In the full bosonic QFT (107), the TFD is simply the product of the different TFD states for each of the momentum modes, i.e.,

$$\begin{aligned} |TFD(t)\rangle = \bigotimes _{ \mathbf {k}} |TFD(t,\omega _k)\rangle , \end{aligned}$$
(121)

where we defined the TFD for each mode in (83). We will take the assumption that the optimal trajectory does not mix the different momentum modes. This assumption is natural because if we introduce entanglement between the different modes, this entanglement will have to be removed in the final state and that will increase the length of the circuit. However, recall that we have seen the case of coherent states which behaved counterintuitively in this regard in Sect. 5.3.

Under the no-mode-mixing assumption, the complexity is simply the sum of complexities for each of the momentum modes. We will be particularly interested in the complexity of formation – the difference in complexities between the TFD state at \(t=0\) and two copies of the vacuum state – cf. Eq.  (91), which is given by

$$\begin{aligned} \Delta {\mathcal {C}}(|TFD(t=0)\rangle ) = \sum _{\mathbf {k}} \Delta {\mathcal {C}}(|TFD(t=0,\omega _k)\rangle ), \end{aligned}$$
(122)

where the expressions for the complexity of formation of the individual modes can be found in Eq.  (91). For reasons that we explain below, here we will focus on the \({\mathcal {C}}_1^{(LR),UB}\) complexity

$$\begin{aligned}&\Delta {\mathcal {C}}_1^{(LR),UB} = \text {Vol} \int _{k\le \Lambda } \frac{d^{d-1}k}{(2\pi )^{d-1}} 2|\alpha _k|,\nonumber \\&\quad \alpha _k = \frac{1}{2}\log \left[ \frac{1+e^{-\beta \omega _k/2}}{1-e^{-\beta \omega _k/2}}\right] , \quad \omega _k = \sqrt{k^2+\omega ^2}\,. \end{aligned}$$
(123)

This integral is finite due to the exponential suppression coming from the \(\alpha _k\) at large frequency. Therefore we may remove the cutoff \(\Lambda \) and simply integrate all the way to infinity. The result obtained by integrating this expression in the limit of vanishing mass is simply proportional to the thermal entropy of the system

$$\begin{aligned} S_{\text {th}} = \text {Vol} \int \frac{d^{d-1}k}{(2\pi )^{d-1}} \left[ \frac{\beta \omega _k}{e^{\beta \omega _k}-1}-\log (1-e^{-\beta \omega _k})\right] \end{aligned}$$
(124)

with proportionality factor

$$\begin{aligned} \left. \frac{\Delta {\mathcal {C}}_1^{(LR),UB}}{S_{\text {th}}}\right| _{\beta m=0} = \frac{2^d-1}{d}. \end{aligned}$$
(125)

The proportionality of the complexity of formation and the thermal entropy is a property of complexity which is reproduced in holographic calculations [70]. For finite mass the results are shown in Fig. 11. The complexity of formation in the diagonal basis \({\mathcal {C}}_1^{(\pm ),UB}\) and the \({\mathcal {C}}_2\) complexity vanish for temperatures much lower than the cutoff scale \(T\ll \Lambda \), which is the physical regime. Therefore, we regard them as less useful measures of complexity of the state.

Fig. 11
figure 11

Complexity of formation as a function of the mass in various dimensions from \(d=2\) (bottom curve) to \(d=6\) (top curve). Figure taken from [66]

While we did not write explicit expressions for the time dependence of the complexity of the TFD state at \(t\ne 0\), such expressions follow directly from its covariance matrix in Eq. (85) and the time dependence can then be evaluated by summing the complexity of the different momentum modes. A plot of the time dependence of the complexity of the TFD state can be found in Fig. 12. In this figure, taken from [66], the complexity evolves in time (either increases or decreases) and saturates after a time of the order of the inverse temperature. This is natural since each mode oscillates and and the oscillations are aligned at \(t=0\) but the different modes become dephased at later times and so the contributions from the different normal modes averages out. Because of the exponential suppression of the oscillations mentioned at the end of Sect. 5.4 with large \(\beta \omega \), modes with frequency higher than \(1/\beta \) hardly contribute to the complexity and so the saturation is dominated by modes with \(\omega \lesssim 1/\beta \) and happens at times \(t\sim \beta \).

We see, that in the free bosonic QFT, the complexity of the TFD saturates rather fast and this is because of the free nature of the system. In holography describing chaotic systems we will see a very different behavior. This highlights a general lesson to be learned about which properties are expected to be similar in free QFT and holography and which are not. In general, static quantities will have common properties while dynamical quantities will differ.

Fig. 12
figure 12

Time dependence of the complexity of the thermofield double state \({\tilde{\gamma }} \equiv (\beta \mu )^{-1}\). Figure taken from [66]

6.4 Mixed state complexity in QFT

In Sect.  5.5 we discussed the complexity of mixed states via the complexity of purification. These results can be used to evaluate the complexity of various interesting mixed states of free quantum field theory.

For example, let us start by considering the complexity of thermal states. The thermal state in free QFT can be decomposed as follows

$$\begin{aligned} {\hat{\rho }} (\beta ) = \otimes {\hat{\rho }}_{th}(\beta ,\omega _k), \quad \omega _k = \sqrt{k^2+m^2}, \end{aligned}$$
(126)

where \({\hat{\rho }}_{th}(\beta ,\omega _k)\) is the thermal state of a single oscillator defined in Eq. (99). Hence the complexity is simplyFootnote 24

$$\begin{aligned} {\mathcal {C}}_{1,th}^{UB,\text {diag}}(\beta ) = \sum _k {\mathcal {C}}_{1,th}^{UB,\text {diag}}(\beta ,\omega _k), \end{aligned}$$
(127)

where the complexity for each momentum mode can be found in Eq. (102). Note that the divergences in complexity come from integrating the \(\log \left| \frac{\mu }{\omega _k}\right| \) contributions in Eqs. (102) and (103). Hence, we see that the complexity of the thermofield double state is twice as divergent as that of the thermal state. This reflects a general property that the purification which preserves the most symmetry between the ancillary degrees of freedom and the physical ones is not always the most efficient one. In the case of the thermofield double state for example, we work very hard to establish short distance correlations between the ancillary degrees of freedom themselves, which would then be removed upon tracing out this part of the system anyway and so that is useless work.

When a mixed state \(\rho _A\) is obtained from an original pure state \(|\psi _{AB}\rangle \), it is often the case that the original state is not the optimal purification. This is because in \(|\psi _{AB}\rangle \) we work too hard to establish all the correlations between the B degrees of freedom and mimic exactly those between A and B. To estimate how different are the correlations in the optimal purification from those in the original state we define the mutual complexityFootnote 25

$$\begin{aligned} \Delta {\mathcal {C}}_{\text {mutual}} = {\mathcal {C}}(\rho _A)+{\mathcal {C}}(\rho _B)-{\mathcal {C}}(|\psi _{AB}\rangle )\,, \end{aligned}$$
(128)

see Fig. 13. For example, when considering the process of forming the thermal state from tracing out half of the thermofield double state we obtain

$$\begin{aligned} \Delta {\mathcal {C}}_{\text {mutual}} = 2{\mathcal {C}}(\rho _{th})-{\mathcal {C}}(|TFD\rangle )\,. \end{aligned}$$
(129)

In particular in quantum field theory of a free scalar field, this quantity turns out to be finite (i.e., all the UV divergences cancel) and is proportional to the thermal entropy for the case of a massless scalar (the conformal limit). The mutual complexity in the diagonal basis in the various QFT examples studied in [74] was found to be subadditive, i.e., it satisfies \(\Delta {\mathcal {C}}^{\text {diag}}_{\text {mutual}}>0\).

Fig. 13
figure 13

Illustration of mutual complexity. We start by a pure state on a system AB which is then split into two mixed states on the systems A and B. The sum of the complexities of purification of these mixed states using ancillary systems \(A^c\) and \(B^c\) is not necessarily equal to the complexity of the original pure state

Another interesting example of a mixed state of a free bosonic QFT is that of subregions of the vacuum state. We could as before, focus on the example of subregions of the vacuum state on the lattice for a free bosonic QFT. The authors of [74] have focused on a one dimensional spatial lattice with N sites. The wavefunction of the vacuum state reads

$$\begin{aligned} \Psi _0(\phi _k) \propto \prod _{k=0\ldots N-1} e^{-\frac{1}{2}\omega _k |\phi _k|^2} \end{aligned}$$
(130)

where \(\omega _k\) and \(\phi _k\) were defined in Eqs. (106) and (108) and we substitute \(d=2\). Translating back this expression to position basis using E. (106) we obtain

$$\begin{aligned} \Psi _0({\tilde{\phi }}_a) \propto \prod _{a,b=0\ldots N-1} e^{-\frac{1}{2}M_{ab} {\tilde{\phi }}_a {\tilde{\phi }}_b} \end{aligned}$$
(131)

where

$$\begin{aligned} M_{ab} = \frac{1}{N} \sum _{k=0\ldots N-1}\omega _k e^{-\frac{2\pi i k}{N}(a-b)}\,. \end{aligned}$$
(132)

To obtain the subregions we divide our lattice in two subsets \(A=\{x_0,\ldots x_j\}\) and \(B=\{x_{j+1},\cdots x_{N-1}\}\) and trace out the region B as follows

$$\begin{aligned} \rho _A(x_A,x'_A) = \int dx_B \Psi _0(x_A,x_B) \Psi _0^*(x'_A,x_B). \end{aligned}$$
(133)

Similarly to what we did earlier with the single Harmonic oscillator it is possible to minimize the complexity over the essential purifications of this mixed state. In fact [74] used a simplifying assumption. They considered mode-by-mode purifications which are introduced after bringing the density matrix to a diagonal form and then purifying each mode which is mixed separately. This is a subset of all possible purifications which provides a good approximation to the complexity of purification based on tests with small systems (purifying two by four).

The authors performed this task numerically and found that the original vacuum state is not always the optimal purification. This is similar to what happened before with the TFD and thermal states.

The results are presented in the plots. Figure 14 presents the complexity as a function of the subregion size in the limit of small mass. The following expressions provides a good fit

$$\begin{aligned} \begin{aligned} {\mathcal {C}}_1^{UB,\text {diag}} =&\frac{\ell }{2\delta } \left| \log \mu \delta \right| + \frac{1}{2}f_1(\mu L)\log \left( \frac{L}{\pi \delta }\sin \frac{\pi \ell }{L}\right) \\&~~~~+\frac{\ell }{L}f_2(\mu L)+f_3(\mu L). \end{aligned} \end{aligned}$$
(134)

Here, \(\mu \) is the scale of the reference state, L is the full system size, \(\ell \) is the subregion size, \(\delta \) is the cutoff and \(f_1,f_2,f_3\) are functions of the reference state scale. These functions could not be determined very accurately because the numerical study examined only very few values of \(\mu \). We see that the leading divergence is an area law and depends on the cutoff in a similar way to the leading divergences in the full vacuum complexity (115). The subleading divergences are reminiscent of the entanglement entropy as we will see in a moment. The mutual complexity \(\Delta {\mathcal {C}}_{\text {mutual}} = {\mathcal {C}}(\rho _A) +{\mathcal {C}}(\rho _B) - {\mathcal {C}} (|\psi _0\rangle )\) can also be evaluated and its dependence on the subregion size and cutoff can be fitted (see Fig. 15) and one obtains in the limit of small mass

$$\begin{aligned} \Delta {\mathcal {C}}_{1,\text {mutual}}^{UB,\text {diag}} \approx f_1(\mu L) \left( \log \left( \frac{L}{\pi \delta }\sin \frac{\pi \ell }{L}\right) +f_4(\mu L)\right) . \end{aligned}$$
(135)

Here, \(f_4\) is yet another function of \(\mu L\). Some proposed fits for \(f_1\) and \(f_4\) can be found in equation (7.10) of [74]. The above formula is very similar to the entanglement entropy formula by Calabrese and Cardy [82, 83]. This hints at a deeper connection between the subleading divergences in complexity an the entanglement entropy in non-dynamical situations.

Fig. 14
figure 14

Complexity of subregions of the vacuum state as a function of the interval size. This figures makes it apparent that the leading contribution to complexity grows linearly with the subsystem size which is the aforementioned volume law. Figure taken from [74]. Here the mass was fixed to be small \(mL=0.01\) in order to mimic the results of a conformal field theory

Fig. 15
figure 15

Mutual complexity of subregions of the vacuum as a function of the subregion size. This typical \(\log (\sin (\#))\) behavior is reminiscent of the Calabrese and Cardy formula for the entanglement entropy. Figure taken from [74]. Here the mass was fixed to be small \(mL=0.01\) in order to mimic the results of a conformal field theory

6.5 Complexity in CFT

The approach of studying QFT state complexity restricted to Gaussian or nearly Gaussian states has its clear limitations. Many interesting physical systems are strongly interacting. In particular, when making connection via holography between quantum information and black holes which is one of the prime motivations for studying QFT complexity, the relevant field theories are strongly interacting. These theories are however special in that they preserve a large spacetime symmetry group – the conformal symmetry. The abundance of symmetry is what helps make progress in this case. Therefore in this section we will focus on the question – can one utilize the conformal symmetry to define a complexity of states within conformal field theory.

This exploration began with the work of [24] who considered the geometric approach to complexity within 2d CFTs. In particular the authors focused on circuits in a unitary representation of the Virasoro algebraFootnote 26

$$\begin{aligned}{}[L_m,L_n] = (m-n) L_{m+n}+\frac{c}{12}m (m^2-1)\delta _{n+m,0}. \end{aligned}$$
(136)

The CFT was taken to live on a circle with angular coordinate \(\theta \equiv \theta +2\pi \) and the corresponding stress tensor can be expressed as

$$\begin{aligned} T(\theta ) = \sum _{n\in {\mathbb {Z}}}{ \left( L_{n} -\frac{c}{24}\delta _{n,0}\right) e^{-i n\theta }}. \end{aligned}$$
(137)

The circuits are constructed from the symmetry generators,

$$\begin{aligned} U(\sigma ) =&\, \overleftarrow{{\mathcal {P}}} \exp \int _0^\sigma d\sigma ' \, Q(\sigma '), \nonumber \\ Q(\sigma ) =&\, \int _0^{2\pi } \frac{d\theta }{2\pi } \epsilon (\sigma ,\theta )T(\theta ) =\sum _{n\in {\mathbb {Z}}}{\epsilon _n(\sigma ) \left( L_{-n} -\frac{c}{24}\delta _{n,0}\right) }, \end{aligned}$$
(138)

where the Fourier modes

$$\begin{aligned} \epsilon _n(\sigma ) = \int _0^{2\pi } \frac{d\theta }{2\pi } \epsilon (\sigma , \theta )e^{i n\theta } \end{aligned}$$
(139)

serve as control functions along the circuit. They should satisfy \(\epsilon _n(\sigma )^* = -\epsilon _{-n}(\sigma )\) in order for the transformation to be unitary. In addition, in order to start our circuit at the identity we require \(\epsilon _n(\sigma =0)=0\).

The Virasoro symmetry without its central extensionFootnote 27 is simply the group of diffeomorphisms of the circle \(f(\theta )\in \text {Diff}(S^1)\). In particular, the function \(\epsilon (\sigma ,\theta )\) in the circuit above fixes the infinitesimal diffeomorphisms whose composition gives the total diffeomorphism function \(f(\sigma ,\theta )\) at each point \(\sigma \) along the circuit. Explicitly, \(\epsilon (\sigma ,f(\sigma ,\theta )) = \partial _\sigma f(\sigma ,\theta )\).

The reference state serving as the starting point for the circuit is taken to be the chiral primary \(|h\rangle \) satisfying

$$\begin{aligned} L_0|h\rangle = h |h\rangle , \quad L_n|h\rangle =0 \text { for } n>0. \end{aligned}$$
(140)

The authors of [23] considered two different cost functions along the circuit

$$\begin{aligned} \begin{aligned} {\mathcal {F}}_1(\sigma ) =&| \langle \psi (\sigma )| \partial _\sigma \psi (\sigma ) \rangle |\,, \\ {\mathcal {F}}_2(\sigma ) =&\sqrt{ \langle \partial _\sigma \psi (\sigma )| \partial _\sigma \psi (\sigma ) \rangle }\,, \end{aligned} \end{aligned}$$
(141)

which become equivalent in the large central charge limit \({\mathcal {F}}_2 \simeq {\mathcal {F}}_1 (1+{\mathcal {O}}(1/c))\). We should point out that the above \({\mathcal {F}}_1\) cost function is in fact different from the \(F_1\) cost function in Eq. (23). The difference is reminiscent of exchanging the order of the absolute value in the complexity definition and the sum over circuit generators. The \({\mathcal {F}}_1\) cost function in Eq. (141) generally has many null directions and therefore does not satisfy the mathematical definition of a norm, making it somewhat disadvantageous as a complexity measure. Nevertheless, it has a nice geometric interpretation in terms of the coadjoint orbits of the Virasoro group and a connection to the Liouville action featuring in the path-integral approach to complexity, see Sect. 6.6. Another useful cost function is the Fubini–Study (FS) metric

$$\begin{aligned} {\mathcal {F}}_{FS}(\sigma ) = \sqrt{ \langle \partial _\sigma \psi (\sigma )| \partial _\sigma \psi (\sigma ) \rangle - | \langle \psi (\sigma )| \partial _\sigma \psi (\sigma ) \rangle |^2}. \end{aligned}$$
(142)

This cost function has the advantage that it assigns zero contributions to circuits which only modify our state by an overall phase.

Using some algebraic manipulations based on the symmetry algebra it is possible to show that the \({\mathcal {F}}_1\) cost function is given by

$$\begin{aligned} {\mathcal {F}}_1(\sigma ) = \biggr |\int _0^{2\pi } \frac{d\theta }{2\pi } \frac{\partial _\sigma f(\sigma ,\theta )}{\partial _{\theta } f(\sigma ,\theta )} \left( \frac{c}{24}-h +\frac{c}{12} \{f,\theta \}\right) \biggr | \end{aligned}$$
(143)

where \(\{f,\theta \} = \frac{f'''}{f'} - \frac{3}{2}\left( \frac{f''}{f'}\right) ^2\) is the Schwarzian derivative.

It turns out that the complexity functional (143) is related to the Polyakov action of induced gravity in two dimensions with a convenient choice of coordinates which means that induced 2d gravity governs the complexity of Virasoro circuits. Since the Polyakov and Liouville actions are related, this connects nicely to the path integral complexity proposal, see next subsection.

A similar computation for the Fubini–Study metric was carried in [26, 27] which leads to

$$\begin{aligned} \begin{aligned}&{\mathcal {F}}_{FS}(\sigma )^2 = \int _0^{2\pi } \frac{d\theta _1}{2\pi }\frac{d\theta _2}{2\pi } \frac{\partial _\sigma f(\sigma ,\theta _1)}{\partial _{\theta _1} f(\sigma ,\theta _1)} \frac{\partial _\sigma f(\sigma ,\theta _2)}{\partial _{\theta _2} f(\sigma ,\theta _2)} \times \\ {}&~~~~~~\left[ \frac{c}{32 \sin ^4[(\theta _1-\theta _2)/2]}- \frac{h}{2 \sin ^2[(\theta _1-\theta _2)/2]}\right] \,. \end{aligned} \end{aligned}$$
(144)

From the above expressions for the cost functions we note that what was earlier a geodesic equation for the control functions \(Y^I(\sigma )\), cf. Eqs. (20) and (21), now became an infinite dimensional geodesic problem with the index I replaced by the continuous variable \(\theta \). The geodesic equations for the control function take the form of integro-differential equations for the function \(f(\sigma , \theta )\). The equations of motion are second order in \(\sigma \) which allows to find circuits connecting two points in the Virasoro group. This makes the Fubini–Study norm a better suited complexity measure compared to the \({\mathcal {F}}_1\) cost function. The authors of [26, 27] used those equations of motion to find the complexity for going between the identity \(f(\sigma =0,\theta )=\theta \) and a perturbation containing a single Fourier mode \(f(\sigma =1, \theta ) =\theta +\frac{\epsilon }{m}\sin (m\theta ) \) with \(\epsilon \ll 1\) and \(m\in {\mathbb {N}}\). The sectional curvatures were found to be negative in most directions for physically relevant values of h and c.

A similar approach can be employed to study the complexity of unitary circuits of the conformal algebra in higher dimensions [25]. The conformal algebra consists of dilatations, translations, special conformal transformations and rotations – \(D,P_\mu , K_\mu , L_{\mu \nu }\) respectively, satisfying the commutation relationsFootnote 28

$$\begin{aligned} \begin{aligned}&[D, P_\mu ] = P_\mu ~, ~~~[D, K_\mu ] = - K_\mu ~, \quad \\&~~[K_\mu , P_\nu ] = 2\left( \delta _{\mu \nu } D - L_{\mu \nu }\right) ~, \end{aligned} \end{aligned}$$
(145)

where the rotations have been omitted from the list (but they satisfy the usual commutation relations). The generators satisfy the following Hermiticity relations

$$\begin{aligned} \begin{aligned} D^\dagger = D~, \quad K_\mu ^\dagger = P_\mu ~, \quad L_{\mu \nu }^\dagger = - L_{\mu \nu }~.\\ \end{aligned} \end{aligned}$$
(146)

As the reference state we consider a scalar primary stateFootnote 29\(|\psi _R\rangle = |\Delta \rangle \) of scaling dimension \(\Delta \) which satisfies

$$\begin{aligned} D|\Delta \rangle = \Delta |\Delta \rangle , \quad K_\mu |\Delta \rangle = L_{\mu \nu } |\Delta \rangle =0. \end{aligned}$$
(147)

A general unitary circuit will pass through states \(|\alpha (\sigma )\rangle = U(\sigma ) |\Delta \rangle \) where the unitary \(U(\sigma )\) is constructed as follows

$$\begin{aligned} U(\sigma ) =e^{i \alpha (\sigma )\cdot P} e^{i \gamma _D(\sigma ) D} \left( \prod _{\mu < \nu } e^{i \lambda _{\mu \nu }(\sigma ) L_{\mu \nu }}\right) e^{i \beta (\sigma ) \cdot K} \end{aligned}$$
(148)

where the various control functions \(\alpha _\mu (\sigma )\), \(\gamma _D(\sigma )\), \(\lambda _{\mu \nu }(\sigma )\), \(\beta (\sigma )\) have to satisfy some constraints to make sure that \(U(\sigma )\) is unitary. For example, one of these constraints is \(\text {Im}(\gamma _D) = -\frac{1}{2}\log (1 - 2 \, \alpha \cdot \alpha ^* + \alpha ^2 \alpha ^{*2})\). The \({\mathcal {F}}_1\) complexity cost function reads:

$$\begin{aligned} \frac{{\mathcal {F}}_1}{\Delta } = \left| \dfrac{\dot{\alpha }\cdot \alpha ^* -\dot{\alpha }^* \cdot \alpha + \alpha ^2 \,(\dot{\alpha }^* \cdot \alpha ^*)-\alpha ^{*2} (\dot{\alpha }\cdot \alpha )}{1 - 2 \, \alpha \cdot \alpha ^* + \alpha ^2 \alpha ^{*2}} + i \text {Re}(\dot{\gamma }_D) \right| , \end{aligned}$$
(149)

while the FS-metric is

$$\begin{aligned} \begin{aligned} \frac{ds_{FS}^2}{d\sigma ^2}&=2 \Delta \left[ \dfrac{\dot{\alpha }\cdot \dot{\alpha }^{*} - 2|\dot{\alpha }\cdot \alpha |^2}{1 - 2 \, \alpha \cdot \alpha ^* + \alpha ^2 \alpha ^{*2}} \right. \\&\quad \left. + 2\dfrac{\left| \dot{\alpha }\cdot \alpha ^* - \alpha ^{*2} \, \alpha \cdot \dot{\alpha }\right| ^2}{(1 - 2 \, \alpha \cdot \alpha ^* + \alpha ^2 \alpha ^{*2})^2}\right] \, . \end{aligned} \end{aligned}$$
(150)

We see that the \({\mathcal {F}}_1\) complexity depends on the overall phase \(\gamma _D\) of the state. In addition it is possible to show that the \({\mathcal {F}}_1\) cost function has many null-directions where the distance vanishes along non-trivial circuits. Once again we see that these properties make the \({\mathcal {F}}_1\) cost function a less favorable measure of complexity. Upon restricting the two-dimensional cost functions (142) and (143) to diffeomorphisms corresponding to the global conformal group, one simply obtains the \(d=2\) case of the higher dimensional cost functions (149) and (150).

Minimizing the Fubini–Study cost, it can be demonstrated that the complexity of a target state \(\alpha _T \equiv |\alpha (t=0)\rangle \) is simply

$$\begin{aligned} {\mathcal {C}}_{FS} = \sqrt{\Delta \left[ (\tanh ^{-1}\Omega ^S)^2+(\tanh ^{-1}\Omega ^A)^2\right] } \end{aligned}$$
(151)

where we can extract \(\Omega ^S\) and \(\Omega ^A\) from the combinations \(\Omega ^S\pm \Omega ^A = \sqrt{2\alpha _T \cdot \alpha ^*_T\pm 2 |\alpha _T^2|}\). We note that this result scales with \(\sqrt{\Delta }\).

In holography, the Fubini–Study line element has been related to the average of minimal and maximal distances between infinitesimally displaced timelike geodesics in the bulk (each representing the state at some point along the circuit), see [25]. This connection was made by identifying the bulk symplectic form and the one associated to the FS metric in the phase space of the circuits. This suggests that a very natural connection can be made to holography by studying the relevant symplectic forms as was indeed suggested in [85]. These ideas opens the path to an explicit holographic verification of the holographic complexity proposals. Alternatively, a connection with holography for a different class of states was proposed in [86, 87] by comparing variations in holographic complexity under small conformal transformations to equivalent variations in CFT complexity in two-dimensions.

The above approach (both in 2d and in higher dimensions) considers only unitary circuits constructed from symmetry generators of the conformal groups and those circuits do not allow to move between any two states in the CFT Hilbert space but only between states in the same conformal family. The extension to a larger class of circuits remains unknown.

6.6 Path-integral approach to complexity

A different approach to complexity is based on preparing the state using the Euclidean path integral. The authors of [23, 88] have proposed that the optimization over possible circuits preparing the state is equivalent to optimizing the metric on the space where the path integral is performed. Roughly speaking, we are to understand this metric as the density of gates in a discretized version of the path integral which in turn can be understood as a tensor network.Footnote 30 The idea is that if some gates are not needed for the optimal circuit, they can be deleted, and this will change the effective geometry. The Euclidean time in the path integral is identified with the depth along the (non-unitary) circuit, and it gives rise to an RG direction \(z=-(\tau -\delta )\) which captures the gradual introduction of entanglement into the state at different length scales; the state prepared at the final time is defined at a UV cutoff \(\delta \).

The simplest case is that of a two-dimensional CFT, because every metric can be brought to the form \(ds^2 = e^{2\phi (z,x)} (dz^2+dx^2)\). In the UV we should have one gate for each cutoff-size region, so we should set \(e^{2\phi (z=\delta ,x)}=\frac{1}{\delta ^2}\). The ground state wavefunction in the curved metric is proportional to the one with the flat metric due to conformal symmetry:

$$\begin{aligned} \Psi _{g_{ab} = e^{2\phi }\delta _{ab}} = e^{S_L[\phi ]-S_L[0]} \Psi _{g_{ab} = \delta _{ab}} \end{aligned}$$
(152)

with a proportionality factor given by the Liouville action

$$\begin{aligned} S_L[\phi ] = \frac{c}{24\pi }\int _{-\infty }^\infty dx \int _\delta ^\infty dz \left[ (\partial _x \phi )^2 + (\partial _z \phi )^2 + \mu e^{2\phi } \right] . \end{aligned}$$
(153)

The parameter \(\mu \) can be rescaled by a shift of \(\phi \), so it can be set to one. The circuit that prepares the state is thus effectively computing the Liouville action, and the optimization is equivalent to minimizing the prefactor \(e^{S_L[\phi ]}\) (see also [89] who proposes another argument for the Liouville action in the language of tensor networks). This leads to the following proposal for the complexity

$$\begin{aligned} {\mathcal {C}}_\Psi = \min _{\phi } S_L[\phi (z,x)]\,. \end{aligned}$$
(154)

The conformal factor that minimizes the action, subject to the boundary condition described above, corresponds to the metric on the hyperbolic half-plane \(ds^2 = (dz^2+dx^2)/z^2\), and it can be interpreted as the metric of a time slice of AdS\(_3\). This leads to a complexity \({\mathcal {C}}_\Psi = \frac{cL}{12\pi \delta }\) for the vacuum state of the CFT, which has the same structure of divergences which we saw earlier in the free field theory case (cf. Eq. (116) for \(d=2\)).

Using appropriate boundary conditions on the strip and on the cut plane, one can find the solutions corresponding to the TFD and to the mixed state for a subregion of the vacuum state, respectively. In all these cases, the evaluation of the Liouville action (supplemented by boundary terms) gives results that agree qualitatively with the free field theory results and with the CV and CA holographic conjectures which we describe below (i.e., they have the same dependence on the cutoff, but different coefficients).

The generalization to higher dimensions is non-trivial, since the metric has more degrees of freedom than just the conformal factor. Restricting to the class of conformally flat metrics, one can write a natural generalization of the Liouville action:

$$\begin{aligned} S_d \sim \int d^{d-1}x \, dz \left[ e^{d\phi }+e^{(d-2)\phi }((\partial _x \phi )^2+(\partial _z \phi )^2)\right] \,. \end{aligned}$$
(155)

The optimization of this action gives again a constant-time slice of AdS\(_d\) and a vacuum complexity that agrees with the free field theory results and with the holographic CV/CA results which we will describe in the next section. A different but also natural generalization of the Liouville action to higher dimensions would be an action that reproduces the conformal anomaly of the theory [90, 91]. Such action would have higher-derivative terms and would not be positive-definite, so its interpretation as complexity would be more problematic.

This framework allows to study also the complexity of a state created by the insertion of a primary operator. The Liouville equation is modified by a source term, and the corresponding geometry is the Poincaré disc with a conical defect. This agrees with the dictionary of AdS\(_3\)/CFT\(_2\) to first order in \(\Delta /c\), but an exact match seems to require quantizing the Liouville action; it is not clear how this could arise from the optimization problem (see however [92]).

While for a CFT it is possible to perform the optimization varying only the background metric, for a generic QFT that has running couplings along the RG flow one expects to have to allow for variations of some parameters of the network. The case of a CFT perturbed by a relevant operator \(\lambda {\mathcal {O}}\) was considered in [93]. The condition (152) that the wavefunction remains the same up to a prefactor is no longer a consequence of the symmetry but has to be enforced by choosing \(\lambda (z)\) appropriately. The Liouville action is replaced by a functional \(N[\phi ,\lambda ]\) which can be calculated order by order in an expansion in \(\lambda \). The optimal geometry agrees with the backreaction of a scalar field on AdS\(_3\).

7 Complexity in holography

7.1 Complexity conjectures

In a series of papers starting in 2014 [12,13,14,15, 94], Susskind and collaborators have argued that the notion of quantum complexity is crucial to understand the quantum and information-theoretic properties of black holes. A connection was in fact already suggested in [95] in relation to the problem of decoding the information contained in the Hawking radiation. Susskind et al. made the connection much sharper by conjecturing, in the context of the AdS/CFT correspondence, a precise relation between the complexity of a state in the dual theory and the corresponding bulk geometry. The conjecture has two alternative forms: “Complexity=Volume” (CV) and “Complexity=Action” (CA).Footnote 31 In order to formulate them, let us denote by \(\Sigma \) a surface at constant time on the AdS boundary, where the state is defined. CV postulates that the complexity of the state is equal to the volume of a maximal slice in the bulk \({\mathcal {N}}\) such that \(\partial {\mathcal {N}}=\Sigma \):

$$\begin{aligned} {\mathcal {C}}_V = \frac{\text {Vol}({\mathcal {N}})}{G_N \ell _{CV}}, \end{aligned}$$
(156)

where \(\ell _{CV}\) is a length required to make the quantity dimensionless, see Fig. 16.Footnote 32 CA postulates instead that the complexity is equal to the on-shell action of a Wheeler-DeWitt (WDW) patch, which is the domain of dependence of a Cauchy slice in the bulk anchored at the boundary on \(\Sigma \):Footnote 33

$$\begin{aligned} {\mathcal {C}}_A = \frac{S_{WDW}}{\pi \hbar } \,, \end{aligned}$$
(157)

see Fig. 17.

Let us see how these prescriptions work in the case of a two-sided eternal black hole in AdS, which is thought to be the holographic dual of the TFD state. The geometry has two asymptotic boundaries, where the two copies of the theory live, that are separated by a horizon, so the L and R theories are in an entangled state but do not interact with each other.

The metric of the Schwarzschild-AdS\({}_{d+1}\) solution (with conformal boundary \({\mathbb {R}}\times S^{d-1}\)) is

$$\begin{aligned} \begin{aligned} ds^2&= -f(r) dt^2 + \frac{dr^2}{f(r)} + r^2 d\Omega _{d-1}^2 \\&= - \frac{f(r) e^{-4 \pi T r_*}}{(2\pi T)^2} dU dV + r^2 d \Omega _{d-1}^2 \,, \\ f(r)&= 1 + \frac{r^2}{\ell _{AdS}^2} -\frac{\mu }{r^{d-2}} \end{aligned} \end{aligned}$$
(158)

where \(\mu \) is proportional to the mass of the black hole:

$$\begin{aligned} M = \frac{(d-1)\omega _{d-1}}{16 \pi G_N} \mu \,. \end{aligned}$$
(159)

We have denoted by \(\omega _{d-1}\) the area of the sphere \(S^{d-1}\). The mass determines also the Hawking temperature via \(f(r_h)=0\), \(f'(r_h)= 4 \pi T\), where \(r_h\) is the horizon radius. The entropy of the black hole is given by \(S=\omega _{d-1} r_h^{d-1}/(4G_N)\). In the second line of (158), the metric is expressed in terms of the Kruskal coordinates U and V that cover the maximal analytical extension of the spacetime. The full spacetime can be divided in four regions, depending on the signs of U and V (see Fig. 16). The relation with the \(t_R,r\) coordinates defined on the right quadrant is

$$\begin{aligned} U = -e^{-2 \pi T (t_R-r_*)} \,, V = e^{2 \pi T (t_R+r_*)} \end{aligned}$$
(160)

where \(r_*\) is the tortoise coordinate defined by \(d r_* = dr/f(r)\). We can see that the original coordinates only cover the region \(U<0,V>0\). The metric has an isometry \(U \rightarrow e^{-a} U, V \rightarrow e^{a} V\) which is just time translation \(t_R \rightarrow t_R+a\), but on the left boundary it translates time in the opposite direction: \(t_L \rightarrow t_L -a\). We chose the time coordinates \(t_{L},t_{R}\) to run in the same direction on both sides of the Penrose diagram. The isometry reflects the invariance of the TFD state under the evolution generated by \(H_L-H_R\). In Kruskal coordinates, the boundaries are located at \(UV=-1\), the horizon is the union of the lines \(U=0\) and \(V=0\), and the black hole singularity is at a constant value of \(UV >0\).

CV conjecture: Let us now consider a bulk hypersurface connecting constant time slices at \(t_L, t_R\). We can use the isometry to set \(t_L = t_R\equiv \frac{t}{2}\).

Fig. 16
figure 16

Penrose diagram of the two-sided black hole in AdS and the maximal-volume surface connecting two constant time slices on the opposite boundaries

Describing the surface by an embedding t(r), its volume is calculated asFootnote 34

$$\begin{aligned} \text {Vol}({\mathcal {N}})= \omega _{d-1} \, \int dr \, r^{d-1} \sqrt{-f(r) t'(r)^2+ \frac{1}{f(r)}} \,. \end{aligned}$$
(161)

We can integrate the equation for extremizing (161) using the existence of an integral of motion \(\gamma \):Footnote 35

$$\begin{aligned} t(r) = \int _{r_0}^\infty \frac{dr}{f \sqrt{1+\gamma ^{-2}f r^{2d-2}}} \,, \quad t(\infty ) = t_R \,. \end{aligned}$$
(162)

The vanishing of the denominator gives the turning point \(r_0\) of the surface: \(\gamma ^2 = |f(r_0)| r_0^{2d-2}\) behind the horizon. So \(\gamma \le \gamma _{max} =\text {max} ( \sqrt{|f|} r^{d-1})\), and as \(\gamma \rightarrow \gamma _{max}\) the integral diverges logarithmically, so \(t_R \rightarrow \infty \). Using the integral of motion, the volume can be rewritten as

$$\begin{aligned} \text {Vol}({\mathcal {N}})= 2 \, \omega _{d-1} \, \int _{r_0} dr \frac{r^{2d-2}}{\sqrt{\gamma ^2+f(r) r^{2d-2}}} \,. \end{aligned}$$
(163)

Comparing the last two equations, we see that the integrals for the time and the volume have the same logarithmic divergence at the lower integration limit, i.e., the region when \(r \approx r_0\), so we can estimate

$$\begin{aligned} \text {Vol}({\mathcal {N}}) \sim \omega _{d-1} \gamma _{max} \, t \quad \text {as} \,\, t \rightarrow \infty \,. \end{aligned}$$
(164)

The maximal volume then grows linearly in time, and this can be attributed to the growth of the region behind the horizon, the ER bridge that connects the L and R theories. For a black hole with a large mass one finds

$$\begin{aligned} \gamma _{max} \sim \frac{\mu \ell _{AdS}}{2} \,, \quad \frac{d {\mathcal {C}}_V}{d t} \sim \frac{8 \pi }{d-1} M \sim \frac{8 \pi }{d} TS \,. \end{aligned}$$
(165)

So the volume grows at a rate proportional to the total energy. The volume has also a divergence from the upper integration limit \(r \rightarrow \infty \). This is the typical UV divergence coming from the AdS boundary, and as usual we regulate it with a radial cutoff \(r_{\text {max}} = \ell _{AdS}^2 / \delta \). We find that the leading divergent term isFootnote 36

$$\begin{aligned} \text {Vol}({\mathcal {N}})_{div} \sim \frac{2}{d-1} \, \omega _{d-1} \frac{\ell _{AdS}^{~2d-1}}{\delta ^{d - 1}} \,. \end{aligned}$$
(166)

This leads to a complexity

$$\begin{aligned} {\mathcal {C}}_{V,div} \sim \frac{{\tilde{c}}}{d-1} \, \frac{\text {Vol}}{\delta ^{d - 1}} \,, \end{aligned}$$
(167)

where \({\tilde{c}}=\ell _{AdS}^{d-1}/G_N\) is proportional to the central charge of the theory [99] and \(\text {Vol}=2\omega _{d-1}\ell _{AdS}^{d-1}\) is the total spatial volume of the two boundary time slices. Notice that this term is time-independent; this is easy to understand, since when r is large we can neglect the \(\gamma ^2 \) term in the denominator of Eq. (163). Moreover it is also state-independent: different states correspond to asymptotically-AdS geometries with the same metric at leading order and corrections of relative order \(1 / r^{d}\). Therefore the difference of the volume in two different states is finite, and can be regularized by a state-independent subtraction. This state-independent subtraction can be done by focusing on the complexity of formation which we defined in Eq. (90) where we subtracted from the complexity of the TFD state at \(t=0\) that of two copies of the vacuum state (here empty AdS), this yields in the high temperature limit in \(d>2\) [70]

$$\begin{aligned} \Delta {\mathcal {C}}_V = 4\sqrt{\pi } \frac{(d-2)\Gamma \left( 1+\frac{1}{d}\right) }{(d-1)\Gamma \left( \frac{1}{2}+\frac{1}{d}\right) } S+\ldots , \end{aligned}$$
(168)

where the dots indicate corrections away from high temperatures. Note that the complexity of formation is proportional to the entropy, just like what we found in the free field theory in Eq. (125), although with a different coefficient. In \(d=2\) the coefficient of the entropy in this expression vanishes and we are left with a constant complexity of formation. In particular, if we compare the complexity of the BTZ black hole to that of the Neveu–Schwarz vacuum in the boundary theory we obtain \(\Delta {\mathcal {C}}_V =8\pi c/3\) where \(c=3\ell _{AdS}/(2G_N)\) is the central charge, whereas comparing to the Ramond vacuum instead yields \(\Delta {\mathcal {C}}_V =0\). In all these examples the complexity of formation is non-negative. This property was proven in general in asymptotically AdS spaces in \(d=3\) and in some symmetric spaces in other dimensions in [100].

When the boundary geometry is not flat, the subtraction contains (167) as the leading term, but also additional subleading divergences that we will not discuss here; their structure was analyzed in [101, 102].

CA conjecture As stated before, we need to find the domain of dependence of a Cauchy slice in the bulk, ending on the boundary at \(t_L = t_R\). This is the part of the bulk that can be unambiguously reconstructed if one knows only the initial conditions on the slice. It is easy to see that the WDW patch consists of points that are spacelike-separated from all boundary points in \(\Sigma \). The boundary of the WDW patch is then obtained by considering the innermost null geodesics starting from the boundary at the given time. In Kruskal coordinates, denoting the coordinate of the boundary time slices as \((U_L, V_L )\), \((U_R, V_R)\), with \(U_L V_L=U_R V_R=-1\), \(U_L/V_L=V_R/U_R\), these geodesics are the surfaces \(U = U_L\), \(U = U_R\), \(V = V_L\), \(V = V_R\).

If we are considering a solution to Einstein gravity with a cosmological constant, then naively, the on-shell action will be proportional to the spacetime volume of the WDW patch. However the spacetime region we consider has boundaries, and it is well-known that in the presence of boundaries the Einstein–Hilbert action has to be supplemented by additional boundary terms. For spacelike or timelike boundaries these are the Gibbons–Hawking boundary terms. However, these terms are not well-defined on null surfaces, due to the fact that the induced metric is degenerate. Furthermore, the boundaries of the WDW patch are not smooth. They consist of multiple components that intersect along codimension-two corners. The complete action appropriate in this situation was found in [103] (see also [101, 104, 105]) and can be written as a sum of terms \(S=\sum _j S_j\) associated to regions of codimension j. The terms areFootnote 37

$$\begin{aligned} 16 \pi G_N S_0= & {} \int _M d^{d+1} x \sqrt{-g} (R-2 \Lambda ) \,, \nonumber \\ 16 \pi G_N S_1= & {} 2 \epsilon _K \int _{B_\pm } d^d x \sqrt{|h|} K \nonumber \\&+ 2 \int _{B_0} d^{d-1}\theta d\lambda \sqrt{\gamma } \left( \epsilon _\kappa \kappa - \Theta \log (\ell _{ct}| \Theta |) \right) \,, \nonumber \\ 16 \pi G_N S_2= & {} 2 \epsilon _a \int _J d^{d-1}\theta \sqrt{\sigma } a \,. \end{aligned}$$
(169)

Here \(B_\pm \), \(B_0\) are the spacelike \((+)\), timelike \((-)\) or null (0) components of the boundary, K is the trace of the extrinsic curvature, \((\theta ^a, \lambda )\) are coordinates on \(B_0\) such that \(\lambda \) is a parameter on the null generators of the surface, increasing towards the future; \(\kappa \) is defined by \(k^\mu \nabla _\mu k_\nu = \kappa k_\nu \) for the vector field normal to the surface \(k^\mu \partial _\mu = \partial _\lambda \), \(\Theta = \partial _\lambda \log \sqrt{\gamma }\) is the trace of the second fundamental form, which gives the expansion rate of the congruence of null generators. J denotes the joints, or corners, arising from the intersection of two boundary components. There can be different types of joints: for \(J=B_\pm \cap B_0\), \(a=\log |n\cdot k|\), and for \(J=B_0 \cap B_0'\), \(a= \log |\frac{k\cdot k'}{2} |\). The normal vectors have to be taken pointing outwards from the region M for timelike surfaces and be future-oriented for spacelike and null surfaces. The factors \(\epsilon _K, \epsilon _\kappa , \epsilon _a\) are signs: \(\epsilon _K = 1\) for a timelike boundary while for a spacelike boundary \(\epsilon _K= 1 (-1)\) if the region M lies in the future (past) of the boundary component; \(\epsilon _\kappa = 1 (-1)\) if the region M lies in the future (past) of the boundary component; and \(\epsilon _a=-1\) if the volume of interest lies to the future (past) of the null segment and the joint lies to the future (past) of the segment, otherwise \(\epsilon _a=1\), see appendix C of [103].Footnote 38 The boundary term on the null boundaries is given in (169) using a particular parametrization, but one can show that it is reparametrization-invariant, thanks to the term involving \(\Theta \).Footnote 39 Notice that this term requires the introduction of a length scale \(\ell _{ct}\) on top of the AdS scale.

With all these ingredients at hand, we can compute the action of the WDW patch. It is UV divergent, and there are different ways to regulate it: we can compute the action of the WDW patch restricted to the part of the bulk within the cutoff, or alternatively we can compute the WDW patch in the cutoff space, with null geodesics starting from the cutoff surface \(UV=-1+4 \pi T \delta \). The two regularization schemes lead to the same result for the leading divergence [74, 101, 102]Footnote 40

$$\begin{aligned} S_{div} \sim \frac{2}{4\pi G_N} \frac{\ell _{AdS}^{2d-2}}{\delta ^{d-1}} \omega _{d-1} \log \left( (d-1) \frac{\ell _{ct}}{\ell _{AdS}} \right) \,. \end{aligned}$$
(170)

This gives a divergence in the complexity

$$\begin{aligned} {\mathcal {C}}_{A,div} \sim \frac{{\tilde{c}}}{4\pi ^2} \frac{\text {Vol}}{\delta ^{d-1}} \log \left( (d-1) \frac{\ell _{ct}}{\ell _{AdS}} \right) , \end{aligned}$$
(171)

where, as before, \({\tilde{c}}=\ell _{AdS}^{d-1}/G_N\) is proportional to the central charge of the theory and \(\text {Vol}\) is the total spatial volume of the two boundary time slices. This has the same structure as (167): it is extensive in the field theory volume and diverges as \(\delta ^{1-d}\); the prefactor is different, but in both cases it depends on an arbitrary length scale (recall that in CV the scale enters in the prescription (156)).

As for the time dependence, one can see that thanks to the time-translation isometry, the action of the part of the WDW patch outside the horizon is time-independent. For late times, the part behind the past horizon becomes vanishingly small, so the only contribution comes from the part within the future horizon.Footnote 41 The patch extends to the singularity. However, the relevant contribution to the action is finite due to the fact that the sphere shrinks there and there is no need to regularize the singularity. The computation done in [15] gives

$$\begin{aligned} \frac{dS}{dt} = 2 M \,. \end{aligned}$$
(172)

It is interesting to note that this result is independent of the counterterm scale \(\ell _{ct}\).

As before, it is interesting to consider the complexity of formation (90) where we subtracted from the complexity of the TFD state at \(t=0\) that of two copies of empty AdS. This yields at high temperatures in \(d>2\) [70]

$$\begin{aligned} \Delta {\mathcal {C}}_A = \frac{(d-2)}{d\pi } \cot \left( \frac{\pi }{d}\right) \, S+\ldots , \end{aligned}$$
(173)

where the dots indicate corrections away from high temperatures. Once again we find the proportionality of the complexity of formation to the entropy. In \(d=2\) the coefficient of the entropy in this expression vanishes and we are left with a constant complexity of formation. In particular, if we compare the complexity of the BTZ black hole to that of the Neveu-Schwarz vacuum in the boundary theory we obtain \(\Delta {\mathcal {C}}_A =- c/3\) where \(c=3\ell _{AdS}/(2G_N)\) is the central charge whereas comparing to the Ramond vacuum instead yields \(\Delta {\mathcal {C}}_A =0\).

7.2 Comparison between CV and CA

The first thing to notice is that both the CV and the CA results contain some ambiguities. CV requires a length scale for dimensional reasons; CA appears at first to be more canonically defined, but as we have seen, the presence of null boundary terms naturally reintroduces an additional scale. Moreover, the action could be modified by additional boundary terms. For example when dealing with charged black holes it turns out that the complexity can depend strongly on the boundary conditions one imposes on the associated Maxwell field [108].

Comparing with the results of the previous sections, we see that both the volume of maximal slices and the action of the WDW patch show the same behavior as the complexity in the free-field theory examples from Sect. 6. First, the UV divergent part obeys a volume law, and depends on the cutoff as \(\delta ^{1-d}\), the same as the free-field theory result for \({\mathcal {C}}_1^{UB}\) in Eq. (116). If we consider instead the free-field result for \({\mathcal {C}}_2\) in Eq. (115), we see that it has a different power law and cannot be matched the holographic result. Comparing to our holographic results in Eqs. (167) and (170) we are led to identify

$$\begin{aligned} |{\tilde{\mu }}| \propto \ell _{AdS}/\ell _{CV} \propto \log (\ell _{ct}(d-1)/\ell _{AdS}) \end{aligned}$$
(174)

where here we introduced back the length scale \(\ell _{CV}\) involved in the definition of the CV proposal. We see that in fact the choice we could make in the field theory side for the scale of the reference state is naturally identified with the freedom which we have in the CV and CA proposals.

Second, the linear growth in time matches the expectation from the circuit model (11). Recall that using the relation with the Lyapunov exponent under the assumption of maximal chaos, the circuit time n is related to the physical time as \(n \propto T t\), and the number of qubits N is proportional to the entropy of the system. With these identifications, the rate of growth of the complexity for a black hole is expected to be proportional to TS at late times. This expectation is borne out both by CV and CA. It is worth noting that the linear growth of complexity for a very long time is not reproduced in the free field theory model in Sect. 6.3. Indeed, in such a simple theory the dynamical properties of complexity are expected to differ significantly from those of chaotic systems.

The result for CA in Eq. (172) may look more satisfactory, giving a rate of linear growth exactly equal to the mass, while for CV there is a proportionality factor that depends on the dimension. However, given the uncertainty in the identification of time, and the fact that the definition of complexity itself does not fix the normalization, we should be skeptical about the significance of the precise prefactor. Nevertheless, the holographic prescription fixes a particular normalization, and one may still be tempted to conjecture that (172) is a universal result for holographic models. This turns out not to be the case: for charged and rotating black holes the rate is a non-trivial function of the charge and angular momentum, and does not coincide with M. Initially [14, 15] speculated that M might give an absolute upper bound on the rate of growth of complexity, based on an analogy with the Lloyd’s bound on the rate of computation [109] (which in turn is based on the orthogonality bounds discussed in Sect. 2). This turns out to be false as well: it was shown in [98] that at late times the limiting value of the rate of change in complexity using CA is approached from above, thus violating the supposed bound by an amount that can be made arbitrarily large. Another counterexample was given in [110, 111]: in the case of Lifshitz and hyperscaling-violating solutions of Einstein–Maxwell dilaton theory the growth was found to be enhanced compared to the CFT case: \(dS/dt= 2E (1+\frac{z-1}{d-\theta })\) where E is the energy (equal to M in the \(z=1\) case). This however would still be compatible with a putative bound given by 2TS. In fact a counterexample was given already in the initial paper [14, 15]: the bound is violated for large charged black holesFootnote 42 and this violation is most pronounced close to extremality, but in general such black holes are unstable to the emission of light charged particles. Recently a version of the holographic Lloyd’s bound was proven for the case of CV: it was shown [100] that under certain energy conditions, in asymptotically-AdS spacetimes in \(d\ge 3\), the rate of growth of \({\mathcal {C}}_V\) is bounded by \(\frac{8\pi M}{d-1} f(M)\), where f(M) is a function equal to 1 for \(M\le {\hat{M}}\) with \({\hat{M}}\) a mass scale near the Hawking–Page transition, and \(f(M)=1+2(M/{\hat{M}})^{1/(d-2)}\) for \(M>{\hat{M}}\). We will comment further on the bounds on the rate of computation in the discussion section.

Finally, let us note that the complexity of formation in holography using the CV (168) and CA (173) proposals was found to be proportional to the entropy in \(d>2\). We observed a similar behavior in free field theory where the mass was set to zero (125). While the dependence of the proportionality coefficient on the dimension was different in all these cases, as we already mentioned earlier, this coefficient is somewhat arbitrary in the prescriptions for evaluating complexity.

7.3 Tensor network model

A different perspective on the growth of the complexity can be gained by considering a tensor network model. This gives another argument for the linear growth of complexity with a prefactor proportional to the temperature times the entropy of the system [15]. Tensor networks have been used as a computational tool to provide an efficient representation of states (e.g., of a spin system) that are less entangled than a typical state. Typically one is interested in the ground state of a local Hamiltonian, which has area-law entanglement entropy (with logarithmic corrections for a gapless system) whereas the typical state has a volume law entanglement entropy. We cannot give a full account of the topic in this review, the reader can find more details, e.g., in the recent review [112].

Fig. 17
figure 17

The Wheeler–DeWitt patch used in the computation of CA

It has been proposed that Tensor networks can provide a discretized picture of AdS/CFT, in particular using the MERA (Multi-Entanglement Renormalization Ansatz) tensor networks which are especially designed for constructing ground states of critical systems [113]. In a MERA network, the ground state state of a critical system is produced by iterating two types of operations, as illustrated in Fig. 18. One operation is the disentangler, which introduces entanglement between the pair of qubits that it acts on; the other is the isometry, which makes a coarse-graining of the degrees of freedom. The effect of the two operations is that entanglement is introduced in the state at increasingly larger length scales.

Fig. 18
figure 18

Illustration of a MERA circuit, implementing the RG flow from the bottom (UV) to the top (IR)

Schematically, one starts from an unentangled state at a UV scale \(\Lambda \), for a system of length L. One layer of the circuit acts on the state with an operator V and gives the wave function at the coarse-grained scale \(\psi (2 L, \frac{\Lambda }{2}) = V \psi (L, \Lambda )\).

The thermofield double state at temperature T has entanglement at length scales smaller than 1/T on each side while points at larger distances are unentangled. Therefore the circuit that builds two copies of the ground state also builds to a good approximation the finite-temperature TFD at short length scales. At the scale of the temperature the state is given by

$$\begin{aligned} \begin{aligned}&|TFD(L,T) \rangle = V^k \otimes (V^*)^k |TFD(L/2^k, \Lambda ) \rangle , \end{aligned} \end{aligned}$$
(175)

where \(k = \log _2 \frac{\Lambda }{T}\) is the number of gates in layers in the circuit. The operation of V in the circuit constructing the TFD is depicted in red/green in Fig. 19.

Fig. 19
figure 19

Illustration of a tensor network constructing the TFD state and its time evolution

Now, if we consider the evolution of the TFD state in time, we should attach a unitary time evolution operator to the UV part of the circuit. This evolution is described in blue in Fig. 19. Naively then, we could expect that the complexity grows as \(N \Lambda \, t\), (with \(N=(L\Lambda )^{d-1}\) the number of UV degrees of freedom) since the Hamiltonian acts on all the UV degrees of freedom. However, it turns out that this is not the most efficient way to prepare the TFD state at finite time. In fact, by swapping the action of the V operators with the time evolution operators we can convince ourselves that it is more efficient in the complexity sense (it requires fewer operations) to act with an effective Hamiltonian on the IR degrees of freedom, see the right panel of Fig. 19. Since we are describing a critical system, we can use the fact that \(H(L) V^k = V^k 2^{-k\Delta } H(L/2^k)\), namely that the Hamiltonian is a scaling operator, with dimension \(\Delta =1\). Then we can act on the IR state with a renormalized Hamiltonian; this is much more efficient since the number of sites on which we need to act is reduced by a factor of \(2^k\) after k steps. At the scale T the number of sites is LT, and so the expected growth rate of complexity is reduced to \(T(LT)^{d-1} \sim T S(T)\). In this way we recover the same prefactor in the rate of growth of complexity that arose from the epidemic model.

We hasten to add that the argument is very heuristic, and the precise correspondence of tensor networks with holography is far from being completely established.

8 Additional tests of the holographic conjectures

8.1 Shock waves

A particularly important support for the complexity conjectures can be obtained by studying their behavior under a perturbation of the system, and comparing it to the predictions from the circuit model in Sect. 3.2. In [13] the authors considered the evolution of the TFD state after the application of a precursor:

$$\begin{aligned} U_L(t_L) U_R(t_R) W_L(t_w) |TFD \rangle \,. \end{aligned}$$
(176)

Here W is a local CFT operator of energy \(E \ll M\) – more precisely, \(E={\mathcal {O}}(1)\), while \(M = {\mathcal {O}}(N^2)\). The operator acts on the boundary at a time \(t_w\), and creates an excitation which propagates in the bulk along a null line. As the excitation moves towards the horizon its energy gets more and more blue-shifted, so its backreaction cannot be ignored, even though the initial energy of the excitation is small. The backreaction is described by a shock wave [114]. For simplicity, we consider the case of AdS\(_3\), and we take an excitation created by an operator which is smeared uniformly along the circle at the boundary, sent from the left at some very early time, see Fig. 20.Footnote 43 In this case the perturbed metric can be written in Kruskal coordinates as

$$\begin{aligned} ds^2 = - A(r) (2 dU dV - 2 h \delta (U) dU^2) + r^2 d\phi ^2 \end{aligned}$$
(177)

where \(A(r)= f(r)e^{-4 \pi T r_*}/(8\pi ^2 T^2)\) can be read by comparing to the unperturbed metric in Eq. (158). The perturbation can be interpreted as a shift in the V coordinate across the horizon: \(V \rightarrow V- h \theta (U)\). The bulk stress energy tensor is localized on the shock wave; we can write it as

$$\begin{aligned} T_{UU} =\frac{E}{16 \pi G_N M} e^{2 \pi T |t_w|} \delta (U) \,, \end{aligned}$$
(178)

where E is the energy of the excitation inserted at \(t_w=0\). Solving Einstein’s equations gives

$$\begin{aligned} h \propto \, e^{2\pi T(|t_w|-t_*)} \,, \quad t_* = \frac{1}{2\pi T} \log \frac{M}{E} \,, \end{aligned}$$
(179)

where \(t_*\) is the scrambling time. The solution is valid in the limit where \(E \rightarrow 0, |t_w| \rightarrow \infty \), with h fixed. In this limit the shock wave propagates along the horizon.

Fig. 20
figure 20

Penrose diagram of the AdS\(_{3}\) black hole geometry perturbed by a shock wave

Due to the shift in V, the maximal slices are displaced when they cross the horizon. The modification of the volume can be computed analytically in the 3d case, and the corresponding complexity is given, up to an additive constant which is UV divergent but time-independent, by the formula [46, 114]

$$\begin{aligned} {\mathcal {C}}_V \sim S \log \left[ \cosh \left( \pi T (t_L+t_R)\right) +{\mathfrak {c}}\, h \, e^{\pi T( t_L-t_R)} \right] \,, \end{aligned}$$
(180)

where S is the entropy, \({\mathfrak {c}}\) is some order one constant and h is given in (179). Setting \(t_L=t_R=0\), the formula has the same dependence on \(|t_w|\) and the scrambling time \(t_*\) as the result of the epidemic model (19): it grows exponentially with \(|t_w|\) for \(|t_w| \ll t_*\), and linearly for \(|t_w| \gg t_*\). Also as a function of \(t_L\), \(t_R\) at fixed h we can see different regimes. Setting for instance \(t_L=-t_R\), we have exponential growth in \(t_L\) for \(t_L \ll t_* - |t_w|\) followed by a linear growth at late times.

The formula (180) is actually a good approximation also for the complexity in shockwave backgrounds in higher-dimensional AdS black holes, because one can argue, with a reasoning similar to the one that led to Eq. (164), that the main contribution comes from a region where r is almost constant, and therefore the volume of the angular directions only contributes an overall factor but does not change the shape of the maximal surface.

More explicitly, we can evaluate the leading late-time result as follows [13]: one finds that the volume of a maximal surface connecting the left boundary at \(t_L\) to the horizon at \((U=0,V_R)\) is given by

$$\begin{aligned} \text {Vol}(t_L,V_R) \sim \frac{\omega _{d-1} \gamma _{max}}{2 \pi T} \log (V_R e^{2\pi T t_L}) \,. \end{aligned}$$
(181)

The remaining part of the surface goes from \((U=0,V_R-h)\) to the boundary at \(t_R\). Minimizing the sum of the two contributions over \(V_R\) gives \(V_R = h/2\), and

$$\begin{aligned} \text {Vol}&=\frac{\omega _{d-1} \gamma _{max}}{2 \pi T}\left( \log \left( \frac{h}{2} e^{2\pi T t_L}\right) +\log \left( \frac{h}{2} e^{-2 \pi T t_R} \right) \right) \nonumber \\&= \omega _{d-1} \gamma _{max} (t_L-t_R +2 |t_w| - 2 t_*)+{\mathcal {O}}(1)\,. \end{aligned}$$
(182)

In this derivation we assumed \(t_L > t_w\), \(t_R < - t_w\). The argument can be extended to more complicated insertions of the form \(U_L(t_L)W_L(t_1)\ldots W_L(t_n)U_R(t_R)\). We describe here the results of [13], that constructed the geometries corresponding to multiple shock waves created by the insertion of operators on the left side at different times, building up on the work of [118]. Since the times \(t_1, \ldots t_n\) do not have to be ordered, one has to distinguish the operator insertions that are time-ordered from those that are not. The former give rise to shock waves that propagate in the same direction, and only give small perturbations to the geometry. The latter create shock waves propagating in opposite directions and have a larger effect. The geometry corresponding to multiple shock waves can be constructed patching together portions of AdS along the horizon with shifts \((V,U) \rightarrow (V,U)\pm 2e^{-2\pi T (t_*\pm t_i)}\), where the coordinate being shifted, as well as the sign in the exponent, depends on the direction of the shock wave. One finds, in agreement with the expectation from the circuit model, that the complexity grows linearly with the time difference between insertions, and with the offset from the switchback coming from points where the time contour folds; the generalization of Eq. (8.1) to multiple time insertions then reads

$$\begin{aligned} \text {Vol} \sim |t_L - t_1| + |t_1 - t_2| + \cdots |t_R + t_n| - 2 n_s t_* \,. \end{aligned}$$
(183)

This result is valid only in the limit when all the time differences between the different shocks and between the shocks and the boundary times are very large compared to \(t_*\); the exact formula, just as for a single shock, will also exhibit different regimes where the volume grows exponentially. It was shown in [15] that the same behavior is obtained also using the CA prescription, although with more cumbersome calculations, especially in the case of multiple shock waves.

In the limit \(E \rightarrow 0\) we have considered, the energy of the shock is negligible and it does not change the mass of the black hole. The case of a finite-energy shock was considered in [117]. In that case, with a single shockwave, the complexity grows linearly at late times at a rate proportional to the final mass of the black hole (after it has absorbed the shock), whereas at early times there is a linear growth with a slope proportional to the energy of the shockwave, and a relatively sharp transition between the two regions.Footnote 44

As observed in [117], the AdS\(_3\) result (180) is in agreement with the epidemic model of Sect. 3.2 and agrees also with the holographic result for light shocks, but does not account for the early-time growth of finite-energy shocks. The epidemic model we used assumed that the perturbation is generated by a simple operator. In fact, it is easy enough to account for the insertion of a heavy operator. We simply have to modify the initial conditions for the number of infected sites. The solution is given by \(s(n_0+n)\), with s(n) the number of infected sites at the n-th step as in (17) and \(s(n_0) = N_0\) is the size of the operator serving as the initial perturbation. We want to consider the case when the initial size is a finite fraction of the total size, \(N_0= \alpha N\). From Eq. (16) we find

$$\begin{aligned} n_0 = \frac{1}{k-1} \log \left( \frac{1-(1-\alpha )^{k-1}}{(1-\alpha )^{k-1}} \right) + n_*. \end{aligned}$$
(184)

The time needed for the infection to spread to the whole system is now \(n_*- n_0\), and we see that it is much shorter than the scrambling time, since it does not scale with N.Footnote 45 The early and late time behavior of the complexity can be obtained from the corresponding limits of (17), taken without the assumption \(s_0 \ll N\). One finds

$$\begin{aligned} \begin{aligned} {\mathcal {C}}(n)&\sim N_0 n \,, \quad n \ll 1 \,, \\&\sim N n \,, \quad n \gg 1 \,. \end{aligned} \end{aligned}$$
(185)

We see that the behavior is the same as for a finite-energy shock: there is an early-time linear regime with rate controlled by the size of the perturbation, and a later-time linear regime with rate given by the size of the system,Footnote 46 as illustrated in Fig. 21. The timescale of the transition between the two asymptotic regimes (called the delay time in [117]) is not controlled by the scrambling time but is of order \(t_d \sim 1/T\). In the case \(k=2\) we can give a formula for the full evolution of the complexity:

$$\begin{aligned} {\mathcal {C}}(n) = N \log \left( \frac{N-1+e^{n_0+n}}{N-1+e^{n_0}} \right) . \end{aligned}$$
(186)
Fig. 21
figure 21

Complexity growth calculated in the epidemic model with an insertion of a heavy operator of size \(N_0=N/2\)

8.2 Subregions

We have considered in the previous sections the complexity of pure states defined by a full holographic geometry. We can also consider mixed states associated to a subregion of the boundary. The information about the density matrix \(\rho _A\) of this mixed state is encoded holographically in the entanglement wedge, i.e., the bulk domain of dependence of the part of the constant time slice contained between the boundary region and the corresponding RT surface [119,120,121]. We recall that the RT surface computes holographically the entanglement entropy of a region on the boundary, and is the minimal surface in the bulk anchored on the boundary of the entangling region, for a review see [7]. It is natural to extend the complexity conjectures to the case of subregions. The extension of CV was first suggested in [122] for the case of static geometries; they proposed to take the volume of the maximal bulk slice bounded by A and by the RT surface. In the case of time-dependent geometries the prescription proposed in [101] makes use of the HRT surface [123] which is the appropriate covariant generalization of the RT surface.

The extension of CA, also proposed in [101], is to take the action of the region formed by the intersection of the entanglement wedge of A with the WDW patch of any boundary constant-time slice that contains A (one can show that the prescription is independent of the choice of the slice).

The case of a subregion given by a ball B of radius R in the vacuum (i.e., pure AdS) was considered in [122]. Using the CV proposal, one finds for the leading divergence

$$\begin{aligned} {\mathcal {C}}_{V,div} =\frac{{\tilde{c}}}{(d-1)} \frac{\text {Vol}(B)}{\delta ^{d-1}}. \end{aligned}$$
(187)

This has a volume law, just like the complexity of the full system. In the case of a BTZ black hole, for a segment of length x, one has

$$\begin{aligned} {\mathcal {C}}_V = \frac{2 c}{3}\left( \frac{x}{\delta } - \pi \right) \,, \end{aligned}$$
(188)

with c the central charge of the dual theory. This result was generalized to multiple segments in [124], who found

$$\begin{aligned} {\mathcal {C}}_V = \frac{2 c}{3}\left( \frac{x_{tot}}{\delta } - \pi \left( 2 \chi - \frac{m}{2}\right) \right) \,, \end{aligned}$$
(189)

where \(\chi \) is the Euler characteristic of the extremal surface, and m the number of joints between the boundary segments and the RT surface. Notice that the finite term is topological, and surprisingly there is no dependence on the temperature of the black hole. This is the case also in global AdS\(_3\), but not for higher dimensions [122, 125, 126].

Using the CA prescription for the same situation of a segment in planar AdS\(_3\) gives [127]

$$\begin{aligned} {\mathcal {C}}_A= & {} \frac{x}{\delta } \frac{c}{6 \pi ^2} \log \left( \frac{\ell _{ct}}{\ell _{AdS}}\right) - \frac{c}{3\pi ^2}\log \left( \frac{2 \ell _{ct}}{\ell _{AdS}}\right) \log \left( \frac{x}{\delta } \right) \nonumber \\&+ \frac{c}{24} \,, \end{aligned}$$
(190)

and for the planar BTZ black hole

$$\begin{aligned} {\mathcal {C}}_A = \frac{x}{\delta } \frac{c}{6 \pi ^2} \log \left( \frac{\ell _{ct}}{\ell _{AdS}}\right) - \log \left( \frac{2 \ell _{ct}}{\ell _{AdS}}\right) \frac{S_{EE}(x)}{\pi ^2} + \frac{c}{24} \,, \end{aligned}$$
(191)

where

$$\begin{aligned} S_{EE}(x) = \frac{c}{3} \log \left( \frac{1}{\pi T \delta } \sinh (\pi T x) \right) \end{aligned}$$
(192)

is the entanglement entropy of the segment. In comparison with (188), CA has a subleading logarithmic divergence that persists also in the limit of zero temperature. Notice that the entanglement entropy appears in this formula in the same way as in the field theory result for the mutual complexity (135), although in order to compare the two we should take the limit \(L \rightarrow \infty \) in the latter. However, the relation between complexity and entanglement becomes more intricate for the case of multiple segments, see for example the case of two-segments in holography [127] and in field theory [128] and it certainly does not hold in dynamical situations since the time dependence of the two quantities is drastically different, as we have already seen.

It was observed in [71] that while CV subregion complexity is additive in a pure state (i.e., \(\Delta {\mathcal {C}}_{V}(\rho _A, \rho _{A^c})=0\), where \(A^c\) is the region complementary to A), and is in general superadditive, \(\Delta {\mathcal {C}}_{V} \le 0\), for CA complexity one cannot make a general statement: it can be subadditive or superadditive, and it may change behavior depending on the value of the counterterm scale \(\ell _{ct}\). However if \(\ell _{ct}\) is selected such that the leading divergence in the complexity is positive, then the CA complexity is found to be superadditive \(\Delta {\mathcal {C}}_A<0\) [71, 74]. This contrasts with the field theory results of Sect. 5.5 where the complexity was found to be subadditive \(\Delta {\mathcal {C}}^{\text {diag}} >0\) in the diagonal basis. In the physical basis on the other hand, the complexity was found to be superadditive in several cases [74].

8.3 Defects and boundaries

Another interesting situation to consider is the presence of boundaries or defects in the field theory. Defects in a CFT that preserve part of the conformal symmetry have been investigated extensively, including their holographic realizations. The simplest model to consider is the thin-brane model, where the defect extends in the AdS bulk as a brane [129] (different models were considered in [130,131,132,133]). The action is the Einstein-Hilbert action coupled to the action of the brane:

$$\begin{aligned} S= & {} \frac{1}{16\pi G_N} \int d^3 x \sqrt{-g}\left( R + \frac{2}{\ell _{AdS}^2}\right) \nonumber \\&- \frac{T}{8 \pi G_N} \int d^2x \sqrt{-h}\,. \end{aligned}$$
(193)

The gravity solution is obtained by gluing two patches of vacuum AdS\(_3\) along the brane, in the way specified by the Israel-Stewart matching conditions [134]. In this model there are three parameters: the central charges of the theories joined by the defect, \(c_{L,R}\), and the tension of the brane T. The dependence of the complexity of the vacuum on the tension was studied in [135] for the case of a 2d CFT, with \(c_L = c_R =c\). When the theory is put on a circle of length L, with two defects at the diametrically opposed points \(x=0, x= L/2\), one finds

$$\begin{aligned} \begin{aligned} {\mathcal {C}}_V&= \frac{4 c}{3} \left( \frac{\pi L}{\delta } + 2 \log \left( \frac{2 L}{\delta } \right) \sinh (2 y^*) \right) \,, \\ {\mathcal {C}}_A&= \frac{c}{3 \pi } \left( \frac{L}{\delta } \log \left( \frac{e \ell _{ct}}{\ell _{AdS}}\right) + \frac{\pi }{2} \right) \,, \end{aligned} \end{aligned}$$
(194)

where \(y^*\) is related to the tension via \( T \ell _{AdS} = 2 \tanh y^*\). Remarkably there is no dependence on \(y^*\) in the CA result, which is completely unaffected by the presence of the defect. One may be tempted to take this surprising result as evidence against the CA conjecture. However it turns out that this is consistent with the result obtained in a simple model of a conformal defect for a free scalar in 2d. This defect is also characterized by a single parameter that determines the matching condition:

$$\begin{aligned} \begin{pmatrix} \partial _x \phi _- \\ \partial _t \phi _- \end{pmatrix} = \begin{pmatrix} \lambda &{} 0 \\ 0 &{}\lambda ^{-1} \end{pmatrix} \begin{pmatrix} \partial _x \phi _+ \\ \partial _t \phi _+ \end{pmatrix} \end{aligned}$$
(195)

where \(\phi _{\pm }\) is the value of the field at the two sides of the defect. When one defect is placed at \(x=0\) and the opposite defect (which has \(\lambda \) replaced with \( \lambda ^{-1}\)) at \(x=L/2\), the spectrum of the theory is not affected by the defect and therefore the vacuum complexity is unaffected as well, see Eq. (113).

This calculation was also extended to the case of a subregion symmetrical across the defect. Just as in the case without defect, the CA subregion has a logarithmic divergence depending on \(\ell _{ct}\), but still independent of the defect’s parameter.

Instead of a defect, one can consider the case where the CFT has a boundary. The holographic description of a BCFT with the thin-brane model was proposed in [136], and using this proposal the complexity was considered for a CFT of dimension d, with the boundary on a hyperplane, in [137]. They found that in \(d>2\) CV and CA have qualitatively similar behavior. In \(d=2\), similarly to (194), CV has a logarithmic divergence which is absent in CA, but CA has also a finite contribution which is tension-dependent. One should notice however that there is an ambiguity coming from the joints at the boundary: the null normals to the boundary and the WDW patch are orthogonal, so the prescription (169) is not well-defined in this case.

The same result in \(d=2\) was found also in [138]), who in addition also computed the vacuum complexity of a finite harmonic chain with Dirichlet boundary conditions. The computation is similar to the one in 3.1, but now the boundary condition breaks the translational invariance, so the zero mode is lifted and one can take the massless limit; this can be more directly compared to the holographic result, and once again it was found that the \({\mathcal {C}}_1\) complexity is in qualitative agreement with the CV proposal.

It is interesting to observe that in the case of a subregion in a BCFT, the holographic complexity exhibits a phase transition “inherited” from the entanglement entropy [138]. Depending on the ratio of the subregion length and the distance from the boundary, the transition is determined by the minimum area of two possible configurations of the RT surface: one where the surface is the same as it would be without boundary, and the other where the surface ends on the brane in the bulk. At the transition point, the two surfaces have the same area, so the entanglement entropy is continuous, but the complexity changes discontinuously, see Fig. 22. This type of transitions in the entanglement entropy were used extensively for studying the formation of islands in the context of the Page curve of black hole evaporation [9, 10, 139,140,141,142,143,144,145,146,147]. The study of complexity and its discontinuity at the phase transition gives additional insights into this problem, see [148,149,150,151,152]. A similar discontinuity appears also without defect or boundary, in the case of a subregion consisting of two disconnected segments [127].

Fig. 22
figure 22

Illustration of the entanglement and complexity phase transition in a system with two boundaries, as a function of the size of the boundary region A. The region inside the RT surface is colored in yellow. Note that in the right figure this region extends to the IR cutoff and an IR regulator is needed to give a finite result. This effect is due to working within the Poincaré patch and is not present when considering global AdS

9 Summary and outlook

In this introductory, review we started by presenting the most basic ideas related to quantum complexity in relation to quantum computing, as one measure of the difficulty of solving a problem with a quantum algorithm. We have established some generic properties that can be deduced with simple counting arguments on the space of operators. We have introduced the geometric approach of Nielsen, which replaces gate complexity with a notion of continuous complexity. This has many advantages, not least that it is in many cases more amenable to explicit computations. We have illustrated the method on examples of increasing system size (i.e., dimension of the Hilbert space): first a single qubit, then a harmonic oscillator, and finally a free QFT. In the last two cases, the complexity is computable for the class of Gaussian states (or equivalently, operators that are generated by gates quadratic in the oscillators). We presented a partial further extension to the case of a CFT, in which case the states that can be considered are those that belong to a single conformal family, i.e., are descendants of a single primary state. We also presented the additional problems that arise when considering mixed states, mostly using one particular definition of complexity, namely the complexity of purification.

We then moved to the holographic complexity conjectures. We showed, working with the example of the eternal two-sided black hole dual to the TFD state, that both CV and CA reproduce qualitatively the features expected for complexity: the divergence structure matches the free-field theory result, and the behavior in time matches the growth expected for a chaotic, fast-scrambling system. We showed that a crucial property of complexity, the switchback effect, is present in simple holographic models where the perturbation of the system is represented by a shock wave. Finally we presented the extension of the conjectures to the case of subregions of the boundary theory, and an application to the thin-brane holographic models of CFTs with defects and boundaries.

While CV and CA give qualitatively similar answers in most cases, we showed that for subregions and defects/boundaries there were significant differences. This raises an important question: which one, if any, of the two conjectures is the correct one? In fact, complexity is not a single observable, but a family of them. The holographic definitions have some ambiguities, but much fewer than the QFT definition which depends on the choice of a cost function, a basis of gates, penalty factors etc.Footnote 47 It could be that the two holographic conjectures correspond each to a specific choice of these parameters, and all the other choices do not have a natural bulk interpretation, or at least we have not found it yet. If it is true, it would be extremely interesting to understand which complexity is naturally singled out by holography and why. We definitely do not have a “smoking gun” signature comparable to other precision tests of the AdS/CFT correspondence, which require supersymmetry or integrability in order to interpolate between weak and strong coupling. It has not been explored whether supersymmetry and/or integrability play a role in the complexity story.

The tensor network description of holography could shed some light on this question, but it needs to be understood better, particularly for what concerns the dynamical aspects. Another approach is to attack the problem from the other end, as it were, namely to develop further the techniques for studying complexity in QFT. Since holographic theories are strongly coupled, it is essential to develop tools to go beyond Gaussian states and free theories. For the moment, only a few attempts have been made using perturbation theory. As we explained, the computations are manageable only when one can exploit a symmetry of the system; for this reason it seems promising to consider CFTs, but for the moment it is not known how to compute the relative complexity of two states that do not belong to the same conformal family. As we have seen, in free theories the complexity can be found in terms of the spectrum of the theory. Presumably in a CFT there will be some dependence on the OPE coefficients as well. It would be interesting to understand this dependence, and to determine whether some part of complexity has universality properties.

Penalty factors are a crucial ingredient of the complexity geometry. As we have seen in the single qubit case, but as is true more generally, they can make the sectional curvatures negative, see [154], which in turn is associated to diverging geodesics and chaotic behavior (notice however that in the case of coherent states we found a section with the geometry of hyperbolic space even without any penalty factors). It is therefore important to try and understand how the complexity in QFT is affected by penalty factors (see [155] for some work in this direction). This would also help in understanding better the relation between complexity and chaos [32].

An important open question is whether there are universal bounds on the growth rate of complexity. As we have seen in Sect. 7.2, in many cases CA saturates a bound inspired by the Lloyd’s bound, which yields a maximum computation rate proportional to the energy of the system. However, on one hand, one can find holographic counterexamples where the bound is violated, and on the other hand, the Lloyd’s bound, seen as a bound on computational speed, requires some assumptions on how the computation is performed; in particular, it assumes that the operations performed by the gates map a state into an orthogonal state. This assumption is not satisfied by the “simple” gates, namely gates that are close to the identity, which are the type of gates used in the definition of continuous complexity. It was argued in [156] that the holographic results imply that a black hole is modeled by simple gates, if one assumes a serial circuit. They introduce two time scales: the time \(\tau _{comp}\) required to perform an operation, and the time \(\tau _{coh}\) which characterizes the spread of the wavefunction, and can be related to the density of states for a system with many degrees of freedom using a saddle-point approximation . For holographic systems \(\tau _{coh} \gg \tau _{comp}\), implying that the gates are simple. However, it seems more reasonable that a circuit modeling a black hole will be parallel, namely many gates can act simultaneously on different qubits (generically we expect as many as S/2). The analysis in this case becomes more subtle. This is a question that certainly warrants further investigation.

Apart from the question of the bounds, the fact that the complexity grows linearly in time is in itself highly significant, and it has important implications for quantum computability. As discussed in [157], if we assume that black holes behave as universal quantum circuits, then their linear growth of complexity for an exponentially long time implies that there exist problems that can be solved by a classical computer with polynomial space and arbitrary time (i.e., they are in the complexity class PSPACE) but which cannot be solved by a quantum computer in polynomial time. Of course, in order to reach this conclusion it is not enough to argue that the growth is generically linear, but one has to prove it. This has been done recently in [158], (see also [159]) for the case of random circuits built from two-qubit gates, where each gate is drawn randomly according to the Haar measure on SU(4). The proof is basically a refinement of the counting argument, and it shows that the complexity is bounded below by a linear function of time, with probability 1. It is believed that this kind of circuit should be a good model for chaotic quantum dynamics generated by a time-independent Hamiltonian. The result was proven for the exact gate complexity, while it is not yet proven for approximate or continuous complexity.

We have mentioned in the introduction that one of the most important questions concerning the quantum information properties of gravity is the difficulty of decoding the Hawking radiation emitted by a black hole. The holographic conjectures we have presented addresses a different, albeit not unrelated, problem, namely the difficulty of distinguishing different states of a black hole. The fact that the holographic duality relates a quantity of the boundary theory that is difficult to compute (in the colloquial sense of the word) with one in the bulk that is easy to compute does not come as a surprise to people who are familiar with the correspondence. However, in the quantum information-theoretic setting we attribute a precise meaning to the difficulty, and we can wonder, as [160] did, whether this property of the correspondence violates the extended Church–Turing thesis, which postulates that any physical process can be efficiently simulated on a quantum computer. Even though the volume, or the action, of the wormhole is not exactly a physical observable, nevertheless one can argue that it is a quantity that can be easily extracted from a coarse knowledge of the metric. Therefore a quantity of high complexity can be efficiently determined by evolving in the bulk; this suggests that the conversion of bulk quantities into boundary quantities, namely the holographic dictionary, must be extremely complex. As pointed out by [161], this Gedanken experiment requires that the bulk observer has access to the black hole interior, so the horizon will play a role in keeping the Church–Turing thesis valid, under the condition that one only considers the space accessible to outside observers.

Considering the problem of decoding Hawking radiation, one encounters a different puzzle, observed in [162] where a possible solution was also proposed. Suppose a black hole is let to radiate for a not too long time.Footnote 48 According to the ER = EPR conjecture [163], there is a wormhole that connects the interior to the radiation, but the volume grows linearly with time and according to CV the complexity of the state is only polynomial in the entropy at this time, in contrast with the result that the distillation of the information from the radiation is exponentially hard. The solution proposed in [162] is that the difficulty of the distillation task is in fact measured by a different quantity, since one is not allowed to use all possible gates but only those that act on the radiation without acting on the interior. A different holographic conjecture was proposed for this restricted complexity, which involves the area of the maximum cross-section of the wormhole and of the minimal surface in the throat that connects it to the asymptotic region. This shows that there are probably different notions of complexity that can be useful for answering different questions about the quantum information-theoretic aspects of gravity, and there is still much to be understood.

Another important question concerns the implications of complexity for many-body systems. In order to characterize properties such as scrambling, chaos, and thermalization, extensive use has been made mostly of two type of observables: low-point correlation functions (especially out-of-time-order correlators), and entanglement entropy. Quantum computational complexity captures properties of the quantum state of a system that are more refined than those visible through these observables. This is why it is sensitive to the evolution of the microstates in the ensemble corresponding to a black hole. It is likely that it can also be used to give new insights into the mechanisms underlying the approach to equilibrium and thermalization, and possibly detect new types of phase transitions (see e.g., [29, 30, 164]).

We should finally point out again that we did not aim at writing a comprehensive review of the subject, therefore we left out many topics that we felt were too advanced for an introduction, such as the thermodynamics and resource theory aspects of complexity [43, 46, 165, 166], the relation with bulk dynamics (in the sense of reconstructing the Einstein equations in the bulk from the complexity of the boundary) [89], alternative conjectures, most notably the one in [96] (sometimes called CV 2.0), complexity in de Sitter space [52, 167,168,169], the evolution of complexity after a quench [77, 170], the relation between complexity and chaos [32, 171], other notions of complexity such as the operator complexity [172, 173]. We hope that our readers will be encouraged to delve further into this fascinating subject and contribute to its development.