1 Introduction

Based on the intuition that the classical spacetime geometry encodes information theoretic properties of the dual quantum field theory (QFT) in the context of gauge/gravity duality, many quantum information concepts have been applied to investigations of gravity theories. A notable example is the holographic entanglement entropy (EE) of a subregion in a QFT [1]. Even though EE has played a crucial role in understanding the dual gravity, it turned out that EE is not enough [2], in particular, when it comes to the interior of the black hole. In the eternal AdS black hole, an Einstein–Rosen bridge (ERB) connecting two boundaries continues to grow for longer time scale even after thermalization. Because EE quickly saturates at the equilibrium, it cannot explain the growth of the ERB and another quantum information concept, complexity, was introduced as a dual to the growth of ERB [3, 4]. To ‘geometrize’ the complexity of quantum states in the dual gravity theory, two conjectures were proposed: complexity-volume (CV) conjecture [4] and complexity-action (CA) conjecture  [5], which are called holographic complexity.Footnote 1 See also Refs. [15,16,17].

However, note that the complexity in information theory is well-defined in discrete systems such as quantum circuits [18]. For example, the so-called circuit complexity is the minimal number of simple elementary gates required to approximate a target operator in quantum circuit. On the contrary, holographic complexity is supposed to be dual to complexity in a QFT, a continuous system. Thus, there may be a mismatch in duality if we try to compare the holographic complexity with the results purely based on the intuition from circuit complexity and it is important to develop the theory of complexity in QFT. Compared with much progress in holographic complexity, the precise meaning of the complexity in QFT is still not complete. In order to define complexity in QFT systematically, we start with the complexity of operator. The complexity between states will be obtained based on the complexity of operator. For the complexity of states we make a brief comment in Sect. 8 and refer to [19] for more detail. Our strategy to define the complexity of operator is: (i) extract minimal and essential axioms for the complexity of operator from the circuit complexity and (ii) define the complexity in continuous QFT systems based on that minimal axioms and smoothness (from continuity) (iii) consider general symmetries of QFT to give constrain on the structure of complexity. It will turn out that these steps enable us to determine the complexity of the SU(n) operators uniquely.

We want to emphasize that not all properties of circuit complexity survive in the complexity in QFT. The difference between discreteness and continuity makes some essential differences in properties of the complexity. For example, a few basic concepts in “circuit complexity” (computational complexity), such as “gates”, are not well defined in general quantum QFT so they should be modified or abandoned. Thus, we will keep only the most essential properties of the circuits complexity (which will be abstracted into the axioms G1G3 in the following section). As another essential ingredient from QFT side, we will take advantages of basic symmetries of QFT, which may not be necessary in the case of quantum circuits or computer science. Because of the effect of this new inputs from QFT, some properties of the complexity in QFT we obtained may be incompatible with quantum circuits or qubit systems but they are more appropriate for QFT.

Our work is also inspired by a geometric approach by Nielsen et al. [20,21,22], where the discrete circuit complexity for a target operator is identified with the minimal geodesic distance connecting the target operator and the identity in a certain Finsler geometry [23,24,25,26], which is just Riemannian geometry without the quadratic restriction. Recently, inspired by this geometric method, Refs. [13, 27,28,29,30] also investigated the complexity in QFT. However, in these studies, because the Finsler metric can be chosen arbitrarily, there is a shortcoming that the complexity depends on the choice of the metric. In this paper, we show that, for SU(n) operators, the Finsler metric and complexity are uniquely determined based on four general axioms (denoted by G1G4) and the basic symmetries of quantum QFT.

In order to make our logic and claims clear we show a schematic map for the logic structure of this paper in Fig. 1. We want to answer the following questions: (1) for what operators can we define complexity? (2) what are basic properties that complexity should satisfy? (3) for quantum field theory, what symmetries should appear in the complexity? (4) what can we obtain for complexity by the answers of above three questions? (5) what are the similarities and differences compared with previous works?

Fig. 1
figure 1

The logical flows of this paper. By answering three basic questions at the far left side we show that the complexity geometry is determined by a unique bi-invariant Finsler geometry. As an application of our formalism, we show that the Schrödinger’s equation for isolated systems can be obtained from a “minimal cost principle”. Although the bi-invariant geometry looks very different from the right-invariant k-local Riemannian geometry proposed by Ref. [27], we will show that for all k-local operators two theory will give equivalent results

Form the section perspective, this paper is organized as follows. In Sect. 2, we introduce minimal and basic concepts of the complexity and propose three axioms G1G3 for the complexity of operators, which are inspired by the circuit complexity. In Sect. 3, we show how the Finsler metric arises from G1G2 and the smoothness of the complexity (G4). In Sect. 4, by using fundamental symmetry properties of QFT, we investigate constraints on the Finsler metric and the complexity. In particular we show that the Finsler metric is bi-invariant by several different approaches and is determined uniquely if we take the axiom G3 into account. We also compare our results with previous researches regarding bi-invariance. In Sect. 5 we derive the explicit form of the Finsler metric of the SU(n) group. Thanks to the bi-invariance, the geodesic in the Finsler space of SU(n) group (so the complexity) is easily computed. In Sect. 6, as one application of the geodesic in the bi-invariance Finsler metric, we propose a “minimal cost principle” as a new interpretation of the Schrödinger’s equation. In Sect. 7 we make a comparison between our complexity and the complexity for K-qubit systems. In Sect. 8 we conclude.

2 Axioms for the complexity of operators

2.1 Why unitary operators?

In order to make a good definition of the “complexity of operator” we first need to clarify what kind of “operator” we intend to deal with in this paper.

Intuitively, the complexity of operator measures how “complex” a physical process is. Thus, the operator should corresponds to a “realizable” physical process. This concept can be formulated as follows. An operator \(\hat{O}\) is called \(\varepsilon \)-realizable if there is at least one experimental quantum process \(\phi \) (for example, a quantum circuit) so that the following inequality holds for arbitrary two states \(|\psi _1\rangle \) and \(|\psi _2\rangle \)

$$\begin{aligned} \left| \langle \psi _2|\hat{O}|\psi _1\rangle -\langle \psi _2|\phi (\psi _1)\rangle \right| ^2\le \varepsilon , \end{aligned}$$
(2.1)

with \(\varepsilon > 0\). Here \(|\phi (\psi _1)\rangle \) is the output state of the quantum process \(\phi \) for an input state \(|\psi _1\rangle \). The \(\varepsilon \) is the tolerance when we use a \(\phi \) to approximate (simulate) the target operator \(\hat{O}\). Any physical system \(\phi \) satisfying the inequality (2.1) is called an \(\varepsilon \)-realization of operator \(\hat{O}\) and denoted by \(\phi _{\varepsilon ,O}\). All \(\varepsilon \)-realizable operators form a set \(\mathcal {O}_\varepsilon \). If an operator is \(\varepsilon \)-realizable for arbitrary positive \(\varepsilon \), then we call it a realizable operator. For example, the identity \(\hat{\mathbb {I}}\), which just keeps the input as the output, is one realizable operator. All the realizable operators form the set \(\mathcal {O}\). Quantum system \(\phi _{O}:=\lim _{\varepsilon \rightarrow 0^+}\phi _{\varepsilon ,O}\) is called a realization of operator \(\hat{O}\).

With the set of input states (\(S_{\text {in}}\)), the set of output states (\(S_{\text {out}}\),) by realizable operators (\(\hat{O} \in \mathcal {O} \)) can be expressed as

$$\begin{aligned} S_{\text {out}} =\mathcal {O}S_{\text {in}} :=\{\hat{O}|\psi \rangle |\forall \hat{O}\in \mathcal {O}, \forall |\psi \rangle \in S_{\text {in}}\}. \end{aligned}$$
(2.2)

We assume

$$\begin{aligned} S_{\text {out}}\subseteq S_{\text {in}}, \end{aligned}$$
(2.3)

which makes it possible that the elements of output set can be used as new inputs. With the assumption (2.3), we can define the products of operators. The product between elements in \(\mathcal {O}\) can be defined as

$$\begin{aligned} (\hat{O}_1\hat{O}_2)|\psi \rangle :=\hat{O}_1(\hat{O}_2|\psi \rangle ),~\forall |\psi \rangle . \end{aligned}$$
(2.4)

By the definition of realizable operators, it can be shown that \(\hat{O}_1\hat{O}_2\) is realizable if \(\hat{O}_1\) and \(\hat{O}_2\) are both realizable operators. Thus, \(\mathcal {O}\) forms a monoid (semigroup with identity).

If we restrict physical processes to quantum mechanical processes, Eq. (2.1) implies that realizable operators are all unitary rather than Hermitian. In other words, our target is a property of the physical process rather than a direct observable. As quantum circuits are quantum mechanical processes and Solovay–Kitaev theorem [31] says that all the unitary operators can be approximated by some quantum circuits with any nonzero tolerance, we can conclude that the realizable operators set is the set of unitary operators. As unitary operators are invertible, the realizable operators set \(\mathcal {O}\) forms a (finite dimensional or infinite dimensional) unitary group.Footnote 2

2.2 Definitions and axioms

Intuitively speaking, the circuit complexity (or computational complexity) of a target operator ( or computational task) is defined by the minimal number of required fundamental gates ( or fundamental steps) to simulate the target operator ( or finish the computational task). Based on this intuitive concept of the complexity in quantum circuits and computations, we propose that the complexity defined in an arbitrary monoid \(\mathcal {O}\) should satisfy the following three axioms. We denote a complexity of an operator \(\hat{x}\) in an operators set \(\mathcal {O}\) by \(\mathcal {C}(\hat{x})\).

G1 :

[Nonnegativity]

\(\forall \hat{x}\in \mathcal {O}\),  \(\mathcal {C}(\hat{x})\ge 0\) and the equality holds iff \(\hat{x}\) is the identity.

G2 :

[Series decomposition rule (triangle inequality)]

\(\forall \hat{x},\hat{y}\in \mathcal {O}\), \(\mathcal {C}(\hat{x}\hat{y}) \le \mathcal {C}(\hat{x})+\mathcal {C}(\hat{y})\).

G3 :

[Parallel decomposition rule]

\(\forall (\hat{x}_1 ,\hat{x}_2) \in \mathcal {N}= \mathcal {O}_1 \times \mathcal {O}_2 \subseteq \mathcal {O}\), \(\mathcal {C}\big ((\hat{x}_1,\hat{x}_2)\big )=\mathcal {C}\big ((\hat{x}_1,\hat{\mathbb {I}}_2)\big )+\mathcal {C}\big ((\hat{\mathbb {I}}_1,\hat{x}_2)\big )\).

Here, in G2, it is possible that the operator \(\hat{x}\hat{y}\) is decomposed in different ways, say \(\hat{x}' \hat{y}'\). In this case, G2 can read also as \(\mathcal {C}(\hat{x}\hat{y}) = \mathcal {C}(\hat{x}'\hat{y}') \le \mathcal {C}(\hat{x}')+\mathcal {C}(\hat{y}')\). In G3, we consider the case that there is a sub-monoid \(\mathcal {N} \subseteq \mathcal {O}\) which can be decomposed into the Cartesian product of two monoids, i.e., \(\mathcal {N}= \mathcal {O}_1 \times \mathcal {O}_2 \). \(\hat{\mathbb {I}}_1\) and \(\hat{\mathbb {I}}_2\) are the identities of \(\mathcal {O}_1\) and \(\mathcal {O}_2\). The Cartesian product of two monoids implies that \((\hat{x}_1,\hat{x}_2 )(\hat{y}_1,\hat{y}_2)=(\hat{x}_1\hat{y}_1, \hat{x}_2\hat{y}_2)\) for arbitrary \((\hat{x}_1,\hat{x}_2), (\hat{y}_1,\hat{y}_2)\in \mathcal {N}\).

The axiom G1 is obvious by definition. We call the axiom G2 “series decomposition rule” because the decomposition of the operator \(\hat{O}=\hat{x}\hat{y}\) to \(\hat{x}\) and \(\hat{y}\) is similar to the decomposition of a big circuit into a series of small circuits. Reversely, the ‘product’ of two operators corresponds to a serial connection of two circuits. The axiom G2 answers a basic question: what is the relationship between the complexities of two operators and the complexity of their products? Because the complexity is a kind of “minimal”, we require the inequality in G2.Footnote 3 This \(\mathbf G2 \) will lead to the familiar “triangle inequality” in the concept of distance (see F3 in the Sect. 3) so it is also called “triangle inequality”.

In contrast to G2 (series decomposition rule), we call the axiom G3 “parallel decomposition rule”, which is chosen as one of the most basic axioms in defining complexity for the first time in this paper.Footnote 4 It comes from the following fundamental question: if an operator (task) \(\hat{O}\) contains two totally independent sub-operators (sub-tasks) \(\hat{x}_1\) and \(\hat{x}_2\), what should be the relationship between the total complexity and the complexities of two sub-operators (sub-tasks)? Here, the totally independent means that: (a) \(\hat{O}\) accepts two inputs and yields two outputs through \(\hat{x}_1\) and \(\hat{x}_2\), and (b) the inputs for \(\hat{x}_1\) (or \(\hat{x}_2\)) will never affect the outputs of \(\hat{x}_2\) (or \(x_1\)). See Fig. 2 for this explanation.

Fig. 2
figure 2

Schematic diagram for the complexity of the Cartesian product and parallel decomposition rule. As two operators \(\hat{x}_1\) and \(\hat{x}_2\) are simulated independently, the minimally required gates for \((\hat{x}_1,\hat{x}_2)\) is the sum of the minimally required gates for \(\hat{x}_1\) and \(\hat{x}_2\). Thus, we have \(\mathcal {C}((\hat{x}_1,\hat{x}_2))=\mathcal {C}((\hat{x}_1,\hat{\mathbb {I}}_2))+\mathcal {C}((\hat{\mathbb {I}}_1,\hat{x}_2))\)

Mathematically, the construction of a bigger operator \(\hat{O}\) by \(\hat{x}_1\) and \(\hat{x}_2\) under two requirements (a) and (b) corresponds to the Cartesian product denoted by \(\hat{O}=(\hat{x}_1,\hat{x}_2)\). Note that the Cartesian product of two monoids does not correspond to the tensor product in a linear representation (i.e., a matrix representation). Instead, it corresponds to the direct sum. For example, if matrixes \(M_1\) and \(M_2\) are two representations of operators \(\hat{x}_1\) and \(\hat{x}_2\), then the representation of their Cartesian product \(\hat{O}\) is \(M_1\oplus M_2\), which is neither \(M_1\otimes M_2\) nor \(M_1M_2\).

In the language of computer science, this “totally independent” just means that one task contains two independent parallel tasks. Thus, the axiom G3 tries to answer the following question: if a task contains two parallel sub-tasks, what should be the relationship between the total complexity and the complexities of such sub-tasks? In term of mathematical language, it amounts to asking: what should be the relationship between \(\mathcal {C}\big ((\hat{x}_1,\hat{x}_2)\big )\), \(\mathcal {C}\big ((\hat{x}_1,\hat{\mathbb {I}}_2)\big )\) and \(\mathcal {C}\big ((\hat{\mathbb {I}}_1,\hat{x}_2)\big )\)?

G3 answers this question by requiring that the complexity of two parallel tasks is the sum of their complexities, which is very natural. See Fig. 2 for a schematic explanation. In matrix representation, G3 says, for an operator \(M=M_1\oplus M_1\), \(\mathcal {C}(M_1\oplus M_1)=\mathcal {C}(M_1)+\mathcal {C}(M_2)\). It can be generalized to the direct sum of many operators: for a finite number of matrixes \(M_1, M_2, \ldots , M_k\), we have

$$\begin{aligned} \text {Parallel decomposition rule }{} \mathbf G3: ~~\mathcal {C}\left( \bigoplus _{i=1}^k M_i\right) =\sum _{i=1}^k\mathcal {C}(M_i).\nonumber \\ \end{aligned}$$
(2.5)

One may worry about the self-consistency between G2 and G3 and argue that we can only require \(\mathcal {C}((\hat{O}_1, \hat{O}_2))\le \mathcal {C}(\hat{O}_1)+\mathcal {C}(\hat{O}_2)\), as there may be other operators \(\{\hat{O}_a, \hat{O}_a', \hat{O}_b, \hat{O}_b'\}\) to satisfy \((\hat{O}_a,\hat{O}_a') (\hat{O}_b,\hat{O}_b')=(\hat{O}_1, \hat{O}_2)\) but the total gates is less than \(\mathcal {C}(\hat{O}_1)+\mathcal {C}(\hat{O}_2)\). However, this is impossible. One can see that the sum of the minimal gates of \(\{\hat{O}_a, \hat{O}_a', \hat{O}_b, \hat{O}_b'\}\) is \(\mathcal {C}(\hat{O}_a)+\mathcal {C}(\hat{O}_a')+\mathcal {C}(\hat{O}_b)+\mathcal {C}(\hat{O}_b')\). But according to the fact that \(\hat{O}_a\hat{O}_a'=\hat{O}_1\) and \(\hat{O}_b\hat{O}_b'=\hat{O}_2\), we find that

$$\begin{aligned} \mathcal {C}(\hat{O}_a)+\mathcal {C}(\hat{O}_a')+\mathcal {C}(\hat{O}_b)+\mathcal {C}(\hat{O}_b')\ge \mathcal {C}(\hat{O}_1)+\mathcal {C}(\hat{O}_2) . \end{aligned}$$

Thus, \(\mathcal {C}(\hat{O}_1)+\mathcal {C}(\hat{O}_2)\) is the minimal gates to obtain \((\hat{O}_1, \hat{O}_2)\).

The axioms G1G3 are satisfied by both circuit complexity and computational complexity. We have expressed the abstract concepts extracted from circuit complexity and computational complexity in terms of mathematical language and will take them as three basic requirements to define complexity also in other systems. The axiom G1 and G2 can be satisfied by Nielson’s original works Refs. [20,21,22] and recent other approaches to complexity such as Refs. [13, 27,28,29,30]. However, these works did not take into account the question related to G3 and broke the requirement in axiom G3 in general. From the viewpoint of quantum circuits (or computer science), series circuits (or tasks ) and parallel circuits (or tasks) are two fundamental manners to decompose a bigger circuits (or tasks) into smaller ones. Thus, the axioms G3 should be as important as G2. In this paper, we propose the concept of G3 for the first time and show that it plays a crucial role in determining the form of the complexity of SU(n) operators. We may be able to modify G3 in somewhat unnatural way, which will lead us to another form of the Finsler metric similar to (7.7). This point will be clarified in more detail in Ref. [33].

3 Emergence of the Finsler structure from the axioms for the complexity

In this section, we show that the Finsler metric arises from the minimal and general axioms for the complexity G1G3 and the smoothness of the complexity. From here, the group element may represent either an abstract object or a faithful representation, which will be understood by context.

In Sect. 2 we have shown that the realizable operators are unitary operators, so the question now becomes how to define the complexity for unitary operators. As the unitary operators \(\hat{U}\) and \(e^{i\theta } \hat{U}\) (with \(\theta \in (0,2\pi )\)) produce equivalent quantum states, the complexity of \(\hat{U}\) and \(e^{i\theta } \hat{U}\) should be the same. Thus it is enough to study the complexity for special unitary groups, SU(n) groups. Ultimately, our aim is to investigate the complexity for operators in quantum field theory, of which Hilbert space is infinite dimensional, so we have to deal with the infinite dimensional special unitary groups. However, they involve infinite dimensional manifolds and have not been well-studied even in mathematics so far. As an intermediate step, in this paper, we will first present our whole theory for finite dimensional cases and assume that the results can be generalized into infinite dimensional cases by some suitable limiting procedures. The subtle aspects between finite and infinite dimensional Lie groups are now under investigation [19].

For a given operator \(\hat{O}\in \text {SU}(n)\), as \(\text {SU}(n)\) is connected, there is a curve c(s) connecting \(\hat{O}\) and identity \(\hat{\mathbb {I}}\). The curve may be parameterized by s with \(c(0)=\hat{\mathbb {I}}\) and \(c(1)=\hat{O}\). See Fig. 3. The tangent of the curve, \( \dot{c}(s)\), is assumed to be given by a right generator \(H_r(s)\) or a left generator \(H_l(s)\):

$$\begin{aligned} \dot{c}(s) =H_r(s)c(s) \ \ \ \mathrm {or} \ \ \ \dot{c}(s) =c(s)H_l(s). \end{aligned}$$
(3.1)

This curve can be approximated by discrete forms:

$$\begin{aligned} \hat{O}_n&=c(s_n) \end{aligned}$$
(3.2)
$$\begin{aligned}&= \delta \hat{O}_n^{(r)}\hat{O}_{n-1} \end{aligned}$$
(3.3)
$$\begin{aligned}&=\hat{O}_{n-1} \delta \hat{O}_n^{(l)}, \end{aligned}$$
(3.4)

where \(s_n=n/N\), \(n=1,2,3,\ldots , N\), \(\hat{O}_{0}=\hat{\mathbb {I}}\) and \(\delta \hat{O}_{n}^{(\alpha )}=\exp [H_\alpha (s_{n})\delta s]\) with \(\alpha =\) r or l and \(\delta s=1/N\). In general, the two generators \(H_r(s)\) and \(H_l(s)\) at the same point of the same curve can be different, i.e., \(H_r(s)\ne H_l(s)\). In fact, from Eq. (3.1), we see \(H_r(s)\) is one adjoint transformation of \(H_l(s)\),

$$\begin{aligned} H_r(s)=c(s) H_l(s)c(s)^{-1}. \end{aligned}$$
(3.5)
Fig. 3
figure 3

A curve c(s) connects the identity and a particular operator \(\hat{O}\) with the endpoints \(c(0)=\hat{\mathbb {I}}\) and \(c(1)=\hat{O}\). This curve can be approximated by a discrete form. Every endpoint is also an operator, which is labeled by \(\hat{O}_n\)

Fig. 4
figure 4

Schematic diagram for two different generators in quantum circuits. To obtain the some target operator \(\hat{O}\) from the quantum circuit \(\phi _0\), we have two different ways to add new circuits

The availability of two different generators can be understood also by a quantum circuit approximation to an operator, say \(\hat{O}\). As shown in Fig. 4, if a quantum circuit \(\phi _0\) is given, the operator \(\hat{O}\) can be constructed in two ways: (i) by adding a new quantum circuit \(\phi _1\) after the output of \(\phi _0\) (corresponding to Eq. (3.3)) or (ii) by adding a new quantum circuit \(\phi _2\) before the input of \(\phi _0\) (corresponding to Eq. (3.4)). The previous works such as Refs. [20,21,22, 27] assumed that the new operators/circuits could appear only after the output side of original operators/circuits, which corresponds to Eq. (3.3). This is one mathematically allowed choice but there is no a priori or a physical reason for that particular choice. Eq. (3.4) should be equally acceptable.

The axioms G1G3 are suitable for arbitrary monoid, both discrete and continuous ones. Now SU(n) group is a manifold, it is natural to expect that the complexity on it is a smooth function. In fact, it turns out to be enough to assume a weaker form

G4 :

[Smoothness] The complexity of any infinitesimal operator in SU(n), \(\delta \hat{O}^{(\alpha )} = \exp (H_\alpha \delta s)\), is a smooth function of only \(H_\alpha \ne 0\) and \(\delta s \ge 0\), i.e.,

$$\begin{aligned} \mathcal {C} (\delta \hat{O}^{(\alpha )}) = \mathcal {C}(\hat{\mathbb {I}}) + \tilde{F}(H_\alpha ) \delta s + \mathcal {O} (\delta s^2), \end{aligned}$$
(3.6)

where \(\tilde{F}(H_\alpha ) := \partial _{\delta s} \mathcal {C} (\delta \hat{O}^{(\alpha )})|_{\delta s =0} \) and \(\mathcal {C}(\hat{\mathbb {I}}) = 0\) by G1.

which is our forth axiom. Notice that \(\mathcal {C}(\delta \hat{O}^{(r)})=\mathcal {C}(\delta \hat{O}^{(l)})\) if \(\delta \hat{O}^{(r)}=\delta \hat{O}^{(l)}\), which implies that an infinitesimal operator will give the same contribution to the total complexity when it is added to the left-side or right-side.Footnote 5 Thus, the index \(\alpha \) is in fact not necessary in this case, but we keep it for notational consistency.

Let us define the cost (\(L_\alpha [c]\)) of a particular curve c, constructed by only \(\delta \hat{O}_n^{(r)}\) or only \(\delta \hat{O}_n^{(l)}\), as

$$\begin{aligned} L_\alpha [c] := \sum _{i=1}^N\mathcal {C} (\delta \hat{O}_i^{(\alpha )}) \xrightarrow {N \rightarrow \infty } \int _0^1 \tilde{F} (H_\alpha (s)) \text {d}s. \end{aligned}$$
(3.7)

Geometrically, it is the length of the particular curve and \(\tilde{F} \text {d}s \) looks like a line element in some geometry. Thus, the natural question will be what kind of geometry is allowed for complexity? We will show that it is Finsler geometry, which emerges naturally from our axioms for the complexity.

First, we can prove that \(\tilde{F}\) satisfies three properties:

F1 :

(Nonnegativity) \(\tilde{F}(H_\alpha )\ge 0\) and \(\tilde{F}(H_\alpha )=0\) iff \(H_\alpha =0\)

F2 :

(Positive homogeneity) \(\forall \lambda \in \mathbb {R}^+\), \(\tilde{F}(\lambda H_\alpha )=\lambda \tilde{F}(H_\alpha )\)

F3 :

(Triangle inequality) \(\tilde{F}(H_{\alpha ,1})+\tilde{F}(H_{\alpha ,2})\ge \tilde{F}(H_1+H_2)\)

only by using G1, G2 and G4! (see Appendix B for a proof.) Note that F1F3 may describe some suitable properties that the concept of the ‘norm’ of vectors in a vector space should satisfy. In our case, the vector space is the Lie algebra (the tangent space at the identity) and the generators (\(H_\alpha \)) of the algebra are vectors. Indeed, the ‘norm’ satisfying the properties F1F3 is called a Minkowski norm in mathematical jargon. Once we know \(H_\alpha (s)\) we can compute the length of the line element by a Minkowski norm \(\tilde{F}\). (At this stage, we don’t know the explicit form of the Minkowski norm, but we will determine it later.)

For a given \(\tilde{F}\), we have two different natural ways to extend the Minkowski norm \(\tilde{F}\) at the identity to every point on the base manifold via arbitrary curves.

$$\begin{aligned}&F_r(c, \dot{c}) := \tilde{F} (H_r)=\tilde{F}(\dot{c} c^{-1}), \ \ \mathrm {or} \ \ F_l(c, \dot{c})\nonumber \\&\quad := \tilde{F} (H_l)=\tilde{F}(c^{-1}\dot{c}). \end{aligned}$$
(3.8)

where we introduce a new notation ‘\(F_\alpha (c, \dot{c})\)’, a standard notation for Finsler metric in mathematics. The introduction of ‘\(F_\alpha (c, \dot{c})\)’ is justified because the Finsler metric is nothing but a Minkowskia norm defined at all points on the base manifold and Eq. (3.8) explains how to assign the Minkowskia norm to all the other points. We refer to Refs. [23,24,25,26] for an introduction to Minkowski norm and the Finsler geometry.Footnote 6 A brief introduction to the Finsler geometry can be found in Appendix A.

There is an invariant property in the Finsler metrics. \(F_r(c, \dot{c})\) is right-invariant because \(H_r\) is invariant under the right-translation \(c \rightarrow c \hat{x}\) for \(\forall \hat{x} \in \) SU(n). Similarly \(F_l(c, \dot{c})\) is left-invariant because \(H_l\) is invariant under the left-translation \(c \rightarrow \hat{x} c\) for \(\forall \hat{x} \in \) SU(n). If there is no further restriction on \(F_\alpha \), there are at least two natural Finsler geometries, \(F_r\) or \(F_l\), which may give different cost or length.

Finally, the left or right complexity of an operator (\(\mathcal {C}_\alpha (\hat{O})\)) is identified with the minimal length (or minimal cost) of the curves connecting \(\hat{\mathbb {I}}\) and \(\hat{O}\):

$$\begin{aligned} \mathcal {C}_\alpha (\hat{O}) :=\min \{L_\alpha [c]|~\forall c(s),~c(0)=\hat{\mathbb {I}},~c(1)=\hat{O}\}. \end{aligned}$$
(3.9)

We see that, even if we know the complexity of every infinitesimal operator (Eq. (3.6)), we have at least two different ways (left or right-way) to define the complexity of an operator and there is no a priori preferred choice among them. In order for the complexity of an operator to be a well-defined physical observable, this mathematical ambiguity should disappear naturally by some suitable physical considerations. In the following section we will show how this ambiguity is removed.

4 Symmetries of the complexity inherited from QFT symmetries

In the previous section, we have shown that the complexity can be computed by the minimal length of curves in Finsler geometry. We want to emphasize again that in our work the Finsler structure is not assumed, but it has been derived based on G1, G2 and G4. This is a novel feature of our work compared to other works dealing with the Finsler geometry.

However, apart from the defining properties of the Finsler metric F1F3, we don’t know anything on \(\tilde{F}(H_\alpha )\) so far. In this section, we will show there are constraints on \(\tilde{F}(H_\alpha )\) if we take into account some symmetries of QFT. This is another important novel feature of our work compared to others. From here, we do not rely on properties of discrete systems or circuit models, which may be incompatible with QFT so may mislead us. We will directly deal with QFT and its symmetry properties and see what kind of constraints we can impose on \(\tilde{F}(H_\alpha )\).

Note that such symmetry considerations are not necessary if we use “complexity” as a purely mathematical tool, for example, to study the “NP-completeness” and to analyze how complex an algorithm or a quantum circuit is. However, when we use the complexity to study real physical processes and try to treat the complexity as a basic physical variable hiding in physical phenomena, symmetries relevant to physical phenomena will be a necessary requirement.

In Sect. 4.1, by requiring unitary invariance for complexity we find

$$\begin{aligned} \text {[Independence of left/right generators]} \quad \tilde{F}(H_l)=\tilde{F}(H_r). \end{aligned}$$
(4.1)

It means that the complexity does not depend on our choice of \(H_r\) or \(H_l\). Therefore, we call this property ‘Independence of left/right generators’ of \(\tilde{F}\). Recall that for a given curve, we may have two metrics, either \(F_r(c,\dot{c})=\tilde{F}(H_r)\) or \(F_l(c,\dot{c})=\tilde{F}(H_l)\). It is an inherent ambiguity mathematically but this ambiguity can be removed by imposing physical condition, unitary invariance. To support our result (4.1) we will present three more arguments in Sect. 4.2 and Appendix D. Note that the constraint (4.1) also implies the Finsler geometry is bi-invariant, meaning both right and left invariant.

In Sect. 4.3, by requiring the CPT symmetry,Footnote 7 we obtain

$$\begin{aligned} \text {[reversibility]}\quad \tilde{F}(H_\alpha ) = \tilde{F}(-H_\alpha ). \end{aligned}$$
(4.2)

We call this property ‘reversibility’ of \(\tilde{F}\) following the mathematical literature, for example, [25]. In Appendix D, we will provide two more methods to support Eq. (4.2). Geometrically speaking, for a given path connecting A and B, it is the constraint Eq. (4.2) that gives the same length when we go from A to B and from B to A.

4.1 Independence of left/right generators from unitary invariance

In this subsection we consider the effect of the unitary invariance of the quantum field theory on the Finsler metric, cost, and complexity. Let us consider an arbitrary quantum field \(\Phi \) with a Hilbert space \(\mathcal {H}\) and a vacuum \(|\Omega \rangle \), which are collectively denoted by \(\{\Phi ,\mathcal {H}, |\Omega \rangle \}\). Its unitary partner is \(\tilde{\Phi }(\vec {x},t):=\hat{U}\Phi (\vec {x},t)\hat{U}^{\dagger }\), \(\tilde{\mathcal {H}}:=\{\hat{U}|\psi \, \rangle |\, \forall |\psi \rangle \in \mathcal {H}\}\) and \(|\tilde{\Omega }\rangle :=\hat{U}|\Omega \rangle \), which are denoted by \(\{\tilde{\Phi },\tilde{\mathcal {H}},|\tilde{\Omega }\rangle \}\).

In the Heisenberg picture, the dynamic of the quantum field \(\Phi \) is governed by a time evolution operator c(t):

$$\begin{aligned} \Phi (\vec {x},t)=c(t)^\dagger \Phi (\vec {x},0)c(t). \end{aligned}$$
(4.3)

The time evolutions of its unitary partner \(\tilde{\Phi }\) is

$$\begin{aligned} \begin{aligned} \tilde{\Phi }(\vec {x},t)&=\hat{U}c(t)^\dagger \hat{U}^{\dagger }\hat{U}\Phi (\vec {x},0)\hat{U}^\dagger \hat{U}c(t)\hat{U}^{\dagger }\\&=\tilde{c}(t)^\dagger \tilde{\Phi }(\vec {x},0)\tilde{c}(t), \end{aligned} \end{aligned}$$
(4.4)

where

$$\begin{aligned} \tilde{c}(t):=\hat{U}c(t)\hat{U}^{\dagger }, \end{aligned}$$
(4.5)

so the evolution of the unitary partner \(\tilde{\Phi }\) is given by \(\tilde{c}(t)\). On the other hand, we cannot distinguish \(\{\Phi ,\mathcal {H},|\Omega \rangle \}\) and its unitary partner \(\{\tilde{\Phi },\tilde{\mathcal {H}},|\tilde{\Omega }\rangle \}\) in the sense that any physical experiment will be invariant under the transformation \(\{\Phi ,\mathcal {H},|\Omega \rangle \}\rightarrow \{\tilde{\Phi },\tilde{\mathcal {H}},|\tilde{\Omega }\rangle \}\). We will call this invariance “unitary-invariance”. Thus, it is natural to expect that the cost cannot distinguish them either, i.e.Footnote 8

$$\begin{aligned} L_\alpha [c]=L_\alpha [\hat{U}c(t)\hat{U}^{\dagger }],~~\forall \hat{U}\in \text {SU({ n})}. \end{aligned}$$
(4.6)

To extract a constraint on \(\tilde{F}\) imposed by Eq. (4.6), it is enough to consider a special curve generated by an arbitrary constant generator \(H_\alpha \)

$$\begin{aligned} c(t)=\exp ( H_\alpha t), \end{aligned}$$
(4.7)

with \(t\in [0,1]\). By the definition of the cost, Eq. (3.7), we have

$$\begin{aligned} \int _0^1\tilde{F}(H_\alpha )\text {d}t=\int _0^1\tilde{F}(\hat{U}H_\alpha \hat{U}^\dagger )\text {d}t, \end{aligned}$$
(4.8)

which implies

$$\begin{aligned} \text {[adjoint invariance]}\quad \tilde{F}(H_\alpha ) = \tilde{F}(\hat{U}H_\alpha \hat{U}^\dagger ), \qquad \forall U \in \text {SU({ n})},\nonumber \\ \end{aligned}$$
(4.9)

As \(H_r\) is just one adjoint transformation of \(H_l\) (see Eq. (3.5)), it follows that

$$\begin{aligned} \tilde{F}(H_r)=\tilde{F}(H_l) \quad \text {or} \quad F_r(c,\dot{c})=F_l(c,\dot{c}), \end{aligned}$$
(4.10)

where Eq. (3.8) is used. It means that the left generator and the right generator give the same complexity. Although we have the freedom to choose the left or right generator, the complexity will be independent of our choice. In other words, if we know the complexity for every infinitesimal operator (Eq. (3.6)), then we have a unique value of the complexity in spite of the inherent ambiguity due to the availability of the left and right generators. In Fig. 5, we summarize the relation between the constraints on the Finsler metric, cost, and complexity. One important consequence of Eq. (4.10) is that the Finsler geometry is bi-invariant, which means both right and left invariant. This property will be very useful when we determine the geodesic in the geometry in Sect. 5.2.

Fig. 5
figure 5

Equivalences between (i) the unitary-invariance of QFT, (ii) the independence of the Finsler metric on the left/right generators, and (iii) the adjoint invariance of the complexity

One may argue that the complexity may not be directly observable and it is possible that c(s) and \(\tilde{c}(s)\) give different complexity. If that happens in some framework of computing the complexity, in our opinion, there must be some gauge freedom in the definition of the complexity in the framework, for the complexity still to be a physical object. Thus, we will be able to make a suitable gauge fixing or a redefinition of the complexity so that this “new complexity” is physical and satisfies Eq. (4.6).

Fig. 6
figure 6

Schematic diagram for the relation and difference between qubit systems and SU(n) groups

4.2 Comparison between SU(n) groups and qubit systems

In order to clarify why the adjoint invariance of the complexity is natural, which may not be the case in discrete systems, we make a comparison between SU(n) groups and qubit systems in Fig. 6. For qubit systems, the operators set forms a countable monoid \(\mathcal {O}\) and can be obtained from a countable fundamental gates set g. The complexity of any operator in \(\mathcal {O}\) is given by the minimal gates number when we use the gates in g to form the target operator. For SU(n) groups, the operators set forms an SU(n) Lie group and the fundamental gates are replaced by infinitesimal operators, which form the Lie algebra \(\mathfrak {su}(n)\).

For qubit systems, suppose that the complexity measured by gates set g is \(\mathcal {C}_{\alpha }(g;\hat{O})\) (here \(\alpha =r,l\)). If we make a “global” unitary transformation on the operators set and gates set together, i.e., \(\tilde{\mathcal {O}}:=\hat{U} \mathcal {O}\hat{U}^{\dagger }\) and \(\tilde{g}:=\hat{U} g \hat{U}^{\dagger }\) we have the following trivial equality

$$\begin{aligned} \mathcal {C}_{\alpha }(g;\hat{O})=\mathcal {C}_{\alpha }(\tilde{g};\tilde{\hat{O}}), \end{aligned}$$
(4.11)

where \(\mathcal {C}_{\alpha }(\tilde{g};\tilde{\hat{O}})\) denotes the complexity of \(\forall \tilde{\hat{O}}\in \tilde{\mathcal {O}}\) measured by \(\tilde{g}\).

In general, the gates set is not invariant under the unitary transformation, i.e.

$$\begin{aligned} \tilde{g} =\hat{U} g \hat{U}^{\dagger }\ne g, ~~~~\text {({ g} is the gates set rather than a gate)} \end{aligned}$$
(4.12)

so we will see that the complexity of \(\hat{O}\) and \(\tilde{\hat{O}}\), measured by same gates set g, will not be the same, i.e.

$$\begin{aligned} \mathcal {C}_{\alpha }(g;\hat{O})=\mathcal {C}_{\alpha }(\tilde{g};\tilde{\hat{O}})\ne \mathcal {C}_{\alpha }(g;\tilde{\hat{O}}) \end{aligned}$$
(4.13)

This shows that the complexity for qubit system will not be invariant under \(\hat{O} \rightarrow \hat{U}\hat{O}\hat{U}^\dagger \) if we use the same gates set.

For SU(n) groups, we still obtain an equation similar to Eq. (4.11),

$$\begin{aligned} \mathcal {C}_{\alpha }(\mathfrak {su}(n);\hat{O})=\mathcal {C}_{\alpha }(\widetilde{\mathfrak {su}}(n);\tilde{\hat{O}}). \end{aligned}$$
(4.14)

However, unlike Eq. (4.12) in qubit systems, we have the following equalityFootnote 9

$$\begin{aligned} \widetilde{\mathfrak {su}}(n):=\hat{U} \mathfrak {su}(n)\hat{U}^{\dagger }=\mathfrak {su}(n),~~\forall \hat{x}\in \text {SU({ n})}. \end{aligned}$$
(4.15)

Thus, we see that Eq. (4.14) implies that

$$\begin{aligned} \mathcal {C}_{\alpha }(\mathfrak {su}(n);\hat{O})=\mathcal {C}_{\alpha }(\mathfrak {su}(n);\tilde{\hat{O}}),~~\forall \hat{O}\in \text {SU({ n})}, \end{aligned}$$
(4.16)

which means the complexity of SU(n) group will be invariant under the adjoint transformation, \(\hat{O} \rightarrow \hat{U}\hat{O}\hat{U}^\dagger \).

It is the difference between Eqs. (4.12) and (4.15) that leads the difference between qubit systems and SU(n) regarding the invariance under adjoint transformations. Because Eq. (4.16) is valid also for any infinitesimal operator, it implies Eq. (4.6). This is another derivation of Eq. (4.6). We have presented two arguments to support the idea that the complexity of SU(n) group should be invariant under adjoint transformations. In Appendix D, we will give the third and the fourth arguments to support this conclusion.

To understand the validity of the adjoint invariance of the complexity, one useful question is the following: what will happen if we restrict our operators set to some subgroup of SU(n)? Let G to be a connected real subgroup and its Lie algebra to be \(\mathfrak {g}\). In this case, we can still obtain the following equation under a general unitary transformation \(\tilde{G}=\hat{x}G\hat{x}^{-1}\) and \(\tilde{\mathfrak {g}}=\hat{x}\mathfrak {g}\hat{x}^{-1}\),

$$\begin{aligned} \mathcal {C}_{\alpha }(\mathfrak {g};\hat{O})=\mathcal {C}_{\alpha }(\widetilde{\mathfrak {g}};\tilde{\hat{O}}),~~\forall \hat{x}\in \text {SU({ n})},~\forall \hat{O}\in G. \end{aligned}$$
(4.17)

If \(\mathfrak {g}\) is an ideal of \(\mathfrak {su}(n)\), \(\widetilde{\mathfrak {g}}=\mathfrak {g}\) for all \(\hat{x}\in \)SU(n). However, because the \(\mathfrak {su}(n)\) is simple it does not have other ideals except for the trivial \(\{0\}\). Thus, if we restrict the operators set to any real subgroup of SU(n), the complexity may not be invariant under a adjoint transformation. For qubit systems such as a quantum circuit, the gates set is discrete, which can form only a subgroup of SU(n). As SU(n) group does not have non-trivial normal subgroup, the complexity for qubit systems is not invariant under the general adjoint transformation.

4.3 Reversibility of Finsler metric from the CPT symmetry

In this subsection we consider the effect of the CPT symmetry of the quantum field theory on the Finsler metric, cost, and complexity. Let us denote the CPT partner of \(\Phi (\vec {x},t)\) by \(\bar{\Phi }(\vec {x},t)\). i.e. \(\bar{\Phi }(\vec {x},t) :=C\circ P\circ T[\Phi (\vec {x},t)]=C[\Phi (-\vec {x},-t)]\). By using Eq. (4.3), we have

$$\begin{aligned} \begin{aligned} \bar{\Phi }(\vec {x},t)&= C\circ P\circ T[c(t)^\dagger \Phi (\vec {x},0)c(t)] \\&=c(-t)^\dagger C[\Phi (-\vec {x},0)]c(-t) \\&=c(-t)^\dagger \bar{\Phi }(\vec {x},0)c(-t), \end{aligned} \end{aligned}$$
(4.18)

where we use the fact that c(t) does not have charge and spatial variable \(\vec {x}\). Thus, the evolution of the CPT parter is given by \(\bar{c}(t):=c(-t)\). Given the CPT symmetry of the theory, it is natural to assume that the costs of c(t) and \(\bar{c}(t)\) are the same, i.e.,

$$\begin{aligned} L_\alpha [c(t)]=L_\alpha [c(-t)]. \end{aligned}$$
(4.19)

Similarly to the unitary symmetry case in Sect. 4.1, as a way to understand the general structure of \(\tilde{F}\), we consider a special curve, the time evolution given by an arbitrary constant generator \(H_\alpha \). Because the generators of \(\bar{c}(s)\) are given by \(\bar{H}_\alpha =-H_\alpha \), Eq. (4.19) reads, by the definition of the cost Eq. (3.7),

$$\begin{aligned} \int _0^1\tilde{F}(H_\alpha )\text {d}t=\int _0^1\tilde{F}(-H_\alpha )\text {d}t. \end{aligned}$$
(4.20)

which implies

$$\begin{aligned} \tilde{F}(H_\alpha )=\tilde{F}(-H_\alpha ). \end{aligned}$$
(4.21)

Path-reversal symmetry If we combine the result of the CTP symmetry and the unitary symmetry, Eqs. (4.21) and (4.9) respectively, one can prove the “path-reversal symmetry” for an arbitrary curve:

$$\begin{aligned} L_\alpha [c]=L_{\alpha }[c^{-1}], \qquad \forall c(s). \end{aligned}$$
(4.22)

Note that in general \(c^{-1}(s):=[c(s)]^{-1}\) is not the curve generated by \(-H_\alpha (s)\) when the curve c(s) is generated by \(H_\alpha (s)\). For example, for the right-invariant case, we can show

$$\begin{aligned} \tilde{F}(H_r (c^{-1})) = \tilde{F}(- c^{-1} H_r (c) c) = \tilde{F}( H_r (c)), \end{aligned}$$
(4.23)

which gives Eq. (4.22). Here, we used \(H_r(c^{-1}) = (\text {d}c^{-1}/\text {d}s) c = -c^{-1}(\dot{c} c^{-1}) c =-c^{-1} H_r(c) c\) in the first equality and Eqs. (4.21) and (4.9) in the second equality. In fact, the reverse also holds, i.e. Eq. (4.22) implies Eqs. (4.21) and (4.9). First, by considering the special case \(c = e^{Hs}\) with a constant H, Eqs. (4.21) can be derived from Eq. (4.22). Then, we end up with \(L_r[c] = L_l[c]\), which implies Eq. (4.9) by the same logic in Sect. 4.1.

Thus, we have the following equivalence between the path-reversal symmetry and the adjoint invariance with the reversibility of the Finsler metric:

$$\begin{aligned}&\forall c(s),~~L_\alpha [c]=L_{\alpha }[c^{-1}]~\nonumber \\&\quad \Leftrightarrow ~ \left\{ \begin{aligned}&\tilde{F}(H_\alpha )=\tilde{F}(\hat{U}H_\alpha \hat{U}^\dagger ),~~~\forall H_\alpha , ~~ \forall \hat{U}\in \text {SU({ n})};\\&\tilde{F}(H_\alpha )=\tilde{F}(-H_\alpha ), \qquad ~ \forall H_\alpha . \end{aligned} \right. \end{aligned}$$
(4.24)

The path-reversal symmetry also can be justified by other ways, for example, based on the inverse symmetry of the relative complexity or the “ket-world”-“bra-world” symmetry. These two arguments are presented in Appendix D in detail. They may server other supporting evidence for Eqs. (4.21) and (4.9) because of the equivalence in Eq. (4.24).

5 Complexity of SU(n) operators

5.1 Finsler metric of SU(n) operators

From here we will drop the indexes \(\alpha , r\), and l based on Eq. (4.10). We have found two constraints Eqs. (4.9) and (4.21) by considering basic physical symmetries. These constraints (plus G3) turn out to be strong enough to determine the Finsler metric in the operator space of any SU(n) groups as follows

$$\begin{aligned} F(c(s),\dot{c}(s))= \tilde{F}(H(s)) = \lambda \text {Tr}\sqrt{H(s)H(s)^\dagger }, \end{aligned}$$
(5.1)

where \(H(s) = H_r(s)\) or \(H_l(s)\) for the curve c(s) and \(\lambda \) is arbitrary constant. (see Appendix C for a proof.)

The Finsler metric Eq. (5.1) is representation independent and to find the explicit Finsler metric tensor for the SU(n) group we consider the fundamental representation. An arbitrary generator H(s) can be expanded as

$$\begin{aligned} H(s)=iH^a(s)T_a,~~~H^a(s)\in \mathbb {R}, \end{aligned}$$
(5.2)

where \(\{T_a\}\) are basis of \(\mathfrak {su}(n)\) in the fundamental representation with the following property.

$$\begin{aligned} T_aT_b=\frac{1}{2n}\delta _{ab}\hat{\mathbb {I}}+\frac{1}{2}\sum _{c=1}^{n^2-1}(i{f_{ab}}^c+{d_{ab}}^c)T_c, \end{aligned}$$
(5.3)

where \({f_{ab}}^c\) is the structure constants antisymmetric in all indices, while \({d_{ab}}^c\), which is nonzero only when \(n>2\), is symmetric in all indices and traceless. Thus, Eq. (5.1) reads

$$\begin{aligned} F(c,\dot{c})=\frac{1}{\sqrt{2n}}{\text {Tr}}\sqrt{H^a(s)H^b(s)[\delta _{ab}\hat{\mathbb {I}}+n{d_{ab}}^cT_c]}, \end{aligned}$$
(5.4)

with

$$\begin{aligned} H^a(s)=2{\text {Tr}}[H(s)T^a]=2{\text {Tr}}[\dot{c}(s)c^{-1}(s)T^a], \end{aligned}$$
(5.5)

and

$$\begin{aligned} T^a:=T_b\delta ^{ab}. \end{aligned}$$
(5.6)

Without loss the generality, we have set \(\lambda =1\).

For \(n=2\), \({d_{ab}}^c=0\) so Eq. (5.4) is simplified to

$$\begin{aligned} \begin{aligned} F(c,\dot{c})&=\frac{1}{2}{\text {Tr}}\sqrt{H^a(s)H^b(s)\delta _{ab}\hat{\mathbb {I}}}=2\sqrt{H^a(s)H^b(s)\delta _{ab}} \\&= \sqrt{{\text {Tr}}[\dot{c}(s)c^{-1}(s)T^a]{\text {Tr}}[\dot{c}(s)c^{-1}(s)T^b]\delta _{ab}}, \end{aligned} \end{aligned}$$
(5.7)

where Eq. (5.5) is used. It contains only quadratic terms of \(\dot{c}\) so it gives a Riemannian geometry. For \(n>2\)

$$\begin{aligned} \begin{aligned} F(c,\dot{c})&=\frac{2}{\sqrt{2n}} {\text {Tr}}\sqrt{{\text {Tr}}[H(s)T^a]{\text {Tr}}[H(s)T^b][\delta _{ab}\hat{\mathbb {I}}+n{d_{ab}}^cT_c]} \\&=\frac{2}{\sqrt{2n}} {\text {Tr}}\sqrt{{\text {Tr}}(\dot{c}c^{-1}T^a){\text {Tr}}(\dot{c}c^{-1}T^b)[\delta _{ab}\hat{\mathbb {I}}+n{d_{ab}}^cT_c]}. \end{aligned} \end{aligned}$$
(5.8)

As \(\hat{\mathbb {I}}\) and \(T_c\) are \(n\times n\) matrixes, the line element in Eq. (5.8) is not the quadratic form of \(\dot{c}\) so it is not Riemannian but Finsler.

5.2 Geodesics and complexity of SU(n) operators

Even though we have the precise Finsler metric, to compute the complexity, we still have to find a geodesic path as shown in (3.9). This minimization procedure is greatly simplified thanks to bi-invariance implied by (4.10). The bi-invariance means the geometry is both right and left invariant. It has been shown that, in bi-invariant Finsler geometry, the curve c(s) is a geodesic if and only if there is a constant generator \(H(s) = \bar{H}\) such that [34, 35]

$$\begin{aligned} \dot{c}(s)=\bar{H}c(s) \ \ \text {or} \ \ c(s)=\exp (s\bar{H}). \end{aligned}$$
(5.9)

With the condition \(\hat{O}=c(1)=\exp (\bar{H})\), we can solve \(\bar{H}\) formally \(\bar{H}=\ln \hat{O}\). The logarithm of a unitary operator always exists but may not be unique (theorem 1.27 in Ref. [36]). Because \(\bar{H}\) is constant, from Eqs. (3.7),

$$\begin{aligned} L[c] = \tilde{F}(\bar{H}) = \text {Tr}\sqrt{\bar{H}\bar{H}^\dagger }. \end{aligned}$$
(5.10)

Finally, the complexity of \(\hat{O}\) in Eq. (3.9) is given by

$$\begin{aligned} \mathcal {C}(\hat{O}) = \min \{ \text {Tr}\sqrt{\bar{H}\bar{H}^\dagger }\ | \ \forall \, \bar{H}=\ln \hat{O}\}, \end{aligned}$$
(5.11)

The minimization ‘min’ in (3.9) in the sense of ‘geodesic’ is already taken care of in (5.9). Here ‘min’ means the minimal value due to multi-valuedness of \(\ln \hat{O}\).

For example, let us consider the SU(2) group in the fundamental representation. For any operator \(\hat{O}\in \)SU(2), there is a unit vector \(\vec {n}\) and a real number \(\theta \) such that,

$$\begin{aligned} \hat{O} = \exp (i \theta \vec {n}\cdot \vec {\sigma }) =\hat{\mathbb {I}}\cos \theta +i(\vec {n}\cdot \vec {\sigma })\sin \theta , \end{aligned}$$
(5.12)

where \(\vec {\sigma }:=(\sigma _x,\sigma _y,\sigma _z)\) stands for three Pauli matrixes. Because \(\ln \hat{O}=i \theta _m\vec {n}\cdot \vec {\sigma }\) with

$$\begin{aligned} \theta _m=\arccos [\text {Tr}(\hat{O})/2]+2m\pi \,, \end{aligned}$$
(5.13)

for \(\forall m\in \mathbb {N}\), the complexity of \(\hat{O}\) is given by

$$\begin{aligned} \mathcal {C}(\hat{O})= 2\arccos [\text {Tr}(\hat{O})/2], \end{aligned}$$
(5.14)

where \( \bar{H}\bar{H}^{\dagger }=\theta _m^2\hat{\mathbb {I}}\) is used.

Fig. 7
figure 7

The schematic figure showing the curve of a time evolution operator \(\hat{U}(t)\). To reach the same target operator at \(t=t_0\), there are many different curves. The curve given by Schrödinger’s equation coincides with the curve of complexity

6 Minimal cost principle

The fact that the process of complexity is generated by a constant generator allows an interesting interpretation of the Schrödinger’s equation. In a quantum system with a time-independent Hamilton \(\mathbb {H}\), the time evolution of the quantum state \(|\psi (t)\rangle \) is given by a time evolution operator \(\hat{U}(t)\), i.e. \(|\psi (t)\rangle =\hat{U}(t)|\psi (0)\rangle \), where \(\hat{U}(t)\) satisfies the Schrödinger’s equation,

$$\begin{aligned} \frac{\text {d}}{\text {d}t}\hat{U}(t)=-i\hbar ^{-1}{\mathbb {H}}\hat{U}(t),~~~\hat{U}(0)=\hat{\mathbb {I}}. \end{aligned}$$
(6.1)

Comparing with Eq. (5.9), we may say that \(\hat{U}(t)\) is a geodesic generated by \(-i\hbar ^{-1}{\mathbb {H}}\). Thus, the time-evolution operator \(\hat{U}(t)\) is along the curve of minimal (at least locally minimal) complexity.

Now let us re-consider the problem in the following way. Assume that after a short time \(t=t_0\), the time-evolution operator becomes \(\hat{U}(t_0)=\hat{O}\). As there are many different curves which connect the \(\hat{\mathbb {I}}\) and \(\hat{O}\) (see the Fig. 7), how can we find the real curve \(\hat{U}(t)\) during \(t\in (0,t_0)\)? One answer is that we assume the time evolution operator will obey the Schrödinger’s equation (6.1). Alternatively, we may replace the Schrödinger’s equation with the following principle:

  • Minimal cost principle: For isolated systems, the time-evolution operator \(\hat{U}(t)\) will be along the curve to reach the target operator so that the cost during this process is locally minimal, i.e. the evolution curve will make the following integral to be locally minimal:

    $$\begin{aligned} L[\hat{U}(t)]=\int _0^{t_0}\text {Tr}\sqrt{\dot{\hat{U}}(t)[\dot{\hat{U}}(t)]^\dagger }\text {d}t,~~\hat{U}(0)=\hat{\mathbb {I}},~~~\hat{U}(t_0)=\hat{O},\nonumber \\ \end{aligned}$$
    (6.2)

    where we used Eq. (5.1), where \( H \rightarrow \dot{\hat{U}}(t)[\hat{U}(t)]^\dagger \).

As a result, the time evolution operator \(\hat{U}(t)\) will satisfy Eq. (6.1). In other words, by this principle, the Schrödinger’s equation is not the first-principle but a consequence of the complexity principle.Footnote 10

7 Comparison with the complexity for K-local qubit systems

For a better understanding of the novel aspects of our work compared to previous research it is useful to compare our complexity and the complexity for K-qubit systems [27]. In particular, our complexity is bi-invariant but the complexity geometry in Ref. [27] is only right-invariant. At first glance, two theories may look different, but the difference in complexity turns out to be little and most of physical results in Ref. [27] can also appear in our theory.

For K-qubit system, the operators form a SU\((2^K)\) group and can be generated only by a right Hamiltonian

$$\begin{aligned} H_r(s)=\sum _IH^a_r(s)T_a, \end{aligned}$$
(7.1)

where \(a=1,2,3,\ldots , 4^K-1\) and \(T_a\) is a series of generalized Pauli operators which can span the Lie algebra \(\mathfrak {su}\)(\(2^K\)). In Ref. [27], for the SU(\(2^K\)) group, the following Riemannian metric was proposed

$$\begin{aligned} \text {d}l^2=\text {d}\Omega _a\mathcal {I}^{ab}\text {d}\Omega _b, \end{aligned}$$
(7.2)

where

$$\begin{aligned} \text {d}\Omega _a:=i{\text {Tr}}(\text {d}c^\dagger (s)T_ac(s)). \end{aligned}$$
(7.3)

Here, the matrix \(\mathcal {I}^{ab}\) should be chosen as a block diagonal matrix with one block corresponding to the unpenalized k-local directions, and the other block corresponding to the directions \(T_a\) containing more than k single qubit operators. Note that, for given a and b, \(iT_a\) is matrix-valued vector in the representation space of \(\mathfrak {su}(2^K)\) and \(\mathcal {I}^{ab}\) is a real number.

The curve c(s) is assumed to be generated by a right generator \(H_r(s)\) such that

$$\begin{aligned} \text {d}c(s)=H_r(s)c(s)\text {d}s,~~~~H_r(s)\in \mathfrak {su}(2^K). \end{aligned}$$
(7.4)

Taking Eq. (7.4) into Eq. (7.3), we have

$$\begin{aligned} \begin{aligned} \text {d}\Omega _a&={\text {Tr}}[c^\dagger (s)H_r(s)T_ac(s)]\text {d}s\\&={\text {Tr}}[H_r(s)T_a]\text {d}s, \end{aligned} \end{aligned}$$
(7.5)

so Eq. (7.2) becomes

$$\begin{aligned} \text {d}l^2={\text {Tr}}[H_r(s)T_a]{\text {Tr}}[H_r(s)T_a]\mathcal {I}^{ab}\text {d}s^2, \end{aligned}$$
(7.6)

which yields the following Finsler (or just Riemannian) metric

$$\begin{aligned} F(c,\dot{c})= \sqrt{{\text {Tr}}[H_r(s)T_a]{\text {Tr}}[H_r(s)T_b]\mathcal {I}^{ab}}. \end{aligned}$$
(7.7)

It has two differences compared with our result (5.8). One is that the “\({\text {Tr}}\)” is inside the square root. The other is that there is a matrix structure \(\mathcal {I}^{ab}\) which is not determined by the Lie algebra uniquely. In the following, we make comparisons between two complexities based on two different Finsler metrics.

  1. (1)

    In our paper, the only basic assumptions are G1G4. All conclusions such as Finsler geometry and the Finsler metric Eq. (5.1) are the results of these four assumptions and fundamental symmetries of QFTs. In Ref. [27], the Riemannian geometry and metric (7.6) in K-qubit system were proposed directly as the basic assumptions.

  2. (2)

    The complexity given by Eq. (7.7) satisfies our axioms G1 and G2 but breaks G3. It can be shown by considering a simple case, \(\mathcal {I}^{ab}=\delta ^{ab}\), which corresponds to bi-invariant case without any penalty (isotropy). In this case, the complexity of the operator \(\hat{O} = {\text {exp}}(H)\) is given by F(H) because the geodesic is generated by a constant generator (due to bi-invariance) i.e.

    $$\begin{aligned} C(\hat{O}) = F(H). \end{aligned}$$
    (7.8)

    By using Eqs. (7.8) and (7.7) we have

    $$\begin{aligned}&\mathcal {C}(\hat{O}_1\oplus \hat{O}_2)=\sqrt{\mathcal {C}^2(\hat{O}_1\oplus \hat{\mathbb {I}}_2)+\mathcal {C}^2(\hat{\mathbb {I}}_1\oplus \hat{O}_2)}\nonumber \\&\quad =\sqrt{\mathcal {C}^2(\hat{O}_1)+\mathcal {C}^2(\hat{O}_2)}, \end{aligned}$$
    (7.9)

    and in more general cases,

    $$\begin{aligned} \mathcal {C}\left( \bigoplus _i\hat{O}_i\right) =\sqrt{\sum _ip_i[\mathcal {C}(\hat{O}_i)]^2}\ne \sum _i\mathcal {C}(\hat{O}_i), \end{aligned}$$
    (7.10)

    where \(p_i\) is a weighting factor if \(\mathcal {I}^{ab}\ne \delta ^{ab}\). This means that the total complexity of parallel operations is not the sum of the complexity of the individual operations, so breaks G3. We want to stress again that G3 is a very natural requirement that has not been considered in previous research.

  3. (3)

    For the same curve in SU(n) group, the tangent vector at a point is unique but the generator is not. It can be a left or right generator. We admit of two ways (left generator or right generator) but Ref. [27] considers only one way (right generator). As there is no reason to assume that physics favors “left world” or “right world”, a simple and natural possibility is that two generators yield the same complexity. It is the case that is realized in our framework unlike the complexities in [27] and Neilsen’s original works [20,21,22] where the left and right generator will give different complexities.

    One may argue that Eq. (7.7) could be valid only for the right generator and, for the left generator, there might be another left-invariant metric which has different penalty \(\mathcal {I}'^{ab}\) and could give out the same curve length. In our upcoming work Ref. [33], we will show this is possible only if the geometry is bi-invariant. This gives us another argument that the complexity for SU(n) group should be bi-invariant.

  4. (4)

    When Ref. [27] discusses some particular physical situations such as “particle on complexity geometry”, “complexity equals to action” and “the complexity growth”, the authors restricted the generators in a “k-local” subspace \(\mathfrak {g}_k\) and assumed \(\mathcal {I}_{ab}|_{\mathfrak {g}_k}=\delta _{ab}\) (see the Sect. IV.C and V in Ref. [27] for detailed explanations). As a result, the geodesics in the sub-manifold generated by \(\mathfrak {g}_k\) are also given by constant generators, which are the same as our bi-invariant Finsler geometry. The lengths of such geodesics in Ref. [27] and in our paper are only different by multiplicative factors, which implies that all the results given by “k-local” subspace can also appear similarly in our bi-invariant Finsler geometry.

    Moreover, in order to obtain the complexity geometry as was proposed in Ref. [27], we can choose some two-dimensional sub-manifold in SU(n) geometry. As described in our upcoming paper [33], by using the Gauss–Codazzi equation, we can show that the sub-manifold can have negative induced sectional curvature somewhere despite the SU(n) geometry is positively curved. So it satisfies the same properties as shown in Ref. [27], where the sectional curvature is made negative near the identity by choosing an appropriate penalty factor.

8 Discussion and outlook

In this paper we proposed four basic axioms for the complexity of operators: nonnegativity (G1), series decomposition rule (triangle inequality) (G2), parallel decomposition rule (G3) and smoothness (G4). Combining these four axioms and basic symmetries in QFT, we have obtained the complexity of the SU(n) operator without ambiguity: Eq. (5.11). In our derivation the bi-invariance of the Finsler structure plays an important role and this bi-invariance is a natural implication of the symmetry in QFTs rather than an artificial assumption. Our logical flows are shown in the Fig. 1.

We argue the importance of the bi-invariance in four ways based on: (i) the unitary-invariance of QFTs (Sect. 4.1); (ii) the nature of continuous operators rather than discrete ones (Sect. 4.2); (iii) inverse-invariance of the relative complexity (Appendix D.1); and (iv) the “ket-world” - “bra-world” equivalence (Appendix D.2). The bi-invariance here is different from the only right-invariance for qubit systems [20,21,22, 27]. We clarify the differences and similarities of our proposal (bi-invariance) from previous researches (only right-invariance) in Sects. 4.2 and 7. It can be shown that most of results in only right-invariant complexity geometry proposed by Ref. [27] can also appear in our framework. We want to emphasis that the complexity cannot be a well-defined physical observable in general finite dimensional systems if the complexity geometry is not bi-invariant.

Thanks to the bi-invariance of the Finsler metric the process of minimal cost (complexity) is generated by a constant generator. This observation leads us to make a novel interpretation for the Schrödinger’s equation: the quantum state evolves by the process of minimizing “computational cost,” which we call “ minimal cost principle.”

As an application of the complexity of the SU(n) operator, the complexity between two states described by density matrices \(\rho _1\) and \(\rho _2\) may be defined naturally as

$$\begin{aligned} \mathcal {C}(\rho _1,\rho _2):=\min \{\mathcal {C}(\hat{O})\, |\, \rho _2=\hat{O}\rho _1\hat{O}^{\dagger },~\forall \hat{O}\in \text {SU}(n)\}.\nonumber \\ \end{aligned}$$
(8.1)

In our forthcoming paper [19] our proposal turns out to be general enough to include and unify other recent developments for the complexity in QFT: cMERA tensor network [37,38,39,40], Fubini-Study metric [41] and path-integral method [42, 43]. Furthermore, it can be shown that our proposal also correctly reproduces the holographic complexity for thermofield double state (TFD).

In a more general context, geometrizing the complexity in continuous operators sets amounts to giving positive homogeneous norms in some Lie algebras. Our paper deals with only SU(n) group so we gives the norm for Lie algebra \(\mathfrak {su}(n)\). For more general Lie algebra \(\mathfrak {g}\), though we cannot determine the norm uniquely, it is natural that such a norm is determined only by the properties of \(\mathfrak {g}\), for example the structure constants, without any other extra information. As in general relativity where the spacetime metric is determined by matter distribution through Einstein’s equations, can we find any physical equation to determine this norm?