A programming language characterizing quantum polynomial time

. We introduce a ﬁrst-order quantum programming language, named foq , whose terminating programs are reversible. We restrict foq to a strict and tractable subset, named pfoq , of terminating programs with bounded width, that provides a ﬁrst programming language-based characterization of the quantum complexity class fbqp . We ﬁnally present a tractable semantics-preserving algorithm compiling a pfoq program to a quantum circuit of size polynomial in the number of input qubits.


Introduction
Motivations.Quantum computing is an emerging and promising computational model that has been in the scientific limelight for several decades.This phenomenon is mainly based on the advantage of quantum computers over their classical competitors, based on the use of purely quantum properties such as superposition and entanglement.The most notable example being Shor's algorithm for finding the prime factors of an integer [16], which is almost exponentially faster than the most efficient known classical factoring algorithm and which is expected to have spin-offs in cryptography (RSA encryption, ...).
Whether due to the fragility of quantum systems, namely the engineering problem of maintaining a large number of qubits in a coherent state, or by lack of reliable technological alternatives, quantum computing is typically described at a level close to hardware.One can think non exhaustively, to quantum circuits [9,12], to measurement-based quantum computers [5,7] or to circuit description languages [14].This low-level machinery restricts drastically the abstraction and programming ease offered by these models and quantum programs currently suffer from the comparison with their classical competitors, which have many high-level tools and formalisms based on more than 50 years of scientific research, engineering development, and practical and industrial applications.
In order to solve these issues, a major effort is made to realize the promise of a quantum computer, which requires the development of different layers of hardware and software, together referred to as the quantum stack.Our paper is part of this line of research.We focus on the highest layers of the quantum stack: quantum programming languages and quantum algorithms.We seek to better understand what can be done efficiently on a quantum computer and we are particularly interested in the development of quantum programming languages where program complexity can be certified automatically by some static analysis technique.

arXiv:2212.06656v1 [cs.LO] 13 Dec 2022
Contribution.Towards this end, we take the notion of polynomial time computation as our main object of study.Our contributions are the following.
-We introduce a quantum programming language, named foq, that includes first-order recursive procedures.The input of a foq program consist in a sorted set of qubits, a list of pairwise distinct qubit indexes.A foq program can apply to each of its qubits basic operators corresponding to unary unitary operators.The considered set of operators has been chosen in accordance with [17] to form a universal set of gates.
-After showing that terminating foq programs are reversible (Theorem 1), we restrict programs to a strict subset, named pfoq, for polynomial time foq.
The restrictions put on a pfoq programs are tractable (i.e., can be decided in polynomial time, see Theorem 2), ensure that programs terminate on any input (Lemma 1), and prevent programs from having any exponential blow up (Lemma 2).-We show that the class of functions computed by pfoq programs is sound and complete for the quantum complexity class fbqp.fbqp is the functional extension of bounded-error quantum polynomial time, known as bqp [3], the class of decision problems solvable by a quantum computer in polynomial time with an error probability of at most 1  3 for all instances.Hence the language pfoq is, to our knowledge, the first programming language characterizing quantum polynomial time functions.Soundness (Theorem 3) is proved by showing that any pfoq program can be simulated by a quantum Turing machine running in polynomial time [3].The completeness of our characterization (Theorem 6) is demonstrated by showing that pfoq programs strictly encompass Yamakami's function algebra, known to be fbqp-complete [17].
-We also describe an algorithm, named compile (based on the subroutines described in Algorithms 1 and 2), that compiles any pfoq program to a quantum circuit acting on n qubits and of size polynomial in n, for all n.
The existence of such circuits is not surprising, as a direct consequence of Yao's characterization of the class bqp in terms of uniform families of circuits of polynomial size [18].However, a constructive generation based on Yao's algorithm is not satisfactory because of the use of quantum Turing machines which makes the circuits complex and not optimal (in size).We show that, in our setting, circuits can be effectively computed and that the compile algorithm is tractable (Theorem 7).
Our programming language foq and the restriction to pfoq are illustrated throughout the paper, using the Quantum Fourier Transform QFT as a leading algorithm (Example 1).
Related work.This paper belongs to a long standing line of works trying to specify, understand, and analyze the semantics of quantum programming languages, starting with the cornerstone work of Selinger [15].The motivations in restricting the considered programs to pfoq were inspired by the works on implicit computational complexity, that seek to characterize complexity classes by putting restrictions (type systems or others) on standard programming languages and paradigms [1,10,13].These restrictions have to be implicit (i.e., not provided by the programmer) and tractable.Among all these works, we are aware of two results [17] and [6] studying polynomial time computations on quantum programming languages, works from which our paper was greatly inspired.[6] provides a characterization of bqp based on a quantum lambda-calculus.Our work is an extension to fbqp with a restriction to first-order procedures.Last but not least, [6] is based on Yao's simulation of quantum Turing machines [18] while we provide an explicit algorithm for generating circuits of polynomial size.Our work is also inspired by the function algebra of [17], that characterizes fbqp: our completeness proof shows that any function in [17] can be simulated by a pfoq program (Theorem 6).However, we claim that foq is a more general language for fbqp in so far that it is much less constraining (in terms of expressive power) than the function algebra of [17]: any function of [17] can be, by design, transformed into a pfoq program, whereas the converse is not true.

First-order quantum programming language
Syntax and well-formedness.We consider a quantum programming language, called foq for First-Order Quantum programming language, that includes basic data types such as Integers, Booleans, Qubits, Operators, and Sorted Sets of qubits, lists of finite length where all elements are different.A foq program has the ability to call first-order (recursive) procedures taking a sorted set of qubits as a parameter.Its syntax is provided in Figure 1.
Let x denote an integer variable and p, q denote sorted sets variables.The size of the sorted set stored in q will be denoted by q .We can refer to the i-th qubit in q as q[i], with 1 ≤ i ≤ q .Hence, each non-empty sorted set variable q can be viewed as a list [q[1], . . ., q[ q ]].The empty sorted set, of size 0, will be denoted by nil and q ⊖ [i] will denote the sorted set obtained by removing the qubit of index i in q.For notational convenience, we extend this notation by q ⊖ [i1, . . ., i k ], for the list obtained by removing the qubits of indexes i 1 , . . ., i k in the sorted set q.
The language also includes some constructs U f to represent (unary) unitary operators, for some total function f ∈ Z → [0, 2π) ∩ R. The function f is required to be polynomial-time approximable: its output is restricted to R, the set of real numbers that can be approximated by a Turing machine for any precision 2 −k in time polynomial in k.
A foq program P(q) consists of a sequence of procedure declarations D followed by a program statement S, ε denoting the empty sequence.In what follows, we will sometimes refer to program P(q) simply as P. Let var(S) be the set of variables appearing in the statement S. Let P be the size of program P, that is the total number of symbols in P.
A procedure declaration decl proc[x](p){S} takes a sorted set parameter p and some optional integer parameter x as inputs.S is called the procedure Fig. 1: Syntax of foq programs statement, proc is the procedure name and belongs to a countable set Procedures.
We will write S proc to refer to S and proc ∈ P holds if proc is declared in D.
Statements include a no-op instruction, applications of a unitary operator to a qubit (q * = U f (i);), sequences, (classical) conditionals, quantum cases, and procedure calls (call proc[i](s);).A quantum case qcase q of {0 → S 0 , 1 → S 1 } provides a quantum control feature that will execute statements S 0 and S 1 in superposition.For example, the CN OT gate on qubits q[i] and q[j], for i, j ∈ N, i ≠ j, can be simulated by the following statement: Throughout the paper, we restrict our study to well-formed programs, that is, programs P = D ∶∶ S satisfying the following properties: var(S) ⊆ {q}; ∀proc ∈ P, var(S proc ) ⊆ {x, p}; procedure names declared in D pairwise are distinct; for each procedure call, the procedure name is declared in D.
Semantics.Let H 2 n be the Hilbert space C 2 n of n qubits.We use Dirac notation to denote a quantum state ψ⟩∈ H 2 n .Each ψ⟩∈ H 2 n can be written as a superposition of bitstrings of size n: ψ⟩= ∑ w∈{0,1} n α w w⟩, with α w ∈ C and ∑ w α w 2 = 1.The length ( ψ⟩) of the state ψ⟩ is n.Given two matrices M, N , we denote by M † the transpose conjugate of M and by M ⊗ N the tensor product of M by N .⟨ψ is equal to ψ⟩ † and ψ⟩⟨φ and ⟨ψ φ⟩ are respectively the inner product and outer product of ψ⟩ and φ⟩.Let I n be the identity matrix in ) , where C is the set of complex numbers whose both real and imaginary parts are in R. One can check easily that each matrix Let B to be the set of Boolean values b ∈ {false, true}.For a given set X, let L(X) be the set of lists of elements in X.Let l = [x 1 , . . ., x m ], with x 1 , . . ., x m ∈ X, denote a list of m-elements in L(X) and [ ] be the empty list (when m = 0).For l, l ′ ∈ L(X), l@l ′ denotes the concatenation of l and l ′ .hd(l) and tl(l) represent the tail and the head of l, respectively.Lists of integers will be used to represent Sorted Sets.They contain pointers to qubits (i.e., indexes) in the global memory.
For each basic type τ , the reduction ⇓ τ is a map in τ × L(N) → τ .Intuitively, it maps an expression of type τ to its value in τ for a given list l of pointers in memory.These reductions are defined in Figure 2, where e and d denote either an integer expression i or a boolean expression b.
Fig. 2: Semantics of expressions Note that in rule (Rm ∉ ), if we try to delete an undefined index then we return the empty list, and in rule (Qu ∉ ), if we try to access an undefined qubit index then we return the value 0 (defined indexes will always be positive).The standard gates R Y (π 4), P (π 4), and CN OT , form a universal set of gates [4], which justifies the choice of NOT, R f Y (i), and Ph f (i) as basic operators.For instance, we can simulate the application of an Hadamard gate H on q by the following statement q * = R f Y (0); q * = NOT;, with the function f defined by ∀n, f (n) = π 4 ∈ [0, 2π)∩ R. By abuse of notation, we will sometimes use q * = H; to denote this statement.Using CNOT, we can also define the SWAP operation swapping the state between two qubits q[i] and q[j], with i, j ∈ N, i ≠ j: Let ⊺ and be two special symbols for termination and error, respectively, and let ◇ stand for a symbol in {⊺, }.The set of configurations of dimension 2 n , denoted Conf n , is defined by with P(N) being the powerset over N. A configuration c = (S, ψ⟩, A, l) ∈ Conf n contains a statement S to be executed (provided that S ∉ {⊺, }), a quantum state ψ⟩ of length n, a set A containing the indexes of qubits that are allowed to be accessed by statement S, and a list l of qubit pointers.
The program big-step semantics →, described in Figure 3, is defined as a relation in ⋃ n∈N Conf n × Conf n .In the rules of Figure 3, → is annotated by an integer, called level.For example, the level of the conclusion in the (Call [ ] ) rule is 1.The level is used to count the total number of procedure calls that are not in superposition (i.e., in distinct branches of a quantum case).
We now give a brief intuition on the rules of Figure 3. Rules (Asg ) and (Asg ⊺ ) evaluate the application of a unitary operator, corresponding to U f (j), to a qubit s[i].For that purpose, they evaluate the index n of s[i] in the global memory.Rule (Asg ) deals with the error case, where the corresponding qubit is not allowed to be accessed.Rule (Asg ⊺ ) deals with the success case: the new quantum state is obtained by applying the result of tensoring the evaluation of U f (j) to the right index.Rules (Seq ◇ ) and (Seq ) evaluate the sequence of statements, depending on whether an error occurs or not.The (If) rule deals with classical conditionals in a standard way.The three rules (Case ⊺ ), (Case ), and (Case ∉ ) evaluate the qubit index n of the control qubit s[i].Then they check whether this index belongs to the set of accessible qubits (is n in A?).If so, the two statements S 0 and S 1 are intuitively evaluated in superposition, on the projected state ⟨0 n ψ⟩ and ⟨1 n ψ⟩, respectively.During these evaluations, the index n cannot be accessed anymore.The rule (Call [ ] ) treats the base case of a procedure call when the sorted set parameter is empty.In the non-empty case, rule (Call ◇ ) evaluates the sorted set parameter s to l ′ and the integer parameter x to n.It returns the result of evaluating the procedure statement S proc {n x}, where n has been substituted to x, w.r.t. the updated qubit pointers list l ′ .
For a given program P = D ∶∶ S and a given quantum state ψ⟩∈ terminates then it is obviously error-free but the converse property does not hold.Every program P can be efficiently transformed into an error-free Example 1.A notable example of quantum algorithm is the Quantum Fourier Transform (QFT), used as a subroutine in Shor's algorithm [16], and whose quantum circuit is provided below, with R n ≜ Ph λx.π 2 x−1 (n), for n ≥ 2. After applying Hadamard and controlled R n gates, the circuit performs a permutation of qubits using swap gates.
Hence, it is polynomial time approximable.The above circuit can be simulated for any number of qubits q by the following foq program QFT.
Derivation tree and level.Given a configuration c wrt a fixed program P, π P c denotes the derivation tree of P, the tree of root c whose children are obtained by applying the rules of Figures 2 and 3 on configuration c with respect to P. We write π instead of π P c when P and c are clear from the context.Note that a derivation tree π can be infinite in the particular case of a non-terminating computation.When π ′ is finite, π ⊴ π ′ denotes that π is a subtree of π ′ .
In the case of a terminating computation π c, there exists a terminal configuration c ′ and a level m ∈ N such that c m → c ′ holds.In this case, the level of π is defined as lv π ≜ m.Given a foq program P that terminates, level P is a total function in N → N defined as level P (n) ≜ max ψ⟩∈H 2 n lv π P cinit( ψ⟩) .
Intuitively, level P (n) corresponds to the maximal number of non-superposed procedure calls in any program execution on an input of length n.
Example 2. Consider the program QFT of example 1. Assume temporarily that QFT terminates (this will be shown in Example 3).For all n ∈ N, level Indeed, on sorted sets of size n, procedure rec is called recursively n + 1 times and makes n + 1 calls to procedure rot on sorted sets of size n, n − 1, . .., and 1.On sorted sets of size n, rot performs n recursive calls.Hence the total number of calls to rot is equal to ∑ n i=1 i.Finally, on a sorted set of size n, procedure inv does ⌊ n 2 ⌋ + 1 recursive call.
A program P is reversible if it terminates and there exists a program P −1 such that P −1 ○ P = Id.
Theorem 1.All terminating foq programs are reversible.

Polynomial time soundness
In this section, we restrict the set of foq programs to a strict subset, named pfoq, that is sound for the quantum complexity class fbqp.For this, we define two criteria: a criterion ensuring that a program terminates and a criterion preventing a terminating program from having an exponential runtime.
Polynomial-time foq.Given two statements S, S ′ , we write S ∈ S ′ to mean that S is a substatement of S ′ and proc ∈ S holds if there are i and s such that call proc[i](s); ∈ S. Given a program P = D ∶∶ S, we define the relation > P ⊆ Procedures × Procedures by proc 1 > P proc 2 if proc 2 ∈ S proc 1 , for any two procedures proc 1 , proc 2 ∈ S. Let the partial order ⪰ P be the transitive and reflexive closure of > P and define the equivalence relation ∼ P by proc 1 ∼ P proc 2 if proc 1 ⪰ P proc 2 and proc 2 ⪰ P proc 1 both hold.Define also the strict order ≻ P by proc 1 ≻ P proc 2 if proc 1 ⪰ P proc 2 and proc 1 ∼ P proc 2 both hold.Definition 1.Let wf be the set of foq programs P that are error-free and satisfy the well-foundedness constraint: ∀proc ∈ P, ∀call proc Lemma 1 If P ∈ wf, then P terminates.Example 3. Consider the program QFT of Example 1.The statements of the procedure declarations define the following relation: rec > QFT rec, rec > QFT rot, rot > QFT rot, and inv > QFT inv.Consequently, rec ∼ QFT rec, rot ∼ QFT rot, inv ∼ QFT inv, and rec ≻ QFT rot hold.For each call to an equivalent procedure, we check that the argument decreases: p⊖ [1] in rec, p⊖ [2] in rot, and p⊖ [1, p ] in inv.Consequently, QFT ∈ wf.We deduce from Theorem 1 that QFT terminates.
We now add a further restriction on mutually recursive procedure calls for guaranteeing polynomial time using a notion of width.Definition 2. Given a program P and a procedure proc ∈ P, the width of proc in P, noted width P (proc), and the width of proc in P relatively to statement S, noted w proc P (S), are two positive integers in N.They are defined inductively by: width P (proc) ≜ w proc P (S proc ), w proc P (skip; ) ≜ 0, We now show that the level of a pfoq program is bounded by a polynomial in the length of its input.
Lemma 2 For each pfoq program P, there exists a polynomial Moreover, checking whether a program is pfoq is tractable.
Theorem 2. For each foq program P, it can be decided in time O( P 2 ) whether P ¬ ∈ pfoq.
Quantum Turing machines and FBQP.Following Bernstein and Vazirani [3], a k-tape Quantum Turing Machine (QTM), with k ≥ 1, is defined by a triplet (Σ, Q, δ) where Σ is a finite alphabet including a blank symbol #, Q is a finite set of states with an initial state s 0 and a final state s ⊺ ≠ s 0 , and δ is the quantum transition function in Q × Σ k → CQ×Σ k ×{L,N,R} k ; {L, N, R} being the set of possible movements of a head on a tape.Each tape of the QTM is twoway infinite and contains cells indexed by Z.A QTM successfully terminates if it reaches a superposition of only the final state s ⊺ .A QTM is said to be well-formed if the transition function δ preserves the norm of the superposition (or, equivalently, if the time evolution of the machine is unitary).The starting position of the tape heads is the start cell, the cell indexed by 0. If the machine terminates with all of its tape heads back on the start cells, it is called stationary.We will use stationary in the case where the machine terminates with its input tape head in the first cell, and all other tape heads in the last non-blank cell.
We will further refer to a QTM as being in normal form if the only transitions from the final state s ⊺ are towards the initial state s 0 .These will be important conditions for the composition and branching constructions of QTMs.If a QTM is well-formed, stationary, and in normal form, we will call it conservative [17] (N.B.: our notion of stationary QTM differs but can be shown to be equivalent to the definition of stationary QTM in [17]).
A configuration γ of a k-tape QTM is a tuple (s, w, n), where s is a state in Q, w is a k-tuple of words in Σ * , and n is a k-tuple of indexes (head positions) in Z.An initial (final) configuration γ init (resp.γ f in ) is a configuration of the shape (s 0 , w, 0) (resp.(s ⊺ , w, 0)).We use γ(w) to denote a configuration γ where the word w is written on the input/output tape.Following [3], we write S to represent the inner-product space of finite complex linear combinations of configurations of the QTM M with the Euclidean norm.A QTM M defines a linear time operator U M ∶ S → S, that outputs a superposition of configurations ∑ i α i γ i ⟩ obtained by applying a single-step transition of M to a configuration γ⟩ (i.e., U M γ⟩ = ∑ i α i γ i ⟩).Let U t M , for t ≥ 1, be the t-steps transition obtained from U M as follows: Given a quantum state ψ⟩ = ∑ w∈{0,1} n α w w⟩ and a configuration γ, let γ( ψ⟩) ∈ S be the quantum configuration defined by γ( ψ⟩ ⊓ ⊔

FBQP completeness
In this section we show that any function in fbqp can be faithfully approximated by a pfoq program.Toward this end, we show that Yamakami's [17] fbqpcomplete function algebra can be exactly simulated in pfoq.
is the smallest class of functions including the basic initial functions {I, P h θ , Rot θ , N OT, SW AP }, with θ ∈ [0, 2π) ∩ C, and closed under schemes Comp, Branch, and kQRec t , for k, t ∈ N, To handle general fbqp functions, [17] defines the extended encoding of an input x ∈ {0, 1} ⋆ as φ P ( x⟩) ≜ 0 l( x⟩) 1⟩ 0 P (l( x⟩)) 10 11P (l( x⟩))+6 1⟩ x⟩, for some polynomial P ∈ N[X] that is an upper bound on the output size of the desired fbqp function.φ P simply consists in the quantum state x⟩ preceded by a polynomial number of ancilla qubits.These ancilla provide space for internal computations and account for the polynomial bound associated to polynomial time QTMs.

The function f is in fbqp.
2. There exists F ∈ ◻ QP 1 such that F ○ φ P computes f with probability 2  3 .
We show the following result by structural induction on a function in ◻ QP 1 .
Theorem 5. Let F be a function in ◻ QP 1 .Then there exists a pfoq program P such that P = F .
We are now ready to state the completeness result.Theorem 6.For every function f in fbqp with polynomial bound Q ∈ N[X], there is a pfoq program P such that P ○ φ Q computes f with probability 2 3 .

Compilation to polynomial-size quantum circuits
In this section, we provide an algorithm that compiles a pfoq program on a given input length n ∈ N into a quantum circuit of size polynomial in n.
Quantum circuits [8] are a well-known graphical computational model for describing quantum computations.Qubits are represented by wires.Each unitary transformation U acting on n qubits can be represented as a gate U with n inputs and n outputs.A circuit C is an element of a PROP category ( [11], a symmetric strict monoidal category) whose morphisms are generated by gates G and wires.Let 1 be the identity circuit (for any length) and ○ and ⊗ be the composition and product, respectively.By abuse of notation, given k circuits C 1 , . . ., C k , ○ k i=1 C i will denote the circuit C1 ○ ⋅ ⋅ ⋅ ○ Ck , where each circuit Ci is obtained by tensoring C i appropriately with identities so that the output of C i matches the input of C i+1 .By construction, a circuit is acyclic.Each circuit C n can be indexed by its number n ∈ N of input wires (i.e., non ancilla qubits) and computes a function C n ∈ H 2 n → H 2 n .To deal with functions in H → H, we consider families of circuits (C n ) n∈N , that are sequences of circuits such that each C n encodes computation on quantum states of length n.Hence each circuit has n input qubits plus some extra ancilla qubits.These ancillas can be used to perform intermediate computations but also to represent functions whose output size is strictly greater than their input size.To avoid the consideration of families encoding undecidable properties, we put a uniformity restriction.Definition 7. A family of circuits (C n ) n∈N is said to be uniform if there exists a polynomial time Turing machine that takes n as input and outputs a representation of C n , for all n ∈ N.
In quantifying the complexity of a circuit, it is necessary to specify the considered elementary gates, and define the complexity of an operation as the number of elementary gates needed to perform it.In our setting, we consider the following set of universal elementary gates {R Y (π 4), P (π 4), CN OT }.The size #C of a circuit C is equal to the number of its gates and wires.Definition 8.A family of circuits (C n ) n∈N is said to be polynomially-sized with α ∈ N → N ancilla qubits if there exists a polynomial P ∈ N[X] such that, for each n ∈ N, #C n ≤ P (n) and the number of ancilla qubits in C n is exactly α(n).
Theorem 1 (Adapted from [18] and [12]).A function f ∶ {0, 1} ⋆ → {0, 1} ⋆ is in fbqp iff there exists a uniform polynomially-sized family of circuits In Theorem 1, The function χ α( x ) pads the input with ancilla in state 0⟩ to match the circuit dimension.The function ξ f (x) projects the output of the circuit to match the length of the function output f (x) .Hence, for x⟩∈ Compilation to circuits.For each pfoq program P, the existence of a polynomiallysized uniform family of circuits (C n ) n∈N that computes P is entailed by the combination of Lemma 2 and Theorem 1.However, due to the complex machinery of QTM, the constructions of both proofs cannot be used in practice to generate a circuit.In this section, we exhibit an algorithm that compiles directly a pfoq program to a polynomially-sized circuit.Note that this compilation process requires some care since recursive procedure calls in quantum cases may yield an exponential number of calls.The remainder of this section will be devoted to present an algorithm, named compile, which, for a given pfoq program P and a given integer n produces a circuit The compile algorithm uses two subroutines, named compr and optimize, and is defined by compile(P, n) ≜ compr(P, [1, . . ., n], ⋅).
The subroutine compr (Algorithm 1) generates the circuit inductively on the program statement.It takes as inputs: a program P, a list of qubit pointers l, and a control structure cs.A control structure cs is a partial function in N → {0, 1}, mapping a qubit pointer to a control value (of a quantum case).Let ⋅ be the control structure of empty domain.For n ∈ N and k ∈ {0, 1}, cs[n ∶= k] is the control structure obtained from cs by setting cs(n) ≜ k.For a given x ∈ {0, 1} ⋆ , we say that state x⟩ satisfies cs if, ∀n ∈ dom(cs), cs(n) = k ⇒ ⟨k n x⟩ 2 = 1.Two control structures cs and cs ′ are orthogonal if there does not exist a state x⟩ that satisfies cs and cs ′ .Note that if ∃i ∈ dom(cs) ∩ dom(cs ′ ), cs(i) + cs ′ (i) = 1 then cs and cs ′ are orthogonal.
Given a control structure cs and a statement S, a controlled statement is a pair (cs, S) ∈ Cst ≜ (N → {0, 1}) × Statements.Intuitively, a controlled statement (cs, S) denotes a statement controlled by the qubits whose indices are in dom(cs).For a unitary gate U ∈ H 2 n → H 2 n , a control structure cs, and a list of pointers l = [x 1 , . . ., x n ] ∈ L(N) such that {x 1 , . . ., x n } ∩ dom(cs) = ∅, U (cs, l) denotes the circuit applying gate U on qubits q[x1], . . ., q[xn], whenever ∀m ∈ dom(cs), q[m] is in state cs(m)⟩.As demonstrated in [12], this circuit can be built with O(card(dom(cs))) elementary gates and ancillas, and a single controlled-U gate.
Algorithm 1 (compr) Input: (P, l, cs) ∈ Programs × L(N) × (N → {0, 1}) Similarly, we can define a generalized Toffoli gate as a circuit of the shape N OT (cs, n).Since card(dom(cs)) will not scale with the size of the input, such a circuit has a constant cost in gates and ancillas and can thus be considered as an elementary gate.We will also be interested in rearranging wires under a given control structure.For two lists of qubit pointers as the circuit that swaps the wires in l 1 with wires in l 2 , controlled on cs.This circuit needs in the worst case one ancilla and O(n) controlled SW AP gates (also known as Fredkin gates).
Let D ≜ D(Procedures × Z × N → N × L(N)) denote the set of dictionaries mapping keys of the shape (proc, i, j) to pairs of the shape (a, l), where i is the value of a classical parameter, j is the size of a sorted set, and a is a qubit index.We will denote the empty dictionary by {}.Let also a ← new ancilla() be an instruction that sets a to a fresh qubit index.
The subroutine optimize (Algorithm 2) treats the complex cases where circuit optimizations (merging) are needed, that is for recursive procedure calls.It takes as input a sequence of procedure declarations D, a list of controlled statements l Cst , a procedure name proc, a list of qubit pointers l, and a dictionary Anc.The subroutine iterates on list l Cst of controlled statements, indicating the statements left to be treated together with their control qubits.When recursive procedure calls appear in distinct branches of a quantum case, the algorithm merges these calls together.For that purpose, it uses new ancilla qubits as control qubits.Given procedure calls of shape call proc[i](s);, with respect to a given list l ∈ L(N), such that (i, l) ⇓ Z i, (s, l) ⇓ L(N) l ′ , and ( s , l) ⇓ N j.If the key (proc, i, j) already exists in the dictionary Anc, the associated ancilla is re-used, otherwise, Anc[proc, i, j] is set to (a, l ′ ).
Some extra ancillas e are also created for swapping wires and are not explicitly indexed since they are not revisited by the subroutine, and are just considered unique.Ancillas a and e are indexed and treated as input qubits, therefore they can be part of the domain of control structures.
Example 6. compile(QFT, n) outputs the circuit provided in Example 1.Notice that there is no extra ancilla as no procedure call appears in the branch of a quantum case.
Polynomial-size circuits.We show Theorem 2 by exhibiting that any exponential growth of the circuit can be avoided by the compile algorithm using an argument based on orthogonal control structures.With a linear number of gates and a constant number of extra ancillas, we can merge calls referring to the same procedure, on different branches of a quantum case, when they are applied to sorted sets of equal size.An example of the construction is given in Figure 5 where two instances of a gate U are merged into one using SW AP gates and gates controlled by orthogonal control structures.

Fig. 5: Example of optimization
The following proposition shows that multiple uses of a gate can be merged in one provided they are applied to orthogonal control structures.
Lemma 4 For any circuit C n ≜ ○ k i=1 U (cs i , l i ), with a unitary gate U , pairwise orthogonal cs 1 , . . ., cs k ∈ Cst, and l 1 , . . .l k ∈ L(N), there exists a circuit C using one controlled gate U , O(kn) gates, and O(k) ancillas, and such that C = C n .Now we show that orthogonality is an invariant property of compile.Lemma 5 Orthogonality is an invariant property of the control structures in l Cst of the subroutine optimize.In other words, for any two distinct pairs (cs, S), (cs ′ , S ′ ) in l Cst , cs and cs ′ are orthogonal.Theorem 7.For any program P in pfoq, compile(P, n) runs in time O(n 2 P +1 ).

⊓ ⊔
As there is no circuit duplication in the assignments of compile, we can deduce from Theorem 7 that the compiled circuit is of polynomial size.
Corollary 1.For any program P in pfoq, there exists a polynomial The rank of program P is defined by rk(P) ≜ max proc∈P rk(proc).
We show the result by induction on the rank of procedure calls in S. Take call proc[i](s); ∈ S. If rk(proc) = 0 and the procedure is not recursive then there is only 1 call to a procedure.If the procedure is recursive, it can be called at most once in each branch of a quantum case statement.Hence there can be at most n + 1 such calls in the full quantum case branch of the derivation tree and it holds that level D∶∶call proc Induction hypothesis: assume that any procedure proc ′ such that rk(proc ).Consider a procedure proc such that rk(proc) = k + 1.If the procedure is not recursive then it can call a constant number (bounded by the size of the program) of procedures of strictly smaller rank.By induction hypothesis, If the procedure is recursive, it can be called at most once in each branch of a quantum case statement.Hence there can be at most n + 1 such calls in the full quantum case branch of the derivation tree.Moreover, each of these calls can perform a constant number of calls to procedures of strictly smaller rank.Consequently, We conclude by observing that for a program P = D ∶∶ S 1 . . .S k , it holds that level Proof.Consider a terminating foq program P = D ∶∶ S. We build a 4-tape QTM M computing P inductively on the statement S. Fix Σ ≜ {0, 1, #, , &}, where # is the blank symbol and where and & are special separation symbols for encoding stacks.The input tape t in of M contains a word in {0, 1, #} n encoding the quantum state.The 3 working tapes are t call , t l , and t K for storing the integer values of a procedure call, the list of qubit pointers, and intermediate classical computations, respectively, as words in Σ * .The configurations of M will be in Q × (Σ * ) 4 × Z 4 , for some finite set of states Q.In particular, the initial configuration is (s 0 , w, ε, ε, ε, 0, 0, 0, 0), with w ∈ {0, 1} n encoding a quantum state of length n ∈ N; the tapes t call , t l , and t K are initially empty (ε).The tape heads all start on the first cells indexed by 0. For m ∈ Z, let t(m) denote the symbol at position m on tape t.Given a word w ∈ Σ * and a tape t, tw denotes that the content of t ends with the word w.By abuse of notation, let e denote the result of evaluating the expression e with respect to the machine current configuration.Also, we will assume that deterministic computations, such as taking tape t i and appending f ( i ), for any function f , are done by a reversible Turing machine [2], as reversible TMs are well-formed QTMs [3,Theorem 4.2].
We now describe a QTM M simulating P inductively on the statement S. The skip; statement is trivial.If S = q * = U f (j);, M appends q to t K .As the program terminates, t in ( q ) ≠ # and the transition function is set to: ∀a ∈ {0, 1}, δ(s S , t in ( q ), s next(S) , a, N ) ≜ ⟨t in ( q ) U f ( j ) a⟩ where s S is the state before executing the assignment when the head of the input tape has been moved to position q , and s next(S) is the state just after executing the assignment.Finally, the machine erases q at the end of t K , leaves its head in the last non-blank cell of t K , and moves the head in t in back to the initial cell.Program P has level P (n) = 0 and the simulating machine runs in time O(n).
For the remaining statements, assume by induction hypothesis the existence of two conservative QTMs M 1 and M 2 that compute functions P 1 and P 2 , respectively, with P 1 ≜ D ∶∶ S 1 and P 2 ≜ D ∶∶ S 2 .By induction hypothesis M 1 and M 2 run in time O(n + n × level Pi (n)).States of M i will be denoted by s i , for i ∈ {1, 2}.By using internal clocks, we can assume without loss of generality that machines M i halt in exactly the same time for any quantum input of length n.
Consider the case S = S 1 S 2 .Machine M is defined as in [3, Dovetailing Lemma], with the initial state s 0 ≜ s 1 0 , its final state s ⊺ ≜ s 2 ⊺ , and the two machines are composed by setting s 1 ⊺ = s 2 0 .The machine M is stationary, well-formed and it is well-behaved since the running time of M 2 only depends on n and the output of M 1 contains a superposition of equally sized quantum states.M computes P in time O(n For the conditional S = if b then S 1 else S 2 , we build a machine M that concatenates b on the working tape t K and runs M 1 or M 2 deterministically depending on the value of b , using the [3, Branching Lemma].Then we erase b from the end of tape t K .M computes P in time max For the quantum case S = qcase q of {0 → S 1 , 1 → S 2 }, the machine appends q on tape t K .It reads t in ( q ), sets it to #, and if it reads 0 runs M 1 , if it reads 1, runs M 2 .Finally, from state s 1 ⊺ , it writes 0 in t in ( q ), moves the head to index 0, and transitions to s ⊺ ; similarly, from state s 2 ⊺ , the machines writes 1 in t in ( q ) before moving the head and transitioning to s ⊺ .We have that M computes P in time max For the procedure call S = call proc[i](s);, inductively define machine M as follows: update t call by appending i , and update t l by adding the qubit pointer indices excluded in s , separating them using &.We then run machine M proc that computes the function P proc , for P proc ≜ D ∶∶ S proc { i x, s q}, in time O(m + m × level Pproc (m)), with m ≜ s ≤ n, afterwards erasing i and the new indices of t in .As level Pproc (n) = O(level P (n)) and the complexity of M is O(n + n × level P (n)).This concludes the proof.
⊓ ⊔ Theorem 5. Let F be a function in ◻ QP 1 .Then there exists a pfoq program P such that P = F .
Proof.We prove this result by structural induction on a function in the ◻ QP 1 algebra.The basic initial function I can be simulated by P(q) = ε ∶∶ skip;.F ∈ {P h θ , ROT θ , N OT } can be simulated using an assignment.In these cases P(q) = ε ∶∶ q[1] * = U f (0); with f such that U f (0) = F .The basic initial function SW AP can be simulated by the program P(q) = ε ∶∶ SWAP(q[1], q [2]), with the SWAP statement defined in Section 2.
We now simulate the Comp, Branch and kQRec t schemes.For that purpose, assume the existence of pfoq programs P F (q) = D F ∶∶ S F , P G (q) = D G ∶∶ S G , and P H (q) = D H ∶∶ S H simulating the ◻ QP 1 functions F , G, and H, respectively.For simplicity, we assume that there are no name clashes between the procedures declared in D F , D G , and D H .
Comp[F, G] can be simulated by P(q) = D F , D G ∶∶ S G S F .Moreover P ∈ pfoq by construction.
Branch[F, G] can be simulated by Procedures proc F and proc G are not recursive and therefore P ∈ pfoq.
Proof.We describe the circuit that achieves the result and then prove its correctness.Let a i , for i ∈ {1, . . ., k}, be ancillas in the zero state.
and C 2 defined from C 1 by reversing the order of the gates.The total number of ancillas is k.Circuits C 1 and C 2 contain a constant number of controlled N OT gates and, in the worst case, O(kn) controlled SW AP gates (depending on the overlap between l k and each l i ).Consider a computational basis state x⟩, for x ∈ {0, 1} n .Since all control structures cs i are pairwise orthogonal, after the two compositions of N OT circuits in C 1 , x⟩ satisfies ⋅[a i ∶= 1] iff it satisfies cs i , for i ∈ {1, . . ., k − 1}.Moreover, x⟩ satisfies ⋅[a k ∶= 1] iff there exists i such that it satisfies cs i .Therefore, each SW AP sends wires in l i into l k iff x⟩ satisfies cs i and then U is applied on l k if it satisfies one of the control structure cs i .As circuit C 2 reverts the actions of C 1 , afterwards, the qubits are in the right positions, all ancillas are set to zero, and the result on non-ancillary qubits is the same as in C n .
⊓ ⊔ Lemma 5 Orthogonality is an invariant property of the control structures in l Cst of the subroutine optimize.In other words, for any two distinct pairs (cs, S), (cs ′ , S ′ ) in l Cst , cs and cs ′ are orthogonal.
Proof.The proof is done by induction on a procedure statement, doing a case analysis on the rules of optimize.The initial value of l Cst is of the form [(cs, S proc )], which trivially satisfies the condition.Consider the possible cases for a pair (cs, S) in l Cst .By induction hypothesis, assume that cs is orthogonal to all other control structures.For Composition, quantum case and classical if , the replacing pair is either of the shape (cs, S ′ ), or it is two pairs with cs[n ∶= 0] and cs[n ∶= 1], therefore we conclude that the resulting l Cst has pairwise orthogonal controls.
For a procedure call, we consider two cases.The first case occurs when a control statement (⋅[a ∶= 1], S proc ) is added to the list l Cst and a is an ancilla controlled by cs.Since the original pair is removed, this construction also satisfies the invariant property.In the second case, there already exists an ancilla a assigned to the triple (proc, i, j).Let us argue by induction on the number of previous procedure call instances for this ancilla.Let cs 1 , . . ., cs k be control structures such that a computational basis state x⟩ satisfies ⋅[a ∶= 1] iff it satisfies one of the cs i , which we assume by induction hypothesis are pairwise orthogonal.By the definition of pfoq, cs will be orthogonal to each of the cs i , and so after gate N OT (cs, a), a state x⟩ will satisfy ⋅[a ∶= 1] iff it satisfies any of cs, cs 1 , . . ., cs k .Since cs was orthogonal to all other control structures appearing in l Cst , and the original pair is removed, the result l Cst will have pairwise orthogonal controls.
Proof.We first show that an execution of the optimize subroutine performs at most O(n 2 ) calls to compr.We use a simple counting argument.The dictionary Anc ensures that ancillas related to the same (proc ′ , i, j) ∈ Procedures × Z × {1, . . ., n} will only be created once.The classical parameter can either be updated to a new constant or by adding (or subtracting) a constant and this can be done at most O(n) times as procedure calls also remove at least an element In the specific case of procedure proc, there is no need to move around qubits while merging, which simplifies the construction and allows the circuit to scale with O( q ).We represent each ancilla created for proc on input length i as a proc,i (note that there is no integer parameter in this case so we omit it here in the index of ancillas).Since the only non-recursive cases are for input length i = 1, 2, those are the only ancillas with controlled gate U.
We can graphically see how the circuit is built linearly.The directed graph below shows the creation of each ancilla a proc,i , where each node is an ancilla and each edge indicates a (generalized) Toffoli gate, numbered exactly as in the circuit.We further add a tag ○ or •• to differentiate between the two types of Toffoli gate, which represent the two recursive branches, ○ for the vertical and diagonal edges where the first qubit is in state 0, and •• for the horizontal edges which refer to the case where the first two qubits are both in state 1.
a proc,6 a proc,6 a proc,5 a proc,4 a proc,3 a proc,2 Nodes have as incoming edges the branches that reach the same quantum input length and are merged in the same ancilla.If we were to represent the for larger input lengths, we would obtain a continuation of this directed graph to the right, following the same pattern.

C Example: quantum teleportation
Quantum teleportation is an interesting technique that allows for transportation of qubits between far away agents, Alice and Bob.If Alice and Bob share n EPR states, they can teleport any n-length state between their labs.
In our case, we consider an input state of length n, and we extend it with qubits 0⟩ ⊗2n .The following is a foq program where we use the 2n zero state qubits to create a Bell state and then perform the teleportation for each qubit, such that the i-th qubit of the initial state appears in position 3n − 2(i − 1) of the final state for 1 ≤ i ≤ n.It can be easily shown that the program is in pfoq.We apply the principle of deferred measurement to avoid performing any measurements until the end, and therefore use typical controlled N OT and Z gates as opposed to the same gates controlled on results of measurements.The following is an example of the circuit compilation for the teleportation of a state of length n = 3.

Definition 4 .Lemma 3 Theorem 3 .
Given two functions f ∶ {0, 1} ⋆ → {0, 1} ⋆ , F ∶ H → H, and a value p ∈ [0, 1], we say that f is computed by F with probabilityp if ∀x ∈ {0, 1} ⋆ , ⟨f (x) F ( x⟩) 2 ≥ p.The class fbqp is the functional extension of the complexity class bqp.Definition 5 ([3]).A function f ∈ {0, 1} ⋆ → {0, 1} ⋆ is in fbqp iff there exist a QTM M and a polynomial P ∈ N[X] s.t.M computes f in time P with probability 2 3 .A function f ∈ {0, 1} ⋆ → {0, 1} ⋆ has a polynomial bound P ∈ N[X] if ∀n ∈ N, ∀x ∈ {0, 1} n , ∃k ≤ P (n), f (x) ∈ {0, 1} k .Functionsin fbqp have a polynomial bound as the size of their output is smaller than the polynomial time bound.Soundness.We show that QTMs can simulate the function computed by any terminating foq program.The time complexity of this simulation depends on the length of the input quantum state and on the level of the considered program.For any terminating foq program P, there exists a conservative QTM M that computes P in time O(n + n × level P (n)).Now we show that any pfoq program computes a fbqp function.Given a pfoq program P, a function f ∶ {0, 1} ⋆ → {0, 1} ⋆ , and a value p ∈ ( 1 2 , 1].If f is computed by P with probability p then f ∈ fbqp.Proof.Using Lemma 2 and Lemma 3.