Abstract
Quantum systems provide a new way of conducting computations based on the socalled qubits. Due to the potential for significant speedups, this field received significant research attention in recent years. The Clifford+T library is a very promising and popular gate library for these kinds of computations. Unlike other libraries considered so far, it consists of only a small number of gates for all of which robust, faulttolerant realizations are known for many technologies that seem to be promising for largescale quantum computing. As a consequence, (logic) synthesis of Clifford+T quantum circuits became an important research problem. However, previous work in this area has several drawbacks: Corresponding approaches are either only applicable to very small quantum systems or lead to circuits that are far from being optimal. The latter is mainly caused by the fact that current synthesis realizes the desired circuit by a local, i.e., columnwise, consideration of the underlying unitary transformation matrix to be synthesized. In this paper, we analyze the conceptual drawbacks of this approach and propose to overcome them by taking a global view of the matrices and perform a separation of concerns regarding individual synthesis steps. We precisely describe a corresponding algorithm as well as its efficient implementation on top of decision diagrams. Experimental results confirm the resulting benefits and show improvements of up to several orders of magnitudes in costs compared to previous work.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Quantum computation is an emerging technology where operations are performed on quantum bits (qubits) rather than conventional bits. In contrast to conventional bits, qubits are not limited to a discrete set of states (Boolean 0 and 1), butexploiting quantumphysical effectsalso allow for arbitrary superpositions of these Boolean basis states. This can be utilized to gain significant speedups for several interesting and practically relevant problems. Prominent examples include factorization, database search, or the simulation of chemical dynamics, for which corresponding quantum algorithms have been proposed, e.g., in [14, 35], and [18], respectively.
To this end, complex quantum computations are usually described in terms of a cascade of simple quantum operations (quantum gates) which eventually form a quantum circuit [25]. In contrast to conventional circuits, quantum circuits are not supposed to describe a network of wires and physical gates to be connected, but basically define in which order the quantum gates/operations are applied to the qubits.
While in theory there exists an infinite number of quantum gates (even for the case of onequbit gates), practical quantum circuits need to be composed of a subset of elementary gates which can be physically realized in a faulttolerant fashion in the considered quantum technology. The latter is required in order to increase the robustness of the operations, as all technologies considered to date are very sensitive to environmental perturbations and, thus, require sophisticated mechanisms for error correction. This obviously motivates to study how desired quantum functionality is realized in terms of elementary quantum gatesa problem known as quantum (logic) synthesis [16, 17, 22, 31, 33].
In this context, the Clifford+T gate library [4] received significant attention in the recent past by providing a set of elementary gates that is universal (i.e., any quantum functionality can be realized with it up to an arbitrary small error \(\epsilon \)) and consists only of a small number of gatesall of which are very well compatible to many established error correction schemes and can be physically implemented in all quantum technologies that seem promising for largescale quantum computations.
Initial work on quantum logic synthesis utilized a twostage scheme in which the desired quantum functionality is first synthesized into a quantum circuit composed of arbitrary onequbit gates and socalled controlled NOT gates (using solutions as, e.g., proposed in [2, 24, 24, 31, 33, 41]). Afterwards, all onequbit gates are individually compiled to the targeted gate library (using solutions as, e.g., proposed in [3, 9, 22] or, specifically targeting the Clifford+T library, in [19,20,21, 37]). Since controlled NOT gates themselves are contained in the Clifford+T library, this yields the desired circuit description. However, following this approach has the main drawback that, despite the fact that it requires expensive decompositions and yields resulting quantum circuits with a tremendous number of gates, only an approximation of the desired quantum functionality is derived. In fact, the decompositions introduce numerical errors (through rounding) that are later on amplified during the compilation of onequbit gates to Clifford+T. Hence, only an approximate realization is determined even if the initially given quantum functionality would allow for an exact one.
In order to overcome this drawback, researchers considered exact synthesis of Clifford+T quantum circuits in [1, 10, 12, 13, 21, 30], i.e., the desired quantum functionality is realized without any rounding errors rather than being approximated with respect to an \(\epsilon \). However, while the approach in [21] is restricted to 1qubit operations, the approaches in [1] and [10, 13] perform an exhaustive search in the space of all Clifford+T circuits up to a given depth and are, thus, practically limited. Accordingly, these works only report results for circuits with up to three or four qubits (although the approach in [10] is based on a highly parallel implementation and was run on a supercomputer using 8192 cores). In contrast, the approach by Giles and Selinger [12] was the first to provide a generic, constructive synthesis algorithm based on algebraic properties of exactly synthesizable transformation matrices. However, while guaranteeing exactness, the respective scheme still bears lots of potential for improvement.
In fact, the solution proposed by Giles and Selinger [12] synthesizes the given unitary transformation matrix (representing the desired quantum functionality) in a local fashion, i.e., column by column. This is disadvantageous since the manipulation of a single column does have an effect to all other columnsfrequently making the remaining columns harder to synthesize. Furthermore, the columnwise consideration yields rather expensive circuits, since often several columns could be considered much more efficiently in one step than in several individual steps (both issues are explained and illustrated in more detail later in Sect. 3). Overall, this leaves significant room for improvement. This is also confirmed by the authors of [12] stating that the resulting circuits are “far from optimal”.
In this work^{Footnote 1}, we are explicitly tackling these shortcomings. To this end, we first discuss the previously proposed synthesis approach and its shortcomings. Afterwards, we propose a global consideration of the unitary matrix to be synthesized which does not restrict to a single column during a particular synthesis step, but always keeps track of the entire matrix. This eventually yields an alternative and improved synthesis algorithm which allows to realize the desired quantum functionality significantly cheaper than before. Experimental evaluations confirm these benefits and show improvements of up to several orders of magnitude compared to previous work. However, due to the exponential complexity of representing and manipulating the desired quantum functionality, an implementation on top of straightforward matrix representations quickly becomes infeasible (as confirmed by the experimental evaluation performed in [28]). In order to cope with this issue, we employ a dedicated datastructure for the compact representation and efficient manipulation of quantum functionality, namely the Quantum Multiplevalued Decision Diagram (QMDD, [29]). This improves the applicability of the proposed approach significantly, especially by exploiting certain QMDD characteristics, which is also confirmed by experimental evaluations.
The remainder of this work is structured as follows: The next section reviews some background on quantum computation as well as the considered Clifford+T library. Section 3 discusses and illustrates previously conducted synthesis of Clifford+T circuitsincluding an analysis of the corresponding shortcomings and their potential for improvement. This leads to an improved synthesis algorithm which is described in detail in Sect. 4. The implementation of the algorithm on top of the QMDD datastructure is outlined in Sect. 5. Finally, Sect. 6 summarizes the results of the conducted experimental evaluation before the paper is concluded in Sect. 7.
2 Background and terminology
This section briefly reviews the basics of quantum computation and circuits as well as the Clifford+T gate library as required for the discussion in this paper. For a more detailed introduction to the field, we refer to [25].
2.1 Quantum computation and circuits
Quantum computations are performed on systems of qubits. Analogously to conventional bits, a qubit has two stable basis states commonly denoted as \(0\rangle \) and \(1\rangle \). Additionally taking into account the modeling of quantummechanical phenomena, qubits can also assume an arbitrary superposition \(\alpha 0\rangle + \beta 1\rangle \) for complexvalued \(\alpha , \beta \) with \(\alpha ^2 + \beta ^2 = 1\). Accordingly, an nqubit quantum system can be in one of \(2^n\) basis states (\(0\ldots 00\rangle , 0\ldots 01\rangle ,\ldots ,1\ldots 11\rangle \)) or a superposition of these states. The state of such a quantum system is represented by a unit vector of dimension \(2^n\) (denoted as state vector). The effect of applying a quantum operation is obtained by multiplying the state vector with a corresponding unitary transformation matrix, i.e., an invertible complexvalued matrix whose inverse is given by the adjoint matrix. In other words, all quantum operations are linear operations on the state space where the kth column of the matrix describes the image of the kth basis state.
Example 1
Commonly used quantum operations include

the Hadamard operation H (setting a qubit into a balanced superposition, e.g., mapping \(0\rangle \) to \(\frac{1}{\sqrt{2}} (0\rangle +1\rangle ))\),

the NOT operation X (flipping the basis states \(0\rangle \),\(1\rangle \)),

as well as the phase shift operations T (\(\pi /4\) gate), \(S=T^2\) (Phase gate), and \(Z=S^2=T^4\).
The corresponding unitary matrices are defined as \( H = \tfrac{1}{\sqrt{2}}\begin{pmatrix} 1 &{} 1 \\ 1 &{} 1 \\ \end{pmatrix}\), \(X = \begin{pmatrix} 0 &{} 1 \\ 1 &{} 0 \end{pmatrix}\), and \( T = \begin{pmatrix} 1 &{} 0 \\ 0 &{} \omega \end{pmatrix} \) where \(\omega =\frac{1+i}{\sqrt{2}}=e^{i \pi /4}\).
Besides these operations that work on a single qubit, there are also operations on multiple qubits. Usually, these realize controlled operations that only manipulate a single qubit (denoted as the target qubit) depending on the state of a set of control qubits. The most prominent example for such operations is the controlled NOT (CNOT) operation on two qubits whose transformation matrix is defined by
This operation performs a NOT operation on the target qubit if the control is in the \(1\rangle \)state and does not change anything (i.e., performs the identity transformation) if the control is in the \(0\rangle \)state.
Realizations of (complex) quantum operations are represented by quantum gates \(g_i\) which eventually form a quantum circuit \(G=g_1 \dots g_d\) with \(1\le i \le d\). The unitary matrix of the entire circuit is computed as the matrix product of the individual gate matrices (in reversed order).
2.2 Clifford+T library
In order to realize a (complex) quantum operation in a particular quantum technology, the operation has to be mapped to a dedicatedpossibly technologyspecificgate library. In this work, we focus on the Clifford+T gate library [4] which is popular for its universality (any quantum operation, i.e., unitary transformation matrix, can be realized up to an arbitrary precision) as well as faulttolerance (robust, faulttolerant implementations of these gates are known for most technologies that are considered promising for largescale quantum computers). The most elementary gates in this library are the Clifford group gates (H, CNOT, S) as well as the T gate, as discussed in Example 1, i.e., without any (additional) controls. While the Clifford group gates on their own only allow for the realization of a restricted set of quantum functionality, it is the T gate that makes the Clifford+T library a universal gate library. This is reflected in the fact that the cost of implementing a T gate (in a faulttolerant way) is significantly higheraround a factor of 100as compared to Clifford gates [11].
In addition to the basic, i.e., uncontrolled, Clifford+T gates, there are also wellknown constructions for multiplecontrolled versions of these gates (see, e.g., [12]). By employing the additional controls, the basic Clifford+T operations can be applied on a dedicated subset of basis states. At the level of unitary matrices, this corresponds to manipulating a certain subset of columns in the original matrix (c.f. the controlled NOT gate in Example 1). However, these constructions have a high realization cost as can be seen from the detailed overview on the costs of multiplecontrolled Clifford+T gates provided in Table 1 (based on [1, 12]). Here, the cost is given in terms of Tdepth, i.e., the number of sequential T gates in the circuit, assuming the availability of one additional helper qubit (ancilla). Regarding exact synthesis, Kliuchnikov et al. conjectured in [21] and Giles and Selinger proved in [12] that a unitary matrix can be realized exactly in the Clifford+T library (i.e., without any rounding error) with the help of at most one ancilla if, and only if, all entries are complex numbers of the form \(\frac{1}{\sqrt{2}^k}(a\omega ^3+b\omega ^2+c\omega +d\)) for coefficients \(a,b,c,d,k\in {\mathbb {Z}}\) and \(\omega =\frac{1+i}{\sqrt{2}}=e^{i \pi /4}\) (as in Example 1).^{Footnote 2} Consequently, we will focus on such matrices in the following.
3 Synthesis of Clifford+T circuits
In this work, we consider the synthesis of quantum functionality to elementary circuit descriptions based on the Clifford+T library. More precisely, the task is considered in which a quantum functionality represented in terms of a transformation matrix F is decomposed into a sequence \(G=g_1\dots g_d\) of elementary quantum operations (i.e., quantum gates \(g_i\) with \(1\le i \le d\)). The resulting sequence eventually forms a quantum circuit and is supposed to be composed of Clifford+T gates as reviewed above. In the following, we revisit the current stateoftheart approach which has been proposed for this purpose and discuss its main drawbacks. Based on this, we sketch the general idea of the improved quantum circuit synthesis proposed in this work. Details on that are later provided in Sect. 4.
3.1 Stateoftheart and motivation
Figure 1 sketches the current stateoftheart in the exact synthesis of Clifford+T quantum circuits as proposed by Giles and Selinger [12]. The main idea is to apply quantum gates so that, eventually, the given matrix to be synthesized (sketched on the lefthand side of Fig. 1) is transformed to the identity matrix (sketched on the righthand side of Fig. 1). To this end, the matrix is transformed column by column (as sketched in the middle of Fig. 1). For each column, three steps are applied:

(a)
Eliminate superposition, i.e., apply quantum gates so that all multiple nonzero matrix entries in the column are combined to a single nonzero entry.

(b)
Move to diagonal, i.e., apply quantum gates which move the remaining nonzero entry to the diagonal of the matrix.

(c)
Remove phase shifts, i.e., apply quantum gates which transform the diagonal entry to 1eventually yielding a column of the identity matrix.
Each of these steps is achieved by using the socalled twolevel operations that modify pairs of entries in the current column. More precisely, the following operations are utilized:

Combine entries: \((a,b) \Rightarrow \frac{1}{\sqrt{2}}(a+b, ab)\)

Exchange entries: \((a,b) \Rightarrow (b,a)\)

Modify phase shift: \((a,b) \Rightarrow (a,b\cdot \omega )\) where \(\omega = e^{i \pi /4}\).
Example 2
Consider the matrix in Fig. 2a which represents a quantum operation on three qubits \(x_0,x_1,x_2\). The four nonzero entries in the first column can be combined pairwise from \((\frac{1}{2},\frac{1}{2})\) to \((\frac{1}{\sqrt{2}},0)\). The resulting pair \((\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}})\) is combined to \((1,0)\). Finally, the \(1\) is exchanged with the 0 entry in the first row and a phase shift by \(1\) is applied. This leads to the matrix shown in Fig. 2b where the first column is of the desired form (note the extracted scalar factor of \(\frac{1}{2}\)).
However, this approach has two major drawbacks:

1.
Twolevel operations do not solely have an effect on that particular column, but on all columns of the matrix. Consequently, locally synthesizing one column without taking the global view, i.e., the remaining columns, into consideration may significantly worsen the degree of superposition in these columns (as already became evident in the previous example where the overall number of nonzero entries increased from Fig. 2a to Fig. 2b).

2.
Twolevel operations rely on multiplecontrolled Clifford+T gates: Hadamard for combining, CNOT for exchanging, and T for modifying the phase shift between entries. These gates have many controls (in fact, \(n1\) controls where n is the number of qubits). Since the number of control lines significantly increases the costs of those gates (cf. Table 1), this leads to high costs of the resulting circuits.
Regarding the first drawback, it seems to be promising to globally consider the whole matrix at once and to only apply operations that do not lead to a worsening in any other column. Overall, this may lead to fewer steps to be conducted in order to derive the desired identity matrix.
Regarding the second drawback, it can be observed that Clifford+T matrices often exhibit similar structures occurring globally throughout the matrix. In many cases, the corresponding twolevel operations applied to these structures are similar and can be combined to a joint, again global, operation that can be realized with lower costs. That is, controls (which enforce a local change in the matrix only) often can be saved as the correspondingly more global change can be conducted at once.
Example 3
The first two combine operations in the previous example correspond to multiplecontrolled Hadamard gates with a target on qubit \(x_2\) and controls on \(x_0, x_1\). The only difference between the gates is that the control connection on \(x_0\) is positive in one case and negative in the other case. Thus, this control can be dropped and both gates can be compressed to a singlecontrolled Hadamard gate. Hence, the two steps cannot only be conducted with a single gate (rather than two); but the required gate is even significantly cheaper. Overall, this reduces the cost from \(2\cdot 7=14\) to 1 (cf. Table 1).
Note that simple optimizations as described in Example 3 could probably be conducted as a postsynthesis optimization process. However, such optimizations can only be found if the corresponding quantum gates are located close to each other in the final circuit. To this end, the approach from [12] yields circuits which have a structure that is completely unsuitable for such postsynthesis optimization. This is because the twolevel operations eventually have to be decomposed into elementary quantum gates which hardly allow the detection of redundancies any more. In fact, the resulting cascades of CNOT gates will, in the vast majority of cases, prevent that operations on rows/columns whose index differs in one position only appear close to each other.
3.2 General idea of proposed approach
Inspired by these observations, the general idea of the proposed approach for logic synthesis of Clifford+T circuits is the following: Globally consider multiple columns simultaneously and establish recurrent structures that allow to combine as many twolevel operations as possible. More precisely, the three steps (a)(c) that have so far been conducted locally for the individual columns are now globally performed on the entire matrix as sketched in Fig. 3:

(a)
Regarding the elimination of superposition, we aim to gradually reduce superposition in the whole matrix, i.e., establish recurrent structures that allow to reduce the superposition in all columns at once. To this end, all entries of the matrix have to be rearranged and (possibly) phase shifted in such a way that all entries form suitable pairs that can be combined.
Example 4
Consider again the matrix from Fig. 2a. After exchanging rows 011 and 111 and applying a phase shift by i to row 101, all entries are grouped in pairs that can be combined with each other. Thus, an uncontrolled Hadamard gate can be applied on qubit \(x_2\). As a consequence, the number of nonzero entries in the matrix is reduced by a factor of 2 in a single step. The resulting matrix is shown in Fig. 4a. Here, only rows 101 and 111 need to be swapped before another Hadamard gate on qubit \(x_{0}\) eliminates the remaining superposition and leads to the matrix shown in Fig. 4b.

(b)
Regarding the movement of entries to the diagonal, there is a large body of research on how to achieve this for permutation matrices (i.e., transformation matrices with Boolean entries only). In fact, this problem is known as reversible circuitsynthesis and most of the approaches employed for this purpose (see, e.g., [15, 32, 34, 36, 44]) are based on multiplecontrolled Toffoli gates (which are exactly realizable in the Clifford+T library; see, e.g., [12, Sec. 5]). Taking into account that potential phase shifts within the matrix do not affect the applicability of these (highly optimized) approaches, any of these can be utilized here.
Example 5
Consider the resulting matrix from the previous step (shown in Fig. 4b). Applying a CNOT gate with control qubit \(x_2\) and target qubit \(x_0\) exchanges rows 001 and 011 with 101 and 111, respectively. This establishes zero submatrices in the upperleft and lowerright quadrant of the matrix and moves the \(i\) in row 101 to the diagonal (as shown in Fig. 4c).
Finally, rows 000 and 010 as well as 101 and 111 can be swapped by two CNOTs on \(x_1\) (one with a negative control on \(x_0\) and the other with a positive control on \(x_2\)). This leads to the diagonal matrix depicted in Fig. 4d.

(c)
Finally, regarding the removal of phase shifts, similar phase shifts can be taken care of simultaneously and the corresponding twolevel operations can be joined.
Example 6
After shifting the phase of row 101 by \(\omega \) (from \(\frac{1+i}{\sqrt{2}}=\omega ^3\) to \(\omega ^2=i\)) and the phases of rows 001, 010, 101, and 110 by \(i\) (from \(\pm i\) to \(\pm 1\)), the remaining phase shifts can be removed by a Z gate on \(x_0\) (without controls).
Overall, this yields substantial improvements compared to the local, i.e., columnwise, synthesis scheme for all three steps. How these ideas are implemented in detail is described in the following two sections. Afterwards, Sect. 6 shows the resulting improvements by means of a summary of the conducted experimental evaluations.
4 Implementation
The proposed approach is described along the three steps discussed above and summarized in Fig. 3.
4.1 Eliminating superposition
As discussed in Sect. 2.2, each entry of a Clifford+T matrix can be written as \(\frac{1}{\sqrt{2}^k}(a\omega ^3+b\omega ^2+c\omega +d\)). More precisely, one can show that, for \(\alpha \ne 0\), there exists a representation with a unique smallest denominator exponent \(k_{{\text {min}}}\) such that integer coefficients a, b, c, d can be found for \(k_{{\text {min}}}\), but not for any greater \(k>k_{{\text {min}}}\).^{Footnote 3} The maximum of these exponents in the entire matrix (denoted as \(k_{\max }\) in the following) determines the degree of superposition and needs to be decreased to 0 in order to completely eliminate superposition. The only way to potentially reduce \(k_{\max }\) is the application of (controlled) Hadamard gates. In fact, a Hadamard gate on qubit \(x_i\) combines pairs of entries of the matrix whose row index only differs at position \(x_i\). More precisely, two entries \(\alpha ,\beta \) are replaced by their (weighted) sum/difference \(\frac{1}{\sqrt{2}}(\alpha +\beta )\) and \(\frac{1}{\sqrt{2}}(\alpha \beta )\), respectively.
In the case that the weighted sum/difference indeed has a smaller exponent than \(\alpha ,\beta \), it is required that both have the same exponent k and that the parity of the coefficients a, b, c, d, i.e., their remainder modulo 2, is pairwise the same.^{Footnote 4} To this end, [12] introduces the notion of residues where the (k)residue of \(\frac{1}{\sqrt{2}^k}(a\omega ^3+b\omega ^2+c\omega +d)\) is defined as \((a\% 2, b\% 2, c\% 2, d\% 2)\) where \(\%\) denotes the modulo operation (e.g., the residue of \(5\omega ^3+2\omega ^2\omega \) is (1, 0, 1, 0)). Accordingly, a pair (\(\alpha ,\beta \)) is termed reducible and \(\alpha , \beta \) are termed reduction partners if, and only if, both have the same exponent k and the same (k)residue. Using this notation, the resulting algorithm to gradually eliminate superposition is as follows:

1.
If \(k_{\max }= 0\), terminate, as there is no more superposition in the matrix. Otherwise, identify and restrict to the entries with the maximum smallest denominator exponent \(k_{\max }\) (i.e., consider the other entries as being 0 in the following).

2.
Determine the most promising qubit \(x_p\) for which we expect to require a smaller number of matrix operations in order to prepare the matrix for the application of a (controlled) Hadamard gate. To this end, first the residue of all entries is computed. Then, to determine the number of already existing reduction partners for the first qubit, we compare entries in adjacent rows (whose row index only differs at the last position). Likewise for the second qubit, we compare entries in rows whose row index only differs at the secondlast position and so on. Finally, the qubit with the maximum number of reduction partners is chosen. In case of a tie, the first qubit with this property is chosen.

3.
Create a list R of row indices of all rows that contain at least one nonzero entry which does not have a reduction partner w.r.t. \(x_p\) so far. Mark all columns as unvisited.
While \(R\ne \emptyset \) and there is an unvisited column left:

Consider the leftmost unvisited column c, create a list \(P_c\) containing all row indices corresponding to nonzero entries in that column (within the rows from R) that do not have a reduction partner (w.r.t. \(x_p\)).

While \(P_c \ge 2\):

Choose two row indices \(A,B\in P_c\), preferably such that the corresponding entries in column c (\(\alpha _c\) and \(\beta _c\)) have the same residue.

Move row B to the appropriate position such that its row index differs from A only at position \(x_p\), i.e., such that \(\alpha _c,\beta _c\) will be combined by a Hadamard gate on \(x_p\). Align the residues of \(\alpha _c\) and \(\beta _c\) (if necessary) and remove the indices of A, B as well as of all further rows with newly established reduction partners (in column c) from \(P_c\) and R.


Mark the column as visited.


4.
If all entries have a suitable partner w.r.t. \(x_p\), apply the (controlled) Hadamard gate directly and proceed with Step 1. Otherwise, eliminate superposition in the first column of the matrix using columnwise synthesis (do not perform diagonalization or elimination of phase shift) and continue with Step 1 on the remaining columns.
As each iteration of the algorithm either reduces \(k_{\max }\) in the entire matrix or yields one additional column with a single nonzero entry, it is guaranteed to terminate at some point yielding a matrix with exactly one nonzero entry per row.
Concerning the movement of rows in the inner whileloop in Step 3, note that a movement is necessary if, and only if, the indices A, B differ in at least one position \(x_d\ne x_p\). Then, CNOTs with a control on \(x_d\) can be used to align the indices of A, B (except for position \(x_d\)) and, if necessary, establish different entries at position \(x_p\)such that only \(x_d\) needs to be aligned in order to move B to the desired position. Finally, the \(x_d\) position of A, B as well as the residues of \(\alpha _c, \beta _c\) can be aligned by applying multiplecontrolled NOT and T/H gates on \(x_d\) (with controls corresponding to the joint index of A, B). The existence of an adequate sequence of T and H gates is guaranteed by Lemma 4 (row operation) from [12]. All these gates preserve already established pairs of reduction partners, but mayby chancecreate additional pairs.
Example 7
Consider the matrix in Fig. 4a. Here, all nonzero entries have the same exponent \(k_{\max } = 1\), so the entire matrix is considered in Step 1. As there are already four pairs of reduction partners (in columns 000, 010, 100 and 110), \(x_p=x_0\) is chosen in Step 2 of the algorithm. Entries without reduction partner are found in rows \(R=\{001, 011, 101, 111\}\). For the first column, there are no entries in these rows, i.e., \(P_{000}=\emptyset \). For the second column, we have \(P_{001}=\{001,111\}\). Both nonzero entries have the same residue. As the indices already differ at exactly two positions \(x_0=x_p\) and \(x_1=x_d\), a twocontrolled NOT on \(x_1\) with positive controls on \(x_0\) and \(x_2\) (1X1) is employed to exchange rows 101 and 111. This gate does not only create a pair of reduction partners in column 001, but establishes pairs of reduction partners in all remaining rows from R, such that \(R=\emptyset \) and an H gate is applied to qubit \(x_0\) in Step 4. This yields the matrix in Fig. 4b where \(k_{\max } = 0\), i.e., the algorithm terminates.
4.2 Moving to the diagonal
In the diagonalization phase, all nonzero entries need to be moved to the diagonal. As already outlined above, there are various approaches available to conduct this task. We propose to use the approach from [36] which is one of the most efficient and scalable approaches for this purpose. Moreover, its ability to exploit recurrent structures in order to reduce the cost of the resulting circuit (Tdepth) has been demonstrated in [43]. The general idea of this approach is to successively establish a block matrix structure, as already outlined in Example 5. More precisely, the first step is to swap the columns of the matrix such that all nonzero entries are gathered in the topleft as well as bottomright quadrant of the matrix, while the offdiagonal quadrants, i.e., the topright and bottomleft quadrant, become zero matrices. In the following steps, the same procedure is likewise applied to the smaller submatrices (topleft and bottomright quadrant) that were obtained in the previous step until a diagonal matrix is established. For more details about the approach, we refer the interested reader to [36, 43, 44].
4.3 Removing phase shifts
Finally, we end up with a diagonal matrix whose entries all have norm 1 which means that they are of the form \(\omega ^m\) for \(m\in \{0,\ldots , 7\}\) (as shown in [12, Lemma 5]). In order to remove possible phase shifts, all the exponents have to be transformed to the same value \(m_0\). Note that we do not require \(m_0=0\) such that the resulting matrix might not be the identity matrix itself, but is equivalent up to global phase and, hence, physically indistinguishable [25].
In order to align the exponents, the approach from [37] could be applied. However, it is targeted to approximating arbitrary diagonal matrices, while the matrices considered here have the abovementioned restriction that all diagonal entries are of the form \(\omega ^m\). Thus, we apply a customized approach. At first, we take care of odd exponents. Instead of applying a multiplecontrolled T gate for each and every of these columns, we aim to merge multiple gates by using techniques from classic logic synthesis. To this end, we interpret the indices of the corresponding rows as the ONset of a Boolean function and synthesize this function in terms of an ExclusiveSumOfProducts (ESOP) [5]. The individual products of this representation are then mapped to controlled T gates (with fewer controls). Afterwards, all exponents in the matrix are even (i.e., all entries are from the set \(\{\pm 1,\pm i\}\)). Then, we proceed similarly for all imaginary entries and, once these became \(\pm 1\), finally also for the \(1\) entries.
Example 8
Consider the matrix in Fig. 4d. There is a single entry with an odd exponent, namely \(\omega ^3\) in row 101. The corresponding Boolean function is given as \(f=x_0 \cdot {\overline{x}}_1 \cdot x_2\). Thus, a twocontrolled \(T^\dagger \) gate (inverse of T) on \(x_2\) with a positive control on \(x_0\) and a negative control on \(x_1\) shifts this entry to \(\omega ^2=i\). Then, the indices of the imaginary entries are given by the ONset of \(g=x_1 \oplus x_2\). Consequently, two uncontrolled S gates on \(x_1\) and \(x_2\) shift these entries to \(\pm 1\) such that the Boolean function for all rows with a \(1\) entry (namely 000,101,110,111) becomes: \(h=x_0 \oplus {\overline{x}}_1 {\overline{x}}_2\). The first term directly translates to an uncontrolled Z gate on qubit \(x_0\). However, since only positive literals can be used as targets of the phase shift gates, some more work is required for the second literal. More precisely, we first need to apply a basis transformation using an uncontrolled NOT gate on \(x_2\), then perform the phase shift by a controlled Z gate on \(x_2\) (with a negative control on \(x_1\)) a finally undo the basis transformation with another uncontrolled NOT gate on \(x_2\).
Overall, performing the three steps as described above eventually transforms any given Clifford+T matrix F to the identity and yields a quantum circuit that realizes F by solely using Clifford+T gates.
Regarding the complexity of the proposed algorithm, notice that Step 4 of the algorithm for eliminating superposition (cf. Sect. 4.1) uses the columnwise synthesis from [12] as a fallback. Thus, the worstcase complexity of the proposed algorithm is not improved w.r.t. [12]. However, the experimental evaluations (summarized in Sect. 6) indeed demonstrate significant improvements of the proposed method compared to the previous work.
5 Exploiting decision diagrams
Due to the exponential complexity of representing the desired quantum functionality, using straightforward matrix representations quickly becomes infeasible. For instance, the reference implementation of the approach from [12] (written in Haskell) failed to produce results for more than seven qubits. In order to cope with this issue and to obtain an implementation of the proposed approach that allows for an application to larger problem instances, we additionally employ decision diagrams, namely QMDDs as reviewed in the following, for a more efficient processing of the unitary matrices.
5.1 QMDDs
As the unitary matrices corresponding to quantum operations grow exponentially with the size of the quantum systems, dedicated data structures are required that exploit redundancies in the matrices in order to enable a more compact representation and efficient manipulation. A very promising candidate for this task is given by the Quantum MultipleValued Decision Diagram (QMDD, [29]). The general idea behind QMDDs is to represent a (unitary) matrix in terms of a directed acyclic graph such that submatrices which occur multiple times are represented by a shared graph structure. While there are several data structures that follow a similar approach, only QMDDs additionally make use of weighted edges. This unique property exclusively allows them to use shared structures also for submatrices which differ by a scalar factora case that occurs quite often for the unitary matrices considered in quantum computation.
The QMDD for a given unitary matrix is constructed by a recursive partitioning process, i.e., a \(2^n \times 2^n\) matrix is partitioned in to four \(2^{n1} \times 2^{n1}\) matrices, as illustrated in the following example.
Example 9
Fig. 5a shows a transformation matrix for which a QMDD as shown in Fig. 5b has been built. Starting with a single terminal vertex that represents the lowest partitioning level, i.e., single matrix entries, the next upper level of \(2\times 2\) matrices is represented by vertices labeled \(x_{2}\). For each entry, there is an outgoing edge to the terminal vertex with an edge weight corresponding to the respective complex value. For simplicity, we omit edge weights equal to 1 and indicate edges with a weight of 0 by stubs. The vertices are normalized by dividing the weights of all outgoing edges by a normalization factor (by default: such that the “leftmost” edge with a nonzero weight has weight 1). This factor is propagated to incoming edges, e.g., the factor \(\tfrac{1}{2}\) is propagated upwards from the \(x_2\)level to the \(x_0\)level in Fig. 5b. By this, structurally equivalent submatrices, i.e., submatrices that are equal as well as submatrices that only differ by a scalar factor, are compressed to a shared vertex (highlighted in grey in Figs. 5a and 5b, respectively). This procedure is repeated for each level until a single vertex labeled by \(x_0\) is created for the top level. This vertex is called the root vertex. Finally, a possible normalization factor of this vertex is assigned to the weight of the root edge which points to the root vertex, but has no source (here: \(\frac{1}{2}\)).
To obtain the value of a particular matrix entry, one has to follow the corresponding path from the root to the terminal vertex and multiply all edge weights on this path. For example, the matrix entry \(\tfrac{i}{2}\) from the top right submatrix of Fig. 5a (highlighted bold) can be determined as the product of the weights on the highlighted path of the QMDD in Fig. 5b.
Moreover, efficient algorithms have been presented for applying operations like matrix addition or multiplication directly on the QMDD datastructure. Overall, QMDDs allow for both, a compact representation as well as an efficient manipulation of unitary matrices for quantum systems of considerable size. As a consequence, they have already been used in a broad variety of applications in the design of quantum circuits (e.g., verification [7, 8, 27, 39, 42], simulation [45, 47], or synthesis [26, 36, 44]).
5.2 Exploiting QMDDs for an enhanced applicability
Accordingly, we are exploiting these benefits of QMDDs for the proposed synthesis scheme. However, in addition to their general efficiency w.r.t. matrix manipulation, we identified particular characteristics of QMDDs that offer further potential for a speedup of the algorithm. In the following, we provide details of the implementation that illustrate this potential:
Restriction to entries with maximum denominator exponent An important preprocessing step of the first step of the algorithm is the restriction to matrix entries with the maximum denominator exponent (\(k_{\mathrm{max}}\)). In order to facilitate this step in QMDDs, a normalization scheme can be chosen that extracts the maximum smallest denominator exponent from the (outgoing) edge weights and propagates it to the incoming edges of a vertex. By this, \(k_{\mathrm{max}}\) will occur as the exponent of the root edge weight and all other weights in the QMDD will have a denominator exponent less than or equal to 0. As a consequence, only those parts of the QMDD need to be considered that are reachable by edges with a denominator exponentof 0. If an edge has a negative denominator exponent, all entries in the represented submatrix have a denominator exponent that is less than \(k_{\mathrm{max}}\) and can, thus, be ignored (i.e., the edge is modified to represent a zero matrix).
Example 10
Consider the QMDD in Fig. 6b which represents the matrix from Fig. 6a. As expected, the exponent \(k_{\mathrm{max}}=2\) occurs in the root edge weight \(\tfrac{1}{2} = \tfrac{1}{\sqrt{2}^2}\), while all other weights have a zero or negative exponent (n.b. \(2=\tfrac{1}{\sqrt{2}^{2}}\)). By pruning all parts that are reached via an edge with a negative exponent, we obtain the QMDD shown in Fig. 6c where all vertices at the \(x_2\) level are collapsed to a single vertex.
Determining the Most Promising Qubit for Hadamard Application An even more important step of the algorithm is to determine the most promising qubit by counting the number of already existing pairs of reduction partners in all columns. In order to facilitate this computation in the QMDD, the qubit under consideration (i.e., the corresponding variable) is moved to the bottom of the QMDD by pairs of adjacent variables as shown in [29]. Then each vertex at that level represents two pairs of matrix entries that would be combined by applying a Hadamard at the respective circuit line. If, and only if, both of these are already suitable reduction partners (i.e., they have the same residue), the vertex represents a reducible pattern. If all vertices represent a reducible pattern, a Hadamard gate can be applied directly. Otherwise, we can obtain the number of existing reduction partners in the matrix by counting the number of paths to the respective vertices in the QMDD.
Example 11
Consider the QMDD in Fig. 6c. In the single \(x_2\) vertex, there is a single nonzero edge weight, i.e., the corresponding entry does not have a suitable reduction partner. Consequently, \(x_2\) is not a promising qubit for Hadamard application. By exchanging the adjacent variables \(x_1\) and \(x_2\), we obtain the QMDD depicted in Fig. 6d. Here, the weights of the first and third (second and fourth) outgoing edge in leftmost \(x_1\) vertex only differ by a factor of \(1\). Thus, they have the same residue and correspond to a pair of reduction partners. As there are two paths to this vertex, the corresponding pattern occurs twice in the matrix. As the other \(x_1\) vertex also represents a reducible pattern, a Hadamard gate can be applied directly to qubit \(x_1\).
Determining Phase Shifts of Diagonal Entries In order to determine which diagonal entries have a phase shift that needs to be removed, we can again exploit characteristics of QMDDs. To this end, recall that all diagonal entries are potencies of \(\omega \), such that all edge weights in the QMDD also will be potencies of \(\omega \). Moreover, recall that our aim in the first step is to express the row indices of all entries with an odd potency as a Boolean function. We construct this function in terms of a corresponding decision diagram representation (Binary Decision Diagram, BDD [6]). To this end, we traverse the QMDD in a depthfirst manner. When the terminal is reached, a Boolean 0 (0terminal) is returned. For each nonterminal vertex, a BDD vertex is constructed by taking the resulting BDDs from the first and fourth edge as the low/high child. More precisely, if the corresponding edge weight is an even potency of \(\omega \), the original BDD is used, while the negated BDD is used for odd potencies. Note that the resulting BDDs can be stored in a computed table and can be reused directly without any further computation when the vertex is processed again during the traversal of the QMDD.
Example 12
The QMDD in Fig. 7a represents the matrix from Fig. 4d. The edge weights of the left and rightmost \(x_2\) vertex are either zero or an even potency of \(\omega \). Consequently, these vertices return a constant 0 function (represented by the 0terminal). As the edges pointing to these vertices from the leftmost \(x_1\) vertex are annotated with an even potency of \(\omega \), the resulting BDD vertex is redundant (both children point to the 0terminal) and can be removed. Likewise, the \(x_2\) vertex in the center of the QMDD yields a BDD vertex, whose low child is pointing to the 0terminal (edge weight 1) and whose high child is pointing to the 1terminal, i.e., the negation of the 0terminal (edge weight \(\omega ^3\)). Overall, the BDD in Fig. 7b is constructed where edges pointing to the low/high child are indicated by dashed/solid lines.
Likewise, BDDs can be constructed for the row indices that correspond to imaginary or \(1\) entries, as required in the last steps of the algorithm.
Overall, QMDDs exhibit multiple useful properties that can readily be facilitated for a speedup of the algorithmon top of their general efficiency for processing huge matrices that has been demonstrated in a broad variety of applications. The resulting benefits of the implementation on top of QMDDs will be evaluated in the next section.
6 Experimental results
In this section, we evaluate the results obtained by the approach and compare them to the synthesis scheme previously proposed in [12] in order to demonstrate whether the proposed heuristic is indeed beneficial and overcomes the conceptual shortcomings of previous work (discussed in Sect. 3), although it has the same worstcase complexity.
To this end, the global synthesis approach discussed above has been implemented in C on top of the QMDD data structure as outlined in the previous section. Moreover, also the approach from [12] has been reimplemented on top of QMDDs in order to benefit from the efficient matrix processing (twolevel operations correspond to certain matrix multiplications). In fact, we found that the straightforward matrix representations used in the preliminary version of the paper [28] had a very poor scalability and were not able to provide results for circuits with more than 7 qubits due to the expensive matrix multiplications. Motivated by this, we aimed for using a more efficient matrix representation in order to make the approaches applicable to larger problem instances and employed QMDDs for this purpose. As benchmarks, we used

arbitrary transformation matrices for mediumsized quantum systems (denoted arbitrary and covering various cases of multiple smallest denominator exponents in the same matrix),

quantum functionality taken from [23] and realizing Shor’s 9qubit error correcting code (denoted by 9qubitN1 and 9qubitN2), a 7qubit encoding (denoted by 7qubitcode), and an error syndrome measurement circuit for a 5qubit code (denoted by 5qubitcode), as well as

several classical reversible functions from RevLib [40].
Note that while random benchmarks are kind of unusual in the conventional logic synthesis community (where a wide range of established benchmark libraries exist), this is completely different in the quantum computing community, where random benchmarks are the main means to evaluate approaches. Most prominently, this can be observed in the domain of quantum simulation, where random circuits are used to show quantum supremacy. But also the “big players” in the field frequently rely on random benchmarks as can, e.g., be seen by the recent competitions conducted by IBM [38, 46].
The results are summarized in Table 2. The first column provides the identifiers of the respective benchmarks followed by its number of qubits. In the remaining columns, the costs of the resulting circuits are provided. As it is a common understanding that (physical) implementations of T gates are significantly more complex than those of Clifford group gates, the costs are provided in terms of Tdepth, i.e., the number of sequential T gates that cannot be conducted in parallel. In order to compute the cost for twolevel operations and multiplecontrolled Clifford+T gates, we employed the cost metric from Table 1 (based on the elementary circuits and decompositions provided in [1, 12] and assuming the availability of one ancillary qubit).
All experiments have been conducted on a 2.8 GHz Intel Core i7 machine with 8 GB of main memory running Linux. While our QMDDbased implementation of the proposed global synthesis approach easily managed to process matrices for larger quantum systems, the implementations of the approach from [12] (both the reference implementation written in Haskell and the reimplementation on top of QMDDs) in most cases failed to produce results for more than 7 qubitsthereby essentially limiting the size of comparable benchmarks to quantum systems of that size (the timeout was set to 3600 CPU seconds).
Table 2 clearly shows that, using the proposed method, much more compact quantum circuits can be realized for Clifford+T functionality compared to the stateoftheart approach from [12]. In fact, significant reductions (of up to several orders of magnitudes) can be obtained. The tremendous cost reductions for the error correction circuits can be explained by the fact that the corresponding circuits consist of H and CNOT gates only and that all H gates are located at the very beginning and the very end of the circuit. The approach proposed by us inherently identifies these H gates and, hence, can eliminate the superposition without the need of any T gatewhich obviously makes the realization way cheaper.
Besides, the significant reductions for the random benchmarks are a consequence of the “global” view taken by the proposed approach (instead of the local view in [12]). As motivated in Sect. 3.1, the resulting sequence of twolevel operations allows for combining multiple twolevel operations to a joint operation that can be realized with lower costs (c.f. Example 3). Thus, the conducted evaluations confirm that the heuristic optimization implemented by the proposed approach is indeed able to overcome the drawbacks of previous works.
Moreover, the results show the enhanced applicability which became possible due to the use of QMDDs as the underlying data structurein fact, much larger quantum functionality can be handled now.
7 Conclusions
In this work, we proposed an improved approach for the synthesis of quantum functionality in terms of Clifford+T quantum circuits. To this end, we explicitly addressed shortcomings of previously proposed synthesis, which relies on a local, i.e., columnwise, consideration of the given transformation matrix. The proposed method considers this matrix globallythereby allowing to conduct several transformations at once and with significantly smaller costs. Although the proposed method is a heuristic optimization with the same worstcase complexity as previous works, experimental evaluations showed that it yields Clifford+T quantum circuits with up to several orders of magnitude smaller costs. To this end, note that while the proposed method aims to determine a circuit realization with a minimized number of T gates, it is not able to determine whether it found the actual minimum (a problem coined COUNTT in [13]).
In order to enhance the applicability of the approach, we employed more efficient matrix representations (in terms of QMDDs). For future work, one may also have a look into other matrix representations, though we do not expect this to provide a significant benefit, as the improvements gained by the use of QMDD essentially concern the runtime and general applicability of the approach, but not the cost of the resulting circuits. Thus, we rather plan to investigate the further potential for optimization that is, e.g., offered by the degree of freedom that exists when permuting the rows between the application of the Hadamard gate in step (a) of the proposed algorithm.
Notes
A preliminary version of this paper has been published in [28].
From an algebraic perspective, these numbers form the ring \({\mathbb {D}}[\omega ]={\mathbb {D}}[\sqrt{2},i]\) (where \({\mathbb {D}}=\{\frac{d}{2^k} \mid d\in {\mathbb {Z}}, k\in {\mathbb {N}}\} \subset {\mathbb {Q}}\) are the dyadic fractions). This ring has a variety of interesting properties, two of which we will later on make use of in Sect. 4.1, though without going into much detail, as this is beyond the scope of this paper. A more detailed discussion and derivation of these properties can be found in [12].
For more detailed discussion on that we refer to [12]. For the special case of 0entries, we assume a representation via \(k=a=b=c=d=0\).
Roughly speaking, the multiplicative factor \(\frac{1}{\sqrt{2}}\) increments the exponent such that k can only be decreased if a factor of 2 can be extracted from \(\alpha \pm \beta \). This, in turn, is only possible if the coefficients of \(\alpha ,\beta \) have the same parity.
References
Amy, M., Maslov, D., Mosca, M., Roetteler, M.: A meetinthemiddle algorithm for fast synthesis of depthoptimal quantum circuits. IEEE Trans. on CAD 32(6), 818–830 (2013). https://doi.org/10.1109/TCAD.2013.2244643
Barenco, A., Bennett, C.H., Cleve, R., DiVinchenzo, D., Margolus, N., Shor, P., Sleator, T., Smolin, J., Weinfurter, H.: Elementary gates for quantum computation. Am. Phys. Soc. 52, 3457–3467 (1995)
Bocharov, A., Roetteler, M., Svore, K.M.: Efficient synthesis of universal repeatuntilsuccess quantum circuits. Phys. Rev. Lett. 114(8), 080502 (2015)
Boykin, P.O., Mor, T., Pulver, M., Roychowdhury, V., Vatan, F.: A new universal and faulttolerant quantum basis. Inf. Process. Lett. 75(3), 101–107 (2000)
Brayton, R., Mishchenko, A.: ABC: An academic industrialstrength verification tool. In: Computer Aided Verification, pp. 24–40 (2010). https://doi.org/10.1007/9783642142956_5
Bryant, R.E.: Graphbased algorithms for Boolean function manipulation. IEEE Trans. Comput. 35(8), 677–691 (1986)
Burgholzer, L., Wille, R.: Improved DDbased equivalence checking of quantum circuits. In: ASP Design Automation Conference, pp. 127–132 (2020)
Burgholzer, L., Wille, R.: Advanced equivalence checking for quantum circuits. arXiv:2004.08420 (2020)
Dawson, C.M., Nielsen, M.A.: The solovaykitaev algorithm. Quantum Info. Comput. 6(1), 81–95 (2006)
Di Matteo, O., Mosca, M.: Parallelizing quantum circuit synthesis. Quantum Sci. Technol. 1(1), 015003 (2016)
Fowler, A.G., Stephens, A.M., Groszkowski, P.: Highthreshold universal quantum computation on the surface code. Phys. Rev. A 80, 052312 (2009). https://doi.org/10.1103/PhysRevA.80.052312
Giles, B., Selinger, P.: Exact synthesis of multiqubit Clifford+T circuits. Phys. Rev. A 87(3), 032332 (2013). https://doi.org/10.1103/PhysRevA.87.032332
Gosset, D., Kliuchnikov, V., Mosca, M., Russo, V.: An algorithm for the tcount. Quantum Info. Comput. 14(15–16), 1261–1276 (2014)
Grover, L.K.: A fast quantum mechanical algorithm for database search. In: Theory of Computing, pp. 212–219 (1996)
Gupta, P., Agrawal, A., Jha, N.K.: An algorithm for synthesis of reversible logic circuits. IEEE Trans. CAD 25(11), 2317–2330 (2006)
Houshmand, M., Sedighi, M., Zamani, M.S., Marjoei, K.: Quantum circuit synthesis targeting to improve oneway quantum computation pattern cost metrics. J. Emerg. Technol. Comput. Syst. 13(4), 55:1–55:27 (2017). https://doi.org/10.1145/3064834
Jones, N.C.: Logic synthesis for faulttolerant quantum computers. arXiv:1310.7290 (2013)
Kassal, I., Jordan, S.P., Love, P.J., Mohseni, M., AspuruGuzik, A.: Polynomialtime quantum algorithm for the simulation of chemical dynamics. Proc. Natl. Acad. Sci. 105(48), 18681–18686 (2008)
Kliuchnikov, V., Bocharov, A., Roetteler, M., Yard, J.: A framework for approximating qubit unitaries. arXiv:1510.03888 (2015)
Kliuchnikov, V., Maslov, D., Mosca, M.: Asymptotically optimal approximation of single qubit unitaries by Clifford and T circuits using a constant number of ancillary qubits. Phys. Rev. Lett. 110(19), 190502 (2013)
Kliuchnikov, V., Maslov, D., Mosca, M.: Fast and efficient exact synthesis of singlequbit unitaries generated by Clifford and T gates. Quantum Inf. Comput. 13(7–8), 607–630 (2013)
Lin, C., Chakrabarti, A., Jha, N.K.: FTQLS: faulttolerant quantum logic synthesis. IEEE Trans. VLSI Syst. 22(6), 1350–1363 (2014). https://doi.org/10.1109/TVLSI.2013.2269869
Mermin, N.D.: Quantum Computer Science: An Introduction. Cambridge University Press, Cambridge (2007)
Miller, D.M., Wille, R., Sasanian, Z.: Elementary quantum gate realizations for multiplecontrol Toffolli gates. In: Int’l Symposium on MultiValued Logic, pp. 288–293 (2011)
Nielsen, M., Chuang, I.: Quantum Computation and Quantum Information. Cambridge University Press, Cambridge (2000)
Niemann, P., Wille, R., Drechsler, R.: Efficient synthesis of quantum circuits implementing Clifford group operations. In: ASP Design Automation Conference, pp. 483–488 (2014)
Niemann, P., Wille, R., Drechsler, R.: Equivalence checking in multilevel quantum systems. In: Reversible Computation, pp. 201–215 (2014)
Niemann, P., Wille, R., Drechsler, R.: Improved synthesis of Clifford+T quantum functionality. In: Design, Automation and Test in Europe, pp. 597–600 (2018)
Niemann, P., Wille, R., Miller, D.M., Thornton, M.A., Drechsler, R.: QMDDs: Efficient quantum function representation and manipulation. IEEE Trans. CAD 35(1), 86–99 (2016). https://doi.org/10.1109/TCAD.2015.2459034
Russell, T.: The exact synthesis of 1 and 2qubit Clifford+T circuits. ArXiv eprints (2014)
Saeedi, M., Arabzadeh, M., Zamani, M.S., Sedighi, M.: Blockbased quantumlogic synthesis. Quantum Inf. Comput. 11(3&4), 262–277 (2011)
Saeedi, M., Zamani, M.S., Sedighi, M., Sasanian, Z.: Synthesis of reversible circuit using cyclebased approach. J. Emerg. Technol. Comput. Syst. 6(4), 13 (2010)
Shende, V.V., Bullock, S.S., Markov, I.L.: Synthesis of quantumlogic circuits. IEEE Trans. CAD 25(6), 1000–1010 (2006)
Shende, V.V., Prasad, A.K., Markov, I.L., Hayes, J.P.: Synthesis of reversible logic circuits. IEEE Trans. CAD 22(6), 710–722 (2003)
Shor, P.W.: Algorithms for quantum computation: discrete logarithms and factoring. In: Foundations of Computer Science pp. 124–134 (1994)
Soeken, M., Wille, R., Hilken, C., Przigoda, N., Drechsler, R.: Synthesis of reversible circuits with minimal lines for large functions. In: ASP Design Automation Conference, pp. 85–92 (2012)
Welch, J., Bocharov, A., Svore, K.M.: Efficient approximation of diagonal unitaries over the Clifford+T basis. Quantum Inf. Comput. 16(1&2), 87–104 (2016)
We have winners! ...of the IBM qiskit developer challenge. https://www.ibm.com/blogs/research/2018/08/winnersqiskitdeveloperchallenge/. Accessed: 20200220
Wille, R., Große, D., Miller, D.M., Drechsler, R.: Equivalence checking of reversible circuits. In: Int’l Symposium on MultiValued Logic, pp. 324–330 (2009)
Wille, R., Große, D., Teuber, L., Dueck, G.W., Drechsler, R.: RevLib: an online resource for reversible functions and reversible circuits. In: Int’l Symposium on MultiValued Logic, pp. 220–225 (2008). RevLib is available at http://www.revlib.org
Wille, R., Soeken, M., Otterstedt, C., Drechsler, R.: Improving the mapping of reversible circuits to quantum circuits using multiple target lines. In: ASP Design Automation Conference, pp. 85–92 (2013)
Yamashita, S., Markov, I.L.: Fast equivalence  checking for quantum circuits. Quantum Inf. Comput. 10(9&10), 721–734 (2010)
Zulehner, A., Wille, R.: Improving synthesis of reversible circuits: Exploiting redundancies in paths and nodes of QMDDs. In: Reversible Computation, pp. 232–247 (2017)
Zulehner, A., Wille, R.: Onepass design of reversible circuits: Combining embedding and synthesis for reversible logic. IEEE Trans. CAD 37(5), 996–1008 (2017)
Zulehner, A., Wille, R.: Advanced simulation of quantum computations. IEEE Trans. CAD 38(5), 848–859 (2019)
Zulehner, A., Wille, R.: Compiling SU(4) quantum circuits to IBM QX architectures. In: ASP Design Automation Conference, pp. 185–190 (2019)
Zulehner, A., Wille, R.: Matrixvector vs. matrixmatrix multiplication: Potential in DDbased simulation of quantum computations. In: Design, Automation and Test in Europe, pp. 90–95 (2019)
Acknowledgements
This work has partially been supported by the LIT Secure and Correct Systems Lab funded by the State of Upper Austria as well as by the BMK, BMDW, and the State of Upper Austria in the frame of the COMET program (managed by the FFG).
Funding
Open Access funding provided by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Niemann, P., Wille, R. & Drechsler, R. Advanced exact synthesis of Clifford+T circuits. Quantum Inf Process 19, 317 (2020). https://doi.org/10.1007/s11128020028160
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11128020028160