## Introduction

Scaling a real matrix O with non-negative entries means finding diagonal matrices $$D_1, D_2$$ such that $$B=D_1OD_2$$ is bistochastic. Sinkhorn theorem presents a necessary and sufficient condition for existence of the decomposition of a matrix. Moreover, the iterative Sinkhorn–Knopp algorithm finds the bistochastic matrix B . Such decomposition can be used for ranking web pages , preconditioning sparse matrices  and understanding traffic circulation .

Since unitary matrices are complex analogue of orthogonal matrices, it is natural to ask whether there exist a counterpart of Sinkhorn theorem for them. De Vos and De Baerdemacker considered whether it is possible, that for arbitrary unitary matrix $$U\in \mathbf {U}(n)$$, there exist two unitary diagonal matrices $$U_1, U_2$$ such, that matrix $$U_1UU_2$$ has all lines sums equal to 1. Such decomposition exists for arbitrary unitary matrix, and an algorithm for finding it approximately was presented . Matrices called negators were treated as quantum counterpart of bistochastic matrices and form a group $$\mathbf {XU}(n)$$ under multiplication. Idel and Wolf propose an application of the quantum scaling in quantum optics .

Algorithm converges for arbitrary unitary matrix U . Similar decomposition of unitary matrices $$U\in \mathbf {U}(2m)$$ called bZbXbZ decomposition was presented . They show that there always exist matrices $$A, B, C, D\in \mathbf {U}(m)$$ such that

\begin{aligned} U = \begin{bmatrix} A&\quad 0 \\ 0&\quad B \end{bmatrix} \frac{1}{2}\begin{bmatrix} {I}+C&\quad {I}-C \\ {I}-C&\quad {I}+C \end{bmatrix} \begin{bmatrix} {I}&\quad 0 \\ 0&\quad D \end{bmatrix}, \end{aligned}
(1)

where $${I}$$ is identity matrix. Matrix in the middle is a block-negator matrix (which is also a negator matrix), while left and right matrices are block diagonal matrices. In , an algorithm of finding such decomposition was presented.

Group $$\mathbf {XU}(2^n)$$ is isomorphic to $$\mathbf { U}(2^n-1)$$ and can be generated by single-qubit negator and controlled-$$\sqrt{\text{ NOT }}$$ gates . However, the proof is non-constructive since a decomposition designed for generating random matrices was used . Although it is proved that it exists for any unitary matrix, obtaining such a decomposition is a very complex task. Therefore, another approach is needed for efficient decomposition procedure.

In this article, using similar method presented by de Vos and de Baerdemacker , we demonstrate an implementation of arbitrary k-qubit unitary operation using one-qubit ancilla with controlled-$$\sqrt{\text {NOT}}$$ and single-qubit negator gates. Since product of these basic negator gates is still a negator matrix, our result can be seen as quantum analogue of scaling matrix. More precisely, we prove that for arbitrary matrix $$U\in \mathbf {U}(2^k)$$, which is performed on system $$\mathcal {H}$$, there exist a negator $$N\in \mathbf {XU}(2^{k+1})$$ such that for arbitrary state $$| \psi \rangle \in \mathcal {H}$$, we have

\begin{aligned} U| \psi \rangle = \Psi (N \Phi (| \psi \rangle )). \end{aligned}
(2)

Here, $$\Phi$$ denotes the operation of extending the system with an ancilla register in $$| - \rangle$$ state and $$\Psi$$ denotes partial trace over the ancilla system. Since after performing operations $$\Phi$$ and N the state is of the form $$| - \rangle \otimes U| \psi \rangle$$, the partial trace is simply removing the ancilla system giving a pure state $$U| \psi \rangle$$. We describe an efficient algorithm that for given U returns explicit and exact form of N with decomposition into a sequence of single-qubit negator and controlled-$$\sqrt{\text {NOT}}$$ gates only in contrast to results of de Vos and de Baerdemacker [9, 10].

In Sect. 2, we recall basic facts. In Sect. 3, we show how to perform such transformation efficiently and demonstrate the cost in terms of controlled-$$\sqrt{\text {NOT}}$$ gates.

To illustrate the transformation method, a transformation of Grover’s search algorithm is presented step by step in Sect. 4.

## Basic facts

Negator gates of dimension 2 were introduced by de Vos and de Baerdemacker  as unitary matrices $$N\in \mathbf {U}(2)$$ which are also a convex combination of identity matrix and NOT gate. Simple calculation shows that they are of the form

\begin{aligned} N(\theta )=\frac{1}{2} \begin{bmatrix} 1 + e^{i\theta }&\quad 1 - e^{i\theta } \\ 1 - e^{i\theta }&\quad 1 + e^{i\theta } \end{bmatrix}, \end{aligned}

where $$\theta \in [0,2\pi )$$. Negators form a subgroup of single-qubit unitary operations, i.e., $$N(\phi )N(\psi )=N(\phi +\psi )$$ for any values of $$\phi$$ and $$\psi$$. In the following, we will also use a 2-qubit negator operation controlled-$$\sqrt{\text {NOT}}$$ gate (which is also controlled-$$N(\frac{\pi }{2})$$ gate)

\begin{aligned} \begin{bmatrix} 1&\quad 0&\quad 0&\quad 0 \\ 0&\quad 1&\quad 0&\quad 0 \\ 0&\quad 0&\quad \frac{1+i}{2}&\quad \frac{1-i}{2} \\ 0&\quad 0&\quad \frac{1-i}{2}&\quad \frac{1+i}{2} \\ \end{bmatrix}. \end{aligned}

As these gates are used as basic operators, we will use a simplified notation in circuit, respectively

These two kinds of unitary matrices will be called NCN gates (Negators-Controlled-Negator).

In Sect. 3, decomposition of single-qubit unitary gates will be needed. Every unitary matrix $$U\in \mathbf {U}(2)$$ can be presented as a product of global phase, two z-rotators and one y-rotator 

\begin{aligned} U= & {} e^{i\phi _0} \begin{bmatrix} \cos \frac{\phi _1}{2} e^{i\phi _2}&\sin \frac{\phi _1}{2} e^{i\phi _3}\\ -\sin \frac{\phi _1}{2} e^{-i\phi _3}&\cos \frac{\phi _1}{2} e^{-i\phi _2} \end{bmatrix} \nonumber \\= & {} e^{i\phi _0} \begin{bmatrix} e^{i\frac{\phi _2+\phi _3}{2}}&0 \\ 0&e^{-i\frac{\phi _2+\phi _3}{2}} \end{bmatrix} \begin{bmatrix} \cos \frac{\phi _1}{2}&\sin \frac{\phi _1}{2}\\ -\sin \frac{\phi _1}{2}&\cos \frac{\phi _1}{2} \end{bmatrix} \begin{bmatrix} e^{i\frac{\phi _2-\phi _3}{2}}&0 \\ 0&e^{-i\frac{\phi _2-\phi _3}{2}} \end{bmatrix} \nonumber \\= & {} e^{i\phi _0} R_z(-\phi _2-\phi _3)R_y(\phi _1)R_z(\phi _3-\phi _2). \end{aligned}
(3)

Since global phase is not measurable, we can simplify this representation without loss of information

\begin{aligned} U \cong R_z(\gamma )R_y(\beta )R_z(\alpha ), \end{aligned}
(4)

where ‘$$\cong$$’ means equality up to a global phase. The same applies in the case of global phase change on one of the registers of a bigger system

\begin{aligned} U_1\otimes e^{i\phi } U_2 \otimes U_3 = e^{i\phi }(U_1\otimes U_2\otimes U_3) \cong U_1\otimes U_2\otimes U_3 . \end{aligned}
(5)

Using these two facts, we can say that in any situation, we can ignore global phase change on any register.

While it may lead to a conclusion that our transformation is mainly applied to group $$\mathbf {SU}(n)$$, we decided to stay with the unitary matrices formalism, since negator gates are not special unitary matrices. The result may be written using the special matrices; however, then the negators gates column and row sums will equal $$e^{i\theta }$$ in general.

## Circuit transformation method

In this section, we provide complete description of the transformation method. We recall a sketch of a proof of universality theorem between quantum gates and negator gates from the work of de Vos and de Baerdemacker . Next, we present transformation method of arbitrary single-qubit gate into NCN product. Then, we provide a method of decomposition for arbitrary k-qubit circuit, based on the single-qubit case. Finally, we analyze the cost of presented transformation.

### Universality theorem

De Vos and de Baerdemacker proved a universality theorem: Group $$\mathbf {XU}(2^k)$$ generated by negators and controlled-$$\sqrt{\text {NOT}}$$ is isomorphic to $$\mathbf {U}(2^k-1)$$ . The proof consists of several steps:

1. 1.

Every matrix $$U\in \mathbf {U}(2^k-1)$$ can be decomposed into a product of m gates $$U_1U_2\cdots U_m$$, where matrices $$U_i\in \mathbf {U}(2^k-1)$$ are of some special forms .

2. 2.

Group $$\mathbf {U}(2^k-1)$$ is isomorphic to group

\begin{aligned} \mathbf {^{1}U}(2^k) = \left\{ \begin{bmatrix} 1&\quad \mathbf 0 ^T\\ \mathbf 0&\quad U\end{bmatrix}: U\in \mathbf {U}(2^k-1) \right\} , \end{aligned}
(6)

because of the isomorphism $$h: \mathbf {U}(2^k-1) \rightarrow \mathbf {^{1}U}(2^k)$$

\begin{aligned} h(U) = \begin{bmatrix} 1&\quad \mathbf 0 \\ \mathbf 0&\quad U \end{bmatrix}. \end{aligned}
(7)
3. 3.

Function $$f : \mathbf {^{1}U}(2^k) \rightarrow \mathbf {XU}(2^k)$$ of the form $$f(U)=(H\otimes {I}_{2^k})U(H\otimes {I}_{2^k})$$ is an isomorphism.

4. 4.

Decomposition of every $$f(h(U_i))$$ into a product of NCN gates is possible, where $$U_i$$ comes from point 1.

The proof used the decomposition presented in the work of Poźniak et al. , because it is proven that the decomposition exists for any unitary matrix. However, obtaining such decomposition is a very complex task. Therefore, we need to choose a different decomposition in order to find an efficient decomposition procedure.

Obviously, group $$\mathbf {U}(2^k)$$ is isomorphic to some subgroup of $$\mathbf {XU}(2^{k+1})$$. In other words, with ancilla (one additional qubit), every unitary matrix can be replaced with a sequence of NCN gates. For our purpose, we choose function $$g:\mathbf {U}(2^k) \rightarrow \mathbf {XU}(2^{k+1})$$

\begin{aligned} g(U) = \frac{1}{2}H\otimes {I}(| 0 \rangle \langle 0 |\otimes {I}+ | 1 \rangle \langle 1 |\otimes U)H\otimes {I}= \frac{1}{2} \begin{bmatrix} {I}&\quad {I}\\ {I}&\quad -{I}\end{bmatrix} \begin{bmatrix} {I}&\quad \mathbf 0\\ \mathbf 0&\quad U \end{bmatrix}\begin{bmatrix} {I}&\quad {I}\\ {I}&\quad -{I}\end{bmatrix}. \end{aligned}
(8)

Using the function g, every gate U changes into controlled-U. Using circuit notation, we can present this fact as

Note that if we assume that the first qubit is set to $$| - \rangle$$, the control qubit does not influence the result (the condition is always ‘true’).

### Single-qubit gate transformation

Now, we aim at decomposition of arbitrary single-qubit gate into NCN gates. With Eq. (4) for any $$(U \in {\mathbf {U}}(2))$$, there exist real parameters $$\alpha ,\beta ,\gamma$$ such that

\begin{aligned} U \cong R_z(\gamma )R_y(\beta )R_z(\alpha ). \end{aligned}
(9)

Therefore, after applying function g, we have

We change the rotators with neighboring Hadamard gates into NCN gates as shown in Fig. 1

Let us note that the symbols of controlled-NOT, controlled-$$\sqrt{\text {NOT}}^\dagger$$ and controlled-negator used in the decomposed circuit do not mean that these gates cannot be transformed. We use these symbols as a simplified notation for its decomposition with use of controlled-$$\sqrt{\text {NOT}}$$ gates as shown in Fig. 2.

### General transformation method

Now, we consider transformation of arbitrary k-qubit circuit. Let us assume that we have a circuit which consists of unitary operation $$(U \in {\mathbf {U}}(2^k))$$, generalized measurement $$\mathbf M = \{M_a\in {L}(\mathbb {C}^{2^k}):a\in \Sigma \}$$, where $$\Sigma$$ is a set of classical outputs of measurement, and starting state $$| \phi _0 \rangle$$

In order to construct a decomposition of unitary U into a sequence of negator gates, we begin with obtaining a decomposition of U into controlled-NOT and single-qubit gates

here denoted by a sequence of gates $$U=V_m\cdots V_1$$. Contrary to the decomposition presented in the work of Poźniak, Życzkowski and Kuś, there exist efficient methods for constructing such circuit . Next, we need to add an additional qubit, transform $$V_i$$ gates into controlled-$$V_i$$ gates and add Hadamard gates as below (since $$HH={I}$$)

Let us note that product $$H\cdot \text {controlled-}V_j\cdot H$$ is an image of homomorphism presented in Eq. (8) on $$V_j$$. Next, we replace the product with the sequence of NCN gates (here denoted by $$\mathbf N_j$$) as in previous subsection (if $$V_j$$ is controlled-NOT, then we choose Toffoli gate transformation from Fig. 1)

For the sake of simplicity, we may change the starting state and resulting state on the first wire

Now, we have an equivalent circuit which consists of negators and controlled-$$\sqrt{\text {NOT}}$$ gates only.

### Transformation cost

Now, we consider upper bound of cost of decomposition into negator circuit. Two kinds will be discussed: memory complexity and number of single- and two-qubit gates. In the first case for arbitrary k-qubit circuit transformation requires one additional qubit.

Let $$c_{\text {CNOT}}(k)$$ and $$c_s(k)$$ denote upper bound of the number of, respectively, controlled-NOT and single-qubit gates needed for the implementation of an arbitrary k-qubit circuit. Using the operation presented above, we need $$17c_{\text {CNOT}}(k) + 64c_s(k)$$ controlled-$$\sqrt{\text {NOT}}$$ gates and $$11c_{\text {CNOT}}(k)+34c_s(k)$$ negators to implement an equivalent circuit (up to global phase).

Any circuit which consists of controlled-NOT and single-qubit gates can be simplified in such a way that $$c_s(k)\le 2c_{\text {CNOT}}(k)+k$$. This estimation is based on the worst case, when there are two single-qubit gates between every controlled-NOT gate. Taking this into account, we can express the previous result in terms of $$c_{\text {CNOT}}$$ only, because only $$17c_{\text {CNOT}}(k)+ 64c_s(k) \le 145 c_{\text {CNOT}}(k)+64k$$ controlled-$$\sqrt{\text {NOT}}$$ gates are needed. In fact, if $$c_{\text {CNOT}} = O(4^k)$$, then so is the number of controlled-$$\sqrt{\text {NOT}}$$ gates.

## Step-by-step transformation example

To illustrate the introduced decomposition, we will present Grover’s algorithm for $$k=2$$ qubits as NCN circuit. The original circuit for this algorithm is presented in Fig. 3, where $$\omega$$ denotes the searched state.

As in the previous section, we will add one qubit, change every H and G gate into controlled-H and controlled-G, respectively, and add Hadamard gates on the ancilla register. Former steps of the decomposition are explicitly presented in Fig. 4. The following facts were used

• the decomposition of Hadamard gate is $$H\cong R_z(\pi )R_y(\frac{\pi }{2})R_z(0)=R_z(\pi )R_y(\frac{\pi }{2})$$,

• the decomposition of NOT gate is $$\text{ NOT }\cong R_z(\pi )R_y(\pi )R_z(0)=R_z(\pi )R_y(\pi )$$,

• for any $$(U, V \in {\mathbf {U}}(2))$$, we have

• Grover’s diffusion operator can be decomposed in the following way

Decomposition of $$U_\omega$$ depends strictly on the value of $$\omega$$; therefore, it is not presented in the example. The full decomposition is presented in Fig. 4.

## Concluding remarks

In the presented work, we provide a constructive method of scaling arbitrary unitary matrices $$U\in \mathbf {U}(2^k)$$. More precisely, we proved that for arbitrary unitary matrix $$U\in \mathbf {U}(2^k)$$, there exists unitary negator matrix $$N\in \mathbf {XU}(2^{k+1})$$ such that for arbitrary state $$| \psi \rangle$$, we have

\begin{aligned} U| \psi \rangle = \Psi (N \Phi (| \psi \rangle )). \end{aligned}
(10)

Here, $$\Phi$$ denotes the operation of extending the system with an ancilla register in $$| - \rangle$$ state and $$\Psi$$ denotes partial trace over the ancilla system. We described efficient algorithm of decomposing N into product of single-qubit negator and controlled-$$\sqrt{\text{ NOT }}$$ gates. Our decomposition consists of $$O(4^k)$$ entangling gates which is proved to be optimal and needs one-qubit ancilla.

Our result can be seen as complex analogue of Sinkhorn–Knopp algorithm, which is known to have wide applications. The result is in contrast to the previous results , which could be only used to prove the existence of such decomposition. Moreover, our transformation is exact and can be found constructively. In contrast to , our transformation consists only of negator gates. The main difference is that transformation needs one-qubit ancilla.