A Universal Quantum Algorithm for Weighted Maximum Cut and Ising Problems

We propose a hybrid quantum-classical algorithm to compute approximate solutions of binary combinatorial problems. We employ a shallow-depth quantum circuit to implement a unitary and Hermitian operator that block-encodes the weighted maximum cut or the Ising Hamiltonian. Measuring the expectation of this operator on a variational quantum state yields the variational energy of the quantum system. The system is enforced to evolve towards the ground state of the problem Hamiltonian by optimizing a set of angles using normalized gradient descent. Experimentally, our algorithm outperforms the state-of-the-art quantum approximate optimization algorithm on random fully connected graphs and challenges D-Wave quantum annealers by producing better approximate solutions. Source code and data files are publicly available under https://github.com/nkuetemeli/UQMaxCutAndIsing.


Introduction
Quantum computing has emerged as a powerful computation paradigm taking advantage of principles of quantum mechanisms. It involves two computational models that fundamentally differ in their functioning: The adiabatic quantum model is best suited for optimization problems, typically cast in quadratic unconstrained binary optimization form. Commercial devices such as dwave annealers [1] can already solve combinatorial problems in various fields such as computer vision [2,3] and database engineering [4,5]. The universal model of quantum computing, also referred to as the gate-based or circuit model, is more flexible and can potentially implement any classical operation, as shown by Bennett [6]. For selected problems such as Shor's factoring [7] and Grover's search [8] algorithms, strong theoretical convergence properties and drastic speed-up of the universal quantum computing model over classical counterparts could be proven. However, for problem sizes of practical interest, these algorithms still require more resources and run-time than existing universal devices can provide. In the era of noisy intermediate-scale devices, it is a challenging task to find real-world applications of universal quantum computing. Hybrid quantumclassical algorithms are therefore considered to be a promising way to obtain practical quantum supremacy.
The Ising model [9] describes a quantum mechanical system with n ∈ N particles or spins that can be in two possible states, i.e., the spin s i , i = 1, . . . , n can be in the state ±1 represented by s i = (−1) qi , q i ∈ {0, 1}. Each spin s i can interact with some external energy of strength C ii or with an adjacent spin s j by a mutual interaction energy C ij . The complete system can be modelled by a general un-directed n vertices graph G = (S, E, C) with S = {s 1 , . . . , s n }, E ⊆ S × S and a cost function C : E → R with C(s i , s j ) := C ij on E. For a quantum system in the state |q⟩ = n i=1 |q i ⟩ , q i ∈ {0, 1}, the Ising model describes the total energy of the system as being the expectation ⟨C⟩ := ⟨q|C|q⟩ of the (2 n × 2 n ) Hamiltonian where Z k denotes the Pauli-Z operator acting on the kth particle of the system. The problem is to i.e. to search for the state |q ⋆ ⟩ of the system with the minimal energy according to the Ising model in eq. (1). By setting C ii = 0 for all i, the problem reduces to the so-called weighted maximum cut (maxcut) problem [10]. The observable C in eq. (1) is a diagonal matrix and it is straightforward to verify that the expectation ⟨C⟩ is lower-bounded by the smallest eigenvalue C min of C. Indeed, measuring C on a quantum system prepared in the superposition state |ψ⟩ = 2 n −1 q=0 α q |q⟩ yields In other words, under all norm-1 vectors, ⟨C⟩ reaches the minimum exactly for |ψ⟩ being an eigenvector of C to the lowest eigenvalue, a ground state |q ⋆ ⟩ of C. Computing ⟨C⟩ for all eigenvectors of C-which amounts to computing the diagonal elements of C-is practically hard, as it amounts to brute-forcing the problem. Instead, in hybrid quantum-classical approaches, one aims for an efficient way to search for parameters Θ ∈ Γ that solve the variational problem  [14]. Thus, determining solutions of the Ising spin model or the maxcut problem, even approximately, is of great practical interest.
Adiabatic quantum computation (aqc) relies on the adiabatic theorem [15,16] and solves the Ising problem by performing an adiabatic evolution of the quantum system from the known and easily-prepared ground state of an initial Hamiltonian to that of the problem Hamiltonian. dwave [1] quantum annealers are intermediate-scale devices implementing aqc. It is an established fact that the run-time of aqc algorithms scales inversely with the energy gap between the ground state and the first excited state of the system Hamiltonian [15][16][17]. This introduces the necessity of carefully designing the system Hamiltonian or to use spectral gap amplification techniques [18]. Another workaround is the conception of (hybrid) gate-based algorithms that use moderate quantum resources with outer-loop classical optimization [19].
Quantum approximate optimization algorithms (qaoa), firstly introduced by Fahri et al. [19] and widely discussed in the literature [20][21][22][23], mimic the aqc model, but use a low-depth variational circuit capable to run on near-term devices and benefiting from the maturity of classical optimization. qaoa is considered as one of the major candidates for dealing with realistic real-world applications at competitive performance using the universal model. However, in practice, optimizing qaoa parameters appears to be extremely difficult due to a non-convex objective [20,22,23]. Also, qaoa is three-folds iterative and hence computationally expensive: First, the ansatz itself involves repetitive layers of gates; second, the classical optimization routine used around qaoa often needs to be iterative as the objective function is non-convex; third, evaluating this objective function requires a large number of quantum circuit measurements.
This work aims to facilitate combinatorial optimization on universal quantum computers. We first eliminate the repetitive layers of qaoa and propose an alternative, more effective encoding of the problem on universal quantum hardware. We present a new, easy to implement and low-depth variational circuit that blockencodes the maxcut or the Ising Hamiltonian in an Hermitian and unitary operator. Measuring this circuit reveals direct information about the loss function eq. (5). We then derive an optimization routine based on gradient descent for the proposed variational circuit in order to drive the quantum system towards the ground state of the problem Hamiltonian. We experimentally validate the novel algorithm and find that it outperforms the state-of-the-art gate-based qaoa model for solving maxcut; on the Ising model, it also challenges the specialized d-wave annealers.
This paper is structured as follows: In section 2 we recall how d-wave and qaoa solve combinatorial problems; in section 3, we present the new variational quantum circuit and how its parameters are optimized; section 4 provides experimental results and section 5 concludes the work.

Adiabatic Quantum Computing
Adiabatic quantum computation (aqc) is an optimization paradigm which relies on the adiabatic theorem [15,16] saying that the ground state of a quantum mechanical system is the solution of an optimization problem [24]. Theoretically, the evolution of a quantum system of n ∈ N particles at a time t ∈ [0, T ], typically re-scaled to s = t/T ∈ [0, 1], can be described by a Hamiltonian H(s) on a Hilbert space H = C ⊗2 n , with the state of the system given by a unit vector |ψ(s)⟩ ∈ H.
Industrial devices such as the d-wave quantum annealer [1] have already been proposed to solve binary combinatorial problems based on the aqc optimization principle. Their main idea consists in initializing a quantum system with a Hamiltonian B = n k=1 X k whose ground state |+⟩ ⊗n is the perfect superposition state and easy to prepare. As above, X k denotes the Pauli-X operator acting on the kth particle of the system. Then, another Hamiltonian C, the problem Hamiltonian, is prepared as in eq. (1). As the time s evolves from 0 to 1, the initial Hamiltonian B is transformed into the problem Hamiltonian C, describing a time-dependent system Hamiltonian where B and C with lim s→1 B(s) = 0 and lim s→1 C(s) = 1 are annealing functions. The evolution of the system generated by H(s) over the time s ∈ [0, 1] is governed by the Schrödinger equation; its solution defines a time-dependent unitary operator that transforms the ground state of B into the ground state of C with high probability if s varies sufficiently slowly [16]. The ground state of C is the solution of the optimization problem (2). While being an efficient scheme for combinatorial optimization that has the potential to ultimately supercede classical computers, aqc has a caveat. It is known that the smaller the energy gap between the ground state and the first excited state of the adiabatic Hamiltonian H(s), the longer the required annealing time for guaranteeing the success of the optimization [15][16][17]. To overcome this, methods for universal quantum computers that take advantage of efficient classical optimization techniques have been proposed in the form of quantum approximate optimization algorithms [19].

Algorithm 1 Quantum Approximate Optimization Algorithms [19]
Require: G = (S, E, C) and p Ensure: |ψ⟩ = |γ, β⟩ Initialize the system in |+⟩ = √ 2 −n 2 n −1 q=0 |q⟩ Initialize parameter (γ, β) ← (γ, β) init while stopping criteria not met do Prepare |γ, β⟩ vis eq. (7) Measure |γ, β⟩ in the computational basis Compute ⟨C⟩ := ⟨γ, β|C|γ, β⟩ Update (γ, β) ← (γ, β) new using a classical optimizer end while For p → ∞ the results of [19] guarantee that there exist parameters (γ, β) for which measuring |γ, β⟩ gives the desired ground state |q ⋆ ⟩ with high probability. However, the qaoa objective is difficult to optimize [20,22,23]. We believe that this is partially due to the fact that the qaoa ansatz encodes problem information in the argument of exp as phases of the qubits, which is partially lost at the measurement. Another issue of qaoa is that its repetitive layers are still too expensive for running on current and near-term devices. In this work, we propose a quantum circuit that encodes the problem more effectively and does not require the repetitive layers of qaoa. Adhering to the promising concept of designing hybrid quantum algorithms, we embed the circuit in a classical optimization method to output the desired ground state |q ⋆ ⟩ with high probability.

Proposed Universal
Quantum Algorithm.
Our method builds on the notion of block-encoding introduced in [25,26] that allows to embed nonunitary matrices as the principal block of a unitary operator acting on the quantum system. Blockencoding is typically achieved by enlarging the Hilbert space of the quantum system.
By adding a single qubit to the system, our goal is to implement the (2 1+n )×(2 1+n ) unitary operator U := U(C, K) given by ,Ĉ := C/K, (9) for the Ising Hamiltonian C from eq. (1) and a suitably chosen constant K ∈ R. As C is a diagonal matrix, the sin and cos functions directly apply to the diagonal elements [27, Section 2.1.8]. This allows to encode information about the problem as probability amplitudes of the qubits. Note that because C is Hermitian, U is Hermitian as well and thus can serve as measurement observable. We use the constant K to re-scale all entries of C to [−π/2, π/2], where sin is strictly increasing and invertible. Specially, eq. (9) block-encodes a bijective transformation of C. For reasons that will become clear in section 3.1, this re-scaling also allows for an efficient implementation of U.
We stress that K should not be set to large since lim K→∞ U(C, K) = X ⊗(1+n) , i.e., for large K the operator U behaves like a not-gate. In section 3.2 we provide an optimization routine for the proposed circuit, we discuss its scalability and complexity in section 3.3, and lastly we provide a suitable choice for K in section 3.4.

Quadratic term Unary term
Cost qubit Working qubits The cost qubit is initialized in the state |ψ⟩ c = X |0⟩ and is the target of all operations. Note that this does not contradict the idea of keeping it in the |0⟩-state, as the operation X = Ry(π) · Z implements the last part of U (eq. (13)). (Left) For each coupling edge of weight C ij between nodes q i and q j , rotate the cost qubit by Ry(−2Ĉ ij ) if their corresponding working qubits are in the same state, else by Ry(2Ĉ ij ) = X · Ry(−2Ĉ ij ) · X. (Right) For each unary edge of weight C ii involving node q i , rotate the cost qubit by Ry(−2Ĉ ii ) if its corresponding working qubit is in the |0⟩-state, else by Ry(2Ĉ ii ) = X · Ry(−2Ĉ ij ) · X.

Implementation
For a system prepared in the basis state |q⟩, the cost of the cut q according to the Ising model in eq. (1) is given by Crucially, for a (1 + n)-qubit system prepared in the state |ψ, ψ⟩ := |ψ⟩ c ⊗|ψ⟩, where |ψ⟩ c = α |0⟩ c + β |1⟩ c is a 1-qubit register that we call the cost qubit and |ψ⟩ is the n-qubit working register, if the cost qubit |ψ⟩ c is kept in the state |0⟩ c , it holds for U defined in eq. (9) that Thus, if K is chosen such that the entries of the scaled diagonal matrix C K fit inside the monotone region of the sine, then minimizing ⟨U⟩ with the cost qubit set to zero is equivalent to minimizing ⟨C⟩.
Observe that U is a block matrix of diagonal matrices and applying it to the basis state |0, q⟩ gives the same result as applying to the cost qubit |0⟩ c the controlled (2 × 2)-operator As derived in section A, this operator performs a reflection in the Bloch sphere of the cost qubit about the axis ⃗ n = (1/2) cos ⟨Ĉ⟩ 0 sin ⟨Ĉ⟩ ⊤ .
Up to an irrelevant phase factor, the operator U can be written as The second equation holds because arccos(sin The last equation is obtained by applying the identity R y (θ 1 + θ 2 ) = R y (θ 2 ) · R y (θ 1 ) for rotations in two dimensions, where θ 1 , θ 2 ∈ R. By recursively applying this same identity to R y (−2 ⟨Ĉ⟩) we find As a result, the weighted sum of Pauli-Z operators naturally translates into a product of unitary transformations, which is very compatible with the gate-based model of quantum computing. As the basis state |q⟩ is chosen arbitrarily, ⟨U⟩ outputs sine transformed costs as derived in eq. (11) for arbitrary states. For a system prepared in the basis state |ψ, ψ⟩ = |0, q⟩, we can even recover the exact cost by ⟨C⟩ = K arcsin ⟨U⟩.
The operator U can be efficiently implemented using the circuit given in fig. 1: • We initialize the cost qubit in the state |ψ⟩ c = R y (π) · Z |0⟩ = X |0⟩. • For each weight C ij between two nodes q i and q j , we rotate the cost qubit by R y (−2Ĉ ij ) if the corresponding working qubits are in the same state, and by R y (2Ĉ ij ) = X · R y (−2Ĉ ij ) · X if they are not. • For each unary weight C ii involving node q i , we rotate the cost qubit by R y (−2Ĉ ii ) if the corresponding working qubit is in the |0⟩-state, and by R y (2Ĉ ii ) = X · R y (−2Ĉ ii ) · X if it is not.
Whenever the unary costs satisfy C ii = 0 for all i, we refer to eq. (14) as the Universal Quantum Maximum Cut (uqmaxcut) model, else as the Universal Quantum Ising (uqising) model.

Workflow
The overall workflow of our algorithm is presented in fig. 2. The complete circuit consists of 3 registers: a 1-qubit register containing an ancilla qubit, another 1-qubit register for the cost qubit, and an n-qubit register for the working qubits encoding the variables of the problem. First, the working qubits are rotated by a set of angles Θ = (θ 1 , . . . , θ n ) ∈ R n , constructing the ansatz |ψ(Θ)⟩ := R y (θ 1 ) ⊗ · · · ⊗ R y (θ n ) |ψ⟩ , from a system previously prepared in the state |ψ⟩.
Here, the qubit q i , representing the ith node of the graph, is rotated by θ i around the y-axis. Next, a Hadamard sandwich involving a controlled version of the U-operator is applied to the ancilla qubit. Finally, according to the principle of implicit measurement of quantum computing [27,Section 4.4] stating that all qubits that are not measured at the end of a quantum circuit can be assumed to be measured, only the ancilla qubit is measured, leaving the cost and working qubits in the Our goal is to solve the variational problem i.e., to find a set of angles Θ ⋆ such that |ψ(Θ ⋆ )⟩ = |q ⋆ ⟩. The cost L(Θ) can be efficiently calculated by L(Θ) = p(0)−p(1), where p(0) and p(1) are the probabilities of measuring the ancilla qubit in the |0⟩ and |1⟩ state, see section B for the proof. This allows us to compute the expectation ⟨U⟩ without having to measure ⟨0, ψ(Θ)| needed for the scalar product. For the optimization, several approaches [28,29] have fortunately been developed for evaluating or approximating gradients and improving optimization on quantum computers.

Parameter Shift Rule
The parameter shift rule [28] is a simple but exact method for evaluating the analytical gradient of a function given in the form of an expectation as in eq. (17) on quantum hardware. For computing the partial derivative with respect to the parameter θ i , it uses two function evaluations with the parameter θ i shifted by ±π/2: where Θ i± := (θ 1 , . . . , θ i−1 , θ i ± π/2, θ i+1 , . . . , θ n ). The parameter shift rule uses the same circuit as the function evaluation, but allows to compute the exact gradient. Its drawback is that it requires 2n function evaluations to compute the gradient of L.

Update Rule
The circuits (uqmaxcut, uqising) are optimized by normalized gradient descent with decreasing step size. Normalized gradient descent has recently been proposed by Suzuki et al. [30] as a powerful optimization strategy for variational quantum algorithms. Specifically, in [30] it is demonstrated experimentally that normalized gradient steps are more effective in escaping non-global minima than gradient steps. It is also known [31] that normalized gradient descent evades saddle points. At each iteration k, our update rule reads The design of the update rule eq. (19) is motivated by the following consideration: We know that we produce bit-strings by either flipping |q i ⟩ or not,  Fig. 2 Complete workflow of our proposed algorithm for the Universal Quantum maxcut and Ising Model (uqmaxcut and uqising). First 1 ○, the cost (node and edge weight) information for the given graph are used to implement the operator U := U(C, K) that block-encodes the problem Hamiltonians. This operator is applied to a trial variational quantum state |ψ(Θ)⟩ tensored with the cost qubit. Second 2 ○, using the principle of implicit measurement, the expectation value ⟨U⟩, which equals the cost L(Θ), is approximated by measuring several copies of the circuit. This computed expectation and eventually its gradient are iteratively used in a classical optimization routine that drives the parameterized state |ψ(Θ)⟩ towards the state |ψ(Θ ⋆ )⟩ that potentially gives the global minimal cost value. Finally 3 ○, the optimal state |ψ(Θ ⋆ )⟩ is measured in the computational basis and the most frequently measured state corresponds to the desired optimal cut of the input graph.
thus θ ⋆ i = ℓπ, ℓ ∈ Z. At iteration k = 0, the update rule eq. (19) allows each θ i to get updated to θ 1 i ← θ 0 i ± π · g i /2, where g i ∈ [−1, 1] is the normalized contribution of θ 0 i in the loss L(Θ). Subsequently, we let the step size decay exponentially to zero when approaching the maximum number of iterations k max . Given the noisy nature of quantum measurement, it is helpful that this frees us from the difficult task of determining a stopping condition for the algorithm.

Scalability and Computational Complexity
For a given graph G = (S, E, C), the circuit construction presented in fig. 1 requires at most one not-gate, |E| single-qubit rotation gates and 4|E| − 2|S| cnot gates. Note that since the edges can be treated in arbitrary order, two consecutive cnot gates that have the same control qubit cancel each other out as their product is the identity, further reducing the number of required cnot gates. The controlled-U(C, K) gate appearing in fig. 2 and conditioned by the ancilla qubit |·⟩ a can be fully decomposed into single qubit rotations and cnot gates without using any Toffoli gates. To see this, note that it can be expressed as In particular, in order to control the complete U(C, K) gate, it suffices to control only the rotation gate R y .
Finally, the controlled rotation itself can be decomposed into two cnot gates and two single qubit rotations as [R y (θ)] a = X a · R y (−θ/2) · X a · R y (θ/2). (21) Table 1 recapitulates the main differences between uqising and a conventional qaoa of depth p. It shows that for p ≥ 3, qaoa requires more quantum resources than our uqising model.  Table 1 Resources and complexity comparison of a depth p conventional qaoa and our uqising model on a graph G = (S, E, C). Recall that the set E entails all the edges of the graph, i.e., the quadratic and unary ones. The set S is the set of vertices. Our construction requires less quantum and classical resources than qaoa for p ⩾ 3.
Further, physically mapping the qaoa ansatz onto the quantum hardware has to take into account a graph-dependent qubit connectivity, while our method, independently of the input graph, requires that one qubit (the cost qubit) is connected to all other qubits (ancilla and working qubits).

Impact of the K-Rescaling
When applying the method, it is important to appropriately choose the constant K ∈ R for rescaling the costsĈ = C/K. The goal is to fix K such that diag(Ĉ) ∈ [−π/2, π/2] 2 n , as this guarantees that sin(diag(Ĉ)) preserves the order of the original costs in diag(C).
Although it is tempting to choose K ≫ C max := max k C kk , where C kk denotes the kth diagonal element, recall that in eq. (9) we have lim K→∞ U(C, K) = X ⊗(n+1) , i.e., the larger K, the more shots are required to accurately measure the entries of sin(Ĉ). Fortunately, in many problems an upper bound to the maximal cost C max can be computed from the original weights without knowing C. For example, choosing guarantees an error-free transformation of the initial problem (5) into the equivalent problem (17). As presented for 10-node graph examples in fig. 3, the choice of smaller values for K entails that the approximation of arccos(sin x) by π/2 − x, used in eq. (13), is inaccurate, with more pronounced error in the largest absolute entries of diag(Ĉ). Notably, the solution of the transformed problem will only be a solution of the initial problem as long as arg min kĈkk = arg min k C kk . As expected, the more λ and thus K grows, the better the order of the true cost agrees with the order of the transformed costs. Also, for fixed λ the difference between the two costs is the largest when the true cost is large in absolute value. The cyan-marked line is the profile that we selected for the experiments; it indicates the value λ = 2/π, which is the minimum λ that guarantees an error-free transformation, cf. eq. (22).

Experimental Results
In order to validate the practical usefulness of uqmaxcut and uqising, we benchmark against two state-of-the-art approaches for solving binary combinatorial optimization with quantum computing: qaoa for the gate-based model and dwave solvers for the adiabatic model. Random graphs in the experiments are generated using the Python language package NetworkX [32]. The unary and quadratic edge weights are all randomly and uniformly chosen in the range [1,10] and the graphs are all fully connected. The gate-based circuits in the experiments (uqmaxcut, uqising, qaoa) are implemented in Python and simulated in a noise-free framework using the QisKit library and the IBM-QASM simulator [33]. For the adiabatic model, d-wave solvers that run on the actual quantum hardware are used. d-wave quantum annealers are made available through the Leap quantum cloud service [34], and the d-wave quantum algorithms can be implemented in Python using the Ocean software [35]. We perform 1024 measurement shots for all gate-based algorithms.
On d-wave [1], the experiment is run with the default annealing time of 20µs and 50 sample reads on the Advantage topology.

Benchmark Metrics
We denote by |q ⋆ ⟩ the ground-truth global minimizer and by |ψ ⋆ ⟩ the ground state proposal of each method (uqmaxcut/uqising, qaoa and dwave). In the experiments, we adopt the following two metrics to evaluate the performance of the methods: • The approximation ratio informs about the quality of the result, i.e., how confident the method is with its solution proposal and how far the cost of this proposal is from the minimal cost C min := min k C kk , cf. [36]. All the terms appearing in r are classically evaluated. It holds 0 ≤ r(ψ ⋆ ) ≤ 1 and is a Boolean variable that indicates whether the most likely state |q max ⟩ obtained with probability |α max | 2 is the desired state |q ⋆ ⟩. It holds i(ψ ⋆ ) = 1 if ⟨q ⋆ |C|q ⋆ ⟩ = ⟨q max |C|q max ⟩ and 0 otherwise. Note that this differs from the usual approach of d-wave, where the sampled state with the minimal energy is regarded as the best solution proposal.

|0⟩ qn
Before discussing the results on the maxcut problem, it is important to notice that for maxcut, solutions always exist in symmetric pairs. Specifically, for a basis state solution This feature can be enforced by introducing the entanglement circuit given in fig. 4 after the rotation layer in fig. 2. The circuit has the matrix representation Also, this entanglement allows to optimize over n − 1 angles instead of n. Without entanglement our method just outputs one of the two possible solutions. In the experiments, uqmaxcut is used without entanglement; if the algorithm outputs either one of the two solutions, we set the approximation index to 1.

Benchmark Results
For the outer optimization algorithm of the uqmaxcut circuit we use normalized gradient descent as described in section 3.2. Other optimization algorithms such as vanilla gradient descent or adaptive moment estimation (adam [37]) could be used as well. However, it proved difficult in the experiments to find suitable step sizes for those methods, so we leave them for future research.
The qaoa layers depth in the experiments is set to p = ⌈n/2⌉ to allow for a fair comparison, as then both qaoa and uqmaxcut optimize over approximately n real variables. We also attempted to optimize qaoa using the same optimizer as for uqmaxcut, but the results were not competitive. Hence, we also show the results of qaoa when using a gradient-free optimizer; we used the cobyla solver [38] available in the scipy library [39].
The results are presented in fig. 5 for fully connected graphs of n = 3, 5 and 10 nodes with strictly positive edge weights. For each n, the results are averaged over 20 graph instances and all algorithms are tested on the same instances. The angles for uqmaxcut and qaoa are all initialized to 0.
The approximation ratio for the three methods (qaoa, d-wave, uqmaxcut) is not adversely affected by the number of variables n, but the approximation index drops sharply as the size of the problem increases. The gradient-free Cobylaoptimized version of qaoa performs much better than qaoa with gradient descent. The latter is on average as good as d-wave regarding the approximation ratio. We conjecture that the gradientbased optimization of qaoa often gets trapped by saddle points of the qaoa loss function landscape. In contrast, uqmaxcut clearly outperforms the two qaoa variants and d-wave in producing good approximate solutions. Furthermore, the approximation index demonstrates that it returns a global minimizer significantly more often than qaoa and less often than d-wave whose architecture is specifically designed to solve such problems.

Benchmarking UQIsing against D-Wave
For the Ising model we benchmark the proposed uqising algorithm against the d-wave annealer, the adiabatic quantum computer specialized in solving this type of problems. The variational circuit for uqising is optimized in the same way as for uqmaxcut, see section 3.2, with the exception that the initial angles are set to π/2 instead of 0. The results are depicted in fig. 6. Notable differences in performance between uqising and d-wave are consistent with the MaxCut experiments. Specifically, its high approximation ratio indicates that uqising always produces either globally optimal solutions or extremely good approximations thereof. In contrast, the small approximation ratio of d-wave reveals that if the annealer does not sample the correct solution, then it may output a state with a comparably high energy value. On the other hand, the approximation index shows that d-wave identifies a globally optimal solution more often than uqising.

Conclusion
We have presented a new low-depth quantum circuit to tackle two important combinatorial problems on universal quantum machines. The resulting Universal Quantum maxcut (uqmaxcut) approach outperforms the state-of-the-art quantum approximate optimization algorithms (qaoa) by the lower depth, by the computed approximation ratios and by a higher probability of outputting optimal solutions. It also challenges the d-wave-quantum annealers that are specifically designed to solve such combinatorial problems; on the maxcut as well as on the Ising spin model, uqmaxcut, respectively, uqising achieve better approximation ratios and can compete with d-wave in producing globally optimal solutions.
We believe that the proposed approach enables the design of new methods for solving practicallysized problems on universal quantum machines. Inspired by the novel operator U, future work should focus on designing fully universal algorithms without the classical outer optimization loop, replacing the latter with fully universal methods like for example Grover's search [8].