## Abstract

We introduce multiple parametrized circuit ansätze and present the results of a numerical study comparing their performance with a standard Quantum Alternating Operator Ansatz approach. The ansätze are inspired by mixing and phase separation in the QAOA, and also motivated by compilation considerations with the aim of running on near-term superconducting quantum processors. The methods are tested on random instances of a quadratic binary constrained optimization problem that is fully connected for which the space of feasible solutions has constant Hamming weight.

For the parameter setting strategies and evaluation metric used, the average performance achieved by the QAOA is effectively matched by the one obtained by a ”mixer-phaser” ansatz that can be compiled in less than half-depth of standard QAOA on most superconducting qubit processors.

## 1 Introduction

The Quantum Approximate Optimization Algorithm (QAOA) was initially introduced in Ref. Farhi et al. (2014). Its simple structure inspired heuristic algorithms for sampling and exact optimization as well as approximate optimization that generalized the simple structure to include a broader and often more implementable set of operators.

The algorithms following the ansatz alternate *p* times between unitary operators chosen from a one-parameter family of *phase separation* operators and operators chosen from a one-parameter family of *mixing* operators. The mixing operators do not commute with the phase separation operators, enabling exploration of the search space. The aim is to output a state that has good overlap with the low-energy eigenspace of the problem Hamiltonian after *p* layers. Good parameters can sometimes be determined analytically, or estimated efficiently classically, or may be found using a combination of runs on an quantum processing unit (QPU) together with a classical optimization heuristic (Akshay et al. 2021; Rabinovich et al. 2021). The simplest case, one which has been extensively studied in the QAOA literature in terms of theory, numerics, and experiments, is QAOA for \(\mathtt{MaxCut}\), for which the ansatz alternates layers consisting of two-qubit parity gates \(\mathcal {U}^{nm}_{ZZ}(\gamma )=\exp [i\gamma Z_{n}Z_{m}]\) with single qubit X-rotations \(\mathcal {U}^{n}_{X}(\beta )=\exp [i\beta X_{n}]\) (*X-mixer*s).

In Hadfield et al. (2019), the QAOA approach was generalized to the *Quantum Alternating Operator Ansatz*, considering unitary layers that are not necessarily linked to local Hamiltonian evolution^{Footnote 1}. In particular, multi-qubit mixing operators were introduced in lieu of the *X*-rotations when applying the QAOA to hard-constrained optimization in order to restrict the search to the feasible subspace, the space of valid configurations obeying the hard constraints. The simplest among the advanced mixers is the two-qubit *XY gate*

which conserves the total spin projection *Z*_{n} + *Z*_{m}.

In this work, we introduce a new ansatz that combines the mixing and phase-separation operators into a more general two-parameter family of operators. For this reason, we refer to it as the ”Quantum Alternate Mixer-Phaser Ansatz” (QAMPA). The main motivation for this generalization is to reduce the depth of the circuits, potentially reducing performance in the ideal case (by potentially limiting the expressibility of the ansatz) but obtaining improved performance on noisy intermediate-scale quantum (NISQ) processors (by running shorter-depth circuits with that can still find good approximate solutions). We perform a numerical study on the performance of QAMPA on a weighted combinatorial optimization problem with hard constraints. For fully-connected binary quadratic optimization problems, the circuits compile to roughly half the depth of standard QAOA on QPUs with nearest-neighbor connectivity. This is the case in most superconducting qubit quantum computers in which qubits are placed on a two-dimensional grid and interact with nearest neighbors through tunable or fixed frequency couplers. Our numerical simulations show that for the problems studied, in the noiseless case, QAMPA performs almost as well as standard QAOA in parameter regimes that are achievable in current hardware, and thus is expected to have advantages under noise given its reduced depth. This ansatz is therefore a viable and attractive approach, particularly for highly connected optimization problems with hard constraints on NISQ hardware.

## 2 Background and prior work

Experimental benchmarks with *X*-mixers are numerous, especially on superconducting processors, although mostly limited to problems whose topology exactly matches the quantum processor hardware (see Harrigan et al. 2021 for a review). Ref. Harrigan et al. (2021) also explores optimization of the fully connected Sherrington-Kirkpatrick model,which requires significant compilation overhead. Unsurprisingly, the compilation requirements resulted in significant performance degradation with circuit depth, due to the relentless unmitigated action of noise during circuit execution in NISQ hardware.

Many techniques have been developed to optimize gate synthesis and qubit routing (i.e. compilation) for algorithms to be run on noisy intermediate scale quantum (NISQ) devices featuring a sparse native gate set. Although experimental QAOA work with XY mixers has still not appeared in the literature, numerical analyses predict long circuit durations for problems with XY mixers that are not encouraging for very near-term hardware (Do et al. 2020). In Farhi et al. (2017) and Kandala et al. (2017) hardware-efficient ansätze were proposed that match the processor topology and the native gates and use the objective function Hamiltonian only to guide the parameter setting procedure and to evaluate the final performance metric. In that approach, QAOA was used as a form of a quantum neural network that needs to be trained to act as an optimization solver, but there is concern as to how well this method, with its many parameters, would work in general.

The standard QAOA approach applied to combinatorial search was discussed in detail in Hadfield et al. (2019). For a given cost function, it starts from a superposition ideally equally distributed among all possible candidate solutions, i.e.

where \(\mathcal {F}\) represents the *feasible subset* of the optimization problem. This state is evolved to a quantum state |*ψ*_{F}〉 through a circuit composed by alternating sequentially two layers of gates for a number *p* of rounds. Each round consists of an exploitation (phase-separation) layer \(\mathcal {U}_{PS}(\gamma )\) which introduces information related to the cost function to be extremized and an exploration (mixing) layer \(\mathcal {U}_{M}(\beta )\) which rearranges probability amplitudes across \(\mathcal {F}\). The parameters *γ* and *β* are real numbers that need to be optimized layer-by-layer. In practice these layers can be decomposed by products of single and two-qubit gates in an arbitrary order ^{Footnote 2} , e.g. \(\mathcal {U}_{PS}(\gamma )={\prod }_{n, m}\mathcal {U}_{PS}^{nm}(\gamma )\). A final quantum alternating operator ansatz looks like:

Note that these products must be ordered when the two-qubit gates do not commute. Most of the works on QAOA feature exclusively single qubit gates as mixers. Only a few works have discussed the performance of the QAOA using *XY* -mixers: Ref. Wang et al. (2020) studies \(\mathtt{MaxkColorableSubgraph}\), Cook et al. (2020) looks at \(\mathtt{MaxkVertexCover}\) and (Niu et al. 2019) considers QAOA as a quantum state transfer protocol.

## 3 QAMPA: Quantum alternate “Mixer-Phaser” Ansatz

We now introduce the “Quantum Alternate Mixer-Phaser Ansatz” (QAMPA) unitary, which is the ordered product of two-qubit gates:

Here, \({U}_{MP}^{nm}(\gamma _{p},\beta _{p})\) is a two-qubit operation between qubits *n* and *m* that is parameterized by two angles \(\gamma _{p}, \beta _{p} \in \mathbb {R}\). The subscript *p* refers to the *p* th round. We refer to this operation as a “mixer-phaser” (MP) operation in that it optionally implements mixing and phase separation depending on the value of the two independent parameters. A possible choice for the mixer-phase operation is simply^{Footnote 3}:

We will focus on this choice for the analysis in this paper.

Note that

In other words, Eq. 7 show that QAOA with 2*p* parameters \(\gamma _{1},\dots , \gamma _{p}\), \(\beta _{1},\dots , \beta _{p}\) can be mapped to QAMPA with 4*p* parameters if *β*_{2k+ 1} = 0 and *γ*_{2k} = 0 for \(k=0{\dots } p\) (and the non-zero parameters are identified in sequence). However, for the same number of layers the algorithm has double the parameters. It is not clear a priori how QAOA and QAMPA would perform relative to each other when compared at fixed, equal number of parameters. In the next subsection we will investigate this question numerically for a specific illustrative problem.

### 3.1 Application to binary optimization with cardinality constraints

Let’s consider a special case of an integer program, a quadratic binary optimization problem with *N* variables, with a constraint that restricts the feasible subspace to bitstrings with a certain fixed Hamming weight *κ*. Mapping bits (0, 1) into spin variables (-1, + 1), this results in the following Ising cost Hamiltonian and constraints:

For *h*_{n} = 0, this problem is a weighted \(\mathtt{MaxCut}\) with given sizes of partitions (Ageev and Sviridenko 1999) (\(\mathtt{WeightedMaxCutGSP}\)). It can also be seen as a Markowitzian portfolio optimization problem (Markowitz 1952) where the task is to select the best performing *κ* of assets in a pool assuming correlations between their performance indicators. Note that while for small *κ* the problem is clearly tractable, for the case *κ* = *N*/2 the problem turns into the NP-Hard \(\mathtt{GraphBisection}\), which has been mapped to QAOA and studied numerically in the unweighted case where *J*_{nm} are either 0 or 1 using a sparse XY-mixer in (Wilson et al. 2021).

Following the literature, we construct the QAOA and QAMPA gates using *XY* mixers (Hadfield et al. 2019; Wang et al. 2020) as:

The initial state could be taken to be an equal superposition of all solutions with *κ* variables set to 1 on the qubit registers. i.e. a Dicke state (Bärtschi and Eidenbenz 2019). By observing the periodicity of the unitaries composing the ansätze, we can observe that if the possible values of the coefficients are commensurable, the angle parameters could be selected within the domains \(\gamma \in [0,2 \pi / \min \limits _{>0} (|h_{n}|, |J_{nm}|)]\) and *β* ∈ [0,*π*] without loss of generality.

### 3.2 Synthesis and routing

Efficiently compiling a quantum circuit such as Eq. 4 to a real quantum processors, having pre-defined calibrated two-qubit gates active on a sparse subset of all possible pairs of qubits, is a non-trivial planning and scheduling problem (Venturelli et al. 2018). We would like to estimate the advantage of using QAMPA versus QAOA in common implementation scenarios. For \(\mathtt{WeightedMaxCutGSP}\), the required *U*_{ZZ} gates to implement the objective function are the *N*(*N* − 1)/2 edges of a fully-connected graph. The mixer that is responsible for the exploration step of the algorithm by keeping the constraint Eq. 9 in check is also ideally a complete mixer, since it is proven numerically to be the best choice for Hamming weight constraints (Wang et al. 2020; Cook et al. 2020; Bärtschi and Eidenbenz 2020). Choosing a mixer with sparser connectivity between various terms might lead to shorter circuits, but the compilation advantage of using QAMPA versus QAOA is maximal if we use the same graph for both phase-separation and mixing operations.

We should note that the initialization choice Eq. 2 (e.g. the creation of the Dicke state |*ψ*_{0}〉, which requires in principle \({O}(\kappa N)\) gates (Bärtschi and Eidenbenz 2019)), while being the simplest to analyze and possibly the most advantageous based in prior studies of similar problems (Wang et al. 2020), might be impractical in the near-term. As discussed in (Hadfield et al. 2019; Egger et al. 2021), the initialization procedure could be possibly substituted by a simpler to realize superposition of feasible states or a classical warm start candidate followed by a mixing round in QAOA. In QAMPA, the first round contains mixing so initialization might come for free if the gate is appropriately synthesized. Hence, initialization is not a concern for the discussion around compilation efficiency.

The routing requirements to schedule gates between all possible pairs of qubits depend on the underlying topology where swap operations can be performed. For a linear device, it was shown in Kivlichan et al. (2018) that the most efficient swap network allowing the scheduling of the gates could be executed with maximum parallelization in *N* steps. For a more connected topology, the linear result is still a worst case scenario which can be implemented by defining an arbitrary Hamiltonian path on the device graph.

As shown in Fig. 1-(a) for an illustratory *N* = 4 case, the linear efficient compilation of QAOA *p* = 1 is intertwined with a swap network both for the phase-separation layer (blue box) and for the mixing layer (red box). The routing overhead in this case is a total of *p*(*N* − 1)^{2} \(\mathtt{SWAP}\) gates increasing the depth of about 2*p*(*N* − 1) if the \(\mathtt{SWAP}\) gates are not simplified or optimized in synthesis. For QAMPA instead, as shown in Fig. 1-(b), a single swap network is required for the mixing and phase separation layer, resulting in a clear advantage for circuit depth for the same number of parameters.

The optimal synthesis of logical gates depends on the available native operations on the quantum processor. For the sake of illustration, suppose to have access to the common set consisting of CNOT gates and parameterized single qubit rotations about *X*, *Y* and *Z*. This set is universal and admits optimal synthesis formulas for any two-qubit gate utilizing at most 3 CNOTs and 15 single-qubit rotations (Vatan and Williams 2004). Figure 1-(c) illustrates the optimal synthesis of \(\mathtt{SWAP}\mathcal{U}_{MP}^{nm}(\gamma,\beta)\) in terms of the canonical known decomposition. Similar synthesis can be derived for \(\mathtt{SWAP}\mathcal{U}_{ZZ}^{nm}\) and \(\mathtt{SWAP}\mathcal{U}_{XY}^{nm}\), showing that for the pictured case there should be a factor of 2 between the resulting depths of the two ansätze. A different set of hardware primitives might increase or reduce the advantage, for instance if the fsim gate (Harrigan et al. 2021) or the *XY* gates are available natively then the swap could be subsumed in a renormalization of the angles, as illustrated in Fig. 1-(d) where we pictorialize the identity: \(\mathtt{SWAP}\mathcal{U}_{MP}^{nm}(\gamma,\beta)=\exp(i\pi/4)\mathcal{U}_{ZZ}^{nm}(\gamma+\pi/2)\mathcal{U}_{XY}^{nm}(\beta+\pi)\).

If the underlying connectivity on the hardware is all-to-all, a swap network is not required. Still, for fixed number of angles it could still be depth-advantageous ^{Footnote 4} to run QAMPA if the sum of the depth required for the synthesis of both the \({U}_{PS}^{nm}\) and \({U}_{M}^{nm}\) is larger than the depth required to synthesize \({U}_{MP}^{nm}\), which is almost always the case if the QAOA gates are not natively available.

## 4 Numerical evaluation

We benchmark the algorithms by numerically simulating the circuits for 40 random fully-connected \(\mathtt{WeightedMaxCutGSP}\) problems where *J*_{ij} ∈{− 1,− 0.5,0.5,1} (chosen uniformly) and *h*_{i} = 0, for even \(N={4,6,\dots ,16}\) and for *κ* = *N*/2 (representing the largest search space, \(| {F}|\simeq 2^{N}/\sqrt {N}\)). For simplicity, we set linear Zeeman terms *h*_{j} = 0, since it is mostly inconsequential from the perspective of the compilation overhead. While the order of the execution of \({U}_{ZZ}^{nm}\) gates does not matter, different orderings of the \({U}_{XY}^{nm}\) and \({U}_{MP}^{nm}\) are not equivalent. We consider that all runs related to a given instance are performed with a random permutation of the gates chosen among the sequences that are allowing maximum parallelization and minimum depth when intertwined with a swap network.

### 4.1 Performance metric and parameter setting

We want to evaluate the optimization performance of QAOA and QAMPA in terms of their value as a \(\mathtt{WeightedMaxCutGSP}\) solver. The metric we use is related to a goal of running the algorithm to discover as quickly as possible a bitstring associated to a sufficiently good value (as determined by a pre-defined quality threshold) of the objective function. To address this goal, our target performance metric will be the expected value of the objective function when the best result in *R* runs is selected (Kim et al. 2019):

where |*k*〉 is a feasible state whose normalized objective function value is:

with *𝜖*_{0}, *𝜖*^{⋆} being respectively the minimum and the maximum of the objective function spectrum of values. \(F(\epsilon _{k})={\sum }_{p<k}|\langle \psi _{F}(\vec \gamma ,\vec \beta ) | p \rangle |^{2}\) is the cumulative distribution function of *𝜖*_{k}, where all sum over states are meant in ascending order *𝜖*_{k}. It could be straightforwardly computed by accessing the wavefunction of the final state, or its sampling statistics. This metrics, beside being the most relevant for optimization purposes, inherits the advantage discussed in the context of other metrics that are more focused on the high-quality solutions portion of the probability, such as the Conditional Value at Risk (CVaR) (Barkoutsos et al. 2020; Díez-Valle et al. 2021) or Gibbs averages (Li et al. 2020), which are suspected to have some desirable “trainability” properties to guide parameter setting, as opposed to the more traditional \(\langle \psi _{F}(\vec \gamma ,\vec \beta ) | H_{C} | \psi _{F}(\vec \gamma ,\vec \beta )\rangle\) (McClean et al. 2018) (which is simply 〈BEST_{1}〉). For illustration, we will work with 〈BEST_{5}〉, since *R* = 5 seems to be a reasonable value to use to reach good approximation ratios for the moderate sizes of problems that we are studying, as we will demonstrate empirically ex-post (Fig. 2).

The parameter setting strategy of choice for all experiments in this paper (which we call \(\mathtt{scanlast}\), for easiness of reference - see Fig. 3) follows a layerwise constructive optimization protocol employing an external blackbox optimizer that guides the repeated execution of the quantum circuit. The protocol aims to identify good angles for both QAOA and QAMPA at level *p* + 1 using the information of the best found at level *p*. It starts with a random generation of *W*_{0} pairs of angles that are then used as an initialization for each run of the optimizer. The *q* best found results (\(\gamma _{1}^{\star }(q)\) and \(\beta _{1}^{\star }(q)\)) are then going to be each seeding the runs at *p* = 2. More precisely, the *p* = 2 runs will be each seeded by *q* batches of *W* runs each of the form \((\gamma _{1}^{\star }(q), \beta _{1}^{\star }(q), \gamma _{2}, \beta _{2})\) where the last two angles are chosen randomly. Note that the best initial angles from the previous layer (e.g., \(\gamma _{1}^{*}\) and \(\beta _{1}^{*}\)) are allowed to vary when optimizing the next layer. The full layerwise procedure is applying this rule recursively: at layer *p* + 1 we would launch *q* searches where each run would initialize the optimizer with \((\gamma _{1}^{\star }(q), \beta _{1}^{\star }(q), \dots , \gamma _{p}^{\star }(q), \beta _{p}^{\star }(q), \gamma _{p+1}, \beta _{p+1})\) for a total complexity of \({O}((W_{0} + pqW)f_{opt})\) where *f*_{opt} is the number of function evaluations used per optimization attempt.

For most of our tests, we choose *W*_{0} = 50, *q* = 10, *W* = 250 and *f*_{opt} = 250, for a total number of runs of 625000 + 2500*p* per test. Moreover we decide to use Powell’s method for the external optimization loop. While this method is not expected to be a suitable choice for large number of variables, in our problem set it outperformed several other methods used in the literature (e.g. BFGS, Model Gradient Descent, and Nelder-Mead) and provided the closest results to bruteforce search. In Fig. 2 we illustrate the results of the \(\mathtt{scanlast}\) procedure in practice, showing the 10% best-found QAOA/QAMPA for each instance (black dots) that maximize 〈BEST_{5}〉 for 40 instances with *N* = 16 variables. The displayed results illustrate clearly that for a given instance the best-found phase separation \(\gamma _{p}^{\star }\) and mixing angles \(\beta _{p}^{\star }\) for QAOA and QAMPA are similar, at least for low value of *p*.

Indeed, by analyzing data from random instances of sizes we observe that the first angles of the sequence converge very often to a value of constant magnitude, which is often close to 0. The time-reversal symmetry is manifest in the heatmap since the solutions for which all the angles are negated are equivalent, and the choice of one over another is likely associated to the randomness in \(\mathtt{scanlast}\). For the mixing angle the concentration of best performing angles is apparent for almost all tested instances, while for the *γ* angles at *p* > 6 the concentration is not as striking. This could be explained by the fact that the value of the 〈*B**E**S**T*_{5}〉 metric is already very close to the maximum, so the optimization landscape could be rather flat and difficult to numerically optimize.

### 4.2 Instance-by-instance comparison

To compare the performance of the two ansatz we consider the instance-by-instance scatter plot where we compare the 〈BEST_{5}〉 results for QAOA vs QAMPA after the \(\mathtt{scanlast}\) optimization.^{Footnote 5} The figure indicates with cross symbols the metric value for each instance and with the round dots the median/standard deviation of the results for all the instance results associated to a given *p*. What is clear from the results shown in Fig. 4 is that as *p* increases both ansätze perform essentially the same for all tested sizes. This convergence is not surprising as for large *p* QAOA and QAMPA for the same number of parameters could be interpreted as a Trotter-like approximation of the same unitary evolution (Childs et al. 2021). Similar results are observed also for different *R* in the metric (including the common expectation value metric *R* = 1), indicating that the output distribution for the *𝜖*_{k} at the current sizes is smooth and monotonic.

#### 4.2.1 Additional tests and variations of QAMPA/QAOA

The approach that we benchmarked in this study offers some flexibility on its implementation. For instance, given that QAMPA is not fully grounded on insights from a specific Hamiltonian evolution, it might be worth asking whether the information related to the coefficient of objective function (*J*_{nm}) are beneficial as inserted in the circuit or if it is sufficient to train the parameters using that information, like done for VQE hardware-efficient ansätze (Kandala et al. 2017). We investigated empirically these questions by comparing instance-by-instance results for QAOA and QAMPA, using \(\mathtt{scanlast}\) on a set of instances for *N* = 10 with slightly modified parameters (*f*_{opt} = 200*p*). The results are statistically in line with the ones presented for *N* = 16 in Fig. 4 using higher computational effort.

In Fig. 5-(a), (b) we show results for algorithms against a version where for the circuit ansatz all *J*_{nm} are fixed to be 1 (we call this the -noJ version in the figures).^{Footnote 6} Note that the cost function coefficients are still used in the evaluation metric both for the parameter setting and for scoring the performance. Another test we performed, illustrated in Fig. 5-(c), (d), compares the QAMPA performance against the performance of an ansatz that includes only *XY* gates. For the *XY-noJ* variation case, the phase separation step of QAOA is completely eliminated and the phase information is entirely absent. The *XY* label in the figure considers instead the design of an XY-only ansatz that mixes qubits using an angle proportional to the cost function coefficients, i.e. using gates of the form \({U}^{nm}_{XY}(J_{nm} \beta )\).

What is observed is that the standard QAOA approach (which performs only slightly better than QAMPA, as we recall) gives the best performance compared against all other tested variations, and it is always beneficial to include a proportional factor multiplying *γ*_{p}, for each gate between qubits *n* and *m*, corresponding to the objective function coefficients *J*_{nm} in the circuit ansatz. These observations were validated for multiple problem sizes up to *N* = 16.

## 5 Discussion and conclusions

As reviewed in Cerezo et al. (2020) and Bharti et al. (2021), modern quantum algorithms for optimization on NISQ devices have generalized significantly the original structure of the QAOA circuitry. In this paper, we have presented a variation that combines the hardware-efficient spirit of Variational Quantum Eigensolvers with the advanced mixers of the Quantum Alternating Operator Ansatz and the guidance from inclusion of operators derived from the cost function without increasing the number of parameters that need to be optimized. Our numerical results indicate that the mixer-phaser ansatz QAMPA is a compelling choice among NISQ era quantum options; we expect these ideas can be ported beyond the \(\mathtt{WeightedMaxCutGSP}\) problem, to provide compilation advantages to a multitude of other hard-constrained combinatorial optimization problems that require advanced mixers. Multiple questions however remain in order to bridge the gap from proof-of-principles to real-world implementation. We list here few avenues of research towards the goal of deploying a QAMPA solver in a quantum processor, before concluding with some more general thoughts on research directions.

First, the parameter setting procedure used in this study could be refined to avoid optimization bottlenecks and limitations that affect all layerwise training protocols at scale. In particular, the Powell method while effective at small *N* would eventually become intractable, and alternative gradient methods might be hampered by barren plateaus if the parameter setting protocol is kept to be “layerwise” (Campos et al. 2021). Recently developed analytical methods based on series expansion (Hadfield et al. 2021) or quantum control (Larocca et al. 2021) methods might come in handy to analyze further the reachability deficits and strengths of these algorithms (Akshay et al. 2020), study the observed optimal parameter concentration (Akshay et al. 2021), and to estimate the performance of QAMPA at scale. Ties between quantum annealing schedule and QAOA parameter setting (Yang et al. 2017; Zhou et al. 2020; Brady et al. 2021a), further indicating that cross-overs between digital and analog optimization methods are also an interesting possible development for QAMPA (Magann et al. 2021; Headley et al. 2020; Wiersema et al. 2020; Brady et al. 2021b).

Moreover, our performance evaluation procedure based on the 〈BEST_{5}〉 provides just an indication of the ability of a QAMPA solver to identify a good solution of \(\mathtt{WeightedMaxCutGSP}\) in a reasonable time, and a fully-fledged analysis on comparative advantage of using this method versus other heuristics is required. This analysis has to take into account practical issues such as the effect of noise, which is going to adversely impact our performance estimation. The relative performance results will be both problem and hardware dependent. While theoretical frameworks to estimate the impact of noise in circuits featuring XY gates are being developed (Streif et al. 2021), ultimately only the experimental tests on quantum hardware will be able to provide a full picture of performance, including identifying at what layer *p* the solver is most effective. While in the ideal case, performance cannot decrease with *p*, under noise the situation is different; performance is observed to decrease quickly with *p* in current hardware (Harrigan et al. 2021). QAMPA, with its circuit depth reduction compared to QAOA, will enable empirical studies for higher values of *p* than QAOA on a wide variety of quantum hardware. Hybridization of QAMPA with adaptive (Zhu et al. 2020) and recursive (Bravyi et al. 2020) hybrid methods may prove powerful. One advantage of the studied problem, and of others with locally conserved particle number constraints, is that provides a natural error mitigation strategy via post-selection (Shaydulin and Galda 2021). Namely, measured bitstrings which do not obey the Hamming weight constraint are discarded. Post-selection has shown to provide significant improvements in experiments on superconducting processors (Google et al 2020) and can be generally applied beneficially to any situation with constraints. Another recent experiment (Hashim et al. 2021) shows that permuting the ordering of the qubits in the SWAP network and averaging over the results is reducing the systematic coherent errors, a technique that could be generalized to our case where the permutations are not equivalents, possibly helping the parameter setting and/or optimization performance.

In terms of actual implementation, the QAMPA method is readily testable in superconducting processors that natively support XY interactions, such as Rigetti’s devices of the Aspen family (Abrams et al. 2020). However, the initialization to a Dicke state is likely too heavy for near-term implementation, so a warm-start from a classical candidate solution obtained by greedy search is advisable (Egger et al. 2021).

We expect researchers to develop other ansätze that have sweet spots in terms of circuit depth, number of parameters, and performance. As QAMPA illustrates, there is nothing sacrosanct about QAOA; variants may perform as well or better, particularly on NISQ hardware. A rich ecosystem of ansätze that take into account hardware architectures, gate sets, and noise considerations for different types of NISQ processors will enable more rapid understanding of quantum optimization approaches for the NISQ era and beyond.

## Change history

### 26 October 2022

A Correction to this paper has been published: https://doi.org/10.1007/s42484-022-00085-x

## Notes

In the remainder of this paper, we use the acronym QAOA to mean Quantum Alternating Operator Ansatz, which includes the Quantum Approximate Optimization Algorithms as a special case.

Note that in general the order doesn’t matter for phase separation, while \([\mathcal {U}_{M}^{nm}(\beta ),\mathcal {U}_{M}^{kl}(\beta ^{\prime })]\neq 0\), so ordering is important for mixing gates.

Another possibility could be \({U}_{PS}^{nm}(\gamma _{p}) {U}_{M}^{nm}(\beta _{p})\).

We note that while depth correlates with fidelity of circuits, it has been suggested that real-time duration is a better metric to use to design compilers (Venturelli et al. 2018).

Note that we do not consider the runs that are required to discover the best performing angles but we directly report the metric computed on the corresponding

*F*(*𝜖*_{k}) distribution.Similar results are obtained if instead of putting all coefficients to be equal we set them to be proportional to a random number drawn from the same distribution that generated the instance in the first place.

## References

Abrams DM, Didier N, Johnson BR, da Silva MP, Ryan CA (2020) Implementation of xy entangling gates with a single calibrated pulse. Nature Electronics 3(12):744–750

Ageev AA, Sviridenko MI (1999) Approximation algorithms for maximum coverage and max cut with given sizes of parts. In: International conference on integer programming and combinatorial optimization. Springer, pp 17–30

Akshay V, Philathong H, Morales MES, Biamonte JD (2020) Reachability deficits in quantum approximate optimization. Physical review letters 124(9):090504

Akshay V, Rabinovich D, Campos E, Biamonte J (2021) Parameter concentration in quantum approximate optimization. arXiv:2103.11976

Barkoutsos PK, Nannicini G, Robert A, Tavernelli I, Woerner S (2020) Improving variational quantum optimization using CVar. Quantum 4:256

Bärtschi A, Eidenbenz S (2019) Deterministic preparation of Dicke states. In: International symposium on fundamentals of computation theory. Springer, pp 126–139

Bärtschi A, Eidenbenz S (2020) Grover mixers for QAOA Shifting complexity from mixer design to state preparation. In: 2020 IEEE International conference on quantum computing and engineering (QCE). IEEE, pp 72–82

Bharti K, Cervera-Lierta A, Kyaw TH, Haug T, Alperin-Lea S, Anand A, Degroote M, Heimonen H, Kottmann JS, Menke T et al (2021) Noisy intermediate-scale quantum (NISQ,) algorithms. arXiv:2101.08448

Brady LT, Baldwin CL, Bapat A, Kharkov Y, Gorshkov AV (2021a) Optimal protocols in quantum annealing and quantum approximate optimization algorithm problems. Phys Rev Lett 126(7):070505

Brady LT, Kocia L, Bienias P, Bapat A, Kharkov Y, Gorshkov AV (2021b) Behavior of analog quantum algorithms. arXiv:2107.01218

Bravyi S, Kliesch A, Koenig R, Tang E (2020) Obstacles to variational quantum optimization from symmetry protection. Phys Rev Lett 125(26):260505

Campos E, Nasrallah A, Biamonte J (2021) Abrupt transitions in variational quantum circuit training. Phys Rev A 103(3):032607

Cerezo M, Arrasmith A, Babbush R, Benjamin SC, Endo S, Fujii K, McClean JR, Mitarai K, Yuan X, Cincio L et al (2020) Variational quantum algorithms. arXiv:2012.09265

Childs AM, Su Y, Tran MC, Wiebe N, Zhu S (2021) Theory of trotter error with commutator scaling. Physical Review X 11(1):011020

Cook J, Eidenbenz S, Bärtschi A. (2020) The quantum alternating operator ansatz on maximum k-vertex cover. In: 2020 IEEE International conference on quantum computing and engineering (QCE). IEEE, pp 83–92

Do M, Wang Z, O’Gorman B, Venturelli D, Rieffel E, Frank J (2020) Planning for compilation of a quantum algorithm for graph coloring. In: Proceedings of the 24th European conference on artificial intelligence (ECAI’2020)

Díez-Valle P, Porras D, José garcía-ripoll J (2021) Quantum variational optimization: the role of entanglement and problem hardness. arXiv:2103.14479

Egger DJ, Mareček J, Woerner S (2021) Warm-starting quantum optimization. Quantum 5:479

Farhi E, Goldstone J, Gutmann S (2014) A quantum approximate optimization algorithm. arXiv:1411.4028

Farhi E, Goldstone J, Gutmann S, Neven H (2017) Quantum algorithms for fixed qubit architectures. arXiv:1703.06199

Google AI et al (2020) Quantum Hartree-fock on a superconducting qubit quantum computer. Science 369(6507):1084–1089

Hadfield S, Hogg T, Rieffel EG (2021) Analytical framework for quantum alternating operator ansä,tze. arXiv:2105.06996

Hadfield S, Wang Z, O’Gorman B, Rieffel EG, Venturelli D, Biswas R (2019) From the quantum approximate optimization algorithm to a quantum alternating operator ansatz. Algorithms 12(2):34

Harrigan MP, Sung KJ, Neeley M, Satzinger KJ, Arute F, Arya K, Atalaya J, Bardin JC, Barends R, Boixo S et al (2021) Quantum approximate optimization of non-planar graph problems on a planar superconducting processor. Nat Phys 17(3):332–336

Hashim A, Rines R, Omole V, Naik RK, Kreikebaum JM, Santiago DI, Chong FT, Siddiqi I, Gokhale P (2021) Optimized fermionic swap networks with equivalent circuit averaging for qaoa. arXiv:2111.04572

Headley D, Müller T, Martin A, Solano E, Sanz M, Wilhelm FK (2020) Approximating the quantum approximate optimisation algorithm. arXiv:2002.12215

Kandala A, Mezzacapo A, Temme K, Takita M, Brink M, Chow JM, Gambetta JM (2017) Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Nature 549(7671):242–246

Kim M, Venturelli D, Jamieson K (2019) Leveraging quantum annealing for large mimo processing in centralized radio access networks. In: Proceedings of the ACM special interest group on data communication, pp 241–255

Kivlichan ID, McClean J, Wiebe N, Gidney C, Aspuru-Guzik A, Kin-Lic Chan G, Babbush R (2018) Quantum simulation of electronic structure with linear depth and connectivity. Physical review letters 120(11):110501

Larocca M, Czarnik P, Sharma K, Muraleedharan G, Coles PJ, Cerezo M (2021) Diagnosing barren plateaus with tools from quantum optimal control. arXiv:2105.14377

Li L, Fan M, Coram M, Riley P, Leichenauer S et al (2020) Quantum optimization with a novel Gibbs objective function and ansatz architecture search. Physical Review Research 2(2):023074

Magann AB, Arenz C, Grace MD, Ho T-S, Kosut RL, McClean JR, Rabitz HA, Sarovar M (2021) From pulses to circuits and back again: A quantum optimal control perspective on variational quantum algorithms. PRX Quantum 2(1):010101

Markowitz H (1952) Portfolio selection. J Finance 7(1):77–91

McClean JR, Boixo S, Smelyanskiy VN, Babbush R, Neven H (2018) Barren plateaus in quantum neural network training landscapes. Nature communications 9(1):1–6

Niu MY, Lu S, Chuang IL (2019) Optimizing QAOA: Success probability and runtime dependence on circuit depth. arXiv:1905.12134

Rabinovich D, Sengupta R, Campos E, Akshay V, Biamonte J (2021) Progress towards analytically optimal angles in quantum approximate optimisation. arXiv:2109.11566

Shaydulin R, Galda A (2021) Error mitigation for deep quantum optimization circuits by leveraging problem symmetries. arXiv:2106.04410

Streif M, Leib M, Wudarski F, Rieffel E, Wang Z (2021) Quantum algorithms with local particle-number conservation: Noise effects and error correction. Phys Rev A 103(4):042412

Vatan F, Williams C (2004) Optimal quantum circuits for general two-qubit gates. Phys Rev A 69(3):032315

Venturelli D, Do M, Rieffel E, Frank J (2018) Compiling quantum circuits to realistic hardware architectures using temporal planners. Quantum Science and Technology 3(2):025004

Wang Z, Rubin NC, Dominy JM, Rieffel EG (2020) XY mixers: Analytical and numerical results for the quantum alternating operator ansatz. Phys Rev A 101(1):012320

Wiersema R, Zhou C, de Sereville Y, Carrasquilla JF, Kim YB, Yuen H (2020) Exploring entanglement and optimization within the Hamiltonian variational ansatz. PRX Quantum 1(2):020319

Wilson M, Stromswold R, Wudarski F, Hadfield S, Tubman NM, Rieffel EG (2021) Optimizing quantum heuristics with meta-learning. Quantum Machine Intelligence 3(1):1–14

Yang Z-C, Rahmani A, Shabani A, Neven H, Chamon C (2017) Optimizing variational quantum algorithms using pontryagin’s minimum principle. Physical Review X 7(2):021027

Zhou L, Wang S-T, Choi S, Pichler H, Lukin MD (2020) Quantum approximate optimization algorithm: Performance, mechanism, and implementation on near-term devices. Physical Review X 10 (2):021067

Zhu L, Tang HL, Barron GS, Calderon-Vargas FA, Mayhall NJ, Barnes E, Economou SE (2020) An adaptive quantum approximate optimization algorithm for solving combinatorial problems on a quantum computer. arXiv:2005.10258

## Acknowledgements

R.L. acknowledges support from a NASA Space Technology Graduate Research (NSTGRO) Fellowship, AFRL NYSTEC Contract (FA8750-19-3-6101), and the USRA Feynman Quantum Academy, a program of USRA NASA Academic Mission Services (NNA16BD14C). D.V. also acknowledges support of NAMS. All authors acknowledge support from the DARPA ONISQ program (IAA 8839 and agreement No. HR00112090). All authors acknowledge useful discussion with the NASA Quantum AI Laboratory (QuAIL) team members, in particular Stuart Hadfield, and Benjamin Hall from Michigan State University.

## Author information

### Authors and Affiliations

### Corresponding author

## Ethics declarations

### Conflict of interests

The authors declare no conflicts of interest.

## Additional information

### Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: unfortunately, the original article was published without the author's final correction requests.

This has since been corrected.

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

LaRose, R., Rieffel, E. & Venturelli, D. Mixer-phaser Ansätze for quantum optimization with hard constraints.
*Quantum Mach. Intell.* **4**, 17 (2022). https://doi.org/10.1007/s42484-022-00069-x

Received:

Accepted:

Published:

DOI: https://doi.org/10.1007/s42484-022-00069-x

### Keywords

- Quantum optimization
- Gate model quantum computing
- Quantum circuits
- Compilation