Introduction

Quantum Approximate Optimization algorithm (QAOA)1, like all quantum algorithms, aims to utilize quantum hardwares to efficiently solve problems that are hard on classical computers. It is one of the candidates to achieve a quantum supremacy in the noisy intermediate-scale quantum (NISQ) era2. So far, quantum supremacy has only been realized for random circuit sampling tasks3,4. For complex but practical problems in the class of nondeterministic-polynomial (NP) time, no quantum advantage has been found, despite trials of QAOA on the Google Sycamore quantum processor5. To search for quantum supremacy, it is crucial to first understand the difference between what is hard or easy for QAOA and for classical algorithms, which can be best explored via the combinatorial problem of Boolean satisfiability problem (SAT).

In a k-SAT instance, one asks whether multiple clauses, each involving k Boolean variables, can be satisfied simultaneously. Depending on the value of k, the worst-case hardness is drastically different—while 3-SAT is NP-complete, 2-SAT can be efficiently solved in polynomial time (class P). In addition, the classical empirical hardness of random 3-SAT instances is known to have a computational phase transition6,7,8,9,10 versus the problem density characterized by the clause-to-variable ratio. When the density is small (large), almost all instances are satisfied (unsatisfied) and easy to solve; while for density approaching the critical point of the SAT-UNSAT phase transition, the 3-SAT problem instances become the hardest to solve.

For quantum algorithms such as QAOA, the above phenomenon is largely unexplored. To begin with, as QAOA always implements SAT problems in their NP-hard optimization versions (Max-SAT)11, it is unclear whether the decision version’s NP (k = 3) versus P (k = 2) contrast has any influence on QAOA’s performance. It is also unclear how classical empirical hardness of 3-SAT connects to QAOA’s performance on its optimization versions. Indeed, ref. 12 does not find a big difference between QAOA’s performance on Max-2-SAT and Max-3-SAT, and only finds QAOA’s performance to worsen as the density increases—a phenomenon they call ‘reachability deficits’.

In this paper, we reveal a computational phase transition in the trainability of QAOA in solving the positive 1-in-k SAT problem. In terms of trainability characterized by gradient, the typical amplitude of gradient in training SAT problems achieves the minimum at a critical problem density ratio. In general, this quantum critical problem density deviates from the SAT-UNSAT phase transition6,7,8,9,10, where the instances are hardest classically. We link this gradient transition to the controllability of the quantum systems evolving under the QAOA circuit13,14,15,16 and the complexity of QAOA circuit17,18,19,20. In terms of accuracy of the optimization versions of SAT, we find that QAOA’s approximation ratio is robust and decays slowly with the problem density. Moreover, despite the performance decay due to reachability deficits, it is precisely in the large problem density region where a quantum advantage can be identified, when comparing with classical approximate algorithms. In addition, the accuracy in solving Max-2-SAT is higher than that in solving Max-3-SAT, consistent with the P versus NP contrast in the decision version of the problems. Interestingly, for the decision version of the SAT problems, QAOA shows the worst performance at the SAT-UNSAT transition, revealing a remnant of classical empirical hardness. Such remnant of classical empirical hardness is also confirmed in quantum adiabatic algorithms (QAA)21,22,23.

Results

Quantum approximate optimization algorithm

To solve an optimization problem, QAOA encodes the cost function into the energy of a problem Hamiltonian HC, defined over spin-1/2 particles (qubits), and then seeks for an approximation of the ground state that encodes the solution to the optimization problem. An n-qubit QAOA circuit implements dynamics governed by the problem Hamiltonian HC and a mixing Hamiltonian \({H}_{B}=\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{x}\) alternatively in each layer, where \({\sigma }_{i}^{x}\) is the Pauli-X operator representing the transverse fields. The output state of a p-layer QAOA is therefore \(|\psi (\overrightarrow{\gamma },\overrightarrow{\beta })\rangle =\mathop{\prod }\nolimits_{\ell = 1}^{p}{e}^{-i{\beta }_{\ell }{H}_{B}}{e}^{-i{\gamma }_{\ell }{H}_{C}}|\psi \left(0,0\right)\rangle ,\) where \(\overrightarrow{\gamma }=\left({\gamma }_{1},\ldots ,{\gamma }_{p}\right)\) and \(\overrightarrow{\beta }=\left({\beta }_{1},\ldots ,{\beta }_{p}\right)\) are variational parameters. The initial state is set to be a superposition of all possible spin configurations, \(\left|\psi \left(0,0\right)\right\rangle ={\left|+\right\rangle }^{\otimes n}\) with \(\left|+\right\rangle =\left(\left|0\right\rangle +\left|1\right\rangle \right)/\sqrt{2}\). To solve the problem, variational training is performed over the parameters \(\overrightarrow{\gamma },\overrightarrow{\beta }\) to minimize the cost function \({{{\mathcal{C}}}}(\overrightarrow{\gamma },\overrightarrow{\beta })=\langle \psi (\overrightarrow{\gamma },\overrightarrow{\beta })|{H}_{C}|\psi (\overrightarrow{\gamma },\overrightarrow{\beta })\rangle .\) The variational training terminates when the cost function stops to decrease significantly, and ideally leads to the optimal parameters \({\overrightarrow{\gamma }}^{* },{\overrightarrow{\beta }}^{* }={{{{\rm{argmin}}}}_{\overrightarrow{\gamma },\overrightarrow{\beta}}}{{{\mathcal{C}}}}(\overrightarrow{\gamma },\overrightarrow{\beta }).\)

SAT problems

We will focus on two types of SAT problems, k-SAT problem (k ≥ 2) and the positive 1-in-k SAT problem (1-k-SAT+, k ≥ 2). The positive 1-in-k SAT problem (1-k-SAT+, k ≥ 2) is also known as the exact-cover k problem. Given n Boolean variables \(V={\{{v}_{i}\}}_{i = 1}^{n}\), a random instance of the SAT problems can be constructed by choosing m clauses \(C={\{{c}_{a}\}}_{a = 1}^{m}\), each containing k different variables \({\{{v}_{aj}\}}_{j = 1}^{k}\) uniformly randomly chosen from V. The k elements in each clause can be either positive or negative literal with equal probability in k-SAT problems, while only positive in 1-k-SAT+ problems. The conjunctive normal form (CNF) of the SAT instance can be expressed as \(F\left(V\right){ = \bigwedge }_{a = 1}^{m}{c}_{a}({\{{v}_{aj}\}}_{j = 1}^{k}),\) where ‘⋀’ denotes AND and forces the CNF to be true only when all clauses are satisfied. In a k-SAT problem, each clause is true when at least one element in the clause is true; while in a positive 1-in-k SAT problem, a clause \({c}_{a}({\{{v}_{aj}\}}_{j = 1}^{k})\) is satisfied if and only if a single variable among \({\{{v}_{aj}\}}_{j = 1}^{k}\) is taken to be true.

The (decision version of) SAT problem asks whether \(F\left(V\right)\) can be satisfied with an assignment of variables V, while the optimization version—Max-SAT—aims to find an assignment of variables V to minimize the number of clause violations. With the increase of clause-to-variable ratio m/n, it becomes harder to satisfy a random SAT instance, and there exists a phase transition of SAT probability across a critical ratio, m/n = 1 for 2-SAT24 and m/n ∼ 4.26 for 3-SAT9, m/n ~ 0.55 for 1-2-SAT+ and m/n ~ 0.62 for 1-3-SAT+10, as shown in Figs. 1(a, b) and 2(a, b).

Fig. 1: SAT-UNSAT phase transition and trainability of 3-SAT (left column) and 2-SAT.
figure 1

a, b Probability of SAT for different system size n. c, d The mean of \(1/{{{\rm{SD}}}}({\partial }_{1}{{{\mathcal{C}}}}(\overrightarrow{\gamma },\overrightarrow{\beta }))\) in different p-layer QAOA with n = 16 variables. The inverse is added for easier comparison according to Eq. (4). e, f The ratio between the average gradient variance \(\langle {{{\rm{SD}}}}({\partial }_{1}{{{\mathcal{C}}}}(\overrightarrow{\gamma },\overrightarrow{\beta }))\rangle\) of n = 6 variables over that of n = 16 variables in different p-layer QAOA. Larger ratio indicates barren plateau. g, h The dimension of dynamical Lie algebra \(\dim ({\mathfrak{g}})\) for generators in QAOA in an n = 6 qubits system (see Eq. 4). i, j Average of 4-point OTOC for different p-layer QAOA with n = 10 variables (see Eq. 5). Green horizontal dashed line represents the value given by Haar unitary − 2n/(4n − 1). Vertical dashed lines indicate the critical SAT-UNSAT transition point. All results and error bars are estimated over 100 instances.

Fig. 2: SAT-UNSAT phase transition and trainability of 1-3-SAT+ (left column) and 1-2-SAT+.
figure 2

a, b Probability of SAT for different system size n. c, d The mean of \(1/{{{\rm{SD}}}}({\partial }_{1}{{{\mathcal{C}}}}(\overrightarrow{\gamma },\overrightarrow{\beta }))\) in different p-layer QAOA with n = 18 variables. The inverse is added for easier comparison according to Eq. (4). e, f The ratio between the average gradient variance \(\langle {{{\rm{SD}}}}({\partial }_{1}{{{\mathcal{C}}}}(\overrightarrow{\gamma },\overrightarrow{\beta }))\rangle\) of n = 6 variables over that of n = 18 variables in different p-layer QAOA. Larger ratio indicates barren plateau. g, h The dimension of dynamical Lie algebra \(\dim ({\mathfrak{g}})\) for generators in QAOA in an n = 6 qubits system (see Eq. 4). The inset log-log plots (g2) and (h2) show \(\dim ({\mathfrak{g}})\) versus n with a fully symmetric HC. Green and orange curves represent the lower bound estimate \(\dim ({\mathfrak{g}})={n}^{2}\) and upper bound of \({\dim }^{{{{\rm{UB}}}}}\). i, j Average of 4-point OTOC for different p-layer QAOA with n = 10 variables (see Eq. 5). Green horizontal dashed line represents the value given by Haar unitary − 2n/(4n − 1). Vertical dashed lines indicate the critical SAT-UNSAT transition point. All results and error bars are estimated over 100 instances.

We study the case of k = 2 and k = 3 for a comparison: while 2-SAT and 1-2-SAT+ are in class P and efficiently solvable, 3-SAT and 1-3-SAT+ are NP-complete and it takes an exponential amount of time to solve it, e.g., by the well-known algorithm X25. Despite the contrast in the decision versions, Max-k-SAT and Max-1-k-SAT+ are always NP hard, even for k = 226. In addition to the worst-case hardness, empirical studies with classical algorithms on different variants of 3-SAT6,7,8,9,10 show that when m/n is small (large), almost all instances are satisfied (unsatisfied) and easy to solve; while for m/n approaching the critical point of the SAT-UNSAT transition, the SAT problem instances become the hardest to solve.

To solve SAT problems with QAOA, we transform each Boolean variables vi to the spin states of a qubit, with spin-down state \(\left|1\right\rangle\) (Pauli-Z operator σz = −1) for true and spin-up state \(\left|0\right\rangle\) (σz = 1) for false, and obtain the spin Hamiltonians for 2-SAT and 3-SAT as [Eq.1]

$${H}_{C,k}=\frac{1}{{2}^{k}}\mathop{\sum }\limits_{a=1}^{m}\mathop{\prod }\limits_{\ell =1}^{k}\left(1+{A}_{{a}_{\ell },a}{\sigma }_{{a}_{\ell }}^{z}\right),$$
(1)

where \({A}_{{a}_{\ell },a}\in \{0,+1,-1\}\) stands for the literal sign for th element in ath clause with +1 ( −1) for positive (negative) literal separately and 0 for absence of it in the clause. Similarly, for 1-3-SAT+ and 1-2-SAT+ as21,22,23,27 (see Methods) [Eqs. 2, 3]

$${H}_{C,{3}^{+}}=\frac{1}{4}\mathop{\sum }\limits_{a=1}^{m}{\left({\sigma }_{a1}^{z}+{\sigma }_{a2}^{z}+{\sigma }_{a3}^{z}-1\right)}^{2},$$
(2)
$${H}_{C,{2}^{+}}=\frac{1}{4}\mathop{\sum }\limits_{a=1}^{m}{\left({\sigma }_{a1}^{z}+{\sigma }_{a2}^{z}\right)}^{2}.$$
(3)

The gate-based implementation of the QAOA for our problem Hamiltonians can be found in Supplementary Note 1. With the above encoding, an instance is satisfied only if the ground state energy is zero. As QAOA minimizes the cost function, it can be considered as an approximate algorithm for solving Max-SAT problems. By default, via a threshold decision, the solution of the optimization versions also implies the solution to the decision versions.

Our overall goal in this paper is to understand what is hard and what is easy on QAOA, both in terms of trainability and accuracy.

Gradient of QAOA training

As a variational circuit, QAOA’s cost-function gradients over variables \(\overrightarrow{\gamma },\overrightarrow{\beta }\) indicate the shape of cost-function landscape—larger amplitudes of gradients indicate sharper changes and therefore the problem is easier to train; while small amplitudes of gradients leads to barren plateaus16,28,29,30,30,31 that make the training difficult. As gradients average to zero on random states28,29, we evaluate the standard deviation (SD) of gradients to characterize their typical amplitudes.

To represent the typical case of training, we evaluate the gradient on random choices of the circuit parameters via a numerical finite-difference. Without loss of generality, we consider the gradient over the first variable γ1 and denote it as \({\partial }_{1}{{{\mathcal{C}}}}\)28,29. To enable an easier visualization in Figs. 1(c, d) and 2(c, d), we plot the inverse of the gradient SD, \(1/{{{\rm{SD}}}}({\partial }_{1}{{{\mathcal{C}}}}(\overrightarrow{\gamma },\overrightarrow{\beta }))\), so that large values indicate hardness in convergence. We consider different number of layers p in QAOA to obtain a comprehensive picture of it.

For all the problems under study, the inverse gradient SD has a clear peak at a critical clause-to-variable m/n, as shown in Figs. 1(c, d) and 2(c, d). However, this peak is in general different from the classical SAT-UNSAT transition indicated by the dashed line. For the special case of 1-3-SAT+, Fig. 2(c) shows that the peak of the inverse gradient coincides with the SAT-UNSAT transition. A large inverse gradient SD indicates a small gradient in the typical case, and therefore a more barren plateau that makes the training hard at the phase transition. When p is small, the peak disappears; however, at the same time, QAOA fails to provide the accurate solution, making trainability irrelevant.

We notice that the cases of k = 3 have the inverse gradient peaked at a much smaller clause-to-variable density, as a result of the more complex clauses. Overall, the results reveal a transition of the trainability measured by gradient that is different from the classical SAT-UNSAT transition, showing that the empirical hardness for quantum algorithms can be different from classical algorithms.

Connection to controllability

To understand the different behaviors of the gradient, we utilize the connection between gradient and controllability measured by the dimension of dynamical Lie algebra (DLA) of QAOA generators, as recently identified in ref. 16.

As explained in ref. 13, DLA can be used to test the controllability of the quantum system governed by unitary dynamics. Let us consider an n-qubit system described by a Hilbert space \({{{\mathcal{H}}}}\). Considering an optimal quantum control model described by a unitary \(U=\mathop{\prod }\nolimits_{k = 1}^{K}{e}^{-i{u}_{k}{H}_{k}}\,\), where \({{{\mathcal{G}}}}\equiv \{{H}_{1},\ldots \,,{H}_{K}\}\) is a set of generators and \(\{{u}_{1},\ldots \,,{u}_{K}\}\subseteq {\mathbb{R}}\) is a set of coefficients which are usually represented by the control fields. Then, the DLA \({\mathfrak{g}}\equiv {\langle i{H}_{1},\ldots ,i{H}_{K}\rangle }_{{{{\rm{Lie}}}}}\subseteq {\mathfrak{su}}({2}^{n})\), is constructed by the repeated and nested commutators of the elements in \({{{\mathcal{G}}}}\). The corresponding dynamical Lie group is therefore obtained by taking the exponential of the DLA \({e}^{{\mathfrak{g}}}\equiv \{{e}^{{V}_{1}}{e}^{{V}_{2}}\ldots {e}^{{V}_{L}}\,|\,{V}_{1},\ldots \,,{V}_{L}\in {\mathfrak{g}}\}.\) Generally, for a finite-time evolution governed by the Schrödinger equation, the system is fully controllable when the set of the unitaries obtained during this evolution can cover all unitaries as its elements. This is precisely formulated by the so-called Lie algebra rank condition, which states that the system is fully controllable if and only if \(\dim ({\mathfrak{g}})={4}^{n}-1\). Here, note that we suppose that \({{{\mathcal{G}}}}\) does not include the identity without the loss of generality because the identity leads to the negligible global phase in QAOA scenario. For quantum systems where the whole Hilbert space is not fully controllable, when the DLA \({\mathfrak{g}}\) can be described as the direct sum, i.e., \({\mathfrak{g}}{ = \bigoplus }_{j}{{\mathfrak{g}}}_{j}\), so that the Hilbert space can be written in a form of the direct sum of the subspace \({{{{\mathcal{H}}}}}_{j}\) as \({{{\mathcal{H}}}}{ = \bigoplus }_{j}{{{{\mathcal{H}}}}}_{j}\), \(\dim ({{\mathfrak{g}}}_{j})\) determines the subspace controllability of \({{{{\mathcal{H}}}}}_{j}\)14,15,16.

With a problem Hamiltonian HC and the mixing Hamiltonian HB as the generators, we can generate the DLA \({\mathfrak{g}}\) and provide an estimate of the SD of gradient from the dimension of the DLA \(\dim \left({\mathfrak{g}}\right)\) as [Eq. 4]

$$1/{{{\rm{SD}}}}\left({\partial }_{1}{{{\mathcal{C}}}}(\overrightarrow{\gamma },\overrightarrow{\beta })\right)\in {{\Omega }}\left({\left[{{{\rm{poly}}}}\left(\dim \left({\mathfrak{g}}\right)\right)\right]}^{1/2}\right)$$
(4)

where “poly” denotes a polynomial function. Here, for two functions f(x) and g(x), f(x) ∈ Ω(g(x)) means that f(x) is bounded below by g(x) asymptotically.

Therefore, we evaluate the DLA dimension numerically to compare with our gradient results. As the numerical evaluation is costly, we are limited to a smaller size of n = 6. Despite the small size, as we see in Fig. 1(g, h) and 2(g, h), the DLA dimension \(\dim \left({\mathfrak{g}}\right)\) essentially has the same behavior versus the clause-to-variable ratio m/n, when compared with the inverse gradient. This manifests a clear connection between the gradient transition and the DLA dimension transition.

QAOA provides a clear physical insight to the concept of trainability of variational quantum algorithms on NISQ device. From Figs. 1(c–h) and 2(c–h), the trainability and controllability have a trade off. This can be explained as the following. When the system is more (less) controllable, there are more (less) control protocols available to transform the initial state to the desired final state. Geometrically, these protocols can be described as the the accessible paths characterized by the parameters \((\overrightarrow{\gamma },\overrightarrow{\beta })\) from the initial state to the desired state. In this picture, our task is to find the optimal path from all the possible paths. Therefore, from the trainability perspective, it becomes harder (easier) to train \((\overrightarrow{\gamma },\overrightarrow{\beta })\) when the system is more (less) controllable. Despite being harder to train, the plurality of the paths also provides more hope to good performance. In addition, one can also connect DLA to controllability via the Quantum Fisher information matrix (QFIM)30. As the rank of QFIM characterizes the number of independent ways to vary the control parameters to change the generated quantum state, it is intuitive that the dimension of DLA upper bounds the rank of QFIM, which connects the training difficulty and controllability of a quantum model.

An additional insight can be obtained by evaluating the DLA dimension at the m ≫ n limit for the 1-k-SAT+ problems, where the Hamiltonian is symmetric between all qubits (see inset of Fig. 2(g) and (h)). In this case, we are able to prove an upper bound (see Methods), \(\dim ({\mathfrak{g}})\le {\dim }^{{{{\rm{UB}}}}}\equiv \frac{1}{6}n({n}^{2}+6n+11)\). We also expect \(\dim ({\mathfrak{g}})\) to be above n2, which is the dimension for a much simpler nearest neighbor Ising model16. While the upper bound is in general a loose one, it indicates that the gradient in the m ≫ n limit is only polynomially small; in contrast, for the hard instances we would expect an exponentially small gradient. This contrast supports the decay of dimension and the increase of gradient when m/n is large.

Barren plateau and complexity

To further understand the barren plateau phenomena, we study the speed of decay of typical gradient with the number of qubits. To begin with, we pick two values of system size, n = 6 and n = 16 for k-SAT (n = 18 for 1-k-SAT+ instead), and evaluate the ratio of the average gradient variance \(\langle {{{\rm{SD}}}}({\partial }_{1}{{{\mathcal{C}}}}(\overrightarrow{\gamma },\overrightarrow{\beta }))\rangle\) for different number of layers p. In Figs. 1(e, f) and 2(e, f), we see that right after the peak of the inverse gradients, when the circuit depth p is sufficiently large, the decay ratio saturates to a value independent of the clause-to-variable ratio, indicating the barren plateau. While below the peak of the inverse gradient, the decay ratio of gradient increases gradually with the clause-to-variable ratio.

To confirm an exponential decay of gradient, in Fig. 3, we focus on 1-k-SAT+ and plot the SD of gradient versus the layer p for different number of qubits n, while keeping the clause-to-variable ratio m/n to be a constant in each panel. For 1-3-SAT+, when m/n is small in panel (a), the SD of gradient saturates and does not decrease versus p or n, showing no barren plateau. At the critical value in panel (b), we see an exponential decrease of SD versus the number of qubits n at large p, confirming a barren plateau. Above threshold, as shown in panel (c), a barren plateau can still be confirmed, however, with larger gradients than the critical case of panel (b). On the contrary, for 1-2-SAT+ as we see in panel (d) and (e), at around the SAT-UNSAT transition we do not see the appearance of a barren plateau. At large m/n in panel (f), the gradient finally starts to show an exponential decay, indicating a barren plateau.

Fig. 3: Barren plateau.
figure 3

SD of gradient \({{{\rm{SD}}}}\left({\partial }_{1}{{{\mathcal{C}}}}\left(\gamma ,\beta \right)\right)\) versus the layer of QAOA p for 1-3-SAT+ (top, ac) for 1-2-SAT+ (bottom, df) problems with different number of variables n. From left to right (a and d, b and e, c and f) we plot at three different ratios m/n = 0.2, 0.8, 2. Due to the finite size n ≤ 18, as shown in Fig. 2 the transition happens at around m/n = 0.8. All results and error bars are estimated over 100 instances.

The appearance of barren plateaus is often connected to the complexity of the typical quantum circuit involved29. The complexity of an ensemble of unitaries can in general be characterized by the closeness to unitary t-design, which reproduces the Haar random expectation values of 2t-point correlators. In this regard, when the quantum circuit forms a 2-design17,18,32, it has been shown that the variance of the gradient will vanish exponentially with the system size—which leads to a barren plateau of cost function28,29.

Therefore, we consider the unitary ensemble \({{{{\mathcal{U}}}}}_{p}\) formed by the p-layer QAOA, \({U}_{{{{\rm{QAOA}}}}}=\mathop{\prod }\nolimits_{\ell = 1}^{p}{e}^{-i{\beta }_{\ell }{H}_{B}}{e}^{-i{\gamma }_{\ell }{H}_{C}}\), with each angle γ ∈ [0, 2π) and each angle β ∈ [0, π) independent and uniform random. To measure the closeness to 2-design, we evaluate the ensemble-averaged infinite-temperature 4-point out-of-time-order correlator (OTOC) [Eq. 5]

$${C}_{{{{\rm{OTO}}}}}({W}_{1},{W}_{2};{{{\mathcal{E}}}})=\frac{1}{d}{\left\langle {{{\rm{Tr}}}}\{{W}_{1}^\dagger{U}^{{\dagger} }{W}_{2}^\dagger U{W}_{1}{U}^{{\dagger} }{W}_{2}U\}\right\rangle }_{{{{\mathcal{E}}}}},$$
(5)

where the dimension d = 2n for an n-qubit system and the average is over the unitary \(U\in {{{\mathcal{E}}}}\)18. For ensemble \({{{\mathcal{E}}}}\) forming a 2-design, we have \({C}_{{{{\rm{OTO}}}}}({W}_{1},{W}_{2};{{{\mathcal{E}}}})=-d/({d}^{2}-1)\) saturate to the Haar results18,32; while for trivial ensembles, \({C}_{{{{\rm{OTO}}}}}({W}_{1},{W}_{2};{{{\mathcal{E}}}})\) is of order one. Therefore, the decay of OTOC indicates the ensemble being a 2-design. Without loss of generality, we consider the OTOC between single-qubit operators \({C}_{{{{\rm{OTO}}}}}({\sigma }_{1}^{y},{\sigma }_{n/2}^{y};{{{{\mathcal{U}}}}}_{p})\). In Figs. 1(i, j) and 2(i, j), we find the OTOC of the QAOA ensemble decays towards the Haar value when clause-to-variable ratio m/n increases to the critical value of the minimum gradient, indicating a transition to 2-design. We also see a difference between the cases of k = 3 versus k = 2—the decay of OTOC for k = 2 is much slower than k = 3.

Accuracy of QAOA

In this section, we explore the accuracy of QAOA in solving k-SAT and 1-k-SAT+. To speedup the training, we develop a heuristic pre-optimization initialization strategy (see Supplementary Note 2). To obtain the best accuracy, we perform 10 repetitions on QAOA for each instance to obtain the optimal solution among those results. To benchmark the accuracy of QAOA with the classical algorithm, in the case of Max-k-SAT, we consider the lower bound of state-of-the-art approximation algorithms; In the case of decision versions of k-SAT, we consider success probability of random guess; In the case of 1-k-SAT+, as less results are known about approximation ratios, we reduce the problem to the maximum weighted independent set (MWIS) problem33,34 and utilize the greedy approximate MWIS algorithms proposed in35,36 (see Methods).

The standard accuracy characterization of approximate algorithms for optimization problem is the approximation ratio11,35,36. For our case of Max-SAT problems, we define the approximation ratio r ≤ 1 of a solution to be the ratio between the number of clauses satisfied by the solution and the maximum number of clauses that can be satisfied by any solution. As the output state \(\left|\psi (\overrightarrow{\gamma },\overrightarrow{\beta })\right\rangle\) in QAOA can be in a superposition of multiple solutions, we evaluate the expected approximation ratio via projecting the output state to the computational basis. For Max-k-SAT, a random assignment will satisfy on average m(1 − 1/2k) number of clauses; For Max 1-k-SAT+, a random assignment will satisfy on average mk/2k number of clauses. For the instances with most clauses satisfiable, the above corresponds to an approximation ratio of rrand ~ 1−1/2k and rrand ~ k/2k. An exact optimal solution will saturate r = 1 and non-trivial approximate algorithms should have r ∈ [rrand, 1].

In Figs. 4 (a, b) and 5 (a, b), we see that as p increases, QAOA is able to obtain larger approximation ratios. As the clause-to-variable ratio m/n increases, the approximation ratio decays as expected. However, the decay is rather slow and manifests the robustness of QAOA. In the Max-3-SAT case, we see at small p the approximation ratio is already better than the lower bound of r ~ 0.95 in ref. 37, similarly, the lower bound of r ≥ 21/22 for the Max-2-SAT11 case is also overcame at small depth p. In the case of Max 1-k-SAT+, we consider the approximation ratio of the classical MWIS approximate algorithm for comparison (see Methods). For Max-1-3-SAT+, we identify a clear quantum advantage at around p ~ 16. For Max-1-2-SAT+, advantages appear even for a shallow depth of p = 8. We want to emphasize that the quantum advantage happens only when the clause-to-variable ratio is large, despite the reachability deficits12. Indeed, we expect quantum algorithms to be advantageous especially for hard problems, where both classical and quantum algorithms face challenges.

Fig. 4: Accuracy of QAOA in k-SAT.
figure 4

a, b Approximation ratio r of SAT clauses, (c, d) Success probability in determining SAT/UNSAT for 3-SAT (left) and 2-SAT (right) versus clause-to-variable ratio m/n with n = 10 variables. Green dashed line represent the lower bound of approximation algorithm r ≥ 0.95 for Max-3-SAT37 and r ≥ 21/22 for Max-2-SAT11. The horizontal light green dashed line in (c, d) represent the success probability of the random guess which are 7/8 and 3/4 for 3-SAT and 2-SAT separately. Vertical black dashed lines in all plots represent critical point of SAT-UNSAT transition. All results and error bars are estimated over 100 instances.

Fig. 5: Accuracy of QAOA in 1-k-SAT+.
figure 5

a, b Approximation ratio r of SAT clauses, (c, d) Success probability in determining SAT/UNSAT for 1-3-SAT+ (left) (from p = 4 to p = 24) and 1-2-SAT+ (right) (from p = 4 to p = 16) versus clause-to-variable ratio m/n with n = 10 variables. Green dots represent the classical approximate results through a reduction to MWIS. The horizontal light green dashed line in (c, d) represent the success probability of the random guess which are 3/8 and 1/2 for 1-3-SAT+ and 1-2-SAT+ separately. Vertical black dashed lines in all plots represent critical point of SAT-UNSAT transition. All results and error bars are estimated over 100 instances.

Although all Max-SAT problems being considered are NP-hard, we do see some interesting contrast in the performance. In the Max-2-SAT and Max-3-SAT cases, the approximation ratio performance is similar when p is large, consistent with previous results in ref. 12; however, for the absolute number of additional violated clauses, Max-2-SAT performs slightly better than Max-3-SAT (see Supplementary Note 2). In the Max-1-k-SAT+ cases, the accuracy of QAOA is substantially higher for k = 2 than k = 3 with the same number of p layers. We speculate such a contrast in the performance can be caused by the different connectivity and complexity of the Hamiltonian in the problems.

To connect to the empirical hardness transition in classical algorithms, we can also reinterpret each optimization result as a decision of SAT/UNSAT. This can be done via a threshold decision on the minimized number of UNSAT clauses, e.g., determine an instance as SAT when the expected number of UNSAT clauses is smaller than Eth = 0.5 and UNSAT otherwise. To characterize the overall performance, we evaluate the success probability of deciding SAT/UNSAT when solving random instances at a fixed clause-to-variable ratio m/n. The results are shown in Figs. 4(c, d) and 5(c, d).

The success probability increases with the layer of QAOA p as we expect. For the k = 3 cases in Figs. 4(c) and 5(c), there is a valley of low success probability at around the critical point of m/n shown in Fig. 2(a), recovering the same hardness transition identified in empirical studies of classical algorithms6,7,8,9,10. While for the k = 2 cases in Figs. 4(d) and 5(d), despite a similar valley of low success probability at small p, the success probability is almost unity for a circuit depth of p = 16. Similarly, the classical benchmark can be reinterpreted and similar transition versus m/n can be seen. Such a valley at the classical SAT-UNSAT transition indicates a remnant of the classical empirical hardness and is different from the trainability transition identified in Figs. 1 and 2.

Combining the above, we see that overall QAOA possesses a similar notion of what is hard and easy as classical algorithms, while showing advantage over the classical algorithms being considered in the large problem-density instances.

Comparison with quantum adiabatic algorithm

The identified trainability transition in general deviates from the classical computational phase transition, while the performance in solving the SAT problems show a consistent trend with the classical computational phase transition. Such a disparity intrigues us to explore the empirical hardness of SAT instances in other Hamiltonian-based quantum algorithms, such as QAA—a popular alternative and also the predecessor of QAOA.

To obtain the solution, QAA prepares the ground state of the problem Hamiltonian in Eqs (1), (2) and (3) via an adiabatic evolution of the Hamiltonian [Eq. 6]

$$H(s)=s{H}_{C}+(1-s){H}_{B}^{\prime},s\in [0,1],$$
(6)

from an ancillary Hamiltonian \({H}_{B}^{\prime}\) at s = 0 to the problem Hamiltonian HC at s = 1. Here the ancillary Hamiltonian \({H}_{B}^{\prime}=\mathop{\sum }\nolimits_{i = 1}^{n}| {h}_{i}| {\sigma }_{i}^{x}\), where ∣hi∣ is the number of times the variable vi appear in the clauses (see Methods). The initial Hamiltonian at \(H(0)={H}_{B}^{\prime}\), with an easy to prepare ground state \(\left|\psi \left(0\right)\right\rangle \propto {\left(\left|0\right\rangle -\left|1\right\rangle \right)}^{\otimes n}\) in a superposition of all possible spin configurations. As one tunes the parameter s slowly towards H(1) = HC, the adiabatic theorem guarantees that the state of the system stays in the ground state; therefore, the final state \(\left|{\phi }_{{{{\rm{QAA}}}}}\right\rangle\) is the ground state of the problem Hamiltonian, which provides the solution to the optimization problem.

From the adiabatic theorem, we can obtain an estimation on the computation time of QAA as Ta ~ 1/(ΔE)2 so that the success probability is close to unity, where ΔE is the minimum gap of the Hamiltonians {H(s), s ∈ [0,1]}21,23. In Figs. 6(a, b) and 7(a, b), we evaluate the inverse gap square 1/(ΔE)2 as a measure of the instance hardness with different clause-to-variable ratio using Qutip38. Note that there exist other more rigorous estimations on the necessary adiabatic evolution time, combining higher-order terms39,40. However, the inverse gap square as a approximate estimation is sufficient for our purpose. In subplots (a), we identify a computational phase transition for 3-SAT and 1-3-SAT+, where the minimum gap is smallest at about the critical SAT/UNSAT transition, up to some small deviation due to finite size, similar to the decision version of SAT on QAOA in subplots (c) of Figs. 4 and 5. While for 2-SAT and 1-2-SAT+, we see the minimum gap to be higher than the critical point, qualitatively agreeing with the case of QAOA.

Fig. 6: QAA gap size and performance for k-SAT problems.
figure 6

a, b The median of 1/ΔE2 of QAA with n = 10 variables shown by blue circles. The green and purple circles represent the SAT and UNSAT instances separately. c, d The probability of the final state after QAA evolution being in the ground state, \(P=\mathop{\sum }\nolimits_{i = 1}^{D}| \left\langle {\psi }_{i}| {\phi }_{{{{\rm{QAA}}}}}\right\rangle {| }^{2}\). All results and error bars are estimated over 100 instances.

Fig. 7: QAA gap size and performance of 1-k-SAT+.
figure 7

a, b The median of 1/ΔE2 of QAA with n = 10 variables shown by blue circles. The green and purple circles represent the SAT and UNSAT instances separately. c, d The probability of the final state after QAA evolution being in the ground state, \(P=\mathop{\sum }\nolimits_{i = 1}^{D}| \left\langle {\psi }_{i}| {\phi }_{{{{\rm{QAA}}}}}\right\rangle {| }^{2}\). All results and error bars are estimated over 100 instances.

To further confirm the transition, we evaluate the success probability \(P=\mathop{\sum }\nolimits_{i = 1}^{D}| \left\langle {\psi }_{i}| {\phi }_{{{{\rm{QAA}}}}}\right\rangle {| }^{2}\) from the overlap between the final evolved state \(\left|{\phi }_{{{{\rm{QAA}}}}}\right\rangle\) and the D-degenerate ground state of the problem Hamiltonian \(\left|{\psi }_{i}\right\rangle\). In Figs. 6(c, d) and 7(c, d), at a finite time Ta, we see the success probability of QAA decreases before the critical point, while roughly maintaining a constant above the critical point. Such a robustness to the problem density coincides with the slow decay of QAOA’s approximation ratio with the problem density. In addition, we also find the success probability of 2-SAT and 1-2-SAT+ (subplots (d)) to be much higher than that of 3-SAT and 1-3-SAT+ (subplots (c)).

Discussion

In this paper, we thoroughly explore the empirical hardness of Hamiltonian-based quantum algorithms in solving SAT problems. In the case of QAOA, we find a trainability phase transition, where the gradient is minimum at certain critical problem density. Such a phase transition is connected to the controllability and complexity of QAOA circuits. Although the trainability transition in general deviates from the classical SAT-UNSAT transition, in terms of performance, Hamiltonian-based algorithms do show a remnant of the classical SAT-UNSAT transition. Although our results are empirical, we expect analytical results to be challenging, as the classical correspondence of such transition is also empirical due to the complexity of the SAT problems. We also identify quantum advantages of QAOA against several classical greedy approximate algorithms for a relatively small-scale quantum system, potentially realizable in the near-term. Although we have focused on two cases of SAT for convenience, we expect the computational phase transition in QAOA to apply to all combinatorial optimization problems. In particular, as 3-SAT is NP-complete, the clause-to-variable ratio represents a universal characterization of a ‘problem density’12, and the computational phase transition applies to all NP-complete problems in this regard.

Methods

Many-body formulation of the problem Hamiltonian

Here we introduce the many-body formulation of the problem spin Hamiltonian in Eqs. (1), (2) and (3). For convenience, we first introduce an n × m binary matrix Aij, where Aij = 1(−1) if the variable vi is included in the clause cj as positive(negative) literal and Aij = 0 otherwise. With this matrix in hand, we can express all Hamiltonians in a standard many-body form34 as [Eqs. 713]

$${H}_{C,3}=\frac{1}{8}\mathop{\sum}\limits_{i < j < \ell }{K}_{ij\ell }{\sigma }_{i}^{z}{\sigma }_{j}^{z}{\sigma }_{\ell }^{z}+\frac{1}{8}\mathop{\sum}\limits_{i < j}{J}_{ij}{\sigma }_{i}^{z}{\sigma }_{j}^{z}-\frac{1}{8}\mathop{\sum}\limits_{i}{h}_{i}{\sigma }_{i}^{z}+\frac{m}{8}$$
(7)
$${H}_{C,2}=\frac{1}{4}\mathop{\sum}\limits_{i < j}{J}_{ij}{\sigma }_{i}^{z}{\sigma }_{j}^{z}-\frac{1}{4}\mathop{\sum}\limits_{i}{h}_{i}{\sigma }_{i}^{z}+\frac{m}{4}$$
(8)

for k-SAT problems; and

$${H}_{C,{3}^{+}}=\frac{1}{2}\mathop{\sum }\limits_{i=1}^{n}{h}_{i}{\sigma }_{i}^{z}+\frac{1}{2}\mathop{\sum}\limits_{i < j}{J}_{ij}{\sigma }_{i}^{z}{\sigma }_{j}^{z}+m,$$
(9)
$${H}_{C,{2}^{+}}=\frac{1}{2}\mathop{\sum}\limits_{i < j}{J}_{ij}{\sigma }_{i}^{z}{\sigma }_{j}^{z}+\frac{m}{2},$$
(10)

for 1-k-SAT+ problems. The notations are introduced as

$${h}_{i}=-\mathop{\sum }\limits_{j=1}^{m}{A}_{ij}$$
(11)
$${J}_{ij}=\mathop{\sum }\limits_{a=1}^{m}{A}_{ia}{A}_{ja}$$
(12)
$${K}_{ij\ell }=\mathop{\sum }\limits_{a=1}^{m}{A}_{ia}{A}_{ja}{A}_{\ell a}$$
(13)

Note that the number of clauses containing vi is equal to ∣hi∣.

Dynamical Lie algebra: definition and bounds

Below we give bounds for \(\dim \left({\mathfrak{g}}\right)\) in the m ≫ n limit for both 1-2-SAT+ and 1-3-SAT+, where all coefficients Jij’s and hi’s approach uniform (see Supplementary Note 3).

For 1-2-SAT+, we have \({H}_{C,{2}^{+}}\propto {\sum }_{i\,{ < }\,j}{\sigma }_{i}^{z}{\sigma }_{j}^{z}\) up to a constant. Then, the set of the initial generators for the corresponding DLA \({{\mathfrak{g}}}_{{H}_{C,{2}^{+}},{H}_{B}}\) is \({{{{\mathcal{G}}}}}_{{2}^{+}}\equiv \{\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{x},\,{\sum }_{i\,{ < }\,j}{\sigma }_{i}^{z}{\sigma }_{j}^{z}\}\,.\) For the fully coupled Ising model with transverse fields along x and y axis, the set of the initial generators becomes \({{{{\mathcal{G}}}}}_{x,y}\equiv \left\{{{{{\mathcal{G}}}}}_{2},\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{y}\right\}\). From ref. 41, the dimension of the corresponding DLA \({{\mathfrak{g}}}_{x,y}\) is [Eq. 14]

$$\dim ({{\mathfrak{g}}}_{x,y})=\left(\begin{array}{c}n+3\\ n\end{array}\right)-1=\frac{1}{6}n({n}^{2}+6n+11),$$
(14)

where \((\begin{array}{c}a\\ b\end{array})\equiv a!/(a-b)!b!\) is the binomial coefficient. Since the DLAs are generated by the repeated and nested commutators of the generator sets, we must have \(\dim ({{\mathfrak{g}}}_{{H}_{C,{2}^{+}},{H}_{B}})\le \dim ({{\mathfrak{g}}}_{x,y})\) due to \({{{{\mathcal{G}}}}}_{2}\subset {{{{\mathcal{G}}}}}_{x,y}\), which leads to [Eq. 15]

$$\dim ({{\mathfrak{g}}}_{{H}_{C,{2}^{+}},{H}_{B}})\le \frac{1}{6}n({n}^{2}+6n+11).$$
(15)

Also we know nearest neighbor Ising model has dimension n216. Therefore, we expect the scaling to be between Ω(n2) and O(n3).

For the 1-3-SAT+, we have \({H}_{C,{3}^{+}}\propto \frac{2}{n-1}{\sum }_{i\,{ < }\,j}{\sigma }_{i}^{z}{\sigma }_{j}^{z}-\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{z}\). Then, the initial set of generators is \({{{{\mathcal{G}}}}}_{3}=\{\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{x},\,\frac{2}{n-1}{\sum }_{i\,{ < }\,j}{\sigma }_{i}^{z}{\sigma }_{j}^{z}-\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{z}\}\). Let \({{\mathfrak{g}}}_{{H}_{C,{3}^{+}},{H}_{B}}\) be the corresponding DLA. Here, because we can write [Eq. 16]

$${e}^{i{H}_{C,{3}^{+}}}={e}^{i\frac{2}{n-1}{\sum }_{i\,{ < }\,j}{\sigma }_{i}^{z}{\sigma }_{j}^{z}}{e}^{-i\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{z}},$$
(16)

if we start from the initial set of generator \({{{{\mathcal{G}}}}}_{3}^{\prime}=\{\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{x},\,\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{z},\,{\sum }_{i\,{ < }\,j}{\sigma }_{i}^{z}{\sigma }_{j}^{z}\}\), we have the corresponding DLA to strictly contain \({{\mathfrak{g}}}_{{H}_{C,{3}^{+}},{H}_{B}}\). Now, due to the commutator \([\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{x},\mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{y}]\propto \mathop{\sum }\nolimits_{i = 1}^{n}{\sigma }_{i}^{z}\), the corresponding DLA of \({{{{\mathcal{G}}}}}_{3}^{\prime}\) becomes exactly \({{\mathfrak{g}}}_{x,y}\). Therefore, we have \(\dim ({{\mathfrak{g}}}_{{H}_{C,{3}^{+}},{H}_{B}})\le \dim ({{\mathfrak{g}}}_{x,y})\), which leads to [Eq. 17]

$$\dim ({{\mathfrak{g}}}_{{H}_{C,{3}^{+}},{H}_{B}})\le \frac{1}{6}n({n}^{2}+6n+11).$$
(17)

The lower bound estimation n2 for the dimension of DLA from nearest neighbor Ising model still holds.

Classical approximate algorithms

To solve Max-1-k-SAT+, we transform it to the MWIS problem. Given a 1-k-SAT+ instance with n variables and m clauses, one can construct a weighted graph with n vertices \({\{{q}_{i}\}}_{i = 1}^{n}\), each corresponding to a variable vi and having the weight w(qi) = ∣hi∣ ≥ 0. For every two distinct vertices qi, qj, an edge (qi, qj) exists if Jij > 0—when the corresponding variables vi, vj appear in at least one clause at the same time.

One can verify that the SAT/UNSAT version of 1-k-SAT+ problem is reduced to asking whether the weight of maximum independent set is equal to m or not. The reason is simple: an independent set of this graph corresponds to an assignment that does not have more than one true assignment in any clause. To guarantee a solution to the 1-k-SAT+ instance, we still need to make sure that all clauses have one true variable. As the total weight of the independent set is equal to how many clauses are satisfied by this assignment, therefore if the total weight is equal to m, all clauses are satisfied. At the same time, the Max-1-k-SAT+ can be reduced to solving the MWIS. As a classical benchmark, we can utilize various greedy algorithms for MWIS35,36 and choose the best performance among them (see Supplementary Note 4).