Introduction

Combinatorial optimization involves seeking the optimal object within a set of candidates, a prevalent issue in various research domains such as statistical physics1, applied mathematics2, and computer science3. As the solution space tends to expand exponentially with increasing problem size, the “combinatorial explosion” poses significant challenges in finding the optimal with traditional algorithms or brute-force search4,5. To overcome this, numerous heuristic algorithms have been devised for approximating (or identifying sub-optimal) outcomes6,7,8,9. However, developing a highly efficient and accurate algorithm to address combinatorial optimization problems remains a formidable challenge.

In computational physics, the great majority of combinatorial optimization can be mapped to the Ising problem, i.e. finding the ground state of the Ising model10. An Ising model consists of a set of N Ising spins with configuration σi = ± 1, the coupling Jij between two spins and the external fields hi. The Hamiltonian of an Ising model is defined as

$$H=-\frac{1}{2}\sum\limits_{ij}^{N}{J}_{ij}{\sigma }_{i}{\sigma }_{j}-\sum\limits_{i}^{N}{h}_{i}{\sigma }_{i}.$$
(1)

This model can be intuitively represented using a set of quantum bits (or qubits). Solving the Ising model is considered nondeterministic polynomial-time (NP) hard, meaning it is widely believed that no efficient exact classical algorithm exists for this problem. Quantum computers or the D-Wave quantum annealer (QA) utilizing superconducting qubits, have been introduced to tackle the Ising problem. However, experimental results have demonstrated that QA’s current performance is suboptimal when handling dense graphs, owing to limited qubit connectivity and physical noise11,12,13,14.

Inspired by the QA, various special-purpose processors for solving the Ising model have been developed, such as the coherent Ising machine (CIM) implemented with pulsed lasers and degenerate optical parametric oscillators15,16,17, the electromechanical system18, FPGA-based digital annealers19, memristor Hopfield neural networks20, the MRAM-based stochastic computing hardware (called P-bits)21,22, etc.

Several quantum annealing-inspired algorithms (QAIA) have been developed for solving combinatorial optimization problems by simulating the physical mechanisms of quantum annealing-inspired devices. These algorithms include different versions of Simulated CIM and Simulated Bifurcation (SB). They relax discrete variables into continuous ones and employ an annealing scheme for optimization. From a quantum mechanics perspective, CIM is implemented using a network of optical parametric oscillators, which exhibit two stable oscillating states above the threshold, representing an Ising spin23. CIM has been numerically simulated on classical computers through SimCIM. In order to mitigate the adverse effects of relaxing spin variables, various versions of CIM with error correction were proposed, including CIM with chaotic amplitude control (CAC), chaotic feedback control (CFC) and separated feedback control (SFC)24,25. Notably, all these algorithms are the simulation of various CIM, but the name SimCIM stands for the particular version of it, as presented in26. Similarly, quantum bifurcation machine (QbM) carries out quantum adiabatic optimization with Kerr-nonlinear parametric oscillators27. Bifurcation phenomena are emulated within classical nonlinear Hamiltonian systems, often referred to as SB28,29. The original SB algorithm is commonly denoted as adiabatic SB (aSB). It is also susceptible to errors stemming from the continuous relaxation of discrete variables30. Then, ballistic SB (bSB) introduces the inelastic walls to alleviate analog errors. To further enhance error suppression, discrete SB (dSB) discretizes the spin variables within mean field terms28,29. Consequently, these variants not only expedite convergence but also yield more accurate solutions.

As a promising approach to solving Ising problems, these quantum annealing inspired algorithms have demonstrated high efficiency and accuracy. Recent benchmark studies have reported their performance. However, works such as31,32,33,34 only benchmark commercial solvers developed on specialized platforms; meanwhile35, solely examines the effects of different nonlinear terms on analog Ising machines’ performance. To promote and broaden the application of these quantum-inspired algorithms, their performance on general devices like CPUs and GPUs must be evaluated.

In this work, we provide the benchmarking experiments of QAIA and compare them with some other physics-inspired algorithms as well as D-Wave. Our goal is to evaluate their efficiency in solving optimization problems. On the one hand, we find that QAIAs not only achieve lower time-to-solution than other heuristic algorithms, classical and quantum solvers in chimera graphs, but also outperforme Advantage system of D-Wave (hereafter abbreviated as ’Advantage’) when searching for the maximum cut value on pegasus graphs with varying problem size. On the other hand, the dSB demonstrates superior performance against other QAIA overall. Particularly, the bSB exhibits the highest success probability and lowest time-to-solution (TTS) on chimera graphs as well as small instances. For large graphs, dSB prove to have the lowest TTS. However, CAC and SFC have a higher probability of searching for the optimal or nearly-optimal solutions respectively than dSB on the skewed graphs. The aSB is prone to getting trapped in local minima due to the error from continuous relaxations without inelastic walls. SimCIM struggles to achieve optimal solutions due to the numerous hyper-parameters involved.

Results and discussion

We carry out the numerical experiments in three different datasets and use success probability and TTS or time-to-target (TTT) to evaluate the performance of algorithms. We first briefly explain the Spin-glass and Max-Cut problems and TTS/TTT, and then give the results of the numerical experiments.

Spin-glass and Max-Cut Problems

Spin-glass model describes the Ising models with the couplings between neighboring spins following a Gaussian distribution. The spin-glass problem is to find ground-state of the Ising spin glass with the energy function Eq. (1). In statistical physics and disordered systems, the Max-Cut problem is equivalent to the task of minimizing the Hamiltonian of a spin glass model36.

The Max-Cut problem is one of the important combinatorial optimization problems. Consider a given undirected graph G = (V, E), where ∣V∣ = N with edge weights wij = wji > 0, for all (i, j) ∈ E. We partition the vertices V into two complementary sets to maximize the sum of weighted edges connecting points in these two different subsets. The Max-Cut has only +1 weights, while the weighted Max-Cut can have any values of weights, continuous or negative. Formulated as the Ising problem, we first assign an Ising spin σi ∈ { − 1, 1} to each node of the graph to represent the two groups. Hence, it can be formulated by

$${\arg\max}_{\sigma_i \in \{-1, 1\}} \, C= \sum\limits_{(i,j)\in E}w_{ij}(1-\sigma_i\sigma_j) \\ = \frac{1}{2}\sum\limits_{ij}^NJ_{ij}\sigma_i\sigma_j +\sum\limits_{(i,j)\in E}w_{ij}$$
(2)

where Jij = − wij. Note that the Hamiltonian H in Eq. (1) with the zero external field is equivalent to maximizing the first term of (2) and the second term ∑ijwij is a constant. It follows that the Max-Cut problem (2) naturally maps onto Ising problem (1), so we use it for benchmarking.

The evaluation metrics

All the QAIA are heuristic algorithms. To extensively assess these methods, we sample 100 times on each graph to compute the ratio of achieving the optimal solution which is referred to as the success probability P. Meanwhile, TTS and TTT are used to assess the computation speed of the methods19. TTT measures the time required for the algorithm to make sure the reference outcome can occur at least once with a probability Q that is conventionally set to 0.99. When running a probabilistic solver for a period of time Ts, the probability of yielding the required solution is given by P(Ts). Consider the case of k trials, the probability of obtaining at least one right outcome is given by

$$Q=1-{(1-P({T}_{s}))}^{k},$$
(3)

then the number of runs needed to achieve the right outcome with a probability of Q( = 0.99) is given by \(k=\frac{\log (1-0.99)}{1-P({T}_{s})}\)

$$\,{{\mbox{TTT}}}\,=\left\{\begin{array}{l}{T}_{s}\frac{\log (1-0.99)}{\log (1-P({T}_{s}))},\quad P({T}_{s}) \, < \, 0.99;\quad \\ {T}_{s},\quad P({T}_{s})\ge 0.99,\quad \end{array}\right.$$
(4)

where Ts is the time to run the algorithm once. For TTS, P(Ts) is set to success probability P. And the reference target in TTT is usually set to the 99% of the exact or best-known value. In addition, we compute the median and maximum of the ratio between the cut value of the sample and the optimal one, denoted as Rmedian and \({R}_{\max }\).

Experiment I

In the first experiment, we benchmark QAIA on small regular graphs with problem sizes N ranging from 10 to 500. We want to observe the trend of performance for QAIA on regular graphs with the increased problem size, thereby setting the weight to { − 1, 1}. The problem can be divided into four categories according to the edge weight and density, including sparse and dense Max-Cut instances, and sparse and dense spin-glass instances. For each problem, we generate 10 instances per size to assess the average performance. These instances are generated by NetworkX37, a Python package, and the cut value of the graphs has been calculated in advance by the Biq Max solver38, an exact method employing semidefinite programming based on the Branch & Bound algorithm.

To increase the reliability of the experimental results, the metrics are evaluated on QAIAs for 10,000 trials in each instance. For sparse instances, varying algorithms perform significantly differently. The bSB exhibits orders of magnitude advantage of success probability as well as TTS compared with other algorithms and possesses superior robustness as the problem size increases no matter for Max-Cut instances (Fig. 1) and spin-glass instances (Fig. 2). SimCIM, aSB and dSB perform similarly. And the dSB obtains slightly higher success probability and robustness while the SimCIM achieves lower TTS. The NMFA is inferior to them in this experiment. While for dense graphs, the performances of QAIA tend to be the same, especially for the Max-Cut instances. But bSB still performs best, while aSB is the worst one (Fig. 3).

Fig. 1: Comparison results of QAIA in 3-regular Max-Cut problems with positive weights on GPU.
figure 1

(a) Success Probability, (b) Time-to-solution (TTS), (c) median of the ratio between the sample’s cut value and (d) the optimal cut value Rmedian and maximum of the ratio between the sample’s cut value and the optimal cut value \({R}_{\max }\). The algorithms run for 10,000 trials with 1,000 iterations on each instance. The solid curves in (a) and (b) are obtained by fitting the corresponding data points for each algorithm to show the tendency of the metrics along with the problem size increases.

Fig. 2: Comparison results of quantum-annealing-inspired-algorithms in 3-regular spin-glass problems on GPU.
figure 2

(a) Success Probability, (b) Time-to-solution (TTS), (c) median of the ratio between the sample’s cut value and (d) the optimal cut value Rmedian and maximum of the ratio between the sample’s cut value and the optimal cut value \({R}_{\max }\). The algorithms run for 10,000 trials with 1,000 iterations on each instance. The solid curves in (a) and (b) are obtained by fitting the corresponding data points for each algorithm to show the tendency of the metrics along with the problem size increases.

Fig. 3: Comparison results of quantum-annealing-inspired-algorithms in dense graphs.
figure 3

(a) (b) spin-glass problems, (c) (d) Max-Cut problems. The solid curves are obtained by fitting the corresponding data points for each algorithm to show the tendency of the metrics along with the problem size increases. Since the optimal solution for the dense instances can not be obtained in a reasonable time, we just show the time-to-target (TTT) rather than time-to-solution (TTS) for them.

Experiment II

To evaluate the performance of the methods in the large-scale problem, the second experiment is conducted in the random Max-Cut instances (G22 and G81), skewed instances (G39-G42) from Gset and the fully-connected instance K2000 with best-known cuts provided by breakout local search (BLS) algorithms39. The G22, G39-42 and K2000 are all the 2000-spin size instances. The G22 has uniform degree. The skewed graphs mean that the graphs have a long-tailed distribution of their vertex degrees and the authors of the CIM with error correction claim that their methods perform better on this type of graph24. K2000 is a SK spin-glass problem. And the problem size of G81 is 10 times that of them. We run each algorithm for 1000 trials with the different number of time steps, Nstep (Nstep indicates the final number of time steps but not intermediate values.) In Fig. 4, the variants of SimCIM and SB perform much better than the SimCIM and aSB. The dSB achieves the best performance, especially in large dense graphs. Equipped with discretization, dSB not only converges fastest but also improves the search capability so that it is more likely to find the optimal solution or extremely approximate solutions even for the extremely large graph G81, and meanwhile, although for small Nstep, the dSB still performs well and as Nstep increases, the cut values of dSB keep improving. The CIM with error correction, CAC and CFC achieve the highest success probability and the probability of obtaining 99% of the best solution in large skew and random graphs respectively. However, we found that the CIM with error correction, especially CAC and SFC, are sensitive to the parameters. They can not obtain reasonable results when setting the wrong parameter values. SimCIM has many parameters (pump loss factor, learning rate, noise factor, etc.) to tune to achieve a satisfying solution. On the contrary, SB and its variants are easy to operate since most of the parameters can be fixed according to the literature29. The aSB and NMFA trail in last place.

Fig. 4: Comparison between quantum-annealing-inspired-algorithms for Gset and a fully-connected K2000 graph.
figure 4

a For G22, b For G39, c for K2000, d for G81. The polygons show the top-1% cut value for different algorithms. The present best-known cuts (black dashed line) for Gset are given by Breakout Local Search (BLS) algorithm, and the best cut for K2000 is computed by SA. The results of G40-G42 are shown in Supplementary Fig. 3.

In addition, we compare the convergence rate of Ising energy of QAIA, and NMFA as well as SA during evolution in sparse (G22) and dense (K2000) graphs respectively on one CPU core. The SA refers to40 which is specifically used to deal with Max-Cut problems. It is extremely fast compared to the standard SA and is >10 times faster than the Python packages provided by41 and42. The dSB achieves the optimal solutions in both two types of graphs and starts converging at around 0.03s and 0.1s, respectively; while SA needs to spend >10 times the time to get a satisfying result on these 2000-node Ising graphs (Fig. 5). It is worth noting that NMFA performs similarly to SA in K2000 due to normalization in each iteration.

Fig. 5: Evolution of Ising energy.
figure 5

a Sparse graph G22. b Dense graph K2000. Both quantum-annealing-inspired-algorithms and simulated annealing (SA) are running on one CPU core. The solid lines indicate average energy while dash-dotted lines represent max and min cases within the 100 trials.

Experiment III

The last one is conducted on chimera and pegasus graphs of the actual D-Wave device11,43, allowing for a comparison of QAIA against QA with the highest computing capabilities. For chimera graphs, the ground state energy of spin-glass instance is provided by Tropical Tensor Network (TTN)44. We compare QAIA with other physics-based algorithms, QA and some exact solvers from published literature13,45. Figure 6 shows that bSB achieves the highest probability of obtaining 99% of the best solution in the chimera graphs. As for the success probability that is not shown in the histogram, all the heuristic algorithms can find the optimal solution in 4 × 4 × 8 chimera and bSB achieves the highest success probability, nearly 70%. In larger chimera systems, dSB and SimCIM stand out for efficiently identifying ground state configurations among QAIAs with extremely few iterations. After that, we choose the best-performing QAIAs in chimera graphs on success probability (bSB for chimera 4 × 4 × 8 and dSB for chimera 8 × 8 × 8) to compare with D-Wave as well as some other solvers including TTN44, brute force search46 and exact belief propagation using bucket sort13. It should be noting that D-Wave leap cloud service is not available in our location so we can only obtain the results of QA in chimera 4 × 4 × 8 from the published literature. The bSB on the CPU is much faster than TTN and slightly slower than D-Wave. However, when running on GPU, it far surpasses D-wave. Meanwhile, dSB gives the ground state energy of 8 × 8 × 8 chimera much faster than any other exact solvers listed in the table no matter on CPU and GPU (Table 1).

Fig. 6: The energy distribution of the samples of stochastic quantum-inspired algorithms.
figure 6

a For 4×4×8 chiemra, b for 8×8×8 chimera. Each algorithm runs for 100 trials with 1000 iterations and is compared to the result of the exact solver Tropical Tensor Network (TTN) on the spin-glass instances. Histograms classify the results of all the algorithms into several intervals. Among each interval, the order of bins is arranged by the number of samples corresponding to the algorithm. The first interval in the histograms counts the number of samples with an energy within 99% of the ground state energy.

Table 1 Benchmarking results of energy of spin-glass on the chimera graphs

For theoretical pegasus graphs (also refer to full pegasus graphs), we conducted a comparison between the Advantage and QAIA. We set up the experiment to be consistent with the paper47 as much as possible to fairly compare their performance. First, we utilize the procedure outlined in48 to extract the Pegasus graph, which consists of 5640 qubits and 40,484 couplers. Following this, we proceed to generate subgraphs of it by varying the number of nodes, ranging from 564 to 5640, and configure the experiment with 100 samples and 1000 iterations per instance for each QAIA. The global optima of these subgraphs are computed by Biq Max solver. The aSB, bSB, CAC, and CFC are capable of searching for the optimal solution with a 100% probability for problem sizes ranging from 564 to 2820. For larger pegasus graphs, dSB demonstrates superior performance against other QAIA. Advantage can only achieve more than 99% best of the optimal value but not the optimal one on the pegasus graphs with real connectivity which is easier for Advantage system than the theoretical pegasus graphs47, while dSB, CAC and CFC achieve the optimal solution across all the problem size (Table 2). It should be noted that QAIA achieved an execution time of ~0.2 s on the NVIDIA Tesla A100 GPU across all instances. This execution time matches the total time taken by Advantage (the annealing time of D-Wave Advantage system is 2000 μs). Thus, we conclude that QAIA outperforms Advantage on pegasus graph.

Table 2 Performance of quantum-annealing-inspired-algorithms on the pegasus-like graphs

Conclusions

In this work, we benchmarked quantum-inspired algorithms across a range of graph types in solving the combinatorial problem. We also compare them with some physics-based algorithms as well as the D-Wave quantum annealer.

We summarize the benchmarking results in Table 3. In chimera graphs, bSB excels not only in comparison to other QAIAs, but also delivers an impressive nearly 50-fold reduction in TTS when contrasted with D-Wave. For pegasus graphs, CAC demonstrates superior performance against others in general. Additionally, CFC and dSB exhibit the ability to search for optimal solutions across a wide range of problem sizes while Advantage system of D-Wave fails.

Table 3 Performance ranking of quantum-annealing-inspired-algorithms

For small graphs, bSB consistently demonstrates superior performance. Even as the problem size grows, its success probability and TTS exhibit minimal changes. In the case of large random, skewed, and dense graphs, CAC, CFC, and dSB excel with the highest success probabilities respectively, and dSB achieves the lowest TTT.

In the context of solving optimization problems, the choice of solver depends on the characteristics of the graphs involved. For chimera and small graphs, bSB emerges as the ideal option. On the other hand, for larger graphs, dSB proves to be the most effective solver. However, given sufficient computational resources and time, CAC and SFC may outperform dSB for large random and skewed graphs, as they offer a higher likelihood of obtaining optimal or near-optimal solutions. The original version of QAIA, which includes aSB and SimCIM, is susceptible to getting stuck in local minima49. Nevertheless, when equipped with inelastic walls, discretization, or error correction techniques, QAIA exhibits a remarkable improvement in performance. Notably, QAIA excels in rapidly finding optimal solutions, surpassing D-Wave annealer and other conventional algorithms mentioned in this paper. As a result, QAIA serves as a valuable baseline for competing quantum algorithms.

Methods

In this paper, we benchmark two categories of quantum annealing inspired methods: Simulating CIM and SB, and compare them with some other physics-based algorithms including NMFA and TTN on spin-glass and Max-Cut problems.

Quantum annealing inspired algorithms

Simulated coherent ising machine

In general, every iteration in SimCIM simulates a roundtrip of optical pulses through the fiber loop in CIM. The operation of CIM can be modeled as c-number stochastic differential equations by characterizing each pulse by its complex amplitude17. The stochastic differential equations are used to describe optical squeezing, linear and nonlinear loss, mutual coupling optical pulse, and noise of CIM. For simplified computation, SimCIM drops the nonlinear term and imaginary part of amplitude, and then, updates spin variables \({{{{{{\boldsymbol{x}}}}}}}={\{{x}_{i}\}}_{i = 1}^{N}\) in a continuous style as follows.

$$\frac{{{{{{{\rm{d}}}}}}}{x}_{i}}{{{{{{{\rm{d}}}}}}}t}=\left(v{x}_{i}+\zeta \sum\limits_{j}{J}_{ij}{x}_{j}\right)+\sigma {f}_{i},$$
(5)

where ζ is the coupling strength, v denotes parametric gain and the linear loss coefficients, and fi is a Gaussian noise. v gradually increases to zero during optimization to ensure the final objective function is equivalent to the Hamiltonian of Ising problem. This optimization problem is solved via gradient descent with momentum.

The standard CIM suffers from amplitude heterogeneity, giving rise to improper mapping of the energy function. To overcome this, an auxiliary variable ei (also called error variable) is introduced for error detection and correction. The time evolution of the spin variable and error variable can be described as follows:

$$\frac{{{{{{{\rm{d}}}}}}}{x}_{i}}{{{{{{{\rm{d}}}}}}}t}=-{x}_{i}^{3}+(p-1){x}_{i}+{e}_{i}\sum\limits_{j}\zeta {J}_{ij}{x}_{j},$$
(6)
$$\frac{{{{{{{\rm{d}}}}}}}{e}_{i}}{{{{{{{\rm{d}}}}}}}t}=-\beta {e}_{i}\left({x}_{i}^{2}-\alpha \right),$$
(7)

where p, α and β are the gain parameter, the target amplitude and the rate of change of error variables respectively. The introduction of error variables makes the system exhibits chaotic dynamics that explore successively configurations close to the ground state, thereby accelerating solving the Ising problem. This system is referred to as CAC.

Another version of Simulated CIM, CFC is quite similar to CAC and the only difference between them is that the time evolution of error variable is controlled by the feedback signal zi instead of amplitude xi as follows:

$${z}_{i}=-{e}_{i}\sum\limits_{j}\zeta {J}_{ij}{x}_{j},$$
(8)
$$\frac{{{{{{{\rm{d}}}}}}}{x}_{i}}{{{{{{{\rm{d}}}}}}}t}=-{x}_{i}^{3}+(p-1){x}_{i}-{z}_{i},$$
(9)
$$\frac{{{{{{{\rm{d}}}}}}}{e}_{i}}{{{{{{{\rm{d}}}}}}}t}=-\beta {e}_{i}\left({z}_{i}^{2}-\alpha \right).$$
(10)

Unlike CAC and CFC, SFC divides the error variable and mutual coupling into two linear terms rather than nonlinear terms combined with these two parts.

$${z}_{i}=-\sum\limits_{j}\zeta {J}_{ij}{x}_{j},$$
(11)
$$\frac{{{{{{{\rm{d}}}}}}}{x}_{i}}{{{{{{{\rm{d}}}}}}}t}=-{x}_{i}^{3}+(p-1){x}_{i}-\tanh (c{z}_{i})-k({z}_{i}-{e}_{i}),$$
(12)
$$\frac{{{{{{{\rm{d}}}}}}}{e}_{i}}{{{{{{{\rm{d}}}}}}}t}=-\beta ({e}_{i}-{z}_{i}),$$
(13)

where p, k, c and β are the system parameters. The tanh function overcomes the problem of amplitude heterogeneity by tuning the parameter c. The local minima traps are destabilized by the difference between feedback signal and error variable. The potential landscape of all CIM with error correction closely resembles that of aSB (Supplementary Fig. 1). However, with the help of error correction, these variants converge faster than the original SB and SimCIM (Supplementary Fig. 2).

Simulated Bifurcation (SB)

QbM is designed to solve the Ising problem by mimicking the Kerr nonlinear oscillators which generate a quantum superposition of two oscillation states50,51,52,53,54. In order to simulate a large-scale QbM in present digital computers efficiently, aSB is formulated by the classical mechanical Hamiltonian as follows.

$$\frac{{{{{{{\rm{d}}}}}}}{x}_{i}}{{{{{{{\rm{d}}}}}}}t}={a}_{0}{y}_{i},$$
(14)
$$\frac{{{{{{{\rm{d}}}}}}}{y}_{i}}{{{{{{{\rm{d}}}}}}}t}=-\left({x}_{i}^{2}+{a}_{0}-a(t)\right){x}_{i}+{c}_{0}\sum\limits_{j=1}^{N}{J}_{ij}{x}_{j},$$
(15)

where xi and yi denote position and momentum for the ith Kerr-nonlinear parametric oscillator respectively, a0 is the positive detuning frequency and a(t) is the time-dependent pumping amplitude increasing from zero, c0 denotes the coupling strength, and Jij is the coupling coefficients of the Ising problem without the external magnetic field in (1).

In the bSB, the perfectly inelastic walls at xi = ± 1 are introduced. Specifically, xi is replaced by \({{{{{{\bf{sgn}}}}}}}({x}_{i})=\pm 1\), and set yi = 0 if ∣xi∣ > 1 in each iteration. These walls force positions to be exactly equal to 1 or − 1 when a(t) becomes sufficiently large. Moreover, the fourth-order term in VaSB is dropped, because the inelastic walls can play a role similar to the nonlinear potential walls. It follows that the equations of motion are given by

$$\frac{{{{{{{\rm{d}}}}}}}{x}_{i}}{{{{{{{\rm{d}}}}}}}t}={a}_{0}{y}_{i},$$
(16)
$$\frac{{{{{{{\rm{d}}}}}}}{y}_{i}}{{{{{{{\rm{d}}}}}}}t}=-({a}_{0}-a(t)){x}_{i}+{c}_{0}\sum\limits_{j=1}^{N}{J}_{ij}{x}_{j},$$
(17)

In aSB of the two spin cases, the origin is the unique local minima when t is sufficiently small. With the increase of a(t), the origin turns to be a saddle and two saddles appear near the origin after the first bifurcation. For the bSB, the origin turns out to be the saddle the two local minima appear at [ − 1, − 1] and [1, 1]. It follows that the convergence is accelerated (Supplementary Fig. 2).

To further mitigate the error of continuous relaxation, the bSB can be further formulated by dSB whose equations of motion are given by

$$\frac{{{{{{{\rm{d}}}}}}}{x}_{i}}{{{{{{{\rm{d}}}}}}}t}={a}_{0}{y}_{i},$$
(18)
$$\frac{{{{{{{\rm{d}}}}}}}{y}_{i}}{{{{{{{\rm{d}}}}}}}t}=-({a}_{0}-a(t)){x}_{i}+{c}_{0}\sum\limits_{j=1}^{N}{J}_{ij}{{{{{{\bf{sgn}}}}}}}({x}_{j}),$$
(19)

Namely, it is discretized by setting \({\sum }_{j = 1}^{N}{J}_{ij}{x}_{i}{x}_{j}\) to \({\sum }_{j = 1}^{N}{J}_{ij}{x}_{i}\,{{\mbox{sign}}}\,({x}_{j})\) in Eq. (15). In contrast to aSB and bSB, the dSB searches the solutions within a wide range at the beginning of the iteration, which makes it achieve faster convergence and the ability to jump out of local minima (Supplementary Fig. 2).

Physics-based algorithms

NMFA is used for comparison in all the experiments due to its excellent scalability while TTN only provides the best cut value in the last experiment.

Noisy Mean Field Annealing (NMFA)

Performed by the classical FPGA coprocessor, the central step in CIM is the combination of spin measurement and mean-field computation55. In NMFA, the rest optical portion of CIM is implemented on a classical computer. More specifically, the discrete spin values are replaced by continuous real numbers in the interval [ − 1, 1] and the Hamiltonian is minimized by mean-field annealing where Gaussian noise is added to escape from local minima. Then, the updating of spin value can be formulated as followed.

$${\hat{x}}_{i}=\tanh \left[\left({\sum}_{j}{J}_{ij}{x}_{j}/\sqrt{\mathop{\sum}_{j}{J}_{ij}^{2}}+{{{{{{\mathcal{N}}}}}}}(0,\sigma )\right)/{T}_{t}\right],$$
(20)
$${x}_{i}=\alpha \hat{{x}_{i}}+(1-\alpha ){x}_{i}$$
(21)

where σ denotes noise amplitude, Tt indicates the temperature value which gradually decreases throughout the annealing, and parameter α acts like the momentum in the gradient method to accelerate the convergence.

Tropical Tensor Network (TTN)

Equipped with the Tropical algebra defined on the semiring of \(({{{{{{\mathcal{R}}}}}}}\cup \{-\infty \},\oplus ,\odot )\)56, the tensor network contraction can give the exact ground state energy and entropy of the model directly at zero temperature, where the ⊕ and ⊙ operators mean

$$x\oplus y=\max (x,y),\qquad x\odot y=x+y.$$
(22)

During the contraction, ⊕ selects the optimal spin configuration and ⊙ sums the energy from subregions of the tensor network. At the same time, the contraction of TNN is performed by a differential way so that the ground state configuration can be sampled57. The combination of the tropical algebra and the usual algebra can give the ground state degeneracy without enumerating the solutions58.

Experimental setup

We conducted three experiments using a single thread on a solitary core of an Intel Xeon E5-2699 processor running at 2.20GHz and an NVIDIA Tesla V100 GPU equipped with 32GB of RAM, respectively. This allows for a comparative analysis of various algorithms to determine the optimal solutions while utilizing the available computing resources.

Parameter settings

There are many types of parameters in each algorithm. In each algorithm, there is always a parameter controlling the annealing process, such as the temperature T in NMFA, the pumping amplitude a(t) in SB, and the pump-loss factor v in SimCIM. In NMFA and SB, this parameter increase linearly. From the potential energy landscape, we can observe that there are four local minima within the landscape of SimCIM, only two of which corresponds to the minimizer of the Ising problem (Supplementary Fig. 1). Therefore, we use the hyperbolic tangent function rather than a linear function to increase the pump loss factor in order to avoid the point being trapped in local minima at the beginning of the iterations. For the same reason, the momentum in SimCIM is set to β = 0.9. In SB, the positive detuning frequency is set to a0 = 1. Setting \({c}_{0}=\frac{{a}_{0}}{{\lambda }_{\max }}\) (\({\lambda }_{\max }\) is the maximum eigenvalue of the coupling matrix \({{{{{{\boldsymbol{J}}}}}}}={({J}_{ij})}_{n\times n}\)) can accelerate the iteration to reach an approximate solution. For weighted Max-Cut and spin-glass problems, the estimation of \({\lambda }_{\max }\) is given as \(2\sqrt{N}\sqrt{\frac{{\sum }_{ij}{J}_{ij}^{2}}{N(N-1)}}\) according to Wigner’s semicircle law59,60. And for unweighted Max-Cut problems with Jij ∈ {0, 1}, it is set to \({\lambda }_{max}={\max }_{i}{\sum }_{j}{J}_{ij}\)61. The parameters of the CIMs with error correction are set according to the paper24. And the rest of the parameters in all the algorithms are obtained by grid search62 and the parameter values for different algorithms can be seen from Supplementary Tables 13.