An integrated coupled oscillator network to solve optimization problems

Graber, Markus; Hofmann, Klaus

doi:10.1038/s44172-024-00261-w

An integrated coupled oscillator network to solve optimization problems

Article
Open access
Published: 23 August 2024

Volume 3, article number 116, (2024)
Cite this article

Download PDF

You have full access to this open access article

Communications Engineering

An integrated coupled oscillator network to solve optimization problems

Download PDF

378 Accesses
1 Altmetric
Explore all metrics

Abstract

Solving combinatorial optimization problems is essential in scientific, technological, and engineering applications, but can be very time and energy-consuming using classical algorithms executed on digital processors. Oscillator-based Ising machines offer a promising alternative by exploiting the analog coupling between electrical oscillators to solve such optimization problems more efficiently. Here we present the design and the capabilities of our scalable approach to solve Ising and quadratic unconstrained binary optimization problems. This approach includes routable oscillator connections to simplify the time-consuming embedding of the problem into the oscillator network. Our manufactured silicon chip, featuring 1440 oscillators implemented in a 28 nm technology, demonstrates the ability to solve optimization problems in 950 ns while consuming typically 319 μW per node. A frequency, phase, and delay calibration ensures robustness against manufacturing variations. The system is evaluated with multiple sets of benchmark problems to analyze the sensitivity for parameters such as the coupling strength or frequency.

Oscillator-based optimization: design, emulation, and implementation

Article Open access 15 January 2024

A 1,968-node coupled ring oscillator circuit for combinatorial optimization problem solving

Article 02 May 2022

Analog Coupled Oscillator Based Weighted Ising Machine

Article Open access 15 October 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Introduction

The scaling of semiconductor technologies to smaller, more advanced nodes to increase computing power and energy efficiency is getting more and more difficult with each semiconductor technology step. In addition, the demand for computing power in combination with improved energy efficiency is increasing, particularly due to advances in artificial intelligence training¹. A part of bridging the gap between desired and available computing power are Ising machines². Those specialized hardware implementations focus on solving optimization problems formulated as the Ising model, which are non-deterministic polynomial-time (NP)-hard and thus require intensive computation power to solve³. Researchers are exploring different physical implementations such as quantum computing⁴, optics^5,6, memristors⁷, and spintronics^8,9,10 to directly employ them for computation in the form of a highly efficient Ising machine. However, such systems are difficult to implement physically and are not yet ready for mass production compared to competing approaches using well-established complementary metal-oxide-semiconductor (CMOS) technologies. Besides specialized digital processors^11,12 the class of oscillator-based Ising machines (OIMs), which is usually implemented in CMOS as well, is very promising due to their high speed and energy efficiency. Systems capable of solving problems as fast as 50 ns at a low total power consumption of just 42 mW for a 1968 node system have already been reported¹³. Other systems with varying network topologies have been presented as well^{13,14,15,16,17,18,19}. The efficiency and effectiveness of oscillator-based computing are also highlighted by the closely related oscillatory neural networks, which are used for machine learning tasks^20,21,22.

Well-known examples of NP-complete optimization problems are the traveling salesman, the knapsack problem, the maximum cut, and the remaining of Karp’s 21 NP-complete problems²³. Those can be converted within polynomial time to the Ising model²⁴. A few exemplary applications for the Ising model are routing²⁵, image segmentation²⁶, and collaborative filtering²⁷. There are many more practical applications in finance, economics, traffic management, computer-aided design, and machine scheduling in a QUBO form²⁸, which can be transformed into the Ising model. The Ising model, which was initially introduced to describe the spins of ferromagnetic material²⁹, is defined as shown in Eq. (1). The Hamiltonian H, which is a function of the discrete spins σ_i ∈ {+1, −1} shall be minimized. ${J}_{ij} \,\in\, {\mathbb{R}},i\ne j$ are the coefficients for the interaction between spins, while h_i acts on just a single spin and gives a bias to the discrete state. In the following, we will refer to the term ∑_〈i〉h_i ⋅ σ_i as ‘bias’. The similar QUBO problem form is shown in Eq. (2), where f_Q with ${Q}_{ij}\in {\mathbb{R}}$ shall be minimized by the discrete vector x with binary variables x_i ∈ {0, 1}. Both models can be transformed into each other as shown by conversion between x_i and σ_i in Eq. (3), where a Q_ijx_ix_j QUBO term gets transformed into the Ising coefficients J_ij, h_i, and h_j. Consequently, we focus on the Ising model in the following since it is more commonly used in the field of OIMs.

$$H(\sigma )=-{\sum}_{\langle i,j\rangle }{J}_{ij}\cdot {\sigma }_{i}{\sigma }_{j}-{\sum}_{\langle i\rangle }{h}_{i}\cdot {\sigma }_{i}$$

(1)

$$f_{Q} (x) = {{{\mathbf{x}}}}^{\intercal} Q {{{\mathbf{x}}}} = \sum\limits_{\langle ij \rangle} Q_{ij} \cdot x_{i}x_{j}$$

(2)

$${x}_{i}=\frac{{\sigma }_{i}+1}{2}$$

(3)

The concept of computation using OIMs is outlined in Fig. 1a. An optimization problem from an application must first be converted into the Ising form. This problem, consisting of discrete variables and weights linking those, is then embedded into the physical oscillator network. Each discrete variable must be mapped to a physical oscillator, which represents the binary states of a discrete variable in its phase angle, being either in-phase or anti-phase (π phase shifted). Physical coupling circuits between the oscillators realize the weights of the optimization problem. After the successful embedding, the oscillators run and mutually influence their phases, which forms a solution to the optimization problem. Since phases are continuous while combinatorial optimization problems have only discrete states, a sub-harmonic injection locking (SHIL) is used before readout. The SHIL perturbs the oscillators and forces alignment of their phases in discrete groups. Details about the phase dynamics can be found in^30,31,32. The phases are read out to assign the discrete states of the corresponding variables σ_i based on their position in one of the two by π separated distinct groups. An exemplary problem showing the letter ‘G’, which was computed on the actual chip, is shown in Fig. 1b.

**Fig. 1: Overview of computing with our OIM chip.**

Unfortunately, the graph embedding of the problem graph into a hardware connectivity network can be NP-hard as well^33,34. Creating another NP-hard problem to solve the target NP-hard problem is one of the major drawbacks of such computing systems and can become a serious bottleneck of the overall computation. However, the embedding is a general issue of such hardware-based Ising machines with a sparse network, independent of the underlying physical devices. In general, it is sufficient to find just a single valid graph embedding into the available hardware network. Consequently, the hardware network should be designed to simplify the problem embedding to prevent it from becoming the bottleneck of the computation. While all-to-all network connectivity does not require time-consuming embedding, the number of connections $\frac{n}{2}(n-1)$ scales quadratically with the network size n, making it challenging for large networks. Besides the immense area, the coupling can get difficult. For example, the delay and capacitive load introduced by the numerous connections poses a serious challenge. Embedding approaches often combine multiple physical nodes to form a single discrete variable to increase the connectivity^33,35. However, this adds substantial overhead from the additionally needed physical nodes and their connections and can reduce the solution accuracy^15,35. Hence, we propose a configurable, routable network with strong local connectivity as a trade-off. The idea is, that most of the graph edges should be embedded into the local connectivity. The routing network is just used for the remaining edges, which cannot be embedded locally. While the first edges can usually be easily mapped to the hardware connectivity graph, it gets increasingly difficult to find valid mappings for the remaining edges the more nodes and edges are already mapped to a hardware component. By using the flexible routing connections on demand for these remaining challenging connections, the difficulty can be reduced. Hence, the addition of such routable channels simplifies the embedding and can increase the success rate compared to a local network³⁶.

A very similar configurable topology is the field programmable Ising machine (FPIM) proposed by Jagielski et al.³⁷. They map boolean satisfiability problems (SATs) to a hardware representation and use a modified open-source field-programmable gate array (FPGA) tool flow to embed such problems in the hardware. This flow enables them to precisely analyze the needed hardware connectivity to embed a given set of optimization problems.

Results

The contribution of this work is as follows: We present a scalable architecture of an OIM system designed for solving Ising and QUBO problems with high accuracy and discuss the analog computing circuitry as well as the routing concept. We show extensive benchmarks solving maximum cut problems of the prototype, which is illustrated in Fig. 1c. Further experimental measurements highlight the importance of critical parameters like the coupling strength, SHIL strength, and system frequency for a high solution accuracy. Furthermore, the routable connections are evaluated and the calibration methods to compensate for unavoidable manufacturing mismatch as well as varying operating conditions like voltage, process, and temperature are discussed. Although abstract oscillator simulation models are available^38,39,40,41, they might not consider all non-idealities of real circuits. A more precise transistor-level simulation of the whole network is not feasible because of the excessive simulation runtime. Hence, this work is an experimental approach to investigate and verify the operation of OIMs.

System

The connectivity of the system network is shown in Fig. 2a. Each oscillator has a local connectivity to 11 neighbors forming a fixed, sparse network. It connects to the horizontal, vertical, and diagonal neighbors in a 2-dimensional grid, but also to the neighbor of the horizontal and vertical neighbor. The configurable routing network (blue) with the Wilton-style switch blocks⁴² provides 4 connections per oscillator to any other. To implement the bias term h_i of the Ising model the oscillators can be coupled to a globally distributed phase reference (red). Our 4.6 mm² prototype manufactured in a 28 nm technology of a world-leading foundry contains a total of 1440 oscillators and 11,724 coupling elements. 2584 of these connections are freely routable across the whole chip, while the remaining have predefined local connectivity.

Node Implementation

The block diagram of an oscillator node representing a discrete variable σ_i is shown in Fig. 2b. The central element is the ring oscillator shown in Fig. 2c operating at a frequency of 100 MHz, which keeps signal propagation delays to neighboring oscillators still small compared to the oscillation period. Due to the necessary graph embedding and data transmission for the OIM-system configuration, we think that higher oscillation frequencies will have limited benefit for the overall computation. We use a differential delay cell design shown in Fig. 2d, which was empirically optimized for its high sensitivity to coupling currents. The small number of 4 delay cells provides a good sensitivity but still sustains oscillation at strong injected currents reliably. The differential architecture provides phase shifts of π by swapping both differential signals. This is used to implement in-phase (0-phase difference) and anti-phase (π-phase difference) coupling. Additionally, the higher power supply rejection of differential designs reduces unwanted coupling via the power supply. However, a differential to single-ended converter provides the oscillator signal for the all-digital phase-to-digital converter (PDC)⁴³. Although just a binary phase measurement would be sufficient for the Ising model, a 4-bit resolution is implemented to investigate the OIM computation in more detail. The sensitive oscillator waveform is buffered to avoid loading effects⁴⁴. The configurable coupler circuits convert the phase differences between the oscillators into a current, which is summed up and injected into the oscillator to change its phase. An all-transistor coupler design is proposed in this work to avoid large resistors or capacitors. As shown in Fig. 2e it consists of two differential pairs to modulate the injected current based on the phase difference. Swapping the input signals implements the sign of the weights by coupling in-phase or anti-phase. By setting the bias current via an individual 4-bit digital-to-analog converter (DAC) for each coupler, the weight coefficients of the optimization problem are set. The current summation circuit as shown in Fig. 2f accumulates the current of all couplers for injection into the oscillator, while its self-biasing stabilizes the common mode voltage level. In combination with the output buffer of the oscillator, which mitigates the capacitive loading caused by the connected couplers, several coupling connections can be simply added to achieve the desired connectivity. To implement the bias term h_i, we couple the oscillators to the already available reference clock, which adds just a small area overhead. The SHIL-injector gradually applies the perturbating SHIL signal to gently freeze the oscillator phase and therefore discretize the oscillator phases in the two spin groups. This ensures, that the coupled oscillators try to minimize the Ising Hamiltonian³⁰. Based on the experimental evaluation of our previous generation with adjustable ramps, we decided to ramp up the SHIL in 8 linear steps, which are increased after each nominal oscillation period.

Routing

The routing enables the connection of two arbitrary oscillators on the chip together if suitable routing tracks are not already occupied. When transmitting a signal through the routing network, it gets inevitably delayed by a few hundred picoseconds up to a few nanoseconds depending on the distance. This delay becomes particularly relevant at the oscillation period of 10 ns. The received time-delayed signal creates a perceived phase lag at the distant oscillator, that does not match the actual phase of the receiver. When coupled, the oscillators will react based on this perceived phase potentially causing a runaway situation and a frequency shift. To address this, we have introduced a delay compensation scheme, the design and simulation of which are discussed in our previous work⁴⁵. This method essentially transmits the oscillator signal of the previous stage, which can be seen as the oscillator waveform with a “negative” delay. An adjustable delayline supplements the propagation delay through the routing network to match that “negative” delay so that the received oscillator waveform is time synchronous with the actual transmitting oscillator. So, the perceived phase at the receiving oscillator matches the phase of the transmitter. However, the phase dynamics of the coupling are altered, because any response is perceived at the receiver with a full period delay. Because the propagation delay heavily depends on the traveled distance but also on temperature, process, and supply voltage, the optimal delayline setting is obtained using a calibration, which is discussed later in the calibration section.

Computing & benchmark setup

The overall system concept of our computing setup is shown in Fig. 2g and its physical realization is depicted in Fig. 2h. The OIM chip just requires external clocks and an FPGA, which could be in general integrated on the same die as well. The data transfer and generation of the timing signals are handled by that FPGA, which interfaces with a computer. Matlab scripts handle the test management, OIM configuration, timing scheduling, and analysis of the results. Four off-chip drivers allow us to investigate and verify the behavior of selected oscillators using an external oscilloscope, which is not needed for normal computing operation. External bias voltages are used to adjust the oscillator frequency, SHIL-injection strength, and global coupling strength. A single computation takes in total 950 ns. The first 200 ns are used to let the oscillators couple. Then the SHIL strength is ramped up to gradually force the coupled oscillators into the two phase groups, which takes 182 ns including a small margin for stabilization. Lastly, the phase measurement to obtain the computed result takes 568 ns. The energy consumption per computed problem varies between 404 nJ and 466 nJ across the individual problems of the later discussed ‘difficult’ benchmark set. A single oscillator including its periphery has a measured power consumption of 113.3 μW and a coupling connection needs approximately 23 μW.

Benchmarks

To evaluate the accuracy, we solve randomly generated benchmark problems and compare them against the best-known solution. Multiple network topologies of varying difficulty are tested, which are sketched together with the histograms in Fig. 3. The outcome of an OIM computation using our chip depends on the random initial states and might be affected by noise, so that the computed solution varies randomly when computing the same problem again. As shown in Fig. 3a, we solved each problem 500 times and noted the average (red) and best (blue) solution. The computed cut is normalized with the best-known solution, which was obtained using the commercially available software Gurobi⁴⁶. Each of the 500 runs per problem exhibits a unique solution, where at least 6.4% of states are different compared to every other solution of the same problem. When running the oscillators without any mutual interaction, only 0.18% of oscillators show a weak correlation between their neighbors. Furthermore, no auto-correlation between the previous phases of an oscillator can be detected after roughly 1200 ns.

The ‘simple’⁴⁷ set shown in Fig. 3b contains only connections to vertical and horizontal neighbors on a two-dimensional grid (0.27% of all-to-all connectivity). Since all weights of a maximum cut problem have a positive weight of 1, a checkerboard pattern is a maximum solution, with the cut being the sum of all edge weights. While such a problem might be simple for an iterative algorithm, the parallel nature of OIM systems does not benefit much from this structure. However, the optimum solution of a checkerboard pattern satisfies all coupling connections, which makes this problem less difficult for OIMs than the others. The average cut reaches more than 96.8% of the best-known solution. The ‘difficult’⁴⁸ set in Fig. 3c uses all available local connections (0.73% of all-to-all connectivity) while keeping the weights at positive 1. The denser local connectivity makes it more difficult to solve since there is no contention-free solution anymore, which satisfies all coupling connections. To emphasize the higher difficulty, computing the reference solution using the commercial software Gurobi⁴⁶ for the ‘difficult’ set took on average 10,000x longer than the ‘simple’ set. However, this is just an indication and no proper measurement of difficulty. As expected, the solution accuracy is worse with an average normalized cut of better than 94%. The ‘bias’⁴⁹ set in Fig. 3d adds the bias terms h_i to the structure of the ‘difficult’ problem set, which shows no accuracy degradation from the added bias term. While the previous problems have a fixed weight of +1, the ‘weighted’⁵⁰ set tests the same topology with the full 4-bit weight resolution in Fig. 3e. The performance is very similar to the ‘bias’ benchmark set. Interestingly the spread between individual problems is reduced. The parameters like the coupling strength were individually tuned for all four shown benchmark sets to achieve the best accuracy.

Experiments & optimization

Figure 4a illustrates the impact of the coupler strength, which can be globally scaled using an external bias voltage, on the computing performances by evaluating the ‘simple’, ‘difficult’, and ‘weighted’ benchmark sets at different coupling strengths. A distinct range for the coupling strength exists, which achieves the best performance but depends on the benchmark problem set respectively problem connectivity. Figure 4b graphically illustrates the impact of the coupling strength using the already known example from Fig. 1b. A weak coupling strength causes multiple local areas, which are solved in themselves correctly, but exhibit a distinct boundary to other such local areas. This effect is much less pronounced at higher coupling strengths, but still relevant around the best coupling strength. At higher coupling strengths, the forming of such local areas is less likely, but faulty spots appear, which couple in an undesired phase despite the clear coupling inputs from the neighbors. So, the coupling strength and SHIL can be thought of as the equivalent of an annealing schedule, which is frequently used in simulated annealing⁵¹. We suppose, that similar mechanisms considerably reduce the accuracy of the OIM for most problems. It should be emphasized, that the choice of coupling strength is very system and problem-specific. Therefore, we recommend that those parameters are not fixed by design, but remain adjustable during operation. Figure 4c shows a sweep over the reference clock frequency, where the SHIL frequency is exactly double that value. The best results are achieved for a frequency between 80 and 90 MHz, which is 10% to 20% lower than the actual oscillator free-running frequency. To assess the effectiveness of routable connections, which have slightly different phase dynamics due to their inevitable routing delay with compensation scheme⁴⁵, a comparison is provided in Fig. 4d. A reduced 10 × 10 network size ‘simple’ benchmark set was solved with routable connections and for comparison using local connections. While the local couplings find the best-known solution for all problems, the routing only for 52 out of the 100 problems. The spread between the problems widens as well, where the worst problem achieves just 85% using the routing network compared to 95% for the local routing.

**Fig. 4: Experimentally measured analysis of the coupling parameters.**

Calibration routines

The OIM system is a mixed signal design, where the analog components are responsible for the actual computation. The digital circuitry configures and supports the analog computing core. Unfortunately, random manufacturing mismatches and imperfections influence the sensitive analog circuitry. For example, some transistors might have a higher conductivity than others, causing differences in the frequencies. We employ an on-chip frequency calibration, a Matlab-assisted phase calibration, and a Matlab-assisted routing delay calibration to compensate for such manufacturing mismatch.

Frequency calibration

The frequency of the oscillators is approximately Gaussian distributed with a spread between 75.6 MHz and 119.1 MHz within a silicon die as shown in Fig. 5a. Since such free-running frequency differences cause phase offsets in the coupled state, which can affect the computation⁴⁴, a fine and coarse calibration adjusts the individual frequencies to the target 100 MHz. After calibration, the standard deviation of the oscillator frequency is reduced from 6.7 MHz to just 0.33 MHz, which improves the computing accuracy on average by approximately 0.7 percentage points. According to our previous chip generation, any frequency deviation less than approximately 1 MHz yields negligible performance gains.

**Fig. 5: Calibration principles used to counteract manufacturing imperfections.**

Reference phase calibration

While the weights between two discrete variables (J_ij ⋅ σ_iσ_j) just depend on the difference between two discrete states, the bias terms (h_i ⋅ σ_i) depend on just one state, which requires an absolute phase reference to define the actual states. The calibration principle is shown in Fig. 5b, which adjusts the external delayline ΔT to change the phase between the reference clock and the SHIL clock. The distribution of the reference coupled oscillators, with one half coupled in-phase and the other half anti-phase, is shown in Fig. 5c (blue). The distribution is only slightly influenced by the operating frequency and coupling strength, as well as supply voltage, temperature, and process. The phase in the SHIL locked states (red) is then adjusted using the delayline ΔT to match the reference coupled phase distribution (blue). Without calibration, the oscillators might not couple as intended by the bias term (h_i ⋅ σ_i) leading to poor accuracy. The phase threshold can be determined by finding the center of both reference phase (blue) groups. The phase angle of the +1 spin group can be clearly identified by the phase of the in-phase coupled oscillators.

Routing delay calibration

Any signals, that are transmitted through the routing network, experience a propagation delay. This is greatly affected by the spatial distance between the connected oscillators as well as supply voltage, temperature, and process variations. Unfortunately, the coupled oscillators are sensitive to that delay leading to changes in their frequency and phase. They either couple as intended or settle in the wrong, opposite phase (shifted by π). This observed behavior is a probability function depending on the delay instead of a sharp transition. The measurements in Fig. 5d show the impact of the routing network on the frequency and phase for a pair of coupled oscillators at three different distances. The delay is changed using a delayline with a 100 ps step size, which is implemented for the calibration scheme. The plot is simplified for better clarity by showing the behavior of the majority of oscillators instead of the actual probability density, which is the reason for the distinct transition. While the frequency shows a declining trend with rising delay, the oscillators couple in the wrong phases and increase their frequency by approximately 50% when exceeding a certain threshold. For calibration, we choose a delay setting leading to a coupled frequency closest to the free-running frequency, which still couples as intended with a probability greater than 70%. This delay value, which is separately obtained for all possible routing distances, is stored in a lookup table. Without the calibration (e.g., setting the delaylines to 0 for their minimum delay), the routing channels essentially fail to function, achieving a similar performance as a random number generator. However, the delay setting has a roughly 400 ps tolerance range, which exhibits very similar performance.

Comparison with Other Works

A comparison with other integrated CMOS OIMs is provided in Table 1. This work offers a versatile trade-off and introduces routable connections between oscillators. It has a competitive number of nodes and edges. Together with Lo et al.¹⁵, it is the only system to provide the bias term of the Ising model. However, our approach can be used for any network topology instead of requiring an all-to-all connected oscillator. While all-to-all connectivity is superior, they are limited to very small systems with just 30¹⁶ and 48¹⁵ nodes. The routing channels in this work add great flexibility to the network. In combination with the local connectivity, it is well-suited for large, sparse problems. In comparison to our predecessor¹⁹, we managed to maintain the high solution accuracy despite increasing the number of nodes and edges while reducing the power and area per node.

Table 1 Comparison of integrated CMOS OIMs including the predecessor as “Graber et al.¹⁹”

Full size table

Discussion

We presented a scalable approach for OIMs capable of solving optimization problems in Ising as well as QUBO formulation. The critical characteristic of such OIM systems is the network connectivity. For solving real-world optimization problems, the network should be as large as possible having tens of thousands of nodes with as many connections as possible. However, the quadratic increase of connections in all-to-all networks makes a circuit implementation unfeasible for large systems due to the immense area demand. Our approach unites these two opposing requirements by introducing flexible, routable connections between oscillators. This is a good trade-off for sparse optimization problems as it provides every possible connection between nodes while avoiding hardwiring every connection to save area. Additionally, the flexibility of the routing simplifies the computationally intensive embedding of the optimization problem into the OIM network, which can potentially exceed the computation time of the actual problem.

Our fabricated chip demonstrates the effectiveness of OIMs by solving max-cut problems within 950 ns while consuming an average power of 460.3 mW. The computed solutions reach on average at least 94% of the optimum and even more when selecting the best runs. The two major design challenges are area efficiency and random device variation from manufacturing. To be as compact as possible, we proposed a fully transistor-based coupler circuit with weights set by simple current DACs to avoid large, area-intensive resistors and capacitors. A time-multiplexing of the DACs, which occupy 12.8% of the core area, could further save area. To deal with process variation and mismatch of individual devices, we use calibration schemes and fine-tune parameters such as coupling strength and SHIL frequency. Our experiments emphasize that good performance is reached in just a narrow range of the coupling strength, which, however, shifts with the optimization problem topology. The oscillator frequency calibration increases the accuracy by approximately 0.7 percentage points. A software-based phase angle calibration ensures correct operation of the bias term ‘h_i’ of the Ising model, which is area-efficiently implemented utilizing the already distributed reference clock for the phase measurement. The calibration of the routing delay is crucial. Without it, the network is not usable and just as effective as a random number generator. Unfortunately, despite the compensation scheme including calibration, the routing connections exhibit lower performance due to the unavoidable propagation delay than native local connections between neighboring oscillators.

A calibration scheme for the coupling weights might further increase the accuracy of OIMs. Additionally, a digital post-processing of the measured phases could improve the performance as well. While our approach is well suited for large, sparse optimization problems, we think that large and dense problems might likely not be feasible for OIMs due to the high amount of needed oscillators and coupling links. Instead, hybrid approaches, which could partition a large graph into smaller sections to be solved by OIMs, might combine the advantages of OIMs with digital processors.

Methods

Chip design

The chip was designed in a 28 nm node of a world-leading semiconductor foundry. A 4.6 mm² silicon chip was manufactured using the mini@sic program of the Europractice IC service. The latest process design kit from the foundry was utilized, providing compact models for simulation. The design uses regular core devices operating at 0.9 V with different thresholds. Cadence Virtuoso version IC6.1.8 was used for schematic and layout entry. Cadence Spectre 21.1.0 was used for transistor-level circuit simulations. Siemens Calibre nmDRC, nmLVS, and PEX version v2020.4_15.9 are used for layout rule checking and parasitic extraction. The digital control logic was synthesized utilizing the foundry-provided standard cells with Synopsys Design Version R-2020.09-SP2 and Cadence Innovus 21.10 for the place-and-route. More details about the overall system implementation can be found in our previous work^45,52.

Delay-line calibration

The delayline calibration is explained in detail in our separate paper⁴⁵. For convenience, the principle is outlined in Fig. 6. As shown in Fig. 6a the internal waveforms of the oscillator are phase-shifted. So the output at ‘4’ is a delayed version of the previous stage at ‘3’. The output of this previous stage is transmitted through the routing network as shown in Fig. 6b. The unavoidable delay T_routing is supplemented with the adjustable delayline ΔT. So, the received phase at oscillator B (point ‘3’) can be aligned with the actual output of oscillator A (point ‘4’) to minimize the skew T_skew. Since T_skew cannot directly be measured and varies with the operating conditions, the discussed calibration is performed.

**Fig. 6: Principle of the used delay compensation scheme taken from our previous paper⁴⁵.**

Chip reconfigurability & external adjustments

Every local coupling connection on the chip can be configured independently. Every weight is set using a 4-bit DAC value and an additional bit to select between in-phase and anti-phase coupling. The same applies to the routing connections, which have additionally a 5-bit word for the delayline with approximately 100 ps steps. The coupler for the bias term h_i is similar, but has 5-bit resolution and has four times the maximum strength of the regular J_ij connections. Every switch of the switch block and the switches to connect the oscillator with the routing network can be individually enabled/disabled. The algorithm to configure these switches is mentioned below in section ‘Path Routing’.

External analog bias voltages are internally converted into bias currents using resistors. So, the coupling strength, the SHIL-strength, and the oscillator frequency can be externally set. Furthermore, the SHIL frequency can be set with the external phase-locked loop (PLL). The timing to control the operating states (enable oscillator couple, apply the SHIL signal, phase measurement, oscillator measurement, frequency calibration) is triggered by the external FPGA (Fig. 2g) on the test printed circuit board (PCB) (Fig. 2h).

Experimental results

The dies with 108 pads, whereof 67 pads are for power supply and bias voltages, were wire-bonded into a QFN-64 package and soldered on a pluggable carrier board. The presented results are from die number 1, the first one we have soldered on a carrier board without any selection. The benchmarks were conducted with the other 9 chips as well. The test PCB contains all circuitry for the experiments, just an external power supply was used additionally. A Matlab script coordinates the testing by selecting optimization problems, generating the configuration bitfiles for loading the problem into the OIM chip, and configuring the test PCB. The interface between the computer and FPGA limits the maximum data throughput to 12 Mbit in our experiments. Therefore, the OIM system is disabled during the configuration data transfer from the computer to the chip. During transmission of the measured phase data, the oscillators are kept frozen in SHIL-lock and stay in the same phase for sufficient SHIL strengths, which was validated regularly. We have no control over the initial phases at the start of the computation. All experiments were conducted at room temperature between 20 and 25 °C. The presented calibration data is averaged over 500 samples per oscillator. The routing frequency-delay behavior in Fig. 5d is the averaged behavior of 34 oscillator pairs sampled 500 times each.

G-Letter example problem

The example of the ‘G’ is generated from a black-and-white image of the letter G with identical resolution as the OIM network (40 × 36), so that each pixel corresponds to one oscillator. It is iterated through all pixels in the image and the coupling connections are set based on the color of the direct horizontal, vertical, and diagonal pixel neighbors. If the color of the neighbor pixel is identical, a connection between the corresponding oscillators to couple in-phase is added. If the color of the corresponding neighbor pixel is the opposite, then an anti-phase coupling connection is inserted. Overall, the underlying problem is of a king’s graph structure with all connections enabled, where the sign is determined by the color of the root image. By construction, all oscillators can couple in phases as intended, which is the optimum solution.

Phase Threshold

The threshold to distinguish between the spins σ_i = +1 and σ_i = − 1 state for all shown results was determined by testing all 16 possible thresholds (resolution of the phase angle by the 4-bit PDC) and selecting the threshold with the best Ising Hamiltonian. We have chosen this brute-force method because it ensures a repeatable and reliable calculation of the Ising Hamiltonian independent of any prior calibration or results. The test PCB generates the SHIL clock f_shil and reference clock f_ref from the same clock signal (see Fig. 2g) using Analog Devices ADF4351 PLLs. Since the internal oscillator feedback loop within the PLL is used for better jitter performance, we do not have control of the phase between f_shil and f_ref. Any time the PLL setting is changed, that phase difference and hence the phase threshold changes. For non-experimental usage, one could either determine the threshold just once based on the phase values or use the reference calibration.

Calibrations

The on-chip frequency calibration is conducted at every start of our test system because the previous calibration is lost after a power cycle. The calibration mainly compensates for random manufacturing mismatch and stays similar between calibration cycles. The phase calibration is conducted once at the start-up of the system and when the operating parameters are notably changed. This is the case, if either the PLL configurations are altered or the SHIL strength/coupling strength is considerably changed. The delay settings of the routable connections were calibrated once and the necessary delays depending on the length stored in a lookup table. This table was then used to get the delay settings in all following experiments.

Benchmarks

The benchmark problems were randomly generated by a custom Matlab script tailored for our OIM chip network. To ensure that the generated problems follow the intended network topology, a list with all allowed and physically available edges is generated first. Then edges are randomly deleted until the desired network density is reached, which is randomly selected between 30% and 70% of all edges of that chosen topology (‘simple’: 841 to 1963 edges, ‘difficult’: 2285 to 5331 edges, ‘bias’: 2717 to 6339 edges, ‘weighted’: as difficult). No checks for the existence of isolated sub-graphs were applied. The edge weights (if applicable) were randomly chosen from a uniform distribution. We interpreted the problem as a maximum cut similar to other works, which closely follows the Ising Model. We extended the maximum cut to include a bias term, which can be interpreted as an edge from a node to a fixed set. The edge weights of the maximum cut problem are limited to positive only values (transformed into negative J_ij coefficients of the Ising model) for better comparability of the obtained cut. Consequently, the obtained cut is limited to the range between 0 and the best-known solution, which we normalize to 1. The sign of the weights in the maximum cut is simply inverted for transformation into the Ising model, so the OIM operates with just anti-phase coupling. Due to the differential design, where the in-phase and anti-phase coupling is achieved by just swapping the inputs to the coupler circuits in its multiplexer, the general coupling behavior stays identical. The best-known solution for each generated problem was obtained using the commercial tool Gurobi⁴⁶. Gurobi version 10.0.3 was executed on an AMD Threadripper Pro 5965WX 24-core processor and 256 GB RAM without any time restriction, which took up to 4.67 hours per problem. The solution was guaranteed to be the optimal solution.

Path routing

The connections through the routing network were configured using Matlab. It is based on a graph representation of the available hardware network, which includes every physical track and switch of the switch block. The ‘shortestpath’ function in Matlab, which internally uses the Dijkstra algorithm, calculates a path to connect the two oscillators. The returned path includes all physical tracks and switches of the switch blocks. These used elements are then marked as occupied so that they are avoided for the following paths. The obtained path is checked to be physically possible and verified to not contain any already occupied tracks. All routing connections of an optimization problem are applied in a sequential order, where the path for each connection only considers the occupied tracks from previous connections. For the routing problems in Fig. 4d, this approach connects the oscillators in the shortest, direct way.

Power measurement

The test PCB (Fig. 2h) contains onboard current shunts for every supply and bias voltage rail, which are placed within the feedback loop of the onboard voltage regulator to compensate for the voltage drop over those shunts. A Keysight DAQ970A sequentially scans the voltage drop over the shunt resistors of the bias rails, analog supply, and digital supply. The separate supply rails for the I/O-ring and off-chip driver are excluded from the measurement. The operating state of the coupled as well as SHIL-locked oscillators were extended for multiple seconds to precisely measure the (average) power consumption. The power consumption during the SHIL ramping and phase measurement could not be directly measured due to their short duration but was approximated by the power consumption under SHIL lock. Additionally, the time and energy for writing the configuration data in the chip and sending the phase data out of the chip are omitted. The power consumption varies substantially with the optimization problem since they influence the amount of enabled oscillators and couplers. Thus, we recorded the power consumption for each problem of the ‘difficult’ set. Additionally, the total power consumption depends on the frequency and the global coupling strength, where the optimal parameters for the optimization problem were used.

Data availability

The used benchmark problems ‘simple’⁴⁷, ‘difficult’⁴⁸, ‘weighted’⁴⁹, and ‘bias’⁵⁰ are available at figshare.

References

Sevilla, J. et al. Compute trends across three eras of machine learning. In 2022 International Joint Conference on Neural Networks (IJCNN), 1–8 https://doi.org/10.1109/IJCNN55064.2022.9891914 (2022).
Mohseni, N., McMahon, P. L. & Byrnes, T. Ising machines as hardware solvers of combinatorial optimization problems. Nat. Rev. Phys. 4, 363–379 (2022).
Article Google Scholar
Kalinin, K. P. & Berloff, N. G. Computational complexity continuum within ising formulation of np problems. Commun. Phys. 5, 20 (2022).
Article Google Scholar
Johnson, M. W. et al. Quantum annealing with manufactured spins. Nature 473, 194–198 (2011).
Article Google Scholar
Honjo, T. et al. A coherent ising machine based on a network of 100,000 degenerate optical parametric oscillator pulses. In 2022 IEEE 22nd International Conference on Nanotechnology (NANO), 405–408 https://doi.org/10.1109/NANO54668.2022.9928658 (2022).
Honjo, T. et al. 100,000-spin coherent ising machine. Sci. Adv. 7, eabh0952 (2021).
Article Google Scholar
Jiang, M., Shan, K., He, C. & Li, C. Efficient combinatorial optimization by quantum-inspired parallel annealing in analogue memristor crossbar. Nat. Commun. 14, 5927 (2023).
Article Google Scholar
Yun, C. et al. Electrically programmable magnetic coupling in an ising network exploiting solid-state ionic gating. Nat. Commun. 14, 6367 (2023).
Article Google Scholar
Dutta, S., Khanna, A. & Datta, S. Understanding the continuous-time dynamics of phase-transition nano-oscillator-based ising hamiltonian solver. IEEE J. Exploratory Solid-State Computational Devices Circuits 6, 155–163 (2020).
Article Google Scholar
Vaidya, J., Surya Kanthi, R. S. & Shukla, N. Creating electronic oscillator-based ising machines without external injection locking. Sci. Rep. 12, 981 (2022).
Article Google Scholar
Yamamoto, K. et al. Statica: A 512-spin 0.25m-weight annealing processor with an all-spin-updates-at-once architecture for combinatorial optimization with complete spin-spin interactions. IEEE J. Solid-State Circuits 56, 165–178 (2021).
Article Google Scholar
Zhang, T., Tao, Q., Liu, B. & Han, J. A review of simulation algorithms of classical ising machines for combinatorial optimization. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS), 1877–1881 https://doi.org/10.1109/ISCAS48785.2022.9937767 (2022).
Moy, W. et al. A 1,968-node coupled ring oscillator circuit for combinatorial optimization problem solving. Nat. Electron. 5, 310–317 (2022).
Article Google Scholar
Wang, T., Wu, L. & Roychowdhury, J. Late breaking results: New computational results and hardware prototypes for oscillator-based ising machines. In 2019 56th ACM/IEEE Design Automation Conference (DAC), 1–2 https://doi.org/10.1145/3316781.3322473 (2019).
Lo, H., Moy, W., Yu, H., Sapatnekar, S. & Kim, C. H. An ising solver chip based on coupled ring oscillators with a 48-node all-to-all connected array architecture. Nat. Electron. 6, 771–778 (2023).
Article Google Scholar
Bashar, M. K. et al. Experimental demonstration of a reconfigurable coupled oscillator platform to solve the max-cut problem. IEEE J. Exploratory Solid-State Computational Devices Circuits 6, 116–121 (2020).
Article Google Scholar
Ahmed, I., Chiu, P.-W., Moy, W. & Kim, C. H. A probabilistic compute fabric based on coupled ring oscillators for solving combinatorial optimization problems. IEEE J. Solid-State Circuits 56, 2870–2880 (2021).
Article Google Scholar
Mallick, A., Bashar, M. K., Truesdell, D. S., Calhoun, B. H. & Shukla, N. Overcoming the accuracy vs. performance trade-off in oscillator ising machines. In 2021 IEEE International Electron Devices Meeting (IEDM), 40.2.1–40.2.4 https://doi.org/10.1109/IEDM19574.2021.9720612 (2021).
Graber, M. & Hofmann, K. A coupled oscillator network to solve combinatorial optimization problems with over 95% accuracy. In 2023 IEEE International Symposium on Circuits and Systems (ISCAS), 1–5 https://doi.org/10.1109/ISCAS46773.2023.10181365 (2023).
Hoppensteadt, F. C. & Izhikevich, E. M. Oscillatory neurocomputers with dynamic connectivity. Phys. Rev. Lett. 82, 2983–2986 (1999).
Article Google Scholar
Delacour, C. & Todri-Sanial, A. Mapping hebbian learning rules to coupling resistances for oscillatory neural networks. Front. Neurosci. 15, 694549 (2021).
Article Google Scholar
Delacour, C. et al. A mixed-signal oscillatory neural network for scalable analog computations in phase domain. Neuromorphic Comput. Eng. 3, 034004 (2023).
Article Google Scholar
Karp, R. M.Reducibility among Combinatorial Problems, 85–103 (Springer US, Boston, MA, 1972). https://doi.org/10.1007/978-1-4684-2001-2_9
Lucas, A. Ising formulations of many np problems. Front. Phys. 2, 5 (2014).
Article Google Scholar
Harwood, S. et al. Formulating and solving routing problems on quantum computers. IEEE Trans. Quantum Eng. 2, 1–17 (2021).
Article Google Scholar
Qin, P. & Zhao, J. A polynomial-time algorithm for image segmentation using ising models. 2011 Seventh Int. Conf. Nat. Comput. 2, 932–935 (2011).
Google Scholar
Liu, Z. et al. Ising-cf: A pathbreaking collaborative filtering method through efficient ising machine learning. In 2023 60th ACM/IEEE Design Automation Conference (DAC), 1–6 https://doi.org/10.1109/DAC56929.2023.10247860 (2023).
Kochenberger, G. et al. The unconstrained binary quadratic programming problem: a survey. J. Combinatorial Optim. 28, 58–81 (2014).
Article MathSciNet Google Scholar
Ising, E. Beitrag zur theorie des ferromagnetismus. Z. f.ür. Phys. 31, 253–258 (1925).
Article Google Scholar
Wang, T., Wu, L., Nobel, P. & Roychowdhury, J. Solving combinatorial optimisation problems using oscillator based ising machines. Nat. Comput. 20, 287–306 (2021).
Article MathSciNet Google Scholar
Wang, T.Novel Computing Paradigms using Oscillators. Ph.D. thesis, EECS Department, University of California, Berkeley http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-12.html (2020).
Wang, T. & Roychowdhury, J. Oscillator-based ising machine Preprint at https://doi.org/10.48550/arXiv.1709.08102 (2017).
Lobe, E., Schürmann, L. & Stollenwerk, T. Embedding of complete graphs in broken chimera graphs. Quantum Inf. Process. 20, 234 (2021).
Article MathSciNet Google Scholar
Sugie, Y. et al. Minor-embedding heuristics for large-scale annealing processors with sparse hardware graphs of up to 102,400 nodes. Soft Comput. 25, 1731–1749 (2021).
Article Google Scholar
Date, P., Patton, R., Schuman, C. & Potok, T. Efficiently embedding qubo problems on adiabatic quantum computers. Quantum Inf. Process. 18, 117 (2019).
Article Google Scholar
Graber, M., Wesp, M. & Hofmann, K. A fast graph minor embedding heuristic for oscillator based ising machines. In 2022 Austrochip Workshop on Microelectronics (Austrochip)https://doi.org/10.1109/Austrochip56145.2022.9940722 (2022).
Jagielski, T., Manohar, R. & Roychowdhury, J. Fpim: Field-programmable ising machines for solving sat Preprint at https://doi.org/10.48550/arXiv.2306.01569 (2023).
Izhikevich, E. & Kuramoto, Y. Weakly coupled oscillators. In Françoise, J.-P., Naber, G. L. & Tsun, T. S. (eds.) Encyclopedia of Mathematical Physics, 448–453 (Academic Press, Oxford, 2006). https://www.sciencedirect.com/science/article/pii/B0125126662001061.
Bhansali, P. & Roychowdhury, J. Gen-adler: The generalized adler’s equation for injection locking analysis in oscillators. In 2009 Asia and South Pacific Design Automation Conference, 522–527 https://doi.org/10.1109/ASPDAC.2009.4796533 (2009).
Graber, M., Angeli, N. & Hofmann, K. An efficient modeling approach for large ring oscillator based ising machines. In SMACD / PRIME 2021; International Conference on SMACD and 16th Conference on PRIME, 1–4 (2021).
Ochs, K. & Beattie, B. A. Towards wave digital modeling of neural pathways using two-port coupling networks. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS), 809–812 https://doi.org/10.1109/ISCAS48785.2022.9937250 (2022).
Wilton, S. J.Architectures and Algorithms for Field-Programmable Gate Arrays with Embedded Memory. Ph.D. thesis, University of Toronto https://hdl.handle.net/1807/10686 (1997).
Angeli, N. & Hofmann, K. A scalable fully synthesized phase-to-digital converter for phase and duty-cycle measurement of high-speed clocks. In 2018 IEEE International Symposium on Circuits and Systems (ISCAS), 1–5 https://doi.org/10.1109/ISCAS.2018.8351118 (2018).
Graber, M. & Hofmann, K. Analysis and design of oscillator coupling for solving combinatorial optimization problems. In 2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS), 1–4 https://doi.org/10.1109/ICECS202256217.2022.9970974 (2022).
Graber, M. & Hofmann, K. Flexible routing to overcome the embedding bottleneck of oscillator-based ising machines. In 2023 30th IEEE International Conference on Electronics, Circuits and Systems (ICECS), 1–4 https://doi.org/10.1109/ICECS58634.2023.10382841 (2023).
Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual https://www.gurobi.com (2023).
Graber, M. Simple benchmark set. figshare https://doi.org/10.6084/m9.figshare.25018898 (2024).
Graber, M. Difficult benchmark set. figshare https://doi.org/10.6084/m9.figshare.25018892 (2024).
Graber, M. Bias benchmark set. figshare https://doi.org/10.6084/m9.figshare.25018862 (2024).
Graber, M. Weighted benchmark set. figshare https://doi.org/10.6084/m9.figshare.25018952 (2024).
Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by simulated annealing. Science 220, 671–680 (1983).
Article MathSciNet Google Scholar
Graber, M. & Hofmann, K. An enhanced 1440 coupled cmos oscillator network to solve combinatorial optimization problems. In 2023 IEEE 36th International System-on-Chip Conference (SOCC), 1–6 https://doi.org/10.1109/SOCC58585.2023.10256945 (2023).

Download references

Acknowledgements

We want to thank the German Research Foundation (DFG) for supporting this work with grant number 496307198. Furthermore, the authors want to thank Tim Lukas Kessel and Malte Nilges (both Technical University of Darmstadt) for their help and support in the development of the printed circuit board (PCB) for testing. Large language models were used to improve the language of the manuscript. Grammarly by Grammarly, Inc. was used for spell checking. ChatGPT by OpenAI and DeepL by Deepl SE were used to enhance the clarity of the writing.

Funding

Open Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Technical University of Darmstadt, Integrated Electronic Systems Lab, Darmstadt, Germany
Markus Graber & Klaus Hofmann

Authors

Markus Graber
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Hofmann
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.G. conducted the circuit design, simulation, and layout for the manufactured chip. M.G. built the test setup, ran the experiments, and was the primary author of the manuscript. K.H. supervised the work, provided guidance, and contributed to the writing and revision process. All authors reviewed the manuscript.

Corresponding author

Correspondence to Markus Graber.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Engineering thanks Aida Todri, Louis Hutin, Jérémie Laydevant for their contribution to the peer review of this work. Primary Handling Editors: Damien Querlioz and Anastasiia Vasylchenkova. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Peer Review File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Graber, M., Hofmann, K. An integrated coupled oscillator network to solve optimization problems. Commun Eng 3, 116 (2024). https://doi.org/10.1038/s44172-024-00261-w

Download citation

Received: 08 February 2024
Accepted: 05 August 2024
Published: 23 August 2024
DOI: https://doi.org/10.1038/s44172-024-00261-w
Springer Nature Limited

Associated content

Neuromorphic Hardware and Computing 2024

Collection 06 May 2024

An integrated coupled oscillator network to solve optimization problems

Abstract

Similar content being viewed by others

Oscillator-based optimization: design, emulation, and implementation

A 1,968-node coupled ring oscillator circuit for combinatorial optimization problem solving

Analog Coupled Oscillator Based Weighted Ising Machine

Explore related subjects

Introduction

Results

System

Node Implementation

Routing

Computing & benchmark setup

Benchmarks

Experiments & optimization

Calibration routines

Frequency calibration

Reference phase calibration

Routing delay calibration

Comparison with Other Works

Discussion

Methods

Chip design

Delay-line calibration

Chip reconfigurability & external adjustments

Experimental results

G-Letter example problem

Phase Threshold

Calibrations

Benchmarks

Path routing

Power measurement

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Peer Review File

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation