Computing Graph Edit Distance with Algorithms on Quantum Devices

Distance measures provide the foundation for many popular algorithms in Machine Learning and Pattern Recognition. Different notions of distance can be used depending on the types of the data the algorithm is working on. For graph-shaped data, an important notion is the Graph Edit Distance (GED) that measures the degree of (dis)similarity between two graphs in terms of the operations needed to make them identical. As the complexity of computing GED is the same as NP-hard problems, it is reasonable to consider approximate solutions. In this paper we present a QUBO formulation of the GED problem. This allows us to implement two different approaches, namely quantum annealing and variational quantum algorithms that run on the two types of quantum hardware currently available: quantum annealer and gate-based quantum computer, respectively. Considering the current state of noisy intermediate-scale quantum computers, we base our study on proof-of-principle tests of their performance.


I. INTRODUCTION
The Graph Edit Distance (GED) [1, 2] represents one of the most common dissimilarity measures used in Pattern Recognition and Image Processing [3,4]. It has been successfully applied to many real-world tasks such as image recognition, handwritten digit recognition, face and face expression recognition [5][6][7][8][9], and has applications in a variety of areas from computer vision and bioinformatics [10] to cognitive science [11], and hardware security [12]. It has also been used in Machine Learning in order to define more powerful kernels for support vector machines [13], and in combination with kernel machines for pattern recognition [14].
In general, the notion of edit distance (originally introduced for strings and then extended to graphs and general structured data) is given in terms of the operations that must be performed on a pair of data in order to make them identical, and represents a quantitative estimate of their dissimilarity. For example in pattern recognition it measures the strength of the distortions that have to be applied to transform a source pattern into a target pattern. As computing the edit distance is essentially to search for the best (in terms of cost) set of operations in the space of all possible ones, the problem is intrinsically a combinatorial optimization problem and its complexity depends on the structure of the data in the search space. When data are graphs, calculating the edit distance becomes easily inefficient as the complexity of the search space grows exponentially with the number of nodes of the graphs. In fact, the GED problem is an NP-hard optimization problem [15], which makes exact approaches impossible to use for large graphs. This makes the study of approaches that give near-optimal results in reasonable time an urgent and essential one, also in the light of the practical impact of the GED problem. It is therefore of great importance to discuss and analyze in this context the feasibility of quantum approaches. Many classical algorithms have been proposed up to now [2], which however cannot be straightforwardly translated to the quantum setting.
In this paper we introduce a formulation of the GED problem as a Quadratic Unconstrained Binary Optimization (QUBO) problem, and show that this offers a common base for a direct implementation of the problem on both quantum and classical hardware, thus making it easy to compare the performances of various approaches. Our formulation is based on a quantified version of the graph isomorphism problem, where the expected answer is the actual number of operations that are needed to make two graphs isomorphic, rather than 'yes' or 'no'.
The quest for optimization algorithms that can run on noisy intermediate scale quantum (NISQ) computers is a very timely topic, as well as a challenging task [16,17]. The two main quantum strategies that have been proposed to tackle optimization tasks are Quantum Annealing (QA) [18] and the circuit-based variational approaches. In this paper we will assess the suitability of quantum annealing and two variational algorithms, i.e. Variational Quantum Eigensolver (VQE) [19] and Quantum Approximate Optimization Algorithms (QAOA) [20], on finding an approximate solution of the GED problem.
Moreover, we will give a concise comparative study of their performance with respect to a specific classical algorithm, namely Simulated Annealing, which makes use of the same QUBO formulation of the problem as the quantum strategies.

A. The Quantum Strategies
The QA approach can be seen as the quantum analogue of the classical simulated annealing optimzation algorithm [21], and essentially consists in using a specific search procedure for finding the state of minimum energy of spin systems studied in statistical mechanics, e.g. the Ising model in a random field [18]. In the quantum case the fluctuations needed to scan the energy landscape are provided by a field that allows quantum tunneling, in contrast to the simulated annealing where the fluctuations are thermal [22]. While QA is based on the adiabatic model of quantum computation, VQE and QAOA are hybrid quantum-classical algorithms implemented on circuitbased quantum computers. This approach has been tested in several proof-of-principle experiments for real-world optimization (e.g. scheduling) problems [23,24].
Both kinds of approaches share a common physical ground, since they are both implemented by means of a total Hamiltonian that is composed by two non-commuting terms, H 0 and H 1 , such that [H 0 , H 1 ] = H 0 H 1 − H 1 H 0 = 0. This Hamiltonian has the form where λ(t) is a control function that is valued in the interval [0, 1] and allows us to switch between the two terms. In fact, assuming that the computation time is in the interval [0, τ ], one can initialize the system into the ground state of H 0 , imposing λ(0) = 0, and tune the control parameter till reaching λ(τ ) = 1. If the variation of λ(t) is such that the hypotheses of the adiabatic theorem [25] are satisfied, both algorithms can be viewed as prefatory approaches for full adiabatic quantum computation [26]. Moreover, the gates composing the QAOA can be seen as a Suzuki-Trotter decomposition [27] of the unitary evolution stemming from a Hamiltonian as defined in (1). Another common aspect of QA and QAOA is that their performance significantly depends on heuristics for the choice of the annealing schedule and parameter initialization, respectively.
By defining the problem of calculating the GED as a Quadratic Unconstrained Binary Optimization (QUBO) problem, we are able to experiment both on quantum annealers, such as the D-Wave Systems Inc. machines, and on quantum circuits via variational algorithms such as VQE and QAOA. We introduce a QUBO formulation of GED that is inspired by the one presented in [28] for the Graph Isomorphism problem, i.e. the computational problem of determining whether two finite graphs are isomorphic. Our formulation exploits the fact that GED can be seen as a quantitative generalization of graph isomorphism, which requires a 'counting' phase while checking the nodes/edges in each of the graphs. This allows us to obtain a quantitative answer to the optimization problem rather than just a yes/no answer as in the case of graph isomorphism. This also explains why the GED problem is far more complex than graph isomorphism, for which it was recently shown that the problem is solvable in quasi-polynomial time [29].
Our strategy is to use the QUBO formulation across all the platforms: classical, quantum annealers and gatebased quantum computers. This formulation of the problem, which is common for all the platforms, allows their smooth comparison and the identification of the most suitable one. The NISQ devices that we use to run the QUBO formulation of GED are the D-Wave System quantum annealer and the IBM circuit-based quantum computer. This will contribute to assess the actual power of NISQ devices for real-world problems. A benchmark of quantum optimization algorithms on satisfiability and max-cut problems has been introduced in [30]. In contrast to that work, we analyze the GED problem, which requires a more involved QUBO formulation and therefore a larger number of qubits. Moreover, we consider a broader class of quantum algorithms and include also classical algorithms in our comparison. This paper is structured as follows. In Sec. II we introduce the notion of graph edit distance, and in Sec. III its QUBO formulation. Section IV is devoted to the classical approaches developed to compute GED, while in Section V we discuss the methods we employ in our experiments on both quantum annealer and circuit-model quantum computer. In Section VI we present the results of our experiments and provide a comparison among the different strategies. Finally, in Section VII we discuss our results in the light of the currently available technologies and we address some open questions.

II. GRAPH EDIT DISTANCE
A graph G = (V, E) consists of a finite, non-empty set of vertices V of cardinality |V | = n, and a set of The graph is undirected if each edge is described by an unordered pair of vertices, directed otherwise. In an undirected graph, the number of edges is at most n(n + 1)/2 or, by excluding self-loops (edges starting and ending in the same vertex), n(n − 1)/2. Two graphs G 1 = (V 1 , E 1 ) and G 2 = (V 2 , E 2 ) are isomorphic, denoted by G 1 ∼ = G 2 , if there exists a bijection (1-to-1 mapping) π between the vertex sets of G 1 and G 2 satisfying the property of edge-preserving [31], i.e. {u, v} ∈ E 1 ⇐⇒ {π(u), π(v)} ∈ E 2 . The graph isomorphism problem is the problem of establishing the exact matching of two graphs [31]. The inexact matching is the more general case where there is a difference between two graphs, and a measure of this difference quantifies 'how much' the two graphs are (dis)similar. An important graph similarity measure is the graph edit distance (GED) [1, 2], whose value is the total cost of 'transforming' one graph into the other, thus making them isomorphic. Clearly, when two graphs are isomorphic, the GED between them is zero.
The GED problem is a NP-hard optimization problem, i.e. intuitively, it is at least as hard as the hardest problems in NP. This means that it is computationally more expensive than graph isomorphism for which a recent result shows that it is possible to find a solution in quasi-polynomial time [29].
A graph edit operation is a mapping from the set of graphs to itself. The most common edit operations on a graph G = (V, E) are listed in Table I, where we also specify the part of the graph which they act on (Vertex or Edge set), and the condition of their applicability (Constraint). Specifically, these operations can be performed provided that they do not insert a pre-existing vertex/edge, or delete a non-existing vertex/edge; it must also be guaranteed that before deleting a vertex, each edge starting or ending into that vertex must be deleted.

Operation
Vertex set Edge set Constraint A graph edit path P is a composition of graph edit operations and the number of these operations defines the length, (P ), of the path. The graph edit distance between two graphs G 1 and G 2 can be defined as (2) A more fine-grained definition of GED can be found in [32], where labelled graphs are considered, so that each graph edit operation might contribute to the graph edit distance with a different weight. For our purpose, it is sufficient to consider the case of unlabelled graphs, which allows us to keep our implementation simpler. In particular, in the calculation of the GED we will assume that each edit operation has cost 1. This implies that the length of an edit path effectively corresponds to the number of operations composing it, which is a reasonable estimate of the complexity of the calculation. An example of calculation of GED is shown in Figure 1. is such that P (G 1 ) ∼ = G 2 . Moreover (P ) is the GED since no shorter path exists that makes the graphs isomorphic.

III. QUBO FORMULATION OF GED
The graph matching problem can be encoded as a linear optimization problem [33,34] or a quadratic optimization problem [28,35]. We show here how the second approach can be extended to inexact graph matching, and precisely to the GED problem. The idea is to reformulate GED as a Quadratic Unconstrained Binary Optimization (QUBO) problem, a class of problems that is well known in multivariate optimization. As the name suggests, a QUBO problem is defined in terms of a quadratic function of binary variables x i , which is unconstrained or, more precisely, with constraints replaced by penalty terms and encoded within a matrix Q representing the objective function [36]. The task is to find the value x * such that where x i ∈ {0, 1} for any of the n entries of x, and Q is an n × n upper triangular real valued matrix with elements q i,j encoding the data specification. QUBO problems are NP-hard problems [37] and share a similar structure and the same computational complexity with spin glass Ising models. This similarity allows for an almost immediate implementation of QUBO objective functions into the adiabatic model of quantum computation.
A QUBO formulation of the GED problem can be given as follows. Given two graphs G 1 = (V 1 , E 1 ) and G 2 = (V 2 , E 2 ) we shall assume that the number of vertices is the same in both graphs, that is |V 1 | = |V 2 | = n. This is without loss of generality; in fact, if G 1 has k = |V 1 | − |V 2 | > 0 vertices more than G 2 we build G 2 by adding k isolated vertices to G 2 , then we calculate he GED between G 1 and G 2 , and finally add k to the given value for considering the k insertion node edit operations. The formulation requires n 2 variables, each denoted by x i,j , with x i,j = 1 if and only if the i-th vertex of G 1 is mapped to the j-th vertex of G 2 .
We define the dual path of P as the path,P , where any deletion operation becomes an insertion operation and vice versa. If a bijection π corresponds to a graph edit path P such that P (G 1 ) ∼ = G 2 , we can partition P in two edit paths P 1 and P 2 , where P 1 has only insertion operations and P 2 only deletion operations. Then Clearly, (P 1 ) + (P 2 ) = (P 1 ) + (P 2 ) = (P ). Intuitively, the process is illustrated by the example reported in Figure 2.
Considering the two assumption stated above, we can define the cost of a bijection π as cost(π) = |E 1 \ π −1 (E 2 )| + |E 2 \ π(E 1 )| where π(E) = {{π(i), π(j)} | {i, j} ∈ E}. For a given bijection π, the value of cost(π) is equal to the number of edges occurring in E 1 and not in E 2 plus the number of A graph edit path P, |P | = 2, such that P (G 1 ) ∼ = G 2 according to π. The insertion is denoted by a green dashed line and the deletion by a red dashed line. (c) Paths P 1 and P 2 such that |P 1 | + |P 2 | = 2 and P 1 (G 1 ) ∼ = P 2 (G 2 ) according to π. The only insertion operations are denoted by a green dashed line. edges present in E 2 but not in E 1 . The assumption that the graphs have the same number of vertices guarantees that a bijection exists.
We can now define the GED as: The QUBO formulation of our problem is composed by two parts: a hard constraint Q h , whose function is to add a penalty if the solution x does not represent a bijection (in that case, x is not a solution for GED) and a soft constraint Q s , which introduces a penalty for any edge mismatch as in Equation (4).
The hard constraint reads as and ensures the validity of the solution (i.e. x corresponds to a bijection). Whenever the i-th vertex of G 1 is mapped to zero or more than one vertices of G 2 , the (a) term of Equation (6) is greater than zero. The (b) term of Equation (6) is greater than zero, when the j-th vertex of G 2 is the image of none or more than one vertices of G 1 . The soft constraint is with The term Q s counts how many edges are not preserved by the bijection π implied by x. In particular, the R i,j counts the arcs {i, j} in G 1 , {π(i), π(j)} missing in G 2 . The S i ,j terms counts the arcs {i , j } in G 2 , {π −1 (i ), π −1 (j )} missing in G 1 . We prove now that the term i,j∈E1 R i,j (x) is equivalent to the term |E 1 \ π −1 (E 2 )|: (1 − e π(i),π(j) ) = {i,j}∈E1 where e ij is one iff arc {i, j} ∈ E 2 . The procedure to identify S i ,j is equivalent. The complete formulation reads: where the choice of parameter λ h , λ s decides the weight of each penalty term.
If we set λ h λ s we are guaranteed that all valid solutions, i.e all those having null hard constraint contribution, have lower cost than any non-valid solutions. For graphs of n vertices, we can choose to ensure that the contribution of the soft constraint λ s Q s (x) to the total QUBO problem is always smaller than the contribution of hard one λ h Q h (x). This is because n 2 is the cost of the worst case, i.e when one graph is complete and the other one empty.
If the maximum cardinality of edges is |E| = m < n 2 , then Equation (9) becomes λ h > mλ s .

IV. CLASSICAL AND QUANTUM APPROACHES TO GED
Many classical algorithms for the GED problem exist [32,38]. Clearly, given the computational complexity of GED, all known exact algorithms run in exponential time. A widely used approach is based on the A* search algorithm [39]. A different approach is to consider classical heuristics that run in polynomial time in the size of the input, however they are not all suitable to treat QUBO problems. The two most common approaches are either based on reduction GED problem to LSAPE (Linear Sum Assignment Problem with Error-Correction) [40] or based on local search [41].
A different approach is the Simulated Annealing (SA) heuristic [21]. This is a probabilistic algorithm that minimizes multivariate binary objective functions and can be used to solve QUBO problems. Due to its physical background discussed later in Section V, it can be compared with the approaches presented in the following sections. SA can be seen as taking a random walk in the solution space according to a Markov chain parametrized by a temperature parameter T [42]. We choose SA as classical counterpart in our comparison because it can straighforwardly be used to solve QUBO problems.

A. Simulated Annealing
SA works as shown in Algorithm 2 in Appendix A. When starting the algorithm the temperature T is high, and solutions with higher energy are accepted with a probability that follows the Boltzmann distribution. The temperature decreases exponentially and SA has fewer chances to accept high energy solutions. When the temperature is zero, SA works in a gradient-descent fashion and will converge to a local minima. SA is restarted many times (shots), each from a different point of the state space chosen randomly. Finally, the energy of the solution is the single, lowest energy found in all the shots.

B. Quantum annealing
Quantum Annealing (QA) [18,43] is a meta-heuristic search algorithm that can be used to tackle QUBO problems. This optimization technique finds its roots in a problem mutuated by statistical physics, namely the search of the minimum-energy state of a spin system exhibiting a glassy phase [44]. We briefly recall the notation and the physics underlying a class of spin systems, hereafter referred to as Ising-like model. The aim is to clarify the connection between the unconstrained quadratic problems and the search of the ground state of such a class of models.
Let us introduce a classical spin variable s i that can take values ±1. The function describing the energy of a system of N interacting spin disposed over a d−dimensional discrete lattice is the Hamiltonian: where the i, j denotes that the sum is over all the first neighboring sites, the J ij are the couplings between two sites of the lattice and h i is the external magnetic field acting on each spin. A quantum version of the Hamiltonian in Equation (10) is obtained replacing suitably the binary variable s i with an ad hoc Pauli matrix σ α i with α = {x, y, z}: The problem of finding the ground state of a Hamiltonian describing an Ising Spin Glass problem is NP-hard and how it relates to the solution of many NP-hard is reported in [45].
The QUBO problem is closely related to the problem of finding the ground state of a Hamiltonian written in terms of spin variable, upon introducing the transformation: Originally, QA was introduced in [43] as a quantuminspired, classical algorithm. In contrast to the SA in which the fluctuations to explore the energy landscape are provided by the temperature parameter T , quantum annealing uses a transverse field coefficient λ(t) called tunneling coefficient that modulates the two terms of the Hamiltonian as in Equation (1).
The quantum annealing uses the term H 0 usually referred to as the transverse field Hamiltonian that does not commute with the term H 1 in which the optimization problem is encoded. The non-commutativity of the two terms provides the fluctuations necessary to exploit the quantum tunneling, and allows us to escape from the local minima by tunneling through hills in the solution landscape [42]. Tipically, when the quantum hardware is used, the system is in a superposition of all possible state, with probability amplitudes depending on H(t).
Quantum annealers usually have a certain number of qubits which are connected according to a given topology. Logically, any QUBO variable is mapped to a qubit. Physically, it is possible that two variables linked by a quadratic term J i,j are not connected. This requires us to find a minor embedding [46], thus a mapping of variables to physical qubits such that variables bound by quadratic terms are located to adjacent qubits. If this is not possible, additional qubits are required to represent a single variable. For this reason, to each variable is assigned one logical qubit but this can be mapped to more than one physical qubits. The task of finding a minor embedding consists in searching a minor of the graph associated with the hardware topology which is isomorphic to the one associated to the QUBO problem. Minor embedding is currently solved using heuristics, available on the D-Wave Ocean library.
It may happen that the physical qubits associated with a single logical qubits are prepared in different states: some in s = −1 and others in s = +1. This phenomena is called chain break and is resolved in post-processing using either majority vote technique (the logical qubit has the most frequent value in the physical qubits assignment) or energy minimization (the logical qubit has the value that minimize the energy). The latter requires multiple evaluation of the formulation). We can use the QUBO formulation to solve the GED problem also on gate-based quantum computers, via Variational Hybrid algorithms, which are based on both classical and quantum resources.
We define a parametric quantum circuit (PQC), that depends on an ansatz for the values of the parameters θ = (θ 1 , ..., θ M ). The number of paramaters M depends on the architecture of the circuit and the number of qubits n (which are equivalent to the variables of the QUBO problem). For example, we can choose rotational gates which naturally depends on a set of rotation angles. As a shorthand we denote the composition of unitary gates composing the PQC by U (θ), and the final state of such a circuit by U (θ) |0 ⊗n = |Ψ(θ) . Then we compute the expectation value of the problem Hamiltonian H C : Such Hamiltonian is related to the QUBO formulation by Equation (12). The value E(θ) corresponds to the cost function which has to be minimized and the minimization stage is performed classically. Examples of this kind of algorithms are the variational quantum eigensolver (VQE) [19] and the quantum approximate optimization algorithm (QAOA) [20]. They are mostly used to find the ground state of the Hamiltonian of nonintegrable spin systems [47], which is indeed a minimization task.
The Variational Quantum Eigensolver (VQE) is inspired by the Variational Principle [19] and it has found its most groundbreaking application in chemical-physics simulation [48][49][50], and depending on the nature of the problem under investigation many different ansätze can be used [51]. The circuit implementing it is shown in Figure 4, and its construction is detailed in Algorithm 3 in Appendix A.
repeat this structure p ≥ 1 times FIG. 4: Circuit for the variational ansatz used for VQE algorithm defined on n = 3 qubits. The structure can be repeated p times and the number of parameters θ is 2pn.
The Quantum Approximate Optimization Algorithm (QAOA) recently introduced in [20,23,24,52] is an application of VQE. The ansatz must respect a particular structure which depends on the Hamiltonian defining the problem.

V. EXPERIMENTAL SETUP
We consider a set of graphs with a number of vertices ranging from three to nine, and we randomly generate the graphs following the procedure explained in [53]. The procedure needs as inputs the number of vertices n and the probabilities p of generating edges in the graph. For all the configurations labelled by the number of vertices n ∈ {3, 4, ..., 9}, we generate four graphs G 1 , G 2 , G 3 , G 4 with edge probabilities p 1 = 0.1, p 2 = 0.33, p 3 = 0.66, p 4 = 0.99. This means that graph G 1 , generated with probability p 1 , will contain a small number of edges while the graph G 4 is almost always a complete graph.
We proceed now to exactly compute the GED(G i , G j ) with i, j = 1, 2, 3, 4 for each number n of vertices by means of the A* algorithm [39]. Note that this is feasible only for small sized graphs due to the exponential worst case complexity of the algorithm. Any computational run has four inputs: the first graph G i , the second graph G j and the parameters λ h , λ s of Equation 8. We iterate the runs over all possible pairs of graphs with the same number of vertices, including the pair of a graph with itself.
The formulation shown in Equation (8) requires to set some values for parameters λ h , λ s > 0. Experimental results, obtained by running SA on all pairs of values λ h = 1 and λ s ∈ {1/i | i = 1, ..., 10} ∪ {0.05, 0.01} on graphs up to 9 vertices, show that the actual best value for λ s fulfilling Equation (8) would be λ s < λ h /81 ≈ 0.012. Our choice of setting λ h = 1 and λ s = 0.1 is a compromise between the need to fulfill Condition (9) -which is valid for small λ s -and the the need to avoid values, λ s 0, which are so low as to make the soft constraint irrelevant.
We have run SA 1000 times, and kept the lowest energy solutionx. For each run we have calculated the exact GED s and the approximated ones. As the absolute error |s −s|, grows with the problem size, we have introduced another quantity called the relative difference, which is defined as   The experimental setup is based on state-of-the art, freely available software. This approach facilitates reproducibility and avoids reinventing the wheel issues e.g. bugs, non-standard implementations. As argued in [54], reproducibility is a serious issue in quantum science. We have addressed the problem by releasing the code in github.com/incud/GraphEditDistance. However, quantum annealer based experiments require the access to D-Wave machines which is available through the cloud.

A. Classical approaches
The exact calculation of GED is performed through the A* algorithm [39] implemented in NetworkX [55], while as heuristic we use the SA implementation provided by D-Wave Ocean SDK.
The control parameter in this implementation are set as follows. The temperature is initialized at T start = M/ log 2 and decreases exponentially to T end = m/ log 100, where M and m are an upper bound and a lower bound of the solutions, respectively. The (absolute value of the) smallest entry of the QUBO matrix is the lower bound m. The sum of the (absolute value of the) entries of the QUBO matrix is the upper bound M .
Another implementation of SA is the one provided within the GEDLib library [56] and its Python wrapper GEDLibPy [57]. The comparison between this and the one provided by D-Wave is reported in Appendix B and shows that the results are comparable. We have also tested other state-of-the-art heuristics that are not based on QUBO formulation, however in general SA is the best performing (see Appendix B). However we will rule out these heuristics from our comparison with the quantum algorithms and restrict ourselves only to methods specifically designed for QUBO problems. D-Wave 2000Q has 2041 qubits and uses Chimera topology having 5974 couplers (physical connections between qubits) [58]. D-Wave Advantage 1.1 has 5436 qubits and uses a Pegasus topology having 37440 couplers [59]. Chain breaks are resolved using majority vote, which is the cheapest technique and is the one suggested by the hardware vendor. D-Wave Leap Hybrid Solver is not a quantum annealer but a hybrid classical-quantum software whose configurations are fully managed by D-Wave and it is not customizable.
Both D-Wave 2000Q and Advantage allow us to set the configuration of the annealing process, controlled by a parameter s that grows monotonically from 0 to 1. The parameter is defined as s = t/τ , where τ is the annealing time. For a specified annealing time the evolution proceeds linearly in t.
Due to the enhanced topology, it is expected to find more compact embedding with the latter hardware [60].
Any GED computation runs on both machines with the following configurations: According to [61], the performance of pausing within the annealing schedule depends uniquely on the length of the pause (longer is better) and are independent of the annealing process before and after the pause.
A run is the number of times the annealing is restarted. We have tried each configuration with a different number of runs: ST with 10 4 , DC with both 10 3 and 10 4 , the others with 10 3 and we have kept only the lowest energy result.
We expect that the long time configuration might outperform those having shorter annealing time, due to the possibility of exploring the energy landscape widely. However, for longer times decoherence effects due to noise may arise and there is still no direct control of it. This increases the probability of errors.
We also expect the PM configuration to outperform the default configuration, due to the previous evidences suggesting that pausing the schedule in the middle of the process improves the performances [61][62][63].

C. Variational Quantum Algorithms
We used the implementation of VQE and QAOA available on the Qiskit IBM platform [64].
FIG. 5: Scheme of the QAOA circuit for n = 3 qubits. The variational ansatz is repeated twice. The number of parameters is fixed to 2p (and does not depends on n).
We briefly recall the construction of the circuit implementing the QAOA as in [20] and in Figure 5 we plot the circuit used to tackle the GED problem. The circuit has as input the state |+ ⊗n where n is the number of variables of the QUBO problem. Such state is obtained from |0 ⊗n by applying the Hadamard gate to each qubit. The QAOA ansatz is constructed by repeating p times two unitary operation U C (γ), U B (λ s ). The whole ansatz will depend on 2p parameters γ 1 , β 1 , ..., γ 2p , β 2p ∈ [−π, π]. The first unitary operator is defined to be U C (γ) = exp{−iγH C } where H C is the problem Hamiltonian and depends on our input. The mapping between the QUBO formulation and the Hamiltonian formulation is detailed in Equation (12) The expectation value represents the energy (cost) associated to a particular choice of parameters and must be minimized: and we stress that the above equation is Equation (13) rewritten for the QAOA case. Finally, the task of the classical optimazion is to find the optimal variational parameters such that: We perform 2048 runs with randomly initialized parameters, then the classical optimization is performed using the algorithm COBYLA [65]. The construction of QAOA ansatz is detailed in Algorithm 4 in Appendix A.

VI. NUMERICAL RESULTS
Our analysis takes into account the number of vertices n, the hardware and its configurations. We evaluate the performance of each algorithm, illustrated in Appendix A, using three different measures: the average of the relative difference ∆, the average of success probability, and the average of high-quality probability. The relative difference was defined in Equation (14); the success probability is defined as the percentage of experimental results having ∆ = 0, and the high-quality probability is defined as the percentage of results having ∆ ≤ 0.2.
All experiments uses the QUBO formulation shown in Equation (8) with parameters λ h = 1, λ s = 0.1, which are optimal according to the preliminary analysis in Table II. The choice of these values enforces the hard constraint, maximizing the chances of reaching a valid solution.

A. Resource usage
The resources exploited by the quantum annealer can give us useful information about the performance of our optimization tasks implemented over different configurations. We focus on the number of logical qubits, i.e. the number of variables of the problem, and physical qubits, i.e. the number of qubits needed to encode the problem by means of the minor embedding procedure. We consider the maximum length of the chain, that is the maximum number of physical qubits needed to encode a single logical qubit, and the chain strength that measures the strength of the interaction between physical qubits belonging to the same chain.
As shown in Table III, the number of logical qubits depends uniquely on the graphs size. The other resources depend on both the topology of the quantum annealer and the quality of the software performing the minor embedding (which is the same for both versions of the hardware). It is evident that D-Wave Advantage produces much smaller embedding; as shorter chains lead to fewer errors, and in fact we obtain more accurate solutions.
Then, we have quantified the resources required for variational algorithms. In Table IV we report the number of qubits which is exactly to the number of variables.  We also report the number of parameters to be trained by the classical optimizer, the depth, i.e. the maximum number of gates on one qubit, and the size, i.e. the total number of gates (see [66] for a more formal definition of the notions of size and depth of a circuit).
To have a good estimation of resources, we transpile the circuit in terms of single-qubits rotations U 3 and CNOT  gates.
We immediately see that in general VQE requires much more parameters than QAOA, although it depends mostly on the choice of the variational form. In general, this should lead to longer classical optimization phase.
We were able to run experiments with graphs up to 5 vertices, since larger instances requires much more computational power for the simulation, which was not available to us. The number of required gates suggests that this approach is not feasible on NISQ hardware due to the low gate fidelity and decoherence errors. Thus, we have performed the calculation on error-free simulators.

B. Comparing Quantum Annealers
We identify the best performing configuration for all the quantum annealers, and then compare their performance with the simulated annealer. Figure 6 compares the different configuration of D-Wave 2000Q and Advantage 1.1. For both quantum annealer versions we see that: • configurations having 10000 runs perform significantly better than configurations with 1000 runs; • the annealing time that minimizes the relative dif-ference is τ ≥ 20µs. Shorter values return inaccurate solutions, and larger values have similar performance but are costly.
• the introduction of the pause in the annealing process does not improve the performance.
In Figure 7 we compare the two versions of the quantum annealer, i.e. the D-Wave 2000Q and the D-Wave Advantage. We compared both versions in their best performing configuration, that is 20µs of annealing time and 10 4 runs, even though for some values of n other configurations might slightly outperform this one. For most instances, the performance of D-Wave Advantage is better than the one of D-Wave 2000Q, but the gap between the two is quite small. The anomalous case (n = 7) in which the average values of the metrics used seem to suggest that D-Wave 2000Q shows a better performance is related to numerical fluctuations. However, taking the experimental error into consideration, the conclusions drawn for the other cases are still valid. The few cases (n = 7) where D-Wave 2000Q shows better performances are explained by the experimental error, as shown by the error bars. However, this fact slightly contrasts our expectation of better performance of D-Wave Advantage suggested by its more compact minor embedding (less physical qubits used).
Both quantum annealers perform worse than SA. Moreover, SA in general performs better than any other classical algorithm tested (results in Appendix B). We can state the superior performances of classical hardware compared to the current quantum annealers, for the GED problem.
It is important to notice that Hybrid Annealer D-Wave Leap has performance close to the SA, even without ever outperforming it. The significance of this finding is weakened by the opaque internal working of D-Wave Leap, which might be perform SA itself.
Finally, we have identified which configuration promises the best tradeoff between quality of the solution and total annealing time. This observation is relevant because the cost of using a quantum annealer is proportional to the amount of time used by the machine.
We measure the time to solution (TTS) as TTS = # of runs × annealing time per run high quality probability for both D-Wave 2000 and D-Wave Advantage. It measures how much time in seconds is required on a quantum annealer to find a high quality solution. In the literature you can find also different definitions of TTS ( [67,68]), all representing the same concept. The results are shown in Table V. Our experiments show that having the shortest annealing time with a large number of runs gives the best tradeoff in terms of TTS. For n = 9 vertices no experiment found any high quality solution thus was not possible to estimate such probability. C. Comparing variational algorithms Figure 8 shows the comparison of the performance of VQE, QAOA, and SA. Considering the current state of NISQ devices, we have tested the variational approach on instances of GED with graphs having at most 5 vertices (thus our circuit uses 25 qubits). For these small dimensions, the classical approach leads to much better solutions. In particular, no variational approaches are able to find an exact solution for instances having n ≥ 4 vertices.
Increasing the number of repetitions may marginally improve the performances: VQE with p = 3 outperforms VQE with p = 1 for any choice of n, while for QAOA the configuration with p = 1 performs equal or better than the one with p = 3. Theoretically, in the limit of p → ∞, the probability of success is guaranteed to be 1 (see [20]). However, increasing the number of layers becomes really costly in terms of the evaluation of the expectation value of the cost Hamiltonian. In fact, the dimensions of the parameter search space scales, when no a priori symmetries are known, as [−π, π] p × [−π, π] p . To FIG. 8: Performance of variational algorithms to compute the GED for graphs with n vertices. In (8a) we plot the mean relative difference, while in (8b) we plot the success solution probability and in (8c) the high quality solution probability. The acronyms refer to the different algorithms as follows (the error bars represent the standard deviation): V1 is VQE with p = 1, V3 is VQE with p = 3, Q1 is QAOA with p = 1, and Q3 is QAOA with p = 3. gain insight on the low performances of the QAOA algorithm we consider the energy landscape of the associate Hamiltonian of the QUBO formulation of the GED of two graphs having four vertices and with p = 1. For the case considered, the Hamiltonian H C has a discrete spectrum with 88 eigenenergies, whose ground state is degenerate. We report the energy landscape obtained in Fig. 9, where it is clear that the Hamiltonian does not have a unique global minima, leading to a low performance of the optimization technique provided by the QAOA for the QUBO formulation of the GED problem.

VII. CONCLUDING REMARKS
Complex systems are nowadays ubiquitous in science. Very useful models for these systems are typically defined by representing them as graphs, i.e. collections of pairwise connected nodes. The nodes constitute the elementary 'units' of the problem, while the edges take into account their interaction. Each edge can also be associated with a label representing some quantitative or qualitative information. This versatility of the graph structures makes them a powerful tool in the most heterogeneous fields of research.
In this paper we have addressed the problem of quantitatively estimating the degree of similarity between a pair of graphs via the Graph Edit Distance. Computing GED is a task that requires the exploration of a space of solutions that is exponential in the size of the input graphs. It is therefore reasonable to consider approximate approaches to the problem, which are able to achieve acceptable approximations of the exact solution. We have thus investigated whether the resources offered by the quantum hardware currently available are plausible candidates to tackle this task. Our proof-of-principle analysis has had a twofold aim: on one side we have shown how to practically implement the computation of GED on both a quantum annealer and a gate-based quantum computer, and subsequently we have compared the results of running the same GED algorithm on the the two types of hardware. Based on our results, the quantum annealer seems to be today a better platform for optimization problems written as QUBO problems.
We remark that, concerning the variational approach, we have used noiseless simulators to have a first insight on the feasibility of evaluating the GED with NISQ devices. Nowadays, the development of the gate-based quantum hardware as well as the specific software for optimization algorithms is still in its infancy. In this work we have reported some preliminary evidences that the main variational algorithms may not be suited for the type of problems as the one we have addressed in this paper. Overall, we are confronted with two contrasting aspects of the currently available resources. On one hand, classically simulating a quantum computer is expensive although it does not suffer from problems related to quantum physical resources, namely qubits and unitary gates acting for a sufficiently long time [69]. On the other hand these quantum computational resources are very limited so as to make it impossible to exploit their full potential. However, we surmise that these detrimental effects will be overcome when quantum hardware with enough resources to allow error-correction schemes as a barrier against noise will be available. Another problem with variational algorithms is that it is necessary to guess the right encoding of the problem into the parameterized cost function that is evaluated using a quantum computer. This is a challenging but crucial task for the success of the solution scheme, together with the choice of the best classical optimizer for the parameter training phase.
In this direction, it was recently proposed a Quantum Natural Gradient Descent algorithm [70,71], which seems to offer enhanced performance by considering statistical information about the quantum circuit such as geometrical methods based on the Quantum Fisher Information [72]. Another seemingly promising strategy is to combine the two approaches that we have studied in this paper, namely using the quantum annealing to get preliminary results that can then be used to initialize a variational algorithm, as shown for the case of the QAOA in [73].
We believe that the algorithms and the benchmarking of quantum annealers and gate-based quantum computers that we have presented in this paper can also be exploited for machine learning tasks. In fact, the algorithms we have devised in our implementations could be profitably incorporated in machine learning algorithms that deal with graph data, to obtain quantum algorithms that are more efficient than the standard classical machine learning strategies.