1 Introduction

Quantum technology has experienced rapid development in the past decades and has the potential to solve some classically intractable problems. Its contributions are still in the early stage, as current so-called Noisy Intermediate-Scale Quantum (NISQ) devices can only handle simple, small-sized algorithms considering they are limited by size and noise. They also encompass additional hardware constraints such as low qubit connectivity, reduced supported gate set, and limitations related to classical-control resources, which makes it even more difficult to execute a quantum circuit on these processors successfully.

Quantum algorithms, usually represented as quantum circuits, are hardware-agnostic; that is, when described, they do not consider hardware restrictions. To execute such algorithms (quantum circuits) on a quantum processor, they must be modified to fulfill the processor’s limitations through a process called quantum circuit mapping. The quantum circuit mapper, which is part of the compiler, is then at the core of the full-stack quantum computing system, connecting algorithms with quantum devices (Bandic et al. 2022).

Various techniques have been proposed to deal with the mapping of quantum circuits (Li et al. 2019; Murali et al. 2019a; Tannu and Qureshi 2019; Li et al. 2020; Zulehner et al. 2018; Venturelli et al. 2019; Lao et al. 2019a, b; Herbert and Sengupta 2018), which differ in approach (exact or heuristic, local or global solution), methodology (e.g., SMT solver (Lye et al. 2015)), cost functions (optimizing number of gates or circuit depth), and performance metrics (e.g., circuit fidelity). These solutions, however, adopt a bottom-up approach, developing mappers specifically for certain quantum processors and technologies. The majority of quantum circuit mapping techniques have mostly focused on hardware properties (Tannu and Qureshi 2019; Lao et al. 2022) and only considered a rather limited set of algorithm characteristics such as number of qubits, number of quantum gates, two-qubit gate percentage, and qubit interactions (i.e., what pair of qubits perform a two-qubit gate). In addition to this, when mapping outcomes are analyzed, the focus is on the values of the obtained metrics without further evaluating why some circuits show higher or lower overheads. Some works have already pointed out the importance of including more algorithm features in the mapping process (Lao and Browne 2021a). A more complete and in-depth profiling of quantum circuits will help to (i) have a deeper understanding on why specific algorithms have higher fidelity than others when being run on a particular processor using a specific mapping technique; (ii) to categorize (cluster) quantum circuits based on those parameters and predict the performance of additional circuits with similar properties in terms of mapping-related metrics, without actually running them on a given device; and (iii) to develop application-driven and hardware-aware mapping techniques (i.e., mapping techniques tailored for a specific set of algorithms in addition to overcoming hardware constraints) (Bandic et al. 2022; Li et al. 2021a; Lao and Browne 2021b). Note that more broadly, this characterization of quantum circuits will be also crucial for defining a meaningful and complete set of quantum benchmarks to evaluate not only quantum circuit mapping techniques but also full-stack quantum computing systems as well as for having a set of algorithm-level metrics to measure system performance (Tomesh et al. 2022).

One of the most stringent quantum hardware constraints that quantum circuit mapping techniques have to deal with is the limited connectivity of physical qubits, which restricts possible interactions between them. Therefore, in this paper, we propose to extend the profiling of quantum circuits/algorithms by not only extracting “standard” parameters like the number of qubits and gates and percentage of two-qubit gates, but also by performing a deeper analysis of their qubit interaction graphs (i.e., representation of the two-qubit gates or qubit interactions of the circuit). By taking input from graph theory and machine learning, we characterize quantum circuits based on their interaction graph metrics (e.g., average shortest path, connectivity, clustering coefficient). We then map those quantum circuits into several quantum processors using a specific quantum circuit mapping technique. In future work, we will also use different quantum circuit mapping configurations, allowing us to evaluate what quantum circuit features impact the circuit mapping performance the most and identify what combination of mapping technique-quantum hardware works better for a given (set of) algorithm(s). Note that this analysis can in the future help in the codesign of algorithm-driven compilation methods and quantum hardware.

In addition, we present a categorized and, as of now, the most comprehensive set of quantum algorithms (benchmarks) from various sources and platforms and in different quantum programming languages. Most of the currently existing and used quantum algorithms, synthetically generated and application-based circuits are included in this collection and classified based on different criteria. We are hoping that this algorithms/circuits set will be used for benchmarking quantum computing systems as well as parts of it, such as compilation techniques.

The main contributions of this work are as follows:

  1. 1.

    We have performed the first characterization and clustering of quantum circuits that also considers qubit interaction graph parameters in addition to the characteristics related to circuit size (number of gates, number of qubits, amount of two-qubit gates). In-depth profiling and clustering of quantum circuits based on their more structural parameters help to analyze why and when some (families of) quantum algorithms show better performance compared to the rest when being executed on a given quantum processor, as well as which circuit parameters have a higher impact on performance for some hardware-compiler setups. Subsequently, that can also help to predict the mapping performance for additional circuits with similar properties, without actually running them on a given device, and therefore assist in recommending an adequate mapper and hardware configuration to use. Finally, this circuit structural parameters analysis step is crucial for the development of future application-based quantum devices and mappers.

  2. 2.

    We have found that quantum circuits similarly structured in terms of their interaction graph parameters will have comparable results in terms of circuit fidelity and gate overhead when mapped on the same quantum device and by using the same mapping technique. By running these groups of circuits with different hardware configurations, we could make clear suggestions on which group of circuits fits which hardware better.

  3. 3.

    We provide the so-far most comprehensive collection of quantum benchmarks, open-source and available in most currently used high- or low-level quantum languages. The goal is to help the quantum community speed up the research process and in the development of a full-stack quantum system by having an easily accessible, all-in-one-place set of benchmarks that can be used for analyzing the performance of existing and future quantum processors and compilation methods.

The paper is organized as follows: Section 2 presents a short introduction to full-stack quantum computing systems and an overview of the current state-of-the-art quantum circuit mapping techniques as well as benchmark characterization. Section 3 introduces our profiling of quantum algorithms and their clustering based on size and structure. The experimental setup with the details of our benchmark collection is included in Section 4. Section 5 showcases the obtained results on how the mapping performance of quantum circuits when run on a specific chip relates to their structural parameters acquired from the analysis of their interaction graphs and their clusters from Section 3. Finally, in Sections. 6 and 7, conclusions and future work are presented.

2 Background and related work

2.1 Quantum computers nowadays

Quantum hardware has significantly progressed since its inception, and a wide variety of technologies has been developed for implementing qubits like solid-state spins, trapped-ion qubits, or superconducting qubits (Resch and Karpuzcu 2019). Hardware characteristics like the number of qubits and gate fidelity are continuously improving. However, current NISQ devices are still immensely resource-constrained and error-prone. They are not able to keep up with the development of promising quantum algorithms, that might achieve exponential speed-up, as they lack in size (number of qubits), which is required for the implementation of fault-tolerant and error-corrected techniques. Therefore, it was inevitable to develop a set of algorithms that could be successfully executed on current processors, coming from different fields like quantum physics, chemistry, or machine learning (Bharti et al. 2022).

Quantum compilers act like intermediaries between algorithms (expressed as quantum circuits) and quantum processors. They not only translate high-level programming language instructions (e.g., library Qiskit given in Python (Anis et al. 2021)) into low-level ones (quantum assembly-like language, e.g., OpenQASM (Li 2019)), but are also responsible for making transformations and optimizations of the quantum circuit to best fulfill the quantum hardware requirements. The compiler design and complexity highly depend on the constraints imposed by the hardware and chosen technology. In nearest-neighbor architectures (e.g., 2D array of qubits), the primary constraint is the limited connectivity among qubits. As running two-qubit gates requires that the paired qubits are adjacent on the chip, restricted connectivity can become a huge obstacle. The compiler tries to overcome that and other limitations and helps to successfully execute a quantum circuit on a given quantum device through a process called mapping. Note that the mapping of quantum circuits usually results in a gate and latency overhead that in turn decreases the circuit fidelity. Therefore, having efficient mapping techniques is crucial in the NISQ era not only to successfully execute quantum algorithms but also for extracting the most out of constrained NISQ devices.

2.2 Computing with NISQ devices

One of the motivations for building quantum computers in the first place is to run algorithms that solve problems that are intractable for existing classical computers due to limitations in speed and memory. Current NISQ devices can only handle simple algorithms, in terms of the number of qubits and gates and circuit depth, as the presence of noise and limited resources (physical qubits) still constrain them: quantum operations have high error rates and qubits decohere over time resulting in information loss. On top of that, running an algorithm on a NISQ device is not a straightforward process. That is due to hardware constraints that affect the algorithm execution, which can vary between quantum technologies.

One of the restrictions that affects the execution of a quantum algorithm the most is (limited) qubit connectivity. That applies to most technologies, including superconducting qubits and quantum dots, where qubits are arranged in a 2D grid or some other not-fully connected topology, as shown in the top-right part of Fig. 1, allowing only nearest-neighbor interactions. In order to perform a two-qubit gate in such architecture, the two interacting qubits in the circuit have to be placed in neighboring physical qubits on the chip, which is not always possible (see Fig. 1: two two-qubit gates between virtual qubits 1 and 5, and 5 and 6 cannot be directly performed because they do not share a physical connection in the coupling graph). Other constraints that have to be considered are (i) primitive gate set—the gates of the circuit to be executed do not always match the native gate set (supported gates) of the quantum chip. For instance, to run the quantum circuit shown in Fig. 1 on the Surface-17 chip (Lao et al. 2022), its CNOT gates would have to be decomposed into X and Y rotations and CZ-gate supported by the device; (ii) classical control constraints—shared electronics help to scale up quantum systems but may limit parallelization of quantum operations during circuit execution. The process of accommodating these requirements imposed by the quantum hardware to efficiently execute a quantum algorithm is called quantum circuit mapping.

Fig. 1
figure 1

Running a quantum circuit on a 7-qubit quantum processor. a Interaction graph \({G}_{i}({V}_{i},{E}_{i})\) of the circuit shown below; nodes \({V}_{i}\) represent virtual qubits, and edges \({E}_{i}\) show interactions between qubits (i.e., 2-qubit gates). b, c The chip’s coupling graph \({G}_{c}({V}_{c},{E}_{c})\); nodes \({V}_{c}\) represent physical qubits, edges \({E}_{c}\) show connections on the chip (i.e., possible two-qubit interactions). d Circuit qubits (\(qi\in {V}_{i}\)) are mapped onto physical qubits (\(Qi\in {V}_{c}\)). e An extra SWAP gate is required to be able to perform all CNOT gates

The quantum circuit mapping process consists of the following steps (not mandatory in this order): (1) Adapting the gate set of the circuit to the gates supported by the device; (2) Scheduling quantum operations (qubit initialization, gates, and measurements) of the circuit to leverage its parallelism and therefore shorten the execution time; (3) Placing virtual qubits (of the circuit) onto physical qubits (on the actual chip) so that the previously mentioned nearest-neighbor two-qubit-gate constraint is satisfied as much as possible during algorithm execution; and (4) Routing or exchanging positions of virtual qubits on the chip such that all qubits that could not initially interact become adjacent and perform their corresponding two-qubit gates (Fig. 1). This is done by inserting additional quantum gates. How routing is performed and which gates are inserted is technology-dependent with various existing methods (SWAPs, Shuttling). Therefore, the resulting after-mapping circuit will in most cases have more gates and a longer execution time than originally. Due to the previously mentioned highly-erroneous quantum operations and qubit decoherence, the overhead in terms of number of gates and circuit depth caused by the mapping should be minimal as it ultimately impacts the algorithm fidelity.

Various approaches have been proposed to solve the circuit mapping problem, each using different methods and strategies. Some solutions are optimal (exact), but work in a brute-force style and are thus only suitable for small circuits (Zulehner et al. 2018; Lye et al. 2015; Siraichi et al. 2018). For larger circuits and to allow for scalability, heuristic solutions are a better fit (Li et al. 2019; Lao et al. 2022; Wille et al. 2016; Guerreschi and Park 2018). Some methods proposed by related works include the use of SMT solvers (Murali et al. 2019a; Lye et al. 2015), greedy heuristic (Li et al. 2019; Zulehner et al. 2018; Dousti and Pedram 2012; Bahreini and Mohammadzadeh 2015), and machine learning-based algorithms (Herbert and Sengupta 2018; Venturelli et al. 2018; Pozzi et al. 2020). These solutions all focus on the “routing” part of the mapper. In addition to this, it is possible to deal with the mapping problem by optimizing its other stages like scheduling (Lao et al. 2022; Guerreschi and Park 2018), gate transformation (Pozzi et al. 2020; Guerreschi 2019; Itoko et al. 2020; Tan and Cong 2021), or initial placement (Tannu and Qureshi 2019; Jiang et al. 2021; Li et al. 2021b).

Different metrics are being used to assess the performance of the quantum circuit mapping technique depending on the cost function: some works have the goal of minimizing the number of gates or gate overhead (e.g., number of additional SWAP gates) (Zulehner et al. 2018; Lao et al. 2019a, 2022; Itoko et al. 2020; Tan and Cong 2021; Li et al. 2021b; Bandic et al. 2020; Hillmich et al. 2021), some prioritize low circuit depth or latency (circuit execution time) (Zulehner et al. 2018; Lao et al. 2019a, 2022; Pozzi et al. 2007; Tan and Cong 2021; Bandic et al. 2020), and finally, some focus on the success rate of the circuit (Jiang et al. 2021; Blume-Kohout and Young 2020) and maximizing fidelity (Murali et al. 2019a; Tannu and Qureshi 2019; Tan and Cong 2021) by also considering the different error rates of the quantum device. Note that the overall goal in the current NISQ era is to maximize the fidelity and success rate of quantum circuits, which currently mostly depends on the gate and circuit depth overhead. Figure 2 shows the impact of the number of gates and the gate overhead on the circuit fidelity. However, as shown in Fig. 1, not all the circuits end up with the same decrease in fidelity for the same or similar gate overhead. Note that the circuit fidelity is close to 0% for any circuit with more than 500 gates (Fig. 2a). In addition, a gate overhead of over 200% after mapping leads, in most cases, to a 100% fidelity decrease (Fig. 2b).

Fig. 2
figure 2

a Circuit fidelity vs. the number of gates. b Gate overhead (%) and decrease in fidelity. Synthetically generated circuits are marked with orange circles and real ones (i.e., quantum algorithms and routines) with blue squares. Here, only circuits with up to 500 gates were used

These approaches all have in common that they are designed to adapt quantum circuits to the device-specific properties and constraints considering only a reduced set of algorithm properties such as gate and qubit count and two-qubit gate percentage (including qubit interactions). A more in-depth quantum circuit characterization, which for instance could include characteristics of the qubit interaction graph like the number of times each pair of qubits interacts and the distribution of those interactions among the qubits, and of the quantum instruction dependency graph (i.e., graph that represents the dependencies between gates in the circuit and used for scheduling) is still missing. Looking further into interaction graphs is very beneficial for the quantum circuit mapping process, as, like stated before, the most stringent constraint of current quantum hardware is its limited qubit connectivity. Some authors have already pointed out the importance of including application properties (Bandic et al. 2022; Li et al. 2020; Lubinski et al. 2021; Mills et al. 2021) and considering the characteristics of the qubit interaction graphs for improving the mapping of quantum circuits (Bandic et al. 2020; Steinberg et al. 2022). Even in classical computing, we notice that different computing resources are necessarily based on what we use the computers for and which applications are executed. For instance, a dedicated GPU can be used for highly parallelizable processes. Likewise, thorough profiling can help to identify which algorithm characteristics are required to execute it successfully on a given device and vice versa. The structural properties of quantum circuits can also help understand why specific algorithms show better success rates than others when being run on a particular processor using a specific mapping technique.

3 Profiling of quantum circuits based on qubit interaction graphs

This section provides an overview of the qubit interaction graph-based benchmark profiling and clustering process, emphasizing why this could be meaningful for improving future quantum circuit mapping techniques.

3.1 On the importance of qubit interaction graphs for quantum circuit mapping

Qubit interaction graph \(G(V,E)\) is a graphical representation of the two-qubit gates of a given quantum circuit. It is in general a directed connected graph. Figure 1 shows an example of a quantum circuit (Fig. 1d) along with its interaction graph \({G}_{i}({V}_{i},{E}_{i})\) representation (Fig. 1a). Directed edges \({E}_{i}\) represent two-qubit gates, and nodes \({V}_{i}\) are the qubits that participate in them. Since the direction of edges in most cases does not influence the execution of the gates, it is sufficient to perceive the interaction graph as undirected for the mapping problem (A quadratic unconstrained binary 2023). If a circuit comprises multiple two-qubit gates between pairs of qubits, it results in a weighted graph (like in Fig. 3), which shows how often each pair of qubits interacts and how those interactions are distributed among qubits.

Fig. 3
figure 3

Interaction graphs of circuits with the same size-related parameters: no. of qubits = 6, no. of gates = 456, two-qubit gate percentage = 0.135

This additional information can be leveraged to provide more insights into a circuit structure that is otherwise hidden when only considering standard algorithm parameters such as the number of qubits and gates and two-qubit gate percentage. To illustrate this, Fig. 3 shows the interaction graphs of two quantum algorithms, an instance of QAOA and a randomly generated circuit (on the right), which a priori are similar when only characterized in terms of the three common algorithm parameters. What can be noticed is that their qubit interaction graph structure is quite different: the graph of the random circuit is more complex with full connectivity and presents a different distribution of the interactions between qubits, that is, of the weights. This will result in more routing and, therefore, higher overhead, unless we indeed have a fully connected coupling graph of the processor (Section 5).

This shows the importance of quantum circuit structure when developing mapping techniques and the necessity of characterizing the circuits in terms of their qubit interaction graphs. A few works have already pointed out how interaction graph along with quantum instruction dependency graph can be used as a baseline for designing better mapping techniques (Li et al. 2019; Lao et al. 2022; Baker et al. 2020; Bandic et al. 2023). In those works, gate dependency graphs are used as core information for scheduling optimization and look-ahead techniques, whereas interaction graphs are usually only used for the initial placement of qubits of the routing procedure. Considering that the primary constraint affecting the fidelity of the circuit execution is nearest-neighbor connectivity required for performing two-qubit gates, it would be valuable to know in advance how they are distributed among qubits and not only their quantity.

In this paper, we perform profiling of quantum circuits by focusing on interaction-graph properties and their relation to quantum circuit mapping. To that purpose, we took input from graph theory and analyzed qubit interaction graphs based on metrics described in Hernández and Mieghem (2011) with a focus on those that are relevant to the mapping problem.

Quantum circuit profiling in our work consists of the following steps:

  1. 1.

    Benchmark collection—collecting benchmarks (quantum circuits) from various sources, translating them to the same quantum language, and extracting their interaction graphs (Section 4).

  2. 2.

    Parameter selection and extraction—choosing and extracting graph-theory-based parameters from the qubit interaction graph that are relevant to the mapping of quantum circuits.

  3. 3.

    Benchmark clustering—clustering benchmarks based on their size- and interaction graph-related parameters.

After performing these steps, we compiled the quantum circuits using OpenQL (Khammassi et al. 2021) and analyzed the relation between their performance and extracted parameters, as well as clusters (Sections 4 and 5).

3.2 Parameter selection for quantum algorithm profiling

There exists a vast amount of metrics used for describing graphs, which can be classified into different groups and classes. However, not all of these metrics are relevant to our goal in terms of qubit interaction graph analysis. After thoroughly investigating all metrics described in Hernández and Mieghem (2011), we chose those that are key for the circuit mapping problem. These metrics, when calculated from the qubit interaction graphs, should represent features of quantum circuits that have a correlation with the mapping performance metrics (e.g., number of SWAPS). For instance, the node degree distribution is a relevant metric as it defines the connectivity of the graph (i.e., density of qubit interactions). The more connected the graph, the higher the node degrees. In case there is an all-to-all connected interaction graph, all degrees would be \(n-1,(n\) being the number of qubits) and that graph would be more challenging to map onto limited connectivity device topologies, which would result in the insertion of a higher number of additional SWAP gates. Table 1 shows the selected metrics subset and how they relate to the quantum circuit mapping process.

Table 1 Selected metrics for characterizing interaction graphs and their relation to the quantum circuit mapping

We noticed, however, that a large amount of these metrics are correlated, i.e., they scale in the same manner. Therefore, the parameter space was reduced by using a Pearson correlation matrix as shown in Fig. 4 (− 1/1 meaning maximally-correlated, 0 meaning not correlated) (Freedman et al. 2007). For instance, note that a minimal node degree of a graph strongly relates to maximal clique and edge connectivity, so in that case, just using one of the parameters, instead of all three, is sufficient. This method allowed us to reduce our previous metric set to average shortest path (average hopcount), maximal and minimal node degree, and adjacency matrix (interaction graph edge-weight distribution) standard deviation. These metrics and the common circuit parameters can be used to cluster quantum circuits. It is expected that quantum algorithms with similar properties should show similar performance when run on specific chips using a given mapping strategy.

Fig. 4
figure 4

Heatmap of a Pearson correlation matrix for quantum circuit and interaction graph metrics selected for mapping

3.3 Clustering benchmarks outcomes and evaluation

As mentioned earlier, one of our goals is to find structural similarities among quantum circuits and create some sort of “circuit families,” whose elements (quantum circuits) will show similar compilation behavior and require similar hardware resources. The two criteria we have used for clustering benchmarks are properties based on circuit size and qubit interaction graph. Note that we performed a two-step clustering: circuits were first clustered based on size parameters (number of qubits and gates and percentage of two-qubit gates) and then on qubit interaction graph metrics. The reason behind this was to avoid the former to become the most significant criteria of our clustering algorithm. Figure 5 shows the five clusters (different colors) in which a set of 300 selected benchmarks (Section 4) have been divided by using the kmeans (Lloyd 1982) clustering algorithm. Benchmarks are represented as lines in this parallel-coordinates plot. The x-axis contains a list of three different parameters with their values shown in y-axes.

Fig. 5
figure 5

Clustering of quantum algorithms based on size-related parameters

Each of the five size-related clusters can then be further divided into sub-clusters based on previously explained graph parameters: average shortest path length, maximal and minimal degree, and adjacency matrix standard deviation. In this case, we have again selected the kmeans algorithm among several others by evaluating different methods and parameter setups with the silhouette coefficient method (Rousseeuw 1987). In Fig. 6 is an example of when one of the size-parameters-based clusters (cluster 0 from Fig. 5) is divided into sub-clusters based on the interaction graph parameters. It is also pretty straightforward for additional future circuits to be assigned to a specific cluster (size- and interaction graph-based) as each of the clusters and sub-clusters covers the specific range of combinations of parameters (e.g., cluster 4 in Fig. 5 covers benchmarks with less than 25% percentage of two-qubit gates, and cluster 3 in Fig. 6 covers the highest minimal degree values (over 6)). Those circuits should then have similar expected fidelity and gate overhead outcomes as the other circuits in that cluster. How exactly do the mapping performance metrics correlate with our clusters from Fig. 6, and the possible reason for that will be described in the next sections.

Fig. 6
figure 6

Sub-clustering of quantum algorithms of cluster 0 (Fig. 5) based on interaction graph parameters

4 Experimental setup

This section describes all the necessary elements for performing our experiments: (i) our newly created benchmarks collection (qbench benchmark suite 2021) and a subset used in this paper; (ii) the OpenQL compiler with its Qmap mapper (Lao et al. 2022) and Surface-97 platform, IBM Rochester and Aspen-16 configuration files, and (iii) chosen set of metrics for evaluating the performance of the quantum circuit mapping technique.

4.1 Quantum benchmarks collection and classification

The fast development of quantum computing systems dictates the necessity for an all-including and standardized benchmark suite that can serve to test quantum devices as well as compilation techniques and, in general, any part(s) of the full stack. To address this issue, we collected various types of quantum circuits used as benchmarks from a large number of sources (Anis et al. 2021; Li 2019; Li and Krishnamoorthy 2020; UCLA 2020; JKU 2018; Möller and Schalkers 2020; Valada 2020; Microsoft 2020; QuTech n.d; Developers n.d; Smith et al. 2016; Sivarajah et al. 2020; Cross 2018; Last et al. 2020; Wille et al. 2008) written in and translated to different available high- and low-level languages. An overview of our open-source benchmark suite called QBench (qbench benchmark suite 2021) is shown in Fig. 7.

Fig. 7
figure 7

Overview of our QBench repository

Benchmarks are first divided into two high-level groups: real vs. synthetic quantum circuits. The first ones are then further split into two categories depending on whether they are based on quantum algorithms or are simple reversible arithmetic circuits. In the second group, we can find three different subgroups based on how they are generated. According to Nielsen and Chuang (2002), currently used benchmarks based on real algorithms (QFT, search algorithms, application-based algorithms) are the ones that are of the highest importance when measuring the performance of all future quantum systems as they are scalable, meaningful, and can show the advantage in quantum systems comparing to classical counterparts (Tomesh et al. 2022). For the current NISQ era, however, there is a need for benchmark libraries like RevLib (Wille et al. 2008) that are within the domain of reversible and quantum circuit design. Synthetic benchmarks represent the group of randomly generated quantum circuits, which provide a larger variety in terms of their parameters (e.g., number of qubits, gates, two-qubit gate ratio, circuit depth), and are mainly used to test the performance of quantum devices and explore their computational power to the fullest. For this paper, we mainly focused on (i) randomly generated quantum circuits that are created by uniformly randomly choosing single- and two-qubit gates from a predefined set and then applying them on arbitrarily chosen qubits or qubit pairs in the circuit (Valada 2020); (ii) QUEKO circuits (UCLA 2020), which are designed to be optimal for specific devices (e.g., with optimal depth); and (iii) Quantum volume square circuit (Cross et al. 2019) that is used in general for benchmarking quantum system architectures. A summary of all the real-algorithm-based or synthetic circuits that are part of our benchmark set can be found in qbench benchmark suite (2021).

Benchmarks in our set are also classified based on their size (large-, middle-, and small-scale and parameterized ones) and on the higher- or lower-level language they are written in qbench benchmark suite (2021). Note that a parameterized (scalable) version of the circuits allows the creation of new circuits of a desired size, which will be very meaningful for future quantum systems (Tomesh et al. 2022). Furthermore, different translators from one quantum language to another, interaction graphs, and interaction graph-based profiling are also part of this benchmarks suite.

For our experiments, we selected a subset of 300 benchmarks from QBench covering different types (previously described in this section) and qubit number ranges (2–1281 qubits for clustering, 3–54 qubits for mapping experiments).

Note that this benchmark set is to become open source not only for other researchers to use it for the future development of quantum systems, but also for others to participate in its future extensions. There will always be new benchmarks that can be added or quantum languages to translate the current benchmarks to, as we are in the era where we witness a continuous development of new quantum algorithms, compilers, simulators, and programming languages.

4.2 Quantum compiler and targeted quantum devices

To analyze how the previously shown clusters of circuits (Section 3) relate with their after-mapping outcomes, we compiled the 300 selected quantum circuits using as target quantum processor an extended 97-qubit version of the Surface-17 chip (like in Fig. 8a). Surface-17 is a quantum processor with a surface code architecture (Lao et al. 2022), designed to be easily scalable. The device characteristics and all its constraints are included in a configuration file, which is then used as input for the compiler OpenQL (Khammassi et al. 2021). The configuration file of our chosen back-end includes information like error rates, primitive gate set, gate-decomposition rules, and processor qubit topology/connectivity. In addition to this, and in order to compare the performance of the mapper for different groups of circuits, we performed the same experiments for two more quantum processors: the IBM Rochester and the Rigetti 16q-Aspen chips that are shown in Fig. 8b and c, respectively. We selected these device configurations because they are currently commonly used in other research on quantum circuit mapping and provide realistic and different connectivity patterns in their coupling graphs. Note that in our experiments, we do not execute the quantum circuits on actual devices, but instead, they are just mapped into the different quantum processors; that is, their hardware constraints are considered in the compiling process.

Fig. 8
figure 8

Topologies of the quantum architectures used for experiments: a Surface-97; b IBM Rochester, and c Rigetti 16-q Aspen. Figures taken from (Overwater et al. 2022; IBM n.d; Rigetti n.d)

At the core of the OpenQL compiler is its Qmap mapper, which has many options and strategies allowing to create a sort of custom-made compilation technique. The Qmap quantum circuit mapper considers several types of hardware constraints: limited connectivity, primitive gate set, and restrictions derived from classical control electronics. It supports several options for circuit optimization, routing, initial placement as well as scheduling. In addition, it outputs different circuit mapping performance metrics such as the number of additional gates and circuit latency. The routing strategy we opted for was MinExtend (Lao et al. 2022), which, among other features, includes looking back to previously mapped gates and strives to minimally extend the latency of the circuit. It also includes different but common gate transformation and optimization strategies such as gate cancelation or commutation.

4.3 Metrics

The most commonly used metrics for quantum circuit mapper evaluations are the number of added SWAPs, circuit depth, and fidelity/reliability. In our case, we have used the additional gates and extended depth information retrieved from the compiler to calculate the following metrics:

  • 1. Gate overhead is calculated as

    \({G}_{overhead}=\frac{({G}_{after}-{G}_{before)}}{{G}_{before}}\), where \({G}_{before}\) and \({G}_{after}\) represent the number of gates before and after compilation.

  • 2. Latency overhead is defined as:

    \({L}_{overhead}=\frac{({L}_{after}-{L}_{before})}{{L}_{before}}\), where \({L}_{before}\) and \({L}_{after}\) represent the circuit latency before and after compilation. Latency is calculated as the number of cycles of the circuit, which also considers variations in gate duration, making it different from circuit depth in which all gates are considered to take one time-step.

  • 3. Circuit fidelity is defined as the product of error rates of the gates in the circuit. When mapping a circuit, the main goal is to maximize this metric (Murali et al. 2019b; Nishio et al. 2020). We assumed that all one-qubit and two-qubit gates have the same error rates, respectively, for which we used average values of the Starmon-5 chip (QUTECH 2020).

  • 4. Fidelity decrease is calculated as \({F}_{decrease}=\frac{({F}_{before}-{F}_{after})}{{F}_{before}}\), where \({F}_{before}\) and \({F}_{after}\) represent the circuit fidelity before and after compilation.

In the following section, we will discuss the relation of the structural parameters of circuits with the above-stated obtained metrics after mapping them into the Surface-97, IBM Rochester, and Rigetti Aspen-16 devices.

5 Results

5.1 Mapping the circuits to Surface-97 chip architecture

In this section, we evaluate and compare the mapping outcomes of our selected circuits and analyze how the circuit parameters impact the results. Additionally, we compare the performance of different clusters of circuits when using the same mapping technique and processor design (Surface-97).

As previously shown in Section 2 (Fig. 2), the gate overhead and circuit fidelity decrease is, on average, higher for our type of synthetic (randomly generated) circuits than for those based on real algorithms, even when they are in the same range of size.Footnote 1 Furthermore, as shown in Fig. 9, these two groups of circuits (real and synthetic) are further divided into a total of four differently-structured groups that include randomly generated circuits, QUEKO benchmarks, quantum algorithm-based circuits, and reversible arithmetic circuits. Note in Fig. 9 the difference between these groups in terms of three defined mapping performance metrics. Reversible arithmetic circuits showed on average the lowest gate overhead (\(\sim 120\%\)) and therefore decrease in fidelity. Randomly generated circuits have on average the best latency overhead (\(\sim 88\%\)). To give an example, QUEKO circuits show an average gate overhead of \(\sim 348\%\), latency overhead of \(\sim 153\%\), and fidelity decrease of nearly 100%. This all clearly shows the importance of including the structure of the quantum circuit in the mapping process and leads us to using that information to our advantage when choosing an appropriate pair of device and mapping technique.

Fig. 9
figure 9

Mapping performance metrics: gate overhead, latency overhead, and fidelity decrease (all in %) for all groups of benchmarks. We differentiate (i) synthetic circuits: randomly generated circuits (hexagons) and QUEKO circuits (UCLA 2020) (squares) and (ii) real, algorithm-based circuits: simpler arithmetic circuits (“x”) and circuits based on quantum algorithms (“ + ”) (e.g., QFT or Grover’s algorithm, see Section 4). Only circuits with up to 500 gates are shown

Subsequent to this, we unveil how size-related parameters: number of qubits, number of gates, and two-qubit gate percentage relate to gate overhead and fidelity decrease, respectively, as shown in Fig. 10. Each point in the graphs represents a benchmark mapped to the Surface-97 processor, and just like in Fig. 9, different groups of benchmarks are shown using different symbols and in the same way. In this case, we only considered circuits with up to 500 gates, as all those above that threshold had negligible fidelity even before mapping. Note that these three mentioned parameters are correlated with the mapping results of the circuits on the chip: the closer the points in graphs are to 0 in all axes simultaneously, the lower the overhead and fidelity decrease. Another point that can be made from these figures is that synthetic circuits (QUEKO and random circuits) perform in this setup, on average, worse than the algorithm-based circuits in terms of after-mapping fidelity and gate overhead (just like in Fig. 9).

Fig. 10
figure 10

a Gate overhead and b fidelity decrease in % (color bar) vs. size-related parameters: number of qubits, number of gates, and two-qubit gate percentage. We differentiate (i) synthetic circuits: randomly generated circuits (hexagons) and QUEKO circuits (UCLA 2020) (squares) and (ii) real, algorithm-based circuits: simpler arithmetic circuits (“x”) and circuits based on quantum algorithms (“ + ”)

We have noticed earlier (Section 2) that the size of a circuit, even though an important feature, is not the only reason why some circuits have lower after-mapping overheads than others. Figure 11 shows how the parameters minimal degree, maximal degree, and average shortest path of the interaction graph influence fidelity and gate overhead of circuits. As observed before, the closer the points in graphs are to 0 in all axes simultaneously, the lower the overhead and fidelity decrease. The graph shows a strong correlation of both the increase in gate overhead (Fig. 11a) and fidelity decrease (Fig. 11b) with the increase in maximal and minimal node degree and average shortest path. 2D cuts of Fig. 11 are shown in Fig. 12 for a better visualization. The following observations can be made: (1) the higher all three circuit parameters, average shortest path, minimal and maximal node degree are simultaneous, the higher the gate overhead (Fig. 12a) and fidelity decrease (Fig. 12b). This means the fidelity is the highest and overhead the lowest if all three circuit parameters are close to 0. (2) Some patterns for circuits belonging to the same group can be observed based on how they are created. For instance, QUEKO circuits (squares) have a high average shortest path (\(\sim 3\)), random circuits (hexagons) have a high average node degree (\(\sim 8\)), whereas RevLib and algorithm-based circuits (x in graph) have on average low values of the same parameters (\(\sim 1.5\) for average shortest path and \(\sim 4.5\) for node degree).

Fig. 11
figure 11

a Gate overhead and b fidelity decrease in % (color bar) vs. interaction graph-related parameters: minimal node degree, maximal node degree, and average shortest path

Fig. 12
figure 12

2D plots of the graphs shown in Fig. 10: a interaction graph-based metrics vs. gate overhead (color) and b interaction graph-based metrics vs. fidelity decrease (color)

In Section 3, quantum circuits have been clustered based on size and interaction graph parameters. In Fig. 13, we can see how the clusters based on interaction graph similarity (example shown in Fig. 6) relate to the mapping performance metrics gate overhead, latency overhead, and fidelity decrease. As mentioned in Section 4, the lower these metrics are, the better the mapping performance. One can notice that circuits belonging to cluster 0 outperform other circuits in terms of gate overhead and fidelity decrease (up to 200% for gate overhead, and an average of \(\sim 89\%\) for fidelity decrease), whereas clusters 3 and 4 show the best performance in terms of latency (up to \(\sim 150\%\)). What we can further conclude when comparing Figs. 9 and 13a is that clusters mostly consist of benchmarks of the same type: cluster 0 mostly has real circuits, cluster 3 random ones, and cluster 2 QUEKO circuits. That shows, for instance, that real quantum circuits, especially those from cluster 0, present some pattern in the structure that is easier to map without requiring too many additional gates. Finally, Fig. 13b, which represents a 2D cut of Fig. 13a, clearly shows differences in the range of gate and latency overhead for different clusters. For instance, clusters 3 and 4 have almost constant circuit latency overhead, on average lower than for other clusters, whereas circuits in cluster 0 have low and similar gate overhead. Gate overhead values of cluster 2 scale linearly with latency overhead.

Fig. 13
figure 13

Relation of clusters of circuits (that are shown in Fig. 6) with the parameters of their interaction graphs: a Gate and latency overhead and fidelity decrease and b gate and latency overhead

5.2 Quantum chip topology as one rationale behind results

To further look into the reasoning behind the relation between quantum circuit parameters and mapping performance metrics, we first into the device topology. Thus, we map the same groups of circuits on two additional quantum platforms: the IBM Rochester and Aspen-16 quantum devices (Fig. 8). The outcomes are shown in Figs. 14 and 15. Figure 15 showcases detailed information on how much each structural parameter influences the three mapping performance metrics: gate overhead, latency overhead, and fidelity decrease for all three device configurations. In Fig. 19 (see Appendix), additional details can be found.

Fig. 14
figure 14

Results of the circuit compilation when mapping different quantum circuits (Random, QUEKO, Reversible arithmetic circuits—RevLib, Quantum-algorithm based circuits) to the IBM Rochester (a) and Aspen-16 (b) device topologies using the MinExtend mapper

Fig. 15
figure 15

Correlation matrices showing correlations of mapping performance metrics (gate overhead, latency overhead, fidelity decrease) with extracted metrics of the circuit for the three device configurations: Surface-97, IBM Rochester, and Rigetti Aspen 16-q (top-down)

From the figures, we can derive the following:

  1. (i)

    Different groups of benchmarks based on their origin and structure perform differently when executed on different device topologies. The main value of the figures comes from the fact that we can clearly choose a preferred quantum processor topology for each of the benchmark groups (e.g., Surface-97 is preferred for arithmetic reversible circuits, whereas IBM Rochester might be chosen for random ones, as shown in Figs. 9 and 14).

  2. (ii)

    The impact of structural parameters on the results varies depending on the topology. For example, in the case of the two new topologies, the number of qubits was not as strongly correlated with gate overhead, whereas the degree of the graph played a more significant role. The correlation matrix shown in Fig. 15 highlights that certain parameters are more relevant for specific quantum devices. For the IBM Rochester device, the most important parameter for gate overhead is the two-qubit gate percentage, whereas for Aspen-16 is the maximal degree. The most important parameters for fidelity decrease of both devices are the maximal and minimal degree of the qubit interaction graph, the number of qubits and gates, and the two-qubit gate percentage. In contrast, for the Surface-97 device, the most important parameters for gate overhead are the number of qubits and the two-qubit gate percentage, while the most important parameters for fidelity decrease are the number of qubits and the maximal degree of the qubit interaction graph. Latency overhead does not appear to be related to these structural parameters, so we will investigate this metric further in future work with other parameters. These observations suggest that

    • Interaction graph parameters are more relevant for the mapping outcomes of Aspen-16 and IBM Rochester devices than for Surface-97. We can see that the majority of structural parameters are highly correlated with the circuit fidelity decrease. The main reasoning behind this is that these processors have much less connected coupling graphs; in other words, the sparser the coupling graph, the strongest the correlation with the interaction graph parameters. In our case, Aspen-16 has the most restricted coupling graph connectivity, and consequently, its mapping metrics have the highest correlation with interaction graph properties.

    • Two-qubit gate percentage, as expected, shows a very high correlation with the gate overhead metric regardless of the device. Other size-related parameters (number of qubits and gates) are highly correlated with the fidelity decrease of Aspen-16 and IBM Rochester devices due to again limited connectivity of their coupling graphs as well as smaller device size. On the other hand, the number of qubits only correlates with the gate overhead of Surface-97, which can be attributed to the fact it is a much larger device where we could run much bigger and more complex circuits that would then lead to inevitably long routing paths between at least some of the qubits.

  3. (iii)

    The two new topologies used for these experiments have quite similar structures (just in different scales of qubit range), and consequently, experiments showcased similar patterns. In future work, we plan to expand our analysis by including additional device topologies.

  4. (iv)

    In cases where there is no correlation between interaction graph parameters and certain results (such as latency overhead and minimal degree), it suggests that other structural parameters may have played a more significant role. In our future work, we plan to investigate additional parameters such as gate-dependency critical paths and parallelism, which are discussed in Section 6. Similar findings were also observed in a previous study (Tomesh et al. 2022), demonstrating differences between topologies.

To further investigate the benchmark cluster-device relationship, we continued by observing the circuits belonging to the same clusters. We noticed that (Fig. 16) cluster 0 consists of sparse, low-degree graphs and mostly RevLib circuits; cluster 1 is composed of circuits of a very large standard deviation of weight distribution; cluster 2 includes grid-like-shaped circuits with mostly QUEKO benchmarks; cluster 3 has the densest graphs with highest node degree, mostly consisting of randomly generated circuits; and finally, cluster 4 contains circuits with large average shortest path, mostly QUEKO circuits based on some existing algorithms (UCLA 2020).

Fig. 16
figure 16

Qubit interaction graphs for circuits belonging to cluster 0 (a) and to cluster 3 (b)

As expected, the sparse graphs of low node degree in cluster 0, which are easier to map to the 2D-grid-resembling qubit topology, required the lowest amount of additional SWAPs, but due to specific, algorithm-based structure could not be well optimized in terms of depth (more difficult to parallelize operations). Cluster 0 is the only cluster with circuits whose fidelity did not drop 100%

On the other hand, the 2D-grid qubit topology, which is the most common state-of-the-art for quantum chips, could not handle well the dense graphs belonging to cluster 3, most of which are random circuits. However, they did perform fine in terms of their latency. What is also interesting, based on these outcomes, is that having, for instance, high average shortest path (like circuits in cluster 4) leads to low latency overhead—as explained in Section 1, which means that the circuit depth was not extended so much. That was as expected, considering that it means that those circuits are much less connected and easier to parallelize.

Furthermore, we have also analyzed the relationship between different circuit clusters and the mapping performance metrics for the experiments performed with the latter two quantum devices, the 53q Rochester and the 16q Aspen processor (see Fig. 17). This time, we clearly see different outcomes. For instance, cluster 0 is not anymore outperforming the others in terms of gate overhead—cluster 4 shows the lowest gate overhead of \(\sim 12\%\); cluster 3 fluctuates much more in terms of latency—it goes up to \(\sim 450\%\) instead of the previous \(\sim 150\%\); and cluster 4 is doing way better in terms of fidelity decrease—\(\sim 90\%\) instead of previous \(\sim 100\%\). This is more evident for the Rochester device as the number of circuits included is significantly larger. As 16q-Aspen is on a smaller scale (lower number of qubits) similar to Rochester device in terms of connectivity, we also notice that they have similarly distributed clusters regarding mapping metrics. The data points in Fig. 17b could even be a subset of those in Fig. 17a. This outcome means that other devices with similar topology and higher numbers of qubits would still show similar patterns.

Fig. 17
figure 17

Quantum circuit mapping metrics vs. clusters of quantum circuits when targeting IBM Rochester (a) and Aspen-16 (b) topologies

We discuss other possible reasoning behind the results in Future work section.

6 Discussion and future work

In Section 3, we mentioned that for completing the description of the structure of quantum circuits, in addition to the interaction graph, we also require gate dependency graph properties. Gate dependency graphs can give insight into how a circuit evolves in time. The critical path within the graph is the most relevant property as it is related to the parallelization degree of the gates, which directly influences the circuit depth. This would also help to explore the oracles or other patterns and repetitions within the circuit. In addition to gate dependency graphs, properties like the amount of parallelism in the circuit (gate density), measurement, and idle gates are influencing the success rate of the circuit a lot (Tomesh et al. 2022).

In addition to this, we must not underestimate the role of the mapping technique in these outcomes. For example, including features like look-ahead/back approaches or optimal initial qubit placement would probably have a stronger influence in terms of mapping results when used on circuits with already predefined, steady, and repetitive structures. To verify this assumption, we plan to compare the performance of quantum circuits when using different types of mappers and optimization properties to investigate the mapper-circuit relationship in contrast to the device-circuit relationship demonstrated in this paper. That could then lead to providing guidelines for designing and optimizing algorithm-aware mapping techniques. To this purpose, structured design space exploration methodologies can be used as pointed out in Bandic et al. (2020).

To conclude, in our future work, we would like to explore further (i) other structural parameters of quantum circuits based on gate dependency graphs such as critical path, the density of gates per layer, and the amount of measurement and idle gates. With this, we will ensure to encapsulate all structural perspectives of quantum circuits when performing benchmark clustering and profiling; (ii) how these observed patterns (with current parameters and additional ones) can help us to predict the mapping performance of new circuit samples assigned to our clusters, without actually running them on the device; (iii) how exactly the interaction graph and coupling graph similarity relate to the mapping result; and (iv) investigate a relationship between interaction graphs and gate dependencies with the chosen mapping technique and to which extend that affects the circuit mapping performance on-chip. For this, we will include more compiling options when performing comparisons. This insight into a circuit structure could help us compare and improve currently existing mapping techniques and enable us to have algorithm-driven mappers and quantum devices.

7 Conclusion

Current quantum devices are still bounded by size and noise and can only handle small and simple quantum algorithms. To execute quantum algorithms, expressed as quantum circuits, on these error-prone and resource-constrained devices, they need to be adapted to overcome those limitations and therefore prevent additional errors. That process is referred to as the mapping of quantum circuits and represents a complex optimization problem that is dependent on both, processor and algorithm properties. In addition to hardware properties, in this paper, we have analyzed how the structure of quantum circuits affects their mapping performance. Our selected quantum circuits were characterized in terms of not only standard parameters, such as the number of qubits and gates and percentage of 2-qubit gates, but also in terms of their interaction graph (i.e., graph theory-based) parameters that include average shortest path, minimal and maximal node degree, and standard deviation of the edge-weight distribution. Our results show a strong correlation between these parameters and circuit mapping metrics: gate overhead, latency overhead, and fidelity decrease increased with the increase in all the chosen parameters. The effect of these parameters varies across different devices and metrics. For example, the degree parameter has a larger impact on fidelity decrease for the IBM Rochester device than for the Surface-97 device. From these findings, we can identify the preferred devices for an algorithm with specific individual metrics. Furthermore, after clustering the circuits based on mentioned parameters, we found patterns in mapping performance (in terms of the three mentioned metrics) of the circuits belonging to the same cluster, when mapped using the same technique on the same device. For instance, clusters with simpler, low node-degree graphs showed better performance when targeting a 2D-grid topology regarding gate overhead, whereas clusters consisting of complex and dense circuits outperformed others in latency. On the other hand, different performance results were noted when running the same groups of circuits on two other less-connected devices: size parameters like the number of qubits were far less relevant, and synthetic circuits outperformed real ones (which was not the case for Surface-97), and finally, the correlation between clusters of benchmarks and mapping results was unlike to the previously obtained ones. It was also shown that the way circuits were created is very related to their structure and impacts the results (e.g., if they were uniformly randomly generated circuits), as those circuits were in most cases grouped in the same clusters. Finally, we could see how the clusters scale with different mapping metrics. For instance, in one of the clusters, gate overhead scales linearly with latency overhead; in another, gate overhead is constantly within a specific range regardless of the increase in latency.

The proposed method and current findings will help to enhance circuit mapping techniques by including information about the structure of the circuit as well as to have a deeper understanding on the disparity of the observed outcomes when executing different quantum algorithms. In addition, structural parameters of circuits could be used to predict their fidelity decrease and gate and latency overhead for some specific processor and compilation technique without running them on actual devices. This could help to analyze and perform a design space exploration as well as codesign of current compilers, quantum processors, and quantum applications. Ultimately, this process contributes to the development of application-specific quantum systems, where algorithms will be run with higher performances.

Quantum circuits are also used as benchmarks for evaluating mapping and quantum processors. However, the quantum community still does not agree on one benchmark set used, which resulted in an overwhelming amount of sources of quantum circuits. In this work, we have created a soon-to-be open-sourced easy-to-use benchmark collection having benchmarks from various sources cataloged in folders based on how they are implemented (e.g., based on a real algorithm, random, application-based), the language they are written in, and their size. The set also contains various scripts for translating circuits from one language to another, circuit interaction graphs, and profiling results, as described in this paper. We hope this collection will be useful for testing new quantum processors, updated regularly by the research community to keep up with the new technologies, compilers, programming languages, and most importantly applications, and eliminate the over-the-top amount of benchmark sources.