1 Introduction

Quantum computing is a promising technology which is expected to efficiently solve certain classes of problems that are challenging for classical computers in terms of computational time and/or hardware resources. The term quantum advantage is coined for the demonstration of this algorithmic speed-up on quantum hardware. Several quantum algorithms have been devised to demonstrate this, for example prime number factorization of large integers [1] with an exponential speed-up compared to its best known classical counterpart. A similar speed-up exists when simulating the chemical and physical properties of molecules and the dynamics of fundamental physical models [2]. We should note that most of these algorithms assume an error-free quantum hardware with a number of quantum bits or qubits beyond the reach of current technology. With the inherent presence of the loss of quantum information in any physical system, a fault-tolerant quantum computer [3] would employ a built-in quantum error correction, where the number of error-free logical qubits is less than the error-prone and noisy physical qubits.

However, even in the absence of error-correction, noisy intermediate-scale quantum (NISQ) computers [4] are thought to exhibit quantum advantage over classical high-performance computers (HPC) in the range of 100 to 1000 qubits, depending on the quality of the quantum hardware and the connectivity between the qubits. Among many physical platforms, superconducting quantum hardware is well-suited for scaling the number of qubits and improving their fidelity while maintaining connectivity and thus becomes a preferred technology in the NISQ era with roadmaps towards fault tolerance [5].

Here, we introduce the IQM SparkTM [6] prototype,Footnote 1 a 5-qubit superconducting quantum computer designed and developed to enable a low-barrier access to both its hardware and software components. The hardware is self-contained with a packaged superconducting quantum processing unit (QPU), a dilution refrigerator, control electronics, while the software components allow for both a direct manipulation of the qubits by microwave pulses or to run small scale quantum algorithms composed of quantum gates [6]. As we will demonstrate with different use cases, this system can be harnessed for a range of educational activities from teaching the concepts of superconducting quantum hardware to developing an understanding of quantum error mitigation and performing experiments from different fields of research.

The rest of the paper is organized as follows. In Sect. 2, we introduce the hardware, in which, after the overview, the basics of the transmon qubits, tunable couplers, traveling wave parametric amplifiers, dilution refrigerators, and control electronics are explained. In Sect. 3, we introduce the software necessary to operate the quantum computer. The next two sections are dedicated to applications of a quantum computer to education and research. Some applications are available only for an on-premises quantum computer. In Sect. 4 we introduce use cases for educational purposes, namely

  • calibration,

  • benchmarking

  • visualization of pulses with oscilloscope

  • error mitigation, and finally

  • execution of simple quantum algorithms.

In Sect. 5, we reproduce some research results which appeared recently in scientific journals, namely

  • simulation of neutrino oscillation,

  • estimation of Jones polynomials, and

  • an introduction into embedding techniques for quantum chemistry.

Sections 6 and 7 are devoted to summary and discussion.


Throughout this paper, \(I_{k}\) stands for the k-dimensional unit matrix. Pauli matrices are denoted by X, Y and Z, with an optional subscript to denote the qubit they apply to:

$$ P_{i} = I_{2} \otimes \cdots \otimes I_{2} \otimes P \otimes I_{2} \otimes \cdots \otimes I_{2}, \quad P \in \{X,Y,Z\}, $$

where P is in the ith position. \(X_{1} Z_{3}=X \otimes I_{2} \otimes Z\), for example. We use the common convention for qubit ordering, namely the top qubit in a quantum circuit is the first qubit, which is opposite to the Qiskit convention [7]. We assign qubit numbers 1 to 5 as shown in Fig. 1 although Qiskit assigns 0 to 4. Which convention is employed should be obvious from the context.

Figure 1
figure 1

Design of a 5-qubit superconducting quantum processing unit employed in this paper, showing 5 qubits (QB) connected by 4 tunable couplers (TC). Black, apart from explanatory text, indicates areas where superconducting film is etched exposing the substrate. Flux lines are in red while drive lines are in blue

Estimating expectation values of observables

A quantum computer estimates the probabilities of the Z-basis states of the measured qubits by repeating the same multiqubit Z-basis measurement many times, and computing the relative frequencies of the outcomes. The number of repetitions is called “shots”. For example, for a single-qubit state \(| \psi \rangle =a| 0 \rangle +b| 1 \rangle \) we may estimate the probabilities \(p_{0}=|a|^{2}\) and \(p_{1}=|b|^{2}\). Using these values we may estimate the expectation value of \(Z_{i}\) as \(\langle \psi | Z_{i} | \psi \rangle =p_{0} - p_{1}\). By using the identities

$$ X = H\,Z\,H \quad \text{and} \quad Y = S\,H\, Z \,H\,S^{\dagger}, $$

we may estimate the expectation value of X or Y by first rotating the state \(| \psi \rangle \) by H or \(H\,S^{\dagger}\), respectively, and then estimate the expectation value of Z. For execution on our hardware, all circuits are transpiled into single-qubit \(R(\theta , \phi )=\exp [-i\theta (\cos \phi \, X + \sin \phi \,Y )/2 ]\) gates and two-qubit CZ gates. In this process the extra gates used for estimating X- or Y-expectation values are often merged with adjacent single-qubit gates in the circuit.

2 Hardware

Our superconducting quantum computer is a full-stack system consisting of a 5-qubit superconducting QPU, dilution refrigerator, optimized cryogenic microwave and DC lines, control electronics, and appropriate classical computing hardware to run the control software. The details of our quantum computer used in the experiments of this paper are described in the following subsections.

2.1 Quantum processing unit

2.1.1 Overview

The core of any quantum computing system is the quantum processing unit (QPU) comprising of qubits, qubit-couplers and control and readout lines. The QPU of our quantum computer features five data qubits with a single central qubit connected to the four peripheral qubit through tunable couplers [8, 9] in star topology as depicted in Fig. 1. The qubits and tunable couplers are described below in more detail.

The chip layout is drawn using KQCircuits [10]. It is a free open-source add-on for KLayout [11], a widely used open-source operating system independent layout viewer and editor for integrated circuits. KQCircuits adds the necessary functions to KLayout to programmatically, or using graphical user interface, draw superconducting circuits and export the designs for fabrication or for standard microwave simulation software. KQCircuits also exports netlists, which enable SPICE-like quasi-lumped-element simulations for fast validation and optimization of geometrical parameters to achieve target coupling strengths between circuit elements discussed below.

2.1.2 Qubit type

The states of computational qubits in our QPU are physically stored in non-linear oscillators referred to as transmon qubits. Transmon is a modified charge qubit, where a Josephson junction or a Superconducting QUantum Interface Device (SQUID) is shunted by a large capacitor in a way that Josephson energy exceeds capacitor energy by a factor of few tens [12]. The transmon is prevalent qubit type in superconducting quantum computation due to its stability against charge and flux noise, and simplicity of operation. In our qubit circuit, sometimes referred to as a grounded transmon [13], the qubit capacitor is formed by a thin metal film island separated from the coplanar ground plane by a gap where metal has been etched and underling dielectric is exposed. In addition to the mutual capacitance, the central island is connected to the ground via the SQUID consisting of parallel-connected Josephson junctions.

The shape of the qubit charge island has six-fold rotational symmetry. Each sector features a capacitor island. Each of the islands has its own size, which allows the coupling capacitance to be individually tuned to achieve target coupling to the neighboring qubits, qubit state readout resonator and extra shunt to the ground for precise targeting of total shunt capacitance. In between two coupling islands, there is a narrow strip of charge island to reduce coupling between the neighboring couplers.

2.1.3 Qubit control

Each of the qubits is individually addressed by two control lines. Control lines are implemented as coplanar waveguides.

The center conductor of the flux line is shorted to the ground in the vicinity of the qubit SQUID creating an effective mutual inductance between the center conductor and the SQUID loop. By applying electrical current through the flux line, magnetic flux created through the SQUID loop creates a phase bias across the Josephson junctions, reducing the effective Josephson energy of the SQUID and hence the qubit frequency. At the maximum qubit frequency, the frequency is insensitive, to the first order, to external flux maximizing the coherence time of the qubit and is referred to as a sweetspot. Qubit frequency changes are used to find overall optimal operation frequencies, change dispersive coupling rates to the other elements, and implement physical Z and CZ gates, more on two-qubit gates below.

For drive lines, the center conductor is left open circuited and due to proximity has mutual capacitance to the charge island of the qubit. By applying microwave signals through the drive line, the qubit state can be driven between internal states with the energy difference corresponding the signal frequency. The capacitance value is carefully chosen to ensure sufficiently low qubit coupling to the \(50\, \Omega \) environment to minimize Purcell losses [14] whilst keeping the coupling to the target qubit strong compared to the spectator qubits. Resonant qubit drive is used to implement X, Y or any \(R(\theta , \phi )\) gate. By choosing corresponding drive frequency, gates between higher energy levels of the qubit can also be achieved giving access to the Hilbert space of larger dimension.

2.1.4 Readout

To infer the state of a superconducting qubit, so-called dispersive readout is employed [15, 16]. This is a widely used method which employs transverse coupling between the qubit and resonator based on a dipole-dipole interaction. Due to the transverse coupling term in the Jaynes-Cummings Hamiltonian [17], a qubit-state-dependent frequency shift of the resonator, known as the dispersive shift, is observed. Every qubit has a dedicated readout resonator connected to it and each readout resonator has a different resonance frequency.

To suppress the Purcell decay rate of the qubits through the readout resonators, the resonator is not directly coupled to the 50 Ω environment. Instead, each readout resonator couples to an individual Purcell filter – a bandpass filter which reduces the transmission at the qubit frequency.

As shown in Fig. 2, the Purcell filters are in turn all coupled to a common probe line. The state of the qubit registry is inferred by probing the transmission of the probe line with a frequency comb and comparing the phase and amplitude of the transmitted signal at the frequency of each readout resonator individually to a set threshold. The readout cross-talk is reduced thanks to the individual Purcell filters [18]. The amount of coupling and the frequency detuning between the elements is optimised to balance readout speed and Purcell relaxation rate. The input capacitor and the shunt at the output of the probe line forms another resonator, where the total length defines the frequency and the location of the output tap defines coupling strength to the output port, see the right side of Fig. 2.

Figure 2
figure 2

Quasi lumped element circuit diagram of readout circuit, including readout- and Purcell resonators connected to probe line, which consists of a distributed Purcell filter. Qubits are depicted as circles with two horizontal lines

The coupling strengths and detunings are chosen such that the Purcell effect would not limit the intrinsic \(T_{1}\).

2.1.5 Tunable couplers

Tunable couplers are utilized in order to perform two-qubit gates between the above-mentioned qubits. Tunable couplers are circuit components based on transmon qubits, which enable us to perform two-qubits gates with state of the art fidelities above 99% [9]. The main benefit of using tunable couplers is the possibility to compensate the native ZZ-interaction between qubits which enables high fidelity identity gates [8]. In our design, see Fig. 3, the interaction between tunable coupler and qubits is mediated by waveguide extenders [9]. This feature allows us to place a significant distance between the qubits to avoid inter-qubit cross-talk, while keeping the switchable ZZ-coupling large enough and, in larger devices, fit the readout resonators into the qubit lattice unit cell.

Figure 3
figure 3

A quasi-lumped element circuit diagram of two transmon qubits (blue and orange) coupled by a tunable coupling structure consisting of waveguide extenders (turquoise) and a floating coupler qubit (red) [9]. Electrical nodes are marked with capital letters. Grey color elements represent the effective couplings implemented by the waveguide extenders

By applying an external magnetic flux into the SQUID loop of a tunable coupler, one can change a coupler frequency and thus the effective amount of the ZZ-interaction \(g_{zz}\) between relevant pair of qubits. The effective value of the \(g_{zz}\) can be changed in a wide range including both positive and negative values. Consequently, there exists a point where this interaction is equal to zero. This point is used while the QPU is idling so that all qubit pairs have negligible interaction and all the native couplings between pairs of qubits are compensated. To perform a two-qubit gate between neighboring qubits, the coupler can be flux-tuned by a square-like baseband pulse to change its frequency causing an interaction between qubits for a certain amount of time, see also Sect. 2.5.

By choosing the detuning between qubits during a gate operation, one can implement CZ, iSWAP gates or any general fermionic simulation gate [19] with the same hardware.

2.2 QPU package

The QPU is mounted inside a carrier for handling, shielding, mounting, and signal connection purposes. The shape of the carrier is optimized to make all standing wave modes be far-detuned from operational frequencies. The QPU is wire-bonded to a printed circuit board (PCB) with coplanar waveguides that is also attached to the carrier. The PCB serves to transmit signals between QPU launchpads and external microwave connections. The carrier is manufactured out of high conductivity copper for improved thermalisation and reducing potential interactions with impurities in the metal. This carrier has a gold plated finish for better thermal contact between mating surfaces, to reduce losses of signals in the exposed transmission lines, and to ensure high quality factors of all mentioned standing wave modes. The particular gold plating process used is non-magnetic to maintain a clean magnetic field environment. The sample carrier securely holds the QPU, protects the chip from stray radiation, minimizes microwave interference and cross-talk between the signals.

To thermally attach the QPU carrier to the refrigerator, it is mounted to a copper cold finger, which is situated inside a multiple layered magnetic shielding assembly. These shields are required for minimizing the interaction with external magnetic fields and suppress to environmental radiation.

2.3 Refrigerator

The cold finger with the chip carrier is attached to the mixing chamber plate of a commercial BlueforsTM refrigerator in order to cool the QPU to the operating temperature of a few tens of mK. These machines use a pulse tube refrigerator to cool the first stages to a few Kelvin and then use a dilution refrigerator to reach the base temperature. The experiments presented in this paper were performed with components attached to the experimental stage with a temperature of 30 mK or cooler.

2.4 Signal inputs and outputs

The microwave signals for the qubit drive, tunable coupler flux, parametric amplifier pump, and readout probe are routed from room temperature to the QPU by coaxial SCuNi wires that include appropriate attenuation cascade at the different temperature stages as well as low-pass filtering at the base temperature. DC signals for qubit flux are routed using twisted pair wiring with low-pass filtering at room temperature, the 3 K stage, and at the base temperature stage.

The readout response is amplified by a Traveling Wave Parametric Amplifier (TWPA), routed via appropriate isolation upwards by superconducting coaxial NbTi wiring to a High-Electron-Mobility Transistor (HEMT) amplifier operating at a nominal 3 K temperature. After the HEMT the signal is carried by silver plated copper-nickel coaxial wiring to reach the top plate of the cryostat.

2.4.1 Traveling wave parametric amplifier

Qubit readout relies on readout pulses with as little energy as tens of microwave photons. To detect such weak signals, quantum limited amplifiers are used, where amount of added noise is limited by quantum mechanics [20]. The first amplifier in our readout chain is a TWPA. Due to high gain and bandwidth, it enables frequency multiplxed readout of all qubits within 100 ns and readout fidelity limited only by qubit decay time [21].

Parametric amplifiers include nonlinear media, where propagating weak and strong tones exchange energy. In TWPAs, the nonlinear media consists of long series of Josephson junctions [22] forming an analogue of an optical Kerr medium. The Josephson potential being an even function of the superconducting phase difference ϕ, the provided nonlinearity to the lowest order is of the form \(\phi ^{4}\) which enables four-wave mixing (4WM). Here two photons from a strong pump tone generate one photon in phase with the weak input signal and another idler photon [23]. Pump, signal and idler tone frequencies are related by the conservation of energy [20], see Fig. 4.

Figure 4
figure 4

Energy level diagrams for (a) 4WM and (b) 3WM processes. \(\omega _{p}\), \(\omega _{s}\) and \(\omega _{i}\) represent the angular frequencies of pump, signal and idler photons, respectively. Δω represents the frequency detuning from the degenerate mode of amplification

In the presence of an external flux, the lowest order of non-linearity becomes of the form \(\phi ^{3}\). This term can facilitate a three-wave mixing (3WM) process where a single pump photon at roughly twice the signal frequency gives its energy to a pair of signal and idler photons [2426]. Having the pump tone far from the signal frequency is beneficial, as then the strong tone at the output of the TWPA can be removed with a simple filter and one avoids any compression effects in later amplification stages.

If the signal and idler frequencies are the same, the resulting amplification is said to be degenerate and only one of the signal quadratures is amplified, but possibly without any added noise. In our system, we typically employ 3WM TWPAs in the non-degenerate regime such that we can frequency multiplexed the readout. However, by changing the flux bias current it is possible to tune the degeneracy to a frequency of interest without changes to the hardware.

2.5 QPU control electronics

The microwave drive pulses are generated by conventional AC-coupled microwave arbitrary waveform generators (AWGs) operating at the qubit frequency band. The tunable coupler flux pulses are generated by a DC-coupled baseband AWG. The readout probe signals are generated and acquired by a conventional quantum analyzer operating at the frequency band of the readout resonators. The readout and drive instruments also provide a combined functionality that enables fast feedback, i.e., driving signals dependent on the readout result at time scales shorter than the qubit coherence times. The qubit flux and TWPA bias currents are generated by a DC voltage source which is connected to the QPU and TWPA devices via a cascade of low pass filters that include inline resistance to convert voltage to a stable direct current.

The electronics racks of the system include all the required auxiliary electronics for operating a full-stack quantum computer: uninterruptible power supply (UPS) to provide regulation and filtering of the mains power to the measurement electronics rack, a network remote configurable mains power distribution unit (ePDU), power supplies for the various readout amplifiers, a reference clock, power supplies for the DC sources, a main Linux host for running the control software, a Windows host for running the Bluefors software that controls the dilution refrigerator, several smaller specialised Linux hosts for instrumentation interfaces, a dedicated firewall, and a network switch that provides Ethernet connectivity to the hosting facility.

3 Software

The software stack of our quantum computer is divided into different functional layers presented in Fig. 5. It can be interfaced in several ways based on the required level of access. The modules and interactions of the software stack are described in more details in the following subsections.

Figure 5
figure 5

The software layers and modules of our quantum computer control software stack

3.1 Cortex

Cortex is a set of software components for running quantum algorithms on our quantum computer. It is the highest level of abstraction in the control software stack of our on-premises quantum computers. Cortex focuses on enabling computation for the end user, rather than experimenting with the behaviour of the individual elements of the quantum computer.

Cortex allows users to define and execute quantum algorithms on the quantum computer, expressed as quantum circuits using high-level frameworks and description languages such as Cirq, Qiskit, and OpenQASM 2.0. The input to the server is a computation job containing one or more quantum circuits to be executed, the number of shots, and possibly some other parameters. The job is queued for execution on the quantum computer. The results of the measurements of the circuits are returned when the job is completed.

3.2 EXA

EXA is a Python-based framework for characterising, calibrating, and controlling our quantum computer. EXA supports the execution of pre-defined experiments as well as the definition and execution of new custom experiments. An experiment unit combines different functionalities such as execution flow, data manipulation, analysis, and presentation, and it can also be built in a modular way using other experiments.

Using the EXA experiment library, users can create Jupyter notebooks or standalone Python applications e.g. to implement macro-like capabilities to simplify the control and measurement processes, eliminate standard repetitive operations using automated procedures, and develop entirely new experiments.

3.3 IQM station control

IQM Station Control takes care of low-level functionality such as housing instrument parameters and hardware drivers. It hides low-level hardware details from the higher-level components, EXA and Cortex. Both Cortex and EXA communicate with Station Control service via its non-RESTful JSON (JavaScript Object Notation) HTTP (Hypertext Transfer Protocol) interface. The interface provides endpoints for performing various parameter sweeps and executing pulse schedules. In normal use, the user does not need to interact directly with the service.

Station Control uses device-specific drivers to further encapsulate the details of each instrument, including its low-level communication protocol.

4 Applications to education

A small scale on-premises quantum computer is exceptionally useful for educational purposes. It facilitates hands-on experimentation, allowing students not only to run quantum circuits, but also to conduct pulse-level experiments, change the calibration, or connect external periphery. In general, applications to education can be sorted into two categories, (1) experiments/lab sessions that involve accessing the hardware physically or through the pulse-level interface, and (2) accessing the quantum computer through the circuit-level interface.

4.1 Utilizing hardware access/pulse-level access for education

Lab sessions that involve accessing the hardware physically or utilizing the pulse-level interface and changing the configuration or calibration of the device help engage students and provide additional learning opportunities that are pivotal in cultivating the next generation of quantum scientists and engineers. Below we will illustrate four examples of how an on-premises quantum computer can be used for this purpose in education.

4.1.1 Exploring a quantum computer

Access to a physical quantum computer enables the investigation of the setup of these machines. Together with the appropriate exercises and learning materials this kind of physical access bridges the gap between abstract quantum circuit descriptions and the actual superconducting quantum computer that executes them. Students experience first hand how quantum computation is performed on qubits. The protection of qubits via cooling and magnetic shields becomes concrete as well as their operation via microwave pulses.

4.1.2 Calibrating a quantum computer

Calibration of a quantum computer involves e.g. fine-tuning the qubit frequencies and finding optimal parameters for gates and readout pulses. By utilizing the pulse-level interface, learners can explore the importance of calibration for high fidelity operations and create their own calibration sets. In a lab setting, the learners create and apply their calibration sets and compare outcomes to understand the impact of calibration on the results as shown in Fig. 6 (a). Figure 6 (b) shows the measurement outcomes of the 5-qubit GHZ state

$$ | \Psi \rangle =\frac{1}{\sqrt{2}}\bigl(| 00000 \rangle +| 11111 \rangle \bigr) $$

with two different calibration sets.

Figure 6
figure 6

(a) Learners may compare good calibration 1 with poor calibration 2. They notice the badly calibrated CZ gate in the second calibration set, indicated by the red arrow, and investigate the cause. (b) Comparison of the 5-qubit GHZ state preparation fidelity using different calibration sets shown in (a)

Furthermore, learners can investigate the fidelity of gates and the potential causes of discrepancies in calibration outcomes, encouraging students to critically analyze their results against provided benchmarks of our QPU. Benchmarking can also expand to aspects such as \(T_{1}\) and \(T_{2}\) times or different strategies for assessing performance.

4.1.3 Exploring control waveforms

With physical access to the device, students can also plug in selected peripheral devices such as oscilloscopes to further investigate the connection of software-defined operations and the physical implementation of quantum operations. Multiple smaller experiments guide the learners through investigating control pulse characteristics and qubit manipulation. For example, they execute multiple instructions that rotate the qubit state by different angles and measure the corresponding pulse shapes. As a more complicated example, learners can explore the control pulse schedules that result from multi-qubit circuits such as one that generates a Bell state \(| \Phi _{+} \rangle = (| 00 \rangle +| 11 \rangle )/\sqrt{2}\), as shown in Fig. 7 (a). The resulting control and readout pulses measured with an oscilloscope are depicted in Fig. 7 (b). Through this hands-on approach, learners will gain insight into the control electronics that enable superconducting quantum computing and the concepts of pulse control.

Figure 7
figure 7

(a) Transpiled circuit that creates a Bell state using native operations of our quantum computer. Note that this circuit has been chosen for illustrative purposes and has not been fully optimized for the given architecture. (b) Resulting pulse shapes as displayed on the oscilloscope interface. Horizontal and vertical axis depict time and voltage, respectively. Curves are shifted vertically for better visibility

4.1.4 Multi-level quantum hardware

Direct hardware access allows one to investigate physical quantum systems beyond the two-level approximation defining a qubit. Here we demonstrate the state preparation and readout of the second excited state of a transmon. This example provides interested learners with a better understanding of the actual superconducting quantum hardware. Furthermore, it connects the educational value of the system with recent scientific results enabled by utilizing the multi-level nature of the transmon [27], such as three-level (qutrit) quantum processors [28], tunable coupler architectures using the second excited state [9] and fast single-qubit gates by the shortcuts-to-adiabaticity version of the stimulated Raman processes (STIRAP) [29].

In Fig. 8, we display the results of an experiment addressing the relaxation dynamics of a transmon prepared in the \(| 2 \rangle \) state by a calibrated \(\pi _{0-2}\) pulse. The population then freely evolves and state discrimination is performed after a delay time (see inset). The observed evolution (dots) of the three states \(| 0 \rangle \), \(| 1 \rangle \) and \(| 2 \rangle \) is then fitted (solid lines) to extract the relaxation timescales. We find \(\Gamma ^{-1}_{10}=44.4\pm 1.3~\mu \text{s}\), \(\Gamma ^{-1}_{21}=35.0\pm 1.1~\mu \text{s}\) and \(\Gamma ^{-1}_{20}=69.2\pm 4.1~\mu \text{s}\), which together capture the expected dynamics of the qutrit, where the inequalities \(\Gamma ^{-1}_{10}>\Gamma ^{-1}_{21}\) and \(\Gamma ^{-1}_{10},\Gamma ^{-1}_{21}<\Gamma ^{-1}_{20}\) demonstrate the role of the transition matrix elements when describing the relaxation of a quantum system [30]. We conclude that pulse-level access enables the direct investigation and control of the superconducting quantum hardware beyond what higher abstraction layers can provide, making it a vital tool for educational programs targeting quantum hardware.

Figure 8
figure 8

State preparation and relaxation dynamics of a qutrit. The inset displays the pulse sequence, which starts with preparing the qutrit in the second excited state, and the readout is performed after a variable delay time. The result of the single shot analysis is shown as dots, and the best fit to the Markovian model in the inset is displayed as solid curves with the relaxation rates \(\Gamma _{21}\), \(\Gamma _{20}\) and \(\Gamma _{10}\) given in the text

4.2 Utilizing circuit-level access for education

A lot of interest goes into investigating, creating and improving NISQ algorithms. This subsection will demonstrate the use of gate-based access in educational settings by providing different examples that can be practiced by learners. The output of each quantum circuit is illustrated by executing it with our superconducting quantum computer.

Because of the unavoidable presence of errors during algorithm execution on NISQ devices such as our quantum computer, the results are expected to differ from the ideal, noiseless ones. Importantly, there are strategies, often collectively referred to as “quantum error mitigation”, that can be employed, either individually or in combination with each other, to reduce the effects of errors on the execution of a desired algorithm [31]. In particular, in the examples that will follow, we make use of techniques belonging to three different classes.

  • “Error suppression” techniques aim to modify and reduce the effects of errors at the level of each single circuit run. A prominent example is the randomized compiling (RC) technique [32, 33] that, through random Pauli twirling, effectively converts problematic coherent errors (e.g. systematic overrotations associated with a given gate) to stochastic errors, which add up more favourably and whose effects are easier to mitigate further.

  • “Readout error mitigation” (REM) techniques target errors that occur during the measurement of the qubits [31, 34]. In general, they consists of two steps. At first, a set of simple and shallow characterization circuits is executed on the hardware, in order to determine the properties and the magnitude of the readout errors. As a simple example, one might want to measure what is the probability that a given qubit prepared, say, in the \(|0\rangle \) state is incorrectly measured to be in \(|1\rangle \). Once this information is gathered, the desired algorithm is executed and its raw results are post-processed in order to compensate for the readout errors. In this paper, we have mitigated readout errors using correlated readout error mitigation calibrated with 10,000 shots per basis state.

  • A third class of “error mitigation” techniques mainly targets errors happening at the gate level, i.e. during the execution of the bulk of the circuits. This is generally achieved by executing different variants of the desired quantum circuit and by combining their output via classical post-processing, leading to a (potentially significant) run time overhead [31]. Despite this trade-off between quality and speed, the implementation of error mitigation strategies is a key element for successful algorithm execution in the NISQ era. One simple-yet-effective technique to mitigate gate errors is known as zero noise extrapolation (ZNE) [35], which is based on the idea of artificially increasing the noise level and then making use of noisier results to extrapolate back to the noiseless limit. Open source and educational implementations of several other techniques can be found, for example, in [36].

4.2.1 Violation of the CHSH inequality

The Nobel Prize in Physics 2022 was awarded jointly to Alain Aspect, John F. Clauser and Anton Zeilinger “for experiments with entangled photons, establishing the violation of Bell inequalities and pioneering quantum information science” [37]. Bell’s theorem claims that correlations of measurement outcomes of two experimenters separated from one another have an upper bound if nature follows the principle of local realism [38]. Suppose two parties, often called Alice and Bob, are located far away from each other in the experiment. They both have two observables they can measure of some signal that comes to them, but they cannot measure both simultaneously. Bell-type inequalities define correlators over the possible measurement settings and their outcomes, assuming that the measurements happen so fast that no communication is possible between the measurement devices of Alice and Bob due to the finite speed of light [39]. A Bell inequality then separates so-called local probability distributions from non-local distributions, that have stronger correlations than the local distributions. The most famous Bell-type inequality is the CHSH inequality [40], which is straightforward to test with two qubits on a quantum computer. The measurements of the qubits are definitely not space-like separated events, though. The inequality is given by the formula:

$$ -2 \leq E(QS)+E(RS)+E(RT)-E(QT) \leq 2, $$

if the local realism is true. Here the observables are

$$ Q=X_{1}, \qquad R=Z_{1},\qquad S=Z_{2},\qquad T=Z_{2}, $$

where index 1 refers to Alice’s and 2 to Bob’s qubit and \(E(A)\) stands for the expectation value of A.

Let us now consider how these correlations behave in quantum mechanics, and evaluate the above expectation values in the state

$$ \big| \Psi (\theta ) \big\rangle = R_{y}(\theta )_{1} \bigl(| 00 \rangle + | 11 \rangle \bigr)/\sqrt{2}, $$

where \(R_{y}(\theta )= \exp (-i \theta Y/2)\). Calculating the expectation values as described at the end of Sect. 1, we obtain

$$ E(QS)=E(RT)= \cos \theta ,\qquad E(QT)=-E(RS) = \sin \theta , $$

and thus

$$ E(QS)+E(RS)+E(RT)-E(QT) = 2\sqrt{2} \cos (\theta +\pi /4). $$

Clearly the CHSH inequality in Eq. (3) is violated in the state \(| \Psi (\theta ) \rangle \) iff

$$ \theta \in \biggl(\frac{\pi}{2}, \pi \biggr) \sqcup \biggl( \frac{3\pi}{2},2\pi \biggr), $$

which demonstrates that quantum mechanics is not compatible with local realism.

To experimentally test this prediction, we will execute a parameterised quantum circuit given in Fig. 9 (a) to create the state \(| \Psi (\theta ) \rangle \) and measure the relevant expectation values for multiple values of θ. Figure 9 (b) shows how the CHSH observable oscillates as we rotate the state, and violates the equality as predicted. The statistical uncertainty due to the finite amount of shots is shown by the error bars, which correspond to one standard deviation. They have been determined by means of bootstrapping, i.e. by classically resampling several times the probability distribution obtained from each circuit run, thus obtaining a set of reprocessed results that can be used to estimate confidence intervals. We use this method throughout the paper.

Figure 9
figure 9

(a) Quantum circuit that prepares \(| \Psi (\theta ) \rangle \). (b) Expectation value of the CHSH observable Eq. (6) in the state \(| \Psi (\theta ) \rangle \), plotted over the y rotation angle θ. The raw experimental data points are shown with blue dots; the implementation of readout error mitigation (green crosses) brings them closer to the ideal noiseless result (red curve). There are two regions outside the black horizontal lines, where the CHSH inequality is violated and hence non-locality is demonstrated. The statistical uncertainty is so small that the error bars remain within the markers

4.2.2 5-qubit GHZ state, decoherence and Mermin’s inequality

5-qubit GHZ state

Let us prepare a maximally entangled 5-qubit state and see what we can do with it. The 5-qubit GHZ state [41]

$$ | \Psi _{5} \rangle = \frac{1}{\sqrt{2}}\bigl(| 00000 \rangle +| 11111 \rangle \bigr) $$

is one of the maximally entangled 5-qubit states. This state is obtained by applying the quantum circuit in Fig. 10 (a) on \(| 00000 \rangle \). Figure 10 (b) shows the output histogram of our quantum computer obtained with 5000 shots, while Fig. 10 (c) shows the output after readout error mitigation is applied.

Figure 10
figure 10

(a) Quantum circuit to implement the 5-qubit GHZ state. (b) Output of 5-qubit quantum computer. (c) Output after readout error mitigation is applied

Entanglement, mixed state and decoherence

Let us separate the 5-qubit system into subsystems made of qubits 12 and qubits 345. Suppose one measures an observable O associated with subsystem 12. The expectation value of O with respect to \(| \Psi \rangle \) is

$$ \langle O \rangle = \langle \Psi |(O \otimes I_{8})| \Psi \rangle = \frac{1}{2} \langle 00 |O| 00 \rangle + \frac{1}{2} \langle 11 |O| 11 \rangle , $$

which is an expectation value with respect to a mixed state, even though the total system is in a pure state. This is directly demonstrated by evaluating the density matrices of (a) the 5-qubit GHZ state, (b) the subsystem 12 and (c) the subsystem 345 as

$$\begin{aligned}& \rho _{\mathrm{GHZ}} = \frac{1}{2} \begin{pmatrix} 1&0&\ldots &0&1 \\ 0&0&\ldots &0&0 \\ \vdots & &\vdots & &\vdots \\ 0&0&\ldots &0&0 \\ 1&0&\ldots &0&1 \end{pmatrix} , \end{aligned}$$
$$\begin{aligned}& \rho _{{12}} = \sum_{i,j,k \in \{0,1\}} \bigl(I_{2} \otimes I_{2} \otimes \langle ijk |\bigr)\rho _{\mathrm{GHZ}} \bigl(I_{2} \otimes I_{2} \otimes | ijk \rangle \bigr)=\frac{1}{2} \operatorname{diag}(1,0,0,1), \end{aligned}$$
$$\begin{aligned}& \begin{aligned}[b] \rho _{{345}} &= \sum_{i,j \in \{0,1\}} \bigl(\langle ij | \otimes I_{2} \otimes I_{2} \otimes I_{2}\bigr)\rho _{\mathrm{GHZ}} \bigl(| ij \rangle \otimes I_{2} \otimes I_{2} \otimes I_{2}\bigr) \\ &=\frac{1}{2}\operatorname{diag}(1,0,0,0,0,0,0,1), \end{aligned} \end{aligned}$$

respectively. Figure 11 shows the result of quantum state tomography for (a) the GHZ state (b) subsystem 12 and (c) subsystem 345, which are obtained experimentally with the Qiskit state tomography algorithm. The subsystem 12 is in a mixed state since it cannot access the information of the subsystem 345 and vice versa.

Figure 11
figure 11

State tomography of (a) 5-qubit GHZ state, (b) qubits 12 (principal system) and (c) qubits 345 (environment). Theoretically (a) is a pure state with rank 1 while (b) and (c) are mixed states with rank 2. We used local readout error mitigation and 3500 shots per measurement basis for the tomography

This situation models decoherence. It is often said that interaction of a quantum system with environment causes entanglement between the two systems, by which a initial pure state of the system becomes a mixed state. This explanation of decoherence is often difficult to understand for beginners. Let us call the subsystem 12 the principal system while subsystem 345 the environment. The total system, started in a pure tensor product state \(| 00 \rangle \otimes | 000 \rangle \), evolves to the entangled GHZ state under the unitary time evolution given by Fig. 10 (a). The total system is still in a pure state. Although the principal system was in a pure state \(| 00 \rangle \) in the beginning, it is now entangled with the environment, and thus in a mixed state \(\rho _{12}\) if the environment is ignored (i.e. traced out). In other words, the GHZ state is a purification of \(\rho _{12}\).

It is interesting to evaluate the von Neumann entropy \(S(\rho ) =-\operatorname{tr} \rho \log _{2} \rho \) of these states. Table 1 shows both the theoretical predictions and the experimental results along with the upper bound saturated by the uniformly mixed state. Observe that the entropy, which is called the entanglement entropy in this context, is theoretically the same for \(\rho _{12}\) and \(\rho _{345}\). Usually, entropy is proportional to the system size but entanglement entropies derived from a pure state \(\rho _{\mathrm{GHZ}}\) are identical even though the subsystem sizes are different.

Table 1 Theoretical and experimental values of the von Neumann entropy of the 5-qubit GHZ state, the subsystem 12 and the subsystem 345. Experimental data in Fig. 11 has been employed. The right column shows the upper bound of entropy, which is saturated by the uniformly mixed state

Violation of Mermin’s inequality

Let us show next that Mermin’s inequality is violated by the 5-qubit GHZ state. Mermin’s inequality is regarded as a generalization of the CHSH inequality to multi-qubit systems [42]. The Mermin polynomial for a 5-qubit system is defined as [43]

$$\begin{aligned} M_{5} =& X_{1}X_{2}X_{3}X_{4}X_{5} \\ &{} -(Y_{1}Y_{2}X_{3}X_{4}X_{5} + 9~\text{permutations}) \\ &{} +(Y_{1}Y_{2}Y_{3} Y_{4}X_{5}+4~\text{permutations}). \end{aligned}$$

It is known that Mermin’s inequality \(E(M_{5}) \leq 4\) is satisfied if the local realism holds. On the other hand, quantum theory predicts \(E(M_{5}) \leq 4^{2}=16\), where the upper bound is saturated if the state is maximally entangled.

Let us evaluate \(E(M_{5})\) in the 5-qubit GHZ state. Since the GHZ state is symmetric with respect to permutations of qubits we only need to evaluate three monomials:

$$\begin{aligned}& \langle \Psi | X_{1} X_{2} X_{3} X_{4} X_{5} | \Psi \rangle =1, \\& \langle \Psi | Y_{1} Y_{2} X_{3} X_{4} X_{5} | \Psi \rangle =-1, \\& \langle \Psi | Y_{1} Y_{2} Y_{3} Y_{4} X_{5} | \Psi \rangle =1, \end{aligned}$$

theoretically. Then we find

$$ \langle \Psi | M_{5} | \Psi \rangle = 1- 10 \times (-1) + 5 \times 1=16, $$

and thus the GHZ state \(| \Psi \rangle \) saturates the upper bound. Let us now confirm this prediction with our 5-qubit quantum computer.

The monomials of the Mermin polynomial can be measured as expectation values as described in Sect. 1. Since the qubits and gate fidelities of a NISQ quantum computer are not homogeneous, we must measure all 16 monomials separately. The estimation of \(\langle \Psi | M_{5} | \Psi \rangle \) obtained with our quantum computer is presented in Table 2, and clearly rules out local realism, in favour of quantum theory.

Table 2 Estimated value of the Mermin polynomial in the prepared 5-qubit GHZ state. The 16 terms of the polynomial were estimated using 10,000 shots each. The three topmost rows in the table represent the average of all the permutations of the Pauli operators in the given observable

4.3 Maxcut problem

Many well known quantum algorithms, such as Shor’s and Grover’s algorithms, require a large number of qubits and fault-tolerant error correction for useful quantum computation. This places their practical execution beyond the capabilities of currently available NISQ computers. In contrast, variational quantum algorithms are more suited for the NISQ computer at our hand, in that they can be executed with a currently available number of qubits without quantum error correction.

Variational quantum algorithms involve an optimization process, where classical computer seeks for the optimal parameters of a quantum circuit so that the expectation value of a Hamiltonian, representing the cost function, evaluated with a quantum computer using the resulting state is minimized. The parameters in the circuit are iterated many times until the expectation value hits the minimum. Variational algorithms have many use cases e.g. in mathematics, chemistry, finance and industrial optimizations. We will introduce application of a variational algorithm, called QAOA (Quantum Approximate Optimization Algorithm), to a combinatorial problem called the Maxcut problem in this subsection. Another variational algorithm VQE (Variational Quantum Eigensolver) will be introduced in Sect. 5.3.

Suppose there is a graph G with n nodes. There are edges between some pairs of nodes. In the Maxcut problem, one aims to partition the nodes of G into a disjoint union \(A \sqcup B\) such that the number of edges connecting nodes in A and B is maximized.

To solve this problem, we introduce an Ising Hamiltonian

$$ H=\sum_{i< j} J_{ij} Z_{i}Z_{j}, $$

where i and j denote the nodes. The coupling strength is \(J_{ij}=1\) if there is an edge between nodes i and j, while \(J_{ij}=0\) if there is no edge. Suppose \(| 0 \rangle \) is assigned to nodes in group A, while \(| 1 \rangle \) to nodes in B. If nodes i and j belong to different groups, the edge ij contributes −1 to the Hamiltonian, while if they belong to the same group, the edge contributes +1. There is no contribution if there is no edge connecting nodes i and j. Thus maximizing the number of edges connecting nodes in different groups reduces to minimizing the expectation value of the Hamiltonian H acting on an n-qubit system.

In fact, the above problem may be implemented with \((n-1)\)-qubit system. Suppose we find a solution A and B of a Maxcut problem of a given graph. Then interchange of A and B is also a solution of the same problem. Accordingly we are free to assign \(| 1 \rangle \) to the nth qubit, for example, without loss of generality. With this choise, \(J_{in}Z_{i} Z_{n}\) becomes \(-J_{in}Z_{i}\). The modified Hamiltonian

$$ H'= \sum_{1\leq i< j \leq n-1} J_{ij}Z_{i} Z_{j} - \sum_{i=1} ^{n-1} J_{in}Z_{i} $$

is implemented with an \((n-1)\)-qubit system.

Let us consider a graph in Fig. 12 with 6 nodes for definiteness, where the node 6 is a virtual node in \(| 1 \rangle \) state while the rest are physical. By fixing the state of node 6 to \(| 1 \rangle \), the relevant Hilbert space is \(\operatorname{Span}(\{| i_{1}\, i_{2}\, i_{3}\, i_{4}\, i_{5} \rangle | 1 \rangle \})\), which may be implemented with our 5-qubit quantum computer.

Figure 12
figure 12

Maxcut problem for six nodes. Node 6 is a fictitious qubit in \(| 1 \rangle \) state while rest are physical qubits. Solid lines represent physical couplings while dashed lines fictitious couplings

Figure 13 shows the solution of the Maxcut problem experimentally obtained for the graph Fig. 12. The solution is read from the most probable state of the probability distribution. We used single-layered QAOA ansatz and \(10{,}000\) measurements per optimization step and readout error mitigation. Using only a single layer of the QAOA ansatz means the algorithm has only a small number of gates and can be executed in a short time. This implies it is an approximate algorithm, and while it will output the correct solution, there will also be sizable probabilities of wrong states in the distribution. The algorithm can be more accurate and precise by increasing the layer depth, but then we start to accumulate more errors during the execution of the algorithm, which again introduces erroneous solutions to the output distribution of the quantum computer. The optimal depth depends in general on the problem instances and the error-levels of the quantum computer, and is a key implementation detail relevant for the performance of quantum algorithms on practical problems.

Figure 13
figure 13

(a) Measurement outcomes of our 5-qubit quantum computer executing QAOA for the Maxcut problem. (b) The solution of the Maxcut problem; we achieve eight cuts, which is the maximum for this graph. The solution \(| 00011 \rangle \) has divided the nodes to groups 0 (green) and 1 (red), such that the number of edges connecting nodes from different groups is maximized. Solid lines show edges connecting nodes in the same group, while eight dashed lines are edges connecting nodes in different groups


Solving the Maxcut problem has been recently adopted as a benchmark for the practical capabilities of a quantum computer [44]. The Q-score of a quantum computer equals the size of the graphs, whose Maxcut problem can be sufficiently solved. The obtained cost, i.e. the average number of cut edges, of the solution has to be above certain threshold. Specifically, one has to find the cost of a graph to be above 0.2 on a scale where 0 corresponds to random solution and 1 to ideal solution. The graphs chosen for the benchmark are random Erdös-Rény graphs with 50% edge-probability between nodes.

We present a comprehensive Q-score benchmark on our 5-qubit quantum computer. By employing the virtual node technique, we can solve the n-node graph with \(n-1\) qubits. Figure 14 displays Q-score results up to five physical qubits.

Figure 14
figure 14

Q-score approximation ratios \(\beta (n)\) for our 5-qubit quantum computer. Ratios above the threshold 0.2 pass the Q-score benchmark. Results of noiseless simulator are shown to highlight the approximate nature of QAOA; with one layer ansatz, the limited expressivity leads to results well below optimal ratio of 1.0 even without noise. We employ the virtual node technique and readout error mitigation. The Q-score ratios are averages over 100 random Maxcut problems. We used 2048 shots per optimization step in QAOA

5 Applications to research

Small-scale quantum computers have been used in many different areas in scientific research including but not limited to physics, chemistry and mathematics. Current trend in NISQ computer research is, no doubt, toward scaling up physical qubits for commercial use of quantum computers. Nonetheless, [45] reports that there are many research papers published in scientific journals, which demonstrate “proofs of principle” of scientific ideas with a small-scale quantum computer.

We illustrate three such examples from physics, mathematics and chemistry in this section, and demonstrate them with our 5-qubit superconducting quantum computer.

5.1 Simulating neutrino oscillations

It is known that there are at least three types (i.e. flavors) of neutrinos in Nature. They are called \(\nu _{e}\), \(\nu _{\mu}\) and \(\nu _{\tau}\). Masses of these neutrinos are not diagonal in the flavor basis \(\{| \nu _{e} \rangle , | \nu _{\mu} \rangle , | \nu _{\tau} \rangle \}\). Let us call the eigenvectors of the mass matrix as \(\{| \nu _{1} \rangle , | \nu _{2} \rangle , | \nu _{3} \rangle \}\) with masses (eigenvalues) \(m_{1} < m_{2}<m_{3}\). Let H be the Hamiltonian that describes neutrinos. The mass eigenstates satisfy

$$ H| \nu _{k} \rangle =E_{k}| \nu _{k} \rangle \quad (k=1,2,3), $$

where \(E_{k} =\sqrt{p^{2} c^{2}+m_{k}^{2} c^{4}}\), p is the momentum of the neutrino, and c is the velocity of light.

These two basis states are related by a unitary matrix called the Pontecorvo–Maki–Nakagawa–Sakata matrix \(U_{\mathrm{PMNS}}=(\langle \nu _{\alpha}|\nu _{j}\rangle )\) [4648] as

$$ \begin{pmatrix} \nu _{e} \\ \nu _{\nu} \\ \nu _{\tau} \\ \nu _{X} \end{pmatrix} = U_{\mathrm{PMNS}} \begin{pmatrix} \nu _{1} \\ \nu _{2} \\ \nu _{3} \\ \nu _{4} \end{pmatrix}, $$


$$ U_{\mathrm{PMNS}} = \begin{pmatrix} 0.8255 & 0.5445 & -0.142+0.0434 i & 0 \\ -0.2709+0.02739 i & 0.6057 +0.0181 i & 0.7475 & 0 \\ 0.4938 +0.0237 i & -0.5798+0.0157 i & 0.6475 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} . $$

Here fictitious neutrinos \(\nu _{X}\) and \(\nu _{4}\) are introduced so that this system can be simulated with a two-qubit system. \(\nu _{X}\) and \(\nu _{4}\) are decoupled from the physical neutrinos and have no physical significance. This fourth neutrino may be utilized in a theory with an exotic neutrino which is yet to be discovered.

We closely follow [49] in the following. We keep the CP-violating phase \(\delta _{\mathrm{CP}}\) in \(U_{\mathrm{PMNS}}\) while [49] ignored this phase. The phase was taken into account in [50], which also generalizes the simulation to arbitrarily many neutrino species. We employ parameters announced in November 2022 [48] in Eq. (19).

Suppose \(| \nu _{\mu} \rangle =(0,1,0,0)^{t}\) is created at \((x,t) = (0,0)\). The probability of detecting \(| \nu _{\alpha} \rangle \) \((\alpha =e, \mu , \tau )\) at \(t>0\) is

$$\begin{aligned} p_{\alpha}(t) =&\bigl|\langle \nu _{\alpha} | \nu _{\mu}(t) \rangle \bigr|^{2} = \bigl| \langle \nu _{\alpha} | e^{-i H t/\hbar}|\nu _{\mu}\rangle \bigr|^{2} \\ =& \Biggl\vert \sum_{k=1}^{4} \langle \nu _{\alpha} |\nu _{k} \rangle e^{-i E_{k} t/\hbar} \langle \nu _{k} |\nu _{\mu}\rangle \Biggr\vert ^{2} \\ =&\bigl|\langle \nu _{\alpha} | U_{\mathrm{PMNS}} \operatorname{diag} \bigl(e^{-i E_{1} t/ \hbar}, e^{-i E_{2} t/\hbar},e^{-i E_{3} t/\hbar},e^{-i \phi} \bigr) U_{ \mathrm{PMNS}}^{\dagger }| \nu _{\mu} \rangle \bigr|^{2}, \end{aligned}$$

where ϕ is an unphysical phase.

Let us simulate this system with a quantum computer. There are three steps to take.

  • Express \(| \nu _{\mu} \rangle =| 01 \rangle \) in terms of the mass eigenstates \(| \nu _{k} \rangle \), which is done by applying \(U_{\mathrm{PMNS}}^{\dagger}\) on \(| \nu _{\mu} \rangle \).

  • Apply the time-evolution operator \(V(t) = e^{-iH t/\hbar}\) on this state to find \(| \nu _{\mu}(t) \rangle = V(t) U_{\mathrm{PMNS}}^{\dagger }| \nu _{\mu} \rangle \). \(V(t)\) is decomposed into two one-qubit gates as \(V(t)=S_{1}(t) \otimes S_{2}(t)\), whose explicit forms are given below.

  • Measure \(| \nu _{\mu}(t) \rangle \) in the basis \(\{| \nu _{e} \rangle , | \nu _{\mu} \rangle , | \nu _{\tau} \rangle \}\). This is done by applying \(U_{\mathrm{PMNS}}\) on \(| \nu _{\mu}(t) \rangle \) and measure the output with binary basis.

Let us analyze these steps in depth. We prepare the initial state \(| \nu _{\mu} \rangle =| 01 \rangle \) in the \(\{| \nu _{k} \rangle \}\) basis as

$$ U_{\mathrm{PMNS}}^{\dagger }| 10 \rangle = u_{\mu 1}^{*} | \nu _{1} \rangle + u_{ \mu 2}^{*}| \nu _{2} \rangle +u_{\nu 3}^{*}| \nu _{3} \rangle , $$

where \(u_{\mu k}\) are the matrix elements of \(U_{\mathrm{PMNS}}\).

The time-evolution operator \(V(t)\) is diagonal in this basis and takes the form

$$ V(t)=\operatorname{diag}\bigl(e^{-i E_{1} t/\hbar}, e^{-i E_{2} t/\hbar},e^{-i E_{3} t/\hbar},e^{-i \phi} \bigr). $$

Since the overall phase has no physical significance, we may factor out \(e^{-i E_{1} t/\hbar}\) so that

$$ V(t)=\operatorname{diag}\bigl(1, e^{-i E_{21} t/\hbar},e^{-i E_{31} t/\hbar},e^{-i \phi '} \bigr) =S_{1}(t) \otimes S_{2}(t), $$

where \(E_{21}=E_{2}-E_{1}\), \(E_{31}=E_{3}-E_{1}\) and \(\phi '\) is another unphysical phase. Here

$$ S_{1}(t) = \begin{pmatrix} 1&0 \\ 0&e^{-i E_{31} t/\hbar} \end{pmatrix} , \qquad S_{2}(t) = \begin{pmatrix} 1&0 \\ 0&e^{-i E_{21} t/\hbar} \end{pmatrix} , $$

where we took advantage of the fact that \(\phi '\) has no physical meaning. Now we have \(| \nu _{\mu}(t) \rangle \), the state of neutrino at t, which was \(| \mu _{\nu} \rangle \) at \(t=0\), as

$$ | \nu _{\mu}(t) \rangle = \bigl(S_{1}(t) \otimes S_{2}(t)\bigr)U_{\mathrm{PMNS}}^{ \dagger }| 01 \rangle . $$

Since the neutrino masses are very small and the velocity is very close to the speed of light c, we may approximate \(E_{k}=\sqrt{p^{2}c^{2}+m_{k}^{2} c^{4}}\) as \(E_{k} \simeq pc + m_{k}^{2} c^{3}/2p\). Then

$$ E_{k1} \simeq \frac{\Delta m_{k1}^{2} c^{3}}{2p} \quad (k=2,3), $$

where \(\Delta m_{k1}^{2}=m_{k}^{2}-m_{1}^{2}\). We employ \(\Delta m_{21}^{2}= 7.39 \times 10^{-5}~\mathrm{eV}^{2}\) and \(\Delta m_{31}^{2}= 2.45 \times 10^{-3}~\mathrm{eV}^{2}\) in our analysis [51].

By approximating \(E \simeq pc\) and \(L \simeq ct\), we have

$$ e^{-i E_{k1}t/\hbar} \simeq \exp \biggl(- i \frac{\Delta m_{k1}^{2} c^{3} }{2\hbar} \frac{L}{E} \biggr). $$

The exponent is expressed numerically with physical units as

$$ \frac{\Delta m_{k1}^{2} c^{3} }{2\hbar} \frac{L}{E} \simeq 2.534\, \Delta m_{k1}^{2}\, \bigl[\mathrm{eV}^{2}\bigr] \times \frac{L\,[\mathrm{km}]}{E\,[\mathrm{GeV}]}. $$

These steps are implemented with a quantum computer with different t, namely different \(L/E\) and the results are compared with theoretical prediction. Figure 15 (a) shows the quantum circuit for this scheme while (b) shows the gate decomposition of \(U_{\mathrm{PMNS}}\). The overall gate decomposition depends on \(L/E\). Figure 15 (c) shows the transpiled gate decomposition for \(L/E=4000\) [km/GeV].

Figure 15
figure 15

(a) Quantum circuit to simulate neutrino oscillation among 3 generations. The input state is \(| 01 \rangle =| \nu _{\mu} \rangle \). (b) Implementation of \(U_{\mathrm{PMNS}}\) using our native gates. The first number in the green box represents θ, while the second number represents ϕ in \(R(\theta , \phi )=\exp [-i\theta (\cos \phi \, X + \sin\phi \,Y )/2 ]\). The parameter ϕ in R is defined \(\operatorname{mod} 2\pi \). (c) Circuit to simulate neutrino oscillations, transpiled to our native gates at \(L/E=4000\). The number of gates is reduced via circuit optimization during the transpilation. Note that in (b) and (c) the order of qubits is reversed according to the Qiskit convention

Figure 16 shows the theoretical curves predicting the detection probabilities of three neutrinos and our quantum computer output as functions of \(L/E\). Probabilities oscillate as a function of \(L/E\) while the sum of probabilities is always 1. This oscillation is possible only when different types of neutrinos have different masses (\(\Delta m_{k1}\neq 0\)). Thus neutrino oscillation is a smoking gun of massive neutrinos.Footnote 2 Neutrino oscillation was the subject of the Nobel Prize in Physics in 2015 [52].

Figure 16
figure 16

Predicted measurement probabilities of neutrino species (solid curves) and measured probabilities (markers) as functions of \(L/E~[\mathrm{km/GeV}]\), where \(\nu _{\mu}\) is created at \(L=0\) at \(t=0\). We used 5000 shots per data point and applied readout error mitigation

5.2 Estimation of the Jones polynomials

Knots, links and braids are fascinating subjects of topology. An unexpected encounter between mathematics (topology) and physics (statistical physics) was discovered by Vaughan Jones in 1984. He discovered knot and link invariants, later called the Jones polynomials, that characterize oriented knots and links. His work was inspired by statistical mechanics of generalized spin models defined on a lattice. Jones was awarded the Fields medal in 1990 for his achievements including the discovery of the Jones polynomials.

Estimation of the Jones polynomials by employing a quantum computer was proposed in [53, 54] and demonstrated with an NMR quantum computer [55]. We reproduce here the results of [55] using our superconducting quantum computer.

We first introduce a braid b associated with a link L. A link is an embedding of a set of loops in \(\mathbb{R}^{3}\) or the 3-dimensional sphere \(S^{3}\). If a link is made of one component, it is called a knot K. Let \((x,y,t)\) be a coordinate of 3-dimensional space-time. A braid b is a set of strings that connect n points \((0,0,0), (1,0,0), \ldots, (n,0,0)\) on \(\mathbb{R}^{2}\) at \(t=0\) and \((0,0,1), (1,0,1), \ldots, (n,0,1)\) at \(t=1\) without intersecting or going backwards in time. Alexander’s theorem claims that every knot and link can be expressed as a braid whose end points are closed, see Fig. 17. The set of braids with n strands has a group structure called the Artin group \(B_{n}\), whose generators are denoted as \(\sigma _{k}\). The generator \(\sigma _{k}\) twists the kth strand and \((k+1)\)st strand as shown in Fig. 18 for \(n=3\). The inverse \(\sigma _{k}^{-1}\) twists them in the opposite direction. The set of generators satisfy the following relations:

$$ \begin{aligned} &\sigma _{k} \sigma _{k} ^{-1}=1, \quad k =1,2, \ldots , n-1, \\ &\sigma _{k} \sigma _{k+1} \sigma _{k}= \sigma _{k+1}\sigma _{k} \sigma _{k+1},\quad k=1, 2, \ldots , n-2, \\ &\sigma _{j} \sigma _{k}=\sigma _{k} \sigma _{j},\quad |j-k|\geq 2. \end{aligned} $$

An arbitrary braid b can be expressed in terms of successive applications of these generators and their inverses as

$$ \sigma _{j_{p}}^{s_{p}} \sigma _{j_{p-1}}^{s_{p-1}} \ldots \sigma _{j_{2}}^{s_{2}} \sigma _{j_{1}}^{s_{1}}, $$

where \(s_{k} \in \{1, -1\}\), \(j_{k} \in \{1, 2, \ldots , n-1\}\) and \(\sigma _{j_{1}}^{s_{1}}\) is applied first and \(\sigma _{j_{p}}^{s_{p}}\) last. This is called the braid word of b. Braid words are not unique and there are infinitely many braid words corresponding to the same braid.

Figure 17
figure 17

(a) Hopf link and its corresponding braid. (b) Trefoil knot and its corresponding braid. Braids are closed with dotted lines to form the link and the knot

Figure 18
figure 18

Generators \(\sigma _{1}\), \(\sigma _{2}\) of three-strand braids

In the following, we are concerned with three-strand braids, namely \(n=3\). It has two generators \(\sigma _{1}\) and \(\sigma _{2}\) as shown in Fig. 18. One possible braid word of the trefoil knot (Fig. 17 (b)) is \(\sigma _{1}^{3}\). Closure of the three-strand braid \(\sigma _{1}^{3}\) results in the trefoil and the trivial knot, as shown in Fig. 19 (a).

Figure 19
figure 19

(a) Simplest representation \(\sigma _{1}^{3}\) of the trefoil knot. (b) Equivalent but more knotted representation of the trefoil knot

It is possible to “unwind” crossing of a braid diagram by introducing the generators \(\{U_{i}\}\) of the Temperley-Lieb algebra \(\text{TL}_{n}\). There are two generators \(U_{1}\), \(U_{2}\) for \(n=3\) and they satisfy

$$ U_{i}^{2}=\delta U_{i},\qquad U_{1} U_{2} U_{1}= U_{1},\qquad U_{2} U_{1} U_{2}= U_{2}, $$

where \(\delta =-A^{2}-A^{-2}\) with \(A=e^{i \theta}\) a complex number with unit modulus. The map \(\rho : B_{3} \to \mathrm{GL}_{2}(\mathbb{C})\) defined as

$$ \rho (\sigma _{i})= A I + A^{-1} U_{i} $$

is a representation of \(B_{3}\). We take the simplest braid word of the trefoil \(\sigma _{1}^{3}\) and its representation \(\rho (\sigma _{1}^{3})\) here. Note that the relations of the braid generators (27) are satisfied if \(U_{i}\) satisfies Eq. (29). The representation ρ is unitary whenever \(U_{i}\) is a real symmetric matrix satisfying \(\delta ^{2} \geq 1\). The last condition is satisfied if

$$ \theta \in [0,\pi /6] \sqcup [\pi /3, 2\pi /3] \sqcup [5\pi /6, 7\pi /6] \sqcup [4\pi /3,5\pi /3] \sqcup [11\pi /6, 2\pi ]. $$

Explicitly, the generators \(U_{i}\) are given as

$$ U_{1}= \begin{pmatrix} \delta &0 \\ 0&0 \end{pmatrix} \quad \text{and}\quad U_{2}= \begin{pmatrix} \delta ^{-1}&\sqrt{1-\delta ^{-2}} \\ \sqrt{1-\delta ^{-2}}&\delta -\delta ^{-1} \end{pmatrix} . $$

We need to define \(w(b) \in \mathbb{Z}\) called the writhe of a braid b before we introduce the Kauffman bracket and the Jones polynomial. Suppose a braid word of b is given as Eq. (28). Then the writhe of b is defined as the sum of the exponents,

$$ w(b) = \sum_{i=1}^{p} s_{i}. $$

For the Hopf link and the trefoil knot we find \(w(\text{Hopf link})=2\) and \(w(\text{trefoil})=3\), respectively.

The Kauffman bracket of the trefoil is obtained as

$$ \langle \text{trefoil} \rangle =\frac{\langle \bar{b} \rangle}{\delta}= \frac{1}{\delta}\bigl(\operatorname{tr} \rho \bigl(\sigma _{1}^{3}\bigr)+A^{w(b)}\bigl(\delta ^{2}-2\bigr)\bigr) =-A^{5} - A^{-3} + A^{-7}, $$

where b is the braid word given in Fig. 17 (b) and \(\bar{\ }\) stands for the closure of b. Note that is made of the trefoil knot and the trivial knot. The factor \(1/\delta \) in Eq. (33) removes the contribution of the trivial knot.

The Kauffman bracket of the Hopf link is obtained in a similar way as

$$ \langle \text{Hopf link} \rangle = \frac{1}{\delta} \bigl( \operatorname{tr}\rho \bigl( \sigma _{1}^{2}\bigr) + A^{2}\bigl(\delta ^{2}-2\bigr)\bigr) =-A^{4}-A^{-4}. $$

The Kauffman bracket is invariant under the Reidemeister moves II and III but not under the Reidemeister move I, and hence cannot be a knot invariant. The Jones polynomial is obtained by multiplying the Kauffman bracket with \((-A^{3})^{- w(b)}\) to make it invariant under all three Reidemeister moves. For the trefoil knot, we obtain the Jones polynomial

$$ V_{\mathrm{trefoil}}(A)= \bigl(-A^{3}\bigr)^{-3} \bigl(-A^{5} - A^{-3} + A^{-7}\bigr) = A^{-4}+ A^{-12}- A^{-16}. $$

It is common to introduce \(t=A^{-4}\) so that

$$ V_{\mathrm{trefoil}}(t) = - t^{4} + t^{3} + t. $$

The Jones polynomial of the Hopf link is

$$ V_{\text{Hopf link}} = -A^{-10}-A^{-2} = -\sqrt{t} \bigl(1+t^{2}\bigr). $$

The Jones polynomial is a Laurent polynomial in \(\sqrt{t}\) in general.

Let us use a quantum computer to estimate the trace \(\operatorname{tr} \rho (\sigma _{1}^{k})\) in the Kauffman brackets (33) and (34). Consider the quantum circuit Fig. 20 (a) with \(U= \rho (\sigma _{1}^{k})\) and the input state

$$ \rho _{0} = | 0 \rangle \langle 0 | \otimes \frac{1}{2}I_{2} =\frac{1}{2} \begin{pmatrix} I_{2}&0 \\ 0&0 \end{pmatrix} , $$

where \(I_{2}/2\) is the maximally mixed state. The state after quantum circuit is applied is

$$ \rho _{1}= \begin{pmatrix} I_{2}&0 \\ 0&U \end{pmatrix} \frac{1}{4} \begin{pmatrix} I_{2}&I_{2} \\ I_{2}&I_{2} \end{pmatrix} \begin{pmatrix} I_{2}&0 \\ 0&U^{\dagger }\end{pmatrix} =\frac{1}{4} \begin{pmatrix} I_{2}&U^{\dagger } \\ U&I_{2} \end{pmatrix} . $$

The expectation value of \(X_{1}\) with respect to \(\rho _{1}\) is

$$ E(X_{1}) = \operatorname{tr}(X_{1} \rho _{1}) = \frac{1}{2} \operatorname{Re} \operatorname{tr} U $$

while the expectation value of \(Y_{1}\) is

$$ E(Y_{1}) = \operatorname{tr}(Y_{1} \rho _{1}) = \frac{1}{2} \operatorname{Im} \operatorname{tr} U. $$

Hence, the trU is found from estimating these two expectation values.

Figure 20
figure 20

(a) Quantum circuit to estimate the Jones polynomial with input \(| 0 \rangle \langle 0 | \otimes I_{2}/2\). (b) Quantum circuit to estimate the Jones polynomial using ancilla qubit to synthesize the mixed state. The input state is \(| 000 \rangle \). To estimate the Jones polynomial, the expectation values of \(X_{1}\) and \(Y_{1}\) are estimated for this state. (c) Circuit for the trefoil knot, appended with a H gate on the first qubit for measuring \(E(X_{1})\) and transpiled to our native gates and connectivity at \(\theta =\pi /6\). The second parameter of R is defined \(\operatorname{mod} 2\pi \)

The above scheme fits well with NMR quantum computer, in which the system is in a maximally mixed state with a good approximation. A superconducting quantum computer is ideally in a pure state and the above scheme cannot be applicable in its original form. We use purification to “synthesize” a uniformly mixed state from a pure state for this purpose. Let us consider the circuit in Fig. 20 (b). The bottom two qubits are in the Bell state

$$ | \Phi _{+} \rangle =\frac{1}{\sqrt{2}}\bigl(| 00 \rangle +| 11 \rangle \bigr) $$

after application of the Hadamard gate and the CNOT gate. The middle qubit is in a maximally mixed state if the bottom qubit is ignored (i.e. partially traced out). We have already seen this in Sect. 4.2.2. Observe that

$$ \operatorname{tr}_{2} | \Phi _{+} \rangle \langle \Phi _{+} |=\frac{1}{2} I_{2}. $$

Then the output of the top two qubits is the same as that of Fig. 20 (a).

Suppose the quantum circuit Fig. 20 (b) is applied on \(| 000 \rangle \). Then the output state is

$$ | \Psi \rangle =\frac{1}{2}\bigl(| 000 \rangle + | 011 \rangle +| 1 \rangle \bigl(U| 0 \rangle \bigr) | 0 \rangle +| 1 \rangle \bigl(U| 1 \rangle \bigr)| 1 \rangle \bigr). $$

We estimate the expectation value of \(X_{1}\) with respect to \(| \Phi \rangle \) to get

$$ E(X_{1})= \langle \Phi | X_{1} | \Phi \rangle = \frac{1}{4}\bigl(\operatorname{tr}U+ \operatorname{tr}U^{\dagger} \bigr)=\frac{1}{2}\operatorname{Re}(\operatorname{tr} U), $$

similarly for \(Y_{1}\) we get

$$ E(Y_{1})=\langle \Phi | Y_{1} | \Phi \rangle = \frac{i}{4}\bigl(- \operatorname{tr}U+ \operatorname{tr}U^{\dagger} \bigr)= \frac{1}{2}\operatorname{Im}(\operatorname{tr} U), $$

from which we estimate trU. Figure 20 (c) shows the transpiled circuit to estimate \(E(X_{1})\) for \(\theta =\pi /6\).

Figure 21 shows the real and the imaginary parts of \(\operatorname{tr} \rho (\sigma _{1}^{k})\) obtained using \(| \Phi \rangle \) with \(k=2\) for the Hopf link while \(k=3\) for the trefoil knot. The readout error mitigated results are marked by the blue circles.

Figure 21
figure 21

Trace in the Jones polynomial of the Hopf link and the trefoil knot. (a) Real part and (b) imaginary part of the trace in the Hopf link, estimated as the expectation values of X and Y on the first qubit, respectively. Similarly, (c) and (d) show the real and the imaginary parts of the trace for the trefoil knot. Solid curves are theoretical results while the markers show experimental results evaluated on our quantum computer. Circles are from readout error mitigated execution and diamonds include also randomized compiling (RC) and zero-noise extrapolation (ZNE)

When quantum circuits get deeper, consisting of several layers of gates, errors that happen in the bulk of the circuits start to accumulate and can significantly affect the results. This is the case for the transpiled circuits that we are considering here, see Fig. 20 (c), and we thus decided to implement also error suppression and additional error mitigation techniques, namely randomized compiling and zero-noise extrapolation. The mitigated expectation values are shown with green diamonds and, in general, are indeed closer to the ideal noiseless results (red). In more detail, we generate 30 different randomized compilations of the original circuit and measure each 20,000 times. The higher shot count is employed to combat the increased variance resulting from zero-noise extrapolation, where we scale the noise by factors of 3 and 5 using global folding. Polynomial fitting is used to extrapolate to the limit of zero noise.

The Kauffman and the Jones polynomials may be obtained in many different ways. All of them are easy if the trefoil is represented as in Fig. 19 (a), for example. But it will be more demanding if the representation is more knotted as in Fig. 19 (b). A typical classical evaluation of these polynomials involves a sum over “states” obtained by splitting each crossing in two different ways. There are \(2^{m}\) states if there are m crossings and the task is exponentially hard as m increases. In contrast, in quantum computing, a controlled unitary gate is assigned to each crossing, i.e., the braid group generator, which requires merely m controlled unitary gates.

5.3 Embedding techniques for quantum chemistry

Embedding techniques are theoretical frameworks used in quantum chemistry and condensed matter physics to study the electronic structure of strongly correlated materials. These methods are crucial and are particularly valuable for systems where the effects of electron-electron interactions are significant and where traditional methods such as density functional theory (DFT) [56, 57] fail to provide accurate descriptions. They represent a powerful tool to study the electronic properties of materials such as transition-metal-oxides and rare-earth compounds. Embedding techniques have been successfully applied to further the understanding of complex phenomena such as metal-insulator transitions, magnetically ordered states and high-temperature superconductivity.

The central idea of these techniques is to map a complex quantum many-body problem to a self-consistent Anderson impurity model (AIM) [58], which consists of a strongly correlated subsystem (impurity) and a weakly correlated or non-correlated subsystem (bath). The impurity is treated more accurately using a method capable of handling strong correlations, while the bath is treated at a lower level of theory, often using mean-field approximations. These descriptions of the impurity and the bath are combined in a self-consistent loop over either single particle (i.e. density matrix embedding theory (DMET) [59, 60] and rotationally invariant slave-boson (RISB) techniques [6163]) or two particle quantities (i.e. self-energy embedding theory (SEET) [64, 65], dynamical mean-field theory (DMFT) [6668] and its cluster extensions [67, 69]) in order to provide a more accurate description of the entire system. Figure 22 is a schematic of the self-consistent loop in embedding based techniques.

Figure 22
figure 22

Schematic diagram of the embedding loop showcasing the part of the loop that is to be performed on the quantum computer

Consider the electronic structure (ES) Hamiltonian in second-quantized form:

$$ \mathcal{H}_{\text{ES}} = \sum_{pq}h^{pq} a^{\dagger}_{p} a_{q} + \sum _{pqrs}h^{pqrs} a^{\dagger}_{p} a^{\dagger}_{q} a_{r} a_{s}, $$

where p, q, r, s are indices of a given basis set and include the respective spin indices \(p \equiv p(\sigma )\). Further, \(a^{\dagger}\) and a are the fermionic creation and annihilation operators and \(h^{pq}\), \(h^{pqrs}\) are constants called the one- and two-electron integrals, respectively. Such a Hamiltonian is then mapped to an Anderson impurity-bath model given by:

$$\begin{aligned} \mathcal{H}_{\text{AIM}} = \sum_{k,\sigma} \varepsilon _{k} c_{k \sigma}^{\dagger }c_{k\sigma} + \sum_{\sigma }(\varepsilon _{d} - \mu ) d_{\sigma}^{\dagger }d_{ \sigma} + U n_{d\uparrow} n_{d\downarrow} + \sum _{k,\sigma} \bigl(V_{k} c_{k\sigma}^{\dagger }d_{\sigma} + \text{H.c.}\bigr), \end{aligned}$$

where k represents the summation index for the bath operators, σ labels the spin, d corresponds to the impurity operator with \(n_{d\sigma} = d^{\dagger}_{\sigma }d_{\sigma}\) and c corresponds to the operators on the non-interacting bath, \(\varepsilon _{d/c}\) correspond to the onsite-energy of the impurity and bath, respectively, with the chemical potential μ. In spite of giving a simple description of the lattice problem, this model is by itself challenging to solve for state-of-the-art numerical techniques such as density matrix renormalization group (DMRG) [7073], quantum Monte Carlo (QMC) [7477] and others. It has been recently shown that a promising alternative approach for computing the ground states of such systems comes from using variational algorithms implemented on quantum devices [78, 79] one of which, QAOA, has already been discussed in Sect. 4.3.

The variational quantum eigensolver (VQE) [8082] is another quantum algorithm designed for finding the ground state energy, which is a fundamental quantity for quantum systems of interest in condensed matter physics and quantum chemistry. The basic idea behind VQE is to use a hybrid approach where the optimization of variational parameters performed on a classical computer is combined with measurements on the quantum computer to find an approximation to the ground state energy. The algorithm involves mainly a trial wavefunction in the form of a parameterized quantum circuit, chosen to represent a guess for the ground state of the quantum system. This circuit, controlled by a set of variational parameters, is executed on a quantum computer. Measurements are made on the final prepared state to calculate the expectation value of the Hamiltonian of the system. Then classical optimization algorithms are employed to tune the variational parameters in order to minimize the expectation value of the Hamiltonian. This process is repeated iteratively until the ground state energy is sufficiently minimized and the solution converges.

On our 5-qubit quantum computer (see Fig. 23) we aim to perform a VQE calculation in order to approximate the ground state of the one-impurity-site and one-bath-site AIM (\(k = 1\) in Eq. (44)). The fermionic Hamiltonian of the AIM has to be mapped to a qubit Hamiltonian using the following Jordan-Wigner transformation [83]:

$$\begin{aligned}& d_{\uparrow}^{\dagger } = \sigma _{0}^{-} = \frac{1}{2} (X_{0} - i Y_{0}), \end{aligned}$$
$$\begin{aligned}& c_{1\uparrow}^{\dagger } = Z_{0} \sigma _{1}^{-} = \frac{1}{2} Z_{0} (X_{1} - i Y_{1}), \end{aligned}$$
$$\begin{aligned}& d_{\downarrow}^{\dagger } = Z_{0} Z_{1} \sigma _{2}^{-} = \frac{1}{2} Z_{0} Z_{1} (X_{2} - i Y_{2}), \end{aligned}$$
$$\begin{aligned}& c_{1\downarrow}^{\dagger } = Z_{0} Z_{1} Z_{2} \sigma _{3}^{-} = \frac{1}{2} Z_{0} Z_{1} Z_{2} (X_{3} - i Y_{3}), \end{aligned}$$

which leads to the qubit Hamiltonian

$$\begin{aligned} \mathcal{H}_{\text{AIM}}^{\text{JW}} & = (\varepsilon _{d} + \varepsilon _{1} - 2\mu ) - \frac{1}{2}(\varepsilon _{d} -\mu +2U) (Z_{0} + Z_{2}) - \frac{1}{2}\varepsilon _{1}(Z_{1} + Z_{3}) \\ &\quad {} + \frac{1}{4} U Z_{0}Z_{2} + \frac{1}{2} V_{1} (X_{0}X_{1} + Y_{0}Y_{1} + X_{2}X_{3} + Y_{2}Y_{3}). \end{aligned}$$

Grouping Hamiltonian terms that have support on different qubits allows us to reduce the number of measurements performed. For example an \(X_{0}X_{1}X_{2}X_{3}\) measurement on the Ansatz state can be used to compute both \(X_{0}X_{1}\) and \(X_{2}X_{3}\) terms of the Hamiltonian. A hardware efficient Ansatz as shown in Fig. 23 (a) is chosen. Since the problem requires only four qubits, the best qubits based on fidelity are chosen from the hardware to perform the calculations and keeping 0th qubit of the spin Hamiltonian fixed at the center of the star topology. Parameterized single qubit \(R_{y}\) gates are introduced and a total of seven parameters are tuned by the L-BGFS-B classical optimizer from Scipy-optimizer package [84]. This is a gradient based optimizer where the derivative is computed using the parameter shift rule [85, 86]. One has:

$$\begin{aligned} \frac{\partial E_{{\boldsymbol {\theta}}} }{\partial \theta _{i}} = \frac{E_{\theta _{i} + \frac{\pi}{2}} - E_{\theta _{i} - \frac{\pi}{2}}}{2}, \end{aligned}$$

where \(E_{{\boldsymbol {\theta}}} = \langle \psi ({\boldsymbol {\theta}}) | \mathcal{H}_{ \text{AIM}}^{\text{JW}} | \psi (\boldsymbol {\theta}) \rangle \) with \({\boldsymbol {\theta}} = \{\theta _{1},\theta _{2} \cdots \}\). Figure 23 (b) shows the energy measurements for different iterations computed without and with read-out error mitigation at \(\epsilon _{d} = \mu \), \(\epsilon _{1} = 0\), \(V=1\) and \(U = 2\). We have used 5000 shots for each measurement. The results can be further improved using advanced error mitigation strategies, such as zero-noise extrapolation (ZNE) [8789] or probabilistic error cancellation (PEC) [9092]. The converged parameterized state can be used to compute quantities such as density matrix or the Green’s function which are necessary components of some of the aforementioned embedding techniques.

Figure 23
figure 23

(a) Hardware efficient Ansatz used for the VQE. (b) Comparison of the VQE energy as a function of the iterations for convergence of an Anderson impurity model at \(U = 2\) on our 5-qubit quantum computer with and without readout error mitigation. The exact energy is also shown for comparison

VQE has been proposed as a promising algorithm for near-term quantum computers to solve certain models from quantum chemistry and condensed matter physics. It is important to note though, that practical implementations of VQE are still limited by the capabilities of available quantum hardware, and there is active ongoing work on improving and refining these algorithms. Nevertheless, it serves as a gateway to explore and investigate advanced embedding techniques on the currently available quantum hardware.

6 Summary

An on-site quantum computer can be utilized in education and research for quantum computing, quantum information and quantum theory. We have demonstrated some of these topics with the 5-qubit superconducting IQM SparkTM prototype in this paper.

First, we presented the tools and programs used. Then, we showed how they can be used in education and research. Certain demonstrations, such as calibration and working with qutrits, are only possible with an on-site quantum computer. It is also used to explore the complex quantum realm and reproduce recent breakthroughs in mathematics, physics, and chemistry for a better understanding.

7 Discussion

This physical on-site quantum computer is vital for making quantum computing accessible to more people and talents, teaching quantum concepts, and enhancing our understanding of quantum theory and computing. It can also be used for research, as we have shown in Sect. 5. A recent survey reveals that many research papers demonstrating “proofs of principle” used a quantum computer with five qubits or fewer [45].

We anticipate that in the very near-term future, every leading university and research institute, wanting to stay competitive in quantum computing education and research, will have physical access to affordable on-site quantum computers, such as the IQM SparkTM [6].