1 Introduction

Secure attestation of cloud computing resources has been in the focus of research to create trust in the cloud, since through it cloud computing customers can make sure that they are provided access to the correct hardware platform (Feng 2017). Trusted hardware is especially important for quantum computing (QC) since the high sensitivity to noise in low-grade machines can have detrimental effects on vital calculations (Regev and Schiff 2008; Preskill 2018).

In the current quantum computing ecosystem, quantum computers of competing hardware manufacturers are aggregated by third-party cloud providers, giving customers access to a multitude of QC hardware platforms. However, a malicious party acting as a legitimate quantum cloud provider might try to reroute customer circuits to cheaper lower-grade hardware, to either save money, censor certain customers, or damage the reputation of selected QC platforms. It is therefore an important endeavour to ensure that quantum cloud customers can authenticate the accessed computing devices: from the customers’ side to lower the risk of making extensive business decisions based on potentially corrupted results, and from the platform manufacturers’ side, to safeguard against third parties trying to damage their reputation.

In the classical world, physical unclonable functions (PUFs) have been proposed as a way to identify and authenticate electronic devices Pappu et al. (2002); Brzuska et al. (2011); Maes (2013). Recently, quantum PUFs (QPUFs) have emerged with that aim to achieve the same goals for quantum devices. The primer of Škorić (2012) introduced the concept of quantum readout PUFs (QR-PUFs), which were later generalized and formalized in the framework of Doosti et al. (2021) and rigorously analyzed by Arapinis et al. (2021). However, the authentication protocols using QPUFs require a quantum memory and a quantum communication channel between the verifier and prover to exchange quantum states. Other studies propose to use QPUFs as generators for cryptographic keys to establish information theoretically safe communication (Horstmeyer et al. 2013; Nikolopoulos 2021); however, they are built with special-purpose hardware that does not fit the gate-based computing model of QC cloud platforms. The recent work by Phalak et al. (2021) proposes a QPUF, where the authentication protocol requires only classical communication and no quantum memory, and can be implemented on gate-based quantum computers to fingerprint these devices.

In this work, we adapt the QPUF framework of Doosti et al. (2021) in order to address the case of classical readout quantum PUFs (CR-QPUFs) where verifier and prover communicate classically to authenticate a quantum device. CR-QPUF constructions in essence rely on classically communicating a unitary transformation, the challenge, which is performed on the quantum device. The device-specific imperfections degenerate the unitary challenge and result in a unique output state, which is measured to obtain the response. Interestingly, due to the prevalent noise in noisy intermediate-scale quantum (NISQ) devices, we find that in order to build a robust CR-QPUF, responses are designed as statistical queries to the QPUF. In fact, the so-called Hadamard CR-QPUF Phalak et al. (2021) intrinsically uses the statistical query model, which we will present in this paper. Even though it has been shown that restrictions to the SQ model can be used to obtain unlearnability results (Hinsche et al. , 2021; Gollakota and Liang , 2021; Kearns , 1998), we present a successful learning attack on the Hadamard CR-QPUF in the SQ model. We thereby show that the remote authentication scheme using the Hadamard CR-QPUF is not secure against learning attacks. We show that an attacker is able to model and predict the Hadamard CR-QPUF characteristics and hence forge the quantum device fingerprint using machine learning. Additionally, we investigate natural extensions of the Hadamard CR-QPUF and observe similar security flaws.

This work is concluded with an in-depth discussion of the prospects and drawbacks of CR-QPUFs. It is possible that in the NISQ era, constructing CR-QPUFs in the statistical query (SQ) model might provide security guarantees against learning attacks. However, there are many open questions that need to be addressed. How can imperfections be modelled? Are entangled resources necessary? What security guarantees can we get from the SQ model? We discuss these questions based on our findings regarding the insecurity of Hadamard CR-QPUFs. Additionally, we discuss the security of CR-QPUFs in the contrarian views of the power of NISQ devices, i.e. whether output distributions of NISQ devices can in general be modelled using low-degree polynomials or not Kalai (2020). Finally, we propose future work and possible next steps for investigating CR-QPUFs that are secure against machine learning attacks.

2 Classical-readout QPUFs

In general, PUFs exhibit unique input-output relations which depend on the physical properties of the device on which the PUF is implemented. These unique relations are commonly referred to as challenge-response pairs (CRPs) and constitute a fingerprint of the physical device.

Fig. 1
figure 1

Schematic illustration of the authentication protocol and attacker model considered. In the trusted setup phase, Alice will create a CRP database by repeatedly invoking the QPUF with random challenges. This database will serve as a unique fingerprint of the QPUF. In the authentication phase, Bob (who claims to be in possession of the QPUF) is required to answer with the correct responses to challenges in Alice’s CRP database. In the modelling attack, Eve has access to the QPUF and aims to learn a model of the input-output behaviour in order to predict responses to future challenges, thus succeeding in the authentication phase using solely the model of the QPUF

Authentication schemes based on PUFs typically use a challenge and response protocol between a verifier and a prover. The goal of the prover (here called Bob) is to prove to the verifier (here called Alice) that he is in possession of the PUF, by demonstrating the ability to query the PUF. In Fig. 1, we schematically depict the authentication protocol of a QPUF that we consider in this work. Before the authentication can take place, a trusted setup phase is required, during which Alice can directly interact with the QPUF. This enables her to build a secret CRP database by repeatedly querying the QPUF with randomly chosen challenges and storing the challenges and respective responses in the database. At a later stage, when Bob wants to prove possession of the QPUF to Alice, he will need to provide the correct corresponding response to a CRP chosen randomly by Alice from her database. If the response matches the CRP entry in the secret database of Alice, Alice authenticates that Bob is in possession of the QPUF. An honest Bob will query the QPUF with the challenge and send the respective response back. A dishonest Bob can try to implement an attack, for example a modelling attack, where by figuring out the QPUF’s input-output characteristics he can predict the correct response to arbitrary challenges without possessing the QPUF.

The protocol described above can be used to authenticate QCs provided by third-party cloud platforms. During the trusted setup phase, the user (Alice) obtains a (certified) CRP database of a QPUF that is implemented on the quantum device of a certain manufacturer. If Alice wants to authenticate the QC at a later point when she no longer has physical access to the hardware, she queries the QPUF again with challenges from her CRP database and compares the respective responses.

Prior work of Škorić (2012) introduced the concept of a quantum-readout PUF (QR-PUF), where challenges and responses are communicated using a quantum channel, thereby presenting a protocol to authenticate a QR-PUF without the need to rely on a trusted readout device. Doosti et al. (2021) proposed a generalised framework for QPUFs, where a secret unitary transformation is performed on challenge states. Since the challenges and responses are quantum states, they need to be stored in a quantum memory; unfortunately, this is not easy to achieve in practice.

A recent proposal by Phalak et al. (2021) introduced the concept of a classical-readout QPUF (CR-QPUF) where a quantum device is queried classically, removing the requirement for a quantum memory. In the proposed protocol, the challenge is a classical description of a parameterized unitary that is run on the quantum computer. The response is the mean of multiple measurement outcomes of the qubits in the computational basis. Both the challenge unitary and the mean value with finite samples are communicated classically between verifier and prover. The goal of the protocol is to leverage imperfections in the quantum computer as hidden parameters that identify the quantum device uniquely. Table 1 summarises the different categories of QPUF protocols at the quantum-classical intersection.

Table 1 Classification of (quantum) PUFs

Here, we introduce the formalism of CR-QPUFs for NISQ devices according to the QPUF framework of Doosti et al. (2021). NISQ devices are subject to noise which consists of systematic noise and non-systematic noise. In the following, we refer to systematic noise as device imperfections and non-systematic noise as white noise. We will generalize the concept of CR-QPUFs envisioned by Phalak et al. (2021) and accurately describe the challenge-response space to enable further rigid analysis.

We will denote with \(qPUF_{id}\) the unique identifying properties of a quantum device with identity \(\mathrm {id}\), generated by the process \(QGen(\lambda )\), where \(\lambda\) is the security parameter:

$$\begin{aligned} qPUF_{id} \leftarrow QGen(\lambda ) \end{aligned}$$

Let \(\mathcal {U}\) be a subspace of unitary transformations, called the challenge space. Given a challenge \(U_{in} \in \mathcal {U}\), the response \(r_{out}\) is a set of i.i.d. samples from running \(U_{in}\) on the quantum computer identified by \(qPUF_{id}\) and measuring in the computational basis. Thus, the completely positive trace-preserving map \(\Lambda _{\mathrm {id}}\) that is performed by the quantum device \(\mathrm {id}\) with properties \(qPUF_{id}\) is a function of the challenge unitary run on the device:

$$\begin{aligned} \Lambda _{\mathrm {id}}(U_{in}) =: \Lambda ^{\mathrm {id}}_{\mathrm {in}} \end{aligned}$$

The density matrix \(\rho _{\mathrm {in}}^{\mathrm {id}}\) is then defined as the result of performing the quantum operation to the \(|0\rangle\) state, i.e.

$$\begin{aligned} \rho _{\mathrm {in}}^{\mathrm {id}} = \Lambda ^{\mathrm {id}}_{\mathrm {in}} |0\rangle \langle 0| {\Lambda ^{\mathrm {id}\dagger }_{\mathrm {in}}}\text {.} \end{aligned}$$

Note that \(\rho _{\mathrm {in}}^{\mathrm {id}}\) does not equal the state after perfectly applying \(U_{in}\), but a mutated version thereof, where the state has been altered depending on the device imperfections. Furthermore, we define the Born distribution as

$$\begin{aligned} \mathcal {P}_{U_{in},\mathrm {id}}(x) = Tr(M_{x} \rho _{\mathrm {in}}^{\mathrm {id}} M_{x}^{\dagger }) \end{aligned}$$

which is the distribution over n-bit strings, induced by measuring the qubits in the computational basis. Here \(M_{x}\) corresponds to the measurement operator of \(|x\rangle\). The generic evaluation \(\mathrm {QEval}\) of the QPUF is then simply sampling from the Born distribution. That is,

$$\begin{aligned} r_{out} \leftarrow QEval({qPUF_{id}, U_{in}}) \end{aligned}$$

and for noise-free devices we obtain

$$\begin{aligned} r_{out} = \left\{ x : x \sim \mathcal {P}_{U_{in},\mathrm {id}} \right\} {.} \end{aligned}$$

However, in NISQ devices, white noise is prevalent and the samples in \(r_{out}\) are corrupted by random noise. We model the random noise as random bit flips. Analog to noisy boolean functions Benjamini et al. (1999), given \(x, y \in \{0,1\}^n\) and \(\epsilon \in (0, 1)\), let y be a random perturbation of x, i.e. \(x_i = y_i\) with probability \(1-\epsilon\), independently for distinct i’s. We denote the random perturbation by \(\mathrm {N}_{\epsilon }(x) = y\). Thus, for NISQ devices, the response \(\hat{r}_\mathrm {out}\) for a given challenge \(U_{in}\) is the finite set of noisy samples,

$$\begin{aligned} \hat{r}_\mathrm {out} = \left\{ \mathrm {N}_{\epsilon }(x) : x \sim \mathcal {P}_{U_{in},\mathrm {id}} \right\} {.} \end{aligned}$$

Thus, in order to construct a robust CR-QPUF response on such noisy devices, it is intuitive to use global statistical properties across multiple samples Benjamini et al. (1999). The statistical query (SQ) model Kearns (1998) provides an excellent formalism to describe this mechanism. In fact the Hadamard CR-QPUF, which is described in the next section, naturally uses the SQ model to build a robust QPUF authentication scheme.

In the SQ model for CR-QPUFs, given an efficiently computable function \(\phi : \{ 0,1 \}^{n} \rightarrow \{ 0,1 \}^{n}\) the QPUF evaluation \(\mathrm {QEval}\) responds with some \({\overset{\boldsymbol\rightarrow}{\mathrm R}}_{\mathrm{out}}\) that is \(\tau\)-close to the expectation value of \(\phi\). Thus, in the SQ model, we have that

$$\begin{aligned} \mathbf {R}_{\mathrm {out}} \leftarrow QEval({qPUF_{id}, U_{in}}) \end{aligned}$$

where

$$\begin{aligned} \left| { \underset{x \sim \mathcal {P}_{U_{in}, \mathrm {id}}}{\mathbb {E}} [ \phi (x) ] - \mathbf {R}_{\mathrm {out}} }\right| \le \tau {.} \end{aligned}$$

Note that behind the scenes, the noise-robust response \({\overrightarrow{\mathrm R}}_{\mathrm{out}}\) is constructed from noisy samples in \(\hat{r}_\mathrm {out}\). In fact, Chernoff-Hoeffding bounds imply that for noise rate \(\epsilon < 1/2\), for any efficiently computable function \(\phi\), the SQ model can be simulated using noisy samples \(\hat{r}_\mathrm {out}\), where \(\left| {\hat{r}_\mathrm {out}}\right|\) is polynomial in n, \(\tau ^{-2}\) and \(\epsilon ^{-2}\). It follows that any algorithm that tries to learn the distribution \(\mathcal {P}_{U_{in}, \mathrm {id}}\), while being restricted to using the SQ model, is less (or equally) powerful than an algorithm that has direct access to \(\hat{r}_\mathrm {out}\). In the next section, we will consolidate the Hadamard CR-QPUF in this framework.

3 The Hadamard CR-QPUF

The Hadamard CR-QPUF introduced by Phalak et al. (2021) aims at robustly authenticating a quantum computer using device-specific qubit imperfections via a classical communication channel. The CR-QPUF challenges are given by a parameterized quantum circuit that includes parameterized single-qubit rotations and Hadamard gates. The responses are the empirical mean of projective measurements of the qubits in the computational basis. In the following, we will define the Hadamard CR-QPUF in the framework described above using the SQ model.

The class of challenge unitaries \(\mathcal {U}\) that is used in the Hadamard CR-QPUF is depicted in Fig. 2 and given by:

$$\begin{aligned} \mathcal {U} := \left\{ \bigotimes _{i=1}^{n} H R_{Y}(\theta _i) : \theta _i \in [0, 2\pi ).\forall i \in \{1,...,n\} \right\} \end{aligned}$$

Let \(qPUF_{id}\) be the properties of the quantum computer \(\mathrm {id}\) and let \(U_{in}(\mathbf {\theta }) \in \mathcal {U}\) be a chosen challenge unitary acting on n qubits that is parameterized by rotations \(\mathbf {\theta }\). The response \({\overrightarrow{\mathrm R}}_{\mathrm{out}}\) is given by the \(\tau\)-close approximation to the expectation value of the qubits, more precisely:

$$\begin{aligned} \left| \underset{x \sim \mathcal {P}_{U_{in}, \mathrm {id}}}{\mathbb {E}} [ x ] - \mathbf {R}_{\mathrm {out}} \right| \le \tau { .} \end{aligned}$$

Thus, the response \({\overrightarrow{\mathrm R}}_{\mathrm{out}}\) is the result of a statistical query as introduced in Section 2, where the function \(\phi\) is the identity. Following from the law of large numbers, the response \({\overrightarrow{\mathrm R}}_{\mathrm{out}}\) is computed using noisy samples \(\hat{r}_\mathrm {out}\) that are obtained by sampling from \(\mathcal {P}_{U_{in}t, \mathrm {id}}\). Thus, the QPUF response \({\overrightarrow{\mathrm R}}_{\mathrm{out}}\) is given by:

$$\begin{aligned} \mathbf {R}_{\mathrm {out}} = \frac{1}{\left| {{\hat{r}_{\mathrm {out}}}}\right| }\sum _{r \in {\hat{r}_{\mathrm {out}}}} r \text {, subject to} \end{aligned}$$
$$\begin{aligned} \epsilon \le \tau \left| { \underset{x \sim \mathcal {P}_{U_{in}, \mathrm {id}}}{\mathbb {E}}[1-2x] }\right| { .} \end{aligned}$$

Hence, we can obtain a \(\tau\)-close approximation of the expectation value if the white noise probability \(\epsilon\) is upper-bounded as shown above. Note that we chose a high-level noise model where the white noise are i.i.d. bit flips. We leave the investigation of other quantum noise models to obtain bounds for \(\tau\) to future work.

Fig. 2
figure 2

Circuit of the parameterized challenge unitary family in the Hadamard CR-QPUF. The challenges are run on a n-bit quantum computer, returning multiple shots per challenge

Fig. 3
figure 3

Circuit of the parameterized challenge unitary family as a natural extension to Hadamard CR-QPUF. The circuit consists of k layers of parameterized \(R_{Y}\) and \(R_{X}\) rotations and a Hadamard gate that are performed on n qubits

Phalak et al. (2021) propose to obtain multiple responses for every challenge and to map every component in each response \({\overrightarrow{\mathrm R}}_{\mathrm{out}}\) to a 5-bit string resulting in the \((5\cdot n)\)-bit string \(S_\mathrm {out}\), which make up the CRP database. The protocol then accepts the respective string \(S^{\prime }_\mathrm {out}\) of a response if the average Hamming distance between \(S^{\prime }_\mathrm {out}\) and all respective strings to the same challenge in the CRP database is within the variance among the strings in the CRP database. The authors chose the 5-bit representation as a favourable trade-off between uniqueness of the signature and robustness to white noise.

3.1 Extension to multiple rotations

A natural extension of the Hadamard CR-QPUF as proposed by Phalak et al. (2021) is to increase the number of rotations applied to the qubits to rotation chains. For example, consider challenges of the form

$$\begin{aligned} U_{\mathrm {in}}(\mathbf {\theta _{1}}, ..., \mathbf {\theta _{t}}) = \bigotimes _{j=1}^{n} H R_{Y}(\theta _{1,j}) R_{X}(\theta _{2,j})...R_{Y}(\theta _{t,j}) \end{aligned}$$

that consists of a chain of alternating \(R_{Y}\) and \(R_{X}\) rotations (Fig. 3). In our modelling attack on the Hadamard CR-QPUF, we also consider such natural extensions and show that the resulting responses can be learned using a simplified model. In the simplified model, the responses of CR-QPUFs based on arbitrary rotation chains consisting of \(R_{Y}\) and \(R_{X}\) rotations are learned. Therefore, a model using only two rotations, i.e. one \(R_Y\) and one \(R_X\) rotation is learned.

4 Attacker capabilities

In this section, we distil the capabilities an attacker must have to perform the modelling attack. One application of CR-QPUFs is to protect quantum cloud customers from fraudulent resource allocation on the provider side. One such example is when a customer is assigned by a malicious actor, to cheaper and lower-grade quantum computing hardware than what they had originally agreed upon. In particular, this includes that the cloud provider is malicious and aims at assigning the customer to lower-grade hardware. In the attacker model we consider, the malicious actor knows the challenge space \(\mathcal {U}\) and is able to run chosen challenges \(U_{in} \in \mathcal {U}\) on the quantum computer that she aims to model. Additionally, she is able to detect when a user tries to run a Hadamard CR-QPUF challenge and extract the challenge rotations \(\mathbf {\theta }\). This is well in the threat model of a malicious quantum cloud provider. It is usually assumed that an attacker of PUF authentication schemes knows the challenge space and tries to predict the response to an unknown challenge. In our case, the attacker will predict the correct response according to \(\mathbf {\theta }\) using a model that was learned in a learning phase.

As we will show in the following section, such a malicious actor that can reroute user circuits to lower-grade remains completely undetected by the Hadamard CR-QPUF secure provisioning protocol.

5 The modelling attack

The attack on the Hadamard CR-QPUF secure provisioning scheme is carried out in two phases, the learning phase and the attack phase. One important step in the attack is based on the observation that the Hadamard CR-QPUF does not entangle the single qubits and thus creates the opportunity to learn the characteristics of the qubits individually. The second essential observation is the fact that the challenge unitaries are fixed in their circuit structure; hence, the challenge space can be reduced to the parametric rotations \(\mathbf {\theta }\). In the following, we describe the learning phase, where the design shortcomings of the Hadamard CR-QPUF are exploited to learn a model of the CR-QPUF, and the attack phase, where the actual attack using the learned models is carried out.

5.1 Learning phase

In the learning phase, the attacker Eve gathers L CRPs \(\{ ( U_{in}(\mathbf {\theta _l}), \mathbf {R}_\mathrm {out}) : \mathbf {\theta _l} \in [0,2\pi )^n \text {, } l=1,...,L\}\), where the chosen rotations \(\mathbf {\theta _l}\) cover the space \([0,2\pi )^n\) equidistantly, i.e. in a grid. Here, the component \(\theta _{l,i} = \theta _{l,j}\) for one \(\mathbf {\theta _l}\) for \(i,j \in \{1,...,n\}\), which means that all n qubits are rotated by the same angle.

Having obtained the responses for the chosen rotations, she then fits a bounded degree polynomial \(f^{(j)}\) to the obtained data points \(\{ ( \theta _{l,j}, {r_l}_\mathrm {out}) : l=1,...,L \}\) for each qubit j. The degrees of the polynomials \(f^{(j)}\) are upper bounded by \(1/\epsilon\), where \(\epsilon\) is the error probability of the qubit Benjamini et al. (1999). We estimate the error probability of the qubits to be 0.1 and thus choose a degree of 10 for the polynomials \(f^{(j)}(\theta _j)\).

The fitting of the polynomials is done using a textbook least square error regression using the scikit-learn library Pedregosa et al. (2011). Using this regression, Eve learns a model function \(f^{(j)}(\theta _j)\) for each qubit \(j\in \{1,...,n\}\) that predicts the \(j^{th}\)-component of \({\overrightarrow{\mathrm R}}_{\mathrm{out}}\). Given unknown challenge rotations \(\mathbf {\theta }\), the learned model functions will then be used in the attack phase to calculate the predicted response \({\overrightarrow{R'}}_{out}\), where:

$$\begin{aligned} \mathbf {R}^{\prime }_\mathrm {out} = f(\mathbf {\theta }) = \begin{pmatrix} f^{(1)}(\theta _1) \\ \vdots \\ f^{(n)}(\theta _n) \end{pmatrix}. \end{aligned}$$

5.2 Attack phase

In the attack phase, Eve intercepts the requests where Alice wants to run a circuit on \(QC_{1}\) and runs them on a lower-grade machine \(QC_{2}\). Whenever Alice runs a Hadamard CR-QPUF challenge \(U_{in}t\) to detect mal-provisioning, Eve responds with the prediction \(R^{\prime }_\mathrm {out}\) using the regression models \(f^{(j)}(\theta _j)\) that have been learned in the learning phase. The attack layout is illustrated in Fig. 4.

Fig. 4
figure 4

The malicious actor Eve on the QC cloud provider side runs all circuits of Alice on lower-grade hardware \(QC_{2}\) and intercepts queries to the Hadamard CR-QPUF. Eve will use the model function \(f(\mathbf {\theta }) = \mathbf {R}^{\prime }_\mathrm {out}\) to predict \(\mathbf {R}_{\mathrm {out}}\) and send the prediction back to Alice

If Alice accepts the response \(\mathbf {R}^{\prime }_\mathrm {out}\), Eve successfully tricked Alice into falsely authenticating \(QC_{1}\), while in fact, all circuits of Alice have been run on \(QC_{2}\). In what follows, we will show how this attack performs using a real-world commercial quantum cloud computer provided by IBM.

6 Results

We carried out the learning and attack phase in a real-world scenario using the 27-qubit IBM quantum cloud computer \(\texttt {{ibmq\_mumbai}}\). Firstly, we learned the model function \(f^{(j)}\) for each qubit \(j=1,...,27\). Therefore we ran the Hadamard CR-QPUF challenge unitary \(U_{\mathrm {in}}(\mathbf {\theta _l})\) for \(L=30\) different rotations \(\mathbf {\theta _l}\) on \(\texttt {{ibmq\_mumbai}}\) and obtained the SQ responses \(\mathbf {R}_{\mathrm {out},l}\) by averaging over \(\left| {\hat{r}_{\mathrm {out},l}}\right| = 2000\) shots, where \(l = {1,...,L}\). The rotation angles \(\mathbf {\theta _{l}}\) were chosen such that the challenge input space \([0,2\pi )\) is split equally in L angles.

Fig. 5
figure 5

Measured responses along the 30 rotation angles between 0 and \(2\pi\) for two different qubits of the \(\texttt {{ibmq\_mumbai}}\) commercial quantum computer. The interpolated lines are the fitted prediction models for the respective qubits. The two qubits exhibit different characteristics depending on \(\theta\). The dependence of the responses on the challenges can be accurately modelled using the low-degree polynomials

Figure 5 shows the \(L=30\) measured responses for two different qubits and the respective regression models \(f^{(j)}(\theta _j)\) that are learnt from the data samples \(\{ ( \theta _{l,j}, \mathbf {R}_{\mathrm {out},l,j}) : l = 1,...,30 \}\) for qubits \(j=1,23\). The plot depicts how the responses for individual qubits are degenerated from a perfect sinus function, which is the optimal analytical solution for the responses. This degeneration is due to imperfections of the quantum computer.

In the attack phase, we randomly sampled 15 challenges \(\{ U_{in}(\mathbf {\theta _k}) : k=1,...,15\}\) (each 5 times) to build a holdout CRP database. We then calculated the respective predicted responses \(\{ f(\mathbf {\theta _k})) : k=1,...,15 \}\) using the learned regression models. After mapping the predicted responses to 25 bits, as suggested by Phalak et al. (2021), we calculated their average Hamming distance (HD) to the holdout CRP database. Per the protocol, if the average HD is within the variance interval of the HDs of the respective responses in the database, the predicted response gets accepted.

In Fig. 6, we plot the intra HD of the (holdout) CRP database and the respective predicted responses which we obtained in the attack. As we can see, almost all of the predicted responses get accepted. Hence, we were able to model the Hadamard CR-QPUF behaviour and successfully predicted responses to unknown challenges. This showcases the predictability and thus the insecurity of the Hadamard CR-QPUF.

Fig. 6
figure 6

Box plot showing the variance of the Hamming distances of responses in the holdout CRP database and their mean. The inner boxes resemble the middle two quartiles of the data. The red dots show the average HD of the \(k=15\) predicted responses (obtained through the model functions \(\{f(\mathbf {\theta }_k):k=1,...,15\}\)) to the respective responses in the CRP database. Almost every predicted response will get accepted by the authentication protocol since their average HDs to the respective responses in the CRP database are within the variance intervals

An attacker on the QC cloud provider side, that is able to detect when the Hadamard CR-QPUF challenge is requested to run, can successfully trick the authentication scheme. It will enable the attacker to reroute client circuits to lower-grade hardware without detection.

6.1 Extension to multiple rotations

In this subsection, we present the results of our attack on natural extensions of the Hadamard CR-QPUF as described in Section 3.1. Here, the challenge unitaries do not consist of only one, but of multiple X and Y rotations per qubit, forming a chain of rotations. The idea of the attack is again to learn a simplified polynomial that only depends on two rotations \(\theta _X\) and \(\theta _Y\) for each qubit.

In the learning phase, we learn a polynomial \(f^{(j)}(\theta _{X,j}, \theta _{Y,j})\) of degree 10 for each qubit j that depends on only X and Y rotations. To learn the models, we obtained training samples by running the challenge

$$\begin{aligned} U_{\mathrm {in}}(\mathbf {\theta }_{X}, \mathbf {\theta }_{Y}) = \bigotimes _{j=1}^{n} H R_{Y}(\theta _{Y,j}) R_{X}(\theta _{X,j}) \end{aligned}$$

using rotation angles \(\mathbf {\theta }_{X}\) and \(\mathbf {\theta }_{Y}\) that equidistantly cover the space \([0,2\pi ) \times [0,2\pi )\) in a discretized \(30 \times 30\) grid. Figure 7 shows the training samples obtained from ibmq_mumbai and the landscape of one of the trained models \(f^{(j)}(\theta _X, \theta _Y)\).

Fig. 7
figure 7

Training samples (black dots) and the landscape of \(f^{(j)}(\theta _X, \theta _Y)\) (surface) that has been learned for one qubit. The landscape exposes a mixture of a cosine and a sine function in the \(\theta _X\) and \(\theta _Y\) directions, respectively. The learned model will be used to imitate the characteristics of the qubit for multiple X and Y rotations

In the attack phase, we aim at predicting responses to unknown challenges comprising chains of rotations. In order to predict the response to a challenge \(U_{\mathrm {in}}(\mathbf {\sigma }_{\rho _1}, ..., \mathbf {\sigma }_{\rho _t})\), where \(\rho _1,...,\rho _t \in \{X,Y\}\) are the rotation directions and the \(\mathbf {\sigma }\)’s are the rotation angles for all qubits, we sum over the \(\mathbf {\sigma }_X\)’s and \(\mathbf {\sigma }_Y\)’s in the challenge and pass them into the learned qubit models \(f^{(j)}(\theta _{X,j}, \theta _{Y,j})\). More precisely:

$$\begin{aligned} \mathbf {\theta }_{X}= & {} \sum \limits _{i=1, \rho _i = X}^{t} {\mathbf {\sigma }_{\rho _i}} \mod 2\pi \\ \mathbf {\theta }_{Y}= & {} \sum \limits _{i=1, \rho _i = Y}^{t} {\mathbf {\sigma }_{\rho _i}} \mod 2\pi \end{aligned}$$
$$\begin{aligned} \mathbf {R}^{\prime }_\mathrm {out} = f(\mathbf {\theta }_{X}, \mathbf {\theta }_{Y}) = \begin{pmatrix} f^{(1)}(\theta _{X,1}, \theta _{Y,1}) \\ \vdots \\ f^{(n)}(\theta _{X,n}, \theta _{Y,n}) \end{pmatrix} \end{aligned}$$

Analogously to Section 6, we carried out the attack on a real-world quantum computer. In Fig. 8, we present the results of this modelling attack on two different kinds of extensions to the Hadamard CR-QPUF. The attack was executed again on the 27-qubit commercial quantum computer ibmq_mumbai. As one can see, again almost all of the predicted responses will get accepted by the protocol, showcasing the insecurity of these extensions of the Hadamard CR-QPUF in the SQ model.

Fig. 8
figure 8

Box plot showing the variance of the Hamming distances of responses in the holdout CRP database and their mean. The inner boxes resemble the middle two quartiles of the data. The red dots show the average HD of the \(k=15\) predicted responses (obtained through the model functions \(\{f(\mathbf {\theta }_k):k=1,...,15\}\)) to the respective responses in the CRP database. Almost every predicted response will get accepted by the authentication protocol since their average HDs to the respective responses in the CRP database are within the variance intervals. Challenges of the form \(H R_{Y}R_{Y}R_{X}R_{X}\) (left) and \(H R_{Y}R_{X}R_{Y}R_{X}R_{Y}R_{X}R_{Y}R_{X}\) (right) were employed

7 Discussion

In this work, we have examined the class of classical readout quantum PUFs. When designing CR-QPUF schemes and in the presence of noisy devices, it is natural to construct robust responses by using the expectation value of a random variable, which is then approximated using multiple noisy samples. This formalism is known as the statistical query model Kearns (1998). Recently, Hinsche et al. (2021) found that output distributions of local quantum circuits cannot be learned efficiently when the learner is restricted to the SQ model (both for classical and quantum learners). Other results Gollakota and Liang (2021) show that learning stabilizer states in the presence of noise and in the SQ model is intractable.

In the context of noisy quantum devices, this situation poses a very interesting perspective: since NISQ devices are inherently noisy, can one leverage the restriction to the SQ model to derive provable security against learning attacks? For instance, can one construct a CR-QPUF, such that one can show that learning the CR-QPUF in the SQ model implies learning an unknown stabilizer state in the SQ model or an unknown local quantum circuit in the SQ model? We believe this poses a very interesting research question. If the answer would be yes, then provable security guarantees against learning attacks could be derived for the CR-QPUF on imperfect noisy devices.

Nevertheless, there are substantial uncertainties that need to be addressed. Firstly, since the secret in CR-QPUFs are the device imperfections, can the device-specific imperfections be modelled and learned? This corresponds to learning \(qPUF_{id}\) which would mean that the underlying security parameter \(\lambda\) can be reconstructed and render any QPUF based on \(\lambda\) useless. As of now, to the best of our knowledge, there are no results that show that the device imperfections cannot be accurately modelled. We believe that in order to resolve this question, results in the field of quantum process tomography Altepeter et al. (2003); Mohseni et al. (2008) might help us to better understand whether device imperfections can be generally modelled or whether given \(U_{in} \in \mathcal {U}\) the device-specific Born distribution \(\mathcal {P}_{U_{in}t, \mathrm {id}}\) cannot be learned in the average case. Secondly, since the challenge unitary \(U_{in}\) is known to the attacker, do the device imperfections alter the output distribution such that knowing \(U_{in}\) does not constitute a significant attacker advantage in learning the output distribution? Can the influence of the imperfections be predicted knowing the challenge unitary? These questions need to be clarified to be confident about the security against learning attacks in the SQ model.

Another important aspect to keep in mind is that an attacker with direct access to the QC cloud provider machine (in contrast to our above pure communication interception-based attacker) is generally able to act outside of the SQ model. Even though there are learning problems where learning with noise is a hard task (even for quantum computers Gollakota and Liang (2021)), the extended attacker is clearly not limited to the SQ model. In this less restricted learning scenario, direct usage of the noisy samples \({\hat{r}_{\mathrm {out}}}\) is clearly possible.

However, studies have also shown that for the only known separation result between the SQ model and the PAC model (for the learning parity with noise problem), there is only a tiny advantage for using noisy samples over the SQ model Blum et al. (2003). This gives strong reason to believe that for hard learning problems in the presence of noise, acting outside of the SQ model does not yield a significant advantage.

If one wants to take a very pessimistic stance to quantum computing, fundamentally, according to Kalai (2020) and others, due to the excessive noise in NISQ devices, the complexity of these devices does not exceed the class of low-degree polynomials at all. This means that NISQ machines are computationally not stronger than classical computers and classical computers are in theory able to simulate calculations on NISQ devices. This again would of course be detrimental to CR-QPUFs, since any CR-QPUF would be efficiently learnable using low-degree polynomials. While this argument is heavily debated in the quantum computing community at this time, designing CR-QPUFs and testing their security thereof provides an excellent playground to check “the argument against quantum computers” and challenge the NISQ expressivity. CR-QPUF proposals could be checked against classical and quantum learning attacks, providing evidence to the question whether NISQ computations form a low-complexity class of algorithms whose output can be learned using low-degree polynomials.

As we have found in the insecurity of the Hadamard CR-QPUF, the responses can be learned using low-degree polynomials, which shows us that the class of challenge unitaries needs to be designed very carefully. Since the Hadamard CR-QPUF does not use entanglement, the QPUF security reduces to that of one single qubit and an attacker only needs to learn the characteristics of one qubit at a time. We therefore argue that entanglement is strictly required in CR-QPUF challenge choices. Additionally, when designing a CR-QPUF, one can leverage that the class of challenge unitaries is not fixed in their circuit structure. In the Hadamard CR-QPUF proposal, this property is not leveraged and the challenge circuit is structurally fixed, which reduces the challenge space to the single-qubit rotation angles \(\mathbf {\theta }\). Finally, we want to mention that while the idea of using imperfections in quantum computers to create an unforgeable fingerprint of the devices could potentially be used in QPUF schemes, opposing incentives of quantum computer manufacturers, who want to eliminate imperfections, and QPUF users, who require imperfections to identify devices, might hinder their application to industrial cloud providers.

8 Future work

Going forward, analyzing and categorizing the device imperfections and their influence on degenerating the Born distribution output would serve a better understanding of CR-QPUFs. In particular, can device imperfections be leveraged such that the mutation of the output distribution of local quantum circuits is not learnable in the SQ model? And equally importantly, can the device imperfections be leveraged such that their influence remains secret and unpredictable for future challenge unitaries? We believe that proposing justified CR-QPUF designs and testing their security, using tools developed for quantum process tomography and machine learning, would help gather evidence of the prospects and feasibility of QPUFs. This poses an excellent playground to gain further insight to CR-QPUFs based on quantum device imperfections.