1 Introduction

Efficiently extracting information from a quantum mechanical system is a task of both theoretical and practical importance. As quantum information technology develops, we may wish to use it to characterise unknown quantum states and processes. Even when we possess a complete description of a quantum system, such as instructions for preparing a quantum state via a quantum circuit, we may need to compute properties of the system that are not efficiently obtainable from this description using classical computation. The formalism of “classical shadows” introduced by Huang et al. [1] provides us with a rigorous approach to solving this problem in some cases.

A classical shadow of a quantum state \(\rho \) is an unbiased estimator of \(\rho \), constructed from the outcomes of random measurements on copies of \(\rho \). As its name suggests, samples of this estimator are stored classically, and can be used to estimate properties of \(\rho \), such as the expectation values of a set of observables. A classical shadows protocol is specified by choosing a distribution over unitaries; measurements are performed by sampling a unitary from this distribution, applying it to \(\rho \), and measuring in the computational basis. The efficiency of any classical shadows scheme depends on both the choice of unitary distribution and the particular observables of interest. Ideally, a scheme is efficient both in terms of quantum resources (the circuit depth and the number of random measurements required), and classical resources (the complexity of the post-processing for obtaining estimates from the classical shadow samples).

Recently, Huggins et al. [2] used classical shadows to implement a new hybrid algorithm, called “quantum-classical hybrid quantum Monte Carlo,” on a near-term quantum processor. Their approach is based on classical computational techniques, known as projector quantum Monte Carlo (QMC) methods, that approximate the ground state of a quantum Hamiltonian by stochastically implementing the imaginary time-evolution operator. Generically, QMC algorithms for fermionic systems are made to scale polynomially by imposing constraints derived from a “trial wavefunction” \(| \Psi _{\text {trial}} \rangle \), an ansatz for the ground state that is provided as an input to the algorithm. This polynomial scaling comes at the expense of introducing a bias that depends on the quality of the ansatz. The use of richer families of trial wavefunctions to increase the accuracy of these constrained QMC calculations is an active area of research. Quantum computing offers the ability to efficiently prepare a new class of trial wavefunctions that are difficult to access classically, which motivated the development of a hybrid quantum-classical algorithm for QMC.

The details vary between specific methods for constrained QMC, but generally the constraints can naturally be cast in terms of inner products with \(| \Psi _{\text {trial}} \rangle \). In Ref. [2], the authors implemented a quantum-classical hybrid algorithm for auxiliary-field quantum Monte Carlo (QC-AFQMC), by collecting classical shadow samples of \(| \Psi _{\text {trial}} \rangle \) and using these to estimate the inner products \(\langle \Psi _{\text {trial}}|\varphi _i \rangle \) between \(| \Psi _{\text {trial}} \rangle \) and Slater determinants \(| \varphi _i \rangle \). The particular classical shadows protocol implemented in Ref. [2] is the one based on random Clifford circuits, proposed and analysed in Ref. [1]. The use of classical shadows substantially reduced the number of circuit repetitions required compared to the alternative approach of using Hadamard tests. However, the proposed scheme for classically post-processing the Clifford-based classical shadows to obtain the requisite inner product estimates is inefficient, with a runtime that scales exponentially with the number of qubits.

To remedy this exponential bottleneck in the QC-AFQMC algorithm, and motivated by the importance of fermionic quantum simulation in general, we develop new tomographic protocols based on classical shadows from random matchgate circuits. We refer to these classical shadows as “matchgate shadows” for brevity. Matchgate circuits, which we formally define in Sect. 2.1.2, are generated by a certain set of two-qubit Pauli rotations, and are equivalent to fermionic Gaussian unitaries under the Jordan-Wigner transformation [3]. We consider two distributions over matchgate circuits: the Haar-uniform distribution over the continuous group of all matchgate circuits, and the uniform distribution over the discrete subset of matchgate circuits that are also members of the Clifford group. In Theorem 1, we establish that the first three moments of two distributions are the same, by finding explicit expressions for the corresponding twirl channels and showing that they are equal. Thus, in the same way the uniform distribution over the Clifford group is a unitary 3-design [4], we can colloquially describe our discrete distribution over Clifford matchgate circuits as a “matchgate 3-design.” In the context of classical shadows, the form of the estimators as well as their variance depend only on the first three moments, so it follows that the two distributions lead to the same results. In addition to potential practical implications, the 3-design property is useful theoretically, as it allows us to exploit the more explicit symmetry of the full matchgate group when analysing the discrete ensemble.

Crucially, we also show how to efficiently post-process our matchgate shadows to estimate three kinds of quantities: (i) expectation values of local fermionic operators, (ii) fidelities \(\textrm{tr}(\varrho \rho )\) between unknown quantum states \(\rho \) and fermionic Gaussian states \(\varrho \), and (iii) inner products \(\langle \psi |\varphi \rangle \) between any pure state \(| \psi \rangle \) (accessed via a state preparation circuit) and arbitrary Slater determinants \(| \varphi \rangle \). We also analyse the variances of the resulting estimates, proving explicit polynomial upper bounds in cases (i) and (ii). In case (iii), we derive an efficiently computable bound, which we evaluate for system sizes up to \(1000\) qubits, finding a modest growth rate consistent with a sublinear scaling in the system size. Beyond these three classes of observables, we provide a general framework for efficiently estimating the expectation values of arbitrary products of local fermionic operators, fermionic Gaussian density operators, and fermionic Gaussian unitaries, though we do not address the task of bounding the variance in this more general situation. We apply this framework to obtain an efficient procedure for estimating inner products between any pure state and arbitrary pure fermionic Gaussian states (not necessarily Slater determinants) using our matchgate shadows. Our post-processing procedures are based on novel methods for classically evaluating free-fermion quantities by exploiting their underlying Clifford algebra structure, which may be of independent interest in both classical and quantum computation.

Our work builds on the classical shadows formalism of Huang et al.  [1], but several differences arise when considering random matchgate circuits rather than random Clifford circuits. First, the fact that the \(n\)-qubit Clifford group has only one non-trivial irreducible representation leads to a particularly simple form for the corresponding measurement channel [5]. In contrast, the group of n-qubit matchgate circuits has \(2n + 1\) inequivalent irreducible representations [6], complicating the analysis. Another interesting difference is that when using classical shadows based on the Clifford group [1], the choice of ensemble, between single- and n-qubit random Clifford circuits, allows one to efficiently estimate either local qubit observables, or low-rank observables such as fidelities. In contrast, our work shows that matchgate shadows are capable of simultaneously estimating both local fermionic observables as well as certain global properties (e.g., the fidelities with fermionic Gaussian states).

The idea of using classical shadows from random matchgate circuits was also explored by Zhao et al. in Ref. [7]. We make a brief comparison here and contrast our work with theirs more thoroughly in Sect. 3.3.1. Zhao et al.  analyse a discrete ensemble of matchgate circuits that is a subset of the discrete ensemble considered in this paper. They apply the resulting shadows to estimate the expectation values of local fermionic observables in a particular basis. We obtain the same scaling as their approach for these local observables. Our practical results go considerably beyond theirs, however, by developing efficient methods and bounding the variances for estimating the nonlocal properties described above (including those required for QC-AFQMC), in addition to the expectation values of local fermionic observables in arbitrary bases. We thus broaden the scope of applicability of the classical shadows formalism to the quantum simulation of fermionic systems.

We refer the reader to Sect. 3 for a summary of our main results; the relevant background material is reviewed in Sect. 2. In Sect. 3.1, we characterise the moments of the two distributions over matchgate circuits we consider. Then, in Sect. 3.2, we provide an expression for the corresponding measurement channel (necessary for classically constructing the matchgate shadows), as well as a general formula for the variance of expectation value estimates. We consider several applications in Sect. 3.3, giving an overview of our methods for efficiently extracting estimates of various quantities from matchgate shadows via classical post-processing, along with bounds on the variances of these estimates. As a specific example, we present a concise description of our protocol for efficiently estimating the inner products required for QC-AFQMC in Algorithm 1. In Sect. 4, we supply the proofs of our results on the ensembles of random matchgate circuits and the classical shadows they generate. Section 5 provides details and proofs of correctness for our efficient post-processing methods, while Sect. 6 gives the proofs of our variance bounds. In Sect. 7, we discuss the context of our work and some directions for future exploration. The appendices contain a mixture of generalisations and more technical details related to the results of the main text.

2 Background

In this section, we provide the background material required for developing our results and putting them into context. In Sect. 2.1, we summarise basic concepts and introduce some notational conventions that will be used throughout the paper. Table 1 contains an abbreviated list of commonly used notation. We then review the classical shadows framework of Ref. [1] in Sect. 2.2, and describe the application of classical shadows to the QC-AFQMC algorithm of Ref. [2] in Sect. 2.3. Readers familiar with this background material can skip to the summary of our main results in Sect. 3, after skimming Sect. 2.1 or Table 1 to gain familiarity with some of the notation.

Table 1 Some notation used throughout the paper

2.1 Preliminaries and notation

Throughout, we use \(\mathcal {H}_n\) to denote the space of n-qubit states, and \(\mathcal {L}(\mathcal {V})\) the space of linear operators on a vector space \(\mathcal {V}\). Thus, \(\mathcal {L}(\mathcal {H}_n)\) is the space of n-qubit operators, and \(\mathcal {L}(\mathcal {L}(\mathcal {H}_n))\) the space of superoperators. All vector spaces we consider are over the field \(\mathbb {C}\).

2.1.1 Majorana operators

For a system of n fermionic modes corresponding to creation operators \(a_1^\dagger ,\dots , a_n^\dagger \), we define 2n Majorana operators \(\gamma _1, \dots \gamma _{2n}\) by

$$\begin{aligned} \gamma _{2j-1} :=a_j + a_j^\dagger , \qquad \gamma _{2j} :=-i(a_j - a_j^\dagger ) \end{aligned}$$
(1)

for \(j \in [n]:=\{1,\dots , n\}\). The Majorana operators are Hermitian and satisfy the anticommutation relations

$$\begin{aligned} \{\gamma _\mu , \gamma _\nu \} = 2\delta _{\mu ,\nu }I \end{aligned}$$
(2)

for all \(\mu ,\nu \in [2n]\). For a subset of indices \(S \subseteq [2n]\), we denote by \(\gamma _S\) the product of the Majorana operators indexed by the elements in S in increasing order. That is,

$$\begin{aligned}&\gamma _{S} :=\gamma _{\mu _1}\dots \gamma _{\mu _{|S|}} \quad \text {for}\, S = \{\mu _1,\dots \mu _{|S|}\} \subseteq [2n]\, \text {with}\, \mu _1< \dots < \mu _{|S|}, \\&\gamma _{\varnothing } \equiv I, \end{aligned}$$

where I denotes the identity operator in \(\mathcal {L}(\mathcal {H}_n)\). It will be clear from context whether the subscript of \(\gamma \) is a single index (usually represented in terms of lowercase Greek or Roman letters) or a subset of indices (usually represented by an uppercase Roman letter). We further define the independent subspaces

$$\begin{aligned} \Gamma _k :=\textrm{span}\left\{ \gamma _S: S \in {[2n]\atopwithdelims ()k}\right\} \end{aligned}$$
(3)

for \(k \in \{0,\dots , 2n\}\), where \({[2n]\atopwithdelims ()k}\) denotes the set of subsets of [2n] of cardinality k. Many of the operators we will be considering are even operators, i.e., they are in the subspace \(\Gamma _{\text {even}}\) spanned by products of an even number of Majorana operators:

$$\begin{aligned} \Gamma _{\text {even}} :=\bigoplus _{\ell = 0}^{n} \Gamma _{2\ell }. \end{aligned}$$
(4)

We represent n-mode fermionic operators by n-qubit operators via the Jordan-Wigner transformation [3] (and generally, we will not distinguish between a fermionic operator and its qubit representationFootnote 1):

$$\begin{aligned} \gamma _{2j-1} = \left( \prod _{i=1}^{j-1} Z_i\right) X_j, \qquad \gamma _{2j} = \left( \prod _{i=1}^{j-1} Z_i\right) Y_j, \end{aligned}$$
(5)

where \(Z_i\) denotes the n-qubit operator that acts as Pauli Z on the ith qubit and as the identity on the rest of the qubits, and similarly for \(X_i\) and \(Y_i\). Then, the space \(\mathcal {L}(\mathcal {H}_n)\) of n-qubit operators is spanned by products of Majorana operators, i.e., \(\mathcal {L}(\mathcal {H}_n)= \bigoplus _{k=0}^{2n} \Gamma _k\). Denoting the eigenstates of Z as \(| 0 \rangle \) and \(| 1 \rangle \), the simultaneous eigenstates of \(\{Z_j\}_{j\in [n]}\) are \(\bigotimes _{j \in [n]}| b_j \rangle =:| b \rangle \) for \(b = (b_1,\dots , b_n) \in \{0,1\}^n\). We refer to \(\{| b \rangle \}_{b\in \{0,1\}^n}\) as the computational basis. This corresponds under the Jordan-Wigner transformation to the occupation-number basis with respect to our fermionic modes \(\{a_j^\dagger \}_{j\in [n]}\): we have \(| b \rangle = (a_1^\dagger )^{b_1} \dots (a_n^\dagger )^{b_n}| \textbf{0} \rangle \), where \(| \textbf{0} \rangle \equiv | 0 \rangle ^{\otimes n}\) is the vacuum state, and

$$\begin{aligned} | b \rangle \langle b | = \prod _{j=1}^n\frac{1}{2}\left( I-i(-1)^{b_j}\gamma _{2j-1}\gamma _{2j}\right) . \end{aligned}$$
(6)

More generally, we will refer to any set of Hermitian operators satisfying the anticommutation relations in Eq. (2) as a set of Majorana operators. We can form different sets of Majorana operators by taking appropriate linear combinations of \(\gamma _1,\dots ,\gamma _{2n}\). Specifically, if

$$\begin{aligned} \widetilde{\gamma }_\mu = \sum _{\nu =1}^{2n} Q_{\mu \nu } \gamma _\nu \end{aligned}$$
(7)

for each \(\mu \in [2n]\), then \(\{\widetilde{\gamma }_\mu \}_{\mu \in [2n]}\) are self-adjoint and satisfy \(\{\widetilde{\gamma }_\mu , \widetilde{\gamma }_\nu \} = 2\delta _{\mu ,\nu }\), if and only if the \(2n \times 2n\) matrix Q is real and orthogonal. Having picked out a special basis—the computational basis—for the space of n-qubit states, we will typically use \(\gamma _\mu \) (without tildes) to denote the particular set of Majorana operators satisfying Eq. (6), and sometimes refer to this set as the “canonical” basis of Majorana operators.

2.1.2 Matchgate circuits/fermionic Gaussian unitaries

Let \(\textrm{O}(2n)\) be the group of real orthogonal \(2n \times 2n\) matrices. For any \(Q \in \textrm{O}(2n)\), we use \(U_Q\) to denote any unitary (acting on \(\mathcal {H}_n\)) such that

$$\begin{aligned} U_Q^\dagger \gamma _\mu U_Q = \sum _{\nu =1}^{2n}Q_{\mu \nu }\gamma _\nu \end{aligned}$$
(8)

for all \(\mu \in [2n]\). We call any such unitary a (fermionic) Gaussian unitary. Gaussian unitaries transform between valid sets of Majorana operators [cf. Eq. (7)]. It is easily verified using Eq. (8) that for any \(S \subseteq [2n]\),

$$\begin{aligned} U_Q^\dagger \gamma _S U_Q = \sum _{S' \in {[2n] \atopwithdelims ()|S|}} \det (Q\big |_{S,S'})\gamma _{S'}, \end{aligned}$$
(9)

where \(M\big |_{S,S'}\) denotes the restriction of the matrix M to rows indexed by S and columns indexed by \(S'\). Since \(\mathcal {L}(\mathcal {H}_n)\) is spanned by \(\{\gamma _S: S \subseteq [2n]\}\), \(U_Q\) is fully determined by Q up to an irrelevant global phase. It follows from Eq. (9) that for each \(k \in \{0,\dots , 2n\}\), the subspace \(\Gamma _k\) spanned by products of k Majorana operators [Eq. (3)] is invariant under conjugation by any \(U_Q\). The set of all Gaussian unitaries on n modes forms a group, which we denote by \(\textrm{M}_n\).

Matchgate circuits are qubit representations of fermionic Gaussian unitaries under the Jordan-Wigner transformation. We will generally use the terms matchgate circuit and Gaussian unitary interchangeably, and call \(\textrm{M}_n\) the “matchgate group.”Footnote 2Matchgates are a particular class of two-qubit gates, generated by two-qubit X rotations of the form \(\exp (i\theta X\otimes X)\) and single qubit Z rotations \(\exp (i\theta Z\otimes I)\) and \(\exp (i\theta I\otimes Z)\). Matchgate circuits can then be defined as the unitaries generated by nearest-neighbour matchgates (where the n qubits are placed on a line) along with the Pauli X operator \(X_n\) on the last qubit. To see why this definition of matchgate circuits corresponds to that of Gaussian unitaries, note that

$$\begin{aligned} iX_j X_{j+1} = \gamma _{2j}\gamma _{2j+1},\qquad iZ_j = \gamma _{2j-1}\gamma _{2j} \end{aligned}$$

from Eq. (5), and that for \(\mu \ne \nu \),

$$\begin{aligned} \exp (\theta \gamma _\mu \gamma _\nu )^\dagger \gamma _\xi \exp (\theta \gamma _\mu \gamma _\nu ) = \left\{ \begin{array}{ll} \cos (2\theta ) \gamma _\mu + \sin (2\theta ) \gamma _\nu \qquad &{}\xi = \mu , \\ -\sin (2\theta ) \gamma _\mu + \cos (2\theta )\gamma _\nu \qquad &{}\xi = \nu , \\ \gamma _\xi , \qquad &{}\xi \not \in \{\mu ,\nu \}.\end{array}\right. \end{aligned}$$

Thus, the nearest neighbour \(X \otimes X\) rotations and single-qubit Z rotations implement Gaussian unitaries corresponding to Givens rotations in planes spanned by the \(\gamma _\mu \), \(\gamma _{\mu +1}\) axes for every \(\mu \in [2n-1]\); these generate all rotations in \(\textrm{SO}(2n)\). Adding in the X operator on the nth qubit then generates \(\textrm{O}(2n)\), since \(X_n\) implements the reflection that takes \(\gamma _{2n} \mapsto -\gamma _{2n}\) and leaves all the other \(\gamma _\mu \) unchanged, as can be seen from Eq. (5).

2.1.3 Fermionic Gaussian states and Slater determinants

There are several equivalent ways of defining (fermionic) Gaussian states. Physically speaking, they are the ground states and thermal states of non-interacting fermionic Hamiltonians. For our purposes, an n-mode Gaussian state is any state whose density operator \(\varrho \) can be written as

$$\begin{aligned} \varrho = \prod _{j=1}^n \frac{1}{2}(I - i\lambda _j \widetilde{\gamma }_{2j-1}\widetilde{\gamma }_{2j}) \end{aligned}$$
(10)

for some coefficients \(\lambda _j \in [-1,1]\) and Majorana operators \(\widetilde{\gamma }_\mu = U_Q^\dagger \gamma _\mu U_Q = \sum _{\mu =1}^{2n} Q_{\mu \nu }\gamma _\nu \) with \(Q \in \textrm{O}(2n)\). If \(\lambda _j \in \{-1,1\}\) for all \(j \in [n]\), then \(\varrho \) is a pure Gaussian state; otherwise, \(\varrho \) is a mixed Gaussian state. Gaussian unitaries map Gaussian states to Gaussian states. In particular, from Eq. (6), we see that computational basis states are all Gaussian states, and that any pure Gaussian state can be prepared from the vacuum state \(| \textbf{0} \rangle \) by a Gaussian unitary \(U_Q \in \textrm{M}_n\).

The density operator of any Gaussian state is in \(\Gamma _{\text {even}}\), and (as the name suggests) a Gaussian state is fully determined by its two-point correlations \(\textrm{tr}(\varrho \gamma _\mu \gamma _\nu )\), which form its covariance matrix. Specifically, for any n-qubit state \(\rho \), the associated covariance matrix \(C_\rho \) is an antisymmetric \(2n\times 2n\) matrix with entries

$$\begin{aligned} (C_\rho )_{\mu \nu } :=-\frac{i}{2}\textrm{tr}([\gamma _\mu ,\gamma _\nu ] \rho ) \end{aligned}$$
(11)

for \(\mu ,\nu \in [2n]\). For example, by Eq. (6), the covariance matrix of a computational basis state \(| b \rangle \langle b |\) is

$$\begin{aligned} C_{| b \rangle } :=\bigoplus _{j=1}^n \begin{pmatrix} 0 &{} (-1)^{b_j} \\ (-1)^{b_j + 1} &{}0 \end{pmatrix}. \end{aligned}$$
(12)

For a general Gaussian state \(\varrho \) specified as in Eq. (10), the covariance matrix is

$$\begin{aligned} C_\varrho = Q^{\textrm{T}} \bigoplus _{j=1}^n \begin{pmatrix} 0 &{} \lambda _j \\ -\lambda _j &{}0 \end{pmatrix} Q, \end{aligned}$$
(13)

and a useful relation between the covariance matrices of \(\varrho \) and \(U_Q^\dagger \varrho U_Q\) for some \(U_Q \in \textrm{M}_n\) is \(C_{U_Q^\dagger \varrho U_Q} = Q^{\textrm{T}} C_\varrho Q\).

For \(\zeta \in \mathbb {Z}_{\ge 0}\), a \(\zeta \)-fermion Slater determinant is a Gaussian state that is also an eigenstate of the number operator \(\sum _{j =1}^n a_j^\dagger a_j\). Not every Gaussian state is a Slater determinant. Indeed, the definition of a Slater determinant depends on the choice of fermionic modes \(\{a_j^\dagger \}_{j \in [n]}\); as discussed in Sect. 2.1.1, throughout this paper, \(\{a_j^\dagger \}_{j \in [n]}\) are the “canonical” modes whose occupation-number states correspond to computational basis states under the Jordan-Wigner transformation. Hence, computational basis states are all Slater determinants, and any \(\zeta \)-fermion Slater determinant can be prepared from a computational basis state \(| x \rangle \) of Hamming weight \(|x| = \zeta \) by a Gaussian unitary that commutes with the number operator.

Fermionic Gaussian states, including Slater determinants, can be efficiently described classically via their covariance matrices (Eq. (11)). Any \(\zeta \)-fermion Slater determinant \(| \varphi \rangle \) can also be written as

$$\begin{aligned} | \varphi \rangle = \widetilde{a}_1^\dagger \dots \widetilde{a}_{\zeta }^\dagger | \textbf{0} \rangle , \qquad \text {where }\widetilde{a}_j = \sum _{k=1}^n V_{jk} a_k \text { for each}\, j \in [n], \end{aligned}$$
(14)

for some \(n \times n\) unitary matrix V. Hence, we can also specify a Slater determinant by specifying V (or the first \(\zeta \) rows of V). For reference, the Majorana \(\{\widetilde{\gamma }_\mu \}_{\mu \in [2n]}\) operators corresponding to these \(\{\widetilde{a}_j\}_{j \in [n]}\) \(\{\gamma _\mu \}_{\mu \in [2n]}\) by \(\widetilde{\gamma }_{2j-1} = \sum _k (\textrm{Re}(V_{jk})\gamma _{2k-1} -\textrm{Im}(V_{jk})\gamma _{2k})\) and \(\widetilde{\gamma }_{2j} = \sum _{k}(\textrm{Im}(V_{jk})\gamma _{2k-1} + \textrm{Re}(V_{jk})\gamma _{2k})\), so the fermionic Gaussian unitary \(U_{\widetilde{Q}}\) that implements the transformation in Eq. (14) is given by the (special) orthogonal matrix

$$\begin{aligned} \widetilde{Q} = \begin{pmatrix} R_{11} &{}\dots &{}R_{1n} \\ \vdots &{}\ddots &{}\vdots \\ R_{n1} &{}\dots &{}R_{nn} \end{pmatrix}, \qquad \text {with blocks } R_{jk} :=\begin{pmatrix} \textrm{Re}(V_{jk}) &{}-\textrm{Im}(V_{jk}) \\ \textrm{Im}(V_{jk}) &{}\textrm{Re}(V_{jk}) \end{pmatrix}. \end{aligned}$$
(15)

2.1.4 Liouville representation

In some parts of this paper, predominantly in Sect. 4, we will use the Liouville representation (or Pauli-transfer matrix representation) for operators and superoperators, for the purpose of making certain expressions more clear. In this representation, operators are notated using “double” kets, and a “double” braket is used to represent the Hilbert-Schmidt inner product. By convention, we take all double kets to be normalised with respect to the Hilbert-Schmidt norm. Thus, for any nonzero operators \(A,B \in \mathcal {L}(\mathcal {H}_n)\),

(and set for the zero operator). In particular, since Majorana operators square to the identity, and their products are Hilbert-Schmidt orthogonal, we have

(16)

As with usual (state) kets, we will freely write e.g., as shorthand for the tensor product . A superoperator acting on an operator is represented by placing the superoperator to the left of the operator’s double ket, i.e., for \(\mathcal {E} \in \mathcal {L}(\mathcal {L}(\mathcal {H}_n))\) and \(A \in \mathcal {L}(\mathcal {H}_n)\),

We will also write \(\mathcal {E}\mathcal {E}'\) in place of \(\mathcal {E}\circ \mathcal {E}'\). The fact that the \(2^n\) ordered products \(\gamma _S\) of Majorana operators forms an orthogonal basis for \(\mathcal {L}(\mathcal {H}_n)\) can be expressed as a resolution of the identity superoperator \(\mathcal {I} \in \mathcal {L}(\mathcal {L}(\mathcal {H}_n))\):

(17)

2.2 Review of the classical shadows framework

In this subsection, we review the classical shadows framework of Ref. [1],Footnote 3 introducing generalisations where necessary. The objective is to estimate the expectation values \(\textrm{tr}(O_1 \rho ),\dots , \textrm{tr}(O_M\rho )\) of M “observables”Footnote 4\(O_1,\dots , O_M \in \mathcal {L}(\mathcal {H}_n)\) with respect to an unknown n-qubit quantum state \(\rho \), assuming that we are given copies of the state and some classical description of the observables. To apply the classical shadows protocol, we first choose a distribution D over some set of unitaries. For each copy of \(\rho \), we 1) randomly draw a unitary U from this distribution, 2) apply U to \(\rho \), and 3) measure in the computational basis \(\{| b \rangle \}_{b\in \{0,1\}^n}\). (Steps 2 and 3 are equivalent to measuring in the basis \(\{U^\dagger | b \rangle \}_{b \in \{0,1\}^n}\).) Then, consider applying \(U^\dagger \) to the post-measurement state. The quantum channel \(\mathcal {M} \in \mathcal {L}(\mathcal {L}(\mathcal {H}_n))\) describing the overall process is given by

$$\begin{aligned} \mathcal {M}(\rho )&= \mathop {{}\mathbb {E}}_{U \sim D}\sum _{b \in \{0,1\}^n} \langle b |U \rho U^\dagger | b \rangle U^\dagger | b \rangle \langle b |U \nonumber \\&= \textrm{tr}_1\left[ \sum _{b \in \{0,1\}^n} \mathop {{}\mathbb {E}}_{U \sim D} \mathcal {U}^{\otimes 2} (| b \rangle \langle b |^{\otimes 2}) (\rho \otimes I) \right] , \end{aligned}$$
(18)

where \(\textrm{tr}_1\) denotes the partial trace over the first tensor component, and \(\mathcal {U}\) denotes the unitary channel corresponding to \(U^\dagger \), i.e., \(\mathcal {U}(\,\cdot \,) = U^\dagger (\,\cdot \,)U\). Now, suppose that \(\mathcal {M}\) is invertible on some subspace \(\mathcal {X}\) of \(\mathcal {L}(\mathcal {H}_n)\), and assume that \(\mathcal {X}\) contains \(\rho \) and \(U^\dagger | b \rangle \langle b |U\) for all \(U \in D\) and \(b \in \{0,1\}^n\). Let \(\mathcal {M}^{-1}: \mathcal {X} \rightarrow \mathcal {X}\) denote the inverse of \(\mathcal {M}\) restricted to this subspace. Then, define the random operator \(\hat{\rho }\) by

$$\begin{aligned} \hat{\rho } :=\mathcal {M}^{-1}(\hat{U}^\dagger | \hat{b} \rangle \langle \hat{b} |\hat{U}), \end{aligned}$$
(19)

where \(\hat{U}\) is distributed according to D and \(\mathbb {P}[| \hat{b} \rangle = | b \rangle \, |\, \hat{U} = U] = \langle b |U\rho U^\dagger | b \rangle \). By construction, \(\hat{\rho }\) is an unbiased estimator for \(\rho \):

$$\begin{aligned} \mathop {{}\mathbb {E}}[\hat{\rho }] = \rho , \end{aligned}$$

and in the literature, the term “classical shadow” is often used to refer to both \(\hat{\rho }\) (a random variable) or a sample of it obtained in one realisation of the above procedure (i.e., \(\mathcal {M}^{-1} (U^\dagger | b \rangle \langle b |U)\) for some outcomes U and \(| b \rangle \)). These samples can be used to estimate the expectation values \(\textrm{tr}(O_i\rho )\), since

$$\begin{aligned} \mathop {{}\mathbb {E}}[\hat{o}_i] = \textrm{tr}(O_i \rho ), \quad \text {where}\, \hat{o}_i :=\textrm{tr}(O_i \hat{\rho }) = \textrm{tr}\left( O_i \mathcal {M}^{-1}(\hat{U}^\dagger | \hat{b} \rangle \langle \hat{b} |\hat{U})\right) \, \text {for}\, i \in [M]. \end{aligned}$$

Note that only steps 1)–3) are performed on the quantum computer; the remaining computations (constructing the classical shadow sample \(\mathcal {M}^{-1}(U| b \rangle \langle b |U)\) and calculating expectation values with respect to it) are classical.

To bound the number of samples of the classical shadow estimator \(\hat{\rho }\) (and hence the number of copies of \(\rho \)) required to estimate the expectation values to within some desired precision with high probability, we can consider the variances of the estimators \(\hat{o}_i\):

$$\begin{aligned} \textrm{Var}[\hat{o}_i]&= \mathop {{}\mathbb {E}}[|\hat{o}_i|^2] - |\mathop {{}\mathbb {E}}[\hat{o}_i]|^2 \\&= \mathop {{}\mathbb {E}}_{U \sim D} \sum _{b \in \{0,1\}^n} \langle b | U\rho U^\dagger | b \rangle \left| \textrm{tr}\left( O_i \mathcal {M}^{-1}(U^\dagger | b \rangle \langle b |U)\right) \right| ^2 - |\textrm{tr}(O_i\rho )|^2 \\&= \mathop {{}\mathbb {E}}_{U \sim D} \sum _{b \in \{0,1\}^n} \textrm{tr}\left[ U^\dagger | b \rangle \langle b | U \rho \otimes \mathcal {M}^{-1}(U^\dagger | b \rangle \langle b |U)O_i \otimes \mathcal {M}^{-1}(U^\dagger | b \rangle \langle b |U) O_i^\dagger \right] \\&\quad -|\textrm{tr}(O_i\rho )|^2. \end{aligned}$$

If we further assume that \(O_i,O_i^\dagger \in \mathcal {X}\), then we can use the fact that \(\textrm{tr}(\mathcal {M}^{-1}(A) B) = \textrm{tr}(A \mathcal {M}^{-1}(B))\) for \(A,B, \in \mathcal {X}\) to rewrite this as

$$\begin{aligned} \textrm{Var}[\hat{o}_i] = \textrm{tr}\left[ \sum _{b \in \{0,1\}^n} \mathop {{}\mathbb {E}}_{U \in D} \mathcal {U}^{\otimes 3}(| b \rangle \langle b |^{\otimes 3}) \left( \rho \otimes \mathcal {M}^{-1}(O_i) \otimes \mathcal {M}^{-1}(O_i^\dagger )\right) \right] - |\textrm{tr}(O_i\rho )|^2, \end{aligned}$$
(20)

which can be upper bounded by the first term. Thus, we see from Eqs. (18) and (20) that while the measurement channel \(\mathcal {M}\) in the classical shadows protocol depends on the chosen unitary distribution D through the 2-fold twirl \(\mathop {{}\mathbb {E}}_{U \sim D}\mathcal {U}^{\otimes 2}\), the variances of the resulting estimates depend on D through the 3-fold twirl \(\mathop {{}\mathbb {E}}_{U \sim D}\mathcal {U}^{\otimes 3}\). If median-of-means estimators (see e.g., [8]) are used, it follows straightforwardly from Chebyshev’s and Hoeffding’s inequalities that

$$\begin{aligned} N_{\textrm{sample}} = \mathcal {O}\left( \frac{\log (M/\delta )}{\varepsilon ^2} \max \limits _{i \in [M]} \textrm{Var}[\hat{o}_i] \right) \end{aligned}$$
(21)

classical shadow samples suffice to estimate every \(\textrm{tr}(O_i\rho )\) to within additive error \(\varepsilon \) with probability at least \(1-\delta \).Footnote 5

It is important to note that the classical shadows framework does not in general provide a protocol that is efficient in terms of quantum and classical resources. Even in cases where there is an efficient procedure to sample unitaries U from D and implement them on a quantum computer, the variance of the estimates \(\hat{o}_i\) may be large, necessitating a large number of copies of \(\rho \) (if the number of samples is chosen according to Eq. (21)). In addition, one needs to be able to somehow classically compute \(\textrm{tr}(O_i \mathcal {M}^{-1}(U^\dagger | b \rangle \langle b |U))\) for all i. The particular classical shadows approach taken in Ref. [2] to implement QC-AFQMC, described in the following subsection, provides an example of a situation where the variance is small (in fact, bounded by a constant in the system size n), but the classical post-processing is inefficient (having a runtime exponential in n). One of the motivations behind the present work is to provide a protocol that has \(\textrm{poly}(n)\) quantum and classical complexity for estimating expectation values of a large class of fermionic observables, including (but not limited to) those required for QC-AFQMC.

2.3 Classical shadows applied to quantum-classical auxiliary-field quantum Monte Carlo (QC-AFQMC)

In this subsection, we review one of the motivating use cases for the protocols we develop in this paper. We begin by briefly discussing projector QMC techniques. Formally, we can find the ground state of a Hamiltonian \(H\) by applying the imaginary time-evolution operator \(e^{-\tau H}\) to some initial state \(| \psi _\text {init} \rangle \) that has non-vanishing overlap with the true ground state \(| \psi _\text {ground} \rangle \):

$$\begin{aligned} | \psi _\text {ground} \rangle \propto \lim _{\tau \rightarrow \infty } e^{-\tau H} | \psi _\text {init} \rangle . \end{aligned}$$
(22)

In order to avoid explicitly storing and manipulating exponentially large objects, projector QMC methods approximate this projection onto the ground state by implementing it stochastically. In some cases, such as for unfrustrated systems of bosons, this yields a polynomially scaling procedure for computing the ground state energy and other properties.

However, when we consider systems containing multiple identical fermions, projector QMC methods are typically faced with the fermion sign problem. In projector QMC methods based on second quantisation, such as auxiliary-field qantum Monte Carlo (AFQMC) [9], the sign problem manifests as an exponentially large variance in the estimator of the energy [9]. When a polynomially scaling approach is desired, the fermion sign problem is usually controlled by applying constraints to the statistical samples of the RHS of Eq. (22) using an approximation to the ground state referred to as the “trial wavefunction” \(| \Psi _\text {trial} \rangle \). Details vary between methods, but the basic idea is to modify the statistical samples so as to constrain their overlaps with the trial wavefunction to be positive. Implementing this constraint correctly for a statistical sample \(| \varphi \rangle \) requires calculating \(\langle \Psi _\text {trial}|\varphi \rangle \).

In Ref. [2], Huggins et al. proposed and implemented a quantum-classical hybrid quantum Monte Carlo algorithm, which involved preparing the trial wavefunction \(| \Psi _{\text {trial}} \rangle \) on a quantum computer, and using a classical shadows protocol to estimate its overlaps with the statistical samples. Ref. [2] focused on AFQMC, where the statistical sample are Slater determinants \(| \varphi _i \rangle \) in general single-particle bases. Given a quantum circuit \(U_\Psi \) that prepares \(| \Psi _\text {trial} \rangle \) from the vacuum state \(| \textbf{0} \rangle \), it is straightforward to use Hadamard tests to estimate the overlap between \(| \Psi _\text {trial} \rangle \) and any Slater determinant \(| \varphi _i \rangle \). However, evaluating these overlaps one at a time (and preparing \(| \Psi _{\text {trial}} \rangle \) for each of the Hadamard tests performed for each overlap) for the large number of Slater determinants that arise in a typical AFQMC calculation could be prohibitively expensive.Footnote 6

In order to make an experiment on a near-term quantum computer feasible, Huggins et al.  designed a protocol that involves first collecting a pre-determined number of classical shadow samples using the quantum device. Then, as the QMC calculation is run on the classical computer, the necessary overlaps can be evaluated classically using the stored shadow samples. More specifically, the protocol entails preparing the state

$$\begin{aligned} \rho = \frac{1}{2} \left( | \textbf{0} \rangle + | \Psi _\text {trial} \rangle \right) \left( \langle \textbf{0} | + \langle \Psi _\text {trial} |\right) , \end{aligned}$$

and collecting classical shadow samples of it. On the classical computer, one specifies the operator

$$\begin{aligned} | \varphi _i \rangle \langle \textbf{0} | = \widetilde{a}_1^\dagger \dots \widetilde{a}_{\zeta }^\dagger | \textbf{0} \rangle \langle \textbf{0} |, \end{aligned}$$

where the free parameters of \(| \varphi _i \rangle \) are contained in the choice of basis for the \(\widetilde{a}_k^\dagger \) operators (see Sect. 2.1.3). Ref. [2] considers trial wavefunctions \(| \Psi _{\text {trial}} \rangle \) that are eigenstates of the number operator, with eigenvalue \(\zeta > 0\), and Slater determinants \(| \varphi _i \rangle \) with the same number of particles \(\zeta \). (In Appendix A.1 we discuss how to relax this requirement and allow for \(| \Psi _\text {trial} \rangle \) that does not have a fixed particle number.) In this case, we have \(\langle \Psi _{\text {trial}}|\textbf{0} \rangle = \langle \varphi _i|\textbf{0} \rangle = 0\), so the overlap of interest can be expressed as

$$\begin{aligned} \langle \Psi _{\text {trial}}|\varphi _i \rangle = 2 \textrm{tr}\left( | \varphi _i \rangle \langle \textbf{0} |\rho \right) . \end{aligned}$$

Hence, the overlap \(| \Psi _\text {trial} \rangle \) can be estimated by estimating the expectation value of \(| \varphi _i \rangle \langle \textbf{0} |\) with respect to the state \(\rho \). This can in principle be done using the classical shadow samples of \(\rho \). In the particular classical shadows protocol used by Ref. [2], the distribution D (see Sect. 2.2) is the uniform distribution over the n-qubit Clifford group. The variances of the resulting estimators were analysed in [1], and it is straightforward to show that the variance for estimating the expectation value of \(| \varphi _i \rangle \langle \textbf{0} |\) is bounded above by a constant. However, the cost of classically post-processing the classical shadow samples to obtain these estimates appears to scale exponentially with the system size n. Explicitly, Clifford classical shadow estimates \(o_i\) of the expectation value of \(| \varphi _i \rangle \langle \textbf{0} |\) are of the form

$$\begin{aligned} o_i = \left( 2^{n} + 1\right) \langle b |U | \varphi _i \rangle \langle \textbf{0} |U^\dagger | b \rangle , \end{aligned}$$

where U is a Clifford unitary and \(| b \rangle \) is a computational basis state. In order to evaluate \(o_i\) to within a constant additive error, we would therefore need to calculate \(\langle b |U | \varphi _i \rangle \langle \textbf{0} |U^\dagger | b \rangle \) up to an additive error that is exponentially small in \(n\). It is not clear how to evaluate the \(\langle b |U | \varphi _i \rangle \) component of this expression with the necessary degree of precision without resorting to methods that scale exponentially with \(n\) in the general case.

In the experimental implementation of QC-AFQMC in Ref. [2], Clifford classical shadows were used despite this exponential complexity of the classical post-processing, because the system sizes considered were sufficiently small. The techniques we present in this work, based on classical shadows from different ensembles of unitaries, will allow this exponentially costly step to be removed in future implementations of QC-AFQMC.

3 Summary

In this section, we present an overview of the key results of this paper. We provide references to sections that contain the proofs and additional details. All of the notation and background concepts we use in this section are explained in Sect. 2.

3.1 Random matchgate circuits

In this work, we consider the classical shadows resulting from two different distributions over matchgate circuits. As discussed in Sect. 2.1.2, matchgate circuits correspond to fermionic Gaussian unitaries via the Jordan-Wigner transformation, and form a continuous group \(\textrm{M}_n\) which is in one-to-one correspondence with the orthogonal group \(\textrm{O}(2n)\) (if we ignore global phases):

$$\begin{aligned} \textrm{M}_n = \{U_Q: Q \in \textrm{O}(2n)\}. \end{aligned}$$
(23)

The first distribution we study is the “uniform” distribution over \(\textrm{M}_n\), where uniformity is more precisely given by the normalised Haar measure \(\mu \) on \(\textrm{O}(2n)\). The second distribution is the uniform distribution over the discrete subgroup of \(\textrm{M}_n\) consisting only of matchgate circuits that are also in the n-qubit Clifford group \(\textrm{Cl}_n\). From Eq. (8), these coincide with the group \(\textrm{B}(2n)\) of \(2n \times 2n\) signed permutation matrices:

$$\begin{aligned} \textrm{M}_n \cap \textrm{Cl}_n = \{U_Q: Q \in \textrm{B}(2n)\}. \end{aligned}$$
(24)

We explain how to efficiently sample from these two distributions in Appendix B.

For \(j \in \mathbb {Z}_{>0}\), we use \(\mathcal {E}^{(j)}_{\textrm{M}_n}\) and \(\mathcal {E}^{(j)}_{\textrm{M}_n \cap \textrm{Cl}_n}\) to denote the j-fold twirl channels corresponding to the distributions over \(\textrm{M}_n\) and \(\textrm{M}_n \cap \textrm{Cl}_n\), respectively:

$$\begin{aligned}{} & {} \mathcal {E}_{\textrm{M}_n}^{(j)} :=\int _{\textrm{O}(2n)} d\mu (Q)\, \mathcal {U}_Q^{\otimes j} \end{aligned}$$
(25)
$$\begin{aligned}{} & {} \mathcal {E}_{\textrm{M}_n \cap \textrm{Cl}_n}^{(j)} :=\frac{1}{|\textrm{B}(2n)|} \sum _{Q \in \textrm{B}(2n)} \mathcal {U}_Q^{\otimes j}, \end{aligned}$$
(26)

where

$$\begin{aligned} \mathcal {U}_Q(\, \cdot \,) :=U_Q^\dagger (\,\cdot \,)U_Q \end{aligned}$$
(27)

denotes the unitary channel for the Gaussian unitary \(U_Q^\dagger \). Since the measurement channel \(\mathcal {M}\) in the classical shadows procedure and the variance of the estimates obtained from the classical shadows are determined by the 2- and 3-fold twirls [see Eqs. (18) and (20)], our first step is to evaluate \(\mathcal {E}^{(j)}_{\textrm{M}_n}\) and \(\mathcal {E}^{(j)}_{\textrm{M}_n \cap \textrm{Cl}_n}\) for j up to 3:

Theorem 1

(First three moments of uniform distributions over \(\textrm{M}_n\) and \(\textrm{M}_n \cap \textrm{Cl}_n\)). Let \(\mathcal {E}^{(j)}_{\textrm{M}_n}, \mathcal {E}^{(j)}_{\textrm{M}_n \cap \textrm{Cl}_n} \in \mathcal {L}(\mathcal {L}(\mathcal {H}_n))^{\otimes j}\) be defined as in Eqs. (25) and (26). Then, we have

(28)
(29)

We prove Theorem 1 in Sect. 4.1 (see also Sect. 2.1.4 for an explanation of the Liouville representation notation used here). In addition to providing explicit expressions for the relevant twirl channels, Theorem 1 shows in particular that the third moments of the two distributions are equal. Thus, the discrete ensemble of Clifford matchgate circuits is a 3-design for the continuous Haar-uniform distribution over all matchgate circuits, in the same way that the Clifford group is a unitary 3-design [4]. We can state this result informally as:

Corollary 1

The group of matchgate circuits that are also Clifford unitaries forms a “matchgate 3-design.”

This is a general result that can be applied in any context that involves up to the third moment of uniformly random matchgate circuits. In the specific context of classical shadows, it implies that the discrete and continuous ensembles we defined above lead to the same measurement channel and variances, so we can in principle use either ensemble and obtain the same results. In addition to possible practical implications (e.g., it may be easier to sample and implement unitaries from one distribution than the other, depending on hardware capabilities), from a mathematical perspective, the 3-design property is useful in that it allows us to use the additional symmetry of the full matchgate group to more easily analyse the discrete ensemble. We will see explicit examples of this in the following subsections.

3.2 Matchgate shadows

Theorem 1 allows us to characterise the classical shadows associated with the two distributions over matchgate circuits described above. We will refer to these colloquially as “matchgate classical shadows” or more simply “matchgate shadows.” Substituting the expression for \(\mathcal {E}_{\textrm{M}_n}^{(2)} = \mathcal {E}_{\textrm{M}_n \cap \textrm{Cl}_n}^{(2)}\) from Theorem 1(ii) for \(\mathop {{}\mathbb {E}}_{U \sim D} \mathcal {U}^{\otimes 2}\) in Eq. (18), we show in Sect. 4.2.1 that the classical shadows measurement channel \(\mathcal {M}\) (for both distributions) is given byFootnote 7

$$\begin{aligned} \mathcal {M} = \sum _{\ell =0}^{n}{n\atopwithdelims ()\ell } {2n \atopwithdelims ()2\ell }^{-1} \mathcal {P}_{2\ell }, \end{aligned}$$
(30)

where for \(k \in \{0,\dots , n\}\), \(\mathcal {P}_k \in \mathcal {L}(\mathcal {L}(\mathcal {H}_n))\) denotes the projector onto the subspace \(\Gamma _k\) of \(\mathcal {L}(\mathcal {H}_n)\) spanned by all products of k Majorana operators, i.e., in Liouville representation,

(31)

Consequently, the image of \(\mathcal {M}\) is the subspace \(\Gamma _{\text {even}} :=\bigoplus _{\ell =0}^{n} \Gamma _{2\ell }\) of \(\mathcal {L}(\mathcal {H}_n)\) consisting of even operators, and the (pseudo)inverse \(\mathcal {M}^{-1}: \Gamma _{\text {even}} \rightarrow \Gamma _{\text {even}}\) on this subspace is

$$\begin{aligned} \mathcal {M}^{-1} = \sum _{\ell = 0}^{n} {2n\atopwithdelims ()2\ell } {n\atopwithdelims ()\ell }^{-1} \mathcal {P}_{2\ell }. \end{aligned}$$
(32)

Since \(U_Q^\dagger | b \rangle \langle b |U_Q \in \Gamma _{\text {even}}\) for any \(Q \in \textrm{O}(2n)\) and computational basis state \(| b \rangle \) (see Eqs. (6) and (9)), our matchgate shadow samples \(\mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\) are well-defined. Furthermore, it follows from the fact that \(\Gamma _{\text {even}}\) is Hilbert-Schmidt orthogonal to its complement \(\Gamma _{\text {odd}} :=\bigoplus _{\ell = 1}^n \Gamma _{2n-1}\) that these classical shadows produce unbiased estimates of the expectation values \(\textrm{tr}( O_1\rho ),\dots ,\textrm{tr}( O_M\rho )\) provided that either the state \(\rho \) is in \(\Gamma _{\text {even}}\), or the observables \(O_1, \dots , O_M\) are in \(\Gamma _{\text {even}}\). Indeed, for many physical problems, the observables of interest are even operators, due to fermionic parity conservation. In our particular application to QC-AFQMC (see Sects. 2.3, and 3.3.3 below), the starting state \(\rho \) and relevant observables are all even operators or can all be made even, as shown in Appendix A.

Supposing that \(O \in \Gamma _{\text {even}}\) (and hence \(O^\dagger \in \Gamma _{\text {even}}\)), we can substitute the expression for \(\mathcal {E}_{\textrm{M}_n}^{(3)} = \mathcal {E}_{\textrm{M}_n \cap \textrm{Cl}_n}^{(3)}\) from Theorem 1(iii) for \(\mathop {{}\mathbb {E}}_{U \sim D} \mathcal {U}^{\otimes 3}\) in Eq. (20), leading to

$$\begin{aligned}&\textrm{Var}[\hat{o}] \le \mathop {{}\mathbb {E}}[|\hat{o}|^2] = \frac{1}{2^{2n}}\sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0 \\ \ell _1+ \ell _2 + \ell _3 \le n \end{array}} \alpha _{\ell _1,\ell _2,\ell _3}\nonumber \\&\quad \sum _{\begin{array}{c} A_1, A_2, A_3 \subseteq [2n] \text { disjoint} \\ |A_1| = 2\ell _1, |A_2| = 2\ell _2, |A_3| = 2\ell _3 \end{array}} \textrm{tr}\left( \rho \gamma _{A_1}\gamma _{A_2}\right) \textrm{tr}\left( O\gamma _{A_2}\gamma _{A_3}\right) \textrm{tr}\left( O^\dagger \gamma _{A_3}\gamma _{A_1}\right) , \end{aligned}$$
(33)

with

$$\begin{aligned} \alpha _{\ell _1,\ell _2,\ell _3} :=\frac{{n\atopwithdelims ()\ell _1,\ell _2,\ell _3, n-\ell _1-\ell _2-\ell _3}}{{2n\atopwithdelims ()2\ell _1,2\ell _2,2\ell _3, 2(n-\ell _1-\ell _2-\ell _3)}} \frac{{2n\atopwithdelims ()2(\ell _1 + \ell _3)}}{{n\atopwithdelims ()\ell _1 + \ell _3}} \frac{{2n\atopwithdelims ()2(\ell _2 + \ell _3)}}{{n\atopwithdelims ()\ell _2 + \ell _3}}, \end{aligned}$$
(34)

for the variance of the estimator \(\hat{o}\) for \(\textrm{tr}(O\rho )\) that we obtain from (a single sample of) our matchgate shadows. Equation (33), proven in Sect. 4.2.2, is expressed in terms of a particular set of Majorana operators \(\gamma _{\mu }\). However, note from Eq. (25) that \(\mathcal {E}^{(3)}_{\textrm{M}_n}\) is invariant under composition with any Gaussian unitary channel \(\mathcal {U}_Q^{\otimes 3}\), and by Corollary 1, so is \(\mathcal {E}^{(3)}_{\textrm{M}_n \cap \textrm{Cl}_n}\). This symmetry can be used to show that, for both the continuous ensemble \(\textrm{M}_n\) and the discrete ensemble \(\textrm{M}_n \cap \textrm{Cl}_n\), we can in fact replace the \(\gamma _\mu \)’s in Eq. (33) with any other basis of Majorana operators, i.e., for any \(Q \in \textrm{O}(2n)\), we can write

$$\begin{aligned} \mathop {{}\mathbb {E}}[|\hat{o}|^2]&= \frac{1}{2^{2n}}\sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0 \\ \ell _1+ \ell _2 + \ell _3 \le n \end{array}} \alpha _{\ell _1,\ell _2,\ell _3}\sum _{\begin{array}{c} A_1, A_2, A_3 \subseteq [2n] \text { disjoint} \\ |A_1| = 2\ell _1, |A_2| = 2\ell _2, |A_3| = 2\ell _3 \end{array}} \nonumber \\&\quad \textrm{tr}\left( \rho \widetilde{\gamma }_{A_1}\widetilde{\gamma }_{A_2}\right) \textrm{tr}\left( O\widetilde{\gamma }_{A_2}\widetilde{\gamma }_{A_3}\right) \textrm{tr}\left( O^\dagger \widetilde{\gamma }_{A_3}\widetilde{\gamma }_{A_1}\right) \end{aligned}$$
(35)

where \(\widetilde{\gamma }_\mu = \sum _{\nu = 1}^{2n} Q_{\mu \nu }\gamma _\mu \). The freedom to choose the Majorana basis in Eq. (35) (which is not a priori obvious for \(\textrm{M}_n \cap \textrm{Cl}_n\), without Corollary 1) will allow us to more easily bound the variance for large classes of fermionic observables. We can also obtain a bound that does not depend on our unknown state \(\rho \) from Eq. (35) by applying a triangle inequality and noting that \(|\textrm{tr}(\rho \widetilde{\gamma }_{A_1}\widetilde{\gamma }_{A_2})| \le \Vert \widetilde{\gamma }_{A_1}\widetilde{\gamma }_{A_2}\Vert \le 1\) for any \(A_1, A_2 \subseteq [n]\):

$$\begin{aligned} \textrm{Var}[\hat{o}]&\le \frac{1}{2^{2n}} \sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0 \\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}}\alpha _{\ell _1,\ell _2,\ell _3} \sum _{\begin{array}{c} A_1, A_2, A_3 \subseteq [2n] \text { disjoint} \\ |A_1| = 2\ell _1, |A_2|=2\ell _2,|A_3|=2\ell _3 \end{array}} \big |\textrm{tr}(O\widetilde{\gamma }_{A_2\cup A_3})\textrm{tr}(O \widetilde{\gamma }_{A_3 \cup A_1})\big | \nonumber \\&= \frac{1}{2^{2n}} \sum _{\begin{array}{c} S_1, S_2 \subseteq [2n] \\ |S_1|, |S_2|, |S_1 \cap S_2| \text { even} \end{array}} \alpha _{\frac{1}{2}|S_2\setminus S_1|, \frac{1}{2}|S_1\setminus S_2|,\frac{1}{2}|S_1 \cap S_2|} \big |\textrm{tr}(O \widetilde{\gamma }_{S_1}) \textrm{tr}(O\widetilde{\gamma }_{S_2})\big |. \end{aligned}$$
(36)

Equations (32) and (36) characterise the classical shadows obtained from performing random measurements corresponding to either of our two distributions over matchgate circuits (\(\textrm{M}_n\) or \(\textrm{M}_n \cap \textrm{Cl}_n\)), but they should be viewed only as a starting point. To provide a viable protocol for estimating the expectation values of some observables \(O_1,\dots , O_M\), we must also be able to 1) efficiently compute (on a classical computer) the expectation values \(\textrm{tr}(O_i \mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q))\) of each \(O_i\) with respect to any classical shadow sample \(\mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\), and 2) efficiently compute a bound on the variance for each \(O_i\), so as to determine the number of samples needed to achieve a given precision. In the following subsection, we describe efficient computation schemes and analyse the variance for three general classes of observables.

3.3 Applications

3.3.1 Local fermionic observables

First, as a simple example, we consider observables that are products of an even number of Majorana operators, i.e., \(O = \widetilde{\gamma }_S\) for \(S \subseteq [2n]\) with |S| even, where \(\widetilde{\gamma }_\mu = \sum _{\mu = 1}^{2n} Q'_{\mu \nu }\gamma _\nu \) for some \(Q'\in \textrm{O}(2n)\). The expectation value of \(\widetilde{\gamma }_S\) with respect to a classical shadow sample \(\mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\) can be computed as

$$\begin{aligned} \textrm{tr}\left( \widetilde{\gamma }_S \mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\right) = {2n\atopwithdelims ()|S|}{n\atopwithdelims ()|S|/2}^{-1} \textrm{pf}\left( i (Q' Q^{\textrm{T}} C_{| b \rangle } Q {Q'}^{\textrm{T}})\big |_{S}\right) , \end{aligned}$$
(37)

where \(\textrm{pf}\) denotes the Pfaffian, \(C_{| b \rangle }\) is the covariance matrix of the computational basis state \(| b \rangle \), given in Eq. (12), and \(M\big |_S\) denotes the restriction of a matrix M to rows and columns indexed by S. Equation (37) follows directly from the form of our inverse channel \(\mathcal {M}^{-1}\) [Eq. (32)] together with Wick’s theorem, and is efficiently computable since the Pfaffian of a \(2n\times 2n\) matrix can be computed in \(\mathcal {O}(n^3)\) time (see e.g., Ref. [10]). From Eq. (35) or (36), the variance of the estimates for \(\textrm{tr}(\widetilde{\gamma }_S\rho )\) is bounded by \({2n\atopwithdelims ()|S|}{n\atopwithdelims ()|S|/2}^{-1}\) (to see this, observe that the only non-vanishing term in the sum corresponds to \(A_1 = A_2 = \varnothing \) and \(A_3 = S\)), which scales as \(n^{|S|/2}\) for constant |S|. Thus, the expectation values of local fermionic observables can be efficiently estimated using the classical shadows we obtain from either \(\textrm{M}_n\) or \(\textrm{M}_n \cap \textrm{Cl}_n\).

3.3.2 Gaussian density matrices

Next, we consider observables that are density operators of fermionic Gaussian states, i.e., \(O = \varrho \), where \(\varrho \) has the form in Eq. (10). In the case where \(\varrho \) or the unknown state \(\rho \) is a pure state, the expectation value \(\textrm{tr}(\varrho \rho )\) of \(\varrho \) with respect to \(\rho \) gives the fidelity between \(\rho \) and \(\varrho \). From Eq. (32), the expectation value of \(\varrho \) with respect to a classical shadow sample \(\mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\), which gives an unbiased estimate for \(\textrm{tr}(\varrho \rho )\), is

$$\begin{aligned} \textrm{tr}\left( \varrho \mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q) \right) = \sum _{\ell = 0}^{n} {2n\atopwithdelims ()2\ell }{n\atopwithdelims ()\ell }^{-1} \textrm{tr}\left( \varrho \mathcal {P}_{2\ell }(U_Q^\dagger | b \rangle \langle b |U_Q)\right) . \end{aligned}$$
(38)

Theorem 2, stated for a special case below and in full generality in Sect. 5.1, allows us to efficiently compute \(\textrm{tr}(\varrho \mathcal {P}_{2\ell }(U_Q^\dagger | b \rangle \langle b |U_Q))\) for every \(\ell \), and hence \(\textrm{tr}(\varrho \mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q) )\), for any Gaussian state \(\varrho \). Note that \(U_Q^\dagger | b \rangle \langle b |U_Q\) is also a Gaussian state, for any \(U_Q \in \textrm{M}_n\) and computational basis state \(| b \rangle \).

Theorem 2* (specialised to invertible \(C_{\varrho _1}\)). For any \(n \in \mathbb {Z}_{>0}\), let \(\varrho _1\) and \(\varrho _2\) be density operators of n-mode fermionic Gaussian states (Eq. (10)), with covariance matrices \(C_{\varrho _1}\) and \(C_{\varrho _2}\) (Eq. (11)). Then, for each \(\ell \in \{0,\dots , n\}\), \(\textrm{tr}(\varrho _1\mathcal {P}_{2\ell }(\varrho _2))\) is the coefficient of \(z^\ell \) in the polynomial \(p_{\varrho _1,\varrho _2}(z)\), where

$$\begin{aligned} p_{\varrho _1,\varrho _2}(z) = \frac{1}{2^n} \textrm{pf}\big (C_{\varrho _1}\big ) \textrm{pf}\left( -C_{\varrho _1}^{-1} + zC_{\varrho _2}\right) \end{aligned}$$
(39)

if \(C_{\varrho _1}\) is invertible.

We give the form of \(p_{\rho _1,\rho _2}(z)\) for the general case, where \(C_{\varrho _1}\) is not necessarily invertible, in Sect. 5.1, where we also provide the proof. This polynomial has degree at most r in general, where \(2r \le 2n\) is the rank of \(C_{\varrho _1}\), so its coefficients could be computed using polynomial interpolationFootnote 8 in \(\mathcal {O}(r^4)\) time. In Appendix D, we describe a different strategy that computes all of the coefficients in \(\mathcal {O}(r^3)\) time. Thus, by taking \(\varrho _1 = \varrho \) and \(\varrho _2 = U_Q^\dagger | b \rangle \langle b |U_Q\) in Theorem 2, we can compute our classical shadows estimate, Eq. (38), in at most \(\mathcal {O}(n^3)\) time. For instance, in the case where \(C_\varrho \) is invertible, the terms \(\textrm{tr}(\varrho \mathcal {P}_{2\ell }(U_Q^\dagger | b \rangle \langle b |U_Q))\) in Eq. (38) are the coefficients of the polynomial \(2^{-n}\textrm{pf}(C_{\varrho })\textrm{pf}(-C_\varrho ^{-1} + z Q^{\textrm{T}} C_{| b \rangle }Q)\), with \(C_{\varrho }\) given as in Eq. (13) and \(C_{| b \rangle }\) in Eq. (12).

A result that is essentially a special case of Theorem 2 (where \(\varrho _1\) is a computational basis state) was derived in Ref. [6], by combining Wick’s theorem with a minor summation formula for Pfaffians. We use a different, more elementary approach (directly exploiting the structure of the Clifford algebra generated by the Majorana operators), to prove Theorem 2, because it can be extended to the more complicated case of estimating overlaps, discussed in the following subsection (whereas the proof strategy of Ref. [6] would require some other summation formula, tailored to the overlap case; as far as we are aware, such a formula is not available in the literature.Footnote 9

As for the variance, we show in Sect. 6.1 that for any Gaussian state \(\varrho \), Eq. (36) becomes

$$\begin{aligned} \textrm{Var}[\hat{o}]\Big |_{O = \varrho } \le \frac{1}{2^{2n}} \sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0\\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}} \frac{{n\atopwithdelims ()\ell _1,\ell _2,\ell _3,n-\ell _1-\ell _2-\ell _3}^2}{{2n\atopwithdelims ()2\ell _1,2\ell _2,2\ell _3, 2(n-\ell _1-\ell _2-\ell _3)}}\frac{{2n\atopwithdelims ()2(\ell _1+\ell _3)}}{{n\atopwithdelims ()\ell _1 + \ell _3}}\frac{{2n\atopwithdelims ()2(\ell _2 + \ell _3)}}{{n\atopwithdelims ()\ell _2 + \ell _3}} \end{aligned}$$
(40)

for the variance of the classical shadows estimator for the expectation value of \(\varrho \). This can be straightforwardly upper bounded by \(\mathcal {O}(n^3)\); we provide a more refined argument in Appendix F that \(\textrm{Var}[\hat{o}]\big |_{O = \hat{\rho }} = \mathcal {O}(\sqrt{n} \log n)\). The RHS of Eq. (6.1) is also plotted as the red line in Fig. 1.

3.3.3 Overlaps with Slater determinants

We can also efficiently estimate the overlap between a pure state \(| \psi \rangle \) and an arbitrary Slater determinant \(| \varphi \rangle \) using our matchgate shadows, provided that we can prepare \(| \psi \rangle \) using a quantum circuit.Footnote 10 As explained in Sect. 2.3, assuming \(| \psi \rangle \) has no support on the vacuum state \(| \textbf{0} \rangle \) (see Appendix A for modified protocols that remove this assumption), the overlap \(\langle \psi |\varphi \rangle \) can be obtained by evaluating the expectation value of \(| \varphi \rangle \langle \textbf{0} |\) with respect to the initial state \(\rho = \frac{1}{2}(| \textbf{0} \rangle + | \psi \rangle )(\langle \textbf{0} | + \langle \psi |)\). Note that \(| \varphi \rangle \langle \textbf{0} |\) is an even operator if and only if the number of electrons \(\zeta \) in \(| \varphi \rangle \) is even. (One way to see this is by using the fact that an operator is in \(\Gamma _{\text {even}}\) if and only if it commutes with the parity operator \(P = \prod _{j = 1}^n (I - 2a_j^\dagger a_j) = (-i)^n \gamma _1 \dots \gamma _{2n}\).) Since our analysis in Sect. 3.2 applies directly only to even operators, we show in Appendix A.1 that in the case where \(| \varphi \rangle \) has an odd number of fermions, we can reduce the problem of evaluating \(\langle \varphi |\psi \rangle \) to evaluating the expectation value of \(| \varphi ' \rangle \langle \textbf{0} |\) for a Slater determinant \(| \varphi ' \rangle \) with an even number of electrons, by introducing an extra qubit and making a simple modification to the initial state \(\rho \).

Hence, it suffices to show how to estimate the expectation value of \(| \varphi \rangle \langle \textbf{0} | \in \Gamma _{\text {even}}\) for an arbitrary Slater determinant \(| \varphi \rangle \) with an even number of fermions \(\zeta \). Taking \(| \varphi \rangle \langle \textbf{0} |\) to be an observable in the context of the classical shadows protocol, its expectation values with respect to classical shadow samples of any initial state \(\rho \) are unbiased estimates of \(\textrm{tr}(| \varphi \rangle \langle \textbf{0} |\rho )\). By Eq. (32), for our matchgate shadows, these estimates have the form

$$\begin{aligned} \textrm{tr}\left( | \varphi \rangle \langle \textbf{0} | \mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\right) = \sum _{\ell =0}^n {2n\atopwithdelims ()2\ell }{n\atopwithdelims ()\ell }^{-1} \textrm{tr}\left( | \varphi \rangle \langle \textbf{0} |\mathcal {P}_{2\ell }(U_Q^\dagger | b \rangle \langle b |U_Q) \right) \end{aligned}$$
(41)

where \(U_Q \in \textrm{M}_n\) and \(| b \rangle \) is a computational basis state. Equation (41) can be efficiently computed using the following result, which we prove in Sect. 5.2.

Theorem 3

For any \(n \in \mathbb {Z}_{>0}\) and even integer \(0 \le \zeta \le n\), let \(| \varphi \rangle \) be an n-mode, \(\zeta \)-fermion Slater determinant specified as in Eq. (14). Let \(\varrho \) be the density operator of any n-mode fermionic Gaussian state \(\text {(Eq.}\) (10\(\text {))}\), with covariance matrix \(C_{\varrho }\) \(\text {(Eq.}\) (13\(\text {))}\). Then, for each \(\ell \in \{0,\dots , n\}\), \(\textrm{tr}(| \varphi \rangle \langle \textbf{0} | \mathcal {P}_{2\ell }(\varrho ))\) is the coefficient of \(z^\ell \) in the polynomial

$$\begin{aligned} q_{| \varphi \rangle ,\varrho }(z) = \frac{1}{2^{n - \zeta /2}} i^{\zeta /2} \textrm{pf}\left( \left( C_{| \textbf{0} \rangle } + zW^* \widetilde{Q} C_{\varrho } \widetilde{Q}^{\textrm{T}} W^\dagger \right) \Big |_{\overline{S}_\zeta } \right) , \end{aligned}$$

where \(C_{| \textbf{0} \rangle }\) is the covariance matrix of the vacuum state \(| \textbf{0} \rangle \) (Eq. (12)), \(\widetilde{Q}\) is the orthogonal matrix defined in Eq. (15),Footnote 11

$$\begin{aligned} W :=\bigoplus _{j = 1}^\zeta \frac{1}{\sqrt{2}}\begin{pmatrix} 1 &{}-i \\ 1 &{}i \end{pmatrix} \oplus \bigoplus _{j = \zeta + 1}^n \begin{pmatrix} 1 &{} 0 \\ 0 &{}1 \end{pmatrix}, \end{aligned}$$
(42)

and \(\overline{S}_\zeta :=[2n] {\setminus } \{1,3,\dots ,2\zeta - 1\}\).

The matrix \((C_{| \textbf{0} \rangle } + zW^* \widetilde{Q} C_{\varrho } \widetilde{Q}^{\textrm{T}} W^\dagger ) \big |_{\overline{S}_\zeta }\) has size \(2n - \zeta \), so the polynomial \(q_{| \varphi \rangle , \varrho }(z)\) has degree at most \(n - \zeta /2\) and all of its coefficients can be found using polynomial interpolation in \(\mathcal {O}((n - \zeta /2)^4)\) time. Thus, taking \(\varrho = U_Q^\dagger | b \rangle \langle b |U_Q\) in Theorem 3 gives an efficient way of computing our classical shadows estimate, Eq. (41).

As we prove in Sect. 6.2, for any initial state \(\rho \) and Slater determinant \(| \varphi \rangle \) with an even number of \(\zeta \) fermions, the variance of our classical shadows estimator for \(\textrm{tr}(| \varphi \rangle \langle \textbf{0} | \rho )\) is bounded as

$$\begin{aligned} \textrm{Var}[\hat{o}]\Big |_{O = | \varphi \rangle \langle \textbf{0} |} \le b(n, \zeta ) :=\frac{1}{2^{2n}} \sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0\\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}}\alpha _{\ell _1,\ell _2,\ell _3}\, \kappa (n, \zeta , \ell _1,\ell _2,\ell _3), \end{aligned}$$
(43)

where \(\alpha _{\ell _1,\ell _2,\ell _3}\) is given by Eq. (34) andFootnote 12

$$\begin{aligned}{} & {} \kappa (n,\zeta , \ell _1,\ell _2,\ell _3) :=2^{\zeta }\sum _{j=0}^{\zeta /2} {\zeta \atopwithdelims ()2j}\nonumber \\{} & {} \quad \times {n-\zeta \atopwithdelims ()\ell _1 - \zeta /2 +j,\, \ell _2 - \zeta /2 + j, \, \ell _3 - j, \,n - \ell _1 - \ell _2 - \ell _3 -j}. \end{aligned}$$
(44)

Note that for \(\zeta = 0\), the RHS of Eq. (43) reduces to the RHS of Eq. (40), which is our variance bound for estimating expectation values of Gaussian density operators. This is consistent, because \(| \varphi \rangle = | \textbf{0} \rangle \) if \(\zeta = 0\), so \(| \varphi \rangle \langle \textbf{0} | = | \textbf{0} \rangle \langle \textbf{0} |\) is a Gaussian density operator. While we do not provide an asymptotic bound on the variance for \(\zeta >0\), Eq. (43) is an explicit upper bound that can be computed in \(\textrm{poly}(n)\) time (and in order to use the classical shadows procedure, it suffices to be able to efficiently compute an upper bound on the variance, in order to choose the number of samples). We plot the bound \(b(n,\zeta )\) in Eq. (43) in Fig. 1 for n up to 1000 and various values of \(\zeta \). The plot strongly suggests that the variance bound for \(\zeta > 0\) is always less than the bound for \(\zeta = 0\), which scales as \(\mathcal {O}(\sqrt{n} \log n)\) (see Appendix F). This would imply that the number of matchgate shadow samples required to estimate overlaps with arbitrary Slater determinants is sublinear in the number of fermionic modes n.

Fig. 1
figure 1

Linear and log-log plots of \(b(n,\zeta )\) vs. n, for \(\zeta \in \{0,2,10,50,100,200,500\}\) (evaluated at integer values of n and joined). We also plot \(y = \sqrt{n} \ln (n)\) for comparison (in Appendix F, we place an \(\mathcal {O}(\sqrt{n}\ln (n))\) bound on b(n, 0), plotted in red). \(b(n,\zeta )\), defined in Eq. (6.2), is our bound on the variance for estimating the expectation value of \(| \varphi \rangle \langle \textbf{0} |\) using our matchgate classical shadows, where \(| \varphi \rangle \) is any n-mode, \(\zeta \)-fermion Slater determinant. As shown in Appendix A.1, estimating the expectation value of \(| \varphi \rangle \langle \textbf{0} |\) allows us to estimate the overlap between any pure state and \(| \varphi \rangle \). Note that b(n, 0) is also equal to the RHS of Eq. (40), which is our variance bound for estimating the expectation values of arbitrary Gaussian density operators

figure a

For reference, we summarise our matchgate shadows protocol applied to estimating overlaps with Slater determinants in Algorithm 1. Using this protocol (with \(| \psi \rangle = | \Psi _{\text {trial}} \rangle \)) in place of the Clifford-based shadows protocol implemented in Ref. [2] removes the exponential classical post-processing cost incurred in QC-AFQMC.

3.3.4 More general fermionic observables

While the first three types of applications discussed above likely cover many cases of interest, we also develop an explicit framework for efficiently evaluating expectation values of a much broader class of observables using our matchgate classical shadows. This broader class includes products of operators of the form \(A^{(1)}\dots A^{(m)}\), where each \(A^{(i)}\) is an arbitrary linear combination of Majorana operators \(\{\gamma _\mu \}_{\mu \in [2n]}\), a fermionic Gaussian unitary, or the density operator of a fermionic Gaussian state. As a specific application of this general framework, we show how to use it to estimate the inner product between an arbitrary pure state \(| \psi \rangle \) and an arbitrary pure fermionic Gaussian state (not restricted to be a Slater determinant), thereby extending the results in Sect. 3.3.3. We describe the framework in detail in Sect. 5.3, which concludes with our procedure for inner product estimation in Sect. 5.3.4.

Our post-processing procedures are built upon new classical simulation results and proof techniques that may find application in other contexts, beyond their use in the specific classical shadows protocols we consider here. In particular, we present a method for efficiently evaluating any expression that can be written in the form \(\textrm{tr}(A^{(1)}\dots A^{(m)})\), which encompasses a wide range of free-fermion quantities of interest. At a high level, the general method consists of three main steps. First, we give a general recipe for recasting the trace of a product of arbitrary operators as a Grassmann integral (see Theorem 4), in a Grassmann algebra that is related to the Clifford algebra generated by the Majorana operators. Then, we show that if each operator in the product falls into one of the three categories described above, the Grassmann integral can be massaged into a particular form. Finally, we develop an algorithm (Algorithm 2) for efficiently evaluating any integral of this form.

3.4 Comparison to related work

3.4.1 Prior work

We now compare our results to those of Zhao et al. [7], which considers classical shadows resulting from the discrete uniform distribution over matchgate circuits \(U_Q\) such that Q is in the alternating group \(\textrm{A}(2n)\); these constitute a proper subset of \(\textrm{M}_n \cap \textrm{Cl}_n\). We note that the measurement channel \(\mathcal {M}\) for this distribution is the same as ours in Eq. (30), even though its corresponding 2-fold twirl \(\mathop {{}\mathbb {E}}_{Q \in \textrm{A}(2n)} \mathcal {U}_Q^{\otimes 2}\) is different from the 2-fold twirl \(\mathcal {E}_{\textrm{M}_n}^{(2)} = \mathcal {E}_{\textrm{M}_n \cap \textrm{Cl}_n}^{(2)}\) for our distributions. (In fact, the j-fold twirl channels differ, i.e., \(\mathop {{}\mathbb {E}}_{Q \in \textrm{A}(2n)} \mathcal {U}_Q^{\otimes j} \ne \mathcal {E}_{\textrm{M}_n}^{(j)} = \mathcal {E}_{\textrm{M}_n \cap \textrm{Cl}_n}^{(j)}\), for all \(j \in \mathbb {Z}_{>0}\).Footnote 13) Since the measurement channels are the same, the expectation values of \(\widetilde{\gamma }_S\) can likewise be computed using Eq. (37) for the distribution in  Ref. [7]. However, the authors only consider and bound the variance for products \(\gamma _S\) of the canonical Majorana operators \(\gamma _\mu \) (also finding a variance bound of \({2n\atopwithdelims ()|S|}{n\atopwithdelims ()|S|/2}^{-1}\)). In the absence of an analogue of Corollary 1 for their discrete distribution (which naturally picks out \(\{\gamma _\mu \}_{\mu \in [2n]}\) as a preferred basis), the variance for arbitrary Majorana products \(\widetilde{\gamma }_S\) is more difficult to (tightly) bound, using their basis-dependent expression for the variance.

More importantly, compared to Ref. [7], we provide methods for efficiently computing estimates of the expectation values of more families of observables, beyond single products of Majorana operators, as discussed in Sect. 3.3. In particular, one of these is the set of \(| \varphi \rangle \langle \textbf{0} |\) operators which allow us to obtain the overlap estimates in the QC-AFQMC algorithm (see Sect. 2.3). Thus, we obtain a more generally applicable shadows protocol for estimating fermionic observables.

3.4.2 Subsequent work

Shortly after the preprint of this manuscript was published, two related papers, Refs. [13, 14], were posted.

O’Gorman [13] considers the same problem as we do in Sect. 3.3.3, that is, of estimating the overlaps between an unknown pure state and arbitrary Slater determinants (likewise motivated by the application to QC-AFQMC), using classical shadows associated with the same discrete distribution analysed by Zhao et al. in Ref. [7]. Hence, as discussed above, the measurement channel \(\mathcal {M}\) for this distribution is the same as ours (Eq. (30)). However, without having proven an analogue of our “matchgate 3-design” result (Corollary 1), the proof of their variance bounds is incomplete. Indeed, there is a gap between Lemma 2 and Theorem 4 of Ref. [13], as it is not proved that the variance bound for \(| \textbf{0} \rangle \langle \textbf{0} |\) also applies to any arbitrary fermionic Gaussian state, nor that the variance bound for \(| 1 \rangle ^{\otimes \zeta } | 0 \rangle ^{\otimes n- \zeta }\langle \textbf{0} |\) also applies to \(| \varphi \rangle \langle \textbf{0} |\) for any \(\zeta \)-fermion Slater determinant. In our paper, the variance analysis (in the case of the discrete distribution over \(\textrm{M}_n \cap \textrm{Cl}_n\)) for arbitrary Gaussian states and Slater determinants relies on the matchgate 3-design result. This is similar to how the fact that the Clifford group forms a unitary 3-design is a key ingredient in the analysis of the Clifford-based classical shadows of Ref. [1]. Reference [13] places a bound of \(\textrm{Var}[\hat{o}]\big |_{O = | \textbf{0} \rangle \langle \textbf{0} |} = \mathcal {O}(n)\) on the variance for \(| \textbf{0} \rangle \langle \textbf{0} |\), which is consistent with (though looser than) our bound of \(\textrm{Var}[\hat{o}]\big |_{O = \varrho } = \mathcal {O}(\sqrt{n}\log n)\) for an arbitrary Gaussian density matrix \(\varrho \), and also presents numerical evidence that \(\textrm{Var}[\hat{o}]\big |_{O = | \varphi \rangle \langle \textbf{0} |}\) scales sublinearly with n for Slater determinants \(| \varphi \rangle \). In addition, Ref. [13] applies the variance bounds from Ref. [7] to show how to learn a Slater determinant from copies thereof.

For the classical post-processing required to extract estimates of \(\textrm{tr}(| \varphi \rangle \langle \textbf{0} |\rho )\) from the classical shadow samples, Ref. [13] shows that \(\textrm{tr}(| \varphi \rangle \langle \textbf{0} |\mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q))\) can be decomposed into \(n + 1\) matchgate tensor networks, then appeals to the fact that certain classes of such tensor networks can be contracted efficiently (see references therein). In contrast, we give an explicit expression for this quantity (Eq. (41) and Theorem 3), and provide a self-contained algorithm for efficiently computing more general fermionic observables as well (see Sects. 3.3.4 and 5.3).

Low [14] considers the uniform distribution over number-conserving fermionic Gaussian unitaries, i.e., \(\{U \in \textrm{M}_n: [U, \sum _j a_j^\dagger a_j] = 0\}\), and shows how the classical shadows corresponding to this distribution can be used to estimate k-fermion reduced density matrices (k-RDMs) \(\textrm{tr}(a_{p_1}^\dagger \dots a_{p_k}^\dagger a_{q_1}\dots a_{q_k} \rho _\zeta )\) of a state \(\rho _\zeta \) with fixed particle number \(\zeta \), with an average-case variance that is asymptotically better than the worst-case variance resulting from the shadows considered in the present paper and Ref. [7]. In particular, for \(k = \mathcal {O}(1)\), the variance averaged over all k-RDMs is \(\mathcal {O}(\zeta ^k)\) for the shadows of Ref. [14] (which can be much smaller than the variance bound of \(\mathcal {O}(n^k)\), derived in Ref. [7] and Sect. 3.3.1), while for \(k = \zeta \), the average variance is \(\mathcal {O}(1)\).

Reference [14] then reduces the estimation of the overlap with an arbitrary n-mode, \(\zeta \)-fermion Slater determinant to the estimation of a \(\zeta \)-RDM of a \((n + \zeta )\)-mode state with \(\zeta \) fermions, for which the “average” variance is \(\mathcal {O}(1)\). However, there is an important subtlety: this average variance is taken over all of the \(\zeta \)-RDMs, whereas only a subset of the \(\zeta \)-RDMs correspond to the estimation of a Slater determinant overlap. Therefore, the fact that this average variance is \(\mathcal {O}(1)\) does not imply that the variance for estimating Slater determinant overlaps is \(\mathcal {O}(1)\) when averaged over \(\zeta \)-fermion Slater determinants. Without further analysis, it remains unclear what the worst-case or average-case variance would be for overlap estimation using these classical shadows. On the other hand, Eq. (43) provides a guarantee on the worst-case variance for overlap estimation using our matchgate shadows, though this bound is likely not independent of n. (Moreover, the protocol of Ref. [14] involves adding \(\zeta \) ancillary fermionic modes, which may be prohibitive for near-term quantum computers, especially when \(\zeta \) is comparable to n—e.g., for systems at half-filling.)

For the classical post-processing, Ref. [14] employs our proof techniques in Sects. 5.1 and 5.2, extending and adapting them to obtain efficiently computable expressions for the k-RDM estimators obtained from their classical shadows. In the same vein as Theorems 2 and 3, the classical post-processing procedure of Ref. [14] involves finding the coefficients of a certain polynomial that can be evaluated in terms of Pfaffians.

4 Ensembles of Matchgate Circuits

In this section, we analyse the two distributions over matchgate circuits defined in Sect. 3 (see Eqs. (23)–(26)). We begin by proving Theorem 1 in Sect. 4.1, then use it in Sect. 4.2 to characterise the classical shadows resulting from the distributions.

4.1 Moments of the distributions

We prove Theorem 1 by explicitly evaluating the twirl channels \(\mathcal {E}^{(j)}_{\textrm{M}_n}\) and \(\mathcal {E}^{(j)}_{\textrm{M}_n \cap \textrm{Cl}_n}\) for \(j \in \{1,2,3\}\). For convenience, we first collect some basic facts about the Gaussian unitary channels \(\mathcal {U}_Q\).

Fact 1

Let \(\mathcal {U}_Q \in \mathcal {L}(\mathcal {L}(\mathcal {H}_n))\) be defined by Eqs. (27) and (8). For any \(Q, Q' \in \textrm{O}(2n)\),

  1. (a)

    \(\mathcal {U}_Q(ABC\dots ) = \mathcal {U}_Q(A) \mathcal {U}_Q(B)\mathcal {U}_Q(C)\dots \) for any operators \(A,B,C, \dots \in \mathcal {L}(\mathcal {H}_n)\)

  2. (b)

    \(\mathcal {U}_{Q Q'} = \mathcal {U}_{Q'} \circ \mathcal {U}_{Q}\)

  3. (c)

    \(\mathcal {U}_Q^\dagger = \mathcal {U}_Q^{-1} = \mathcal {U}_{Q^{\textrm{T}}}\), where the adjoint is with respect to the Hilbert-Schmidt inner product,

  4. (d)

    \(\mathcal {U}_Q (\Gamma _k) = \Gamma _k\) for all \(k \in \{0,\dots , 2n\}\), where \(\Gamma _k\) is defined in Eq. (3).

Proof

(a) is a simple consequence of the unitarity of \(U_Q\). Using (a) in conjunction with Eq. (8) gives \((\mathcal {U}_{Q'} \circ \mathcal {U}_Q) (\gamma _S) = \mathcal {U}_{QQ'}(\gamma _S)\) for all \(S \subseteq [2n]\), which implies (b) since \(\{\gamma _S: S \subseteq [2n]\}\) spans \(\mathcal {L}(\mathcal {H}_n)\). For (c), it follows from (b) that \(\mathcal {U}_Q \circ \mathcal {U}_{Q^{\textrm{T}}} = \mathcal {U}_{Q^{\textrm{T}} Q} = \mathcal {U}_I = \mathcal {I}\), so \(\mathcal {U}_{Q}^{-1} = \mathcal {U}_{Q^{\textrm{T}}}\). Also, clearly \(\mathcal {U}^{-1}(\,\cdot \,) = U_Q(\,\cdot \,)U_Q^\dagger \). Hence, for all \(A,B \in \mathcal {L}(\mathcal {H}_n)\),

$$\begin{aligned} \textrm{tr}(A^\dagger \mathcal {U}_Q(B)) = \textrm{tr}(U_Q A^\dagger U_Q^\dagger B) = \textrm{tr}( \mathcal {U}_Q^{-1}(A)^\dagger B), \end{aligned}$$

so \(\mathcal {U}_Q^{-1} = \mathcal {U}_Q^\dagger \). (d) follows directly from Eq. (9) and the fact that \(\mathcal {U}_Q\) is invertible. \(\square \)

It follows from Fact 1(b) that the map \(\mathcal {U}: \textrm{O}(2n)\rightarrow \mathcal {L}(\mathcal {L}(\mathcal {H}_n))\) with \(\mathcal {U}(Q) = \mathcal {U}_Q\) is a faithful representation of the orthogonal group \(\textrm{O}(2n)\). The following fact follows straightforwardly from the group properties of \(\textrm{O}(2n)\) and \(\textrm{B}(2n)\).

Fact 2

For any \(j \in \mathbb {Z}_{>0}\), \(\mathcal {E}_{\textrm{M}_n}^{(j)}\) is the orthogonal projector onto the subspace \(\mathcal {X}^{(j)}_{\textrm{M}_n} :=\{A \in \mathcal {L}(\mathcal {H}_n)^{\otimes j}: \mathcal {U}_Q^{\otimes j}(A) = A\hspace{5.0pt}\forall \, Q \in \textrm{O}(2n)\}\) of \(\mathcal {L}(\mathcal {H}_n)^{\otimes j}\), and \(\mathcal {E}_{\textrm{M}_n \cap \textrm{Cl}_n}^{(j)}\) is the orthogonal projector onto the subspace \(\mathcal {X}^{(j)}_{\textrm{M}_n \cap \textrm{Cl}_n} :=\{A \in \mathcal {L}(\mathcal {H}_n)^{\otimes j}: \mathcal {U}_Q^{\otimes j}(A) = A\hspace{5.0pt}\forall \, Q \in \textrm{B}(2n)\}\).

Proof

For any \(Q \in \textrm{O}(2n)\),

$$\begin{aligned} \mathcal {U}_Q^{\otimes j} \circ \mathcal {E}_{\textrm{M}_n}^{(j)} = \mathcal {E}_{\textrm{M}_n}^{(j)} = \mathcal {E}_{\textrm{M}_n}^{(j)} \circ \mathcal {U}_Q^{\otimes j} \end{aligned}$$
(45)

using Fact 1(b) in conjunction with the left- and right-invariance of the Haar measure on \(\textrm{O}(2n)\). It follows that

$$\begin{aligned} (\mathcal {E}_{\textrm{M}_n}^{(j)})^2 = \mathcal {E}_{\textrm{M}_n}^{(j)}. \end{aligned}$$

Also,

$$\begin{aligned} (\mathcal {E}_{\textrm{M}_n}^{(j)})^\dagger = \int _{\textrm{O}(2n)} d\mu (Q)\, (\mathcal {U}_Q^\dagger )^{\otimes j} = \int _{\textrm{O}(2n)} d\mu (Q)\, \mathcal {U}_{Q^{-1}}^{\otimes j} = \int _{\textrm{O}(2n)} d\mu (Q)\, \mathcal {U}_{Q}^{\otimes j} = \mathcal {E}_{\textrm{M}_n}^{(j)}, \end{aligned}$$

where the second equality is Fact 1(c), while the third follows from the fact that \(\textrm{O}(2n)\) is unimodular. Thus, \(\mathcal {E}_{\textrm{M}_n}^{(j)}\) is an orthogonal projector. For \(A \in \mathcal {X}_{\textrm{M}_n}\), clearly \(\mathcal {E}^{(j)}_{\textrm{M}_n}(A) = A\), so \(\mathcal {X}_{\textrm{M}_n}^{(j)} \subseteq \textrm{im}(\mathcal {E}^{(j)}_{\textrm{M}_n})\). Conversely, if \(A \in \textrm{im}(\mathcal {E}^{(j)}_{\textrm{M}_n})\), then \(A = \mathcal {E}^{(j)}_{\textrm{M}_n}(A)\), so \(\mathcal {U}_Q(A) = (\mathcal {U}_Q \circ \mathcal {E}^{(j)}_{\textrm{M}_n})(A) = \mathcal {E}^{(j)}_{\textrm{M}_n}(A) = A\) by Eq. (45).

The proof for \(\mathcal {E}^{(j)}_{\textrm{M}_n \cap \textrm{Cl}_n}\) is analogous, with

$$\begin{aligned} \mathcal {U}_Q^{\otimes j} \circ \mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(j)} = \mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(j)} = \mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(j)} \circ \mathcal {U}_Q^{\otimes j} \end{aligned}$$
(46)

for any \(Q \in \textrm{B}(2n)\) following from the fact that \(\textrm{B}(2n)\) is closed under left- and right-multiplication by any group element, and \((\mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(j)})^\dagger = \mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(j)}\) from the fact that \(\textrm{B}(2n)\) is closed under inverse. \(\square \)

Throughout the remainder of this section, we use \(\mathcal {E}^{(j)}\) to denote either \(\mathcal {E}_{\textrm{M}_n}^{(j)}\) or \(\mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(j)}\), for \(j \in \{1,2,3\}\). We have not yet proven that \(\mathcal {E}_{\textrm{M}_n}^{(j)} = \mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(j)}\) for \(j \in \{1,2,3\}\), but this notational simplification will be valid since any statement we make while evaluating \(\mathcal {E}^{(j)}\) will be patently true for both \(\mathcal {E}_{\textrm{M}_n}^{(j)}\) and \(\mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(j)}\). The expression we arrive at for \(\mathcal {E}^{(j)}\) will therefore be equal to both \(\mathcal {E}_{\textrm{M}_n}^{(j)}\) and \(\mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(j)}\). Equations (45) and (46) will be particularly useful; we subsume these as

$$\begin{aligned} \mathcal {U}_Q^{\otimes j} \circ \mathcal {E}^{(j)} = \mathcal {E}^{(j)} = \mathcal {E}^{(j)} \circ \mathcal {U}_Q^{\otimes j} \end{aligned}$$
(47)

for all \(Q \in \textrm{B}(2n)\).

We start by calculating the 2-fold twirl \(\mathcal {E}^{(2)}\), which can then be used to calculate \(\mathcal {E}^{(1)}\), since for any \(j > 1\),

$$\begin{aligned} \mathcal {E}^{(j-1)}(A) = \textrm{tr}_1\left[ \mathcal {E}^{(j)}\left( \frac{I}{2^n} \otimes A\right) \right] \end{aligned}$$
(48)

for all \(\mathcal {A} \in \mathcal {L}(\mathcal {H}_n)\). We then sketch the proof for \(\mathcal {E}^{(3)}\), deferring the more technical parts to the appendix. Note that \(\mathcal {E}^{(2)}\) could be derived from \(\mathcal {E}^{(3)}\) using Eq. (48). However, we present a direct, self-contained proof for \(\mathcal {E}^{(2)}\), because the proof for \(\mathcal {E}^{(3)}\) uses similar ideas, but is a bit more technically involved.

4.1.1 The 2-fold twirl \(\mathcal {E}^{(2)}\)

As discussed above, we let \(\mathcal {E}^{(2)}\) denote \(\mathcal {E}_{\textrm{M}_n}^{(2)}\) or \(\mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(2)}\). By Fact 2, \(\mathcal {E}^{(2)}\) is an orthogonal projector, so we can determine it by finding its image. We consider the basis for \(\mathcal {L}(\mathcal {H}_n)^{\otimes 2}\). We start with a simple lemma that uses symmetry to preclude certain basis states from being in the image of \(\mathcal {E}^{(2)}\), and partly characterise the action of \(\mathcal {E}^{(2)}\) on the remaining basis states.

Lemma 1

Let \(\mathcal {E}^{(2)}\) be \(\mathcal {E}_{\textrm{M}_n}^{(2)}\) or \(\mathcal {E}_{\textrm{M}_n\cap \textrm{Cl}_n}^{(2)}\), defined as in Eqs. (25) and (26).

\(\mathrm {(a)}\) For \(S_1, S_2 \subseteq [2n]\), only if \(S_1 = S_2\).

\(\mathrm {(b)}\) for any \(S, S' \subseteq [2n]\) such that \(|S| = |S'|\).

Proof

Let \(S_1,S_2,S,S' \subseteq [2n]\).

  1. (a)

    If \(S_1 \ne S_2\), there must exist some index \(\mu \in [2n]\) such that \(\mu \in S_1\) and \(\mu \not \in S_2\), or some index \(\mu \in [2n]\) such that \(\mu \in S_2\) and \(\mu \not \in S_1\). In either case, let \(Q \in \textrm{B}(2n) \subset \textrm{O}(2n)\) be the reflection matrix such that and for all \(\nu \ne \mu \). Then, (using Fact 1(a)). Hence, by Eq. (47),

    so .

  2. (b)

    Suppose \(|S| = |S'|\). Let \(Q' \in \textrm{B}(2n) \subset \textrm{O}(2n)\) be any permutation matrix such that . (Specifically, if \(S = \{\mu _1,\dots , \mu _{|S|}\}\) with \(\mu _1< \dots <\mu _{|S|}\) and \(S' = \{\mu _1',\dots , \mu _{|S|}'\}\) with \(\mu _1'< \dots <\mu _{|S|}'\), take any permutation \(Q'\) that maps \(\mu _i \mapsto \mu _i'\) for each \(i \in [|S|]\); it is clear that such a permutation always exists). Then, using Eq. (47),

We now prove Theorem 1(ii).

Proof of Theorem 1(ii)

Inserting resolutions of the identity [Eq. (17)] and using Lemma 1(a), we have

(note that since \((\mathcal {E}^{(2)})^\dagger = \mathcal {E}^{(2)}\) by Fact 2, Lemma 1(a) also implies that if \(S_1\ne S_2\)). It follows from Fact 1(d) that \(\mathcal {E}^{(2)}(\Gamma _k \otimes \Gamma _k) = \Gamma _k \otimes \Gamma _k\) for all \(k \in \{0,\dots , 2n\}\), so only if \(|S| = |S'|\). Hence, we can write

Now, by Lemma 1(b), the coefficient is the same for all pairs of subsets \(S,S'\) of the same cardinality k, i.e., for some number \(b'_k \in \mathbb {C}\), we have

Thus,

where is defined as in Eq. (28) and we rescale \(b_k'\) to \(b_k :=b_k'{2n\atopwithdelims ()k}\) to account for the normalisation of .

Since \(\mathcal {E}^{(2)}\) is a projector (Fact 2), each \(b_k\) must equal 0 or 1. To complete the proof, we show that \(b_k = 1\) for all \(k \in \{0,\dots , 2n\}\) by showing that for all \(Q \in \textrm{O}(2n)\), so :

where we use Eq. (8) in the second line and the Cauchy-Binet formula in the fourth, noting that \(\det (Q_{S,S'}) = \det ((Q_{S,S'})^{\textrm{T}}) = \det ((Q^{\textrm{T}})_{S',S})\). \(\square \)

4.1.2 The 1-fold twirl \(\mathcal {E}^{(1)}\)

Proof of Theorem 1(i)

From Eq. (48),

for any \(A \in \mathcal {L}(\mathcal {H}_n)\). Substituting in the expression for \(\mathcal {E}^{(2)}\) from Theorem 1(ii) and using Hilbert-Schmidt orthogonality of the \(\gamma _S\) (Eq. (16)), this evaluates to , so .

Alternatively, it is easily seen that for any \(S \ne \varnothing \) (take any \(\mu \in S\), and use Eq. (47) with Q the reflection that maps \(\mu \mapsto -\mu \) and \(\nu \mapsto \nu \) for all \(\nu \ne \mu \) to obtain ), while . \(\square \)

4.1.3 The 3-fold twirl \(\mathcal {E}^{(3)}\)

The 3-fold twirl channel \(\mathcal {E}^{(3)}\) (which, as discussed above, represents \(\mathcal {E}^{(3)}_{\textrm{M}_n}\) or \(\mathcal {E}^{(3)}_{\textrm{M}_n \cap \textrm{Cl}_n}\)) can be calculated along the same lines as \(\mathcal {E}^{(2)}\). The following lemma is the analogue of Lemma 1, for \(\mathcal {E}^{(3)}\).

Lemma 2

Let \(\mathcal {E}^{(3)}\) be \(\mathcal {E}^{(3)}_{\textrm{M}_n}\) or \(\mathcal {E}^{(3)}_{\textrm{M}_n \cap \textrm{Cl}_n}\), defined as in Eqs. (25) and (26).

  1. 1.

    For \(S_1,S_2,S_3 \subseteq [2n]\), only if \(S_1, S_2, S_3\) are of the form

    $$\begin{aligned} S_1 = A_1 \cup A_2, \quad S_2 = A_2 \cup A_3, \quad S_3 = A_3 \cup A_1 \end{aligned}$$

    for some mutually disjoint subsets \(A_1,A_2,A_3 \subseteq [2n]\).

  2. 2.

    for any subsets \(A_1,A_2,A_3,A_1',A_2',A_3' \subseteq [2n]\) such that \(A_1,A_2,A_3\) are mutually disjoint, \(A_1',A_2',A_3'\) are mutually disjoint, and \(|A_i| = |A_i'|\) for all \(i \in \{1,2,3\}\).

Note that since Majorana operators anticommute, may differ from by a minus sign. We provide the proof of Lemma 2, which uses symmetry arguments similar to those in Lemma 1, in Appendix C. This lemma should perhaps make the form of (defined in Eq. (29)) in the expression for \(\mathcal {E}^{(3)}\) somewhat more intuitive; it allows us to prove Theorem 1(iii) as follows.

Proof sketch for Theorem 1(iii)

Lemma 2(b) (together with \((\mathcal {E}^{(3)})^\dagger = \mathcal {E}^{(3)}\), from Fact 2) implies that for each triplet of integers \(k_1,k_2,k_3 \in \{0,\dots , 2n\}\) such that \(k_1 + k_2 + k_3 \le 2n\), there exists some number \(c_{k_1,k_2,k_3}' \in \mathbb {C}\) such that

(49)

for any subsets \(A_1,A_2, A_3, A_1', A_2',A_3' \subseteq [2n]\) such that \(A_1, A_2, A_3\) are mutually disjoint, \(A_1', A_2', A_3'\) are mutually disjoint, and \(|A_i| = |A_i'| = k_i'\) for all \(i \in \{1,2,3\}\). Inserting resolutions of identities to the left and right of \(\mathcal {E}^{(3)}\), then using Lemma 2(a) and Eq. (49) gives

which we can rewrite as

by letting \(c_{k_1,k_2,k_3} :=c'_{k_1,k_2,k_3}{2n \atopwithdelims ()k_1,k_2,k_3, 2n-k_1-k_2-k_3}\). The fact that \(\mathcal {E}^{(3)}\) is a projector (Fact 2) implies that each \(c_{k_1,k_2,k_3}\) is either 0 or 1. In Appendix C, we show that for each \(k_1,k_2, k_3\) appearing in the sum, by proving that for all \(Q \in \textrm{O}(2n)\); it then follows that \(c_{k_1,k_2,k_3} = 1\). \(\square \)

4.2 Classical shadows via matchgate circuits

In this subsection, we characterise the classical shadows arising from the uniform distribution over the continuous group \(\textrm{M}_n\) [Eq. (23)] of all matchgate circuits, and from the uniform distribution over the discrete group \(\textrm{M}_n \cap \textrm{Cl}_n\) [Eq. (24)] of matchgate circuits that are also in the Clifford group \(\textrm{Cl}_n\). To implement the classical shadows protocol, we must be able sample unitaries from these distributions and implement them on a quantum computer; we discuss efficient methods for doing so in Appendix B.

With Theorem 1 in hand, we can straightforwardly find explicit expressions for the measurement channel \(\mathcal {M}\) [Eq. (18)] in the protocol as well as the variance \(\textrm{Var}[\hat{o}_i]\) [Eq. (20)] of the expectation value estimators \(\hat{o}_i\) obtained from the classical shadows, when either of the two distributions is used. Since Eqs. (18) and (20) depend on the distribution only through the 2- and 3-fold twirl channels, and Theorem 1 shows that the j-fold twirl channels for \(\textrm{M}_n\) and \(\textrm{M}_n \cap \textrm{Cl}_n\) coincide for \(j \in \{1,2,3\}\), it follows that the classical shadows measurement channel and the variances are exactly the same for the two distributions. Hence, we will use the same notation (\(\mathcal {M}\) and \(\textrm{Var}[\hat{o}_i]\)) for both distributions.

4.2.1 Measurement channel

First, we calculate the classical shadows measurement channel \(\mathcal {M}\) by substituting the expression for the 2-fold twirl from Theorem 1(ii) into Eq. (18).

Proof of Eq. (30)

By Eq. (18), the measurement channel \(\mathcal {M}\) associated with the uniform distribution over \(\textrm{M}_n\) or over \(\textrm{M}_n\cap \textrm{Cl}_n\) is given by

$$\begin{aligned} \mathcal {M}(A) = \textrm{tr}_1\left[ \sum _{b \in \{0,1\}^n} \mathcal {E}^{(2)}(| b \rangle \langle b |^{\otimes 2})(A \otimes I) \right] \end{aligned}$$

for all \(\mathcal {A} \in \mathcal {L}(\mathcal {H}_n)\), where with , by Theorem 1(ii). We can simplify this by noting that \(\mathcal {E}^{(2)}(| b \rangle \langle b |^{\otimes 2}) = \mathcal {E}^{(2)}(| \textbf{0} \rangle \langle \textbf{0} |^{\otimes 2})\) for all \(b\in \{0,1\}^n\), which follows from the fact that computational basis states \(| b \rangle \langle b |\) are all Gaussian states (see Sect. 2.1.3), and \(\mathcal {E}^{(2)} = \mathcal {E}^{(2)}_{\textrm{M}_n}\) is invariant under composition with any Gaussian unitary channel \(\mathcal {U}_Q\). More explicitly, for each \(b \in \{0,1\}^n\), let \(Q_b \in \textrm{B}(2n)\) be the matrix such that \(\mathcal {U}_{Q_b}\) maps \(\gamma _{2j-1} \mapsto -\gamma _{2j-1}\) for every \(j \in [n]\) such that \(b_j = 1\), and leaves all the other \(\gamma _\mu \) unchanged. Then, \(\mathcal {U}_{Q_b}(| b \rangle \langle b |) = | \textbf{0} \rangle \langle \textbf{0} |\) from Eq. (6), so \(\mathcal {E}^{(2)}(| b \rangle \langle b |^{\otimes 2}) = (\mathcal {E}^{(2)}\circ \mathcal {U}_{Q_b})(| b \rangle \langle b |^{\otimes 2}) = \mathcal {E}^{(2)}(| \textbf{0} \rangle \langle \textbf{0} |)\) using Eq. (47). Hence,

$$\begin{aligned} \mathcal {M}(A) = 2^n \textrm{tr}_1\left[ \mathcal {E}^{(2)}(| \textbf{0} \rangle \langle \textbf{0} |^{\otimes 2})(A\otimes I) \right] , \end{aligned}$$
(50)

so it remains to calculate \(\mathcal {E}^{(2)}(| \textbf{0} \rangle \langle \textbf{0} |^{\otimes 2})\).

For convenience, let in Liouville representation, so . From Eq. (6), we have \(| \textbf{0} \rangle \langle \textbf{0} | = \frac{1}{2^n}\sum _{T \subseteq [n]}\prod _{j \in T} (-i\gamma _{2j-1}\gamma _{2j})\), so we can expand in the \(\gamma _S\) basis as

(51)

where \(\text {pairs}(T) :=\bigcup _{j \in T}\{2j-1,2j\}\). From this, it is clear that is only nonzero for even-cardinality subsets S of the form \(\text {pairs}(T)\) for some \(T \subseteq [n]\); in such cases, . Thus,

so we have

using \(\gamma _{S}^\dagger = (-1)^{|S|(|S|-1)/2}\gamma _S\) in the last line, which is a simple consequence of the fact that Majorana operators anticommute (and recalling that due to normalisation). Inserting this into Eq. (50) gives

$$\begin{aligned} \mathcal {M}(A)&= \frac{1}{2^n} \sum _{\ell = 0}^n {2n\atopwithdelims ()2\ell }^{-1} {n\atopwithdelims ()\ell }\sum _{S \in {[2n]\atopwithdelims ()2\ell }}\textrm{tr}(\gamma _S^\dagger A)\gamma _S \\&= \sum _{\ell = 0}^n {2n\atopwithdelims ()2\ell }^{-1} {n\atopwithdelims ()\ell }\mathcal {P}_{2\ell }(A), \end{aligned}$$

by definition of the projectors \(\mathcal {P}_k\) [Eq. (31)]. \(\square \)

From Eq. (30), we see that \(\mathcal {M}\) maps \(\mathcal {L}(\mathcal {H}_n)\) onto the subspace \(\Gamma _{\text {even}} = \oplus _{\ell =0}^{n}\Gamma _{2\ell }\) of even operators. We denote the (pseudo)inverse of \(\mathcal {M}\) on this subspace by \(\mathcal {M}^{-1}: \Gamma _{\text {even}} \rightarrow \Gamma _{\text {even}}\), which clearly has the form given in Eq. (32).

We now consider the consequences of the fact that the image of \(\mathcal {M}\) is \(\Gamma _{\text {even}}\) for the classical shadows protocol. Observe from Eq. (6) that for any computational basis state \(| b \rangle \), the corresponding density operator \(| b \rangle \langle b |\) is in \(\Gamma _{\text {even}}\). Then, since conjugation by any matchgate circuit leaves \(\Gamma _{\text {even}}\) invariant, the post-measurement state \(U_Q^\dagger | b \rangle \langle b |U_Q\) is an even operator for any \(b \in \{0,1\}^n\) and \(Q \in \textrm{O}(2n)\). Thus, for both distributions (\(\textrm{M}_n\) and \(\textrm{M}_n \cap \textrm{Cl}_n\)), the classical shadows \(\mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\) are well-defined. As noted in section (2.2), these classical shadows constitute unbiased estimates for the unknown state \(\rho \) if \(\rho \) is in the image of \(\mathcal {M}\)—in our case, if \(\rho \in \Gamma _{\text {even}}\). More explicitly, defining the random variable \(\hat{\rho } = \mathcal {M}^{-1}(\hat{U}_Q^\dagger | \hat{b} \rangle \langle \hat{b} |\hat{U}_Q)\) as in Eq. (19) (with D taken to be either of our distributions over matchgate circuits), we have

$$\begin{aligned} \mathop {{}\mathbb {E}}[\hat{\rho }] = \mathcal {M}^{-1}(\mathcal {M}(\rho )) = \mathcal {P}_{\text {even}}(\rho ), \end{aligned}$$

where \(\mathcal {P}_{\text {even}} :=\sum _{\ell = 0}^{n} \mathcal {P}_{2\ell }\) is the projector onto \(\Gamma _{\text {even}}\). Thus, to obtain unbiased estimators \(\hat{o}_i :=\textrm{tr}(O_i\hat{\rho })\) for the expectation values of arbitrary observables \(O_i\) with respect to \(\rho \) using this classical shadows protocol, it suffices for \(\rho \) to be an even operator. However, it would also suffice for all of the observables \(O_i\) to be even (and \(\rho \) to be arbitrary), due to the Hilbert-Schmidt orthogonality of Majorana operators. In particular, \(\textrm{tr}(\mathcal {P}_{\textrm{even}}(A)B) = \textrm{tr}(A\mathcal {P}_{\textrm{even}}(B))\) for any \(A,B \in \mathcal {L}(\mathcal {H}_n)\), so if \(O_i \in \Gamma _{\text {even}}\), we have

$$\begin{aligned} \mathop {{}\mathbb {E}}[\hat{o}_i] = \textrm{tr}(O_i\mathcal {P}_{\text {even}}(\rho )) = \textrm{tr}( \mathcal {P}_{\text {even}}(O_i)\rho )= \textrm{tr}(O_i\rho ). \end{aligned}$$

Therefore, we require that either \(\rho \in \Gamma _{\text {even}}\), or \(O_i \in \Gamma _{\text {even}}\) for all i.

4.2.2 Variance

Having calculated \(\mathcal {M}^{-1}\), we now substitute it along with the expression for the 3-fold twirl from Theorem 1(iii) into Eq. (20) to obtain the variance bound Eq. (35), which holds for any \(O \in \Gamma _{\text {even}}\). Note that it suffices to consider the variance for even observables—even if we are in the case where \(\rho \in \Gamma _{\text {even}}\) while the observables \(O_i\) can be arbitrary, we have \(\hat{o}_i = \textrm{tr}(O_i \hat{\rho }) = \textrm{tr}( \mathcal {P}_{\text {even}}(O_i)\hat{\rho })\) since our classical shadow samples \(\mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b | U_Q)\) are all even operators, so the variance for \(O_i\) is equal to the variance for its projection \(\mathcal {P}_{\text {even}}(O_i) \in \Gamma _{\text {even}}\) onto the even subspace.

Proof of Eq. (35)

The variance of the unbiased estimator \(\hat{o}\) for \(\textrm{tr}(O\rho )\) can be upper bounded as \(\textrm{Var}[\hat{o}] \le \mathop {{}\mathbb {E}}[|\hat{o}|^2]\), and from Eq. (20), we have

$$\begin{aligned} \mathop {{}\mathbb {E}}[|\hat{o}|^2] = \textrm{tr}\left[ \sum _{b \in \{0,1\}^n} \mathcal {E}^{(3)}(| b \rangle \langle b |^{\otimes 3}) \left( \rho \otimes \mathcal {M}^{-1}(O) \otimes \mathcal {M}^{-1}(O^\dagger \right) \right] \end{aligned}$$

for \(O \in \Gamma _{\text {even}}\), where \(\mathcal {E}^{(3)} :=\mathcal {E}^{(3)}_{\textrm{M}_n} = \mathcal {E}^{(3)}_{\textrm{M}_n \cap \textrm{Cl}_n}\). By Theorem 1(ii), we have , with given by Eq. (29). Just as in the proof of Eq. (30), we first use Eq. (47) to infer that \(\mathcal {E}^{(3)}(| b \rangle \langle b |) = \mathcal {E}^{(3)}(| \textbf{0} \rangle \langle \textbf{0} |^{\otimes 3})\) for all \(b \in \{0,1\}\), leading to

$$\begin{aligned} \mathop {{}\mathbb {E}}[|\hat{o}|^2] = 2^{n} \textrm{tr}\left[ \mathcal {E}^{(3)}(| \textbf{0} \rangle \langle \textbf{0} |^{\otimes 3})\left( \rho \otimes \mathcal {M}^{-1}(O) \otimes \mathcal {M}^{-1}(O^\dagger )\right) \right] . \end{aligned}$$
(52)

Next, we calculate , where . By Eq. (29),

where for brevity we write

$$\begin{aligned} {n\atopwithdelims ()k_1,k_2,k_3}' \equiv {n\atopwithdelims ()k_1,k_2,k_3, 2n-k_1-k_2-k_3} \end{aligned}$$
(53)

for the multinomial coefficient. By Eq. (51), is a linear combination of Majorana products \(\gamma _{\text {pairs}(T)}\) corresponding to subsets \(\text {pairs}(T) :=\bigcup _{j \in T}\{2j-1,2j\}\) consisting only of pairs of indices \(2j-1\) and 2j, so we see that can only be nonzero if \(A_1 \cup A_2 = \text {pairs}(T_1')\), \(A_2 \cup A_3 = \text {pairs}(T_2')\), and \(A_3 \cup A_1 = \text {pairs}(T_3')\) for some \(T_1',T_2', T_3' \subseteq [n]\). For mutually disjoint \(A_1,A_2,A_3\subseteq [2n]\), this condition is equivalent to \(A_1 = \text {pairs}(T_1)\), \(A_2 = \text {pairs}(T_2)\), and \(A_3 = \text {pairs}(T_3)\) for some mutually disjoint \(T_1,T_2, T_3 \subseteq [n]\), in which case (for \(i,j \in \{1,2,3\}\), \(i \ne j\)) by Eq. (51). Hence,

from which we obtain

Now, note that we can change to any other Majorana basis \(\widetilde{\gamma }_\mu = \sum _{\nu =1}^{2n} Q_{\mu \nu }\gamma _\mu = \mathcal {U}_Q(\gamma _\mu )\) for \(Q \in \textrm{O}(2n)\), by applying \(\mathcal {U}_Q^{\otimes 3}\) to both sides and using \(\mathcal {U}_Q^{\otimes 3} \circ \mathcal {E}^{(3)} = \mathcal {E}^{(3)}\) from Eq. (47), yielding

$$\begin{aligned} \mathcal {E}^{(3)}(| \textbf{0} \rangle \langle \textbf{0} |^{\otimes 3})= & {} \frac{1}{2^{3n}} \sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0\\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}} (-1)^{\ell _1 + \ell _2 + \ell _3} \frac{{n\atopwithdelims ()\ell _1, \ell _2, \ell _3}'}{{2n\atopwithdelims ()2\ell _1,2\ell _2,2\ell _3}'} \\{} & {} \quad \sum _{\begin{array}{c} A_1, A_2, A_3 \subseteq [2n] \text { disjoint}\\ |A_1| =2\ell _1, A_2 = 2\ell _2, |A_3|= 2\ell _3 \end{array}} \widetilde{\gamma }_{A_1}\widetilde{\gamma }_{A_2}\otimes \widetilde{\gamma }_{A_2}\widetilde{\gamma }_{A_3} \otimes \widetilde{\gamma }_{A_3}\widetilde{\gamma }_{A_1}. \end{aligned}$$

Inserting this into Eq. (52) gives

$$\begin{aligned} \mathop {{}\mathbb {E}}[|\hat{o}|^2]&= \frac{1}{2^{2n}} \sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0 \\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}} (-1)^{\ell _1 + \ell _2 + \ell _3}\frac{{n\atopwithdelims ()\ell _1, \ell _2, \ell _3}'}{{2n\atopwithdelims ()2\ell _1,2\ell _2,2\ell _3}'} \\&\quad \times \sum _{\begin{array}{c} A_1, A_2, A_3 \subseteq [2n] \text { disjoint}\\ |A_1| =2\ell _1, A_2 = 2\ell _2, |A_3|= 2\ell _3 \end{array}}\textrm{tr}\left( \widetilde{\gamma }_{A_1}\widetilde{\gamma }_{A_2}\rho \right) \textrm{tr}\left( \widetilde{\gamma }_{A_2}\widetilde{\gamma }_{A_3}\mathcal {M}^{-1}(O)\right) \\&\quad \textrm{tr}\left( \widetilde{\gamma }_{A_3}\widetilde{\gamma }_{A_1}\mathcal {M}^{-1}(O^\dagger )\right) . \end{aligned}$$

Finally, we use Eq. (32) to write, for \(|A_2| = 2\ell _2\) and \(|A_3| = 2\ell _3\),

$$\begin{aligned} \textrm{tr}\left( \widetilde{\gamma }_{A_2}\widetilde{\gamma }_{A_3}\mathcal {M}^{-1}(O)\right)&= \textrm{tr}\left( \mathcal {M}^{-1}(\widetilde{\gamma }_{A_2}\widetilde{\gamma }_{A_3}) O\right) \\ {}&= {2n\atopwithdelims ()2\ell _2 + 2\ell _3}{n\atopwithdelims ()\ell _2 + \ell _3}^{-1}\textrm{tr}(\widetilde{\gamma }_{A_2}\widetilde{\gamma }_{A_3}O), \end{aligned}$$

and similarly for \(\textrm{tr}(\gamma _{A_3}\gamma _{A_1}\mathcal {M}^{-1}(O^\dagger ))\). We thus arrive at Eq. (35):

$$\begin{aligned} \textrm{Var}[\hat{o}] \le \mathop {{}\mathbb {E}}[|\hat{o}|^2]&= \frac{1}{2^{2n}} \sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0 \\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}} (-1)^{\ell _1 + \ell _2 + \ell _3}\frac{{n\atopwithdelims ()\ell _1, \ell _2, \ell _3}'}{{2n\atopwithdelims ()2\ell _1,2\ell _2,2\ell _3}'}\frac{{2n\atopwithdelims ()2(\ell _2 + \ell _3)}}{{n\atopwithdelims ()\ell _2 + \ell _3}}\frac{{2n\atopwithdelims ()2(\ell _3 + \ell _1)}}{{n\atopwithdelims ()\ell _3 + \ell _1}} \\&\qquad \times \sum _{\begin{array}{c} A_1, A_2, A_3 \subseteq [2n] \text { disjoint}\\ |A_1| =2\ell _1, A_2 = 2\ell _2, |A_3|= 2\ell _3 \end{array}}\textrm{tr}\left( \widetilde{\gamma }_{A_1}\widetilde{\gamma }_{A_2}\rho \right) \textrm{tr}\left( \widetilde{\gamma }_{A_2}\widetilde{\gamma }_{A_3}O\right) \textrm{tr}\left( \widetilde{\gamma }_{A_3}\widetilde{\gamma }_{A_1}O^\dagger \right) . \end{aligned}$$

\(\square \)

We further analyse this variance bound for particular observables of interest in Sect. 6.

5 Efficient Post-processing of Matchgate Shadows

In this section, we start by proving Theorems 2 and 3 in Sects. 5.1 and 5.2, which give explicit expressions that can be used to efficiently evaluate fidelities with Gaussian states as well as overlaps with Slater determinants via matchgate shadows. In Sect. 5.3, we then present a general procedure for efficiently estimating the expectation values of a broader family of observables, including those that yield overlaps with arbitrary pure Gaussian states.

5.1 Estimating fidelities with fermionic Gaussian states

In this subsection, we state and prove the general form of Theorem 2 (stated for the special case of invertible \(C_{\varrho _1}\) in 3.3.2), which allows us to efficiently compute the expectation value \(\textrm{tr}(\varrho \mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q))\) of any fermionic Gaussian density operator \(\varrho \) with respect to any matchgate shadow sample \(\mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\) (see Eq. (38)). Recall that \(\textrm{tr}(\varrho \mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q))\) gives an unbiased estimate of \(\textrm{tr}(\varrho \rho )\), allowing us to use our classical shadows to efficiently estimate the fidelities between the unknown state \(\rho \) and any Gaussian states. We will later bound the variance of these estimates in Sect. 6.1.

Theorem 2

(For arbitrary \(C_{\varrho _1}\)). Let \(\varrho _1\) and \(\varrho _2\) be density operators of n-mode fermionic Gaussian states (Eq. (10)), with covariance matrices \(C_{\varrho _1}\) and \(C_{\varrho _2}\) (Eq. (11)). Let 2r be the rank of \(C_{\varrho _1}\), and let \(Q_1\in \textrm{O}(2n)\) be any orthogonal matrix and \(C_{\varrho _1'}\) any invertible \(2r\times 2r\) matrix such that

$$\begin{aligned} C_{\varrho _1} = Q_1^{\textrm{T}} \begin{pmatrix} C_{\varrho _1}' &{}{0} \\ {0} &{}0 \end{pmatrix} Q_1. \end{aligned}$$
(54)

Then, for any \(\ell \in \{0,\dots , n\}\), \(\textrm{tr}(\varrho _1\mathcal {P}_{2\ell }(\varrho _2))\) is the coefficient of \(z^\ell \) in the polynomial

$$\begin{aligned} p_{\varrho _1,\varrho _2}(z) = \frac{1}{2^{n}}\textrm{pf}\left( C_{\varrho _1}'\right) \textrm{pf}\left( -C'^{-1}_{\varrho _1} + z(Q_1 C_{\varrho _2} Q_1^{\textrm{T}})\Big |_{[2r]} \right) \end{aligned}$$
(55)

(where \(M\big |_{[2r]}\) denotes the matrix M restricted to the first 2r rows and columns).

Note from Eq. (13) that \(Q_1\) and \(C_{\varrho _1}'\) always exist. In the case where \(C_{\varrho _1}\) is invertible, we can take \(Q_1 = I\) and \(C_{\varrho _1}' = C_{\varrho _1}\), and Eq. (55) reduces to Eq. (39).

\(p_{\varrho _1,\varrho _2}(z)\) is a polynomial of degree at most r. Its coefficients can be calculated via polynomial interpolation, which entails evaluating \(p_{\varrho _1,\varrho _2}(z)\) at \(r+1\) values of z. From Eq. (55), each evaluation involves computing the Pfaffian of two \(2r\times 2r\) matrices, which takes \(\mathcal {O}(r^3)\) time. Thus, the total runtime of this approach is \(\mathcal {O}(r^4)\). In Appendix D, we give an alternative procedure for computing all of the coefficients of \(p_{\varrho _1,\varrho _2}(z)\) in \(\mathcal {O}(r^3)\) time. Therefore, for any Gaussian state \(\varrho \) with covariance matrix \(C_{\varrho }\) of rank r, we can compute \(\textrm{tr}(\varrho \mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b | U_Q))\) (for any matchgate circuit \(U_Q\) and computational basis state \(| b \rangle \)) in \(\mathcal {O}(r^3)\) time, by finding the coefficients in the polynomial

$$\begin{aligned} p_{\rho , U_Q^\dagger | b \rangle \langle b |U_Q}(z) = \frac{1}{2^n}\textrm{pf}\left( -C_{\rho }'\right) \textrm{pf}\left( -C_{\varrho _1}'^{-1} + z (Q_1 Q^{\textrm{T}} C_{| b \rangle }Q Q_1^{\textrm{T}}) \Big |_{[2r]} \right) , \end{aligned}$$

then substituting the coefficient of \(z^\ell \) for \(\textrm{tr}(\varrho \mathcal {P}_{2\ell }(U_Q^\dagger | b \rangle \langle b |U_Q))\), for each \(\ell \in \{0,\dots , r\}\), into Eq. (38).

We prove Theorem 2 (as well as Theorem 3, in the next subsection) by interpreting Majorana operators as generators of a Clifford algebra and using basic Clifford algebra properties, which we briefly summarise here. We refer the reader to Ref. [15] for the definition of a Clifford algebra as well as the proofs of the identities we use. An important property of a Clifford algebra is that it is graded: any element A of the algebra can be written as

$$\begin{aligned} A = \left\langle A \right\rangle _0 + \left\langle A \right\rangle _1 + \left\langle A \right\rangle _2 + \dots = \sum _{k \ge 0}\left\langle A \right\rangle _k, \end{aligned}$$

where \(\left\langle A \right\rangle _k\) is the grade-k or k-vector part of A. If \(A = \left\langle A \right\rangle _k\) for some nonnegative integer k, then A is said to be homogeneous of grade k and is called a k-vector. 0-vectors, 1-vectors, and 2-vectors are referred to as scalars, vectors, and bivectors, respectively. Scalars are identified with elements of the underlying field (in our case, \(\mathbb {C}\)), while for \(k \ge 1\), a k-vector is the sum of k-blades (or simple k-vectors), where A is a k-blade if and only if \(A = a_1a_2\dots a_k\) for some vectors \(a_1,a_2,\dots , a_k\) such that \(a_i a_j = -a_j a_i\) for all \(i,j \in [k]\) with \(i \ne j\). Every vector squares to a scalar, i.e., \(a^2 = \left\langle a^2 \right\rangle _0\) for any 1-vector a. The grade operation \(\left\langle \, \cdot \, \right\rangle _k\) has the properties \(\left\langle A + B \right\rangle _k = \left\langle A \right\rangle _k + \left\langle B \right\rangle _k\), \(\left\langle c A \right\rangle _k = c\left\langle A \right\rangle _k = \left\langle A \right\rangle _k c\) if \(c = \left\langle c \right\rangle _0\), and \(\left\langle \left\langle A \right\rangle _k \right\rangle _l = \delta _{kl} \left\langle A \right\rangle _k\). We will often make use of the following identity: for arbitrary elements A and B,

$$\begin{aligned} \left\langle AB \right\rangle _0 = \sum _{k}\left\langle \left\langle A \right\rangle _k\left\langle B \right\rangle _k \right\rangle _0. \end{aligned}$$
(56)

Thus, our space \(\mathcal {L}(\mathcal {H}_n)\) of n-qubit operators can be viewed as a representation of a complex \(2^{2n}\)-dimensional Clifford algebra \(\mathcal {C}_{2n}\) (with the addition and multiplication operations of the algebra corresponding to operator addition and multiplication). For any set of Majorana operators \(\{\widetilde{\gamma }_\mu \}_{\mu \in [2n]}\), each \(\widetilde{\gamma }_\mu \) represents a 1-vector; then, since Majorana operators mutually anticommute, products of k Majorana operators, i.e., \(\widetilde{\gamma }_S\) for some \(S \in {[2n]\atopwithdelims ()k}\), represent k-blades. Thus, the grade-k part of any operator is its projection onto the subspace \(\Gamma _k\), and k-vectors are the elements of \(\Gamma _k\). In particular, any scalar c in the Clifford algebra is represented by a multiple of the identity operator: \(c \equiv cI\). Hence, using the same notation for Clifford algebra elements and their representations, we have

$$\begin{aligned} \textrm{tr}(A) = 2^{n} \left\langle A \right\rangle _0 \end{aligned}$$
(57)

for \(A \in \mathcal {L}(\mathcal {H}_n)\cong \mathcal {C}_{2n}\). Another useful observation is that for any k-vector \(A_k\), \(\gamma _{[2n]}A_k\) is of grade \(2n-k\), so for any element A, we have \(\left\langle \gamma _{[2n]}A \right\rangle _{2n} = \gamma _{[2n]}\left\langle A \right\rangle _0\). Since \(\gamma _{[2n]}^\dagger \gamma _{[2n]} = I \equiv 1\), we can multiply both sides by \(\gamma _{[2n]}^\dagger \) to obtain

$$\begin{aligned} \left\langle A \right\rangle _0 = \gamma _{[2n]}^\dagger \left\langle \gamma _{[2n]}A \right\rangle _{2n}. \end{aligned}$$
(58)

It will also be convenient to introduce notation for the wedge product on the Clifford algebra, which is defined in terms of the multiplication operation of the algebra and the grade operation as follows. For homogeneous elements \(A_k = \left\langle A_k \right\rangle _k\) and \(B_l = \left\langle B_l \right\rangle _l\), the wedge product is defined by

$$\begin{aligned} A_k \wedge B_l :=\left\langle A_k B_l \right\rangle _{k+l} \end{aligned}$$
(59)

Then, for arbitrary elements A and B,

$$\begin{aligned} A \wedge B :=\sum _k \sum _l \left\langle A \right\rangle _k \wedge \left\langle B \right\rangle _l. \end{aligned}$$
(60)

The wedge is associative, distributive with respect to addition, and antisymmetric under exchange of vectors. From its definition, we see for instance that \(\widetilde{\gamma }_{\mu _1} \dots \widetilde{\gamma }_{\mu _k} = \widetilde{\gamma }_{\mu _1} \wedge \dots \wedge \widetilde{\gamma }_{\mu _k}\), for any set of Majorana operators \(\{\widetilde{\gamma }_\mu \}_{\mu \in [2n]}\). We adopt the convention that in the absence of brackets, the wedge product is performed first (i.e., \(AB \wedge C = A (B \wedge C)\)).

Finally, we state the following fact, which expresses the coefficient of \(\gamma _{[2n]} = \gamma _1 \wedge \dots \wedge \gamma _{2n}\) in the nth wedge power of an arbitrary bivector \(B \in \Gamma _2\)—or more generally, the coefficient of \(e_1 \wedge \dots \wedge e_{2k} \) in \(B^{\wedge k}\), where B is a bivector in the subalgebra generated by arbitrary 1-vectors \(e_1,\dots , e_k\)—in terms of a Pfaffian. This is easily proven using the definition of the Pfaffian and the antisymmetry of the wedge product.

Fact 3

Let \(e_1,\dots , e_{2k}\) be arbitrary 1-vectors. For any bivector \(B = \frac{1}{2} \sum _{\mu ,\nu =1}^{2k} M_{\mu \nu }e_\mu \wedge e_\nu \), where M is an antisymmetric \(2k \times 2k\) matrix, we have

$$\begin{aligned} B^{\wedge k} = k!\, \textrm{pf}(M) e_1 \wedge \dots \wedge e_{2k}. \end{aligned}$$

We are now ready to prove Theorem 2.

Proof of Theorem 2

Let \(\varrho _1 = \prod _{j=1}^n \frac{1}{2}\left( I - i\lambda _j\gamma '_{2j-1}\gamma '_{2j}\right) \) and \(\varrho _2 = \prod _{j=1}^n \frac{1}{2}\left( I - i\lambda _j''\gamma ''_{2j-1}\gamma ''_{2j}\right) \), where \(\lambda _j,\lambda _j'' \in [-1,1]\), and \(\{\gamma '_\mu \}_{\mu \in [2n]}\) and \(\{\gamma ''_{\mu }\}_{\mu \in [2n]}\) are sets of Majorana operators: \(\gamma _{\mu }' = U_{Q'}^\dagger \gamma _{\mu }U_{Q'}\) and \(\gamma _{\mu }'' = U_{Q''}^\dagger \gamma _{\mu } U_{Q''}\) for some \(Q',Q'' \in \textrm{O}(2n)\). To find \(\textrm{tr}(\varrho _1 \mathcal {P}_{2\ell }(\varrho _2))\), observe that if we expand the expression for \(\varrho _2\), the projector \(\mathcal {P}_{2\ell }\) picks out products of precisely \(\ell \) of the bivectors \(\lambda _j''\gamma _{2j-1}''\gamma _{2j}''\). Therefore, \(\textrm{tr}(\varrho _1\mathcal {P}_{2\ell }(\varrho _2))\) is the coefficient of \(z^\ell \) in

$$\begin{aligned} \textrm{tr}\left( \left[ \prod _{j=1}^n \frac{1}{2}\left( I - i\lambda _j\gamma '_{2j-1}\gamma '_{2j}\right) \right] \left[ \prod _{k=1}^n \frac{1}{2}\left( I - iz\lambda _k''\gamma ''_{2k-1}\gamma ''_{2k}\right) \right] \right) =:p_{\varrho _1,\varrho _2}(z). \end{aligned}$$
(61)

For convenience, we define \(\widetilde{\gamma }_\mu :=U_{Q'}\gamma _\mu '' U_{Q'}^\dagger \) and let \(\widetilde{\lambda }_j :=z \lambda _j''\), so that we can use the cyclic property of the trace to write \(p_{\varrho _1,\varrho _2}(z) = \textrm{tr}([\prod _j \frac{1}{2}(I - i\lambda _j\gamma _{2j-1}\gamma _{2j})][\prod _k \frac{1}{2} (I - i \widetilde{\lambda }_k \widetilde{\gamma }_{2k-1}\widetilde{\gamma }_{2k})])\). We will also use the following notation for the simple bivectors \(\gamma _{2j-1}\gamma _{2j}\) and \(\widetilde{\gamma }_{2j-1}\widetilde{\gamma }_{2j}\), and products thereof: for any \(j \in [n]\) and \(T \subseteq [n]\),

$$\begin{aligned} \begin{aligned}&\beta _j :=\gamma _{2j-1}\gamma _{2j}, \qquad{} & {} \widetilde{\beta }_j :=\widetilde{\gamma }_{2j-1}\widetilde{\gamma }_{2j}, \\&\beta _T :=\prod _{j\in T} \beta _j, \qquad{} & {} \widetilde{\beta }_T :=\prod _{j \in T} \widetilde{\beta }_j \end{aligned} \end{aligned}$$
(62)

for any \(T \subseteq [n]\). Note that \(\beta _T\) and \(\widetilde{\beta }_T\) are blades of grade 2|T|. Analogously, denote products of \(\lambda _j\) and \(\widetilde{\lambda }_j\) by

$$\begin{aligned} \lambda _T :=\prod _{j \in T} \lambda _j, \quad \widetilde{\lambda }_T :=\prod _{j \in T}\widetilde{\lambda }_j. \end{aligned}$$

Thus, we can rewrite Eq. (61) as

$$\begin{aligned} p_{\varrho _1,\varrho _2}(z)&= \textrm{tr}\left( \left[ \prod _{j = 1}^n \frac{1}{2}(I - i\lambda _j \beta _j)\right] \left[ \prod _{j' = 1}^n\frac{1}{2} (I - i\widetilde{\lambda }_{j'}\widetilde{\beta }_{j'} \right] \right) \\&= 2^n \left\langle \frac{1}{2^{2n}}\sum _{T \subseteq [n]}(-i)^{|T|} \lambda _T \beta _T \sum _{T'\subseteq [n]}(-i)^{|T'|}\widetilde{\lambda }_{T'}\widetilde{\beta }_{T'} \right\rangle _0, \end{aligned}$$

using Eq. (57) in the second equality.

To illustrate the main ideas, we first consider the case where \(\lambda _j \ne 0\) for all \(j \in [n]\) (i.e., \(C_{\varrho _1}\) is invertible) for simplicity, before extending the proof to the general case. Using the identities in Eqs. (56) and (58), we have

$$\begin{aligned} p_{\varrho _1,\varrho _2}(z)&= \frac{1}{2^n} \sum _k \left\langle \left\langle \sum _{T \subseteq [n]}(-i)^{|T|} \lambda _T \beta _T \right\rangle _k \left\langle \sum _{T' \subseteq [n]}(-i)^{|T'|}\widetilde{\lambda }_{T'}\widetilde{\beta }_{T'} \right\rangle _k \right\rangle _0 \nonumber \\&= \frac{1}{2^n} \sum _{\ell =0}^n \left\langle \sum _{T \in {[n]\atopwithdelims ()\ell }} (-i)^\ell \lambda _T \beta _T \sum _{T' \in {[n]\atopwithdelims ()\ell }} (-i)^{\ell }\widetilde{\lambda }_{T'}\widetilde{\beta }_{T'} \right\rangle _0 \nonumber \\&= \frac{1}{2^n}\sum _{\ell =0}^n \gamma _{[2n]}^\dagger \left\langle \gamma _{[2n]}(-1)^\ell \sum _{T \in {[n]\atopwithdelims ()\ell }} \lambda _T \beta _T \sum _{T'\in {[n]\atopwithdelims ()\ell }}\widetilde{\lambda }_{T'} \widetilde{\beta }_{T'} \right\rangle _{2n}, \end{aligned}$$
(63)

where the second line follows from the fact that \(\beta _T\) and \(\widetilde{\beta }_T\) are of grade 2|T|. Noting that \(\gamma _{[2n]} = \beta _{[n]}\), we have

$$\begin{aligned} \gamma _{[2n]} \beta _T = (-1)^{|T|} \beta _{[n] \setminus T}, \end{aligned}$$
(64)

since the \(\beta _j\) commute and \(\beta _j^2 = -1\). Also, \(\lambda _{T} = \lambda _{[2n]}/\lambda _{[n]{\setminus } T}\), so relabelling \(T \rightarrow [n] {\setminus } T\) in the second sum gives

$$\begin{aligned} p_{\varrho _1,\varrho _2}(z)&= \frac{1}{2^n} \lambda _{[2n]} \gamma _{[2n]}^\dagger \sum _{\ell = 0}^n \left\langle \sum _{T \in {[n]\atopwithdelims ()n -\ell }} \frac{1}{\lambda _T} \beta _T \sum _{T' \in {[n]\atopwithdelims ()\ell }}\widetilde{\lambda }_{T'}\widetilde{\beta }_{T'} \right\rangle _{2n} \end{aligned}$$
(65)
$$\begin{aligned}&= \frac{1}{2^n}\lambda _{[2n]}\gamma _{[2n]}^\dagger \sum _{\ell =0}^n \sum _{T \in {[n]\atopwithdelims ()n-\ell }} \sum _{T' \in {[n]\atopwithdelims ()\ell }} \left( \frac{1}{\lambda _T} \beta _T\right) \wedge (\widetilde{\lambda }_{T'} \widetilde{\beta }_{T'}) \end{aligned}$$
(66)

by definition of the wedge product (Eq. (59)). Now, since any two bivectors commute with respect to the wedge product and \(\beta _j \wedge \beta _j = \widetilde{\beta }_j \wedge \widetilde{\beta }_j = 0\), it follows from the multinomial formula that

$$\begin{aligned} \left( \sum _{j =1}^n \frac{1}{\lambda _j} \beta _j + \sum _{j = 1}^{n} \widetilde{\lambda }_j \widetilde{\beta }_j \right) ^{\wedge n} =n! \sum _{\ell =0}^n \sum _{T \in {[n]\atopwithdelims ()n-\ell }} \sum _{T' \in {[n]\atopwithdelims ()\ell }} \left( \frac{1}{\lambda _T} \beta _T\right) \wedge (\widetilde{\lambda }_{T'} \widetilde{\beta }_{T'}). \end{aligned}$$
(67)

Finally, we write the bivector on the LHS as a linear combination of \(\gamma _\mu \gamma _\nu \), so that we can apply Fact 3. Letting

$$\begin{aligned} \Lambda _1 :=\bigoplus _{j=1}^n \begin{pmatrix} 0 &{}\lambda _j \\ -\lambda _j &{}0 \end{pmatrix}, \qquad \Lambda _2 :=\bigoplus _{j=1}^n \begin{pmatrix} 0 &{}\lambda _j'' \\ -\lambda _j'' &{}0 \end{pmatrix}, \end{aligned}$$

we have (recalling \(\widetilde{\lambda }_j = z\lambda _j''\))

$$\begin{aligned} \left( \sum _{j =1}^n \frac{1}{\lambda _j} \beta _j + \sum _{j = 1}^{n} \widetilde{\lambda }_j \widetilde{\beta }_j\right) ^{\wedge n}&= \left( \frac{1}{2}\sum _{\mu , \nu \in [2n]} (-\Lambda _1^{-1} + zQ' Q''^{\textrm{T}} \Lambda _2 Q'' Q'^{\textrm{T}})_{\mu \nu } \gamma _\mu \wedge \gamma _\nu \right) ^{\wedge n} \\&= n! \textrm{pf}(-\Lambda _1^{-1} + zQ'Q''^{\textrm{T}} \Lambda _2 Q''Q'^{\textrm{T}}) \gamma _{[2n]} \end{aligned}$$

using Fact 3. We also have \(\lambda _{[2n]} = \textrm{pf}(\Lambda _1)\). Inserting these into Eq. (66) and noting that \(Q'^{\textrm{T}} \Lambda _1 Q' = C_{\varrho _1}\) and \(Q''^{\textrm{T}} \Lambda _2 Q'' = C_{\varrho _2}\), we arrive at

$$\begin{aligned} p_{\varrho _1,\varrho _2}(z)&= \frac{1}{2^n} \textrm{pf}\left( \Lambda _1\right) \textrm{pf}\left( -\Lambda _1^{-1} + zQ' Q''^{\textrm{T}} \Lambda _2 Q'' Q'^{\textrm{T}}\right) \gamma _{[2n]}^\dagger \gamma _{[2n]} \\&= \frac{1}{2^n} \textrm{pf}\left( Q'C_{\varrho _1}Q'^{\textrm{T}}\right) \textrm{pf}\left( Q'(-C_{\varrho _1}^{-1} + z C_{\varrho _2})Q'^{\textrm{T}} \right) \end{aligned}$$

for invertible \(C_{\varrho _1}\). We can then write this as \(p_{\varrho _1,\varrho _2}(z) = 2^{-n}\textrm{pf}(C_{\varrho _1}') \textrm{pf}(-C_{\varrho _1}'^{-1} + zQ_1 C_{\varrho _2}Q_1^{\textrm{T}})\) for any \(C_{\varrho _1}' = Q_1 C_{\varrho _1} Q_1^{\textrm{T}}\) with \(Q_1 \in \textrm{O}(2n)\) using the Pfaffian identity

$$\begin{aligned} \textrm{pf}(BAB^{\textrm{T}}) = \det (B) \textrm{pf}(A). \end{aligned}$$
(68)

For the general case, let \(J :=\{j\in [n]: \lambda _j \ne 0\}\). From Eq. (13), \(|J| = r\), where 2r is the rank of \(C_{\varrho _1}\). The equations up to Eq. (63) still hold, except we can sum over only subsets T such that \(\lambda _T \ne 0\)—that is, subsets \(T \subseteq J\):

$$\begin{aligned} p_{\varrho _1,\varrho _2}(z) = \frac{1}{2^n} \sum _{\ell =0}^{|J|} \gamma _{[2n]}^\dagger \left\langle \gamma _{[2n]}(-1)^\ell \sum _{T \in {J\atopwithdelims ()\ell }} \lambda _T \beta _T \sum _{T' \in {[n]\atopwithdelims ()\ell }} \widetilde{\lambda }_{T'} \widetilde{\beta }_{T'} \right\rangle _{2n}. \end{aligned}$$

Next, instead of using Eq. (64), we write \(\gamma _{[2n]} = \beta _{[n] \setminus J}\beta _J\) and observe that \(\beta _J \beta _T = (-1)^{|T|} \beta _{J{\setminus } T}\) for any \(T \subseteq J\). Also, \(\lambda _T = \lambda _J/\lambda _{J{\setminus } T}\) for \(T \subseteq J\), so Eqs. (65) and (66) change to

$$\begin{aligned} p_{\varrho _1,\varrho _2}(z)&= \frac{1}{2^n} \lambda _J \gamma _{[2n]}^\dagger \sum _{\ell = 0}^{|J|} \left\langle \beta _{[n] \setminus J}\sum _{T \in {J\atopwithdelims ()|J| -\ell }} \frac{1}{\lambda _T} \beta _T \sum _{T' \in {[n]\atopwithdelims ()\ell }}\widetilde{\lambda }_{T'}\widetilde{\beta }_{T'} \right\rangle _{2n} \nonumber \\&= \frac{1}{2^n}\lambda _J\gamma _{[2n]}^\dagger \beta _{[n]\setminus J} \wedge \sum _{\ell =0}^{|J|} \sum _{T \in {J\atopwithdelims ()|J|-\ell }} \sum _{T' \in {[n]\atopwithdelims ()\ell }} \left( \frac{1}{\lambda _T} \beta _T\right) \wedge (\widetilde{\lambda }_{T'} \widetilde{\beta }_{T'}), \end{aligned}$$
(69)

again using Eq. (59) to obtain the second equality. The generalisation of Eq. (67) is

$$\begin{aligned} \left( \sum _{j \in J} \frac{1}{\lambda _j} \beta _j + \sum _{j = 1}^{n} \widetilde{\lambda }_j \widetilde{\beta }_j \right) ^{\wedge |J|} =(|J|)! \sum _{\ell =0}^{|J|} \sum _{T \in {J\atopwithdelims ()|J|-\ell }} \sum _{T' \in {[n]\atopwithdelims ()\ell }} \left( \frac{1}{\lambda _T} \beta _T\right) \wedge (\widetilde{\lambda }_{T'} \widetilde{\beta }_{T'}), \end{aligned}$$

and we can write the bivector on the LHS as \(\frac{1}{2}\sum _{\mu ,\nu \in \text {pairs}(J)} (-\Lambda _{1}^{-1})_{\mu \nu } \gamma _\mu \gamma _\nu + \frac{1}{2}\sum _{\mu ,\nu \in [2n]} z(Q'Q''^{\textrm{T}}\Lambda _2 Q''Q'^{\textrm{T}})_{\mu \nu }\gamma _\mu \gamma _\nu \) (where \(\text {pairs}(J) :=\bigcup _{j \in J}\{2j-1,2j\}\)). Substituting this into Eq. (69), we have

$$\begin{aligned} p_{\rho _1,\rho _2}(z)= & {} \frac{1}{2^n} \lambda _J \gamma _{[2n]}^\dagger \beta _{[n] \setminus J} \wedge \frac{1}{(|J|)!}\\{} & {} \left( -\frac{1}{2}\sum _{\mu ,\nu \in \text {pairs}(J)} (\Lambda _{1}^{-1})_{\mu \nu } \gamma _\mu \gamma _\nu + \frac{1}{2}z \sum _{\mu ,\nu \in [2n]} (Q'Q''^{\textrm{T}}\Lambda _2 Q''Q'^{\textrm{T}})_{\mu \nu }\gamma _\mu \gamma _\nu \right) ^{\wedge |J|} \end{aligned}$$

Now, the key additional step for this general case is to observe that \(\beta _{[n]{\setminus } J} \wedge (\gamma _\mu \gamma _\nu ) = \beta _{[n]{\setminus } J} \wedge \gamma _\mu \wedge \gamma _\nu \) is equal to zero whenever \(\mu \) and/or \(\nu \) are in \(\text {pairs}([n]\setminus J)\), due to the antisymmetry of the wedge product. This implies that we can restrict the sum in the second term to over \(\mu , \nu \in \text {pairs}(J)\) as well. We can then apply Fact 3 to obtain

$$\begin{aligned} p_{\varrho _1,\varrho _2}(z)&= \frac{1}{2^n} \lambda _J \gamma _{[2n]}^\dagger \beta _{[n]\setminus J} \wedge \left[ \textrm{pf}\left( -\Lambda _1^{-1} \Big |_{\text {pairs}(J)} + zQ'Q''^{\textrm{T}} \Lambda _2 Q''Q'^{\textrm{T}}\Big |_{\text {pairs}(J)}\right) \gamma _{\text {pairs}(J)} \right] \\&= \frac{1}{2^n} \textrm{pf}\left( \Lambda _1\Big |_{\text {pairs}(J)}\right) \textrm{pf}\left( -\Lambda _1^{-1} \Big |_{\text {pairs}(J)} + zQ'C_{\varrho _2}Q'^{\textrm{T}}\Big |_{\text {pairs}(J)}\right) , \end{aligned}$$

since \(\beta _{[n] {\setminus } J} \wedge \gamma _{\text {pairs}(J)} = \beta _{[n]{\setminus } J}\wedge \beta _J = \gamma _{[2n]}\). Theorem 2 follows by noting that any \(C_{\varrho _1}'\) satisfying Eq. (54) is related to \(\Lambda _1\big |_{\text {pairs}(J)}\) by an orthogonal basis change, and using the Pfaffian identity in Eq. (68). \(\square \)

As a side remark, taking \(z = 1\) in Theorem 2 yields the general formula

$$\begin{aligned} \textrm{tr}(\varrho _1 \varrho _2) = \frac{1}{2^n} \textrm{pf}(-C_{\varrho _1}')\textrm{pf}\left( C_{\varrho _1}'^{-1} + (Q_1 C_{\varrho _2}Q_1^{\textrm{T}})\Big |_{[r]}\right) \end{aligned}$$
(70)

for the overlap between two arbitrary Gaussian states (since \(p_{\varrho _1,\varrho _2}(1) = \sum _{\ell =0}^{2n} \textrm{tr}(\varrho _1\mathcal {P}_{2\ell }(\varrho _2))\) and \(\sum _{\ell =0}^{2n}\mathcal {P}_{2\ell }(\varrho _2) = \mathcal {P}_{\text {even}}(\varrho _2) = \varrho _2\)). A special case of this formula, where \(\varrho _1\) and \(\varrho _2\) are both pure, is given in Ref. [16].

5.2 Estimating overlaps with Slater determinants

In this subsection, we prove Theorem 3. As discussed in Sect. 3.3.3, Theorem 3 provides a method for efficiently estimating the expectation value of \(| \varphi \rangle \langle \textbf{0} |\), where \(| \varphi \rangle \) is an arbitrary Slater determinant, which in turn allows us to estimate the overlap of a pure state with \(| \varphi \rangle \).

The proof of Theorem 3 uses some of the ideas in the proof of Theorem 2 (in the case where \(C_{\varrho _1}\) is not invertible). The basic Clifford algebra properties we use in the proof are summarised in the previous subsection.

Proof of Theorem 3

Let \(\varrho = \prod _{j=1}^n \frac{1}{2}(I - i\lambda _j\gamma '_{2j-1}\gamma '_{2j})\), where \(\lambda _j \in [-1,1]\) and \(\gamma _{\mu }' = U_{Q'}^\dagger \gamma _\mu U_{Q'}\) for some \(Q' \in \textrm{O}(2n)\). By the same reasoning as in the proof of Theorem 3, \(\textrm{tr}(| \varphi \rangle \langle \textbf{0} |\mathcal {P}_{2\ell }(\varrho ))\) is the coefficient of \(z^\ell \) in

$$\begin{aligned} \textrm{tr}\left( | \varphi \rangle \langle \textbf{0} | \left[ \prod _{j=1}^n \frac{1}{2}(I - iz\lambda _j \gamma _{2j-1}'\gamma _{2j}') \right] \right) =:q_{| \varphi \rangle ,\varrho }(z). \end{aligned}$$

With \(\widetilde{Q}\) the orthogonal matrix defined in Eq. (15), we can write \(| \varphi \rangle \langle \textbf{0} |\) as

$$\begin{aligned} | \varphi \rangle \langle \textbf{0} |&= \widetilde{a}_1^\dagger \dots \widetilde{a}_\zeta ^\dagger | \textbf{0} \rangle \langle \textbf{0} | = U_{\widetilde{Q}}^\dagger a_1^\dagger \dots a_\zeta ^\dagger U_{\widetilde{Q}} | \textbf{0} \rangle \langle \textbf{0} | = U_{\widetilde{Q}}^\dagger a_1^\dagger \dots a_\zeta ^\dagger | \textbf{0} \rangle \langle \textbf{0} |U_{\widetilde{Q}}, \end{aligned}$$
(71)

where in the last equality, we use the fact that \(U_{\widetilde{Q}}\) conserves the number operator (\(\sum _{j=1}^n \widetilde{a}_j^\dagger \widetilde{a}_j = \sum _{j=1}^n a_j^\dagger a_j\)), so \(U_{\widetilde{Q}}| \textbf{0} \rangle \propto | \textbf{0} \rangle \). Hence, defining \(\widetilde{\gamma }_\mu :=U_{\widetilde{Q}}\gamma _{\mu }' U_{\widetilde{Q}}^\dagger \) and \(\widetilde{\lambda }_j :=z\lambda _j\), we can use the cyclic property of the trace along with Eq. (6) to obtain

$$\begin{aligned} q_{| \varphi \rangle ,\varrho }(z) = \textrm{tr}\left( a_1^\dagger \dots a_\zeta ^\dagger \left[ \prod _{j=1}^n\frac{1}{2}(I - i\gamma _{2j-1}\gamma _{2j}) \right] \left[ \prod _{j'=1}^n \frac{1}{2}(I - i\widetilde{\lambda }_{j} \widetilde{\gamma }_{2j'-1}\widetilde{\gamma }_{2j'})\right] \right) . \end{aligned}$$

We will use the notation \(\beta _j\), \(\widetilde{\beta }_j\), \(\beta _T\), \(\widetilde{\beta }_T\) defined in Eq. (62), as well as \(\widetilde{\lambda }_T :=\prod _{j \in T}\widetilde{\lambda }_j\). Note that \(a_j^\dagger \) commutes with \(\frac{1}{2}(I- i\beta _k)\) for all \(k \ne j\) while \(a_j^\dagger \frac{1}{2}(I - i\beta _j) = a_j^\dagger (I - a_j^\dagger a_j) = a_j^\dagger \). Thus,

$$\begin{aligned} q_{| \varphi \rangle ,\varrho }(z)&= \textrm{tr}\left( a_1^\dagger \dots a_\zeta ^\dagger \left[ \prod _{j=\zeta + 1}^n \frac{1}{2}(I - i\gamma _{2j-1}\gamma _{2j}) \right] \left[ \prod _{j'=1}^n \frac{1}{2}(I - i\widetilde{\lambda }_{j} \widetilde{\gamma }_{2j'-1}\widetilde{\gamma }_{2j'})\right] \right) \\&= \frac{1}{2^{n-\zeta }}\sum _k \left\langle \left\langle a_1^\dagger \dots a_\zeta ^\dagger \sum _{T \subseteq [\zeta + 1:n]} (-i)^{|T|} \beta _T \right\rangle _{k} \left\langle \sum _{T' \subseteq [n]}(-i)^{|T'|}\widetilde{\lambda }_{T'} \widetilde{\beta }_{T'} \right\rangle _k \right\rangle _0, \end{aligned}$$

where \([\zeta + 1:n] :=\{\zeta + 1, \dots , n\}\) and we apply Eqs. (57) and (56) in the second line. Since \(a_1^\dagger , \dots , a_\zeta ^\dagger , \gamma _{2\zeta + 1},\dots \gamma _{2\zeta }\) mutually anticommute, \(a_1^\dagger \dots a_\zeta ^\dagger \beta _T\) is a blade of grade \(\zeta + 2|T|\), so \(\left\langle a_1^\dagger \dots a_\zeta ^\dagger \beta _T \right\rangle _k = \delta _{k, \zeta + 2|T|} a_1^\dagger \dots a_\zeta ^\dagger \beta _T\). Using this along with Eq. (58) gives

$$\begin{aligned} q_{| \varphi \rangle ,\varrho }(z) = \frac{1}{2^{n-\zeta }} \sum _{\ell = \zeta /2}^n \gamma _{[2n]}^\dagger \left\langle \gamma _{[2n]} a_1^\dagger \dots a_\zeta ^\dagger \sum _{T \in {[\zeta +1:n] \atopwithdelims ()\ell - \zeta /2}} (-i)^{\ell - \zeta /2} \beta _T \sum _{T' \in {[n]\atopwithdelims ()\ell }}(-i)^\ell \widetilde{\lambda }_{T'} \widetilde{\beta }_{T'} \right\rangle _{2n}. \end{aligned}$$

Next, \(\gamma _{[2n]}a_1^\dagger \dots a_\zeta ^\dagger = (-ia_1)^\dagger \dots (-ia_\zeta )^\dagger \gamma _{[2\zeta + 1:2n]}\) and \(\gamma _{[2\zeta + 1:2n]} \beta _T = (-1)^{|T|} \beta _{[\zeta + 1:n] {\setminus } T}\) for any \(T \subseteq [\zeta + 1:n]\), so

$$\begin{aligned} q_{| \varphi \rangle ,\varrho }(z)&= \frac{1}{2^{n-\zeta }} \gamma _{[2n]}^\dagger \sum _{\ell =\zeta /2}^n \left\langle (-i)^\zeta a_1^\dagger \dots a_\zeta ^\dagger \sum _{T \in {[\zeta + 1:n] \atopwithdelims ()\ell - \zeta /2}}(-i)^{\ell - \zeta /2} (-1)^{\ell - \zeta /2}\beta _{[\zeta + 1:n]\setminus T} \right. \\&\left. \sum _{T' \in {[n]\atopwithdelims ()\ell }} (-i)^\ell \widetilde{\lambda }_{T'} \widetilde{\beta }_{T'} \right\rangle _{2n} \\&= \frac{i^{\zeta /2}}{2^{n-\zeta }} \gamma _{[2n]}^\dagger \sum _{\ell = \zeta /2}^n \left\langle a_1^\dagger \dots a_\zeta ^\dagger \sum _{T\in {[\zeta + 1:n] \atopwithdelims ()n - \zeta /2 - \ell }} \beta _{T} \sum _{T' \in {[n]\atopwithdelims ()\ell }}\widetilde{\lambda }_{T'}\widetilde{\beta }_{T'} \right\rangle _{2n} \\&= \frac{i^{\zeta /2}}{2^{n-\zeta }} \gamma _{[2n]}^\dagger (a_1^\dagger \dots a_\zeta ^\dagger ) \wedge \sum _{\ell = \zeta /2}^n \sum _{T\in {[\zeta + 1:n] \atopwithdelims ()n - \zeta /2 - \ell }}\sum _{T' \in {[n]\atopwithdelims ()\ell }} \beta _T \wedge (\widetilde{\lambda }_{T'} \widetilde{\beta }_{T'}) \\&= \frac{i^{\zeta /2}}{2^{n-\zeta }} \gamma _{[2n]}^\dagger (a_1^\dagger \dots a_\zeta ^\dagger ) \wedge \frac{1}{(n -\zeta /2)!}\left( \sum _{j=\zeta + 1}^n \beta _j + \sum _{j=1}^n \widetilde{\lambda }_j \widetilde{\beta }_j \right) ^{\wedge (n - \zeta /2)}, \end{aligned}$$

where we have proceeded along the same lines as in the proof of Theorem 2—we change variables \(T \rightarrow [\zeta + 1:n] {\setminus } T\) in the second line, apply the definition of the wedge product [Eq. (59)] in the third, and use the multinomial theorem in the fourth.

Now, consider the basis of 1-vectors

$$\begin{aligned} \{e_1, \dots , e_{2n}\} :=\{\sqrt{2}a_1^\dagger , \sqrt{2} a_1, \dots , \sqrt{2}a_\zeta ^\dagger , \sqrt{2}a_\zeta , \gamma _{2\zeta + 1},\dots , \gamma _{2n} \}. \end{aligned}$$

It can be seen that the unitary matrix W defined in Eq. (42) changes between \(\{e_\mu \}_{\mu \in [2n]}\) and \(\{\gamma _\mu \}_{\mu \in [2n]}\): we have \(e_\mu = \sum _{\mu =1}^{2n} W_{\mu \nu }\gamma _\nu \), so \(\gamma _\mu = \sum _{\mu =1}^{2n} W^\dagger _{\mu \nu } e_\nu \). Writing \(q_{| \varphi \rangle ,\varrho }(z)\) in terms of \(\{e_\mu \}_{\mu \in [2n]}\), we have

$$\begin{aligned} q_{| \varphi \rangle ,\varrho }(z)&= \frac{i^{\zeta /2}}{2^{n - \zeta }} \gamma _{[2n]}^\dagger \left( \frac{e_1}{\sqrt{2}} \frac{e_3}{\sqrt{2}} \dots \frac{e_{2\zeta -1}}{\sqrt{2}}\right) \\&\quad \wedge \frac{1}{(n-\zeta /2)!}\left( \frac{1}{2} \sum _{\mu ,\nu \in \overline{S}_\zeta } (C_{| \textbf{0} \rangle })_{\mu \nu } e_\mu e_\nu + \frac{1}{2} \sum _{\mu ,\nu \in [2n]} z(W^* \widetilde{Q} C_{\varrho } \widetilde{Q}^{\textrm{T}} W^\dagger )_{\mu \nu } e_\mu e_\nu \right) ^{\wedge (n - \zeta /2)}, \end{aligned}$$

where \(\overline{S}_\zeta :=[2n] {\setminus } \{1,3,\dots , 2\zeta - 1\}\). Finally, by the antisymmetry of the wedge product, \((e_1 e_3 \dots e_{2\zeta -1}) \wedge (e_\mu e_\nu ) = e_1 \wedge e_3 \wedge \dots \wedge e_{2\zeta - 1} \wedge e_\mu \wedge e_\nu \) vanishes whenever \(\mu \) and/or \(\nu \) are in \(\{1,3,\dots , 2\zeta - 1\}\), so we can restrict the second sum to \(\mu , \nu \in \overline{S}_\zeta \). We can then apply Fact 3, obtaining

$$\begin{aligned} q_{| \varphi \rangle ,\varrho }(z)&= \frac{i^{\zeta /2}}{2^{n - \zeta /2}} \gamma _{[2n]}^\dagger (e_1 e_3 \dots e_{2\zeta - 1}) \wedge \textrm{pf}\left( \left( C_{| \textbf{0} \rangle } + zW^* \widetilde{Q} C_{\varrho }\widetilde{Q}^{\textrm{T}} W^\dagger \right) \Big |_{\overline{S}_\zeta } \right) \bigwedge _{\mu \in \overline{S}_\zeta } e_\mu \\&= \frac{i^{\zeta /2}}{2^{n - \zeta }} \textrm{pf}\left( \left( C_{| \textbf{0} \rangle } + zW^* \widetilde{Q} C_{\varrho }\widetilde{Q}^{\textrm{T}} W^\dagger \right) \Big |_{\overline{S}_\zeta } \right) \gamma _{[2n]}^\dagger (-1)^{\zeta (\zeta - 1)/2} \bigwedge _{\mu \in [2n]} e_\mu \end{aligned}$$

Theorem 3 follows by noting that \(\bigwedge _{\mu \in [2n]} e_\mu = (-i)^{\zeta } \gamma _{[2n]}\) and \((-1)^{\zeta (\zeta -1)/2} (-i)^\zeta = (-1)^{\zeta ^2/2} = 1\) since \(\zeta \) is even. \(\square \)

5.3 Efficient estimation of more general observables

In this subsection, we give a method for efficiently estimating the expectation values of a larger class of observables using our matchgate shadows, extending beyond the Gaussian density operators treated in Theorem 2 and the \(| \varphi \rangle \langle \textbf{0} |\) operators in Theorem 3, which allow us to measure overlaps with Slater determinants. This class comprises arbitrary products of certain kinds of operators, including Gaussian density operators, Gaussian unitaries, and linear combinations of Majorana operators (i.e., \(\sum _{\mu =1}^{2n} c_\mu \gamma _\mu \) for any \(c_\mu \in \mathbb {C})\). A useful example of such a product is any operator of the form \(| \phi \rangle \langle \textbf{0} | = U_{Q_\phi }| \textbf{0} \rangle \langle \textbf{0} |\), where \(| \phi \rangle \) is an arbitrary pure Gaussian state (not necessarily a Slater determinant) and \(U_{Q_\phi }\) is a Gaussian unitary that prepares \(| \phi \rangle \). The ability to efficiently estimate the expectation value of such operators allows us to efficiently estimate the overlaps between any pure state and any pure Gaussian states (see Appendix A.2).

Our method builds on ideas in Refs. [16, 17], which use Grassmann integration (also known as Berezin integration) to evaluate certain quantities involving fermionic operations, as well as identities in Ref. [18] for evaluating Grassmann integrals. Specifically, we generalise results in Appendix A of Ref. [16], and consider what is essentially a special case of the problem in Ref. [17], and are therefore able to give a more explicit and streamlined protocol.Footnote 14 We also extend Theorem A.15(d) of Ref. [18] to obtain an efficient method for evaluating Grassmann integrals of a certain form.

We begin by reviewing the basics of Grassmann algebra and Grassmann integration (Sect. 5.3.1). We then present our method, which can be outlined at a high level as follows. The objective is to evaluate \(\textrm{tr}(A^{(1)}\dots A^{(m)})\) where \(A^{(i)}\) are elements of \(\mathcal {L}(\mathcal {H}_n)\) (equivalently, the Clifford algebra \(\mathcal {C}_{2n}\)). The first main step is to express \(\textrm{tr}(A^{(1)}\dots A^{(m)})\) (for arbitrary \(A^{(1)}, \dots , A^{(m)}\)), as a Grassmann integral by associating to each Clifford algebra element \(A^{(i)}\) a particular element of a \(2^{2mn}\)-dimensional Grassmann algebra (Sect. 5.3.2). We then show that the integral reduces to a fairly simple form when each \(A^{(i)}\) belongs to a certain class of operators; importantly, operators that allow us to evaluate expectation values with respect to matchgate shadow samples fall into this class (Sect. 5.3.3). Finally, we construct a procedure for efficiently evaluating any integral of this form (Algorithm 2).

After presenting the general method, we work through an example of applying it to efficiently evaluate the expectation value of \(| \phi \rangle \langle \textbf{0} |\), where \(| \phi \rangle \) is a Gaussian state, with respect to any matchgate shadow sample (Sect. 5.3.4). In conjunction with the procedure in Appendix A.2, this enables us to efficiently estimate the overlap between any pure state and \(| \phi \rangle \).

5.3.1 Review of Grassmann algebra and Grassmann integration

We give a brief overview of Grassmann algebras and Grassmann integration, and refer the reader to Ref. [19] for a more detailed treatment. Our definitions and notational conventions are mainly based on Refs. [18, 20].

For any \(n \in \mathbb {Z}_{\ge 0}\), a \(2^{2n}\)-dimensional Grassmann algebra \(\mathcal {G}_{2n}\) (over the field \(\mathbb {C}\)) is generated by 2n Grassmann variables \(\theta _1,\dots , \theta _{2n}\) (for our purposes, it will suffice to consider Grassmann algebras with an even number of generators, although the same definitions hold for odd numbers of generators). The multiplication operation of the Grassmann algebra is associative and antisymmetric under the exchange of generators:

$$\begin{aligned} \theta _\mu \theta _\nu = -\theta _\nu \theta _\mu \end{aligned}$$
(72)

for any \(\mu , \nu \in [2n]\); this implies that \(\theta _\mu ^2 = 0\). Any element \(f(\theta ) \in \mathcal {G}_{2n}\) can be written as

$$\begin{aligned} f(\theta ) = c_0 + \sum _{k =1}^{2n} \sum _{1 \le \mu _1< \dots < \mu _k \le 2n} c_{\mu _1,\dots , \mu _k} \theta _{\mu _1}\dots \theta _{\mu _k} \end{aligned}$$
(73)

for some coefficients \(c_0, c_{\mu _1,\dots , \mu _k} \in \mathbb {C}\). We say that \(f(\theta )\) is even if \(c_{\mu _1,\dots , \mu _k} = 0\) for all odd k. Even elements commute with all other elements in \(\mathcal {G}_{2n}\). From these definitions, it can be observed that \(\mathcal {G}_{2n}\) is equivalent to a \(2^{2n}\)-dimensional Clifford algebra \(\mathcal {C}_{2n}\), except where the multiplication operation is taken to be the wedge product, defined by Eq. (59) and (60). (Indeed, the properties of Grassmann integration, some of which we review below and in Appendix E, can all be interpreted in terms of Clifford algebra operations, and proved using Clifford algebra identities.)

The “integral” \(\int d\theta _\mu \) is defined by its action

$$\begin{aligned} \int d\theta _\mu \, 1 = 0,\qquad \int d\theta _\mu \, \theta _\nu = \delta _{\mu \nu }, \end{aligned}$$
(74)

on scalars and generators, together with linearity over \(\mathbb {C}\) and the rule

$$\begin{aligned} \int d\theta _\mu \left( \theta _\nu f(\theta )\right) = \delta _{\mu \nu } f(\theta ) - \theta _\nu \int d\theta _\mu \, f(\theta ) \end{aligned}$$
(75)

for any \(f(\theta ) \in \mathcal {G}_{2n}\). We will use the notation \(\int d\theta _{\mu _1} \dots d\theta _{\mu _k} \equiv \int d\theta _{\mu _1} \dots \int d\theta _{\mu _k}\) for any \(\mu _1,\dots , \mu _k \in [2n]\), and \(D\theta \equiv d\theta _{2n} \dots d{\theta }_{1}\), so that \(\int D\theta \equiv \int d\theta _{2n} \dots d\theta _1\). Then, note that \(\int D\theta \, \theta _{\mu _1}\dots \theta _{\mu _k}\) is only nonzero if \(k = 2n\) and \(\{\theta _{\mu _1},\dots , \theta _{\mu _k}\} = [2n]\), in which case it is equal to \(\textrm{sgn}(\pi )\) where \(\pi \) is the permutation that maps \((\mu _1,\dots , \mu _k)\) to \((1,\dots , 2n)\). Thus, when applied to an arbitrary element of \(\mathcal {G}_{2n}\), \(\int D\theta \) picks out the coefficient of \(\theta _1\dots \theta _{2n}\), i.e., for \(f(\theta )\) expanded as in Eq. (73),

$$\begin{aligned} \int D\theta \, f(\theta ) = c_{1,\dots , 2n}. \end{aligned}$$
(76)

Also, for any Grassmann variables \(\theta _1,\dots , \theta _{2n}, \eta _1,\dots , \eta _{2n}\), we use \(\theta \equiv \begin{pmatrix} \theta _1&\dots&\theta _{2n}\end{pmatrix}^{\textrm{T}}\) to denote the vector with \(\theta _1, \dots , \theta _{2n}\) as its components, and likewise, \(\eta \equiv \begin{pmatrix} \eta _1&\dots&\eta _{2n}\end{pmatrix}^{\textrm{T}}\). Then, we have, for instance,

$$\begin{aligned} \theta ^{\textrm{T}} \eta \equiv \sum _{\mu =1}^{2n}\theta _\mu \eta _\mu , \qquad B\theta \equiv \begin{pmatrix} \sum \limits _{\mu = 1}^{2n} B_{1\mu } \theta _\mu \\ \vdots \\ \sum \limits _{\mu = 1}^{2n} B_{k\mu } \theta _\mu \end{pmatrix}, \qquad \theta ^{\textrm{T}} M\theta \equiv \sum _{\mu ,\nu =1}^{2n} M_{\mu \nu }\theta _{\mu }\theta _\nu \end{aligned}$$

for any \(k \times 2n\) matrix B and \(2n\times 2n\) matrix M. Note from Eq. (72) that \(\theta ^{\textrm{T}} \eta = -\eta ^{\textrm{T}} \theta \).

Finally, following Ref. [20], we define the “Grassmann representation” of \(\mathcal {L}(\mathcal {H}_n)\cong \mathcal {C}_{2n}\) as follows. Recall that \(A \in \mathcal {L}(\mathcal {H}_n)\) can be expanded uniquely in terms of the (canonical) Majorana operators \(\{\gamma _\mu \}_{\mu \in [2n]}\) as

$$\begin{aligned} A = c_0 I + \sum _{k =1}^{2n} \sum _{1 \le \mu _1< \dots < \mu _k \le 2n} c_{\mu _1,\dots , \mu _k} \gamma _{\mu _1}\dots \gamma _{\mu _k} \end{aligned}$$
(77)

for \(c_0, c_{\mu _1,\dots , \mu _k} \in \mathbb {C}\) (by Hilbert-Schmidt orthogonality, \(c_0 = \textrm{tr}(A), c_{\mu _1,\dots , \mu _k} = \textrm{tr}(\gamma _{\mu _k}\dots \gamma _{\mu _1}A)\)). The Grassmann representation \(\omega (A; \theta )\) of A in terms of Grassmann variables \(\theta _1,\dots , \theta _{2n}\) is then given by

$$\begin{aligned} \omega (A; \theta ) = c_0 + \sum _{k =1}^{2n} \sum _{1 \le \mu _1< \dots < \mu _k \le 2n} c_{\mu _1,\dots , \mu _k} \theta _{\mu _1}\dots \theta _{\mu _k}, \end{aligned}$$
(78)

which we can regard as an element of \(\mathcal {G}_{2n}\), or of any Grassmann algebra whose generators include \(\theta _1,\dots , \theta _{2n}\). Note that for \(A, B \in \mathcal {L}(\mathcal {H}_n)\), we have \(\omega (AB; \theta ) = \omega (A;\theta )\omega (B;\theta )\) if and only if \(AB = A \wedge B\) (where \(\wedge \) denotes the wedge product in \(\mathcal {C}_{2n})\).

5.3.2 Expressing the trace of any product of operators as a Grassmann integral

Let \(A^{(1)},\dots , A^{(m)}\) be arbitrary operators in \(\mathcal {L}(\mathcal {H}_n)\cong \mathcal {C}_{2n}\). To express the scalar quantity \(\textrm{tr}(A^{(1)}\dots A^{(m)})\) in terms of a Grassmann integral, we represent each \(A^{(i)}\) in terms of a different set of 2n independent Grassmann variables \(\theta _1^{(i)},\dots , \theta _{2n}^{(i)}\), then consider a Grassmann algebra that includes all of the \(\{\theta _{\mu }^{(i)}\}_{\mu \in [2n],i \in [m]}\) as generators. We prove the following identity in Appendix E1.

Theorem 4

For any even integer \(m \in \mathbb {Z}_{> 0}\) and any \(A^{(1)}, \dots , A^{(m)} \in \mathcal {L}(\mathcal {H}_n)\), let \(\theta _1^{(1)}, \dots , \theta _{2n}^{(1)},\dots , \theta _1^{(m)}, \dots , \theta _{2n}^{(m)}\) be generators of a \(2^{2mn}\)-dimensional Grassmann algebra \(\mathcal {G}_{2mn}\). Then, we have

$$\begin{aligned} \textrm{tr}\left( A^{(1)}\dots A^{(m)}\right)&= 2^{n} (-1)^{n m(m-1)/2} \int D\theta ^{(m)}\dots D\theta ^{(1)}\, \omega (A^{(1)};\theta ^{(1)})\dots \omega (A^{(m)}; \theta ^{(m)}) \nonumber \\&\quad \exp \Bigg (\sum _{\begin{array}{c} i,j \in [m]\\ i <j \end{array}} s_{ij} \theta ^{(i) T} \theta ^{(j)} \Bigg ), \end{aligned}$$
(79)

where \(\omega (A^{(i)}; \theta ^{(i)}) \in \mathcal {G}_{2mn}\) is the Grassmann representation of \(A^{(i)}\) in terms of \(\theta _1^{(i)},\dots , \theta _{2n}^{(i)}\) \(\mathrm {(}\)Eq. (78)\(\mathrm {)}\), and \(s_{ij} :=(-1)^{i+j+1}\) for \(i < j\).

Here, we use the same notational conventions as those defined in Sect. 5.3.1, e.g., \(D\theta ^{(i)} \equiv d\theta _{2n}^{(i)}\dots d\theta _1^{(i)}\) and \(\theta ^{(i)}\) denotes the vector \(\begin{pmatrix} \theta _1^{(i)}&\dots&\theta _{2n}^{(i)} \end{pmatrix}^{\textrm{T}}\). To apply Theorem 4 to products of an odd number m of operators, we can, for instance, express one of the operators as a product of two operators, or add in \(A^{(m+1)} = I\).

Theorem 4 generalises Equations (140) and (144) of Ref. [16], which give the expressions for \(m = 2\) and for \(m=4\) in the special case where one of the operators is \((-i)^{n} \gamma _{[2n]}\).Footnote 15

The purpose of Theorem 4 is to translate the trace of any product of operators into a Grassmann integral. The basic reason for doing so is that the multiplication operation of the Grassmann algebra, which is equivalent to the wedge product in the Clifford algebra, is easier to work with than the multiplication operation of the Clifford algebra. In this translation, we go from a \(2^{2n}\)-dimensional Clifford algebra to a larger Grassmann algebra, with 2n separate generators for each of the m operators \(A^{(i)}\) appearing in the trace. To see why, consider expanding each \(A^{(i)}\) on the left-hand side of Eq. (79) into products of Majorana operators, as in Eq. (77), and observe that the terms that end up contributing to the trace are those for which each \(\gamma _\mu \) appears in an even number of the \(A^{(i)}\). Now, if we were to use only 2n generators for all of the different operators (i.e., work with an expression like \(\omega (A^{(1)};\theta )\dots \omega (A^{(m)};\theta )\) rather than have a different \(\theta ^{(i)}\) for each \(A^{(i)}\)), these terms would not be properly accounted for in the Grassmann algebra, as the wedge product is antisymmetric—note \(\gamma _\mu ^2 = I\), whereas \(\theta _\mu ^2 = 0\). Using independent sets of generators instead preserves all of these terms when we move to the Grassmann algebra. Then, the exponential on the right-hand side of Eq. (79) correctly compensates for the fact that the Grassmann integral picks out only terms that contain all of the generators (as shown by the proof in Appendix E1).

5.3.3 Efficiently evaluating certain Grassmann integrals

We now consider Grassmann integrals of a certain, rather general form that can be efficiently evaluated, and show that for many operators of potential interest, Theorem 4 yields integrals that can be written in this form.

Specifically, for any \(N,K \in \mathbb {Z}_{\ge 0}\), any \(K \times 2N\) matrix B, and any antisymmetric \(2N \times 2N\) matrix M, we define the Grassmann integral

$$\begin{aligned} g(B, M) :=\int D\chi \, (B\chi )_1\dots (B\chi )_K \exp \left( \frac{1}{2}\chi ^{\textrm{T}} M\chi \right) , \end{aligned}$$
(80)

where \(\chi _1,\dots , \chi _{2N}\) are generators of a \(2^{2N}\)-dimensional Grassmann algebra, \(D\chi \equiv d\chi _{2N}\dots d\chi _1\), and \(\chi \equiv \begin{pmatrix} \chi _1 \dots \chi _{2N}\end{pmatrix}^{\textrm{T}}\) (so \((B\chi )_j \equiv \sum _{\mu =1}^{2N}B_{j\mu }\chi _\mu \), \(\chi ^{\textrm{T}} M\chi = \sum _{\mu ,\nu =1}^{2N}M_{\mu \nu }\chi _\mu \chi _\nu \)). In the case where M is invertible, Theorem A.15(d) of Ref. [18] gives an efficiently computable expression for g(BM), namely,

$$\begin{aligned} g(B, M) = \textrm{pf}(M) \textrm{pf}(-B M^{-1} B^{\textrm{T}}) \end{aligned}$$
(81)

where \(\textrm{pf}(-BM^{-1}B^{\textrm{T}}) \equiv 1\) if \(K = 0\) (and the Pfaffian for any matrix of odd dimension is zero by definition, so \(\textrm{pf}(-BM^{-1}B^{\textrm{T}}) \equiv 0\) if K is odd). We provide a recursive algorithm that efficiently evaluates g(BM) in the general case, where M is not necessarily invertible, later in this subsection. We first give several broad examples where \(\textrm{tr}(A^{(1)}\dots A^{(m)})\) can be expressed as such an integral, via Theorem 4.

Consider the integral on the RHS of Eq. (79) for some \(A^{(1)},\dots , A^{(m)} \in \mathcal {L}(\mathcal {H}_n)\). We take \(N = mn\) and identify \(\theta ^{(i)}_{\mu }\) with \(\chi _{2n(i-1) + \mu }\) for \(\mu \in [2n], i \in [m]\), so that \(\chi = \begin{pmatrix} \theta ^{(1)T}&\dots&\theta ^{(m)T}\end{pmatrix}^{\textrm{T}} = \begin{pmatrix} \theta _1^{(1)}&\dots&\theta _{2n}^{(1)}&\dots&\theta _1^{(m)}&\dots&\theta _{2n}^{(m)}\end{pmatrix}^{\textrm{T}}.\) Then, note that we can write the term \(\exp (\sum _{i<j} s_{ij}\theta ^{(i)T}\theta ^{(j)})\) in the form

$$\begin{aligned} \exp \Bigg (\sum _{i,j \in [m]:i <j} s_{ij} \theta ^{(i) T} \theta ^{(j)} \Bigg )= \exp \left( \frac{1}{2}\chi ^{\textrm{T}} S \chi \right) \end{aligned}$$
(82)

for an antisymmetric \(2N \times 2N\) matrix S (more explicitly, S has a simple block form, with \(2n\times 2n\) zero blocks along its diagonal, blocks of the form \(s_{ij}\mathbb {1}\) above the diagonal, and appropriately matching blocks below the diagonal, where \(\mathbb {1}\) is the \(2n\times 2n\) identity matrix). It remains to consider the Grassmann representations \(\omega (A^{(i)};\theta ^{(i)})\) and show how to encode them appropriately into an integral of the form g(MB).

The simplest example is where \(A^{(i)}\) is an arbitrary linear combination of Majorana operators: \(A^{(i)} = \sum _{\mu = 1}^{2n} c_{\mu } \gamma _\mu \) for some \(c_{\mu } \in \mathbb {C}\). Then, by Eq. (78), \(\omega (A^{(i)}; \theta ^{(i)}) = \sum _{\mu =1}^{2n}c_\mu \theta _\mu ^{(i)}\). By putting the 2n coefficients \(c_\mu \) in columns \(2n(i-1) + 1\) through 2ni of the kth row of B for some k, we then have \(\omega (A^{(i)};\theta ^{(i)}) = (B\chi )_k\), which clearly fits into the form of g(BM). Similarly, if \(A^{(i)}\) is a product of linear combinations of Majorana operators that mutually anticommute, i.e., if \(A^{(i)} = L_1\dots L_\ell \) where \(L_p = \sum _{\mu =1}^{2n}c_{p\mu } \gamma _\mu \) and \(L_pL_q = -L_q L_p\) for all \(p \ne q\) (for instance, this arises when \(A^{(i)} = \widetilde{\gamma }_{\mu _1}\dots \widetilde{\gamma }_{\mu _\ell }\) for some Majorana operators \(\widetilde{\gamma }_{\mu }\), or \(A^{(i)} = \widetilde{a}^\dagger _{j_1}\dots \widetilde{a}^\dagger _{j_\ell }\) for some creation operators \(\widetilde{a}^\dagger _j\)), then \(A^{(i)} = L_1 \wedge \dots \wedge L_\ell \), so \(\omega (A^{(i)}; \theta ^{(i)}) = (\sum _{\mu _1} c_{1\mu _1} \theta _{\mu _1}^{(i)})\dots (\sum _{\mu _\ell } c_{\ell \mu _\ell } \theta _{\mu _\ell }^{(i)})\). Hence, by putting the coefficients \(c_{p\mu }\) in appropriate positions in some rows \(k,\dots ,k + \ell -1\) of the matrix B, we have \(\omega (A^{(i)};\theta ^{(i)}) = (B\chi )_{k}\dots (B\chi )_{k+ \ell -1}\).

Next, suppose \(A^{(i)}\) is a fermionic Gaussian density operator \(\varrho = \prod _{j=1}^n \frac{1}{2}(I - i\lambda _{j} \widetilde{\gamma }_{2j-1}\widetilde{\gamma }_{2j})\), where \(\widetilde{\gamma }_\mu = \sum _{\nu =1}^{2n} Q_{\mu \nu }\gamma _\mu \) for some \(Q \in \textrm{O}(2n)\) (see Eq. (10)). Since the \(\widetilde{\gamma }_\mu \) mutually anticommute, \(\varrho = 2^{-n}\bigwedge _{j \in [n]}^{2n}(I - i \lambda _j \widetilde{\gamma }_{2j-1} \wedge \widetilde{\gamma }_{2j})\), so

$$\begin{aligned} \omega (\varrho ; \theta ^{(i)})&= \frac{1}{2^n} \prod _{j=1}^{n} \left( 1 - i \lambda _j \widetilde{\theta }_{2j-1}^{(i)} \widetilde{\theta }_{2j}^{(i)}\right) = \frac{1}{2^n} \prod _{j=1}^n \exp \left( -i \lambda _j \widetilde{\theta }_{2j-1}^{(i)} \widetilde{\theta }_{2j}^{(i)}\right) \nonumber \\&= \frac{1}{2^n}\exp \left( -\frac{i}{2}\theta ^{(i)T} C_{\varrho } \theta ^{(i)} \right) , \end{aligned}$$
(83)

where we let \(\widetilde{\theta }^{(i)}_\mu :=\sum _{\nu =1}^{2n}Q_{\mu \nu } \theta ^{(i)}_\nu \). The second inequality follows from the fact that \((\widetilde{\theta }^{(i)}_{2j-1}\widetilde{\theta }_{2j}^{(i)})^k = 0\) for any \(k > 1\), so \(\exp (-i \lambda _j \widetilde{\theta }_{2j-1}^{(i)} \widetilde{\theta }_{2j}^{(i)}) = 1 - i\lambda _j \widetilde{\theta }_{2j-1}^{(i)} \widetilde{\theta }_{2j}^{(i)}\), and the third uses the fact that \(\widetilde{\theta }_{2j-1}^{(i)} \widetilde{\theta }_{2j}^{(i)}\) are even and hence mutually commute, along with the definition of the covariance matrix \(C_\varrho \) of \(\varrho \) (Eq. (11)). Thus, we can write \(\omega (\varrho ;\theta ^{(i)}) = \exp (\frac{1}{2} \chi ^{\textrm{T}} C^{(i)} \chi )\) where \(C^{(i)}\) is a \(2N \times 2N\) matrix with \(-iC_\varrho \) as one of its blocks (rows and columns \(2n(i-1) + 1\) through 2ni) and all other entries 0. Furthermore, observe that Eq. (83) does not require \(\lambda _j \in [-1,1]\) (as is the case for a Gaussian density operator), so we can still write \(A^{(i)} = \exp (\frac{1}{2}\chi ^{\textrm{T}} C^{(i)}\chi )\) for some \(C^{(i)}\) whenever it has the form \(\prod _{j=1}^n \frac{1}{2}(I - i\lambda _j \widetilde{\gamma }_{2j-1}\widetilde{\gamma }_{2j})\) for some \(\lambda _j \in \mathbb {C}\), even if it is not a Gaussian density operator per se. This observation will be useful for, for instance, evaluating expectation values with respect to matchgate shadow samples (see the following subsection).

A final example we consider is where one or more of the operators in the expression \(\textrm{tr}(A^{(1)}\dots A^{(m)})\) we want to evaluate is a fermionic Gaussian unitary \(U_Q \in \textrm{M}_n\). We suppose that \(U_Q\) is specified by the orthogonal matrix \(Q \in \textrm{O}(2n)\) that it corresponds to via Eq. (8), along with \(\det (U_Q)\). Hence, if \(\det (Q) = 1\), i.e., \(Q \in \textrm{SO}(2n)\), we can write \(U_Q = e^{i\alpha } U_R\), where \(\det (U_R) = 1\), \(R = Q\), and \(e^{i\alpha }\) is a global phase, determined by \(\det (U_Q)\). If \(\det (Q) = -1\), we can write \(U_Q = e^{i\alpha }\gamma _1 U_R\), where \(R = R_1 Q\) with \(R_1 = \text {diag}(1,-1,\dots , -1)\) the \(2n \times 2n\) reflection matrix that \(\gamma _1\) corresponds to via Eq. (8), \(\det (U_R) = 1\), and \(e^{i\alpha }\) is a global phase. In both cases, \(R \in \textrm{SO}(2n)\), and it can be shown (see e.g., Ref. [21]) that \(U_R\) is generated by a quadratic Hamiltonian: \(U_R = \exp (-iH)\) with

$$\begin{aligned} H = \frac{i}{2}\sum _{\mu ,\nu =1}^{2n} h_{\mu \nu } \gamma _\mu \gamma _\nu , \end{aligned}$$
(84)

where h is the \(2n\times 2n\) antisymmetric matrix such that \(R = e^{2\,h}\), i.e., \(h = \ln (R)/2\). Block-diagonalising h using an orthogonal matrix \(Q' \in \textrm{O}(2n)\) as \(h = Q'^{\textrm{T}}\bigoplus _{j=1}^n \begin{pmatrix} 0 &{}\sigma _j \\ -\sigma _j &{}0 \end{pmatrix}Q'\) for some \(\sigma _j \in \mathbb {R}\), and defining the Majorana operators \(\gamma _\mu ' :=\sum _{\nu =1}^{2n} Q_{\mu \nu }' \gamma _\nu \), we then have

$$\begin{aligned} H = i \sum _{j=1}^n \sigma _j \gamma _{2j-1}' \gamma _{2j}', \end{aligned}$$

so

$$\begin{aligned} U_R&= \exp \left( \sum _j \sigma _j \gamma _{2j-1}'\gamma _{2j}'\right) = \prod _{j=1}^n \exp \left( \sigma _j \gamma _{2j-1}'\gamma _{2j}'\right) = \prod _{j=1}^n \left( \cos (\sigma _j) I + \sin (\sigma _j)\gamma _{2j-1}'\gamma _{2j}' \right) , \end{aligned}$$

where the second equality follows from the fact that \(\gamma _{2j-1}'\gamma _{2j}'\) mutually commute, and the third from \((\gamma _{2j-1}'\gamma _{2j}')^2 = -I\). Since the \(\gamma _\mu '\) anticommute, we can also write this as \(U_R = \wedge _{j=1}^n [\cos (\sigma _j)I + \sin (\sigma _j)\gamma _{2j-1}'\gamma _{2j}']\), from which we see that \(\theta _{2n}\) has the form

$$\begin{aligned} \omega (U_R; \theta ) = \prod _{j=1}^n \left( \cos (\sigma _j) + \sin (\sigma _j) \theta _{2j-1}'\theta _{2j}'\right) , \end{aligned}$$

with \(\theta '_\mu :=\sum _{\nu = 1}^{2n} Q'_{\mu \nu } \theta _\nu \). Thus, if \(\cos (\sigma _j) \ne 0\) for all \(j \in [n]\), we have

$$\begin{aligned} \omega (U_R;\theta )&= c \prod _{j=1}^n \left[ 1 + \tan (\sigma _j) \theta '_{2j-1}\theta '_{2j}\right] = c\prod _{j=1}^n \exp \left( \tan (\sigma _j)\theta '_{2j-1}\theta '_{2j} \right) \nonumber \\&=c\exp \left( \frac{1}{2}\theta ^{\textrm{T}} T_R \theta \right) , \end{aligned}$$
(85)

where \(c :=\prod _{j=1}^n \cos (\sigma _j)\) and \(T_R:=Q'^{\textrm{T}} \bigoplus _{j=1}^n \begin{pmatrix} 0 &{}\tan (\sigma _j) \\ -\tan (\sigma _j) &{}0\end{pmatrix}Q'\). If, on the other hand, \(\cos (\sigma _j) = 0\) for some j, say, for \(j \in J \subseteq [n]\), then we have

$$\begin{aligned} \omega (U_R; \theta ) = c_J \left[ \prod _{j \in J}\left( Q'\theta \right) _{2j-1}\left( Q'\theta \right) _{2j} \right] \exp \left( \frac{1}{2}\theta ^{\textrm{T}} T_{R,J} \theta \right) , \end{aligned}$$

where \(c_J :=\prod _{j=1}^n \cos (\sigma _j)\) and \(T_{R,J}\) is the same as \(T_R\) except with \(\tan (\sigma _j)\) replaced by 0 for all \(j \in J\). Hence, by placing \(T_R\) or \(T_{R,J}\) in the appropriate block of the larger matrix M in Eq. (80), and some of the rows of \(Q'\) in the appropriate positions of the larger matrix B if \(J \ne \varnothing \), we can incorporate \(\omega (U_R; \theta )\) into a Grassmann integral of the form of g(BM) (Eq. (80)). When applying Theorem 4 to \(U_Q\), we take \(A^{(i)} = U_Q = e^{i\alpha }U_R\) for some i if \(\det (Q) = 1\), and if \(\det (Q) = -1\) we take \(A^{(i)} = \gamma _1\) (then \(\omega (A^{(i)};\theta ^{(i)})\) is simply \(\theta ^{(i)}_1\)) and \(A^{(i+1)} = e^{i\alpha }U_R\).

Therefore, when each \(A^{(i)}\) in \(\textrm{tr}(A^{(1)}\dots A^{(m)})\) belongs to one of the three example categories we have considered above (products of linear combinations of Majorana operators, Gaussian density operators, and Gaussian unitaries), we can write the RHS of Eq. (79) as a Grassmann integral over a product of some linear combinations of the Grassmann variables \(\chi _\mu \), as well as some exponentials of the form \(\exp (\frac{1}{2}\theta ^{\textrm{T}} M^{(j)} \theta )\) for some antisymmetric \(2N \times 2N\) matrices \(M^{(j)}\). Since these exponentials are even elements, we can commute them past any of linear combinations of \(\chi _\mu \) to move them all to the right, and then combine them with the \(\exp (\frac{1}{2}\chi ^{\textrm{T}} S \chi )\) from Eq. (82) as \(\exp (\frac{1}{2}\theta ^{\textrm{T}} \sum _i (M^{j} + S)\theta )\). Then, setting \(M = \sum _i M^{j} +S\) and constructing the matrix B appropriately, we obtain a Grassmann integral g(BM) of the form in Eq. (80). We work through an explicit example of this procedure in the following subsection.

It remains to show how to efficiently evaluate the Grassmann integral g(BM) for arbitrary \(K \times 2N\) matrices B and antisymmetric \(2N \times 2N\) matrices M. We present an algorithm for doing so as Algorithm 2, and prove its correctness in Appendix E2. The algorithm builds on ideas in Ref. [16], and extends Theorem A.15(d) in Ref. [18] (see Eq. (81)) to an efficient procedure for evaluating g(BM) even when M is not invertible.

figure b

5.3.4 Worked example: Overlaps with pure Gaussian states via matchgate shadows

Having presented our general method, we now apply it to show how to use our matchgate shadows to estimate the overlaps between a pure state (accessed via a state preparation circuit) and arbitrary pure Gaussian states. As shown in Appendix A.2, we can evaluate any such overlap by evaluating the expectation value of \(| \phi \rangle \langle \textbf{0} |\) for a Gaussian state \(| \phi \rangle \) of even parity (\(P| \phi \rangle = | \phi \rangle \) where \(P = (-i)^n \gamma _{[2n]}\)). To illustrate the general applicability of the method, we will actually consider evaluating the expectation value of \(\widetilde{\gamma }_S| \phi \rangle \langle \textbf{0} |\) for some Majorana operators \(\{\widetilde{\gamma }_\mu \}_{\mu \in [2n]}\) and \(S \subseteq [2n]\).Footnote 16 This reduces to \(| \phi \rangle \langle 0 |\) in the special case where \(S = \varnothing \), so \(\gamma _S = I\).

By Eq. (32), the expectation value of \(\widetilde{\gamma }_S | \phi \rangle \langle \textbf{0} |\) with respect to a matchgate shadow sample \(\mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\) is

$$\begin{aligned} \textrm{tr}\left( \widetilde{\gamma }_S | \phi \rangle \langle \textbf{0} | \mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\right) = \sum _{\ell =0}^n {2n\atopwithdelims ()2\ell }{n\atopwithdelims ()\ell }^{-1} \textrm{tr}\left( \widetilde{\gamma }_S | \phi \rangle \langle \textbf{0} | \mathcal {P}_{2\ell }(U_Q^\dagger | b \rangle \langle b |U_Q) \right) , \end{aligned}$$
(86)

so it suffices to efficiently evaluate the terms \(\textrm{tr}(\widetilde{\gamma }_S | \phi \rangle \langle \textbf{0} | \mathcal {P}_{2\ell }(U_Q^\dagger | b \rangle \langle b |U_Q))\) for each \(\ell \in \{0,\dots , n\}\). We consider more generally \(\textrm{tr}(\widetilde{\gamma }_S | \phi \rangle \langle \textbf{0} | \mathcal {P}_{2\ell }(\varrho ))\) for an arbitrary Gaussian density operator \(\varrho = \prod _{j=1}^n \frac{1}{2}(I - i\lambda _j \gamma '_{2j-1}\gamma '_{2j})\), with \(\lambda _j \in [-1,1]\) and \(\gamma _{\mu }' :=U_{Q}^\dagger \gamma _\mu U_{Q}\) for any \(Q \in \textrm{O}(2n)\). We use the same generating function trick as in Theorems 2 and 3, putting the variable z in front of the bivectors \(\gamma _{2j-1}'\gamma _{2j}'\) so that \(\textrm{tr}(\widetilde{\gamma }_S | \phi \rangle \langle \textbf{0} | \mathcal {P}_{2\ell }(\varrho ))\) is the coefficient of \(z^\ell \) in the polynomial

$$\begin{aligned} p(z) :=\textrm{tr}\left( \widetilde{\gamma }_S | \phi \rangle \langle \textbf{0} | \prod _{j=1}^n \frac{1}{2}\left( I - iz \lambda _j \gamma _{2j-1}'\gamma _{2j}'\right) \right) . \end{aligned}$$

p(z) has degree at most n, so we can find its coefficients via polynomial interpolation if we are able to evaluate p(z) at arbitrary values of z. We can use the method described above to do this efficiently.

First, we fix some notation. Let \(\widetilde{Q} \in \textrm{O}(2n)\) be the orthogonal matrix such that \(\widetilde{\gamma }_\mu = \sum _{\nu =1}^{2n} \widetilde{Q}_{\mu \nu }\gamma _\mu \), and let \(S = \{\mu _1,\dots , \mu _{|S|}\}\) with \(\mu _1< \dots < \mu _{|S|}\). We assume that \(| \phi \rangle \) is specified as \(| \phi \rangle = e^{i\alpha } U_R| \textbf{0} \rangle \), where \(e^{i\alpha }\) is a global phase, \(\det (U_R) = 1\), and \(R \in \textrm{SO}(n)\) (we know \(\det (R) = 1\) because \(| \phi \rangle \) and \(| \textbf{0} \rangle \) have the same parity). For convenience, we denote \(\varrho (z) :=\prod _{j=1}^{n} \frac{1}{2}(I - iz\lambda _j \gamma _{2j-1}'\gamma _{2j}')\).

Hence, we want to evaluate \(p(z) = e^{i\alpha } \textrm{tr}(\widetilde{\gamma }_S U_R | \textbf{0} \rangle \langle \textbf{0} | \varrho (z))\). We start by writing down the Grassmann representations of \(\widetilde{\gamma }_S\), \(U_R\), \(| \textbf{0} \rangle \langle \textbf{0} |\), and \(\varrho (z)\), using Sect. 5.3.3. Next, we apply Theorem 4 from Sect. 5.3.2 to to express p(z) as a Grassmann integral. Finally, we rewrite this integral in the form of Eq. (80), by assembling the matrices B and M. The integral can then be evaluated by putting B and M into Algorithm 2.

As shown in Sect. 5.3.3, the Grassmann representation of \(\widetilde{\gamma }_S = \widetilde{\gamma }_{\mu _1}\dots \widetilde{\gamma }_{\mu _{|S|}}\) in terms of \(\theta _1,\dots , \theta _{2n}\) is simply

$$\begin{aligned} \omega (\widetilde{\gamma }_S; \theta ) = (\widetilde{Q}\theta )_{\mu _1}\dots (\widetilde{Q}\theta )_{\mu _{|S|}} \end{aligned}$$

(recalling that \(\theta \) represents the vector \(\begin{pmatrix} \theta _1&\dots&\theta _{2n}\end{pmatrix}^{\textrm{T}})\)). We find the Hamiltonian that generates \(U_R\) (Eq. (84)), and block-diagonalise its antisymmetric coefficient matrix h as \(h = Q'^{\textrm{T}} \bigoplus _{j=1}^n \begin{pmatrix} 0&{}\sigma _j \\ -\sigma _j &{}0 \end{pmatrix} Q'\). Then, considering the generic case where \(\cos (\sigma _j) \ne 0\) for all \(j\in [n]\) for simplicity, we have

$$\begin{aligned}{} & {} \omega (U_R;\theta ) = \Bigg (\prod _{j=1}^n \cos (\sigma _j)\Bigg ) \exp \left( \frac{1}{2}\theta ^{\textrm{T}} T_R\theta \right) , \qquad \text {where } \\{} & {} T_R:=Q'^{\textrm{T}} \bigoplus _{j=1}^n \begin{pmatrix} 0 &{}\tan (\sigma _j) \\ -\tan (\sigma _j) &{}0\end{pmatrix}Q' \end{aligned}$$

by Eq. (85). Using Eq. (83) (noting that it holds for arbitrary \(\lambda _j \in \mathbb {C}\)), we obtain

$$\begin{aligned} \omega (| \textbf{0} \rangle \langle \textbf{0} |; \theta ) = \frac{1}{2^n}\exp \left( -\frac{i}{2} \theta ^{\textrm{T}} C_{| \textbf{0} \rangle } \theta \right) , \qquad \omega (\varrho (z)) = \frac{1}{2^n}\exp \left( -\frac{i}{2} \theta ^{\textrm{T}} (zC_{\varrho }) \theta \right) \end{aligned}$$

for any z, where \(C_{| \textbf{0} \rangle }\) is the covariance matrix of the vacuum state \(| \textbf{0} \rangle \langle \textbf{0} |\) (see Eq. (12)) and \(C_{\varrho }\) is the covariance matrix of \(\varrho \) (see Eq. (13)).

Then, applying Theorem 4 with \(m = 4\), \(A^{(1)} = \widetilde{\gamma }_S\), \(A^{(2)} = U_R\), \(A^{(3)} = | \textbf{0} \rangle \langle \textbf{0} |\), and \(A^{(4)} = \varrho (z)\) yields

$$\begin{aligned} p(z)&= e^{i\alpha }\textrm{tr}\left( \widetilde{\gamma }_S U_R | \textbf{0} \rangle \langle \textbf{0} |\varrho (z)\right) \nonumber \\&= e^{i\alpha } 2^n \int D\theta ^{(1)}\,D\theta ^{(2)}\,D\theta ^{(3)}\,D\theta ^{(4)}\, (\widetilde{Q}\theta ^{(1)})_{\mu _1}\dots (\widetilde{Q}\theta ^{(1)})_{\mu _{|S|}}\nonumber \\&\quad \times \Bigg (\prod _{j=1}^n \cos (\sigma _j)\Bigg ) \exp \left( \frac{1}{2}\theta ^{(2)T} T_R\theta ^{(2)}\right) \nonumber \\&\quad \frac{1}{2^n} \exp \left( -\frac{i}{2}\theta ^{(3)T} C_{| \textbf{0} \rangle }\theta ^{(3)}\right) \frac{1}{2^n} \exp \left( -\frac{i}{2}\theta ^{(4)T} (zC_{\varrho })\theta ^{(4)} \right) \nonumber \\&\quad \times \exp \left( \theta ^{(1)T}\theta ^{(2)} -\theta ^{(1)T}\theta ^{(3)} +\theta ^{(1)T}\theta ^{(4)} + \theta ^{(2) T}\theta ^{(3)} - \theta ^{(2)T}\theta ^{(4)} + \theta ^{(3)T}\theta ^{(4)}\right) . \end{aligned}$$
(87)

Since the arguments of the exponentials are all even elements of \(\mathcal {G}_{8n}\), they mutually commute, so we can rewrite the product of exponentials as a single exponential.

Now, we relabel \(\theta _1^{(1)},\dots , \theta ^{(1)}_{2n}, \dots , \theta _1^{(4)}, \dots , \theta _{2n}^{(4)}\) as \(\chi _1, \dots , \chi _{8n}\), so

$$\begin{aligned} \chi \equiv \begin{pmatrix} \theta ^{(1)} \\ \theta ^{(2)} \\ \theta ^{(3)} \\ \theta ^{(4)}\end{pmatrix}. \end{aligned}$$

Let \(\widetilde{Q}\big |_{S*}\) denote the \(|S| \times 2n\) matrix formed by restricting \(\widetilde{Q}\) to rows in S, and define the \(|S| \times 8n\) matrix B by

$$\begin{aligned} B = \begin{pmatrix} \widetilde{Q}\big |_{S*}&0&0&0 \end{pmatrix} \end{aligned}$$

(where each 0 denotes an \(|S| \times 2n\) block). B is constructed such that the kth entry of \(B\chi \) is \((\widetilde{Q}\theta ^{(1)})_{\mu _k}\). Also define the \(8n \times 8n\) matrix M(z) by

$$\begin{aligned} M(z) = \begin{pmatrix} 0 &{}\mathbb {1} &{}-\mathbb {1} &{}\mathbb {1} \\ -\mathbb {1} &{}T_R &{}\mathbb {1} &{}-\mathbb {1} \\ \mathbb {1} &{} -\mathbb {1} &{}-iC_{| \textbf{0} \rangle } &{}\mathbb {1} \\ -\mathbb {1} &{}\mathbb {1} &{}-\mathbb {1} &{}-izC_{\varrho } \end{pmatrix} \end{aligned}$$
(88)

(where \(\mathbb {1}\) denotes the \(2n\times 2n\) identity matrix). M(z) is constructed such that the product of exponentials in Eq. (87) is equal to \(\exp (\frac{1}{2}\chi ^{\textrm{T}} M\chi )\). Therefore,

$$\begin{aligned} p(z) = e^{i\alpha } \frac{1}{2^n}\Bigg (\prod _{j=1}^n \cos (\sigma _j) \Bigg )\int D\chi \, (B\chi )_{1} \dots (B\chi )_{|S|} \exp \left( \frac{1}{2}\chi ^{\textrm{T}} M(z)\chi \right) , \end{aligned}$$

and the Grassmann integral is equal to g(BM(z)) in the notation of Eq. (80). This can be evaluated for any fixed value of z in at most \(\mathcal {O}(n^4)\) time using Algorithm 2.

To evaluate the expectation value of \(\widetilde{\gamma }_S| \phi \rangle \langle \textbf{0} |\) with respect to any classical shadow sample \(\mathcal {M}^{-1}(U_Q^\dagger | b \rangle \langle b |U_Q)\) (taking \(S =\varnothing \) if we wish to estimate the overlap with \(| \phi \rangle \)), we substitute \(C_{U_Q^\dagger | b \rangle \langle b |U_Q} = Q^{\textrm{T}} C_{| b \rangle }Q\) for \(C_\varrho \) in Eq. (88), and find the coefficients of p(z) via polynomial interpolation, evaluating p(z) at \(n + 1\) values of z using Algorithm 2. We then substitute the coefficients into Eq. (86).

6 Variance Bounds

In Sects. 5.1 and 5.2, we presented efficient methods for extracting unbiased estimates of expectation values of fermionic Gaussian density operators and of overlaps with Slater determinants from our matchgate shadows. We now analyse the variances of these estimates in this section. We leave investigating the variance of the more general classes of observables considered in Sect. 5.3 to future work.

6.1 Variance for expectation values of Gaussian density operators

We begin in this subsection by proving Eq. (40), which is an efficiently computable upper bound on the variance for estimating the expectation value of any Gaussian density operator \(\varrho \). We then show that this bound scales at most polynomially with the number of fermionic modes n (using a straightforward analysis to bound it loosely by \(\mathcal {O}(n^3)\)). We provide a more careful analysis in Appendix F to demonstrate that the variance is \(\mathcal {O}(\sqrt{n}\log n)\).

Proof of Eq. (40)

Let \(\varrho \) be the density operator of any Gaussian state. By Eq. (10), \(\varrho = \prod _{j =1}^n \frac{1}{2}(I - i\lambda _j \widetilde{\gamma }_{2j-1}\widetilde{\gamma }_{2j})\) for some \(\lambda _j \in [-1,1]\) and Majorana operators \(\widetilde{\gamma }_\mu \). We apply Eq. (36) with this set of Majorana operators (recall we are free to choose the Majorana basis in Eq. (36)) to \(\varrho \), yielding

$$\begin{aligned} \textrm{Var}[\hat{o}]\Big |_{O = \varrho } \le \frac{1}{2^{2n}} \sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0 \\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}}\alpha _{\ell _1,\ell _2,\ell _3} \sum _{\begin{array}{c} A_1, A_2, A_3 \subseteq [2n] \text { disjoint} \\ |A_1| = 2\ell _1, |A_2|=2\ell _2,|A_3|=2\ell _3 \end{array}} \big |\textrm{tr}(\varrho \widetilde{\gamma }_{A_2\cup A_3})\textrm{tr}(\varrho \widetilde{\gamma }_{A_3 \cup A_1})\big |, \end{aligned}$$

with \(\alpha _{\ell _1,\ell _2,\ell _3}\) given in Eq. (34). We can write \(\varrho \) as

$$\begin{aligned} \varrho = \frac{1}{2^n}\sum _{T \subseteq [n]} \lambda _T \widetilde{\gamma }_{\text {pairs}(T)} \end{aligned}$$

where \(\lambda _T :=\prod _{j \in T} \lambda _j\) and \(\text {pairs}(T) :=\bigcup _{j \in T} \{2j-1,2j\}\). Thus, we see that \(\textrm{tr}(\varrho \widetilde{\gamma }_{A_2 \cup A_3})\textrm{tr}(\varrho \widetilde{\gamma }_{A_3 \cup A_1})\) is nonzero only if \(A_2 \cup A_3 = \text {pairs}(T_1')\) and \(A_3 \cup A_1 = \text {pairs}(T_2')\) for some \(T_1',T_2' \subseteq [n]\). For mutually disjoint \(A_1, A_2, A_3\), this condition is equivalent to \(A_1 = \text {pairs}(T_1)\), \(A_2 = \text {pairs}(T_2)\), and \(A_3 = \text {pairs}(T_3)\) for some mutually disjoint subsets \(T_1,T_2, T_3 \subseteq [n]\), in which case \(|\textrm{tr}(\varrho \widetilde{\gamma }_{A_2 \cup A_3})\textrm{tr}(\varrho \widetilde{\gamma }_{A_3 \cup A_1})| = |2^{-n}\lambda _{A_2 \cup A_3}\textrm{tr}(\widetilde{\gamma }_{A_2\cup A_3}^2) \cdot 2^{-n} \lambda _{A_3 \cup A_1}\textrm{tr}(\widetilde{\gamma }_{A_3 \cup A_1}^2)| = |\lambda _{A_2\cup A_3}\lambda _{A_3 \cup A_1}|\). Hence, using Eq. (34) and \(|\lambda _T| \le 1\) for any \(T \subseteq [n]\),

$$\begin{aligned} \textrm{Var}[\hat{o}]\Big |_{O = \varrho }&\le \frac{1}{2^{2n}}\sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0 \\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}} \alpha _{\ell _1,\ell _2,\ell _3} \sum _{\begin{array}{c} T_1,T_2,T_3 \subseteq [n] \text { disjoint} \\ |T_1| = \ell _1, |T_2| = \ell _2, |T_3| = \ell _3 \end{array}} |\lambda _{A_2 \cup A_3}\lambda _{A_3 \cup A_1}| \\&\le \frac{1}{2^{2n}}\sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3\ge 0\\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}} \frac{{n\atopwithdelims ()\ell _1,\ell _2,\ell _3, n-\ell _1-\ell _2-\ell _3}}{{2n\atopwithdelims ()2\ell _1,2\ell _2,2\ell _3, 2(n-\ell _1-\ell _2-\ell _3)}}\frac{{2n\atopwithdelims ()2(\ell _1 + \ell _3)}}{{n\atopwithdelims ()\ell _1 + \ell _3}} \frac{{2n\atopwithdelims ()2(\ell _2 + \ell _3)}}{{n\atopwithdelims ()\ell _2 + \ell _3}}\\ \end{aligned}$$
$$\begin{aligned}&\quad \sum _{\begin{array}{c} T_1,T_2,T_3 \subseteq [n] \text { disjoint} \\ |T_1| = \ell _1, |T_2| = \ell _2, |T_3| = \ell _3 \end{array}}1 \\&= \frac{1}{2^{2n}}\sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3\ge 0\\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}} \frac{{n\atopwithdelims ()\ell _1,\ell _2,\ell _3, n-\ell _1-\ell _2-\ell _3}^2}{{2n\atopwithdelims ()2\ell _1,2\ell _2,2\ell _3, 2(n-\ell _1-\ell _2-\ell _3)}}\frac{{2n\atopwithdelims ()2(\ell _1 + \ell _3)}}{{n\atopwithdelims ()\ell _1 + \ell _3}} \frac{{2n\atopwithdelims ()2(\ell _2 + \ell _3)}}{{n\atopwithdelims ()\ell _2 + \ell _3}}. \end{aligned}$$

\(\square \)

We further analyse the bound in Eq. (40) using Stirling’s approximation—more precisely, we consider the bounds

$$\begin{aligned} \sqrt{2\pi }\sqrt{k} \left( \frac{k}{e}\right) ^k < k! \le e\sqrt{k}\left( \frac{k}{e}\right) ^k, \end{aligned}$$
(89)

which hold for all \(k \in \mathbb {Z}_{>0}\) [22]. By writing the multinomial coefficients in terms of factorials then applying Eq. (89), it is straightforward to show that

$$\begin{aligned} \frac{{2n\atopwithdelims ()2k}}{{n\atopwithdelims ()k}} \le 2^{nH_{\textrm{b}}(k/n)}, \end{aligned}$$
(90)

for any \(k \in \{0,\dots , n\}\), where \(H_{\textrm{b}}\) denotes the binary entropy function \(H_{\textrm{b}}(x) = -x\log _2 x -(1-x) \log _2(1-x)\), and that for integers \(k_1 + k_2 + k_3 + k_4 = n\),

$$\begin{aligned} \frac{{n\atopwithdelims ()k_1,k_2,k_3,k_4}^2}{{2n\atopwithdelims ()2k_1,2k_2,2k_3,2k_4}} \le \left\{ \begin{array}{ll} \sqrt{\frac{n}{k_1k_2k_3k_4}} &{} \quad k_1,k_2,k_3,k_4> 0\\ \sqrt{\frac{n}{k_1k_2k_3}} &{} \quad k_1,k_2,k_3> 0, k_4 = 0 \\ \sqrt{\frac{n}{k_1k_2}} &{} \quad k_1,k_2>0, k_3= k_4 = 0 \\ 1 &{} \quad k_1 > 0, k_2=k_3=k_4 = 0 \end{array}\right. \end{aligned}$$
(91)

(where all other cases follow by symmetry between \(k_1, k_2, k_3, k_4\)). We can obtain a loose bound by upper-bounding \({2n\atopwithdelims ()2k}{n\atopwithdelims ()k}^{-1}\) by \(2^n\) for all k, using \(\max _x H_\textrm{b}(x) = 1\), and noting that the RHS of Eq. (91) is upper bounded by a constant in every case, so \({n\atopwithdelims ()k_1,k_2,k_3,k_4}^2 {2n\atopwithdelims ()2k_1,2k_2,2k_3,2k_4}^{-1} \le c\) for some constant c. Thus,

$$\begin{aligned} \textrm{Var}[\hat{o}]\Big |_{O = \varrho } \le \frac{1}{2^{2n}} \sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0\\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}} c (2^n)^2 = \mathcal {O}(n^3). \end{aligned}$$

In Appendix F, we tighten this to \(\mathcal {O}(\sqrt{n}\log n)\), using Eqs. (90) and (91). In Fig. 1, the bound in Eq. (40) is plotted as in red, while the dashed black line is \(y = \sqrt{n}\ln n\).

6.2 Variance for overlaps with Slater determinants

In this subsection, we prove Eqs. (43) and (44), which constitute an efficiently computable upper bound on the variance for estimating the expectation value of \(| \varphi \rangle \langle \textbf{0} |\), for any Slater determinant \(| \varphi \rangle \) with an even number \(\zeta \) of fermions. As explained in Appendix A, the ability to estimate these expectation values allow us to estimate the overlaps between a pure state and arbitrary Slater determinants (with any number of fermions).

Note that for \(\zeta = 0\), we have \(| \varphi \rangle \langle \textbf{0} | = | \textbf{0} \rangle \langle \textbf{0} |\), which is a Gaussian density operator, and indeed Eqs. (43) and (44) reduce in this case to our variance bound for Gaussian density operators, Eq. (40). The key observation in the proof of Eq. (40), given in the previous subsection, is that for a Gaussian density operator of the form \(\varrho = \prod _{j=1}^n \frac{1}{2}(I -i\lambda _j \widetilde{\gamma }_{2j-1}\widetilde{\gamma }_{2j})\), the terms \(\textrm{tr}(\varrho \widetilde{\gamma }_{A_2 \cup A_3})\textrm{tr}(\varrho \widetilde{\gamma }_{A_3 \cup A_1})\) (in the inner sum of Eq. (36)) are nonzero only for certain combinations of disjoint subsets \(A_1,A_2, A_3 \subseteq [2n]\). It was then straightforward to see that the number of such combinations is at most \({n \atopwithdelims ()\ell _1,\ell _2,\ell _3, n -\ell _1-\ell _2-\ell _3}\). We use a similar approach to prove Eqs. (43) and (44), expanding \(| \varphi \rangle \langle \textbf{0} |\) in a particular basis of Majorana operators, then counting the number of combinations \((A_1,A_2,A_3)\) that contribute to the sum in Eq. (36). For \(\zeta > 0\), this requires a more involved combinatorial argument.

Proof of Eqs. (43) and (44)

Let \(| \varphi \rangle = \widetilde{a}_1^\dagger \dots \widetilde{a}_\zeta ^\dagger | \textbf{0} \rangle \) be any \(\zeta \)-fermion Slater determinant with \(\zeta \) even, and let \(\{\widetilde{\gamma }_\mu \}_{\mu \in [2n]}\) be the Majorana operators corresponding to \(\{\widetilde{a}_j\}_{j \in [n]}\) (via Eq. (1)). Then, \(\widetilde{\gamma }_{2j-1}| \textbf{0} \rangle = (\widetilde{a}_j + \widetilde{a}_j^\dagger )| \textbf{0} \rangle = \widetilde{a}_j^\dagger | \textbf{0} \rangle \), which together with the anticommutation relations implies that \(\widetilde{a}_1^\dagger \dots \widetilde{a}_\zeta ^\dagger | \textbf{0} \rangle = \widetilde{\gamma }_1\widetilde{\gamma }_3 \dots \widetilde{\gamma }_{2\zeta - 1}| \textbf{0} \rangle \). Hence, defining \(S_\zeta :=\{1,3,\dots , 2\zeta - 1\}\),

$$\begin{aligned} | \varphi \rangle \langle \textbf{0} | = \widetilde{\gamma }_1\widetilde{\gamma }_3 \dots \widetilde{\gamma }_{2\zeta - 1}| \textbf{0} \rangle \langle \textbf{0} | = \widetilde{\gamma }_{S_\zeta } | \textbf{0} \rangle \langle \textbf{0} |. \end{aligned}$$

Note that since the basis transformation from \(a_j\) to \(\widetilde{a}_j\) (Eq. (14)) is number-conserving, we can also express \(| \textbf{0} \rangle \langle \textbf{0} |\) in terms of \(\widetilde{\gamma }_{\mu }\) as \(| \textbf{0} \rangle \langle \textbf{0} | = \prod _{j=1}^n \frac{1}{2}(I - i\widetilde{\gamma }_{2j-1}\widetilde{\gamma }_{2j})\). Applying Eq. (36) with \(\widetilde{\gamma }_\mu \) as the Majorana basis to \(| \varphi \rangle \langle \textbf{0} |\), we have

$$\begin{aligned}&\textrm{Var}[\hat{o}]\Big |_{O = | \varphi \rangle \langle \textbf{0} |} \le \frac{1}{2^{2n}}\sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0 \\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}} \alpha _{\ell _1,\ell _2,\ell _3} \sum _{\begin{array}{c} A_1,A_2, A_3 \subseteq [2n] \text { disjoint} \\ |A_1| = 2\ell _1, |A_2| = 2\ell _2, |A_3| = 2\ell _3 \end{array}}\\&\big |\textrm{tr}\left( | \varphi \rangle \langle \textbf{0} | \widetilde{\gamma }_{A_2 \cup A_3}\right) \textrm{tr}\left( | \varphi \rangle \langle \textbf{0} |\widetilde{\gamma }_{A_3 \cup A_1}\right) \big |, \end{aligned}$$

with

$$\begin{aligned} \big |\textrm{tr}\left( | \varphi \rangle \langle \textbf{0} | \widetilde{\gamma }_{A_2 \cup A_3}\right) \textrm{tr}\left( | \varphi \rangle \langle \textbf{0} |\widetilde{\gamma }_{A_3 \cup A_1}\right) \big |&= \big |\textrm{tr}\left( | \textbf{0} \rangle \langle \textbf{0} | \widetilde{\gamma }_{A_2 \cup A_3} \widetilde{\gamma }_{S_\zeta } \right) \textrm{tr}\left( | \textbf{0} \rangle \langle \textbf{0} |\widetilde{\gamma }_{A_3 \cup A_1}\widetilde{\gamma }_{S_\zeta }\right) \big | \\&= \big |\textrm{tr}\left( | \textbf{0} \rangle \langle \textbf{0} | \widetilde{\gamma }_{(A_2 \cup A_3) \triangle S_\zeta }\right) \textrm{tr}\left( | \textbf{0} \rangle \langle \textbf{0} |\widetilde{\gamma }_{(A_3 \cup A_1) \triangle S_\zeta }\right) \big |, \end{aligned}$$

where \(A \triangle B\) denotes the symmetric difference of the sets A and B. Writing

$$\begin{aligned} | \textbf{0} \rangle \langle \textbf{0} | = \frac{1}{2^n} \sum _{T \subseteq [n]} \widetilde{\gamma }_{\text {pairs}(T)}, \end{aligned}$$

where \(\text {pairs}(T) :=\bigcup _{j \in T} \{2j-1,2j\}\), we see that \(|\textrm{tr}\left( | \varphi \rangle \langle \textbf{0} | \widetilde{\gamma }_{A_2 \cup A_3}\right) \textrm{tr}\left( | \varphi \rangle \langle \textbf{0} |\widetilde{\gamma }_{A_3 \cup A_1}\right) |\) is nonzero only if \((A_2 \cup A_3) \triangle S_\zeta = \text {pairs}(T_1)\) and \((A_3 \cup A_1)\triangle S_\zeta = \text {pairs}(T_2)\) for some \(T_1, T_2 \subseteq [n]\), in which case it takes on the value 1. Thus,

$$\begin{aligned} \textrm{Var}[\hat{o}] \Big |_{O = | \varphi \rangle \langle \textbf{0} |} \le \frac{1}{2^{2n}} \sum _{\begin{array}{c} \ell _1,\ell _2,\ell _3 \ge 0 \\ \ell _1 + \ell _2 + \ell _3 \le n \end{array}} \alpha _{\ell _1,\ell _2,\ell _3} \, \kappa (n, \zeta , \ell _1,\ell _2,\ell _3), \end{aligned}$$

where \(\kappa (n,\zeta ,\ell _1,\ell _2,\ell _3)\) is the number of tuples \((A_1,A_2, A_3)\) such that \(A_1, A_2, A_3\) are mutually disjoint subsets of [2n] of cardinalities \(2\ell _1\), \(2\ell _2\), and \(2\ell _3\), respectively, and \((A_2 \cup A_3) \triangle S_\zeta = \text {pairs}(T_1)\) and \((A_3 \cup A_1)\triangle S_\zeta = \text {pairs}(T_2)\) for some \(T_1, T_2 \subseteq [n]\). This last condition means that for each \(j \in [n]\), \(2j-1 \in (A_2 \cup A_3) \triangle S_\zeta \) if and only if \(2j \in (A_2\cup A_3) \triangle S_\zeta \), and likewise for \((A_3 \cup A_1) \triangle S_\zeta \).

For mutually disjoint \(A_1, A_2, A_3\), \((A_2 \cup A_3) \triangle S_\zeta = A_2 \triangle (A_3 \triangle S_\zeta )\) and \((A_3 \cup A_1) \triangle S_\zeta = A_1 \triangle (A_3 \triangle S_\zeta )\). For convenience, let \(A_3' :=A_3 \triangle S_\zeta \), and for each \(i \in \{1,2,3\}\) and \(j \in [n]\), we write

$$\begin{aligned} A_i^{(j)} :=\left\{ \begin{array}{ll} 00 \qquad &{}\text {if}\, 2j - 1, 2j \not \in A_i \\ 01 \qquad &{}\text {if}\, 2j-1 \not \in A_i{ and}2j \in A_i \\ 10 \qquad &{}\text {if}\, 2j-1 \in A_i{ and}2j\not \in A_i \\ 11 \qquad &{}\text {if}\, 2j-1,2j \in A_i, \end{array}\right. \end{aligned}$$

and define \(A_3'^{(j)}\) analogously. The above condition then translates to \(A_2^{(j)} \oplus A_3'^{(j)} = 00\) or 11, and \(A_1^{(j)} \oplus A_3'^{(j)} = 00\) or 11, for every \(j \in [n]\), where \(\oplus \) denotes the bitwise XOR. Equivalently,

$$\begin{aligned} A_1^{(j)} = A_3'^{(j)} \oplus 00 \text { or } A_3'^{(j)} \oplus 11, \qquad A_2^{(j)} = A_3'^{(j)} \oplus 00 \text { or } A_3'^{(j)} \oplus 11. \end{aligned}$$

We assume that \(A_1,A_2,A_3\) satisfy this condition and are mutually disjoint, and consider the different possibilities. For \(j \in \{1,\dots , \zeta \}\),

  • if \(A_3^{(j)} = 00\), then \(A_3'^{(j)} = 10\), so either \(A_1^{(j)} = 01\) and \(A_2^{(j)} = 10\), or \(A_1^{(j)} = 10\) and \(A_2^{(j)} = 01\) (note that e.g., \(A_1^{(j)}\) and \(A_2^{(j)}\) cannot both be 01, since this would mean that 2j is in both \(X_1\) and \(A_2\), contradicting the assumption that they are disjoint);

  • if \(A_3^{(j)} = 01\), then \(A_3'^{(j)} = 11\), so \(A_1^{(j)} = 00\) and \(A_2^{(j)} = 00\) (note that we cannot have \(A_1^{(j)} = 11\) since then \(A_1\) and \(A_3\) would not be disjoint, and likewise for \(A_2\));

  • if \(A_3^{(j)} = 10\), then \(A_3'^{(j)} = 00\), so \(A_1^{(j)} = 00\) and \(A_2^{(j)} = 00\);

  • if \(A_3^{(j)} = 11\), then \(A_3'^{(j)} = 01\), and there are no possibilities for \(A_1\) and \(A_2\) that satisfy the disjointness assumption.

For \(j \in \{\zeta + 1,\dots , n\}\), we have \(A_3'^{(j)} = A_3^{(j)}\), so

  • if \(A_3^{(j)} = 00\), then \(A_1^{(j)} = 00\) and \(A_2^{(j)} = 00\), or \(A_1^{(j)} = 00\) and \(A_2^{(j)} = 00\), or \(A_1^{(j)} = 11\) and \(A_2^{(j)} = 00\);

  • if \(A_3^{(j)} = 01\) or \(A_3^{(j)} = 10\), there are no possibilities for \(A_1\) and \(A_2\);

  • if \(A_3^{(j)} = 11\), then \(A_1^{(j)} = 00\) and \(A_2^{(j)} = 00\).

Now, let us count the number of such combinations for which \(|A_i| = 2\ell _i\) for \(i \in \{1,2,3\}\). We imagine building up the sets \(A_1,A_2, A_3\), starting with empty sets and then adding to them by choosing one of the viable cases above for each \(j \in [n]\). Note that for \(j \in \{\zeta + 1,\dots , n\}\), we either add 0 or 2 elements to \(A_3\) (choosing \(A_3^{(j)} = 00\) adds no elements, whereas choosing \(A_3^{(j)} = 11\) adds two). Hence, since we ultimately want an even number of elements in \(A_3\), the number of elements we add to \(A_3\) from the \(j \in \{1,\dots , \zeta \}\) cases must be even. Let this number be 2k. Then, 2k can be any integer between 0 and \(2\ell _3\). For each \(j \in \{1,\dots , \zeta \}\), there are two ways to add 0 elements to \(A_3\) (\(A_3^{(j)} = 00\), then \(A_1 = 01\) and \(A_2 = 10\) or vice versa), and both of these ways add 1 element each to \(A_1\) and \(A_2\); there are also two ways to add 1 element to \(A_3\) (\(A_3^{(j)} = 01\) or \(A_3^{(j)} = 10\)), and both of these ways add 0 elements to \(A_1\) and \(A_2\). Thus, for a fixed k, there are \({\zeta \atopwithdelims ()2k} 2^{2k} 2^{\zeta - 2k} = {\zeta \atopwithdelims ()2k} 2^\zeta \) ways to add 2k elements to \(A_3\) from the \(j \in \{1,\dots ,\zeta \}\) cases, and all of these ways add \(\zeta - 2k\) elements to each of \(A_1\) and \(A_2\). Thus, it remains to add \(2\ell _3 - 2k\) elements to \(A_3\), \(2\ell _1 - \zeta + 2k\) elements to \(A_1\), and \(2 \ell _2 - \zeta +2k\) elements to \(A_2\) from the \(j \in \{\zeta + 1,\dots , n\}\) cases. For each \(j \in \{\zeta + 1, \dots n\}\), we can add no elements to \(A_1,A_2,A_3\), or 2 elements to \(A_1\) and no elements to \(A_2, A_3\), or 2 elements to \(A_2\) and no elements to \(A_1,A_3\), or 2 elements to \(A_3\) and no elements to \(A_1, A_2\). Thus, there are \({n - \zeta \atopwithdelims ()(2\ell _1 - \zeta + 2k)/2, \, (2\ell _2 - \zeta + 2k)/2, (2\ell _3 - 2k)/2}'\) ways to add the remaining elements to \(A_1, A_2, A_3\) (see Eq. (53)). Summing over the possible values of k gives

$$\begin{aligned} \kappa (n,\zeta ,\ell _1,\ell _2,\ell _3) = \sum _{k = 0}^{\ell _3} {\zeta \atopwithdelims ()2k} 2^{\zeta } {n-\zeta \atopwithdelims ()\ell _1 - \zeta /2 + k,\, \ell _2 -\zeta /2+k, \, \ell _3 -k}'. \end{aligned}$$

This is the same as Eq. (44) since \({\zeta \atopwithdelims ()2k} = 0\) if \(2k > \zeta \). \(\square \)

The bound given by Eqs. (43) and (44) is computable in \(\text {poly}(n)\) time, and we plot it for values of n up to 1000 and various values of \(\zeta \le n\) in Fig. 1. The plot strongly suggests that the bound for every \(\zeta \) scales sublinearly for all n. We know that the bound for \(\zeta = 0\) is \(\mathcal {O}(\sqrt{n}\log n)\) from Appendix F, and from the plot, the bounds for \(\zeta > 0\) seem to all be lower than that for \(\zeta = 0\).

7 Conclusion

In this paper, we investigated the classical shadows obtained via two different ensembles of random matchgate circuits, one continuous and one discrete. In Theorem 1, we analysed the first three moments of the uniform distribution over all matchgate circuits (\(\textrm{M}_n\)) and of the uniform distribution over Clifford matchgate circuits (\(\textrm{M}_n\cap \textrm{Cl}_n\)), and found that they match, establishing that Clifford matchgate circuits form a “matchgate 3-design.” We then used these results to derive expressions for the classical measurement channel corresponding to these distributions (Eq. (30)) and its pseudo-inverse (Eq. (32)), and to establish bounds on the variance of our matchgate shadows estimator for the expectation value of an arbitrary observable (Eq. (36)). Importantly, the 3-design property allowed us to easily bound the variance of estimators arising from the discrete ensemble for general classes of fermionic observables. (This is reminiscent of how Ref. [1] uses the result that the Clifford group forms a unitary 3-design [4] as a key step in analysing their Clifford classical shadows).

We then developed techniques to efficiently extract various kinds of information about a quantum state of interest from its matchgate shadow, and placed bounds on the variance of the associated estimators in some cases. For local fermionic operators, we showed that our matchgate shadows straightforwardly lead to efficient measurement schemes, matching the performance of prior work (Ref. [7]) and generalising it to handle local operators in arbitrary single-particle bases. We then demonstrated that our matchgate shadows can efficiently estimate not only local observables, but quantities like \(\langle \varrho \rangle \), where \(\varrho \) is the density operator of an arbitrary fermionic Gaussian state. For \(\varrho \) pure, this gives the fidelity between an arbitrary unknown quantum state and the fermionic Gaussian state. We showed that these estimates can be obtained from the matchgate shadows in cubic time, and bounded the variance in terms of the system size n as \(\mathcal {O}(\sqrt{n} \log n)\). This provides an interesting contrast with classical shadows derived from the Clifford group, where one is forced to choose between being able to efficiently (in terms of sample complexity) treat local qubit observables (using random single-qubit Clifford circuits) and being able to efficiently treat global properties such as fidelities and the expectation values of low-rank operators (using random n-qubit Clifford circuits), but not both simultaneously [1]. Our results show that randomising over a certain strict subset of the n-qubit Clifford group (that is, \(\textrm{M}_n \cap \textrm{Cl}_n\)), we can efficiently handle both local fermionic observables and global properties.

One of the original motivations of our work was the need to efficiently estimate quantities of the form \(\langle \psi | \varphi \rangle \), where \(| \psi \rangle \) is an arbitrary pure state (accessed via a state preparation circuit) and \(| \varphi \rangle \) is an arbitrary Slater determinant. These overlaps are required in, for instance, auxiliary-field quantum Monte Carlo (AFQMC) methods, and a protocol for estimating them using Clifford shadows was recently implemented as a core subroutine in the quantum-classical hybrid AFQMC algorithm of Ref. [2]. However, this protocol involved a classical post-processing step whose complexity scales exponentially with the system size n. Our matchgate shadows approach (explicitly described in Algorithm 1) to this problem removes this exponential bottleneck, enabling us to process each each sample in \(\mathcal {O}(n^4)\) time. We also bounded the variance of the estimates (and hence the number of samples needed) by an expression that we can efficiently evaluate. This bound is plotted in Fig. 1 for values of n up to 1000, showing that the variance is reasonable at these values, with a growth rate that suggests sublinear scaling.

In addition, we constructed a more general framework for efficiently evaluation the expectation values of arbitrary products of certain commonly encountered fermionic operators with respect to matchgate shadows. We applied this framework to generalise our overlap estimation procedure to arbitrary fermionic Gaussian states, and we expect it will be a useful tool in the development of further applications of our matchgate shadows in the future.

Before concluding, we highlight some open questions raised by our results.

We were able to improve the naive quartic-time algorithm for classically evaluating the expectation value of \(\varrho \) with respect to a matchgate shadow sample (see Sect. 3.3.2 and Appendix D), by exploiting the structure of the Pfaffians of certain linear matrix functions. Can a similar improvement be achieved for our overlap estimation algorithms? Due to the large number of overlap computations in QC-AFQMC, even shaving off this one factor of n would be valuable for this application. More generally, we did not delve into the optimisations that may be available for our methods for evaluating other quantities, and these may manifest themselves when the methods are applied to concrete use cases.

As for the quantum sample complexity, we were able to provide closed-form expressions for bounds on the variance of our matchgate shadow estimators in some cases, and efficiently computable bounds in others. Is is possible to provide a simpler or more intuitive characterisation of the variance for arbitrary operators, than our Eqs. (35) and (36)? How would our approach perform when used to estimate quantities that are nonlinear in the density operator of the unknown state?

More fundamentally, we proved that \(\textrm{M}_n \cap \textrm{Cl}_n\) is a 3-design for \(\textrm{M}_n\). Is there an analogous result for the subgroup \(\textrm{M}_n^*\) of \(\textrm{M}_n\) consisting only of fermionic parity-conserving Gaussian unitaries, i.e., \(U_Q\) such that \(Q \in \textrm{SO}(2n)\)? We know that the k-th moment of the (Haar-)uniform distribution over \(\textrm{M}_n^*\) differs from that over \(\textrm{M}_n\), for any \(k \ge 1\). To see this, observe that for any \(Q \in \textrm{SO}(2n)\), \(\mathcal {U}_Q(\gamma _{[2n]}) = \gamma _{[2n]}\) (recall \(\gamma _{[2n]}\) is proportional to the parity operator), so the 1-fold twirl for \(\textrm{M}_n^*\) maps \(\gamma _{[2n]}\) to itself, whereas the 1-fold twirl for \(\textrm{M}_n\) maps \(\gamma _{[2n]}\) to 0 by Theorem 1(i). Thus, the 1-fold twirls are different, which implies that the k-fold twirls are different for all k.

Finally, for Clifford shadows, Ref. [5] drew on the connection between classical shadows and randomised benchmarking to design a noise-robust classical shadow protocol. Can we make use of this connection and the prior work of Ref. [6] on the randomised benchmarking of matchgates to design a similarly robust version of our matchgate shadows?

We leave these explorations to future work.