Estimating Quantum Mutual Information Through a Quantum Neural Network

We propose a method of quantum machine learning called quantum mutual information neural estimation (QMINE) for estimating von Neumann entropy and quantum mutual information, which are fundamental properties in quantum information theory. The QMINE proposed here basically utilizes a technique of quantum neural networks (QNNs), to minimize a loss function that determines the von Neumann entropy, and thus quantum mutual information, which is believed more powerful to process quantum datasets than conventional neural networks due to quantum superposition and entanglement. To create a precise loss function, we propose a quantum Donsker-Varadhan representation (QDVR), which is a quantum analog of the classical Donsker-Varadhan representation. By exploiting a parameter shift rule on parameterized quantum circuits, we can efficiently implement and optimize the QNN and estimate the quantum entropies using the QMINE technique. Furthermore, numerical observations support our predictions of QDVR and demonstrate the good performance of QMINE.


I. INTRODUCTION
The concept of quantum mutual information (QMI) in quantum information theory quantifies the amount of information shared between two quantum systems.This extends the classical notion of mutual information to the quantum regime [1][2][3].This information measure is fundamental, because it determines the quantum correlation or entanglement between two quantum systems.The information obtained from quantum mutual information can be applied to various fields of quantum information processing such as quantum computation, quantum cryptography, and quantum communication [2,3] (particularly in quantum channel capacity problems [4,5]).They also play a crucial role in quantum machine learning [6,7], where they measure the information shared between different representations of quantum datasets.Moreover, the gathered information can be used to enhance the efficiency and effectiveness of quantum algorithms in processing quantum data.
Quantum mutual information is expressed as the sum of von Neumann entropies, denoted by S(ρ) = −Tr(ρ ln ρ) for a quantum state ρ, making the determination of the von Neumann entropy [8] essential for calculating quantum mutual information.In recent years, the estimation of the von Neumann entropy has garnered significant attention in the field of quantum information theory.Various methods have been proposed to estimate von Neumann entropy, including those exploiting quantum state tomography [9], Monte Carlo sampling [10], and entanglement entropy [11][12][13][14][15][16][17][18].Several studies [16][17][18] have utilized quantum query models for entropy estimation and have demonstrated promising quantum speedups.Specifically, Wang et al. [16] proposed that the von Neumann entropy can be estimated with an accuracy of ε by using O( r 2 ε 2 ) queries.However, these query model-based algorithms have practical limitations because a quantum circuit that generates the quantum state must be prepared, and the effectiveness of constructing a quantum query model for the input state remains an open question [15].Thus, we focused on estimating the von Neumann entropy of an unknown quantum state using only identical copies of the state.To the best of our knowledge, no existing quantum algorithms estimate the von Neumann entropy using O(poly(r), poly( 1 ε )) copies of the quantum state, where r represents the rank of the state.
A mutual information neural estimation (MINE) method is a novel technique that utilizes neural networks to calculate the classical mutual information between two random variables.More precisely, this method optimizes a neural network to estimate mutual information by minimizing the loss function.The loss function is based on the Donsker-Varadhan representation [19] that provides a lower bound for the well-known Kullback-Leibler (KL) divergence.
Quantum neural networks (QNNs) [20,21], which are among the most powerful quantum machine learning methods, serve as quantum counterparts to conventional neural networks, and offer several advantages.One notable advantage is the ability to use a quantum state as an input, which is particularly advantageous when calculating quantum mutual information or the von Neumann entropy.We identified two types of QNNs in the literature [20,22] that possess a neural network structure and leverage quantum advantages accompanied by well-defined training procedures.In this study, we employed a parameterized quantum circuit [22], which is known for its quantum advantages, despite the presence of the barren plateau problem, which requires further investigation [23].
As a quantum analog of MINE, we propose a quantum mutual information neural estimation (QMINE), which is a method for determining the von Neumann entropy and quantum mutual information through a quantum neural network technique.Similar to the classical case, QMINE uses a quantum neural network to minimize the loss function that evaluates the von Neumann entropy.)) copies of the quantum state.However, we acknowledge that further investigation is required owing to the challenging and well-known barren plateau problem, as well as the need for efficient quantum training methods.
The remainder of this paper is organized as follows.In Sec.II, we briefly introduce the basic notions of quantum mutual information, MINE, and parame-trized quantum circuits.In Sec.III, we generalize the Donsker-Varadhan representation to a QDVR, which is the main component of QMINE.We also propose an estimation method for von Neumann entropy using quantum neural networks in Sec.IV.This implies that it is possible to efficiently obtain the quantum mutual information, and its numerical simulations under the framework of QMINE in Sec.V. Finally, a discussion and remarks are presented in Sec.VI, and open questions and possibilities are raised for future research.
Note on concurrent work.The independent and concurrent work [24] appeared on the arXiv a few days after our preprint was uploaded.It introduced a method for estimating von Neumann entropy reminiscent of ours, with Rényi entropy, measured relative (Rényi) entropy, and fidelity.Our work focused on estimating von Neumann entropy with low copy complexity.We reduced the domain in the variation formula but Ref. [24] did not.We believe that limiting the trace and rank in the variation formula is crucial for effective estimation.Quantum mutual information, also known as von Neumann mutual information, quantifies the relationship be- tween two quantum states.This can be calculated by using the formula (See Fig. 1): Here, S(ρ) represents the von Neumann entropy [8] of quantum state ρ in a d-dimensional Hilbert space, given by S (ρ) = −Tr (ρ log ρ).Therefore, estimating the von Neumann entropy enables the estimation of the quantum mutual information.The von Neumann entropy, which is an extension of the Shannon entropy [25] to the quantum domain, can be estimated using quantum circuits and measurements.It is defined as the entropy of the density matrix associated with a quantum state, where the density matrix is a positive semi-definite matrix that represents the state.To estimate the von Neumann entropy, measurements can be performed on multiple copies of the quantum state and the outcomes of these measurements can be utilized.The most straightforward approach is to directly estimate the density matrix and calculate the entropy using its definition.However, estimating the von Neumann entropy can be challenging, particularly for large quantum systems, owing to the difficulty in accurately estimating the density matrix.Furthermore, the estimation accuracy is influenced by the number of measurements conducted and the quality of the quantum hardware employed.However, ongoing research is focused on developing more efficient and precise methods for estimating the von Neumann entropy.
Several methods have been employed to estimate von Neumann entropy, particularly those utilizing the quantum query model [15,17,18].In the quantum query model, if the quantum circuit U produces a quantum state ρ, it utilizes unitary gates such as U , U † , and CU (controlled-U ).However, the quantum circuit must be known to use the query model.The effectiveness of constructing a quantum query model for a given input state remains uncertain [15], prompting us to explore the von Neumann entropy estimation without relying on the quantum query model.In the absence of a query model, our approach solely exploits identical copies of quantum states.Previous studies, such as Acharya et al. [13] employed O(d 2 ) copies of the quantum state ρ, where d denotes the dimension, whereas Wang et al. [15] used ε 5 λ 2 copies of ρ, where λ represents the lower bound on all nonzero eigenvalues.To the best of our knowledge, no existing algorithm provides a high-accuracy estimation of the von Neumann entropy by using only O(poly(r)) copies of ρ with rank r.

B. Mutual Information Neural Estimator
The mutual information neural estimator (MINE) [26] is a method for estimating the mutual information of two random variables by using neural networks.This approach involves selecting functions T θ : X×Y → R that are parameterized by neural networks with the parameter θ ∈ Θ. Considering n samples, we define the empirical joint and product probability distributions as p Y , respectively.The MINE strategy is given by: where E is the expected value.Additionally, the Donsker-Varadhan representation is defined as follows: For any probability distribution functions p and q, D KL (p||q) = sup where we take the supremum over all the functions T .Using the Donsker-Varadhan representation [27], it can be proven that I (X : Y ) ≥ I (X : Y ) n and MINE are strongly consistent, meaning that there exists a positive integer N and a choice of neural network parameters θ ∈ Θ such that for all n ≥ N , I(X : Y ) − I(X : Y ) n ≤ ε.By applying a gradient descent method on the neural network T θ to maximize we can obtain I(X : Y ) n and estimate the mutual information I(X : Y ).
The MINE technique has found applications in various areas of artificial intelligence, such as feature selection, representation learning, and unsupervised learning, using information-theoretic methods.Compared to previous approaches, it provides more accurate and robust estimates of mutual information, leading to significant advancements in the field of artificial intelligence (AI).It is important to recognize that MINE is a relatively new and rapidly evolving field, with ongoing research focused on enhancing and broadening its capabilities.Nonetheless, the MINE technique is widely regarded as a valuable tool in AI and information theory, offering a powerful and flexible approach for estimating the mutual information between variables.

C. Parametrized Quantum Circuits
Parameterized quantum circuits (PQCs) [22] are quantum circuits that incorporate adjustable parameters, typ-ically represented as real numbers.These parameters can be fine-tuned to control the behavior of the quantum circuit, thereby providing increased flexibility and optimization potential.Parameterized quantum circuits have extensive applications in quantum machine learning and optimization algorithms, enabling computations that are challenging or even infeasible using classical methods.The key concept is to employ a parameterized quantum circuit as a feature extractor or waveform generator, followed by classical optimization algorithms that iteratively adjust the circuit parameters to minimize the objective function.
By manipulating circuit parameters, one can efficiently learn and represent quantum systems in a compact and adaptable manner.In quantum optimization, parameterized quantum circuits play a crucial role in global minimum search.By encoding the objective function into circuit parameters, quantum effects such as quantum parallelism and quantum tunneling can be harnessed to explore the search space more effectively than classical optimization algorithms.The objective function can be represented as a measurement outcome of the quantum circuit.The quantum circuit can harness superposition and entanglement to explore the search space more effectively than classical optimization algorithms.
One of the core techniques used in quantum optimization procedures for parameterized quantum circuits is the parameter shift rule [28].The parameter shift rule is a powerful tool in quantum machine learning that enables efficient computation of gradients with respect to the parameters of a quantum circuit.
The fundamental concept behind the parameter shift rule is to employ a quantum circuit with adjustable parameters to perform the measurements.By utilizing the measurement outcome, it is possible to estimate the gradient of a cost function with respect to the circuit parameters.This rule capitalizes on the notion that small variations in the parameters of a quantum circuit can be used to calculate the derivative of the cost function pertaining to these parameters.
The underlying principle involves the preparation of two identical copies of a quantum state, each with slightly different parameter values.By comparing these two quantum states, it was possible to estimate the gradient.More importantly, this method allows the calculation of gradients in a single pass through a quantum circuit, obviating the need for additional measurements.By using multiple samples via measurement, the gradient can be estimated.
If a parameterized quantum circuit is represented as a sequence of unitary gates, it is denoted as The output of the circuit can then be observed using an observable Ô and the measurement outcome becomes a quantum circuit function.The quantum circuit function is expressed in simplified form as f (x; θ i ) = The gradient of the quantum circuit function can then be calculated using the parameter shift rule, as follows: The parameter shift rule has been successfully employed in various quantum machine learning algorithms, including quantum neural networks [20,22] and quantum support vector machines [29,30], for optimization and training purposes.It is regarded as a valuable tool for developing efficient quantum machine learning algorithms, as it enables the efficient computation of gradients in quantum systems, which is often a challenging task.It is important to note that the parameter shift rule is an approximation, and its accuracy depends on factors such as the choice of parameters, cost function, and the specific quantum circuit.Nevertheless, it has proven to be a useful and efficient technique in the emerging field of quantum machine learning, and our ongoing research focuses on enhancing and expanding its potential capabilities.

III. QUANTUM DONSKER-VARADHAN REPRESENTATION
The quantum Donsker-Varadhan representation is a mathematical framework that enables quantum neural networks to estimate the von Neumann entropy.It is a quantum counterpart of the original Donsker-Varadhan representation, with the distinction that QDVR focuses solely on the quantum entropy rather than on the relative entropy.QDVR can be considered as a modified version of the Gibbs variational principle [31], which restricts the domain to density matrices.
As mentioned previously, MINE [26] exploits the original Donsker-Varadhan representation to estimate classical mutual information using a (classical) neural network.In the context of estimating mutual quantum information, it is natural to consider a quantum version of the Donsker-Varadhan representation.Notably, we need only estimate the components of von Neumann entropy S(ρ A ), S(ρ B ), and S(ρ AB ) to determine the quantum mutual information I(A : B).A variational formula for von Neumann entropy exists as follows: Theorem 1 (Gibbs Variational Principle [31]).Let f : H d×d → R be a function defined on d-dimensional Hermitian matrices T and ρ be a density matrix.Then we have f (T ) = −Tr(ρT ) + log Tr(e T ) . (5) Thus, for d-dimensional Hermitian matrices T , the von Neumann entropy is given by: where the infimum is taken over all Hermitian T .
Our objective is to determine the Hermitian matrix T that maximizes f (T ).
We parameterize T by using t i ∈ R and To compute f (T ), we must measure the quantum state ρ using the basis {|ψ i ⟩} d i=1 .Achieving this with an error smaller than ε requires O( σ 2 ε 2 ) samples of ρ, where σ := Var({t i }).However, the number of required samples of ρ can become substantial because of the broad domain of T , which encompasses all Hermitian matrices.Therefore, reducing the size of this domain is imperative.
Lemma 2. For all Hermitian matrices T , the function f holds that for a constant c.Now, we only need to search for the space of the positive Hermitian matrices to find the optimal T .The computational complexity of copying ρ to calculate f (T ) depends on T .To reduce this complexity, we need to specify and limit the trace of T .Lemma 3. A positive Hermitian matrix T 0 with rank r exists that satisfies Tr(T 0 ) ≤ 2rn + r log 1 ε such that where ρ is an r-rank density matrix.
Proof.See the details of the proof in Appendix VII A.
Proposition 2 (Quantum Donsker-Varadhan Representation).Let f : H d×d → R be a function defined on ddimensional Hermitian matrices, and let ρ be an r-rank density matrix.
where c ≥ 2rn + r log 1 ε .Then, According to the quantum Donsker-Varadhan representation in Proposition 2, we only need to search within the space of the density matrices.By calculating g(T ) with an error of ε, the complexity of copying ρ is O( c 2 ε 2 ).Next, we plan to determine the optimal density matrix T that minimizes g(T ).In the next section, we will use quantum neural networks to determine the optimal T .

IV. VON NEUMANN ENTROPY ESTIMATION WITH QUANTUM NEURAL NETWORKS
We now explain the estimation of von Neumann entropy using quantum neural networks, specifically focusing on parameterized quantum circuits as an example.Our approach is inspired by the work of Liu et al. [32], who utilized variational autoregressive networks and quantum circuits to address problems in quantum statistical mechanics.To achieve this, specific values are assigned to the variables in T by defining t as a set of real numbers, {t i |t i ∈ R} and |ψ i ⟩ as complex vectors in C d .Additionally, let us assume that the rank of ρ is denoted by r, and we define T = r i=1 t i |ψ i ⟩⟨ψ i |.Consequently, the function g(T ) becomes g(T ) = −c r i=1 t i ⟨ψ i |ρ|ψ i ⟩ + log (d − r + r i=1 e cti ).We can introduce a unitary operator U that transforms |ψ i ⟩ into |i⟩, and represent this unitary operator using a set of parameters θ as U (θ) as follows: (12) By considering U (θ) as a quantum neural network and ρ as its input, we can obtain the network output by computing U (θ)ρU † (θ).To accurately calculate g(T ) with an error rate less than ε, it is necessary to measure the output of the quantum neural network O Var(cti) 2 Our objective was to optimize the parameters to determine the infimum of g(T ).For example, let us consider a parameterized quantum circuit [22] with Pauli gates as a quantum neural network . By applying the parameter shift rule [28], we observe that and (14) To satisfy the conditions t i ≥ 0 and r i=1 t i = 1, we choose t i = i−1 j=1 sin 2 φ j cos 2 φ i .We can apply gradient descent to φ j and θ i to optimize the quantum circuit.To calculate the gradient, we require O c 2 ε 2 × (# of parameters in QNN) copies of ρ.Therefore, to obtain inf (g (T )) and estimate S (ρ) with an error of less than ε, we require copies of ρ.Analytic gradient measurements in convex loss functions require O( n 3 ε 2 ) copies of ρ to converge to a solution with O(ε) close to the optimum [33].In general, situations that involve parameterized quantum circuits may have nonconvex loss functions, but many algorithms still utilize parameterized quantum circuits and achieve quantum speedups.We anticipate that quantum speedup can be achieved by employing parameterized quantum circuits with analytic gradient measurements in QMINE and estimating the von Neumann entropy using O(poly(r)) copies of ρ.In future research, we will investigate the relationships between n train and n params , and the performance of this approach.The key point is to transform the quantum mutual information estimation problem into a quantum neural network problem.

V. NUMERICAL SIMULATIONS
We demonstrated the performance of QMINE in estimating the quantum mutual information of random density matrices through numerical simulations of a quantum circuit.Our goal is to show that QMINE can estimate quantum mutual information with low error.We also analyze the rank and trainable parameters, and conducted simulations to support the results on QDVR.

A. Rank Analysis
Based on QDVR, we establish that if the rank of the density matrix ρ is r, then setting the rank of the parameter matrix T to r is sufficient.Thus, we aim to determine the optimal T that estimates the von Neumann entropy.To investigate the effect of rank, we experimented with the rank of T by letting r = rank(ρ) and k = rank(T ).In this analysis, we simulate the scenario with N = 5, D = 30, r = 8, and c ≤ 80, where N is the number of qubits, D is the circuit depth, r is the rank of the density matrix, c is calculated using QDVR (details are provided in Appendix VII B). Figure 2 shows that when k ≥ r, the result of QMINE converges to the correct value, whereas when k < r, it converges at a high error rate.This phenomenon has also been observed in other cases.These results support the QDVR's claim that the rank of the optimal solution T is r.Because convergence is faster when k = r than when k > r, it is best to use QMINE with k = r.The green curve represents QMINE with the exact rank, which exhibited the best performance.It converges rapidly with low error.However, the red curve represents QMINE with a lower rank, which converges with a high error.Finally, the blue curve represents QMINE with a higher rank, which converges with low error but at a slower pace.

B. Number of Trainable Parameters on Quantum Circuit Analysis
We analyzed the performance of QMINE by varying the depth of the quantum circuit.In our simulations, we used N = 5, D = 30, r = k = 8, and c ≤ 80.The experimental results confirmed that as the depth of the circuit and the number of parameters increased, the estimation accuracy of QMINE improved.Fig. 3 illustrates the results, showing that a circuit depth of 20 achieved the best performance.It converged rapidly with a lower error compared to a depth of 30, which converged at a slower rate despite having a similar error.These findings emphasize the importance of choosing an appropriate circuit depth (i.e., number of parameters) in QMINE.The copy complexity is determined by the number of parameters (n params ) and the number of training iterations (n train ).Therefore, when applying QMINE in various situations, it is crucial to select the correct circuit depth.We plan to investigate this aspect in future studies.
FIG. 3: The green line in the graph, representing a circuit depth of 20 with 400 parameters, exhibits the best performance.It converges quickly with a low error-rate.However, the red line, representing a depth of 10 with 200 parameters, converges with a high error-rate.The blue line, corresponding to a depth of 30 with 600 parameters, achieves a low error but it takes a longer-time to converge.

C. Estimating Quantum Mutual Information
We estimated the quantum mutual information of a random density matrix using simulations with N = 4 qubits.For each tested random density matrix, we achieved error rates ranging from 0.1% to 1%.Additional details can be found in Appendix VII B.

VI. CONCLUSIONS
We have addressed the quantum Donsker-Varadhan representation, which is a mathematical framework for estimating von Neumann entropy.The QDVR allows us to find the optimal T by searching only within the density matrices, resulting in low copy complexity for calculations.By optimizing the quantum neural network using QDVR and the parameter shift rule, we can estimate von Neumann entropy and subsequently estimate the quantum mutual information.The number of copies of ρ required is approximately ε n params • n train .Through the numerical simulations, we demonstrated that the quantum mutual information neural estimation (QMINE) performs well, and it aligns with the results of quantum Donsker-Varadhan representation.The rank analysis supported the results of QDVR, whereas the circuit depth analysis emphasized the importance of selecting an appropriate circuit depth.In addition, we estimated the quantum mutual information and achieved a low error rate.The key finding of this study is the conversion of the quantum mutual information and von Neumann entropy estimation problem into a quantum neural network problem.In future, we suggest investigating the specifics of n params and n train pertaining to the quantum neural network problem.This will be explored in future studies.To obtain quantum mutual information, we adopted an alternative and simple strategy.By exploiting QMINE (suggested in Sec.IV), we directly estimate S(ρ A ⊗ ρ B ) and S(ρ AB ).That is, we address S(ρ A ⊗ ρ B ) rather than estimating S(ρ A ) or S(ρ B ).This method reduces the number of resource copies required for simulations.

|S
We used four-qubit for this simulation and the results of our experiment are summarized in Table I.To show that QMINE can estimate the quantum mutual information for various density matrices, we present the results of the estimation, where the rank of ρ AB is different.
Mutual Information and von Neumann Entropy

FIG. 1 :
FIG. 1: Schematic diagram for the quantum mutual information, I(A : B), between two quantum states ρ A and ρ B .

FIG. 2 :
FIG. 2:We compare the performance of different approaches.The green curve represents QMINE with the exact rank, which exhibited the best performance.It converges rapidly with low error.However, the red curve represents QMINE with a lower rank, which converges with a high error.Finally, the blue curve represents QMINE with a higher rank, which converges with low error but at a slower pace.
To generate a loss function that estimates the von Neumann entropy, we present the quantum Donsker-Varadhan representation (QDVR), which is a quantum version of the Donsker-Varadhan representation.QMINE offers the potential for a quantum advantage in estimating the von Neumann entropy facilitated by QDVR.By converting the problem of von Neumann entropy estimation into a quantum machine learning regime, QMINE opens new possibilities.There is also the potential to estimate von Neumann entropy using only O(poly(r), poly( 1 ε

TABLE I :
Estimations of quantum mutual information using the QMINE method