1 Introduction and related work

Quantum computing (QC) is one of the most promising fields in computation and has taken an important place in international research (Ying 2010). QC is based on the physics theorem, which assumes that an electron can behave simultaneously as a wave and a particle (Robertson 1943). However, there are some difficulties in building and maintaining the superposition of quantum computers due to the sensitivity of the computers to noise and decoherence (Bennett et al. 1997; De Wolf 2019). Some technology companies have quantum computers and have invested in developing this field (Zeng et al. 2017). Throughout the years, arguments on the advantages and disadvantages of QC have been raised and are still discussed today (Boyer et al. 1998; De Wolf 2019).

A few QC algorithms have been developed throughout the years to answer different problems. Grover’s algorithm is one example, which was developed to solve the problem of finding a value in an unsorted array (Lavor et al. 2003; Leuenberger and Loss 2003). An additional significant algorithm is Shor’s algorithm. Shor’s algorithm was discovered in 1994 and proposes a solution for integer factorization. Shor used the advantages of QC to solve a problem in polynomial time, rather than exponential time in a classical computer (Hayward 2008). Theoretically, Shor’s algorithm breaks public-key cryptography schemes like the widely used Rivest–Shamir–Adleman (RSA) scheme. The RSA is a public-key cryptosystem used for secure data transmission and is based on the assumption that factoring large integers is computationally intractable (Milanov 2009). Therefore, it may be feasible to defeat RSA by constructing and maintaining a large quantum computer (Thombre and Jajodia 2021).

As mentioned above, quantum computers significantly decrease computing complexity. Therefore, they can perform wider operations than classical computers in parallel processes (Biamonte et al. 2017; Wiebe 2020). However, algorithms can use a combination of classical computers and quantum computers and are not exclusive to one type of computer or the other (Buffoni and Caruso 2021). Thus, the combination of QC and classical computing yields a young but rapidly growing field, quantum machine learning (QML). Generally, transforming a classical machine learning algorithm into QC requires implementing the logic of the classical machine learning algorithm with circuits composed of quantum gates (Benedetti et al. 2019; Alchieri et al. 2021). Recently, studies have presented quantum algorithms for learning random variables (González et al. 2022; Pirhooshyaran and Terlaky 2021), building a quantum convolutional network to learn images (Hur et al. 2022; Tüysüz et al. 2021), developing generative adversarial networks (GANs) and transfer learning (Assouel et al. 2022; Azevedo et al. 2022; Zoufal et al. 2021), and reinforcement learning implementation (Dalla et al. 2022).

In physics, entropy is essential for describing uncertainty in the state of matter (Bein 2006). In recent years, the concept of entropy has become increasingly important in the theory of information with the development of information technology. Thus, information can be quantified by measuring the amount of data in events, random variables, and distributions (Wehrl 1978). At the same time, probabilities are used to quantify information, which is why information theory is related to probability theory. Furthermore, information measurements are widely used in artificial intelligence and machine learning, such as in constructing decision trees and optimizing classifier models (Kapur and Kesavan 1992). As such, there is a significant relationship between information theory and machine learning, and a practitioner must be familiar with some of the basic concepts from the field (Huang et al. 2022; Liu et al. 2022). Relatedly, in data mining and machine learning, entropy represents a model’s degree of unpredictability or impurity. Therefore, if it is easier to draw a valuable conclusion from a piece of information, then the entropy will be lower. On the other hand, if the entropy is higher, it will be more challenging to make conclusions based on that information (Kaufmann et al. 2020; Kaufmann and Vecchio et al. 2020).

This work presents a quantum “black box” for entropy calculation. It is a generic procedure, regardless of data type, and can be applied for information analysis, ML algorithms, and more. Section 2 describes the procedure’s correctness and general implementation using quantum logic circuits. The central innovative aspect of this method is to allow users without any background in QC to utilize the capabilities of a quantum computer for their specific needs without the need to build the quantum circuits and transform the problem from classical to quantum computation. Therefore, this “black box” is accessible to those without preliminary knowledge of QC. Moreover, our quantum “black box” has a fixed depth of three, equivalent to the number of steps done by the quantum computer that runs the circuit. Comparing it to classical computer computation, it does not depend on the input size as it is based on amplitude encoding, which encodes the input as a single state of the quantum circuit. Section 3 presents a case study that compares our method to classical computer results. Section 4 describes the main conclusions and suggestions for future research.

2 Quantum entropy “black box”

This section presents and describes a new method for quantum entropy calculation. It is aimed at making QC accessible and enables entropy calculation using quantum computers. The method assumes that the state vector (in a single time phase) includes the probability that the circuit will end in a specific state. First, we will describe the method and its procedure. Then, we will present the implementation and correctness of the method.

2.1 Quantum logic and gates

Let v = (v1, v2, …, vn) be the input vector that represents the occurrences of each item (i.e., each vi ∈  ∪ {0} represents the number of occurrences of the ith item). To begin, the algorithm transforms v to an amplitude encoding by concatenating all n items into a single amplitude vector. Let \(\overset{\sim }{v}\) be the amplitude vector, such that \({\left|\overset{\sim }{v}\right|}^2=1\). The normalization constant, denoted as \(\widetilde A\), satisfies

$$\widetilde A=\frac1{\sqrt{\sum_{i=1}^nv_i}}$$

The input vector can be represented in the computational basis as \(\overset{\sim }{v}=\sum_{i=1}^n\sqrt{v_i}\mid i>\). Since a quantum system of n qubits provides 2n amplitudes, encoding \(\overset{\sim }{v}\) requires the use of ⌊log2n⌋ + 1 qubits. It is important to note that in cases where the length of \(\overset{\sim }{v}\) is not to the power of two, zeros were added as their values do not change the entropy calculation.

Next, the algorithm creates a quantum circuit using ⌊log2n⌋ + 1 qubits, initializes the states by \(\overset{\sim }{v}\) elements, and applies the unitary gate \(U\left(\frac{\pi }{2},0,\pi \right)\) on each qubit (equivalent to the Hadamard gate) to transform it into superposition. The vector is presented as an amplitude encoding; thus, each state holds the probability of the input item. Let ∣ψ> be the state vector achieved in this time phase. Thus, ∣ψ> is a vector of size n and represents the complex form probabilities of the original vector items. Next, the algorithm creates PE, a parameterized vector of size n (i.e., a vector that defines parameters according to the values assigned while running the quantum circuit), such that the items of PE are the coefficients of H ∣ ψ> multiplied by log2(e) ≈ 1.4427. As PE is a parameterized vector, we applied a logarithm rotation and achieved the following:

$$\mathrm{PE}={\log}_2(e)\left[\begin{array}{c}\mid \\ {}{\log}_e\left(\frac{\sqrt{v_i}}{\sqrt{\sum_{i=1}^n{v}_i}}\right)\\ {}\mid \end{array}\right]$$

Note that PE cannot be performed as stand-alone operation since it does not represent a state vector. For that, let W be a square and invertible diagonal matrix of size n, where Wii holds ∣ψ> coefficients, and Wij = 0 for each i ≠ j. The method applies W on PE to calculate the multiplication of the state probabilities in their logarithm rotation vector. At the end of the quantum circuit, the method applies the unitary gate \(U\left(\frac{\pi }{2},0,\pi \right)\) on each qubit and returns the output vector. Last, the method uses classical computer computation to calculate the vector sum (i.e., the total entropy of \(\overset{\sim }{v}\)).

Notes

  1. 1.

    The PE and W gates are described earlier in this section. The proof of its correctness is detailed in Section 2.2. Note that PE and W are circled (in Fig. 1) and defined as an operation to satisfy the invertibility conditions.

    Fig. 1
    figure 1

    The quantum circuit of entropy calculation

  2. 2.

    The dashed lines describe the entry and exit of the qubits from the superposition.

  3. 3.

    We used the IBM simulators (with the Qiskit Library for Python; Cross 2018) to avoid noise and to be able to sample the state vector in each time phase in the circuit. This differs from quantum computers as each observation/measure causes collapse of the quantum circuit.

  4. 4.

    Fig. 1 describes the quantum circuit over three qubits, although generalization to a higher dimension can be done with tensor products.

figure a
figure b

2.2 Correctness

Let v = (v1, v2, …, vn) be the input vector that represents the occurrences of each item (i.e., each vi ∈  ∪ {0} represents the number of occurrences of the ith item). Let \(\widetilde A\) be the normalization constant. Applying the square root of each item in v, we get

$$\left(\sum_{i=1}^n\left(\sqrt{v_i}\right)^2\right)\widetilde A^2=1$$
$$\widetilde A^2=\frac1{\sum_{i=1}^nv_i}\widetilde A=\frac1{\sqrt{\sum_{i=1}^nv_i}}$$

The method transforms v to an amplitude vector, denoted as \(\overset{\sim }{v}\), such that each vi ∈ v is converted to \(\frac{\sqrt{v_i}}{\sqrt{\sum_{i=1}^n{v}_i}}\). Therefore, it satisfies the following:

$$\widetilde v=\widetilde A\cdot\sum_{i=1}^n\sqrt{v_i}$$
$${\left|\overset{\sim }{v}\right|}^2=\sum_{i=1}^n{\left(\frac{\sqrt{v_i}}{\sqrt{\sum_{i=1}^n{v}_i}}\right)}^2=\sum_{i=1}^n\frac{v_i}{\sum_{i=1}^n{v}_i}=\frac{\sum_{i=1}^n{v}_i}{\sum_{i=1}^n{v}_i}=1$$

Let ∣ψ> be the initialized state vector. The method set \(\overset{\sim }{v}\)to the initial states and applies the U gate with the parameters \(\theta =\frac{\pi }{2},\phi =0,\lambda =\pi\), which is equivalent to applying the Hadamard gate to move the states into superposition. Thus, the current quantum circuit is H ∣ ψ>. Since the normalization constant sums up to one, the coefficients of H ∣ ψ> can describe the probability of each state and have the form \(\sqrt{p_i}\mid i>\), where pi is the probability of the ith item in the computational basis.

Let PE be a parametric vector of size n (i.e., a vector that defines parameters according to their values assigned while running the quantum circuit), such that each PEi item is the item of H ∣ ψ> multiplied by log2(e) ≈ 1.4427. Thus,

$$\mathrm{PE}={\log}_2(e)\cdot H\mid \psi >$$

In quantum computing, the logarithms are in a natural base (i.e., e) since all basic computations can be described by polar coordinates. To multiply PE in log2(e), we converted the calculations to base two (i.e., the binary base):

$${\log}_{\mathrm{e}}(a)=\frac{\log_2(a)}{\log_2(e)}\kern0.5em \Longleftrightarrow {\log}_2(a)={\log}_2(e)\cdot {\log}_{\mathrm{e}}(a)$$

Therefore, PE is a vector that represents the logarithm of the state coefficients (i.e., the probabilities):

$$\mathrm{PE}={\log}_2(e)\left[\begin{array}{c}\mid \\ {}{\log}_e\left(\frac{\sqrt{v_i}}{\sqrt{\sum_{i=1}^n{v}_i}}\right)\\ {}\mid \end{array}\right]$$

Let W be a diagonal gate of size n, including all pi elements (i.e., the square coefficients of ψ>):

$$W=\left(\begin{array}{cccc}\frac{v_1}{\sum_{i=1}^n{v}_i}& 0& \cdots & 0\\ {}\vdots & \frac{v_2}{\sum_{i=1}^n{v}_i}& \ddots & \vdots \\ {}0& 0& \cdots & 0\\ {}0& 0& 0& \frac{v_n}{\sum_{i=1}^n{v}_i}\end{array}\right)$$

Applying W · (PE) returns a vector of size n, in which each element represents the multiplication of the W diagonal in the logarithm parametric vector. Last, applying H again yields

$$W\cdot \left(\mathrm{PE}\right)\cdot H=W\cdot {\log}_2(e)\cdot H\mid \psi >H$$

Given that HH = I, we get an output of

$$W\cdot {\log}_2(e)\cdot H\mid \psi >H=$$
$$=W\cdot {\log}_2(e)\cdot \mid \psi >=$$
$$={\log}_2(e)\cdot W\cdot \mid \psi >$$

where W ·  ∣ ψ> is the multiplication of the state probabilities in the current state (i.e., the logarithm rotation) and log2(e) normalized the result into base two.

3 Case study

This section presents the case study and experiments of the entropy calculation using our method compared to classical computer computation. Each experiment was simulated using an IBM simulator with 1024 shots. In cases based on the state vector of the quantum circuit, we used the state achieved by most of the shots. For the demonstration’s simplification, we first describe a simple use case of entropy calculation of a numerical vector and detail each state and operation in the quantum circuit. Then, we present an entropy calculation of a given text.

3.1 Simple occurrences vector

Let v = [4, 3, 1, 6] be the vector of occurrences of size four, such that the first item appeared four times, the second item appeared three times, and so on. The classical computer computation for entropy yielded

$$-\frac{2}{7}{\log}_2\left(\frac{2}{7}\right)-\frac{3}{14}{\log}_2\left(\frac{3}{14}\right)-\frac{1}{14}{\log}_2\left(\frac{1}{14}\right)-\frac{3}{7}{\log}_2\left(\frac{3}{7}\right)=1.788$$

The quantum circuit converted v into an amplitude vector \(\overset{\sim }{v}\), such that each vi ∈ v was assigned to \(\frac{\sqrt{v_i}}{\sqrt{\sum_{i=1}^n{v}_i}}\). Applying the Hadamard gate and pushing \(\overset{\sim }{v}\) into the superposition yielded a state vector ∣ψ> of

$$\mid \psi >\kern0.75em =\kern0.5em \frac{\sqrt{14}}{7}\mid 00>+\frac{\sqrt{42}}{14}\mid 01>+\frac{\sqrt{14}}{14}\mid 10>+\frac{\sqrt{21}}{7}\mid 11>$$

Next, the method created PE, the parameterize vector equal to the logarithm rotation of ∣ψ> multiplied in log2(e):

$$\mathrm{PE}\left(|\psi >\right)=\left(\begin{array}{c}-{1.807}{i}\\ {}-{2.222}{i}\\ {}-{3.807}{i}\\ {}-{1.222}{i}\end{array}\right)$$

Since PE(| ψ>) is a parameterized vector, we multiplied it by the diagonal matrix W to ensure an invertible state operation:

$$W=\left(\begin{array}{cccc}0.285& 0& 0& 0\\ {}0& 0.215& 0& 0\\ {}0& 0& 0.072& 0\\ {}0& 0& 0& 0.428\end{array}\right)$$
$$\left(\begin{array}{cccc}0.285& 0& 0& 0\\ {}0& 0.215& 0& 0\\ {}0& 0& 0.072& 0\\ {}0& 0& 0& 0.428\end{array}\right)\cdot \left(\begin{array}{c}-1.807\\ {}-2.222\\ {}-3.807\\ {}-{1.222}\end{array}\right)=\left(\begin{array}{c}-{0.285}\cdot {1.807}{i}\\ {}-{0.215}\cdot {2.222}{i}\\ {}-{0.072}\cdot {3.807}{i}\\ {}-{0.428}\cdot {1.222}{i}\end{array}\right)=\left(\begin{array}{c}-{0.515}{i}\\ {}-{0.477}{i}\\ {}-{0.275}{i}\\ {}-{0.523}{i}\end{array}\right)$$

Last, the classical computer computed the absolute sum value of the matrix, which represented the total entropy observed:

$$0.515+0.477+{0.275}+{0.523}={1.788}$$

The error (i.e., difference) between both outputs is 3.2e−21, indicating a high level of agreement between the classic and quantum computations.

3.2 Entropy of text

For the demonstration of our method on text, we used the information presented in Section 1 of this work (i.e., “Introduction and related work”). We preprocessed the text and converted it into an array of occurrences, where the first item was the number of “a” occurrences, the second item was “b”, and so on. As an input vector, we received a vector of size 24 with an entropy of 4.155, calculated by classical computer computation.

First, for the quantum circuit, we added eight zero values to ensure the input had a size that was to the power of two (i.e., 32) and then applied the Hadamard gate and pushed it into superposition. It yielded a state vector ∣ψ> of size 32; hence, ∣ψ> coefficients represented the probabilities of each input element. Next, the quantum circuit used the parametric vector and the diagonal gate to perform a logarithm rotation of the input vector. In this case, the output of the quantum circuit was a vector of size 32, representing all sub-multiplications of probability in its logarithm. The quantum calculation output was similar to the classical computer computation and presented an entropy of 4.155 after 1024 shots.

To understand the level of agreement between the two methods, we examined the differences between the computed values. The error (i.e., the difference) was 8.8e−16, which is relatively low for the input of such a long text. The significant difference was that the quantum circuit performed three operations to achieve the desired entropy, which was faster than the classical computer computation.

4 Analysis

This section provides a comparison and analysis between the proposed method and other existing methods for calculating entropy. First, we describe the methods we used for the comparison. Then, we present the results of our method over four types of input.

4.1 Entropy calculation methods

We examined the following methods to demonstrate the results:

  1. 1.

    Shannon entropy (Shannon 1948)—given a random variable X, which takes values of {x1, x2, …, xm} under sample space Ω, the Shannon entropy, denoted as HS(X), is defined by

    $${H}_S(X)=-\sum_{x_i\in X}p\left({x}_i\right)\cdot {\log}_2\left(p\left({x}_i\right)\right)$$
  2. 2.

    von Neumann entropy (Von Neumann 1955; Nielsen and Chuang 2010)—let ρ be the density matrix of a quantum state. The following are two equivalent approaches to calculating the von Neumann entropy, denoted as S(ρ):

    1. a.

      Assuming that ‖ρ − I‖ < 1, where I is the identity matrix, then the following power series is convergent and defines the logarithm of ρ:

      $$\log\left(\rho\right)=\sum_{k=1}^\infty\left(-1\right)^{k+1}\frac{\left(\rho-I\right)^k}k S\left(\rho\right)=-Tr\left(\rho\cdot\log_2\left(\rho\right)\right)$$
  1. b.

    Let {λ1, …, λk} be the set of eigenvalues of the matrix ρ. Then, the von Neumann entropy can also be defined as

    $$S\left(\rho \right)=-\sum_{i=1}^k{\lambda}_i\cdot {\log}_2\left({\lambda}_i\right)$$

To demonstrate the equivalence of both approaches, let ρ be the density matrix of the quantum state, such as

$$\rho =\frac{1}{3}\left.|1\right\rangle \left\langle 1|\right.+\frac{2}{3}\left.|0\right\rangle \left\langle 0|\right.=\left(\begin{array}{cc}\frac{5}{6}& \frac{1}{6}\\ {}\frac{1}{6}& \frac{1}{6}\end{array}\right)$$

On the one hand, the eigenvalues of ρ are λ1 = 0.872, λ2 = 0.127. By applying the second approach to calculate the von Neumann entropy, the following is obtained:

$$S\left(\rho \right)=-\sum_{i=1}^k{\lambda}_i\cdot {\log}_2\left({\lambda}_i\right)=0.55$$

On the other hand, a logarithm base conversion must be used to acquire the appropriate base:

$${\log}_2\left(\rho \right)=\frac{\log_e\left(\rho \right)}{\log_e(2)}=\frac{\left(\begin{array}{cc}-0.237& 0.430\\ {}0.430& -1.959\end{array}\right)}{\log_e(2)}=\left(\begin{array}{cc}-0.343& 0.620\\ {}0.620& -2.826\end{array}\right)$$

Thus, the von Neumann entropy is

$$S\left(\rho \right)=- Tr\left(\rho \cdot {\log}_2\left(\rho \right)\right)=- Tr\left(\begin{array}{cc}-0.182& 0.046\\ {}0.046& -0.367\end{array}\right)=0.55$$

4.2 Results

For the analysis of our method, Table 1 presents the comparison to other existing methods over the following inputs:

  1. 1.

    Input A—a simple occurrence vector, as described in Section 3.1

  2. 2.

    Input B—a text, as described in Section 3.2

  3. 3.

    Input C—a randomized text of size 5000 consisting of uppercase and lowercase letters

  4. 4.

    Input D—a randomized text of size 1000 consisting of digits only

Table 1 Comparison of the proposed method to the Shannon entropy and von Neumann Entropy

As demonstrated in Table 1, all the entropy calculation methods obtained results in an (almost) identical range without extreme anomalies or noises. The differences between our method and the Shannon entropy were minor. Therefore, it can be concluded that there is an agreement between them. When comparing these results to the von Neumann entropy, slightly more significant deviations are obtained (e.g., a difference of 0.268 on input B). This difference might be due to the different calculation methods or the noises created throughout the 1024 simulations in the quantum algorithm.

5 Conclusions and discussion

This study proposes a novel quantum “black box” for entropy calculation. The presented procedure is generic and can be applied to information analyses, machine learning algorithms, and more. The method involves amplitude encoding, a key component of quantum computing, representing a vector’s probability. Its main innovation is the use of quantum computers to calculate entropy as a “black box” without having to build quantum circuits or transform the problem from classical to quantum computation. As a result, this “black box” is accessible to those without a previous understanding of quantum computing. The following are the main conclusions:

  1. 1.

    Our method calculated the entropy of different data types with the same precision as a classic computer. The significant difference was the amplitude encoding, which represents the dataset as a complete vector that yields the probabilities of the input. Our method used amplitude encoding for entropy estimation, although this can be generalized to any measure based on stochastic elements.

  2. 2.

    Our quantum “black box” has a fixed depth of three, equivalent to the number of steps performed by the quantum computer that runs the circuit. Circuit depth matters because qubits have finite coherence time. Depth complexity is not independent of gate complexity because a circuit with many gates is also likely to have considerable depth. Thus, circuit depth can increase due to both the algorithm’s structure and the physical limitations of the hardware. When comparing it to classical computer computation, our method does not depend on the input size since it is based on amplitude encoding, which encodes the input as a single state of the quantum circuit.

This study presented two main issues that must be addressed in future studies. First, we tested our method using the IBM simulator to avoid noise and to be able to sample a state vector without collapsing the circuit. Since this study was designed to find a method for entropy calculation in quantum computing, future studies should examine the implementation and evaluation of our method using a quantum computer. Second, besides entropy, many “black boxes” can help build and maintain learning algorithms, such as information gain, weighted and conditional entropy, and distance metrics. Future studies should focus on developing those “boxes” to make quantum computing more accessible.