1 Introduction

The field of quantum computing (QC) is considered one of the most promising in computational science (Ying 2010) and has gained importance in international research. Based on the physics theorem, QC assumes that an electron can behave simultaneously as a wave and particle (Robertson 1943). Due to the sensitivity of quantum computers to noise and decoherence (Bennett et al. 1997), it can be challenging to build and maintain a superposition of quantum computers. There have been many investigations and discussions regarding quantum computers (Zeng et al. 2017), including arguments concerning the advantages and disadvantages of quantum computers (Boyer et al. 1998). Quantum computers have the potential to revolutionize the field of computing, as they can perform tasks much faster than classical computers and can also process large amounts of data in a short amount of time. For example, quantum computers can perform calculations that are impossible with classical computers, such as solving certain types of algorithms. Consequently, they have the potential to revolutionize many industries, such as finance, healthcare, and artificial intelligence (Piattini et al. 2021).

Quantum computers significantly reduce the complexity of computing. Due to their parallel processing capabilities, they can perform a greater variety of operations than classical computers (Biamonte et al. 2017; Wiebe 2020). Currently, there is no distinction between classical and quantum computers, and algorithms can be implemented on both (Buffoni and Caruso 2021). Hence, quantum machine learning (QML) is a young but rapidly growing field alongside QC. By using quantum gates, it is possible to transform classical machine learning algorithms into QC (Benedetti et al. 2019; Alchieri et al. 2021). Ultimately, combining classical and QC provides a powerful tool for solving complex problems.

A parameterized quantum circuit (PQC) consists of one or more parameters that can be changed according to the user’s requirements (Hubregtsen et al. 2021). Their versatility makes them useful for implementing machine learning algorithms, variational quantum algorithms, and quantum simulators (Benedetti et al. 2019; Du et al. 2020). A parameterized gate differs from a fixed gate in that it depends on variables for its operation. PQCs are a rapidly developing field, and new advances are being made constantly (Peham et al. 2023). Recent studies have described the implementation of quantum PQC and algorithms for the learning of random variables (González et al. 2022; Pirhooshyaran and Terlaky 2021) and entropy estimation (Koren et al. 2023), as well as building a quantum convolutional network to learn images (Hur et al. 2022; Tüysüz et al. 2021), implementing reinforcement learning (Dalla Pozza et al. 2022), and developing generative adversarial networks (GANs) and transfer learning (Assouel et al. 2022; Azevedo et al. 2022).

For machine learning models to be successful, data representation is crucial. Classical machine learning relies on numerical representations of data in order to be best processed by a classical algorithm. Quantum machine learning poses the same fundamental question: how to represent and efficiently input data into quantum systems to be analyzed by quantum algorithms (Li et al. 2022; Weigold et al. 2021a, b). As a result of this process, quantum machine learning algorithms are directly affected by their computational capacity (Dilip et al. 2022; LaRose and Coyle 2020; Sierra-Sosa et al. 2023). There are three primary encoding methods: (1) Basic encoding, which associates a classical string with a computational basis state. It is the simplest method to understand, although the state vectors quickly become sparse (i.e., vectors that have mostly zero values). (2) Amplitude encoding encodes data into the amplitudes of a quantum state. As a system of \(n\) qubits provides \({2}^{n}\) amplitudes, it can encode a dataset of \(N\) records over \(M\) features using \({{\text{log}}}_{2}(N\cdot M)+1\) qubits. (3) Angle encoding encodes \(M\) features into the rotation angles of \(M\) qubits. The angle encoding slightly differs from other encoding techniques as it only encodes one data point at a time, rather than an entire dataset. This method requires, at most, \(M\) qubits. All of the mentioned methods have been found to produce enhanced results in quantum autoencoders and image processing (Bravo-Prieto 2021; Majji et al. 2023; Romero et al. 2017; Shin et al. 2023).

Exploratory data analysis (EDA) is a set of techniques developed in 1970 that aims to examine the data before building a model (Tukey 1977). It explores data for patterns, trends, underlying structures, anomalies, and more (Chatfield 1986; Leinhardt and Wasserman 1979). The main goal of EDA is to develop valid models based on data insights (Komorowski et al. 2016; Morgenthaler 2009). This process can be categorized into three main categories: (1) Univariate analysis refers to one dependable variable (in a dataset, it explores each variable separately). It can be performed by statistical analyses, such as mean, median, and variance (Behrens and Yu 2003; Vigni et al. 2013). (2) Bivariate analysis examines the relationship between the two variables. It can be two numerical variables, two categorical variables, or mixed variables (Cleff 2014; Jebb et al. 2017). (3) Multivariate analysis refers to at least three variables (Gelman 2004; Wang et al. 2023; Wongsuphasawat et al. 2019). Notably, analyzing the distribution of a dataset significantly impacts the results of the analysis. Furthermore, understanding the distribution can assist in choosing the appropriate statistical test, identifying anomalies, examining normality, and more, leading to results that are more accurate, reliable, and valid.

This study presents and describes a quantum-parametrized gate and its circuit implementation to estimate the distribution measures for discrete value vectors. It can be applied in information analysis, exploratory data analysis, and machine learning algorithms. There are two central innovative aspects of this proposed method: (1) a new quantum method for estimating distribution measures, such as expectation and variance, and (2) the accessibility of QC and the creation of a method that can be run without any prior knowledge of implementing quantum circuits. Section 2 will present the definition of the new quantum gate, the procedure, its implementation, and the mathematical justifications. Section 3 will describe the empirical study of a dataset and a detailed scenario of variance estimation. The method was tested and compared over five discrete value distributions, the results of which are presented in Section 4. Lastly, Section 5 will discuss the main conclusions and suggestions for future directions.

2 Quantum distribution measures

This section presents and describes a new parametrized quantum gate for statistical measure estimations. First, the general parametrized gate will be described, and its unitarity will be proven. Then, the quantum method will be presented, including the relevant logic and gates. Last, its implementation and correctness will be presented in detail.

2.1 Parameterized quantum gate

This study defines a new parameterized quantum gate that inputs \(a,r\in {\mathbb{N}}\), creating a parameterized and diagonal square matrix of size \(a\) as follows:

$$M\left(a,r\right)=\left[\begin{array}{cccc}\sqrt{{1}^{ri}}& \cdots & 0& 0\\ 0& \sqrt{{2}^{ri}}& 0& 0\\ \vdots & \cdots & \cdots & \vdots \\ 0& \cdots & 0& \sqrt{{3}^{ri}}\end{array}\right]$$

A new operation defined by a quantum circuit is required to be unitaryFootnote 1 since any physical operation on a state is used to advance it (Bennett et al. 1997). Since \(M\left(a,r\right)\) is a square and diagonal matrix, its inverse is also diagonal and defined as follows:

$$M{\left(a,r\right)}^{-1}=\left[\begin{array}{cccc}\sqrt{{1}^{-ri}}& \cdots & 0& 0\\ 0& \sqrt{{2}^{-ri}}& 0& 0\\ \vdots & \cdots & \cdots & \vdots \\ 0& \cdots & 0& \sqrt{{3}^{-ri}}\end{array}\right]$$

Thus, the inverse matrix is also its conjugate transpose, and the new quantum operation is unitary since:

$${M\left(a,r\right)}^{*}M\left(a,r\right)=M{\left(a,r\right)}^{-1}M\left(a,r\right)=I$$

2.2 Quantum logic and gates

Let \(A=({a}_{1},{a}_{2},\dots ,{a}_{n})\) be the input vector, such that \(\forall {a}_{i}\in A\). Therefore, it holds that \({a}_{i}\in {\mathbb{N}}\cup \{0\}\). The proposed method consists of two sub-procedures, as follows:

  1. 1.

    Classical computer preprocessing—Given the input of vector \(A\), the method creates \({f}_{A}=({f}_{1},\dots {f}_{k})\) where each \({f}_{i}\in {f}_{A}\) represents the number of occurrences of the \({i}^{th}\) item for \(k\le n\). Next, it transforms \({f}_{A}\) to an amplitude vector, such that each \({f}_{i}\in {f}_{A}\) is converted to \(\frac{\sqrt{{f}_{i}}}{\sqrt{\sum_{i=1}^{k}{f}_{i}}}\) and returns the normalized vector.

  2. 2.

    Quantum variance estimation—According to the input of the classical computer preprocessing results, the proposed method creates a quantum circuit and allocates \(\left\lfloor\text{log}\left(k\right)\right\rfloor+1\) qubits. Once the circuit is ready, it initializes the state of the system by the normalized amplitude vector, denoted as \(|\psi \rangle\), and the additional state vector, denoted as \(|\psi {\prime}\rangle\), as followsFootnote 2:

    $$|\psi \rangle =\frac{1}{\sqrt{\sum {f}_{A}}}\sum_{j=1}^{j}\sqrt{{f}_{j}}|j\rangle$$
    $$|\psi {\prime}\rangle =\frac{1}{\sqrt{\sum \left|{\sqrt{j}}^{{2}^{r}-ri}\sqrt{{f}_{j}}\right|}}\sum_{j=1}^{k}{\sqrt{j}}^{{2}^{r}-ri}\sqrt{{f}_{j}}|j\rangle$$

Next, the method applies the Hadamard gate to move the state into superposition. Thus, the current state can be presented as:

$${H}^{\otimes k}|\psi \rangle =\frac{{\sum }_{i}{\left(-1\right)}^{\psi \cdot i}|i\rangle }{\sqrt{{2}^{k}}}$$

Once the state is in superposition, the method uses the parametrized gate (described in Section 2.1) to estimate the expected value of \(A\) and \({A}^{2}\), denoted as \({\varphi }_{1}\), \({\varphi }_{2}\), respectively. Thus, it calculates \({\varphi }_{1}\) as \(\langle \psi^{\prime}|M(k,1)|\psi \rangle\) and \({\varphi }_{2}\) as \(\langle \psi^{\prime}|M(k,2)|\psi \rangle\). Last, the method returns the value of \({\varphi }_{2}-{\varphi }_{1}^{2}\) using simple classical computer computation. Figure 1

Fig. 1
figure 1

The quantum circuit of variance calculation. Notes: (1) The dashed lines describe the sampling of the state in superposition. (2) IBM simulators were used (with the Qiskit Library for Python; Cross 2018) to avoid noise and sample the state vector in each time phase in the circuit. (3) Figure 1 describes the quantum circuit over several qubits, although generalization to a higher dimension can be done with tensor products. (4) The output represents an approximation of the variance value. The analysis of the approximation ratio is detailed in Section 2.3. (5) Amplitude encoding was chosen due to the ability of this method to encode an entire dataset using a logarithmic number of qubits. Despite the number of gates required for this coding, this method can form the base for many distribution measures (e.g., expectation, variance, skewness, kurtosis). Future studies are necessary to explore a more efficient representation of these gates

2.3 Correctness

Let \(A=({a}_{1},{a}_{2},\dots ,{a}_{n})\) be the input vector, and let \({f}_{A}=\left({f}_{1},\dots ,{f}_{k}\right)\) be the occurrences of each item in A (i.e., each \({f}_{i}\in {f}_{A}\) represents the number of occurrences of the \({i}^{th}\) item for \(k\le n\)). The method transforms \({f}_{A}\) to an amplitude vector as \(|\psi \rangle =\frac{1}{\sqrt{\sum {f}_{A}}}\sum_{i=1}^{k}\sqrt{{f}_{i}}|i\rangle\) that satisfies:

$${\Vert |\psi \rangle \Vert }^{2}=\sum_{i=1}^{k}{\left(\frac{\sqrt{{f}_{i}}}{\sqrt{\sum_{j=1}^{k}{f}_{j}}}\right)}^{2}=\sum_{i=1}^{k}\frac{{f}_{i}}{\sum_{j=1}^{k}{f}_{j}}=\frac{{\sum }_{i=1}^{k}{f}_{i}}{\sum_{j=1}^{k}{f}_{j}}=1$$

Since a quantum system of \(k\) qubits provides \({2}^{k}\) amplitudes, encoding \({f}_{A}\) requires the use of \(\left\lfloor{\text{log}}_2k\right\rfloor+1\) qubits. Notably, in cases where the length of \({f}_{A}\) is not to the power of two, zeros are added as their values do not change the calculation. Given the initialized state vector, \(|\psi \rangle\), The method applies the \(U\) gate with the parameters \(\theta =\frac{\pi }{2},\phi =0,\lambda =\pi\), which is equivalent to applying the Hadamard gate to move the state into superposition (Wijesekera et al. 2009):

$${H}^{\otimes k}|\psi \rangle =\frac{{\sum }_{i}{\left(-1\right)}^{\psi \cdot i}|i\rangle }{\sqrt{{2}^{k}}}$$

Next, the algorithm creates \(M(a,r)\), a parameterized and diagonal square matrix of size \(a\) as described in Section 2.1. The algorithm uses \(M(k,1)\) and \(M\left(k,2\right)\) to estimate the values of the first and second moments of \(A\). Let \({\varphi }_{1}\) be the expected value of applying \(M(k,1)\) on \(|\psi \rangle\), denoted as \(\langle \psi {\prime}|M(k,1)|\psi \rangle\) (Bakshi and Mahanthappa 1963). Given that \(|\psi \rangle =\frac{1}{\sqrt{\sum {f}_{A}}}\sum_{i=1}^{k}\sqrt{{f}_{i}}|i\rangle\), the following proof of correctness describes a system with two qubits, although it can be generalized using the tensor product:

$$|\psi \rangle =\frac{1}{\sqrt{\sum_{j=1}^{k}{f}_{j}}}\sum_{i=1}^{k}\sqrt{{f}_{i}}|i\rangle =\frac{\sqrt{{f}_{1}}}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}|00\rangle +\frac{\sqrt{{f}_{2}}}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}|01\rangle +\frac{\sqrt{{f}_{3}}}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}|10\rangle +\frac{\sqrt{{f}_{4}}}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}|11\rangle$$
$$\langle M(k,1)|\psi \rangle =\frac{1}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}\left[\begin{array}{cccc}\sqrt{{1}^{i}}& 0& 0& 0\\ 0& \sqrt{{2}^{i}}& 0& 0\\ 0& 0& \sqrt{{3}^{i}}& 0\\ 0& 0& 0& \sqrt{{4}^{i}}\end{array}\right]\left(\begin{array}{c}\sqrt{{f}_{1}}\\ \sqrt{{f}_{2}}\\ \sqrt{{f}_{3}}\\ \sqrt{{f}_{4}}\end{array}\right)=\frac{1}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}\left(\begin{array}{c}\sqrt{{1}^{i}}\sqrt{{f}_{1}}\\ \sqrt{{2}^{i}}\sqrt{{f}_{2}}\\ \sqrt{{3}^{i}}\sqrt{{f}_{3}}\\ \sqrt{{4}^{i}}\sqrt{{f}_{4}}\end{array}\right)$$
$$\langle \psi {\prime}|M(k,1)|\psi \rangle =\frac{1}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}\cdot {\left(\begin{array}{c}{\sqrt{1}}^{{2}^{1}-i}\sqrt{{f}_{1}}\\ {\sqrt{2}}^{{2}^{1}-i}\sqrt{{f}_{2}}\\ {\sqrt{3}}^{{2}^{1}-i}\sqrt{{f}_{3}}\\ {\sqrt{4}}^{{2}^{1}-i}\sqrt{{f}_{4}}\end{array}\right)}^{T}\left(\begin{array}{c}\sqrt{{1}^{i}}\sqrt{{f}_{1}}\\ \sqrt{{2}^{i}}\sqrt{{f}_{2}}\\ \sqrt{{3}^{i}}\sqrt{{f}_{3}}\\ \sqrt{{4}^{i}}\sqrt{{f}_{4}}\end{array}\right)=\frac{1}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}\left({f}_{1}+2{f}_{2}+3{f}_{3}+4{f}_{4}\right)$$

It is important to note that since the norm of a complex number is a real number, it can easily be normalized to facilitate the sum of squared amplitudes to equal one. Given that each \({f}_{i}\) represents the normalized frequency of the \({i}^{th}\) item, the expected value of the operator is equal to the first moment of A, i.e., its expectation. Similarly, let \({\varphi }_{2}\) be the predicted value of applying \(M(k,2)\) on \(|\psi \rangle\), denoted as \(\langle \psi {\prime}|M(k,2)|\psi \rangle\):

$$\langle M\left(k,2\right)|\psi \rangle =\frac{1}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}\left[\begin{array}{cccc}\sqrt{{1}^{2i}}& 0& 0& 0\\ 0& \sqrt{{2}^{2i}}& 0& 0\\ 0& 0& \sqrt{{3}^{2i}}& 0\\ 0& 0& 0& \sqrt{{4}^{2i}}\end{array}\right]\left(\begin{array}{c}\sqrt{{f}_{1}}\\ \sqrt{{f}_{2}}\\ \sqrt{{f}_{3}}\\ \sqrt{{f}_{4}}\end{array}\right)=\frac{1}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}\left(\begin{array}{c}\sqrt{{1}^{2i}}\sqrt{{f}_{1}}\\ \sqrt{{2}^{2i}}\sqrt{{f}_{2}}\\ \sqrt{{3}^{2i}}\sqrt{{f}_{3}}\\ \sqrt{{4}^{2i}}\sqrt{{f}_{4}}\end{array}\right)$$
$$\langle \psi {\prime}|M(k,2)|\psi \rangle =\frac{1}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}{\left(\begin{array}{c}{\sqrt{1}}^{{2}^{2}-2i}\sqrt{{f}_{1}}\\ {\sqrt{2}}^{{2}^{2}-2i}\sqrt{{f}_{2}}\\ {\sqrt{3}}^{{2}^{2}-2i}\sqrt{{f}_{3}}\\ {\sqrt{4}}^{{2}^{2}-2i}\sqrt{{f}_{4}}\end{array}\right)}^{T}\left(\begin{array}{c}\sqrt{{1}^{2i}}\sqrt{{f}_{1}}\\ \sqrt{{2}^{2i}}\sqrt{{f}_{2}}\\ \sqrt{{3}^{2i}}\sqrt{{f}_{3}}\\ \sqrt{{4}^{2i}}\sqrt{{f}_{4}}\end{array}\right)=\frac{1}{\sqrt{\sum_{j=1}^{4}{f}_{j}}}\left({f}_{1}+4{f}_{2}+9{f}_{3}+16{f}_{4}\right)$$

Thus, the value of \({\varphi }_{2}-{\varphi }_{1}^{2}\) is equal to \({\mathbb{E}}\left[{A}^{2}\right]-{\left({\mathbb{E}}\left[A\right]\right)}^{2}\) and represents its variance.Footnote 3

3 Case Study

This section describes a case study and tests comparing the proposed method to a classical computer calculation. An IBM simulator with 1024 shots was used to simulate the trials. To simplify the illustration, a basic use case is first described regarding calculating the variance of a simple occurrences vector and details each operation in the quantum circuit. Then, the results of the proposed method are presented for the features of the diabetes dataset (Kahn 1994).

3.1 Simple occurrences vector

Let \(A=(\mathrm{1,2},\mathrm{2,2},\mathrm{1,4},\mathrm{1,4})\) be the input vector and let \({f}_{A}=(\mathrm{3,3},\mathrm{0,2})\) be the occurrences vector of size four, such that the first item appeared three times, the second item appeared three times, and so on. Since \({f}_{A}\) has four values, it requires two qubits, and the initialized state is:

$$|\psi \rangle =\frac{1}{\sqrt{\sum_{j=1}^{k}{f}_{j}}}\sum_{i=1}^{k}\sqrt{{f}_{i}}|i\rangle =\frac{1}{\sqrt{8}}\left(\sqrt{3}|00\rangle +\sqrt{3}|01\rangle +0|10\rangle +\sqrt{2}|11\rangle \right)$$

Figure 2 presents the initialized state, \(|\psi \rangle\), over two qubits in a Bloch sphereFootnote 4 representation. Using the initialized state vector, the method used the \(M(a,r)\) parametrized gate and estimated its expected value in two manners:

Fig. 2
figure 2

Initialized qubits in Bloch sphere representation

$$\langle M\left(k,1\right)|\psi \rangle =\frac{1}{\sqrt{8}}\left[\begin{array}{cccc}\sqrt{{1}^{i}}& 0& 0& 0\\ 0& \sqrt{{2}^{i}}& 0& 0\\ 0& 0& \sqrt{{3}^{i}}& 0\\ 0& 0& 0& \sqrt{{4}^{i}}\end{array}\right]\left(\begin{array}{c}\sqrt{3}\\ \sqrt{3}\\ \sqrt{0}\\ \sqrt{2}\end{array}\right)=\frac{1}{\sqrt{8}}\left(\begin{array}{c}\sqrt{{1}^{i}}\sqrt{3}\\ \sqrt{{2}^{i}}\sqrt{3}\\ 0\\ \sqrt{{4}^{i}}\sqrt{2}\end{array}\right)$$
$${\varphi }_{1}=\langle \psi {\prime}|M(k,1)|\psi \rangle =\frac{1}{\sqrt{8}}{\left(\begin{array}{c}{\sqrt{1}}^{2-i}\sqrt{3}\\ {\sqrt{2}}^{2-i}\sqrt{3}\\ {\sqrt{3}}^{2-i}\sqrt{0}\\ {\sqrt{4}}^{2-i}\sqrt{2}\end{array}\right)}^{T}\left(\begin{array}{c}\sqrt{{1}^{i}}\sqrt{3}\\ \sqrt{{2}^{i}}\sqrt{3}\\ 0\\ \sqrt{{4}^{i}}\sqrt{2}\end{array}\right)=\frac{1}{\sqrt{8}}\left(1\cdot 3+2\cdot 3+0+4\cdot 2\right)=\frac{2.125}{\sqrt{8}}$$
$$\langle M(k,2)|\psi \rangle =\frac{1}{\sqrt{8}}\left[\begin{array}{cccc}\sqrt{{1}^{i}}& 0& 0& 0\\ 0& \sqrt{{2}^{i}}& 0& 0\\ 0& 0& \sqrt{{3}^{i}}& 0\\ 0& 0& 0& \sqrt{{4}^{i}}\end{array}\right]\left(\begin{array}{c}\sqrt{3}\\ \sqrt{3}\\ \sqrt{0}\\ \sqrt{2}\end{array}\right)=\frac{1}{\sqrt{8}}\left(\begin{array}{c}\sqrt{{1}^{i}}\sqrt{3}\\ \sqrt{{2}^{i}}\sqrt{3}\\ 0\\ \sqrt{{4}^{i}}\sqrt{2}\end{array}\right)$$
$${\varphi }_{2}=\langle \psi |M(k,2)|\psi \rangle =\frac{1}{\sqrt{8}}{\left(\begin{array}{c}{\sqrt{1}}^{{2}^{2}-2i}\sqrt{3}\\ {\sqrt{2}}^{{2}^{2}-2i}\sqrt{3}\\ {\sqrt{3}}^{{2}^{2}-2i}\sqrt{0}\\ {\sqrt{4}}^{{2}^{2}-2i}\sqrt{2}\end{array}\right)}^{T}\left(\begin{array}{c}\sqrt{{1}^{i}}\sqrt{3}\\ \sqrt{{2}^{i}}\sqrt{3}\\ 0\\ \sqrt{{4}^{i}}\sqrt{2}\end{array}\right)=\frac{1}{\sqrt{8}}\left(3+{2}^{2}\cdot 3+0+{4}^{2}\cdot 2\right)=\frac{5.875}{\sqrt{8}}$$

Lastly, using the classical computer computation, the method returned:

$${\varphi }_{2}-{\varphi }_{1}^{2}=\frac{5.875}{\sqrt{8}}-\frac{{2.125}^{2}}{8}=1.512$$

The classical computer calculation yielded a variance of 1.553, and the error between the classical and quantum computation was 0.021.

3.2 The diabetes dataset

In this section, using a dataset, the proposed method is compared with the classical computer method and analyzed. For the comparison, we used the diabetes dataset (Kahn 1994), which includes 768 diabetic and non-diabetic women. It consists of eight features and a Boolean target variable. The “BMI” and “DiabetesPedigreeFunction” features were removed since the proposed method is designed for discrete values. A classical computer was used to calculate each feature's variance, and an IBM simulator and the proposed method were used to calculate its quantum variance. Table 1 compares the variance values achieved for each feature and presents the error rate.

Table 1 A variance comparison of the diabetes dataset features

The results showed the consistency of the proposed method across six features of the data set. The minimal error occurred in the “Age” column with a deviation of only 0.021% of the variance, calculated using a classic computer. The maximum error was obtained in the “Blood Pressure” column with an error of 5.996%. The calculation of the distribution measures of each feature in the dataset according to the proposed method was consistent and showed a high agreement with the original value. At the same time, further analysis is required to examine the proposed method, as will be presented in Section 4.

To understand the behavior of the “Blood Pressure” feature, which raised a maximum error of 5.996%, Fig. 3 shows its distribution estimation. It is known that anomalous values in the distribution can cause biased results and a wide error range when calculating distribution measures, such as mean and variance. Approximately 40 records were defined as outliers, which may have caused a significant change in the variance estimation.

Fig. 3
figure 3

The distribution of the “Blood Pressure” feature

4 Results

Five discrete value distributions were compared to assess and evaluate the results of the proposed method (Table 2). For each of the following distributions, 10,000 experiments were conducted:

  1. 1.

    Binomial distribution, \(Bin(n,p)\), with a success probability of \(p\) in a total of \(n\) trials.

  2. 2.

    Geometric distribution, \(G(p)\), with success probability \(p\).

  3. 3.

    Uniform distribution between \(a,b\in {\mathbb{N}}\), denoted as \(U(a,b)\).

  4. 4.

    Hypergeometric distribution, \(HG\left(N,D,n\right)\), with a total of \(N\) items, \(D\) specials, and \(n\) trials.

  5. 5.

    Poisson distribution, \(Pois(\lambda )\), where \(\lambda\) is the expected value of events in an interval of time.

Table 2 Variance comparison over discrete value distributions

The results of the comparison between the distributions showed a stronger consistency than the results of the diabetes dataset (described in Section 3.2). The highest error had a value of 1.351, which occurred in the binomial distribution with a probability of success of 0.7 in a single trial. For all the examined distributions, the quantum method maintained low error values, which reinforced the implementation of the proposed method and its results.

To examine the effect of the input size (the number of qubits) on the performance of our method, the following experiments were conducted:

  1. 1.

    Binomial distribution—Inputs of sizes \({2}^{i}\) for \(3\le i\le 8\) were created, which required \(i\) qubits, respectively. The binomial distribution was used due to the ability to control the input size (unlike geometric distribution). The variance obtained was compared using a classical and quantum calculation for each of the probabilities between 0.1 and 0.9 with a step of 0.1 (i.e., for each examined input, there was a total of nine pairs of variances). Figure 4 presents this comparison, where each dot in the figure represents a probability. For example, nine qubits have a total of 18 points (nine for the quantum calculation and nine for the classic calculation). Figure 4 examines the level of agreement between the quantum method and the classical calculation for different variance values. In cases without agreement between the quantum and classical calculation, Fig. 4 would present a grouping of red and black points separately.

  2. 2.

    Hypergeometric distribution—Inputs of size 10,000 were created, with a total of \(N=500, n=200\), and the values of \({2}^{i}\) were examined, where \(2\le i\le 7\) for \(D\). Figure 5 shows the comparison between the calculated variance over different values of \(D\). Like Fig. 4, the x-axis represents the number of qubits required to encode the input, and the y-axis shows the variances obtained for different \(D\) values.

Fig. 4
figure 4

A comparison between the binomial variance over different numbers of qubits

Fig. 5
figure 5

A comparison between the hypergeometric variance over different numbers of qubits

According to the results of the binomial distribution, it can be concluded that the proposed method showed reliable performance, even among data with high variability (i.e., high number of qubits). This can be seen in the agreement between the red points (which represent quantum calculation) and the gray points (which represent classical calculation). Similar to the binomial distribution, there was consistency among the hypergeometric distribution between the calculation of the reefs. However, small inputs (which required 2 or 3 qubits) showed a more comprehensive range of answers and gaps between the classical and quantum computations. These were not gaps that presented an error; rather, they illustrated the difficulty in estimating the exact values on a quantum computer.

5 Conclusions and discussion

This study proposes a novel quantum-parametrized gate and circuit implementation for distribution measure estimations. The presented method can be implemented in data analysis processes, machine learning techniques, and more. Its main innovation is the use of a parametrized quantum circuit to calculate the expectation and variance of a given vector. As a result, this method is accessible to those without a previous understanding of QC. The following are the main conclusions:

  1. 1.

    The parametrized quantum gate proposed in this study was found to be effective in estimating the distribution values (i.e., expectation and variance) of a discrete value vector. When comparing the proposed and classical methods using the diabetes dataset (Section 3.2), the error range of the obtained variance ranged between 0.021% and 5.996%. The feature for which the maximum error was obtained, “Blood Pressure,” was a noisy column, and therefore, further research is required.

  2. 2.

    In testing the discrete distributions (Section 4), inputs of different sizes were compared to examine the effect of the number of required qubits and the obtained result. A wide agreement was found between the variance calculation using a classical computer and the proposed method in QC. In these cases, in contrast to the data set presented in Section 3.2, a minimal error was obtained, even for noisy distributions with inconsistent variance values.

The limitation of this study is expressed in the constraint of the method only being able to use discrete values due to the classical computer preprocessing process, which includes the creation of a frequency vector. Nevertheless, this study presented four main issues that should be addressed in future studies. First, the generalization of the parametric gate and its adaptation to the need to calculate additional distribution measures, such as skewness, kurtosis, and more, should be further examined. Second, a deep study should be conducted on the parameter values presented in this study and their optimal values for minimizing the error value. Third, an adaptation of the method for continuous distributions such as normal, exponential, and more would be beneficial. Last, using amplitude encoding may be efficient for a logarithmic number of qubits, although it requires a large number of quantum gates. Future studies are encouraged to explore the relationship between the input encoding and the number of quantum gates required to represent it to optimize the trade between these variables.