Error probability in polarization sensitive communication systems in terms of moments of the channel’s rotation angle

Polarization effects of channels such as turbulent free space and optical fibers can deteriorate the quality of polarization-sensitive communication schemes, for example, quantum key distribution. The exact nature of the channel is problematic to determine, and most theoretical results are based on approximations. We introduce a general model with probabilistic polarization angle rotation and analyze the average error it causes in terms of the moments of the probability distribution. For a small rotation angle variance, an inequality between distributions’ errors is considered based on their kurtosis values and potentially higher-order central moments. An upper bound is determined for the error probability, holding for all practically interesting scenarios.


Introduction
The state of polarization is a property of electromagnetic radiation that can be used to carry information or to distinguish between different channels in some communication systems, such as polarization shift keying (PolSK) modulation (Ghassemlooy et al. 2012;Zhang et al. 2018), polarization domain multiplexing (PDM) (Zhang et al. 2018;Cvijetic et al. 2010) or discrete variable quantum key distribution (DV-QKD) protocols using single photon polarization qubits (Gobby et al. 2004;Schmitt-Manderbach et al. 2007). Such systems typically use the optical (infrared or visible) part of the spectrum and include both wireless (Ghassemlooy et al. 2012;Cvijetic et al. 2010;Schmitt-Manderbach et al. 2007;Zhang et al. 2018) and guided applications (Gobby et al. 2004).
Although polarization can be exploited to increase channel capacity, to counteract problems of more conventional modulation schemes or to provide a robust flying qubit implementation, all these systems suffer from the depolarizing and polarization altering phenomena in the respective propagation medium, such as polarization rotation, polarization-dependent losses, and polarization-dependent phase delays. These effects contribute 1 3 62 Page 2 of 21 to inter-symbol interference (ISI) in PolSK, to crosstalk between channels in PDM and to increased quantum bit error rate (QBER) in DV-QKD.
We want to determine the amount of error that is introduced to the different applications in a specific channel due to measuring the polarization to be orthogonal to the input state as a consequence of polarization effects. The main physical channel types for polarizationbased communication schemes are the turbulent atmospheric channels and the optical fiber channels, each having different polarization effects, shortly summarized in the following subsections.

Turbulent atmospheric channels
The depolarizing nature of turbulent atmospheric channels on polarized laser beams has been studied both theoretically (Strohbehn and Clifford 1967)-for the plane wave case, predicting very small changes in the polarization angle-and experimentally (Höhn 1969) as early as the 1960s. The theory suggested that the variance in fluctuations increases with the turbulent strength, which, in turn, increases with the scintillation index and propagation distance. Measured results for the variance of the polarization angle were found to be several orders of magnitude higher than those predicted by theory, attributed to the incompleteness of derivations only focusing on turbulence, neglecting other atmospheric phenomena. The case of collimated laser beams was theoretically analyzed by Collett and Alferness (1972), suggesting that the discrepancy arises from the fact that plane wave theory was compared to measurements conducted on Gaussian beams. A 2009 space-toground experiment using circular polarization confirmed that atmospheric effects only induce a minor change in the state of polarization, amounting to 2.8% of QBER in a QKD scheme; most of the differences can be attributed to device and setup imperfections (Toyoshima 2009). These researches focused on the variance of polarization angle fluctuations, but not on the underlying probability distribution. Zhang (2014) also noted the absence of discussion about the distribution of polarization angle fluctuation induced in turbulent atmospheric channels. Based on small-angle approximations and the theoretical model described in Strohbehn and Clifford (1967), they derived that the distribution is normal (Gaussian), with a variance that is increasing with the turbulent strength. As the theoretical and measured experimental distributions agree well within angles of ±0.05 radians, their model seems to be perfectly suitable for when the changes in polarization are small. They also studied the changes in polarization parameters, e.g., the azimuth angle, the ellipticity, and the Stokes parameters (Zhang et al. 2016), comparing experimental results to theoretical predictions, finding that the induced fluctuations increase with increasing turbulent strength, and they are more prominent at 1550 nm than at 780 nm.
Further theoretical calculations and numerical simulations were also reported (Zhang et al. 2017) based on the extended Huygens-Fresnel (EHF) principle, regarding the first and second moments-mean and variance-of the Stokes parameters as a function of the turbulent strength. This description is almost complete since any other polarization parameter can be readily calculated from the Stokes parameters. The mean values are decreasing due to depolarization, but their effects can be easily compensated, knowing the channel's properties. The variances are increasing with increasing structure constant or length, but they ultimately saturate. In Zhang et al. (2018), the research group performed an extensive analysis of how the polarization effects (namely, polarization-dependent losses, polarization angle rotation, and the polarization-dependent phase shift) affect the bit error rate of polarization-based free-space optical (FSO) systems in turbulent channels. They showed that for polarization multiplexing, all three effects degrade the performance, while for noncoherent polarization shift keying (POLSK), only the rotation is problematic. Simulations show that the mean polarization angle rotation starts to increase in moderate turbulence, saturating in strong turbulence between 10° and 12°, depending on the channel length.

Optical fiber channels
Another important channel for optical communications is the optical fiber channel commonly used in modern high-speed communication systems. A series of measurements regarding polarization angle and ellipticity angle fluctuations conducted on buried fiber cables at 1546 nm found that the typical rate of change in these parameters is slow, on the order of several hours and days (typically 2°-10° a day). Sudden changes happen rarely, and they can be attributed to causes like the movement of cables due to maintenance (Nicholson and Temple 1989). Long-term rotations like this can be easily compensated through polarization control (Xavier et al. 2009). The angles, however, were measured only every 75 seconds; therefore, no information can be found about short-term polarization fluctuations that cannot be tracked. A later study in 2003 compared the polarization angle fluctuations in buried/duct and aerial cables at around 1550 nm, recording at every 7 seconds (Ogaki et al. 2003). The research showed that fluctuations have higher amplitudes in aerial cables-on the order of 25° per minute, as opposed to 6°-10° in buried cables for the angle of polarization; changes in ellipticity showed even higher values-, mostly attributed to environmental effects, such as wind. Environmental impacts on aerial fibers have been the topic of further research. Although not directly concerned with polarization angle changes, in Waddy et al. (2001); Bao et al. (2004) it is verified by measurements that temperature gradients and moderate to high wind speeds cause the shortest time-scale changes (30-60 ms, measured at a resolution of 10 ms) in the state of polarization.
Imai and Matsumoto modeled the optical fiber as a series of small sections with fluctuating linear birefringence axis angles and birefringence values and analyzed the variance of latitude and longitude fluctuations on the Poincaré sphere for the output light. These coordinates correspond to twice the ellipticity and orientation angles, respectively (Imai and Matsumoto 1988). They found that latitude (ellipticity) fluctuation variances were independent of the input light ellipticity, and always higher than the longitude (orientation angle) variances. The latter also showed a sinusoidal dependence on the input ellipticity. Their analysis only allowed for small birefringence values. Using other approximative restrictions, it was deduced that the distributions for longitude and latitude fluctuations are normal, while the two are jointly normal. In the paper, it was predicted that both variances are proportional to the fiber length l, but measurements on fibers wound around drums showed a proportionality to l 1.3 , attributed to inefficient randomization of fiber segments. (Total randomization would cause direct proportionality, while the uniform distribution of the directions of birefringence axes would lead to a l 2 dependency.) The work introduced a more complicated model, adding fiber sections with large and steady birefringence. In this case, the longitude and latitude fluctuation distributions were also determined to be normal. An experiment was also conducted on a submarine and an underground cable, resulting in good agreement with predictions. Polarization fluctuations were found to be higher by several orders of magnitude in case of the submarine cable, attributed to mechanical and temperature changes (the former are results of water movement, such as waves and tides). A later paper built on the results of Imai and Matsumoto, also allowing for circular birefringence fluctuations in the fiber sections, obtaining slightly different outcomes (Sheng 62 Page 4 of 21 and Ling 1990). Using these assumptions, the latitude variance also depends on the input ellipticity, and it can be smaller than the longitude variance. Czegledi et al. introduced an extensive fiber model for random polarization drifts (Czegledi et al. 2016) accounting for all polarization parameters. The model was derived in the Jones, the Stokes-Mueller, and the 4D formalisms, concentrating on the polarization drift's effects on coherent PDM systems. They found that without compensation, the distribution of the possible output polarization states converges to the uniform distribution on the Poincaré sphere, which was in good agreement with measurements. The time scale on which changes are happening depends on a polarization linewidth factor.
Certain studies have been specifically concerned with the polarization compensation of quantum key distribution systems with fiber channels, which are significantly more sensitive than classical communication networks. Using only quantum signals, including control qubits, Almeida et al. developed a continuous, non-interruptive algorithm that can decrease the QBER below the highest acceptable values even in the presence of strong perturbances, without knowledge of the nature of polarization rotation (Almeida et al. 2016). The impact of random birefringence changes on QKD systems was also investigated in Pinto et al.

Research goals
The changes in polarization due to channel effects have been widely discussed, including the mean and variance of the polarization angle rotation, but little attention was paid to the underlying distribution(s). The papers that addressed this question always restricted the possible fluctuations to very small values, with assumptions and approximations generally leading to normal distributions. Our research aims to examine and compare arbitrary channels with random polarization rotation, without knowing the exact physical nature of the phenomenon. We are interested in how different channels compare in terms of an error parameter relating to polarization measurement, and whether there is a universal upper bound for said parameter amongst all possible probability distributions.
The main motivation is to provide an understanding of how polarization-sensitive communication schemes are affected by a channel that rotates the polarization state of light stochastically, following a well-defined probability distribution. Special concern is placed on the QBER of quantum key distribution protocols, an emerging research topic of recent years (Gyöngyösi et al. 2019). The principle of eavesdropper detection in QKD protocols is based on QBER measurements since any observation changes a quantum system's state. Therefore, if a physical channel introduces too many bit errors on its own, it can prevent the key distribution from taking place even in the absence of eavesdropping.
In this paper, we define an error parameter e, a function depending on the distribution and its variance, to quantify the decrease in quality introduced by the channel to the scheme. It is impractical to estimate the underlying distribution in real-life scenarios. However, it is shown that by estimating the polarization rotation's variance, an upper bound for e can be found. For a given DV-QKD protocol and a specific eavesdropping strategy, the relation between the error parameter and the quantum bit error rate can be obtained-see our previous work (Schranz and Udvary 2019) for an example. The same applies to PolSK and PDM systems. Thus, it can be assessed whether the channel can potentially cause false alerts and block the key sharing process.
The paper is organized as follows. Section 2 describes the channel model and the restrictions we use in our investigations and outlines how channels with different polarization angle rotation distributions could be compared. Section 3 introduces and elaborates a moment-based approach of comparison, and formulates three conjectures about it. Sections 4 and 5 offer numerous examples for the study of these conjectures, while Sect. 6 concludes the analysis.

The channel model
We define the error parameter e as the following: e is the probability that measuring the polarization of a single photon in the basis corresponding to its intended polarization yields an erroneous result, owing to the channel's effects. This choice of the parameter is due to the fact that cross-polarization leakage is a primary factor in causing bit errors, quantum bit errors, or inter-channel interference in polarization-sensitive communication schemes.
In this paper, we restrict the possible channel polarization effects to polarization rotation, neglecting polarization-dependent losses, and polarization-dependent phase shifts, while background noise is not discussed either. Using these assumptions, we can deal with ratios (or probabilities), greatly simplifying all further calculations. In the following sections, we discuss the validity of this simplification, the exact model, and a base for comparing the error parameter of different channels.

Limits of validity
An arbitrary normalized polarization state �ϕ⟩ can be written as a superposition of linear polarization basis states with coordinates in the x-y basis, as δ x and δ y are the starting phases of the x and y components of the electric field, δ = δ y − δ x is the phase difference between the x and y components, and i is the imaginary unit. The auxiliary angle α can be expressed as E x 0 and E y 0 being the amplitudes of the x and y field components. α is closely related to the polarization measurement probabilities in the x-y basis ( p x and p y for the x and y polarizations, respectively): = cos 2 (α) and since �x⟩ and �y⟩ are properly normalized and orthogonal to each other: ⟨x�x⟩ = ⟨y�y⟩ = 1 and ⟨x�y⟩ = ⟨y�x⟩ = 0 . α and δ describe the state of polarization in terms of the electric field's properties. A more graphic representation is the polarization ellipse ( Fig. 1), using two parameters: the orientation angle ψ (also called the polarization angle), and the ellipticity angle χ . Generally, polarization rotation means that the original orientation angle ψ is changed, becoming ψ + θ where θ is the rotation angle. ψ and χ are described by trigonometric equations in terms of α and δ.
From Eq. 3, it is clear that for a small phase difference δ , the ellipticity is negligible, while the orientation and auxiliary angles are almost equal ( ψ ≈ α ); therefore the measurement probabilities only depend on the polarization angle (and its rotation). The measurement probabilities-written here in quantum mechanical terms-are well known from Malus' law, which states that if a polarizer or polarization beam splitter (PBS) is irradiated by linearly polarized light angled at θ relative to the polarizing/transmission axis, a proportion of cos 2 (θ) of the light's intensity will be transmitted (Collett 2005). It follows that a proportion of sin 2 (θ) will be either blocked or reflected, depending on the type of device (presuming a lossless polarizer or beam splitter). This can be translated to the single-photon level. A photon is prepared and sent as linearly polarized along a certain direction, and its polarization is measured in the basis formed by vectors parallel and orthogonal to that direction. However, if its polarization angle is rotated along the way by an angle θ , the probability of measuring the state to be orthogonal to the original state is exactly sin 2 (θ) . The corresponding error parameter is then, by our definition, e = sin 2 (θ).

Fundamental assumptions
In our analysis, we assume that the input polarization state �ϕ⟩ in is horizontal linear, easily achieved in practice by putting a linear polarizer in the path of the emitted light.
(3) tan(2ψ) = tan(2α) cos(δ) . 1 The polarization ellipse. ψ is the orientation (polarization) angle, χ the ellipticity angle, α the auxiliary angle; E x 0 and E y 0 are the amplitudes of the x and y electric field components Page 7 of 21 62 The corresponding parameters are α = ψ = χ = δ = 0 . The channel effect is modelled as a probabilistic polarization angle rotation, where the rotation angle θ(t n = nT) = θ t n is a stochastic process (Fig. 2). If we assume independency of the input state, the unitary Jones matrix describing the time-dependent channel can be written as where t n is the label of time when the n th state is sent, and T is the time between successive transmissions. δ H is a polarization independent phase delay due to propagation. Attenuation factors are neglected, as we do not want to deal with states that are not measured eventually. It needs to be noted that this channel is noiseless for circularly polarized input states -�ϕ⟩ in = 1∕ √ 2(�x⟩ ± �y⟩) -, since rotating a circle's "polarization angle" only changes the starting phase, which is irrelevant in terms of measurement. However, semiconductor lasers are often intrinsically linearly polarized, and manipulating their light to be circularly polarized can prove to be costly; moreover, the devices (e.q. quarter-wave plates) may require special auxiliary equipment (Rauter 2014). For this reason, the assumption of linearly polarized input states is justifiable in practical scenarios.
The resulting output polarization state �ϕ t n ⟩ out is also linearly polarized ( α = ψ = θ t n , χ = δ = 0): Polarization is measured in the x-y basis, collapsing the state into either �x⟩ or �y⟩ with probabilities p , respectively (Ding 2017;Pinto et al. 2019). Note that when �ϕ⟩ in = �x⟩ , it follows from our definition that the error parameter e equals p y . The global phase δ x + δ H does not influence the measurements, and can be set to zero without loss of generality. {θ t n | n ∈ ℤ + } is a set of independent, identically distributed random variables that follow a probability distribution characterized by a non-negative probability density function (PDF) f θ (ϑ) symmetric around zero. The PDF is thus an even function of ϑ (Eq. 8). Zero mean is justified by the polarization rotation compensation methods mentioned above that can suppress the effects of slow polarization changes. These assumptions make the analysis easier at the expense of generality. Since all θ t n are assumed to be pairwise independent and identically distributed, we use the notation θ without subscript. Further assuming the existence and finiteness of all the moments of θ-later necessary for more straightforward comparison between distributions -, the condition in Eq. 8 can be rewritten for the expectation value of the random variable as If the rotation angle θ is not constant, but a random variable following a particular probability distribution, the average error parameter should be calculated as the expected value of the function sin 2 (θ) (Eq. 10).

Relevant measures for channel comparison, previous results
Given two channels with different polarization rotation angle distributions, there is an innate need for comparing their error parameters. However, there is no real reason contrasting drastically different distributions, e.g., a normal distribution with very small variance and some other distribution covering all angles uniformly; therefore, we need to find some relevant measure, along which different distributions having the same measure might be compared.
We chose this measure to be the variance (or equivalently, the standard deviation) of the distributions for two main motives: it is easy to measure or at least estimate, and in many cases, PDFs can be easily reparametrized using the variance or the standard deviation, instead of the natural parameters.
In our previous paper (Schranz and Udvary 2019), we calculated analytic formulae for the error parameters of three different distributions, as functions of the standard deviation. These are the symmetric two-point (a distribution with two possible, equiprobable outcomes θ 0 and −θ 0 ), the uniform, and the normal distributions. Their respective error parameters in terms of the standard deviation are given in Eqs. 11-13.
For small non-zero deviations, the inequality e tp > e uni > e norm holds. However, e tp is a periodic function oscillating between 0 and 1, but the latter two tend to 0.5 as σ → ∞ . Note that it is not necessarily possible to find an analytic solution for every probability distribution. Thus numerical calculations are required in most cases. In the next section, we extend the investigation to further examples to provide a more conclusive comparison between distributions allowed by our previous restrictions.

Error parameters in terms of moments of distributions
To offer a reasonable mathematical background for comparing the error caused by two different channels with the same rotation angle variance, we first lay down some necessary basics from probability theory, provide reasoning for the validity limits of the comparison, and introduce notations that shall be used later on in this paper.

Moment inequalities
Given a random variable X following a distribution described by the properly normalized probability density function f X (x) , we can define its n th raw ( [X n ] , Eq. 14) and central ( μ n , Eq. 15) moments as the following: Here μ = [X] is the expectation value (first raw moment) of the distribution. We denote the second central moment, the variance, as σ 2 = μ 2 , where σ is the standard deviation. For distributions with μ = 0 , all raw and central moments coincide ( [X n ] = μ n , ∀n ). Furthermore, the 2n th standardized central moment (or moment coefficient) β 2n−2 can be defined by normalizing the central moment with the 2n th power of σ.
A similar definition exists for odd-index β s; however, these are not directly representing the standardized central moment of order 2n + 1.
Since we normalize with the respective powers of the standard deviation, the second normalized central moment is, by definition, β 0 = 1 . With this notation, the kurtosis (fourth normalized central moment) is denoted as β 2 . Note that in the following, we exclusively use the above definition of kurtosis, not to be mistaken for the widely used excess kurtosis defined as γ 2 = β 2 − 3. Even-index normalized central moments are known to form a non-decreasing series (Shohat 1929). Moreover, the following is true for any three neighbouring moments: β 2n−2 ≥ β 2n−3 + β 2n−4 . For distributions symmetric around zero, the first fact originates simply from the second. Symmetric distributions have odd central moments that are equal to zero, provided that they exist, simplifying the expression to β 2n−2 ≥ β 2n−4 . Since β 0 = 1 , we can conclude that for such distributions, every even-index normalized central moment is greater than or equal to one (Pearson 1929).

Power series expansion of the error parameter
To obtain an expression for e in terms of any distribution's moments, we start with the Taylor series of sin 2 (θ) about zero: Inserting the series into the definition integral of the error parameter, we ultimately arrive at a power series representation of e in terms of the distribution's standard deviation σ.
After switching the order of the integral and the infinite sum, we can observe that the new integral is-by definition-the 2n th raw moment of the distribution. Since [θ] = 0 for the distributions we are interested in, the raw and central moments coincide, allowing us to replace the raw moments by the product of the 2n th standardized central moments β 2n−2 and the 2n th powers of σ . Thus, the new power series appears in terms of σ.
Questions might be raised about the validity of interchanging the integral and the infinite sum. To see whether the two repeated integrals are equal, we need to apply the Fubini-Tonelli theorem and check for the absolute convergence of the integral (Tao 2011). The theorem can be applied, as real numbers ℝ and natural numbers ℕ form σ-finite measure spaces with the Lebesgue measure and the counting measure, respectively. Moreover, the function g = (−1) n+1 2 2n−1 [(2n)!] −1 ⋅ ϑ 2n ⋅ f θ (ϑ) is measurable, since we assumed the existence and finiteness of all moments of the distribution, and integrability implies measurability (Cohn 2013).
The integral I abs is finite if and only if all the terms in Eq. 29 are finite. Since the third term is a constant, we only need to check the finiteness of the first and second terms. For this, it is necessary but not sufficient that f θ (ϑ) decays faster than e −2|ϑ| as |ϑ| → ∞ (Eq. 30). Due to the symmetry constraints we imposed on the PDF, the first two terms are either both finite or both infinite; checking one of them is sufficient.
There are distributions for which I abs is unconditionally finite, such as those with finite support or the normal distribution; some of them are finite for only a specific range of parameter/variance values (logistic, Laplace, etc.); while for some the integral is unconditionally infinite, e.g., the Cauchy distribution. The latter is an excellent example of a distribution we cannot compare to some of the others. All of its odd moments are undefined; all of its even moments are infinite (for every parameter value). Note, however, that even if I abs fails to converge, interchanging the integral and the sum may yield the same result, as the theorem only provides a sufficient condition, not a necessary one. Due to this, some of the integrals performed in Sect. 4 might not yield the same result when the integral and the sum are interchanged, but this does not contradict the general idea of the power series representation of error parameters.

Conjectures
Given a random distribution, we generally have no information about and no upper limit on the growth rate of β 2n−2 as n → ∞ . The values of two alternating series are difficult to compare unless further information is provided. As an approximation, we can examine only the first two terms.
We have also seen that the error parameter of the two-point distribution is periodic, starting from zero if σ = 0 , while it can be seen by simple reasoning that for any continuous distribution e → 0.5 as σ → ∞ . Therefore, the two-point distribution cannot provide an upper limit in terms of the error parameter for all values of σ . To counteract this, assume that the variation of the distribution governing the polarization angle rotation is reasonably small ( σ < 1 ), which coincides with practical situations. This further implies that By the two above assumptions, we formed the following conjectures, supported by our previous results (Eqs. 11-13).
Conjecture 1 Given two symmetric, zero-mean probability distributions with the same variance σ 2 , and assuming that all their moments are finite, there exists a standard deviation σ 0 > 0 , for which the following are true: The distribution with the higher kurtosis β 0 has a lower error parameter if σ ∈ (0, σ 0 ).
This conjecture can be extended to cases where even β 2 , β 4 , etc. agree, taking into account one more term in the power series for every additional equal pair of moments, leading to two distinct cases due to the alternating nature of the series.

Conjecture 2
Given two symmetric, zero-mean probability distributions with equal central standardized moments up to β 2n−2 , and assuming that all their moments are finite, there exists a standard deviation σ 2n−2 > 0 , for which the following are true: The distribution with the higher β 2n has • a lower error parameter if σ ∈ (0, σ 2n−2 ) and n is odd, • a higher error parameter if σ ∈ (0, σ 2n−2 ) and n is even.
Since the distribution with the lowest central standardized moments is the symmetric two-point distribution, we can form another conjecture for the upper limit of the error parameter on the small-angle conditions. Note that this "strong form" of the conjecture is not directly a consequence of the previous statements.

Conjecture 3
Out of all symmetric, zero-mean probability distributions with variance σ 2assuming the finiteness of all moments-, the distribution with the highest error parameter is the symmetric two-point distribution for σ < 1 . The upper bound for the error parameter is then given by e = sin 2 (σ).
In the following sections, we analyze the conjectures by comparing several different distributions and their error parameters.

Comparing error parameters of different distributions
To see whether the first conjecture holds, we started by analyzing twelve symmetric zeromean distributions, including those examined in our previous work. The list of distributions is given in the first 12 rows of Table 1, in increasing order of kurtosis. Every PDF (32) σ 2 > σ 4 > σ 6 > ⋯ Page 13 of 21 62 was reparametrized with the standard deviation σ . The results presented here are numerical integral calculations performed by Matlab with absolute and relative error tolerances of 10 −20 since analytic solutions can bring up errors near their singularities (mostly as σ → 0 ) when treated by software with finite numerical precision. Figure 3 shows the error parameter graphs of seven selected distributions for a whole period of sin 2 (σ) . The two-point distribution is characterized by a periodic oscillation  between 0 and 1, the error parameter of every other distribution tends to 0.5 as σ → ∞ , with some graphs showing damped oscillations around the final value, some of them asymptotically approaching 0.5 from below. The triangular distribution is somewhat special, as its error parameter shows oscillatory behavior, but never actually overshoots 0.5. Figure 4 shows a magnified section of the graphs between standard deviations 0.17 and 0.2 rad, well above the small-variance approximation for the normality assumption in Zhang (2014). It becomes obvious that the absolute differences in the error parameter between different distributions are minor, the only exception being the exponential halfpower distribution with an extremely high kurtosis value of 25.2. Furthermore, it is an indication that the two-point distribution's error parameter, sin 2 (σ) , is not only a higher bound for e at small σ values but also an excellent approximation to most practically feasible scenarios.
Closer numerical inspection tells us that none of the examples disagree with Conjecture 1. Most of the neighbouring distributions from Table 1 have a σ 0 larger than 1 rad. The only examples where graphs cross at smaller standard deviations are the triangular, raised cosine, and exponential cubic distributions with very similar kurtosis values of 2.4, 2.4062, and 2.4184, respectively. The triangular and the raised cosine error parameters cross around σ ≈ 0.407 rad, the raised cosine and exponential cubic parameters around σ ≈ 0.465 rad, while the triangular and exponential meet at σ ≈ 0.443 rad. Still, no disagreement with the conjectures had been found.
It must be noted that the upper limit of QBER for the security of any two-level DV-QKD protocol is 0.11 against coherent attacks (Cerf et al. 2002) and 1∕2 − √ 1∕8 ≈ 0.1464 against optimal individual attacks (Cerf et al. 2002;Fuchs et al. 1997;Bruß et al. 2000). Based on our definition of the error parameter e and assuming that the channel errors are independent of the input state �ϕ⟩ in , the QBER for an eavesdropping-free scenario would be exactly e in protocols using both basis states, e.g., BB84 (Bennett and Brassard 1984).
In the presence of eavesdropping, however, the QBER amount attributed to channel errors depends on the actual eavesdropping strategy and will be lower than e, since the two noise sources may counteract each other. As an example, this amount is e/2 for the simple intercept-and-resend attack (Schranz and Udvary 2019). Altogether, in terms of DV-QKD applications, it suffices to examine whether there are any crossing points of the e curves before they reach the specific boundaries, since any processes will be aborted, if the QBER exceeds those. All the σ 0 values found in our analysis are variances where the error parameter of the respective distributions exceeds 0.11. Only the crossing corresponding to the lowest σ 0 value (0.407 rad) happened before the error parameters of the triangular and raised cosine distributions reached 0.1464 (around e ≈ 0.14519 ). Therefore, we can conclude that the inequalities stated in the conjectures seem to hold within practically acceptable levels of QBER.
For more in-depth examination, we extend the analysis by introducing two distribution scale families with easily controllable variance and kurtosis values.

Distribution scale families with simple kurtosis parametrization
Several well-known distributions have a fixed value of kurtosis that cannot be influenced by the change of parameters. Such is the normal distribution with β 2 = 3 or the Wigner semicircle distribution with β 2 = 2 . To obtain data from distributions that are very close in kurtosis to each other, we are interested in finding scale families of distributions with at least two independent parameters, for which we can use one parameter to set the variance, and another to set the value of β 2 . The kurtosis should preferably be a continuous function of the respective parameter with an elementary inverse. The continuity is required for setting any desired value within an interval; the elementary inverse is necessary to find the exact parameter for a given kurtosis, not only an approximation.
There are infinitely many possibilities to construct distribution families suitable for this task. Such is the Pearson type VII family, which can be freely parameterized for both variance and kurtosis, representing values of β 2 ∈ [3, ∞) . As another example, the family of exponential power distributions (containing the Laplace, normal, exponential cubic, quartic and half-power distributions, and also the uniform as the parameter goes to zero) could be parametrized for any β 2 ∈ [1.8, ∞) , but finding the right parameter would involve calculations with the non-elementary inverse of the gamma function (Forbes 2011), only allowing approximations. Moreover, it is impossible to set kurtosis values smaller than 1.8, which seem to be responsible for the highest error probability at low variances. Therefore, they are not suitable to examine how distributions "close to" the two-point distribution behave. Also, distributions with very high kurtosis values would require precise computing.
We shortly describe two complementary scale families that both have the uniform distribution as a limiting case. Using these families, we can easily construct a distribution with any kurtosis value within the open interval (1, 3.24). Thus, the kurtosis can be set arbitrarily low. Note that these families could be easily extended to have a location parameter μ coinciding with the mean, shifting the support of the distribution away from zero. However, since we are only concerned with zero-mean distributions, this parameter can be omitted (set to zero) without any loss of generality.

U-polynomial scale family
A random variable X follows a U-polynomial distribution with parameter q if its probability density function is proportional to |x| q on a bounded interval x ∈ [−a, a] and zero outside the interval. The name describes the shape of the distribution within its support for q > 1 . The exact PDF is of the form The variance and the fourth central moment μ 4 are functions of both q and a (Eqs. 34, 35), but the kurtosis is a function of only q, monotonically decreasing for q > 0 (Eq. 36).
The distributions have kurtosis values found in the semi-open interval (1, 1.8], since lim q→0 β 2 = 9∕5 = 1.8 and lim q→∞ β 2 = 1 , and β 2 is a continuous function of q at zero (Fig. 5). Since the kurtosis is not a function of the boundary parameter a, q can be exclusively used to set its value. Afterwards, for a given q we can set a to control the variance independently from the kurtosis. The PDF can be easily reparametrized with the standard deviation σ instead of the interval width a using a simple substitution originating from Eq. 34, allowing us to compare the error parameters with the previously mentioned distributions. The U-polynomial scale family includes the uniform ( q = 0 ), the inverse triangular ( q = 1 ) and the U-quadratic ( q = 2 ) distributions, and also, as a limiting case, the symmetric two-point distribution ( q → ∞).

∩-polynomial scale family
A random variable Y follows a ∩-polynomial (read as "cap-polynomial") distribution with parameter q, if its PDF is proportional to a 1∕q − |y| 1∕q on a bounded interval y ∈ [−a, a] and zero outside the interval. The name now describes the distribution's shape on its support for q < 1 . This scale family is, in a manner of possible kurtosis values, complementary to the U-polynomial family, since they represent disjoint kurtosis intervals. The parameter q-as opposed to 1/q-is chosen so that for q → 0 , the distribution converges pointwise to a uniform distribution, as the U-polynomial scale family.
Similarly to the U-polynomial family, the variance and the fourth central moment are functions of both q and a (Eqs. 38, 39). The kurtosis is a function of only q, but in this case, monotonically increasing for q > 0 (Eq. 40).
Possible kurtosis values are in an open interval ranging from lim q→0 β 2 = 9∕5 = 1.8 to lim q→∞ β 2 = 81∕25 = 3.24 (Fig. 5). Like in the case of the U-polynomial distributions, setting the kurtosis with q and reparametrizing the probability density function with σ instead of a is a simple task. The ∩-polynomial scale family includes the triangular distribution ( q = 1 ), and the uniform distribution as the limiting case of q → 0.

Testing the conjectures with the U-and ∩-polynomial families
We analyzed 4-4 additional distributions from the two families, both serving a different purpose (see the final eight rows of Table 1 for their standardized central moments). The U-polynomials are meant to approximate the two-point distribution by increasing the parameter ( q = 2, 4, 10, 20 ). Two of the ∩-polynomials are chosen so that their kurtosis values match that of the Wigner semicircle ( β 2 = 2 , q = (3 + 2 √ 10)∕31 ) and normal distributions ( β 2 = 3 , q = 3 + √ 10 ), the former having a higher, the latter a smaller β 4 than their respective counterparts. These choices are explicitly targeting the testing of Conjecture 2 in case of n = 2.
Approximate σ 0 and σ 2 values found for successive distributions in the list are summarized in Table 2. Every value corresponds to the distribution in its row and the one directly below. Since calculations have only been carried out for σ ∈ [0, 2π] , σ 0 = ∞ is meant to represent the fact that both graphs seem to be monotonically increasing functions asymptotically tending to 0.5, with one of them converging slower than the other (thus, (38) σ 2 [Y] = a 2 (q + 1) 9q + 3 (39) μ 4 [Y] = a 4 (q + 1) 25q + 5 (40) β 2 [Y] = μ 4 σ 4 = (9q + 3) 2 (25q + 5)(q + 1) potentially never increasing above the other). Based on Table 2, it can be seen that there is a non-zero σ 0 (or σ 2 , for distributions with the same kurtosis) for every pair of distributions, even for those that are as close in the first few moments as the Wigner semicircle and the ∩
Although mathematically unproven, based on the lack of counterexamples, we can suppose that the first conjecture and its extension are true. As a consequence, a weaker form of Conjecture 3 (a special case of the first) would also be true: sin 2 (σ) is an upper bound for the error probability on an interval of non-zero measure. The strong form of Conjecture 3 possibly holds as well, even by extending its validity for σ ≤ π∕2 rad.

Conclusion
In our work, we created a simple channel model where the input linear polarization is rotated by a stochastic process, described by independent, identically distributed random variables θ t n . We have shown that the error probability introduced by this channel (the error parameter e) can be expressed as a power series in terms of the rotation angle distribution's variance σ 2 . Moreover, for small standard deviations ( σ ≪ 1 ), given two distributions with the same variance and different kurtosis β 2 , the one with the higher kurtosis will cause a lower error probability. We did not find any counterexamples to this statement, outlined in the first conjecture, and the inequalities seem to hold for values of σ such that e < 0.11 , the upper limit of QBER for safe key distribution in QKD schemes. An upper bound for e was also found in terms of the rotation angle's standard deviation σ , obtained for a symmetric two-point distribution, as e max = sin 2 (σ) . This function seems to provide the highest possible error probability for all σ ≤ π∕2.
The importance of this comes from the fact that quantum bit errors may originate from two sources: polarization effects of the channel and eavesdropping. By measuring or estimating the rotation angle's variance σ 2 (e.g., by using multiphoton decoy states to measure the intensity ratio between the orthogonal polarizations) and modeling the channel with a symmetric two-point rotation angle distribution, we can derive the maximum amount of QBER that is due to the channel, and therefore the minimum amount due to eavesdropping. Other applications may include any other communication system in which the state of polarization is essential, such as free-space optical links adopting polarization shift keying or polarization domain multiplexing.
The model is a simplification of real channels, not accounting for changes in the ellipticity of polarization and polarization-dependent losses. However, it can be expanded to incorporate all possible error sources and investigate how the error parameter behaves in a general case.