Goodness-of-fit test for a -stable distribution based on the quantile conditional variance statistics

The class of a -stable distributions is ubiquitous in many areas including signal processing, ﬁnance, biology, physics, and condition monitoring. In particular, it allows efﬁcient noise modeling and incorporates distributional properties such as asymmetry and heavy-tails. Despite the popularity of this modeling choice, most statistical goodness-of-ﬁt tests designed for a -stable distributions are based on a generic distance measurement methods. To be efﬁcient, those methods require large sample sizes and often do not efﬁciently discriminate distributions when the corresponding a -stable parameters are close to each other. In this paper, we propose a novel goodness-of-ﬁt method based on quantile (trimmed) conditional variances that is designed to overcome these deﬁciencies and outperforms many benchmark testing procedures. The effectiveness of the proposed approach is illustrated using extensive simulation study with focus set on the symmetric case. For completeness, an empirical example linked to plasma physics is provided.


Introduction
The a-stable distributions were first introduced in the 1920s in Lévy (1924); see also Khinchine and Lévy (1936). They are a natural extension of the Gaussian distribution which allow more sophisticated noise modelling. Indeed, due to the Generalized Central Limit Theorem, one could show that a-stable distributions attract distributions of sums of random variables with diverging variance, in the same sense that the Gaussian law attracts distributions with finite variance; see Jakubowski and Kobus (1989).
While a-stable distributions are characterized by four parameters, the most important one, arguably, is the stability index a 2 ð0; 2 which is responsible for the heavy-tailed behavior. In a nutshell, the smaller the a the higher the probability that the corresponding random variable takes extreme values. For a\2 the astable distributions belong the the wide class of the heavy-tailed distributions and in this case the corresponding random variable has infinite variance. On the other hand, the a-stable distribution can be considered as the extension of the Gaussian one, indeed for a ¼ 2 it reduces to normal distribution.
However, under the assumption that the random sample comes from the astable distribution, the fitting algorithms require very large sample size to be effective and do not efficiently discriminate a-stable distributions when the parameters are close to each other; see Burnecki et al. (2012Burnecki et al. ( , 2015. In particular, most goodness-of-fit procedures for a-stable distribution are based on the generic distance measurement statistics, when the whole empirical and theoretical cumulative distribution functions (or characteristic functions) are compared with each other. Consequently, the computational time needed for comprehensive data assessment is rather high, at least compared to assessments based on simple statistics (like moments for the finite-variance distributions). To sum up, there is a need for further development of fast and efficient a-stable distribution testing methods that are also effective for small number of observations.
In this paper, we propose a new goodness-of-fit method based on the quantile conditional variance (QCV) class of statistics that is designed to overcome the aforementioned problems; the set conditional variances are obtained by applying the standard variance function to a double quantile-trimmed (sample) data; see Sect. 3 for details. Although the theoretical variance for a-stable distribution with a\2 is infinite, the QCV always exist and can be used to fully characterize the astable distribution. Our approach is the base of a novel testing procedure which seems to be superior in comparison to benchmark approaches based on the distance measurement between theoretical and empirical distributions. Moreover, the test statistics we propose are very easy to implement as they are based on simple characteristics. As we demonstrated in this paper, the new testing procedure is very effective, also for small samples, and the proposed approach can be effectively used to discriminate the light-and heavy-tailed distributions, even if they are close to each other, e.g. when a is close to 2. While in this paper we focus on tail-impact assessment tests for symmetric a-stable distributions, our approach could be generalised to tackle the generic a-stable distribution fit assessment; potential generalisations are discussed in Sects. 6 and 7.
A similar approach based on the QCV statistics was recently proposed in Jelito and Pitera (2020) to assess tail heaviness in reference to the Gaussian distribution. It has been shown that exemplary normality test based on QCV statistic and the 20/60/ 20 Rule outperform many benchmark frameworks including Jarque-Bera, Anderson-Darling, and Shapiro-Wilk tests; see Jaworski and Pitera (2016) where the 20/ 60/20 Rule in the context of the Gaussian distribution is studied. Also, we refer to Hebda-Sobkowicz et al. (2020b, 2020a, where a similar approach was used in signal analysis for fault disgnostics. Moreover, it has been recently shown in Jaworski and Pitera (2020) that QCV could be used to characterise the underlying distribution up to an additive constant -this lays out a theoretical basis for efficient testing procedures based on QCV statistics. While the QCV statistic is a natural (local) extension of the standard variance, so far it has been not considered in the literature apart from the mentioned papers. This might be surprising, as the conditional second moments seem to be a very natural tool e.g. for engineering and financial applications. In particular, this refers to flexibility coming from using conditional second moments instead of higher-order unconditional moments, see e.g. Cioczek-Georges and Taqqu (1995). Our paper builds upon those remarks and exploits interconnection between Jaworski and Pitera (2020) and Jaworski and Pitera (2016) in reference to a-stable distributions. While we focus on goodness-of-fit testing, our approach is in fact quite general and could be applied in other contexts, e.g. for parameter fitting.
The rest of the paper is organized as follows: in Sect. 2 we define astable distributions and indicate their main properties which are useful in the further analysis. In Sect. 3, we introduce the conditional variance characterisation of astable distribution. Next, in Sect. 4 we analyze the sample quantile conditional variance for a-stable distribution and the properties of the QCV estimatior. Next, in Sect. 5, we propose an exemplary statistical test for symmetric a-stable distribution and check it efficiency on simulated datasets. This is the core part of the paper as our focus is set on the symmetric case. In particular, we compare the effectiveness of the test with known algorithms used for a-stable distribution testing. Moreover, we demonstrate the efficiency of the new algorithm for the case when the stability index is close to 2. In Sect. 6, we comment on a more generic statistical testing for a-stable distribution. In particular, we show a possible approach to non-symmetric goodness-of-fit testing. Then, in Sect. 7, we introduce a novel generic astable distribution goodness-of-fit visual test based on the QCV statistics. Finally, in Sect. 8, we provide empirical analysis which show how the results presented in this paper could be applied. Datasets analyzed in this paper were studied already in Burnecki et al. (2015), where the problem of discriminating between the light-and heavy-tailed distributions was discussed; see also Burnecki et al. (2012). Our results provide further (statistically significant) quantification of claims made in the past and point out to additional features embedded into empirical datasets that were not discussed before. The last section concludes the paper.

The a-stable distribution
The a-stable random variable X is typically characterized by four real parameters: the stability index a 2 ð0; 2, the scale parameter c [ 0, the skewness parameter b 2 ½À1; 1 and the location parameter l 2 R. One of the most common definition of the a-stable distribution is through its characteristic function; see e.g. Samorodnitsky and Taqqu (1994); Nolan (2020).
Definition 1 We say that X follows the a-stable distribution if its characteristic function is given by where a 2 ð0; 2, b 2 ½À1; 1, c [ 0, and l 2 R. For brevity, we write X $ Sða; c; b; lÞ. Also, in the case b ¼ 0, we say that X follows the symmetric (around median l) a-stable distribution and write X $ SaSða; c; lÞ.
The a-stable distributions are considered as the generalisation of the Gaussian distribution. In fact, for a ¼ 2 the a-stable distribution is the Gaussian distribution with mean equal to l and variance equal to 2c 2 ; b parameter is unimportant here. If a\2, then the second moment (and thus the variance) of a-stable distributed random variable X does not exist. Also, if a\1, then the expected value of X is infinite. Let us now briefly discuss each parameter's role.
First, the stability index parameter a 2 ½0; 2 is responsible for the so-called heavy-tailed property of a-stable distribution. This property is strongly related to the power-law behavior, which means that for a\2, the distribution tail of X $ Sða; c; b; lÞ is a power function where D a ¼ R 1 0 x Àa sinðxÞdx À Á À1 ¼ 1 p CðaÞ sinð pa 2 Þ and CðÁÞ is the Gamma function, see Samorodnitsky and Taqqu (1994).
Second, the skewness parameter b 2 ½À1; 1 gives the information about the symmetry of the distribution. Note that it cannot be directly linked to the usual skewness statistic as the the 3rd central moment of a-stable distribution does not exist when a\2.
Finally, the scale and location parameters c [ 0 and l 2 R refer to the standard distributional properties related to affine transformations. Following Samorodnitsky and Taqqu (1994), we know that if X $ Sða; c; b; lÞ then for any b; a 2 R, a [ 0, we get aX þ b $ Sða; ac; b;lÞ, wherẽ While a-stable random variables are absolutely continuous for any set of parameters, their probability density function (PDF) in general has no closed (analytic) form. However, there are some instances, where it could be given explicitly. For example, this refers to the Gaussian distribution family SaSð2; c; lÞ with the PDF given by the Cauchy distribution family SaSð1; c; lÞ with the PDF defined as and the Lévy distribution family Sð1=2; c; 1; lÞ with the PDF given by integral form. For instance, following Shao and Nikias (1993) and assuming that l ¼ 0 and c ¼ 1, we can represent X $ SaSða; 1; 0Þ PDF as The series given in (6) is absolutely convergent for any x 2 R; see Bergstrom (1952);Feller (1966). Although, the series expansions of PDF for symmetric astable random variables are known, the asymptotic of series expansions are good only in the tails. The origin of the PDF deviate from the theoretical PDF for intermediate values, see Tsihrintzis and Nikias (1993). For other representations, we refer the reader for instance to Nolan (2020Nolan ( , 1997; Cizek et al. (2005).

Conditional variance characterisation of the a-stable distributions
For now, let us assume that X is a generic absolutely continuous random variable.
For any A 2 R, such that P½A 6 ¼ 0, we define the conditional variance of Xon A by where E½ÁjA is the standard conditional (set) expectation operator; note that (7) might be non-finite. For brevity, for any 0 a\b 1, we also define the quantile trimmed subdomain of X by A X ða; bÞ :¼ fX 2 ðF À1 X ðaÞ; F À1 X ðbÞÞg 2 R and related quantile conditional variance (QCV) of X given by r 2 X ða; bÞ :¼ Var ½XjA X ða; bÞ: For completeness, note that: (a) P½A X ða; bÞ ¼ b À a [ 0 so the definition is well posted; (b) for 0\a\b\1 the value of (8) is finite; (c) for any c [ 0 and l 2 R we get r 2 cXþl ðÁ; ÁÞ ¼ c 2 r 2 X ðÁ; ÁÞ; (d) Var ½X ¼ r 2 X ð0; 1Þ. Recently, it has been shown in Jaworski and Pitera (2020) that the family of QCVs could act as law classifiers; see Theorem 2. 1 Theorem 2 Let X, Y be (absolutely continuous) random variables such that r 2 X ða; bÞ ¼ r 2 Y ða; bÞ, for 0 a\b 1. Then, there exists l 2 R such that F X ðtÞ ¼ F Yþl ðtÞ, t 2 R, i.e. the laws of X and Y coincide almost surely up to an additive-constant.
Note that in Theorem 2, using standard limit arguments, we get that consideration of limit values a ¼ 0 and b ¼ 1 (for which QCV might be non-finite) is in fact not required to get full classification. Because of that, from now on, we assume that 0\a\b\1. In particular, this implies that all QCVs in scope are finite.
Based on Theorem 2 we can characterise the a-stable distribution by providing information about QCVs. While the general analytic formula for the QCV of astable distribution is unknown, it could be efficiently computed using standard methods. In particular, for SaSða; c; lÞ class, we can use the series expansion given in (6) to get the formula of QCV; see Proposition 3.
To conclude this section, we illustrate the dynamics of the QCV for exemplary parameter shifts for a-stable distribution; see Fig. 1. Monotonic behaviour visible on the graphs indicate that distribution parameters might be represented via QCV. Nevertheless, the detailed analysis of this fact is out of scope of the paper. In the middle panel, we demonstrate QCV values for five a parameter values and ða; bÞ ¼ ð0:2; 0:8Þ showing how QCV is changing with respect to b parameter. In the right panel, we assume that b ¼ 1 À a and b ¼ 0, and demonstrate QCV values for a 2 ½0:1; 0:5. The calculations are performed based on 100 000 Monte Carlo simulations. To increase the transparency, we restrict a parameter values to (1,2] 4 The sample quantile conditional variance estimator and its properties for the a-stable distributions In this section we introduce the sample QCV estimator and comment on it basic properties. This will be a core object used for goodness-of-fit testing. While in this section we focus on a generic choice of quantile thresholds 0\a\b\1 for a single QCV, in practical applications one choose specific linear combinations (or ratios) of QCVs with different predefined thresholds in order to allow efficient statistical testing of certain distributional assumptions such as heavy-tails or asymmetry. We refer to Sects. 5 and 6 for details. Given independent and identically distributed (i.i.d.) sample ðX 1 ; . . .; X n Þ and quantile values 0\a\b\1, the sample estimator of r 2 X ða; bÞ is given bŷ wherel X ða; bÞ :¼ 1 ½nbÀ½na P ½nb i¼½naþ1 X ðiÞ is the conditional sample mean, X ðkÞ is the kth order statistic of the sample, and ½x :¼ maxfk 2 Z : k xg denotes the integral part of x 2 R. Note that while (13) might look complicated, it could be computed by applying the following straightforward logic: first, sort the sample; second, take a subset of observations induced by empirical quantiles linked to a and b 2 ; third, compute standard sample variance on the subset of observations. Using standard arguments one can check that for any 0\a\b\0 the QCV estimatorr 2 X ða; bÞ is consistent and follows the usual CLT dynamics; these facts are summarised in Proposition 4.
Proof The proof is based on the standard trimmed-mean arguments introduced e.g. in Stigler (1973). For completeness, let us show an outline of the second part of the proof; see Jelito and Pitera (2020) for details. Let 0\a\b\1 and let ðX i Þ be a sample from the same distribution as X. First, noting that X is absolutely continuous, using (Jelito and Pitera 2020, Lemma 1) and (Jelito and Pitera 2020, Lemma 2) we know thatl X ða; bÞ À! P l X ða; bÞ and ffiffi ffi n pl X ða; bÞ À l X ða; bÞ ½ À ! d N ð0; gÞ, n ! 1, where l X ða; bÞ is the theoretical conditional mean of X and g is some constant. Also, using (Jelito and Pitera 2020, Lemma 3), we get ffiffi ffi n pr 2 X ða; bÞ Àŝ 2 X ða; bÞ À Á À! P 0, as n ! 1, whereŝ 2 X ða; bÞ :¼ X ðiÞ À l X ða; bÞ 2 : Consequently, it is enough to show that ffiffi ffi n pŝ 2 X ða; bÞ À r 2 X ða:bÞ For brevity, we introduce additional notation and define the directed sum for any sequence of numbers ða i Þ. The first step of the proof is to split Z n into two parts: one representing the deterministic trimmed mean component and one representing the residual component from quantile estimated thresholds. Namely, we have where S ðiÞ :¼ X ðiÞ À l X ða; bÞ À Á 2 . Introducing KðÁÞ :¼ F À1 X ðÁÞ À l X ða; bÞ À Á 2 , we can rewrite formula (14) as In the second step of the proof, we show that ffiffi ffi n p m n S A n i¼½na S ðiÞ À KðaÞ À Á À! P 0 and ffiffi ffi n p m n S ½nb i¼B n S ðiÞ À KðbÞ For brevity, we only show the proof for of the first convergence; the proof for the second convergence is analogous. First, note that 0 ffiffi ffi n p m n S A n i¼½na S ðiÞ À KðaÞ À Á A n À ½na m n = ffiffi ffi n p maxfS ð½naÞ À KðaÞ; S ð½A n Þ À KðaÞg : Second, due to consistency of quantile estimators X ð½naÞ and X ð½A n Þ , we get S ð½naÞ À KðaÞ À! P 0 and S ðA n Þ À KðaÞ À! P 0, so it is enough to show that A n À½na m n = ffiffi n p converges to normal distribution. Noting that m n =n ! b À a and ðna À ½naÞ= ffiffi ffi n p ! 0, as n ! 1, and applying Central Limit Theorem combined with Slutsky's Theorem (see Ferguson (1996)) to A n $ B n; a ð Þ, we get This concludes the proof of (16). The third and final step of the proof is to apply Central Limit Theorem to Z n . Noting that ðna À ½naÞ= ffiffi ffi n p ! 0 and ðnb À ½nbÞ= ffiffi ffi n p ! 0, we can rewrite (15) as S ðiÞ þ ðA n À naÞKðaÞ þ ðnb À B n ÞKðbÞ ! þ r n ; where r n À! P 0. Thus, recalling definitions of S ðiÞ , A n and B n , we get Applying Central Limit Theorem combined with Slutsky's Theorem to the sum and noting that for 1 i n we get we conclude the proof. h Note that the constant s [ 0 in Proposition 4 might be easily approximated using Monte Carlo simulations. Moreover, in some specific cases one might provide an explicit formula for s; see Appendix A in Jelito and Pitera (2020) where closed-form formula for s in the case of the Gaussian distribution is provided. Also, it should be noted that using Proposition 4 and (multivariate) Central Limit Theorem we immediately get that any linear combination of QCV sample estimators is also asymptotically normal, i.e. we get where 0\a i \b i \0 and d i 2 R for (i ¼ 1; 2. . .; k), and s [ 0 is some fixed constant. Dividing (17) by another (non-degenerate) linear combination of quantile conditional variances we get a statistic that is a pivotal quantity with respect to both location parameter l 2 R and scale parameter c [ 0. This could be easily shown using e.g. Slutsky's Theorem; see Ferguson (1996). For an outline of the proof of those facts, we refer to the proof of Theorem 6.1 in Jelito and Pitera (2020).

Statistical test for symmetric a-stable distribution based on the quantile conditional variances statistics
Based on (17), we know that one can use the linear combination of sample QCVs for efficient goodness-of-fit parameter testing. Arguably, the stability index parameter a 2 ð0; 2 is the most important a-stable distribution shape parameter as it controls tail-heaviness. In fact, most a-stable distribution goodness-of-fit test focus on this parameter in the symmetric case; location and scale oriented testing is usually performed using more standard methods. We refer to Matsui and Takemura (2008) and Wilcox (2017) for details. In this section we present a test statistic family that might be used to test stability index a parameter specification for SaSða; c; lÞ distributions. We refer to Sect. 6 for a discussion about skewness parameter inclusion in the proposed goodness-of-fit method and to Sect. 7 for a more generic QCV overall distribution fit framework that takes into account stability index, skewness, and scale parameters.
Recalling that the stability index a could be linked to tail behavior, we decided to follow the approach similar to the one introduced in Jelito and Pitera (2020). More explicitly, we decided to compare tail QCVs with central region QCV to assess how heavy are the tails.
The generic family of test statistics we consider that could be used for goodnessof-fit testing for any distribution is given by where x ¼ ðx 1 ; . . .; x n Þ is a sample from the same distribution as the random variable X, d 1 ; d 2 ; d 3 2 R are fixed weight parameters, 0\a 1 \a 2 \a 3 \a 4 \1 correspond to quantile split parameters, andr 2 x ða; bÞ denote QCV empirical estimator for sample x on quantile interval (a, b). Note thatr x ða 1 ; a 4 Þ is used for normalisation purposes: it makes N invariant to affine transformations of X. In other words, the choice of location parameter l and scale parameter c do not impact values of N making it a pivotal quantity. Next, we find specific choices of parameters that could be used for generic a-stable goodness-of-fit testing.

Specific choice of parameters of test statistic for symmetric astable distribution
Taking into account that X is symmetric, we decided to set a 4 :¼ 1 À a 1 , a 3 :¼ 1 À a 2 , and d 1 :¼ d 3 , which reduced the number of input specification parameters to four, i.e. a 1 , a 2 , d 1 , and d 2 . For given a 1 and a 2 , we fix values of d 1 and d 2 in such a way, that the resulting test statistic N is normalised when a ¼ 2.
To be more precise, we assume that d 1 [ 0 and find values of d 1 and d 2 in such a way that N is (asymptotically) close to standard normal (for a ¼ 2Þ; please recall that theoretical QCV values could be obtained from Proposition 3. Consequently, we can focus on the choice of the quantile parameters ða 1 ; a 2 Þ. For transparency, we decided to take three ða 1 ; a 2 Þ quantile splits. Namely, we take values ð5%; 25%Þ, ð0:5%; 25%Þ and ð0:5%; 4%Þ; they correspond to quantile ratio data splits 5=20=50=20=5; 0:5=24:5=50=24:5=0:5; and 0:5=3:5=92=3:5=0:5: The first parameter set, ð5%; 25%Þ, is a generic choice which should be good for general testing for all sample sizes. We compare 50% central region QCV with the trimmed tail QCVs; the top and bottom 5% trimming should ensure finiteness of tail QCVs while not inducing severe (reduced conditional sample size) volatility. The second choice, ð0:5%; 25%Þ, might be seen as a modification of the first one for larger samples. We reduced top and bottom trimming from 5% to 0.5% while maintaining the central 50% area. This should increase sensitivity of tail QCVs allowing more accurate a testing when the underlying sample size is large. The third statistic, ð0:5%; 4%Þ, is designed to detect minor a changes for large sample sizes. It is focused on extreme tail QCVs and should be good for very small a change tests; note this is aligned with results observed in the right plot in Fig. 6. To sum up, we introduce three different versions of test statistic (18) that are given by For completeness, in Fig. 2 we present the plots of the limit values of N i , i ¼ 1; 2; 3 depending on a 2 ð1; 2 without the factor ffiffi ffi n p . Note that in the limit all test statistic are decreasing functions of a, so that their asymptotic power is equal to one (within the class of symmetric a-stable distributions).
Test statistics N 1 , N 2 , and N 3 establish a generic statistical framework that could be used for efficient goodness-of-fit testing for (symmetric) a-stable distributions for all possible choices of parameters and sample sizes. Essentially, the specific choice of an appropriate statistic depends on the underlying goal and sample size. If one is interested in generic a-stable goodness-of-fit test for moderate sample size, then N 1 is the best choice. If the sample size is larger and we want to test a parameter values which are relatively close to 2, e.g. when 1:5\a\1:9 then one should use N 2 . Finally, if we are interested in efficient discrimination of parameters for values of a very close to 2, e.g. when 1:9\a then we recommend using N 3 . See Sects. 5.2.2 and 5.2.3 when a comprehensive comparison study with focus set on test power is made. Nevertheless, it should be noted that our choice of three conditioning sets and related specifications are just exemplary choices -QCV based tests allow very flexible statistic construction which could be tailored to particular testing needs. Still, given a specific sample at hand, one should avoid specific tuning of parameters a i and d i as it would effectively reduce the statistical power of the test, see Miller (2012) and references therein.

Power simulation study
In this section we check the effectiveness of the proposed testing procedure by Monte Carlo simulations. More precisely, we calculate the power of the introduced goodness-of-fit tests based on N 1 ; N 2 and N 3 test statistics defined in (19), (20) and (21), respectively. We compare the powers of the tests based on the QCV statistics with benchmark tests for a-stable distribution that are based on the empirical and theoretical distribution distance.  (21), with respect to the a parameter. For transparency, we restrict parameter set to a 2 ð1; 2; similar behaviour is observed on the full parameter range

Comparison to other tests
In order to demonstrate how effective is the test based on the QCV statistics, we consider the power of the tests based on N 1 , N 2 and N 3 statistics defined in (19)-(21). In this subsection, for simplicity, we consider only the two-sided (TS) tests; in the further analysis we show results for one-sided tests as well. We perform the analysis under H 0 hypothesis stating that the sample comes from the symmetric astable distribution SaSða 0 ; 1; 0Þ, where a 0 :¼ 1:5; note that scale and location parameters are not important here as test statistics N 1 , N 2 and N 3 are invariant to affine transformations. The values of N 1 , N 2 , and N 3 statistics under the H 0 hypothesis are calculated based on the 100 000 strong Monte Carlo simulation for various sample size n 2 f20; 50; 100; 500; 1000; 2000g. We check the powers of the tests for H 1 hypotheses stating that random sample comes from distribution SaSða 1 ; 1; 0Þ, where a 1 2 f1:1; 1:2; . . .; 2:0g. It should be mentioned that in the real application the a-stable distribution with a 1 is rarely applied, thus we decided present the results for a [ 1. For each case, the power is calculated using the 10 000 simulations for the sample corresponding to the H 1 hypothesis.
For comparison, powers of N 1 , N 2 , and N 3 tests are confronted with powers of the benchmark tests for a-stable distribution. We decided to consider the standard goodness-of-fit tests that are based on measuring the distance between the empirical and theoretical distributions. Namely, we consider Kolmogorov-Smirnov (KS) cumulative probability test, the Kuiper test, the Watson test, the Cramer-von Mises (CvM) test, and the Anderson-Darling (AD) test, see Chakravarti et al. (1967); Srinivasan (1971); Watson (1961); Anderson (1962); Anderson and Darling (1954). These statistical tests could be considered as benchmark choices that are commonly used for the a-stable distribution testing and are considered to be very efficient in the real data analysis; see Cizek et al. (2005) for details. For consistency, all benchmark test powers are calculated using similar framework with the same number of strong Monte Carlo simulations for H 0 and H 1 hypotheses, for n ¼ 20; 50; 100; 500; 1000; 2000. In particular, please note that this gives us control over Type I errors that are encoded into the underlying significance levels.
In Fig. 3, we present test powers for the significance level 5%. Results for significance level 1% are similar and deferred to the Appendix; see Fig. 13. Surprisingly, as one can see, the tests based on the QCV statistics outperform all other considered tests, even for small sample sizes (n ¼ 20 and n ¼ 50).
It should be highlighted that the N 1 -based test seems to be the most effective for a 1 \1:5, i.e. when the tested a corresponding to H 1 hypothesis is smaller than the a corresponding to the H 0 hypothesis, while the N 2 test is the best one in the case a 1 [ 1:5, i.e. when a from H 1 hypothesis is higher than a for H 0 hypothesis. This might be potentially traced back to the fact that the higher the a the more stable (slimmer) the tails. In consequence, the standard error of the QCV estimator ofr 2 X ð0:5%; 25%Þ in (20) is decreasing. This improves the performance of N 2 in reference to N 1 .
The N 3 test is ineffective for small sample size (n ¼ 20) because of the extreme quantile specification. Due to the fact that b20 Á 0:04c\1 we have only one Powers are calculated on the basis of 10 000 Monte Carlo simulations for samples corresponding to H 1 hypothesis, i.e. for SaSða; 1; 0Þ. The significance level is equal to 5%. For all sample sizes, the QCV based tests outperform benchmark frameworks; note that in the top left plot (n ¼ 20) the values for N 3 are not available due to low quantile specification for this test statistic. For transparency, we added the legend only to the top left plot. See Figure 13 for results for significance level 1% For large sample sizes, all considered tests seem to be effective. However, the tests based on the QCV statistics are more restrictive. In Table 1, we demonstrate the details of the results presented in Fig. 3 for exemplary sample size n ¼ 500. One can see the difference between the powers of the tests proposed in this paper (N 1 test and N 2 test) are clearly higher that the powers of other considered tests. For completeness, test powers for other sample sizes are collected in tables 4, 5, 6,7, 8 in Appendix B.
To provide further insight in reference to Type I and Type II errors as well as sample size impact analysis we pick one representative H 1 alternative hypothesis for a 1 ¼ 1:7 and study corresponding p-value distributions. Also, we study test powers for more granular sample size specification. The results for other alternative hypotheses are essentially the same and are available from the authors upon request.
In Fig. 4 we present comparison of p-value distributions of the tests for SaSð1:5; 1; 0Þ distribution (H 0 hypothesis) against SaSð1:7; 1; 0Þ distribution (H 1 hypothesis) for various sample sizes n. In particular, one can observe stochastic dominance of N 2 test p-values with respect to all benchmark tests. Overall, the results prove that within a-stable framework our test should give best control over both Type I and Type II errors; please recall that all p-values are based on samplesize tailored Monte Carlo simulation which links them to Type I errors.
Next, in Fig. 5 we present test power as a function of sample size n 2 ½10; 2000 for a 0 ¼ 1:5 and a 1 ¼ 1:7. For completeness, we include the results for significance  consistently propagates to other sample sizes and the payoff between test power and sample size is optimal for QCV based tests. In particular, N 2 test statistic outperforms all other benchmark tests.
To sum up, our analysis shows that QCV-based test statistics outperform benchmark frameworks. The supreme performance of the introduced statistics N 1 , N 2 , and N 3 in reference to other benchmark framework could be traced back to two main reasons. First, most benchmark methods are based on the generic whole distribution fit rather than local fit in the non-central part of the distributionintroduction of QCV based statistics allowed more efficient local densitty discrimination. Second, our test statistic are tailored to measure fat-tail behaviour which is very efficient when assessing the value of a. In fact, the value of N 1 , N 2 , or N 3 could be used as a measure of tail fatness; see Jelito and Pitera (2020), where tail-fatness analysis in reference to similar test statistic in the Gaussian context is performed.
For completeness, in Appendix Fig. 14, we present a simplified R source code that could be used to compute N 1 statistic and estimate (two-sided) test power for true a 0 ¼ 1:5 and alternative a 1 ¼ 1:7 for sample size n ¼ 50 at 5% significance level. One can easily modify the R code to get results for other QCV statistics and parameters sets.

Generic test power check
In Sect. 5.2.1, we have shown that N 1 statistic given in (19) has a decent test power and is outperforming all competing benchmark frameworks for multiple sample sizes and one exemplary choice of (true) a ¼ 1:5. In this section, we fix sample size to n ¼ 500 and perform a more comprehensive power study focusing on all three N 1 , N 2 , and N 3 statistics, given in (19)-(21). Using Monte Carlo method, for each Fig. 5 Comparison of test powers for SaSð1:5; 1; 0Þ distribution (H 0 hypothesis) against SaSð1:7; 1; 0Þ distribution (H 1 hypothesis) for various sample sizes n. The following tests are taken under consideration: the two-sided N 1 , N 2 and N 3 tests proposed in this paper, Kolmogorov-Smirnov (KS) two-samples test, Kuiper test, Watson test, Cramer-von Mises (CvM) test and Anderson-Darling (AD) test. Results for confidence levels 1 and 5% are presented. For all sample sizes, the QCV based tests outperform benchmark frameworks; note that for small sample sizes (n\50) the values for N 3 are not available due to low quantile specification for this test statistic. For transparency, we added the legend only to left plot a 2 ð0; 2 on the 0.025 span-grid, we simulated m ¼ 10 000 samples; each simulation is a random sample from SaSða; 1; 0Þ distribution of size n. Next, we calculate m values of N 1 , N 2 , and N 3 , for each a 2 ð0; 2, and use this to get Monte Carlo densities of all test statistics. Using the obtained densities, we calculate twosided test powers. Note that due to a simplistic nature of conditional variance estimators, this simulation is very easy to implement and fast to compute. 3 In this part, we decided to consider the whole range of the a parameter in order to see the effectiveness of the proposoed testing approach. Using the results, for a chosen 5% significance level, we were able to approximate test power of N 1 , N 2 , and N 3 for all null and alternative a shape (simple) hypothesis. The results (for two-sided tests) are presented in Fig. 6.
From Fig. 6 we see that apart from a close to 2, the power of test statistic N 1 seems to be the best among all considered statistics. On average, for sample size n ¼ 500, if the difference between the true a 0 and alternative a 1 is higher than 0.2 then N 1 test power is very close to 1 (for 5% significance level). Good performance of N 1 could be traced back to heaviness of the tails. The smaller the a, the harder it is to estimate the extreme 0:5% quantile which is needed for N 2 and N 3 . This impacts the standard error of the conditional variance estimator, reducing it's test power. Also, note that estimation of small 3:5% probability set in N 3 makes this statistic test power smaller than N 2 . On the other hand, for a close to 2, the situation is entirely different. First, note that N 2 outperforms N 1 for a ! 1:5; see also Fig. 3. While N 2 seems to be an adequate overall choice when the true a is in the range [1.5, 2], the closer to 2 we are, the better is the performance of N 3 . In fact, for a very close to 2 it looks like N 3 is outperforming N 2 . We decided to investigate this in details in Sect. 5.2.3.
Also, note that from Figs. 6 and 2 one can deduce that test statistics N 1 , N 2 , and N 3 could be used for more generic testing. Namely, since the true values of test Fig. 6 N 1 , N 2 , and N 3 test power level plot (two-sided); n ¼ 500; a 2 ð0; 2. Apart from a close to 2, statistic N 1 seems like the best choice. This is due to the fact that 0:005% quantile trimming introduced in N 2 and N 3 is to extreme. Nevertheless, for a ¼ 2 the tails became more stable, which results in better performance of N 2 and N 3 . Note that relatively small sample size impacts performance of N 3 due to nonrobust estimation of conditional variance on small 3.5% probability set statistic are monotone wrt. a, given null hypothesis H 0 of the form a ¼ a 0 , we can consider an alternative hypothesis H 1 of the form a\a 0 rather that a ¼ a 1 .

Generic test power check for symmetric a-stable distribution with aalpha close to 2
In this section we decided to repeat the exercise from Sect. 5.2.2 assuming that a 2 ½1:8; 2:0 and n ¼ 2000. In other words, we want to check the performance of our testing framework when one wants to investigate near-Gaussian tails for relatively large sample. In fact, a similar case will be considered in the real data analysis presented Sect. 8. To get better accuracy, we increased Monte Carlo strong sample size to m ¼ 50 000 and grid density to 0.01. To sum up, for each a 2 f1:80; 1:81; . . .; 1:99; 2:00g, we simulated strong Monte Carlo sample of size m ¼ 50 000, where each simulation is a random sample from SaSða; 1; 0Þ distribution of size n. Next, as in Section 5.2.2, using the obtained test statistic densities we calculate two-sided test powers for N 1 , N 2 , and N 3 . The results of the simulations for significance level 5% are presented in Fig. 7. First, we note that in the considered region of a parameter, the performance of N 1 test is not satisfactory. On the other hand, the performance of N 2 and N 3 tests is comparable -the higher the a the better is the performance of N 3 test in comparison to N 2 test. This is consistent with the results presented in Sect. 5.2.2. Note that increased sample size ðn ¼ 2000Þ resulted in narrowed test power bounds. Now, if the difference between the true a 0 and alternative a 1 is higher than approximately 0.1 then N 3 test power is very close to 1 (for 5% significance level); for n ¼ 500 this was equal to 0.2. To illustrate this, let us present a table with one-sided and twosided test powers for N 2 and N 3 tests for (true) parameter a ¼ 1:90. For completeness, apart from significance level 5%, we also present test powers for significance level 1%; see Table 2. Fig. 7 N 1 , N 2 , and N 3 tests power level plots (two-sided); n ¼ 2000; a 2 ½1:8; 2. Overall, the results for N 2 and N 3 tests are comparable, but the closer to a ¼ 2 we get, the better the performance of N 3 tests. Note that the outcomes for N 1 test are not satisfactory, i.e. while N 1 is best to test the overall a-stable fit, it might not discriminate near-Gaussian distributions as well as N 2 or N 3  The sample size is n ¼ 2000 and the significance level are set to 1% and 5%. The values in Test type column refer to one-sided (OS) and two-sided (TS) statistical tests.
One sided test refer to left-sided test for a 1 \a and right-sided test for a 1 [ a.
Overall, the performance of N 3 test is better than for N 2 test but the results are comparable 6 Statistical tests for generic a-stable distribution based on quantile conditional variance estimators While in this paper our focus is set on goodness-of-fit testing of the symmetric astable distribution, one could easily expand the framework presented in Sect. 5 to allow more generic tests. For completeness, in this section we provide a short comment on possible extensions of our testing framework that includes analysis of the skewness parameter b 2 ½À1; 1. Please recall that scale c [ 0 and location l 2 R goodness-of-fit adequacy testing is typically performed using more standard methods and most a-stable goodness-of-fit tests are invariant to affine transformations, see e.g. Matsui and Takemura (2008). First, from Fig. 1 we see that QCV could be used to efficiently measure astable distribution skewness. Indeed, the central exhibit of Fig. 1 shows the dynamics of (theoretical) QCV strongly depends on the underlying choice of skewness parameter b so one should be able to develop symmetry-oriented statistical test.
Second, we note that test statistics N 1 , N 2 , and N 3 introduced in Sect. 5.1 are tailored to symmetrically measure tail-heaviness impact that is encoded in parameter a and one would expect them to be generally invariant to the choice of the skewness parameter b. This could be traced back to specific constraints imposed on quantile split parameters and weights parameters in (18). Namely, we assumed that a 4 ¼ 1 À a 1 , a 3 ¼ 1 À a 2 , d 1 ¼ d 3 , and d 1 d 2 \0 which resulted in test statistic that (symmetrically) measures the impact of tail-set QCVs on central-set QCV. This observation is illustrated in the left exhibit of Fig. 8, where two-sided tests power level plot for N 1 test statistic for exemplary set of null hypothesis parameters a 0 ¼ 1:5 and b 0 ¼ 0 is confronted with various alternative hypotheses for a 1 2 ½1:0; 2:0 and b 1 2 ½À1; 1. As expected, the power of the test is almost invariant to the choice of the skewness parameter.
For brevity, we now focus on test statistic N 1 and show how it can be modified to allow better skewness impact measurement. We follow the setting similar to the one introduced in Sect. 5.2.1. Namely, we fix sample size n ¼ 500, confidence level 5%, and assess test power of modified statistics in a same manner that was done in Fig. 6 but taking into account the impact of b. Two possible natural modifications of N 1 are given by Fig. 8 N 1 , N a 1 , and N b 1 tests power level plots (two-sided) for a 0 ¼ 1:5 and b 0 ¼ 0; n ¼ 500; a 1 2 ½1; 2; b 1 2 ½À1; 1. Overall, we note that test statistic N 1 is almost invariant to the choice of b, while test statistics N a 1 and N b 1 could be used for skewness identification N : In both cases, we simply change the input weights d 1 ; d 2 ; d 3 2 R; see (18). In the first case, weights change results in a statistic N a 1 that measures the QCV difference between left and right tail. Theoretically, the more symmetric the distribution, the closer to zero the values of N a 1 . In the second case, we modified N 1 in such a way that only the relation between left tail and central set QCVs is taken into account. Statistic N b 1 should be able to detect changes in both a and b among some fixed QCV induced level-set. The test power results are presented in Fig. 8.
From the plot one could deduce that the distribution of N a 1 and N b 1 does depend on parameter b and those statistic could be used for skewness indentification. In particular, note that N a 1 seems to correctly capture the actual skewness effect that is partly encoded in b. Note that the bigger the value of a, the smaller the impact of b on the actual distribution skewness which results in widening power level sets. In particular, for a ¼ 2, the distribution reduces to (symmetric) Gaussian and the choice of b does not matter.
Of course, one could easily improve the obtained results by considering different test statistics that are focused on symmetry measurement. For instance, one could remove the central QCV in (18) by setting a 2 ¼ a 3 ¼ 0:5 and d 2 ¼ 0, or consider maximum of N b 1 and it analogue that takes into account right tail. Nevertheless, we want to emphasize that the main focus on this article is on the symmetric case and tail-heaviness assessment framework. The detailed derivation of efficient bsensitive statistic is out of scope of this article. That saying, we refer to Sect. 7 where the overall distribution fit using QCV framework is studied. In particular, it is shown how possible b parameter misspecification could impact the structure of sample QCVs; see e.g. Fig. 9 for details.

General a-stable distribution goodness-of-fit visual test using quantile conditional variance estimators
In this section we introduce a generic (visual) fit test based on quantile conditional variance that uses the results of Theorem 2. It should be noted that the procedure introduced in this section could also be embedded into more formal statistical language, e.g. by introducing a multiple (composite) testing framework. This section complements the results from Sects. 5 and 6. Let us assume we have an i.i.d. sample from the same distribution as X, say x ¼ ðx 1 ; . . .; x n Þ, n 2 N, and we want to check whether this sample comes from a specific Sða; c; b; lÞ distribution. Furthermore, let us assume that we are not interested in the location parameter fit; note there are standard methods to assess this based e.g. on trimmed mean analysis, see Wilcox (2017); Ferguson (1978Ferguson ( , 2001.
Due to Theorem 2 combined with Proposition 4 we know that X $ Sða; c; b;lÞ, for somel 2 R, if and only if for any 0\a\b\1 we get where Z $ Sða; c; b; 0Þ. Thus, to assess the adequacy of the distribution fit, we can check Property (22) for representative set of quantile intervals ða i ; b i Þ, where i 2 f1; 2; . . .; Ig and I 2 N. The rational choice is to use non-overlapping quantile intervals with (a.s) interconnected union, i.e set b i ¼ a iþ1 , for i 2 f1; 2; . . .; I À 1g. For simplicity, we might also assume that all intervals are of equal length, i.e. there exist [ 0 such that b i À a i ¼ , for i 2 f1; 2; . . .; Ig. Also, we can normalise (22) by considering a set of convergence conditions ½r 2 X ða i ; b i Þ = r 2 Z ða i ; b i Þ À 1 ! 0, n ! 1. To control error of the fit, given a set of pre-defined parameters ða; c; bÞ and n 2 N, one could consider a series of statisticŝ where Z $ Sða; c; b; 0Þ; note that the (null-hypothesis) confidence intervals, for any fixed n 2 N, could be easily computed numerically using e.g. Monte Carlo simulations. If we are not interested in the scale c [ 0 parameter fit, we can further normaliseR i by multiplying inner variance ratio e.g. by r 2 Z ða 1 ; b I Þ =r 2 X ða 1 ; b I Þ; we will refer to such statistic as normalisedR i . Of course, the choice of the number of intervals I 2 N and exterior points 0\a 1 and b I \1 should depend on the sample size so that the resulting QCV estimators are relatively dense and robust.
To illustrate how the sanity check based on test statistics (23) might look like, let us present a simple example. Fig. 9 The plot illustrates how one can use sample conditional variance ratiosR i given in (23) to check the overall parameter fit for a-stable distribution. In each plot,R i for exemplary sample from X $ Sð1:85; 1; 0; 0Þ of size n ¼ 2000 is confronted with the ratios corresponding to other aÀstable distributions. In the left plot, we tested the fit for a ¼ 2, which resulted in tail conditional variance misalignment. In the middle plot, we introduced skewness by setting b ¼ 0:7, which made the outcome non-symmetric. In the right plot, we set c ¼ 0:9 which results in up-lifted values ofR i . The dashed lines correspond to extreme (point-wise) quantiles ofR i , i ¼ 1; 2; . . .; 22, under the correct model specification; they were obtained using 10 000 Monte Carlo run Example 5 Let us consider a 4.5% quantile grid with 0.005% cut-offs, i.e. we set I ¼ 22 with quantile intervals a i :¼ 0:005 þ 0:045 Á ði À 1Þ and b i :¼ 0:005 þ 0:045 Á i; for i ¼ 1; 2; . . .I; note that b I ¼ 1 À a 1 ¼ 0:995. Assume that we have an i.i.d. sample x ¼ ðx 1 ; . . .; x 2000 Þ from the same distribution as the random variable X, where X $ Sð1:85; 1; 0; 0Þ: For n ¼ 2000, each conditional variancer 2 X ða i ; b i Þ is estimated using b0:045nc ¼ 90 observations from x; 10 smallest and 10 biggest values of x are excluded due to trimming. Let us check whether the sample x comes from three other a-stable distributions given via random variables X 1 $ Sð2; 1; 0; 0Þ; X 2 $ Sð1:85; 1; 0:7; 0Þ; X 3 $ Sð1:85; 0:9; 0; 0Þ: To do so, we construct test statisticsR 1 i ,R 2 i , andR 3 i , for i ¼ 1; 2; . . .; 22, which correspond to the distributions of X 1 ; X 2 , and X 3 , respectively. In the first case, the distribution has fatter tails, which should makeR 1 i positive for tail conditional variances. In the second case, the non-symmetric nature of the distribution should result in non-symmetric behaviour ofR 2 i . In the latter case, the statisticsR 3 i should be positive (and not centred around 0) due to smaller scale parameter value.
To verify this, we simulate 10 000 strong Monte Carlo samples from the distributions corresponding to X 1 , X 2 and X 3 , where each simulation is of size n ¼ 2000.
Next, we use simulated values to get MC densities ofR 1 i ,R 2 i , andR 3 i , for i ¼ 1; 2; . . .; 22, under the correct model specification. Finally, we compare value of R i , coming from X sample, with 1% and 5% confidence bounds obtained using the Monte Carlo analysis. The results of the exercise are illustrated in Fig. 9.

Statistics of plasma turbulence in fusion device
In this section we illustrate how to use statistical framework based on QCV statistics for goodness-of-fit testing of real data. We take datasets examined in Burnecki et al. (2015), see also Burnecki et al. (2012). Specifically, we investigate the data obtained in experiments on the controlled thermonuclear fusion device ''Kharkov Institute of Physics and Technology'', Kharkov, Ukraine. Stellarators and similar devices, like e.g., tokamaks and compact toroids are used to study the properties of magnetically confined thermonuclear plasmas. They serve as smaller prototypes of the International Thermonuclear Experimental Reactor (ITER), the most expensive scientific endeavor in history aimed to demonstrate the scientific and technological feasibility of fusion energy for peaceful use. 4 It is a well-known fact that the magnetically confined plasma produced in such devices is always in a highly non-equilibrium state. This phenomenon called plasma turbulence is characterized by an anomalously high levels of fluctuations of the electric field and particle density and plays a decisive role in the generation of anomalous particle and heat fluxes from plasma confinement region, see Krasheninnikov et al. (2020). This circumstance constitute one of the main obstacles on the way of the magnetic confinement implementation, and this is the reason why the statistical properties of plasma turbulence are intensively investigated. In particular, during such studies a remarkable phenomenon called L-H transition have been observed in many fusion devices. Namely, a sudden transition from the low confinement mode (L mode) to a high confinement mode (H mode) is accompanied by suppression of turbulence and a rapid drop of turbulent transport at the edge of thermonuclear device, see Connor and Wilson (2000) and Wagner (2007). The implementation of the H-mode regime, which is chosen as the operating mode for the future ITER device requires detailed investigation of the physics of such transition. We here focus on statistical properties of turbulent plasma fluctuations before and after L-H transition phenomenon in stellarator-torsatron URAGAN-3M.
We examine four datasets which are denoted as Dataset 1, Dataset 2, Dataset 3, and Dataset 4. They are obtained by the use of high resolution measurements of the electric potential (floating potential) fluctuations with the help of movable Langmuir probe arrays. The detailed description of the experimental set-up and measurement procedure can be found in Beletskii et al. (2009). In a nutshell, Dataset 1 and Dataset 2 describe the floating potential fluctuations (in volts) in turbulent plasma, registered by Langmuir probe for torus radial position r = 9.5 cm. While Dataset 1 is related to the fluctuations before the transition point, Dataset 2 describe the fluctuation after the transition. Dataset 3 and Dataset 4 describe the potential fluctuations for torus radial position r = 9.6 cm. As before, Datasets 3 is related to prior-transition fluctuations while Dataset 4 is linked to posterior-transition fluctuations. The considered datasets contain 2000 normalized observations each and are presented in Fig. 10.
Before we present our analysis, let us comment on the results from Burnecki et al. (2015). The authors introduced a visual test that pointed out to differences between prior transition point and posterior transition point data. Namely, while the (two-sample) Kolmogorov-Smirnov test did not reject the hypothesis stating that the distributions of Datasets 1 and Dataset 2, or Dataset 3 and Dataset 4, respectively, is the same, the introduced visual framework indicated slight regime change between the considered datasets. The authors demonstrated that the distribution corresponding to Dataset 1 is the a-stable with a\2, Dataset 2 and Dataset 3 can be modeled by the Gaussian distribution, while Dataset 4 belongs to the domain of attraction of the Gaussian law however it is non-Gaussian.
The results obtained with QCV framework in general confirm the analysis from Burnecki et al. (2015). However, the results obtain here are more rigorous and are statistically significant since they are not based just on the visual inspection of the considered statistic. Since the sample size is relatively big (n ¼ 2000) and we are interested in a values close to 2 we took into consideration only test statistics N 2 and N 3 , i.e. we excluded N 1 from the analysis; see Sect. 5.2 for discussion. Moreover, rather than performing a single statistical test oriented at verification of the non-normality hypothesis (e.g. by setting H 1 to a\1:98), we decided to present testing results for a wide range of parameters for both statistics. This is done to provide a more complete picture and for better illustration.
First, we calculated the values of N 2 and N 3 for all empirical datasets and confronted it with various theoretical N 2 and N 3 quantiles obtained for astable distributions. The 1% and 5% quantiles were constructed based on 50 000 Monte Carlo strong simulation for samples (of length 2000) coming from SaSða; 1; 0Þ, where a 2 ½1:8; 2:0 values are restricted to the 0.01 dense grid. The results are illustrated in Fig. 11.
From left panel of Fig. 11 we see that N 2 statistic rejects Gaussian distribution hypothesis for Dataset 1 at 1% significance level. Also, since N 2 test statistics for Dataset 2, Dataset 3, and Dataset 4 fall into the constructed confidence intervals for a ¼ 2, one cannot reject normality for those samples. On the other hand, for N 3 , the hypothesis of the Gaussian distribution for Dataset 1 and Dataset 4 is rejected on 1% significance level (see the right panel of Fig. 11). For Dataset 1, the values of N 2 and N 3 statistics fall into the constructed confidence intervals for a 2 ½1:85; 1:93 which suggests that SaSða; 1; 0Þ distribution on this parameter range might be a  Fig. 11 The plots present tail quantiles for N 2 (left) and N 3 (right) test statistics for a 2 ½1:8; 2 and sample size n ¼ 2000. For each a, using Monte Carlo simulations we approximated test statistics median, 1% tail quantiles, and 5% tail quantiles; note that those could be linked to one-sided test significance levels. Horizontal lines indicate values of N 2 (left) and N 3 (right) for empirical datasets. Normality (a ¼ 2) for Dataset 1 is rejected by both statistics, but only N 2 rejects normality for Dataset 4. For both datasets, the intersection of 1% upper quantiles and vertical lines indicate how heavy (in terms of a) might be the tail of the sample distribution suitable modelling choice. For Dataset 4, taking into account N 3 statistic values, we conclude that the desired a stability index range is in [1.85, 1.97].
For completeness, we also present p-values for goodness-of fit tests based on N 2 and N 3 statistics for H 0 hypotheses of SaSða; 1; 0Þ distribution for selected a parameters and for all datasets; see Table 3. The p-values presented in Table 3 are calculated based on the 50 000 Monte Carlo simulations. More precisely, for each a we simulate 50 000 times a sample of length 2000 from SaSða; 1; 0Þ distribution in order to calculate 50 000 sample test statistics N 2 and N 3 . Next, using obtained results, we construct a Monte Carlo distribution of N 2 and N 3 and confront it with N 2 and N 3 statistics for the given dataset, say N X 2 and N X 3 . Finally, denoting byq X i the value of the Monte Carlo CDF for the test statistic N i (i ¼ 2; 3) at point N X i , the corresponding p-value is calculated as the minimum ofq X i and 1 Àq X i . This correspond to the minimum of left-sided and right-sided statistical test p-values for N 2 and N 3 test statistics.
As in Fig. 11, the results from Table 3 clearly indicate that based on the N 2 and N 3 statistics, the hypothesis of Gaussian distribution for Dataset 1 and Dataset 4 should be rejected. Moreover, the corresponding (acceptable) SaSða; 1; 0Þ distribution for Dataset 1 and Dataset 4 should have the stability index in the range a 2 ½1:84; 1:93 and a 2 ½1:88; 1:96, respectively; this is consistent with results deduced from Fig. 11. For Datasets 2, we observe highest p-values for a ¼ 2 (in case of N 2 ) and a ¼ 1:98 (in case of N 3 ) which may suggest the Gaussian (or very close to Gaussian) distribution. For Dataset 3, the highest p-values are observed for a ¼ 1:98 for both considered statistics. This may also suggests the a-stable distribution with stability index very close to Gaussian.
To sum up, our analysis shows that the hypothesis of the Gaussian distribution for Dataset 2 and Dataset 3 cannot be rejected while Dataset 1 and Dataset 4 can be modeled by the distribution from the domain of attraction of the a-stable law with a parameter close to 2. Also, note that the analysis of the results based only on one of the statistics (i.e. N 2 ) does not give the whole picture of the distribution related to the data. The results presented in Table 3 confirm that N 3 is the best choice when performing analysis of the near-Gaussian data for relatively big sample sizes; this is consistent with remarks made in Sect. 5.1 when each statistic purpose was outlined.
Next, for better illustration and to have more clear picture about the sample distribution properties, we decided to follow the approach introduced in Sect. 7. This should check the overall symmetric a-stable distribution parameter fit for the analyzed datasets. Taking into account the p-values presented in Table 3, we decided to perform a (visual) check of the following a parameter fits: (1) a ¼ 1:93 for Dataset 1; (2) a ¼ 2:0 for Dataset 2; (3) a ¼ 2:0 for Dataset 3; (4) a ¼ 1:96 for Dataset 4. We followed the approach introduced in Example 5 using the normalised version ofR i statistics; see Sect. 7 for details. The equivalents of plots presented in Fig. 9 for empirical datasets and the chosen specifications are presented in Fig. 12.
The results point out to further sample properties. We see that (tail) Gaussian distribution hypothesis for Dataset 2 and Dataset 3 cannot be rejected. In fact, in both cases, the values of the normalizedR i statistics fall into the constructed confidence intervals for SaSð2; 1; 0Þ distribution for almost any conditional variance  One can see that Gaussian distribution (a subset. Next, for Dataset 1, we observe that the extreme left tail QCV is breaching the quantile threshold which might suggest that the true left tail is higher than the left tail of SaSð1:93; 1; 0Þ. On the other hand, for Dataset 4, we observe breach in the right tail QCV combined with a slight increasing trend in QCVs. This phenomena may suggest the non-symmetric behavior of the analyzed vector of observations and to the conclusion that the H 0 hypothesis should be rejected. Let us finish this section with the comment on practical consequences of such statistical analysis of plasma fluctuations. At first, we note that non-Gaussian heavytailed distributions for low frequency plasma turbulence have also been observed in a number of toroidal plasma confinement systems such as T-10 tokamak, L-2M, TJ Fig. 12 The plots illustrates the overall fit for all empirical datasets given a specific choice of SaSða; 1; 0Þ distribution; the reference a levels are presented in the titles of the plots. The dashed lines correspond to extreme (point-wise) quantiles ofR i , i ¼ 1; 2; . . .; 22, under the correct model specification; they were obtained using a 10 000 Monte Carlo run. One could see that the hypothesis of Gaussian distribution can not be rejected for Dataset 2 and Dataset 3 at the significance level 1% as well as 5%. The results for Dataset 1 and Dataset 4 indicate the the H 0 hypothesis of tested distribution should be rejected. The realised values of R i statistics exceed the constructed confidence bound for low (in case of Dataset 1) and high conditional variance subsets (in case of Dataset 4) II, and LHD stellarators as well as in astrophysical plasmas, see Korolev and Skvortsova (2006), Dendy and Chapman (2006), and Watkins et al. (2009). The fluctuations with a-stable behavior have been reported in experiments on stellarators URAGAN-3M, Heliotron J, and on the tokamak ADITYA, see Gonchar et al. (2003), Mizuuchi et al. (2005, and Jha et al. (2003), respectively. This phenomenon is called ''Lévy turbulence'', and here we confirm its existence in the stellarator URAGAN-3M. Moreover, we conclude that not only the turbulence level, but the very statistics of the turbulence changes at the L-H transition. Such observation brings the necessity to build adequate theoretical models of the plasma turbulence before and after the L-H transition. We hope that the change of statistics observed in the data taken from URAGAN-3M will inspire plasma experimental groups to check if the change of statistics is also observed in other fusion devices where L-H transition has been detected.

Conclusions
In this paper, we introduced a novel generic goodness-of-fit testing approach based on the quantile conditional variance (QCV) statistics which builds upon recent results obtained in Jaworski and Pitera (2020) and Jelito and Pitera (2020). We studied probabilistic properties of sample QCV statistics for a-stable distributed random variables and shown that the QCV analysis could be used to efficiently characterise a-stable distribution. Our Monte-Carlo based analysis indicates that the proposed goodness-of-fit test statistics outperforms many benchmark goodness-offit tests that are typically used in reference to a-stable distributions. Although our focus was on the symmetric case, the simulation study indicates that the proposed test statistics might be in fact almost invariant to skewness specification encoded in b parameter, see Fig. 8. In other words, same methodology can be applied to test a parameter fit also for the asymmetric a-stable distributions.
We want to emphasize, that our paper is the first one that applies quantile conditional variance statistical approach to a-stable distributed samples. The presented results indicate that the proposed methodology can be considered as the universal one when a-stable distribution testing is considered. In fact, we believe that the approach based on quantile conditional moments might be used to effectively solve the challenge stated in Nolan (2020), where it was said that: ''In general, it appears to be challenging to find an omnibus test for stability''.
Finally, we have demonstrated how to utilize our framework in the analysis of real-data from plasma physics and showed that our approach efficiently discriminate between light-and heavy-tailed distributions. This proves that the presented methodology can be also useful in practical problems which require using efficient methods for specific data analysis.

Appendix B Numerical comparison of the tests' powers for significance level 5%
In Tables 4, 5, 6,7, 8 we present the powers of the tests considered in Sect. 5.2.1. In each case the the H 0 hypothesis is that the random sample comes from SaSð1:5; 1; 0Þ distribution. The H 1 hypothesis is that vector of observations constitutes the random sample from SaSða 1 ; 1; 0Þ distribution for a 1 2 f1:1; 1:2; Á Á Á ; 2:0g. The number of

Declarations
Conflicts of interest The author declare tht they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creativecommons.org/licenses/by/4.0/.
Fig. 14 A simplified R source code, that can be used to compute N 1 . For completeness, we also show how to compute exemplary (two-sided) test power for true a 0 ¼ 1:5 and alternative a 1 ¼ 1:7 for sample of size n ¼ 50 at 5% significance level. The calculations take 6 seconds on 13'' MacBook Air (2017) with 8 GB RAM and 1.8 GHz Dual-Core Intel i5 processor