Skip to main content
Log in

Inference for quantile measures of skewness

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

Given a location-scale family generated by a distribution with smooth positive density, the aim is to provide distribution-free tests and confidence intervals for a skewness coefficient determined by three quantiles. It is the Bowley–Hinkley ratio \(S_r/R_r\), where \(S_r=x_r+x_{1-r}-2x_{0.5}\) is the sum of two symmetric quantiles minus twice the median, and \(R_r=x_{1-r}-x_r\) is the \(r\)th interquantile range. Here, \(0<r< 0.5\) is to be chosen and fixed. The sample version of this ratio depends only on three order statistics and is the basis for tests and confidence intervals. It is shown that the variance stabilized version of this statistic leads to more powerful tests than the Studentized version of the sample version of \(S_r\). Sample sizes required to obtain accurate coverage of confidence intervals with a prespecified width are provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • ABS (2011) Household data and income distribution, Australian Bureau of Statistics Report 6523.0. Canberra, ACT, Australia. www.ausstats.abs.gov.au

  • Bickel PJ, Doksum KA (1977) Mathematical statistics: basic ideas and selected topics. Holden-Day, San Francisco

    MATH  Google Scholar 

  • Bloch DA, Gastwirth JL (1968) On a simple estimate of the reciprocal of the density function. Ann Math Stat 39(3):1083–1085

    Article  MATH  MathSciNet  Google Scholar 

  • Bowley AL (1920) Elements of Statistics, 4th edn. Scribner’s, New York, 1st edn 1901

  • Brys G, Hubert M, Struyf A (2004) A robust measure of skewness. J Comp Graph Stat 13:996–1017

    Article  MathSciNet  Google Scholar 

  • Critchley F, Jones MC (2008) Asymmetry and gradient asymmetry functions: density-based skewness and kurtosis. Scand J Stat 35(3):415–437

    Article  MATH  MathSciNet  Google Scholar 

  • DasGupta A (2006) Asymptotic theory of statistics and probability. Springer, New York

    Google Scholar 

  • David FN, Johnson NL (1956) Some tests of significance with ordered variables. J Roy Stat Soc Ser B 18:1–20

    MATH  MathSciNet  Google Scholar 

  • David HA (1981) Order statistics. Wiley, New York

    MATH  Google Scholar 

  • Groeneveld RA (1991) An influence function approach to describing the skewness of a population. Am Stat 45:97–102

    MathSciNet  Google Scholar 

  • Groeneveld RA (1998) A class of quantile measures for kurtosis. Am Stat 52:325–329

    MathSciNet  Google Scholar 

  • Groeneveld RA, Meeden G (1984) Measuring skewness and kurtosis. Statistician 33:391–399

    Article  Google Scholar 

  • Gupta MK (1967) An asymptotically nonparametric test of symmetry. Ann Math Stat 38(3):849–866

    Article  MATH  Google Scholar 

  • Heritier S, Cantoni E, Copt S, Victoria-Feser M (2009) Robust methods in Biostatistics. Wiley, Chichester

  • Hinkley DV (1975) On power transformations to symmetry. Biometrika 62:101–111

    Article  MATH  MathSciNet  Google Scholar 

  • Hyndman RJ, Fan Y (1996) Sample quantiles in statistical packages. Am Stat 50:361–365

    Google Scholar 

  • Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions, vol 1. Wiley, New York

    MATH  Google Scholar 

  • Johnson NL, Kotz S, Balakrishnan N (1995) Continuous univariate distributions, vol 2. Wiley, New York

    MATH  Google Scholar 

  • Kulinskaya E, Morgenthaler S, Staudte RG (2008) Meta analysis: a guide to calibrating and combining statistical evidence. Wiley series in probability and statistics, Wiley, Chichester

    Google Scholar 

  • Kulinskaya E, Morgenthaler S, Staudte RG (2010) Variance stabilizing the difference of two binomial proportions. Am Stat 64:350–356

    Article  MATH  MathSciNet  Google Scholar 

  • MacGillivray HL (1986) Skewness and asymmetry: measures and orderings. Ann Stat 14:1994–1011

    Article  MathSciNet  Google Scholar 

  • Morgenthaler S, Staudte RG (2012) Advantages of variance stabilization. Scand J Stat 39:714–728

    Article  MATH  MathSciNet  Google Scholar 

  • Morgenthaler S, Staudte RG (2013) Evidence for alternative hypotheses. In: Becker C, Fried R, Kuhnt S (eds) Robustness and complex data structures; a Festschrift in Honour of Ursula Gather. Springer, Berlin, pp 315–329

    Chapter  Google Scholar 

  • Ngatchou-Wandji J (2006) On testing for the nullity of some skewness coefficients. Int Stat Rev 74:47–65

    Article  MATH  Google Scholar 

  • Oja H (1981) On location, scale, skewness and kurosis of univariate distributions. Scand J Stat 8:154–168

    MATH  MathSciNet  Google Scholar 

  • Prendergast LA, Staudte RG (2014) Better than you think: interval estimators of the difference of binomial proportions. J Stat Plan Inference 148:38–48

    Article  MATH  MathSciNet  Google Scholar 

  • Rosco JF, Jones MC, Pewsey A (2011) Skew t distributions via the sinh-arcsinh transformation. Test 20(3):630–652

    Article  MATH  MathSciNet  Google Scholar 

  • Ruppert D (1987) What is kurtosis? an influence function approach. Am Stat 41(1):1–5

    MATH  MathSciNet  Google Scholar 

  • Siddiqui MM (1960) Distribution of quantiles in samples from a bivariate population. J Res Nat Bureau Stds Ser B 64:1960

    MathSciNet  Google Scholar 

  • Staudte RG (2013a) Inference for the standardized median. In: Lahiri S, Schick A, Sengupta A, Sriram N (eds) Contemporary developments in statistical theory; a Festscrift for Hira Lal Koul, pp 353–363. Springer, Berlin

  • Staudte RG (2013b) Distribution-free confidence intervals for the standardized median. STAT 2(1):184–196

    Article  MathSciNet  Google Scholar 

  • Team RDC (2008) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna. http://www.R-project.org. ISBN 3-900051-07-0

  • Tukey JW (1965) Which part of the sample contains the information? Proc Math Acad Sci USA 53:127–134

    Article  MATH  MathSciNet  Google Scholar 

  • van Zwet WR (1964) Transformations of random variables. Math, Zentrum, Amsterdam

Download references

Acknowledgments

The author is indebted to the Editor and referees, whose recommendations led to substantial improvement in presentation of the text. The author thanks Dr. Luke Prendergast of La Trobe University for helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert G. Staudte.

Appendices

Appendices

1.1 A1: Frequency tables of visits to doctors data, by gender

http://www.unige.ch/ses/dsec/staff/faculty/Cantoni-Eva/Books/RobustBiostat.html

The data are found at the above website. After extracting Chapter5.rdata, loading it into R and employing the following R commands, one obtains the table of data listed below.

figure a

There were, for example, 141 females who visited the doctor twice during the observation period. The number of females is 2,079 and the number of males is 789.

figure b

1.2 A2: Asymptotic widths of confidence intervals for \(\gamma _r\)

We now give a somewhat heuristic derivation, corroborated by simulation studies, for the expression (6), (7) for \(W=U-L\) and also \(W_\text {DF}=U_\text {DF}-L_\text {DF}\) of (9). We assume \(W_\text {DF} = W_{\text {DF},n,\alpha }(\hat{\gamma };\hat{a}_0,\hat{a}_1,\hat{a}_2)=W_{n,\alpha }(\hat{\gamma }; ,a_0,a_1,a_2) +o_p(n^{-1/2})\). Let \(b=\sinh ^{-1}\{l(\hat{\gamma })/D\}\) and \(c=z_{1-\alpha /2}\sqrt{a_2/n}\,\). Then, using (2), (4), and (5) one can write with the help of hyperbolic function relations:

$$\begin{aligned} W&= k^{-1}(De^{b+c})-k^{-1}(De^{b-c}) \nonumber \\&= \frac{D}{2a_2}\left\{ \sinh (b+c)-\sinh (b-c)\right\} \nonumber \\&= \frac{D}{a_2}\left\{ \cosh (b)\sinh (c)\right\} \nonumber \\&= \frac{D}{a_2}\left\{ c\;\cosh (b)+o(c)\right\} \text {\qquad as } c\rightarrow 0 \nonumber \\&= \frac{D}{a_2}\left[ c\;\cosh \left\{ \sinh ^{-1}\left( \frac{l(\hat{\gamma }) }{D}\right) \right\} \right] +o(c) \nonumber \\&= \frac{\sqrt{l^2(\hat{\gamma })+D^2}\,}{\sqrt{a_2}\,}\;\frac{z_{1-\alpha /2}}{\sqrt{n}\,}+o(n^{-1/2}) \text {\qquad as } n\rightarrow \infty \nonumber \\&= \frac{w_\text {asym}(\hat{\gamma })\;z_{1-\alpha /2}}{\sqrt{n}\,}+o(n^{-1/2}). \end{aligned}$$
(13)

Elementary calculations show that \(w_{asym}(\gamma )=2\sqrt{q(t)}=2\sqrt{a_0+a_1\gamma +a_2\gamma ^2}\,\) is convex with minimum at \(\gamma _\text {min}=-a_1/(2a_2)=\hbox {Cov}[\hat{R}_r,\hat{S}_r]/\hbox {Var}[\hat{R}_r].\) Further,

$$\begin{aligned} w_\text {asym}(\gamma _\text {min})=\frac{2\{(n\hbox {Var}[\hat{S}_r])\, (1-\hbox {Corr}^2[\hat{R}_r,\hat{S}_r])\}^{1/2}\,}{R_r}. \end{aligned}$$

In general, \(\gamma _\text {min}\ne S_r/R_r=\gamma _r=\lim \hat{\gamma }_r \), although they are both equal to 0 for symmetric \(F\).

The maximum of \(w_{asym}(\gamma )\) over \([-1,1]\) is the larger of \(2\sqrt{a_0-a_1+a_2}\,\) and \(2\sqrt{a_0+a_1+a_2}\,\). A glance at Table 3 shows that these values can be much larger than \(w_\text {asym}(\gamma _r)\), but it is the latter quantity that is of interest because of the consistency of \(\hat{\gamma }_r\) for \(\gamma _r\). It follows from (13) that to obtain a large sample 100(\(1-\alpha )\) % confidence interval for \(\gamma _{r}\) of desired width \(W_0\), one requires

$$\begin{aligned} n\ge n_0=n_0(r,\alpha ,W_0)= \biggl \{\frac{\max _F w_\text {asym}(\gamma _r)\;z_{1-\alpha /2}}{W_0}\biggr \}^2. \end{aligned}$$
(14)

Table 6 shows that for the 15 families considered here, and each \(r\) ranging from \(0.05\) to \(0.25\), the \(\max _F w_\text {asym}(\gamma _{r})\) occurs for \(F\) equal to the Cauchy distribution. Thus, we want the value of \(n_0(r )\) for this worst case. Letting \(\theta _r=\pi (r-0.5)\), one finds \(x_r=\tan (\theta _r)\) and \(g_r=\pi /\cos ^{2}(\theta _r),\) so that the constants (8) are

$$\begin{aligned} a_0(r)&= \frac{\pi ^2}{4}\left\{ \frac{1}{\tan ^{2}(\theta _r)}+ \frac{8r}{\sin ^{2}(2\theta _r)}-\frac{4r}{\sin ^{2}(\theta _r)}\right\} \\ a_2(r)&= \frac{2\pi ^2r(1-2r)}{\sin ^{2}(2\theta _r)}.\nonumber \end{aligned}$$
(15)

A plot of \(a_0(r)\) against \(0<r<0.5\) shows the graph is U-shaped and symmetric about \(r=0.25\), with a minimum of \(a_0(0.25)=\pi ^2/4.\) The graph of \(a_2(r)\) is also U-shaped, symmetric and nearly identical to that of \(a_0(r)\) for \(r\) near 0.25, but diverges from it as \(r\) approaches the boundaries of \([0,0.5].\)

Similarly, the graph of \(w_\text {asym}(\gamma _r)=2\sqrt{a_0(r)}\,\) is also U-shaped and symmetric about \(r=0.25\), but with minimum \(\pi .\) Thus, Bowley’s coefficient \(\hat{\gamma }_{0.25}\) minimizes over \(r\) the maximum over \(F\) of the asymptotic interval widths.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Staudte, R.G. Inference for quantile measures of skewness. TEST 23, 751–768 (2014). https://doi.org/10.1007/s11749-014-0391-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-014-0391-5

Keywords

Mathematics Subject Classification (2010)

Navigation