Inference for quantile measures of skewness

Staudte, Robert G.

doi:10.1007/s11749-014-0391-5

Inference for quantile measures of skewness

Original Paper
Published: 19 July 2014

Volume 23, pages 751–768, (2014)
Cite this article

TEST Aims and scope Submit manuscript

Robert G. Staudte¹

512 Accesses
9 Citations
Explore all metrics

Abstract

Given a location-scale family generated by a distribution with smooth positive density, the aim is to provide distribution-free tests and confidence intervals for a skewness coefficient determined by three quantiles. It is the Bowley–Hinkley ratio $S_r/R_r$, where $S_r=x_r+x_{1-r}-2x_{0.5}$ is the sum of two symmetric quantiles minus twice the median, and $R_r=x_{1-r}-x_r$ is the $r$th interquantile range. Here, $0<r< 0.5$ is to be chosen and fixed. The sample version of this ratio depends only on three order statistics and is the basis for tests and confidence intervals. It is shown that the variance stabilized version of this statistic leads to more powerful tests than the Studentized version of the sample version of $S_r$. Sample sizes required to obtain accurate coverage of confidence intervals with a prespecified width are provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical Inference for the Location and Scale Parameters of the Skew Normal Distribution

Article 29 November 2018

An alternative data analytic approach to measure the univariate and multivariate skewness

Article 06 March 2018

On relative skewness for multivariate distributions

Article 20 March 2015

References

ABS (2011) Household data and income distribution, Australian Bureau of Statistics Report 6523.0. Canberra, ACT, Australia. www.ausstats.abs.gov.au
Bickel PJ, Doksum KA (1977) Mathematical statistics: basic ideas and selected topics. Holden-Day, San Francisco
MATH Google Scholar
Bloch DA, Gastwirth JL (1968) On a simple estimate of the reciprocal of the density function. Ann Math Stat 39(3):1083–1085
Article MATH MathSciNet Google Scholar
Bowley AL (1920) Elements of Statistics, 4th edn. Scribner’s, New York, 1st edn 1901
Brys G, Hubert M, Struyf A (2004) A robust measure of skewness. J Comp Graph Stat 13:996–1017
Article MathSciNet Google Scholar
Critchley F, Jones MC (2008) Asymmetry and gradient asymmetry functions: density-based skewness and kurtosis. Scand J Stat 35(3):415–437
Article MATH MathSciNet Google Scholar
DasGupta A (2006) Asymptotic theory of statistics and probability. Springer, New York
Google Scholar
David FN, Johnson NL (1956) Some tests of significance with ordered variables. J Roy Stat Soc Ser B 18:1–20
MATH MathSciNet Google Scholar
David HA (1981) Order statistics. Wiley, New York
MATH Google Scholar
Groeneveld RA (1991) An influence function approach to describing the skewness of a population. Am Stat 45:97–102
MathSciNet Google Scholar
Groeneveld RA (1998) A class of quantile measures for kurtosis. Am Stat 52:325–329
MathSciNet Google Scholar
Groeneveld RA, Meeden G (1984) Measuring skewness and kurtosis. Statistician 33:391–399
Article Google Scholar
Gupta MK (1967) An asymptotically nonparametric test of symmetry. Ann Math Stat 38(3):849–866
Article MATH Google Scholar
Heritier S, Cantoni E, Copt S, Victoria-Feser M (2009) Robust methods in Biostatistics. Wiley, Chichester
Hinkley DV (1975) On power transformations to symmetry. Biometrika 62:101–111
Article MATH MathSciNet Google Scholar
Hyndman RJ, Fan Y (1996) Sample quantiles in statistical packages. Am Stat 50:361–365
Google Scholar
Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions, vol 1. Wiley, New York
MATH Google Scholar
Johnson NL, Kotz S, Balakrishnan N (1995) Continuous univariate distributions, vol 2. Wiley, New York
MATH Google Scholar
Kulinskaya E, Morgenthaler S, Staudte RG (2008) Meta analysis: a guide to calibrating and combining statistical evidence. Wiley series in probability and statistics, Wiley, Chichester
Google Scholar
Kulinskaya E, Morgenthaler S, Staudte RG (2010) Variance stabilizing the difference of two binomial proportions. Am Stat 64:350–356
Article MATH MathSciNet Google Scholar
MacGillivray HL (1986) Skewness and asymmetry: measures and orderings. Ann Stat 14:1994–1011
Article MathSciNet Google Scholar
Morgenthaler S, Staudte RG (2012) Advantages of variance stabilization. Scand J Stat 39:714–728
Article MATH MathSciNet Google Scholar
Morgenthaler S, Staudte RG (2013) Evidence for alternative hypotheses. In: Becker C, Fried R, Kuhnt S (eds) Robustness and complex data structures; a Festschrift in Honour of Ursula Gather. Springer, Berlin, pp 315–329
Chapter Google Scholar
Ngatchou-Wandji J (2006) On testing for the nullity of some skewness coefficients. Int Stat Rev 74:47–65
Article MATH Google Scholar
Oja H (1981) On location, scale, skewness and kurosis of univariate distributions. Scand J Stat 8:154–168
MATH MathSciNet Google Scholar
Prendergast LA, Staudte RG (2014) Better than you think: interval estimators of the difference of binomial proportions. J Stat Plan Inference 148:38–48
Article MATH MathSciNet Google Scholar
Rosco JF, Jones MC, Pewsey A (2011) Skew t distributions via the sinh-arcsinh transformation. Test 20(3):630–652
Article MATH MathSciNet Google Scholar
Ruppert D (1987) What is kurtosis? an influence function approach. Am Stat 41(1):1–5
MATH MathSciNet Google Scholar
Siddiqui MM (1960) Distribution of quantiles in samples from a bivariate population. J Res Nat Bureau Stds Ser B 64:1960
MathSciNet Google Scholar
Staudte RG (2013a) Inference for the standardized median. In: Lahiri S, Schick A, Sengupta A, Sriram N (eds) Contemporary developments in statistical theory; a Festscrift for Hira Lal Koul, pp 353–363. Springer, Berlin
Staudte RG (2013b) Distribution-free confidence intervals for the standardized median. STAT 2(1):184–196
Article MathSciNet Google Scholar
Team RDC (2008) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna. http://www.R-project.org. ISBN 3-900051-07-0
Tukey JW (1965) Which part of the sample contains the information? Proc Math Acad Sci USA 53:127–134
Article MATH MathSciNet Google Scholar
van Zwet WR (1964) Transformations of random variables. Math, Zentrum, Amsterdam

Download references

Acknowledgments

The author is indebted to the Editor and referees, whose recommendations led to substantial improvement in presentation of the text. The author thanks Dr. Luke Prendergast of La Trobe University for helpful discussions.

Author information

Authors and Affiliations

La Trobe University, Melbourne, VIC , 3086, Australia
Robert G. Staudte

Authors

Robert G. Staudte
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robert G. Staudte.

Appendices

1.1 A1: Frequency tables of visits to doctors data, by gender

http://www.unige.ch/ses/dsec/staff/faculty/Cantoni-Eva/Books/RobustBiostat.html

The data are found at the above website. After extracting Chapter5.rdata, loading it into R and employing the following R commands, one obtains the table of data listed below.

There were, for example, 141 females who visited the doctor twice during the observation period. The number of females is 2,079 and the number of males is 789.

1.2 A2: Asymptotic widths of confidence intervals for $\gamma _r$

We now give a somewhat heuristic derivation, corroborated by simulation studies, for the expression (6), (7) for $W=U-L$ and also $W_\text {DF}=U_\text {DF}-L_\text {DF}$ of (9). We assume $W_\text {DF} = W_{\text {DF},n,\alpha }(\hat{\gamma };\hat{a}_0,\hat{a}_1,\hat{a}_2)=W_{n,\alpha }(\hat{\gamma }; ,a_0,a_1,a_2) +o_p(n^{-1/2})$. Let $b=\sinh ^{-1}\{l(\hat{\gamma })/D\}$ and $c=z_{1-\alpha /2}\sqrt{a_2/n}\,$. Then, using (2), (4), and (5) one can write with the help of hyperbolic function relations:

$$\begin{aligned} W&= k^{-1}(De^{b+c})-k^{-1}(De^{b-c}) \nonumber \\&= \frac{D}{2a_2}\left\{ \sinh (b+c)-\sinh (b-c)\right\} \nonumber \\&= \frac{D}{a_2}\left\{ \cosh (b)\sinh (c)\right\} \nonumber \\&= \frac{D}{a_2}\left\{ c\;\cosh (b)+o(c)\right\} \text {\qquad as } c\rightarrow 0 \nonumber \\&= \frac{D}{a_2}\left[ c\;\cosh \left\{ \sinh ^{-1}\left( \frac{l(\hat{\gamma }) }{D}\right) \right\} \right] +o(c) \nonumber \\&= \frac{\sqrt{l^2(\hat{\gamma })+D^2}\,}{\sqrt{a_2}\,}\;\frac{z_{1-\alpha /2}}{\sqrt{n}\,}+o(n^{-1/2}) \text {\qquad as } n\rightarrow \infty \nonumber \\&= \frac{w_\text {asym}(\hat{\gamma })\;z_{1-\alpha /2}}{\sqrt{n}\,}+o(n^{-1/2}). \end{aligned}$$

(13)

Elementary calculations show that $w_{asym}(\gamma )=2\sqrt{q(t)}=2\sqrt{a_0+a_1\gamma +a_2\gamma ^2}\,$ is convex with minimum at $\gamma _\text {min}=-a_1/(2a_2)=\hbox {Cov}[\hat{R}_r,\hat{S}_r]/\hbox {Var}[\hat{R}_r].$ Further,

$$\begin{aligned} w_\text {asym}(\gamma _\text {min})=\frac{2\{(n\hbox {Var}[\hat{S}_r])\, (1-\hbox {Corr}^2[\hat{R}_r,\hat{S}_r])\}^{1/2}\,}{R_r}. \end{aligned}$$

In general, $\gamma _\text {min}\ne S_r/R_r=\gamma _r=\lim \hat{\gamma }_r $, although they are both equal to 0 for symmetric $F$.

The maximum of $w_{asym}(\gamma )$ over $[-1,1]$ is the larger of $2\sqrt{a_0-a_1+a_2}\,$ and $2\sqrt{a_0+a_1+a_2}\,$. A glance at Table 3 shows that these values can be much larger than $w_\text {asym}(\gamma _r)$, but it is the latter quantity that is of interest because of the consistency of $\hat{\gamma }_r$ for $\gamma _r$. It follows from (13) that to obtain a large sample 100($1-\alpha )$ % confidence interval for $\gamma _{r}$ of desired width $W_0$, one requires

$$\begin{aligned} n\ge n_0=n_0(r,\alpha ,W_0)= \biggl \{\frac{\max _F w_\text {asym}(\gamma _r)\;z_{1-\alpha /2}}{W_0}\biggr \}^2. \end{aligned}$$

(14)

Table 6 shows that for the 15 families considered here, and each $r$ ranging from $0.05$ to $0.25$, the $\max _F w_\text {asym}(\gamma _{r})$ occurs for $F$ equal to the Cauchy distribution. Thus, we want the value of $n_0(r )$ for this worst case. Letting $\theta _r=\pi (r-0.5)$, one finds $x_r=\tan (\theta _r)$ and $g_r=\pi /\cos ^{2}(\theta _r),$ so that the constants (8) are

$$\begin{aligned} a_0(r)&= \frac{\pi ^2}{4}\left\{ \frac{1}{\tan ^{2}(\theta _r)}+ \frac{8r}{\sin ^{2}(2\theta _r)}-\frac{4r}{\sin ^{2}(\theta _r)}\right\} \\ a_2(r)&= \frac{2\pi ^2r(1-2r)}{\sin ^{2}(2\theta _r)}.\nonumber \end{aligned}$$

(15)

A plot of $a_0(r)$ against $0<r<0.5$ shows the graph is U-shaped and symmetric about $r=0.25$, with a minimum of $a_0(0.25)=\pi ^2/4.$ The graph of $a_2(r)$ is also U-shaped, symmetric and nearly identical to that of $a_0(r)$ for $r$ near 0.25, but diverges from it as $r$ approaches the boundaries of $[0,0.5].$

Similarly, the graph of $w_\text {asym}(\gamma _r)=2\sqrt{a_0(r)}\,$ is also U-shaped and symmetric about $r=0.25$, but with minimum $\pi .$ Thus, Bowley’s coefficient $\hat{\gamma }_{0.25}$ minimizes over $r$ the maximum over $F$ of the asymptotic interval widths.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Staudte, R.G. Inference for quantile measures of skewness. TEST 23, 751–768 (2014). https://doi.org/10.1007/s11749-014-0391-5

Download citation

Received: 10 December 2013
Accepted: 30 June 2014
Published: 19 July 2014
Issue Date: December 2014
DOI: https://doi.org/10.1007/s11749-014-0391-5

Keywords

Mathematics Subject Classification (2010)

62G99

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inference for quantile measures of skewness

Abstract

Access this article

Similar content being viewed by others

Statistical Inference for the Location and Scale Parameters of the Skew Normal Distribution

An alternative data analytic approach to measure the univariate and multivariate skewness

On relative skewness for multivariate distributions

References

Acknowledgments