Box–Cox symmetric distributions and applications to nutritional data

Ferrari, Silvia L. P.; Fumes, Giovana

doi:10.1007/s10182-017-0291-6

Box–Cox symmetric distributions and applications to nutritional data

Original Paper
Published: 13 February 2017

Volume 101, pages 321–344, (2017)
Cite this article

AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Silvia L. P. Ferrari¹ &
Giovana Fumes²

476 Accesses
14 Citations
1 Altmetric
Explore all metrics

Abstract

We introduce and study the Box–Cox symmetric class of distributions, which is useful for modeling positively skewed, possibly heavy-tailed, data. The new class of distributions includes the Box–Cox t, Box–Cox Cole-Green (or Box–Cox normal), Box–Cox power exponential distributions, and the class of the log-symmetric distributions as special cases. It provides easy parameter interpretation, which makes it convenient for regression modeling purposes. Additionally, it provides enough flexibility to handle outliers. The usefulness of the Box–Cox symmetric models is illustrated in a series of applications to nutritional data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Box–Cox elliptical distributions with application

Article 25 September 2018

Raúl Alejandro Morán-Vásquez & Silvia L. P. Ferrari

A skew–normal mixture of joint location, scale and skewness models

Article 26 August 2016

Hui-qiong Li, Liu-cang Wu & Jie-yi Yi

From Normality to Skewed Multivariate Distributions: A Personal View

Notes

It is the distribution of $Z/U^{1/q}$, where $q>0$ and Z and U are independent random variables with standard normal and uniform distribution, respectively.
If $\sigma |{{\lambda }}|=0$, $1/\sigma |{{\lambda }}|$ is interpreted as $\lim _{\sigma {{\lambda }} \rightarrow 0}{( 1/\sigma |{{\lambda }}| )}=\infty $ and $F ( 1/\sigma |{{\lambda }}|)$ is taken as 1.
$y^{-\infty }=\infty $, if $0<y<1$, $=1$, if $ y=1$, $=0$, if $y>1$; $y^{\infty }=0$, if $0<y<1$, $=1$, if $ y=1$, $=\infty $, if $y>1$.
The tail indices were obtained using Maple 13; see http://www.maplesoft.com. The tail index for the log-power exponential distribution with $\tau >1$ was obtained for $\tau \in \mathbb {Q}$, and for the slash distribution, for $q \in \mathbb {N}^{*}$.

References

Azzalini, A.: The skew-normal distribution and related multivariate families. Scand. J. Stat. 32(2), 159–188 (2005)
Article MathSciNet MATH Google Scholar
Box, G.E.P., Cox, D.R.: An analysis of transformations. J. R. Stat. Soc. Ser. B 26(2), 211–252 (1964)
MATH Google Scholar
Cole, T., Green, P.J.: Smoothing reference centile curves: the LMS method and penalized likelihood. Stat. Med. 11(10), 1305–1319 (1992)
Article Google Scholar
Cordeiro, G.M., Andrade, M.G.: Transformed symmetric models. Stat. Model. 11(4), 371–388 (2011)
Article MathSciNet Google Scholar
de Haan, L.: On Regular Variation and Its Application to the Weak Convergence of Sample Extremes, Mathematical Centre Tracts, vol. 32. Mathematics Centre, Amsterdam (1970)
Google Scholar
Dunn, P.K., Smyth, G.K.: Randomized quantile residuals. J. Comput. Graph. Stat. 5(3), 236–244 (1996)
Google Scholar
Fang, K.T., Kotz, S., NG, K.W.: Symmetric Multivariate and Related Distributions. Chapman and Hall, London (1990)
Book MATH Google Scholar
Hubert, M., Vandervieren, E.: An adjusted boxplot for skewed distributions. Comput. Stat. Data Anal. 52(12), 5186–5201 (2008)
Article MathSciNet MATH Google Scholar
Jondeau, E., Rockinger, M.: Conditional volatility, skewness, and kurtosis: existence, persistence, and comovements. J. Econ. Dyn. Control 27, 1699–1737 (2003)
Article MathSciNet MATH Google Scholar
Kelker, D.: Distribution theory of spherical distributions and a location-scale parameter generalization. Sankhya A 32(4), 419–430 (1970)
MathSciNet MATH Google Scholar
Luceño, A.: Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators. Comput. Stat. Data Anal. 51(2), 904–917 (2005)
Article MathSciNet MATH Google Scholar
Poirier, D.J.: The use of box–cox transformation in limited dependent variable models. J. Am. Stat. Assoc. 73(362), 284–287 (1978)
Article MATH Google Scholar
Resnick, S.I.: Heavy-Tail Phenomena Probabilistic and Statistical Modeling. Springer, New York (2007)
MATH Google Scholar
Rigby, R.A., Stasinopoulos, D.M.: Smooth centile curves for skew and kurtotic data modelled using the Box–Cox power exponential distribution. Stat. Med. 23(19), 3053–3076 (2004)
Article Google Scholar
Rigby, R.A., Stasinopoulos, D.M.: Generalized additive models for location, scale and shape. J. R. Stat. Soc. Ser. C Appl. Stat. 54(3), 507–554 (2005)
Article MathSciNet MATH Google Scholar
Rigby, R.A., Stasinopoulos, D.M.: Using the Box–Cox t distribution in GAMLSS to model skewness and kurtosis. Stat. Model. 6(3), 209–229 (2006)
Article MathSciNet Google Scholar
Rigby, R.A., Stasinopoulos, D.M., Heller, G. Voudouris, V.: The Distribution Toolbox of GAMLSS. London (2014). http://www.gamlss.org/wp-content/uploads/2014/10/distributions.pdf
Rogers, W.H., Tukey, J.W.: Understanding some long-tailed symmetrical distributions. Stat. Neerl. 26(3), 211–226 (1972)
Article MathSciNet MATH Google Scholar
Stasinopoulos, D.M., Rigby, R.A. Akantziliotou, C.: Instructions on how to use the GAMLSS package in R. London (2008). http://www.gamlss.org
Vanegas, L.H., Paula, G.A.: A semiparametric approach for joint modeling of median and skewness. Test 24(1), 110–135 (2015)
Article MathSciNet MATH Google Scholar
Vanegas, L.H., Paula, G.A.: Log-symmetric distributions: statistical properties and parameter estimation. Braz. J. Probab. Stat. 30, 196–220 (2016)
Article MathSciNet MATH Google Scholar
Voudouris, V., Gilchrist, R., Rigby, R.A., Sedgwick, J., Stasinopoulos, D.M.: Modelling skewness and kurtosis with the BCPE density in GAMLSS. J. Appl. Stat. 39(6), 1279–1293 (2012)
Article MathSciNet Google Scholar
Yang, Z.: A modified family of power transformations. Econ. Lett. 92(1), 14–19 (2006)
Article MathSciNet MATH Google Scholar
Yang, Z.L.: Some asymptotic results on Box-Cox transformation methodology. Commun. Stat. Theory Methods 25(2), 403–415 (1996)
Article MathSciNet MATH Google Scholar
Yeo, I.K., Johnson, R.A.: A new family of power transformation to improve normality or symmetry. Biometrika 87(4), 954–959 (2000)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank José Eduardo Corrente for providing the data used in this study, and Eliane C. Pinheiro for helpful discussions. We are grateful to the Associate Editor and two anonymous referees for constructive comments and suggestions. Funding was provided by Conselho Nacional de Desenvolvimento Científico e Tecnológico (Grant No. 304388-2014-9), Fundação de Amparo à Pesquisa do Estado de São Paulo (Grant No. 2012/21788-2), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior.

Author information

Authors and Affiliations

Department of Statistics, University of São Paulo, São Paulo, Brazil
Silvia L. P. Ferrari
Department of Exact Sciences, ESALQ, University of São Paulo, Piracicaba, Brazil
Giovana Fumes

Authors

Silvia L. P. Ferrari
View author publications
You can also search for this author in PubMed Google Scholar
Giovana Fumes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Silvia L. P. Ferrari.

Appendix

In this “Appendix,” we give the first and second derivatives of the log-likelihood function with respect to the parameters. Let $z=h(y;\mu ,\sigma ,\lambda )$, where $h(y;\mu ,\sigma ,\lambda )$ is given in (1), $\varpi =-2r'(z^2)/r(z^2)$, and $\xi =r((\sigma \lambda )^{-2}) / R((\sigma |\lambda |)^{-1}).$ We have

$$\begin{aligned} \displaystyle \frac{\partial z}{\partial \mu }= & {} -\frac{1}{\mu \sigma } \left( \frac{y}{\mu }\right) ^\lambda \xrightarrow [\lambda \rightarrow 0]{} -\frac{1}{\mu \sigma }, \\ \displaystyle \frac{\partial z}{\partial \lambda }= & {} \frac{1}{\sigma \lambda ^2} \left\{ 1+ \left( \frac{y}{\mu }\right) ^{\lambda } \left[ -1 + \lambda \log \left( \frac{y}{\mu }\right) \right] \right\} \xrightarrow [\lambda \rightarrow 0]{} \frac{1}{2 \sigma }\left[ \log \left( \frac{y}{\mu }\right) \right] ^2,\\ \displaystyle \frac{\partial ^2 z}{\partial \mu ^2}= & {} \frac{(\lambda +1)}{\mu ^2 \sigma } \left( \frac{y}{\mu }\right) ^\lambda \xrightarrow [\lambda \rightarrow 0]{} \frac{1}{\mu ^2 \sigma },\\ \displaystyle \frac{\partial ^2 z}{\partial \lambda ^2}= & {} \frac{1}{\sigma \lambda ^3}\left\{ -2+ \left( \frac{y}{\mu }\right) ^{\lambda } \left[ 2- 2 \lambda \log \left( \frac{y}{\mu }\right) +\lambda ^2 \left( \log \left( \frac{y}{\mu }\right) \right) ^2 \right] \right\} \\&\times \xrightarrow [\lambda \rightarrow 0]{} \frac{1}{3 \sigma } \left[ \log \left( \frac{y}{\mu } \right) \right] ^3,\\ \displaystyle \frac{\partial ^2 z}{\partial \mu \partial \lambda }= & {} -\frac{1}{\mu \sigma } \left( \frac{y}{\mu }\right) ^\lambda \log \left( \frac{y}{\mu }\right) \xrightarrow [\lambda \rightarrow 0]{} -\frac{1}{\mu \sigma }\log \left( \frac{y}{\mu }\right) . \end{aligned}$$

Let $\ell $ denote the log-likelihood for a single observation y. We have

$$\begin{aligned} \ell = (\lambda -1) \log y - \lambda \log \mu - \log \sigma +\log r(z^2)- \log R\left( \frac{1}{\sigma |\lambda |}\right) , \end{aligned}$$

if $\lambda \ne 0$; the last term in $\ell $ is zero if $\lambda =0$. The first derivatives of $\ell $ are given by

$$\begin{aligned}&\frac{\partial \ell }{\partial \mu }= -\frac{\lambda }{\mu }-\varpi z\frac{\partial z}{\partial \mu },\\&\displaystyle \frac{\partial \ell }{\partial \sigma }=\left\{ \begin{array}{l@{\quad }l} \displaystyle {-\frac{1}{\sigma }+ \frac{\varpi z^2}{\sigma }+ \frac{\xi }{\sigma ^2 |\lambda |}}, &{} \quad {\hbox {if} \quad \lambda \ne 0}, \\ \displaystyle {-\frac{1}{\sigma }+ \frac{\varpi z^2}{\sigma }}, &{} \quad { \hbox {if} \quad \lambda = 0,} \end{array} \right. \\&\frac{\partial \ell }{\partial \lambda }= \log \left( \frac{y}{\mu }\right) -\varpi z\frac{\partial z}{\partial \lambda }+ \hbox {sign}(\lambda )\frac{ \xi }{\sigma \lambda ^2}. \end{aligned}$$

The second derivatives of $\ell $ are given by

$$\begin{aligned}&\frac{\partial ^2 \ell }{\partial \mu ^2}= \frac{\lambda }{\mu ^2}-\left( z \frac{\hbox {d}\varpi }{\hbox {d}z} +\varpi \right) \left( \frac{\partial z}{\partial \mu }\right) ^2-\varpi z \frac{\partial ^2 z}{\partial \mu ^2},\\&\displaystyle \frac{\partial ^2 \ell }{\partial \sigma ^2}=\left\{ \begin{array}{l@{\quad }l} \displaystyle {\frac{1}{\sigma ^2}- \frac{z^3}{\sigma ^2} \frac{\hbox {d}\varpi }{\hbox {d}z}- \frac{3 \varpi z^2}{\sigma ^2}+\frac{1}{\sigma ^2 |\lambda |} \frac{\partial \xi }{\partial \sigma }-\frac{2 \xi }{\sigma ^3 |\lambda |}}, &{} {\hbox {if} \quad \lambda \ne 0}, \\ \displaystyle {\frac{1}{\sigma ^2}- \frac{z^3}{\sigma ^2} \frac{\hbox {d}\varpi }{\hbox {d}z}- \frac{3 \varpi z^2}{\sigma ^2}}, &{} { \hbox {if} \quad \lambda = 0,} \end{array} \right. \\&\frac{\partial ^2 \ell }{\partial \lambda ^2}= -\left( z \frac{\hbox {d}\varpi }{\hbox {d}z} +\varpi \right) \left( \frac{\partial z}{\partial \lambda }\right) ^2-\varpi z \frac{\partial ^2 z}{\partial \lambda ^2}+\hbox {sign}(\lambda )\left( \frac{1}{\sigma \lambda ^2} \frac{\partial \xi }{\partial \lambda }-\frac{2 \xi }{\sigma \lambda ^3}\right) ,\\&\frac{\partial ^2 \ell }{\partial \mu \partial \sigma }=\frac{z}{\sigma } \frac{\partial z}{\partial \mu } \left( z \frac{\hbox {d}\varpi }{\hbox {d}z} + 2 \varpi \right) ,\\&\frac{\partial ^2 \ell }{\partial \mu \partial \lambda }= -\frac{1}{\mu }-\left( z \frac{\hbox {d}\varpi }{\hbox {d}z} +\varpi \right) \frac{\partial z}{\partial \mu }\frac{\partial z}{\partial \lambda }-\varpi z \frac{\partial ^2 z}{\partial \mu \partial \lambda },\\&\frac{\partial ^2 \ell }{\partial \sigma \partial \lambda }= \frac{z}{\sigma } \frac{\partial z}{\partial \lambda } \left( z \frac{\hbox {d}\varpi }{\hbox {d}z}+ 2 \varpi \right) + \frac{1}{ \sigma ^2 |\lambda |} \frac{\partial \xi }{\partial \lambda } - \hbox {sign}(\lambda ) \frac{\xi }{\sigma ^2 \lambda ^2}. \end{aligned}$$

The first and second derivatives of $\ell $ are obtained after plugging the derivatives of z given above.

Note that the first derivatives of $\ell $ depend on the weighting function $\varpi $ ($\varpi $ is given in Table 3 for some distributions). Consequently, $\hbox {d}\varpi /\hbox {d}z$ appears in all the second derivatives of $\ell $. Note that $\partial \ell /\partial \sigma $ and $\partial \ell /\partial \lambda $ involve $\xi $, which in turn depends on the particular distribution in the BCS class and the truncation set. The first derivatives of $\xi $ appear in $\partial ^2 \ell /\partial \sigma ^2$, $\partial ^2 \ell /\partial \lambda ^2$ and $\partial ^2 \ell /\partial \sigma \partial \lambda $. The stability of the terms that involve $\xi $ and its first derivatives around $\lambda =0$ may vary according to different distributions. For instance, they may be unstable for the Box–Cox t distribution with small degrees of freedom parameter. Yet, a simulation study of the type I error probability of the likelihood ratio test of $\mathrm{H}_0: \lambda =0$ in the Box–Cox t model for different values of the degrees of freedom parameter performed well; see Sect. 4.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ferrari, S.L.P., Fumes, G. Box–Cox symmetric distributions and applications to nutritional data. AStA Adv Stat Anal 101, 321–344 (2017). https://doi.org/10.1007/s10182-017-0291-6

Download citation

Received: 08 March 2016
Accepted: 02 February 2017
Published: 13 February 2017
Issue Date: July 2017
DOI: https://doi.org/10.1007/s10182-017-0291-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Box–Cox symmetric distributions and applications to nutritional data

Abstract

Access this article

Similar content being viewed by others

Box–Cox elliptical distributions with application

A skew–normal mixture of joint location, scale and skewness models

From Normality to Skewed Multivariate Distributions: A Personal View

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Box–Cox symmetric distributions and applications to nutritional data

Abstract

Access this article

Similar content being viewed by others

Box–Cox elliptical distributions with application

A skew–normal mixture of joint location, scale and skewness models

From Normality to Skewed Multivariate Distributions: A Personal View

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation