Skip to main content
Log in

Box–Cox symmetric distributions and applications to nutritional data

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

We introduce and study the Box–Cox symmetric class of distributions, which is useful for modeling positively skewed, possibly heavy-tailed, data. The new class of distributions includes the Box–Cox t, Box–Cox Cole-Green (or Box–Cox normal), Box–Cox power exponential distributions, and the class of the log-symmetric distributions as special cases. It provides easy parameter interpretation, which makes it convenient for regression modeling purposes. Additionally, it provides enough flexibility to handle outliers. The usefulness of the Box–Cox symmetric models is illustrated in a series of applications to nutritional data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. It is the distribution of \(Z/U^{1/q}\), where \(q>0\) and Z and U are independent random variables with standard normal and uniform distribution, respectively.

  2. If \(\sigma |{{\lambda }}|=0\), \(1/\sigma |{{\lambda }}|\) is interpreted as \(\lim _{\sigma {{\lambda }} \rightarrow 0}{( 1/\sigma |{{\lambda }}| )}=\infty \) and \(F ( 1/\sigma |{{\lambda }}|)\) is taken as 1.

  3. \(y^{-\infty }=\infty \), if \(0<y<1\), \(=1\), if \( y=1\), \(=0\), if \(y>1\); \(y^{\infty }=0\), if \(0<y<1\), \(=1\), if \( y=1\), \(=\infty \), if \(y>1\).

  4. The tail indices were obtained using Maple 13; see http://www.maplesoft.com. The tail index for the log-power exponential distribution with \(\tau >1\) was obtained for \(\tau \in \mathbb {Q}\), and for the slash distribution, for \(q \in \mathbb {N}^{*}\).

References

  • Azzalini, A.: The skew-normal distribution and related multivariate families. Scand. J. Stat. 32(2), 159–188 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Box, G.E.P., Cox, D.R.: An analysis of transformations. J. R. Stat. Soc. Ser. B 26(2), 211–252 (1964)

    MATH  Google Scholar 

  • Cole, T., Green, P.J.: Smoothing reference centile curves: the LMS method and penalized likelihood. Stat. Med. 11(10), 1305–1319 (1992)

    Article  Google Scholar 

  • Cordeiro, G.M., Andrade, M.G.: Transformed symmetric models. Stat. Model. 11(4), 371–388 (2011)

    Article  MathSciNet  Google Scholar 

  • de Haan, L.: On Regular Variation and Its Application to the Weak Convergence of Sample Extremes, Mathematical Centre Tracts, vol. 32. Mathematics Centre, Amsterdam (1970)

    Google Scholar 

  • Dunn, P.K., Smyth, G.K.: Randomized quantile residuals. J. Comput. Graph. Stat. 5(3), 236–244 (1996)

    Google Scholar 

  • Fang, K.T., Kotz, S., NG, K.W.: Symmetric Multivariate and Related Distributions. Chapman and Hall, London (1990)

    Book  MATH  Google Scholar 

  • Hubert, M., Vandervieren, E.: An adjusted boxplot for skewed distributions. Comput. Stat. Data Anal. 52(12), 5186–5201 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Jondeau, E., Rockinger, M.: Conditional volatility, skewness, and kurtosis: existence, persistence, and comovements. J. Econ. Dyn. Control 27, 1699–1737 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Kelker, D.: Distribution theory of spherical distributions and a location-scale parameter generalization. Sankhya A 32(4), 419–430 (1970)

    MathSciNet  MATH  Google Scholar 

  • Luceño, A.: Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators. Comput. Stat. Data Anal. 51(2), 904–917 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Poirier, D.J.: The use of box–cox transformation in limited dependent variable models. J. Am. Stat. Assoc. 73(362), 284–287 (1978)

    Article  MATH  Google Scholar 

  • Resnick, S.I.: Heavy-Tail Phenomena Probabilistic and Statistical Modeling. Springer, New York (2007)

    MATH  Google Scholar 

  • Rigby, R.A., Stasinopoulos, D.M.: Smooth centile curves for skew and kurtotic data modelled using the Box–Cox power exponential distribution. Stat. Med. 23(19), 3053–3076 (2004)

    Article  Google Scholar 

  • Rigby, R.A., Stasinopoulos, D.M.: Generalized additive models for location, scale and shape. J. R. Stat. Soc. Ser. C Appl. Stat. 54(3), 507–554 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Rigby, R.A., Stasinopoulos, D.M.: Using the Box–Cox t distribution in GAMLSS to model skewness and kurtosis. Stat. Model. 6(3), 209–229 (2006)

    Article  MathSciNet  Google Scholar 

  • Rigby, R.A., Stasinopoulos, D.M., Heller, G. Voudouris, V.: The Distribution Toolbox of GAMLSS. London (2014). http://www.gamlss.org/wp-content/uploads/2014/10/distributions.pdf

  • Rogers, W.H., Tukey, J.W.: Understanding some long-tailed symmetrical distributions. Stat. Neerl. 26(3), 211–226 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  • Stasinopoulos, D.M., Rigby, R.A. Akantziliotou, C.: Instructions on how to use the GAMLSS package in R. London (2008). http://www.gamlss.org

  • Vanegas, L.H., Paula, G.A.: A semiparametric approach for joint modeling of median and skewness. Test 24(1), 110–135 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Vanegas, L.H., Paula, G.A.: Log-symmetric distributions: statistical properties and parameter estimation. Braz. J. Probab. Stat. 30, 196–220 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  • Voudouris, V., Gilchrist, R., Rigby, R.A., Sedgwick, J., Stasinopoulos, D.M.: Modelling skewness and kurtosis with the BCPE density in GAMLSS. J. Appl. Stat. 39(6), 1279–1293 (2012)

    Article  MathSciNet  Google Scholar 

  • Yang, Z.: A modified family of power transformations. Econ. Lett. 92(1), 14–19 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Yang, Z.L.: Some asymptotic results on Box-Cox transformation methodology. Commun. Stat. Theory Methods 25(2), 403–415 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  • Yeo, I.K., Johnson, R.A.: A new family of power transformation to improve normality or symmetry. Biometrika 87(4), 954–959 (2000)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank José Eduardo Corrente for providing the data used in this study, and Eliane C. Pinheiro for helpful discussions. We are grateful to the Associate Editor and two anonymous referees for constructive comments and suggestions. Funding was provided by Conselho Nacional de Desenvolvimento Científico e Tecnológico (Grant No. 304388-2014-9), Fundação de Amparo à Pesquisa do Estado de São Paulo (Grant No. 2012/21788-2), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Silvia L. P. Ferrari.

Appendix

Appendix

In this “Appendix,” we give the first and second derivatives of the log-likelihood function with respect to the parameters. Let \(z=h(y;\mu ,\sigma ,\lambda )\), where \(h(y;\mu ,\sigma ,\lambda )\) is given in (1), \(\varpi =-2r'(z^2)/r(z^2)\), and \(\xi =r((\sigma \lambda )^{-2}) / R((\sigma |\lambda |)^{-1}).\) We have

$$\begin{aligned} \displaystyle \frac{\partial z}{\partial \mu }= & {} -\frac{1}{\mu \sigma } \left( \frac{y}{\mu }\right) ^\lambda \xrightarrow [\lambda \rightarrow 0]{} -\frac{1}{\mu \sigma }, \\ \displaystyle \frac{\partial z}{\partial \lambda }= & {} \frac{1}{\sigma \lambda ^2} \left\{ 1+ \left( \frac{y}{\mu }\right) ^{\lambda } \left[ -1 + \lambda \log \left( \frac{y}{\mu }\right) \right] \right\} \xrightarrow [\lambda \rightarrow 0]{} \frac{1}{2 \sigma }\left[ \log \left( \frac{y}{\mu }\right) \right] ^2,\\ \displaystyle \frac{\partial ^2 z}{\partial \mu ^2}= & {} \frac{(\lambda +1)}{\mu ^2 \sigma } \left( \frac{y}{\mu }\right) ^\lambda \xrightarrow [\lambda \rightarrow 0]{} \frac{1}{\mu ^2 \sigma },\\ \displaystyle \frac{\partial ^2 z}{\partial \lambda ^2}= & {} \frac{1}{\sigma \lambda ^3}\left\{ -2+ \left( \frac{y}{\mu }\right) ^{\lambda } \left[ 2- 2 \lambda \log \left( \frac{y}{\mu }\right) +\lambda ^2 \left( \log \left( \frac{y}{\mu }\right) \right) ^2 \right] \right\} \\&\times \xrightarrow [\lambda \rightarrow 0]{} \frac{1}{3 \sigma } \left[ \log \left( \frac{y}{\mu } \right) \right] ^3,\\ \displaystyle \frac{\partial ^2 z}{\partial \mu \partial \lambda }= & {} -\frac{1}{\mu \sigma } \left( \frac{y}{\mu }\right) ^\lambda \log \left( \frac{y}{\mu }\right) \xrightarrow [\lambda \rightarrow 0]{} -\frac{1}{\mu \sigma }\log \left( \frac{y}{\mu }\right) . \end{aligned}$$

Let \(\ell \) denote the log-likelihood for a single observation y. We have

$$\begin{aligned} \ell = (\lambda -1) \log y - \lambda \log \mu - \log \sigma +\log r(z^2)- \log R\left( \frac{1}{\sigma |\lambda |}\right) , \end{aligned}$$

if \(\lambda \ne 0\); the last term in \(\ell \) is zero if \(\lambda =0\). The first derivatives of \(\ell \) are given by

$$\begin{aligned}&\frac{\partial \ell }{\partial \mu }= -\frac{\lambda }{\mu }-\varpi z\frac{\partial z}{\partial \mu },\\&\displaystyle \frac{\partial \ell }{\partial \sigma }=\left\{ \begin{array}{l@{\quad }l} \displaystyle {-\frac{1}{\sigma }+ \frac{\varpi z^2}{\sigma }+ \frac{\xi }{\sigma ^2 |\lambda |}}, &{} \quad {\hbox {if} \quad \lambda \ne 0}, \\ \displaystyle {-\frac{1}{\sigma }+ \frac{\varpi z^2}{\sigma }}, &{} \quad { \hbox {if} \quad \lambda = 0,} \end{array} \right. \\&\frac{\partial \ell }{\partial \lambda }= \log \left( \frac{y}{\mu }\right) -\varpi z\frac{\partial z}{\partial \lambda }+ \hbox {sign}(\lambda )\frac{ \xi }{\sigma \lambda ^2}. \end{aligned}$$

The second derivatives of \(\ell \) are given by

$$\begin{aligned}&\frac{\partial ^2 \ell }{\partial \mu ^2}= \frac{\lambda }{\mu ^2}-\left( z \frac{\hbox {d}\varpi }{\hbox {d}z} +\varpi \right) \left( \frac{\partial z}{\partial \mu }\right) ^2-\varpi z \frac{\partial ^2 z}{\partial \mu ^2},\\&\displaystyle \frac{\partial ^2 \ell }{\partial \sigma ^2}=\left\{ \begin{array}{l@{\quad }l} \displaystyle {\frac{1}{\sigma ^2}- \frac{z^3}{\sigma ^2} \frac{\hbox {d}\varpi }{\hbox {d}z}- \frac{3 \varpi z^2}{\sigma ^2}+\frac{1}{\sigma ^2 |\lambda |} \frac{\partial \xi }{\partial \sigma }-\frac{2 \xi }{\sigma ^3 |\lambda |}}, &{} {\hbox {if} \quad \lambda \ne 0}, \\ \displaystyle {\frac{1}{\sigma ^2}- \frac{z^3}{\sigma ^2} \frac{\hbox {d}\varpi }{\hbox {d}z}- \frac{3 \varpi z^2}{\sigma ^2}}, &{} { \hbox {if} \quad \lambda = 0,} \end{array} \right. \\&\frac{\partial ^2 \ell }{\partial \lambda ^2}= -\left( z \frac{\hbox {d}\varpi }{\hbox {d}z} +\varpi \right) \left( \frac{\partial z}{\partial \lambda }\right) ^2-\varpi z \frac{\partial ^2 z}{\partial \lambda ^2}+\hbox {sign}(\lambda )\left( \frac{1}{\sigma \lambda ^2} \frac{\partial \xi }{\partial \lambda }-\frac{2 \xi }{\sigma \lambda ^3}\right) ,\\&\frac{\partial ^2 \ell }{\partial \mu \partial \sigma }=\frac{z}{\sigma } \frac{\partial z}{\partial \mu } \left( z \frac{\hbox {d}\varpi }{\hbox {d}z} + 2 \varpi \right) ,\\&\frac{\partial ^2 \ell }{\partial \mu \partial \lambda }= -\frac{1}{\mu }-\left( z \frac{\hbox {d}\varpi }{\hbox {d}z} +\varpi \right) \frac{\partial z}{\partial \mu }\frac{\partial z}{\partial \lambda }-\varpi z \frac{\partial ^2 z}{\partial \mu \partial \lambda },\\&\frac{\partial ^2 \ell }{\partial \sigma \partial \lambda }= \frac{z}{\sigma } \frac{\partial z}{\partial \lambda } \left( z \frac{\hbox {d}\varpi }{\hbox {d}z}+ 2 \varpi \right) + \frac{1}{ \sigma ^2 |\lambda |} \frac{\partial \xi }{\partial \lambda } - \hbox {sign}(\lambda ) \frac{\xi }{\sigma ^2 \lambda ^2}. \end{aligned}$$

The first and second derivatives of \(\ell \) are obtained after plugging the derivatives of z given above.

Note that the first derivatives of \(\ell \) depend on the weighting function \(\varpi \) (\(\varpi \) is given in Table 3 for some distributions). Consequently, \(\hbox {d}\varpi /\hbox {d}z\) appears in all the second derivatives of \(\ell \). Note that \(\partial \ell /\partial \sigma \) and \(\partial \ell /\partial \lambda \) involve \(\xi \), which in turn depends on the particular distribution in the BCS class and the truncation set. The first derivatives of \(\xi \) appear in \(\partial ^2 \ell /\partial \sigma ^2\), \(\partial ^2 \ell /\partial \lambda ^2\) and \(\partial ^2 \ell /\partial \sigma \partial \lambda \). The stability of the terms that involve \(\xi \) and its first derivatives around \(\lambda =0\) may vary according to different distributions. For instance, they may be unstable for the Box–Cox t distribution with small degrees of freedom parameter. Yet, a simulation study of the type I error probability of the likelihood ratio test of \(\mathrm{H}_0: \lambda =0\) in the Box–Cox t model for different values of the degrees of freedom parameter performed well; see Sect. 4.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ferrari, S.L.P., Fumes, G. Box–Cox symmetric distributions and applications to nutritional data. AStA Adv Stat Anal 101, 321–344 (2017). https://doi.org/10.1007/s10182-017-0291-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-017-0291-6

Keywords

Navigation