Advertisement

Annals of Operations Research

, Volume 276, Issue 1–2, pp 331–347 | Cite as

Nonparametric additive beta regression for fractional response with application to body fat data

  • Kuangnan FangEmail author
  • Xinyan Fan
  • Wei Lan
  • Bingquan Wang
S.I.: Computational Biomedicine
  • 88 Downloads

Abstract

Fractional data that are restricted in the standard unit interval (0, 1) with a highly skewed distribution are commonly encountered. Such data arise in various areas, such as economics, finance, and medicine, among others. One natural idea to model such data is to use the beta family due to its flexibility to accommodate various density shapes. In this paper, we propose a nonparametric additive beta regression model along with a variable selection procedure, where the mean response is related to covariates through the combination of unknown functions of covariates, which can be approximated on a B-spline basis. By using this approximation method, we transform the problem of variable selection into the problem of selecting the groups of coefficients in the expansion. Based on the penalized likelihood method for group variable selection, we successfully select the significant covariates. Moreover, the estimation and selection consistencies and the properties of the penalized estimators are established. The simulation studies demonstrate that the performance of our proposed method is quite good. Finally, we apply the proposed method to body fat data, and we obtain several important findings with satisfactory selection and prediction performance.

Keywords

Nonparametric additive beta regression Fractional data Variable selection Group SCAD 

Notes

Acknowledgements

This study has been supported by National Natural Science Foundation of China (71471152) and Fundamental Research Funds for the Central Universities of China (20720181003, 20720171095).

References

  1. Breheny, P., & Huang, J. (2015). Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Statistics and computing, 25(2), 173–187.CrossRefGoogle Scholar
  2. Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.CrossRefGoogle Scholar
  3. Fan, J., & Lv, J. (2010). A selective overview of variable selection in high dimensional feature space. Statistica Sinica, 20(1), 101–148.Google Scholar
  4. Fang, K., & Ma, S. (2013). Three-part model for fractional response variables with application to Chinese household health insurance coverage. Journal of Applied Statistics, 40(5), 925–940.CrossRefGoogle Scholar
  5. Ferrari, S., & Cribari-Neto, F. (2004). Beta regression for modeling rates and proportions. Journal of Applied Statistics, 31(7), 799–815.CrossRefGoogle Scholar
  6. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning. New York: Springer.CrossRefGoogle Scholar
  7. Hoeting, J., Madigan, D., Raftery, A., & Volinsky, C. (1999). Bayesian model averaging: A tutorial. Statistics Science, 44(4), 382–417.Google Scholar
  8. Horowitz, J., & Mammen, E. (2004). Nonparametric estimation of an additive model with a link function. The Annals of Statistics, 32(6), 2412–2443.CrossRefGoogle Scholar
  9. Huang, J., Horowitz, J. L., & Wei, F. (2010). Variable selection in nonparametric additive models. The Annals of Statistics, 38(4), 2282–2313.CrossRefGoogle Scholar
  10. Johnson, N. L., Kotz, S., & Balakrishnan, N. (1995). Continuous univariate distributions (2nd ed.). New York: Wiley.Google Scholar
  11. Leng, C., Tran, M., & Nott, D. (2010). Bayesian adaptive Lasso. Annals of the Institute of Statistical Mathematics, 66(2), 221–244.CrossRefGoogle Scholar
  12. Lin, Y., & Zhang, H. (2006). Component selection and smoothing in multivariate nonparametric regression. The Annals of Statistics, 34(5), 2272–2297.CrossRefGoogle Scholar
  13. Meier, L., Van De Geer, S., & Buhlmann, P. (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society: Series B, 70(1), 53–71.CrossRefGoogle Scholar
  14. Nocedal, J., & Wright, S. J. (1999). Numerical optimization. New York: Springer.CrossRefGoogle Scholar
  15. Rigby, R. A., & Stasinopoulos, D. M. (2005). Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society: Series C (Applied Statistics), 54(3), 507–554.CrossRefGoogle Scholar
  16. Schmid, M., Wickler, F., Maloney, K. O., Mitchell, R., Fenske, N., & Mayr, A. (2013). Boosted beta regression. Plos One, 8(4), e61623.CrossRefGoogle Scholar
  17. Schumaker, L. (1981). Spline functions: Basic theory. New York: Wiley.Google Scholar
  18. Wang, H., & Leng, C. (2007). Unified LASSO estimation by least squares approximation. Journal of the American Statistical Association, 102(479), 1039–1048.CrossRefGoogle Scholar
  19. Weisberg, S. (1985). Applied linear regression. New York: Wiley.Google Scholar
  20. Wu, C., Cui, Y., & Ma, S. (2014). Integrative analysis of gene environment interactions under a multiresponse partially linear varying coefficient model. Statistics in Medicine, 33(28), 4988–4998.CrossRefGoogle Scholar
  21. Xue, L. (2009). Consistent variable selection in additive models. Statistica Sinica, 19, 1281–1296.Google Scholar
  22. Zhang, H., Wahba, G., Lin, Y., Voelker, M., Ferris, M., Klein, R., et al. (2004). Variable selection and model building via likelihood basis pursuit. Journal of the American statistical Association, 99(467), 659–672.CrossRefGoogle Scholar
  23. Zhao, W., Zhang, R., Lv, Y., & Liu, J. (2014). Variable selection for varying dispersion beta regression model. Journal of Applied Statistics, 41(1), 95–108.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Kuangnan Fang
    • 1
    • 3
    Email author
  • Xinyan Fan
    • 1
  • Wei Lan
    • 2
  • Bingquan Wang
    • 1
  1. 1.Department of statistics, School of EconomicsXiamen UniversityXiamenChina
  2. 2.Statistics School and Center of Statistical ResearchSouthwestern University of Finance and EconomicsXiamenChina
  3. 3.Data Mining Research CenterXiamen UniversityXiamenChina

Personalised recommendations