Skip to main content
Log in

Sparse logistic functional principal component analysis for binary data

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Functional binary datasets occur frequently in real practice, whereas discrete characteristics of the data can bring challenges to model estimation. In this paper, we propose a sparse logistic functional principal component analysis (SLFPCA) method to handle functional binary data. The SLFPCA looks for local sparsity of the eigenfunctions to obtain convenience in interpretation. We formulate the problem through a penalized Bernoulli likelihood with both roughness penalty and sparseness penalty terms. An innovative algorithm is developed for the optimization of the penalized likelihood using majorization-minimization algorithm. The proposed method is accompanied by R package SLFPCA for implementation. The theoretical results indicate both consistency and sparsistency of the proposed method. We conduct a thorough numerical experiment to demonstrate the advantages of the SLFPCA approach. Our method is further applied to a physical activity dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Cardot, H.: Nonparametric estimation of smoothed principal components analysis of sampled noisy functions. J. Nonparametr. Stat. 12(4), 503–538 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Cardot, H., Ferraty, F., Sarda, P.: Spline estimators for the functional linear model. Stat. Sin. 13, 571–591 (2003)

    MathSciNet  MATH  Google Scholar 

  • Centofanti, F., Fontana, M., Lepore, A., et al.: Smooth lasso estimator for the function-on-function linear regression model. arXiv preprint arXiv:2007.00529 (2020)

  • Chen, J., Chen, Z.: Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95(3), 759–771 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, J., Chen, Z.: Extended bic for small-n-large-p sparse glm. Stat. Sin. 22, 555–574 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, K., Lei, J.: Localized functional principal component analysis. J. Am. Stat. Assoc. 110(511), 1266–1275 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Fan, J., Peng, H.: Nonconcave penalized likelihood with a diverging number of parameters. Ann. Stat. 32(3), 928–961 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  • Gertheiss, J., Goldsmith, J., Staicu, A.M.: A note on modeling sparse exponential-family functional response curves. Comput. Stat. Data Anal. 105, 46–52 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  • Gervini, D.: Robust functional estimation using the median and spherical principal components. Biometrika 95(3), 587–600 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Hall, P., Hosseini-Nasab, M.: On properties of functional principal components analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 68(1), 109–126 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Hall, P., Müller, H., Yao, F.: Modelling sparse generalized longitudinal observations with latent gaussian processes. J. R. Stati. Soc. Ser. B Stat. Methodol. 70(4), 703–723 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Huang, J., Horowitz, J.L., Wei, F.: Variable selection in nonparametric additive models. Ann. Stat. 38(4), 2282 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Huang, H., Li, Y., Guan, Y.: Joint modeling and clustering paired generalized longitudinal trajectories with application to cocaine abuse treatment data. J. Am. Stat. Assoc. 109(508), 1412–1424 (2014)

    Article  MathSciNet  Google Scholar 

  • James, G.M., Hastie, T.J., Sugar, C.A.: Principal component models for sparse functional data. Biometrika 87(3), 587–602 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Kozey-Keadle, S., Staudenmayer, J., Libertine, A., et al.: Changes in sedentary time and physical activity in response to an exercise training and/or lifestyle intervention. J. Phys. Activity Health 11(7), 1324–1333 (2014)

    Article  Google Scholar 

  • Lee, S., Huang, J.Z.: A coordinate descent mm algorithm for fast computation of sparse logistic pca. Comput. Stat. Data Anal. 62, 26–38 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Lee, S., Huang, J.Z., Hu, J.: Sparse logistic principal components analysis for binary data. Ann. Appl. Stat. 4(3), 1579–1601 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Li, G., Shen, H., Huang, J.Z.: Supervised sparse and functional principal component analysis. J. Comput. Graph. Stat. 25(3), 859–878 (2016)

    Article  MathSciNet  Google Scholar 

  • Li, G., Huang, J.Z., Shen, H.: Exponential family functional data analysis via a low-rank model. Biometrics 74(4), 1301–1310 (2018)

    Article  MathSciNet  Google Scholar 

  • Lin, Z., Wang, L., Cao, J.: Interpretable functional principal component analysis. Biometrics 72(3), 846–854 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  • Lin, Z., Cao, J., Wang, L., et al.: Locally sparse estimator for functional linear regression models. J. Comput. Graph. Stat. 26(2), 306–318 (2017)

    Article  MathSciNet  Google Scholar 

  • Nie, Y., Cao, J.: Sparse functional principal component analysis in a new regression framework. Comput. Stat. Data Anal. 152, 1–15 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  • Ramsay, J.O., Silverman, B.W.: Functional Data Analysis, 2nd edn. Springer Series in Statistics. Springer, New York (2005)

  • Silverman, B.W.: Smoothed functional principal components analysis by choice of norm. Ann. Stat. 24(1), 1–24 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  • Tremblay, M.S., Warburton, D.E., Janssen, I., et al.: New Canadian physical activity guidelines. Appl. Physiol. Nutr. Metab. 36(1), 36–46 (2011)

    Article  Google Scholar 

  • Tu, C.Y., Park, J., Wang, H.: Estimation of functional sparsity in nonparametric varying coefficient models for longitudinal data analysis. Stat. Sin. 30, 439–465 (2020)

    MathSciNet  MATH  Google Scholar 

  • van der Linde, A.: A Bayesian latent variable approach to functional principal components analysis with binary and count data. AStA Adv. Stat. Anal. 93(3), 307–333 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Wang, H., Kai, B.: Functional sparsity: global versus local. Stat. Sin. 25, 1337–1354 (2015)

    MathSciNet  MATH  Google Scholar 

  • Wang, K., Tsung, F.: Hierarchical sparse functional principal component analysis for multistage multivariate profile data. IISE Trans. 66, 1–16 (2020)

    Google Scholar 

  • Yao, F., Müller, H.G., Wang, J.L.: Functional data analysis for sparse longitudinal data. J. Am. Stat. Assoc. 100(470), 577–590 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, J., Siegle, G.J., Sun, T., et al.: Interpretable principal component analysis for multilevel multivariate functional data. Biostatistics 6, 66 (2021)

    Google Scholar 

  • Zhou, L., Huang, J.Z., Carroll, R.J.: Joint modelling of paired sparse functional data using principal components. Biometrika 95(3), 601–619 (2008)

  • Zhou, J., Wang, N.Y., Wang, N.: Functional linear model with zero-value coefficient function at sub-regions. Stat. Sin. 23, 25–50 (2013)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by Public Health & Disease Control and Prevention, Major Innovation & Planning Interdisciplinary Platform for the “Double-First Class” Initiative, Renmin University of China. This work was supported by the Outstanding Innovative Talents Cultivation Funded Programs 2021 of Renmin University of China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingxiao Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 385 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhong, R., Liu, S., Li, H. et al. Sparse logistic functional principal component analysis for binary data. Stat Comput 33, 15 (2023). https://doi.org/10.1007/s11222-022-10190-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-022-10190-3

Keywords

Navigation