Abstract
Functional binary datasets occur frequently in real practice, whereas discrete characteristics of the data can bring challenges to model estimation. In this paper, we propose a sparse logistic functional principal component analysis (SLFPCA) method to handle functional binary data. The SLFPCA looks for local sparsity of the eigenfunctions to obtain convenience in interpretation. We formulate the problem through a penalized Bernoulli likelihood with both roughness penalty and sparseness penalty terms. An innovative algorithm is developed for the optimization of the penalized likelihood using majorization-minimization algorithm. The proposed method is accompanied by R package SLFPCA for implementation. The theoretical results indicate both consistency and sparsistency of the proposed method. We conduct a thorough numerical experiment to demonstrate the advantages of the SLFPCA approach. Our method is further applied to a physical activity dataset.
Similar content being viewed by others
References
Cardot, H.: Nonparametric estimation of smoothed principal components analysis of sampled noisy functions. J. Nonparametr. Stat. 12(4), 503–538 (2000)
Cardot, H., Ferraty, F., Sarda, P.: Spline estimators for the functional linear model. Stat. Sin. 13, 571–591 (2003)
Centofanti, F., Fontana, M., Lepore, A., et al.: Smooth lasso estimator for the function-on-function linear regression model. arXiv preprint arXiv:2007.00529 (2020)
Chen, J., Chen, Z.: Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95(3), 759–771 (2008)
Chen, J., Chen, Z.: Extended bic for small-n-large-p sparse glm. Stat. Sin. 22, 555–574 (2012)
Chen, K., Lei, J.: Localized functional principal component analysis. J. Am. Stat. Assoc. 110(511), 1266–1275 (2015)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
Fan, J., Peng, H.: Nonconcave penalized likelihood with a diverging number of parameters. Ann. Stat. 32(3), 928–961 (2004)
Gertheiss, J., Goldsmith, J., Staicu, A.M.: A note on modeling sparse exponential-family functional response curves. Comput. Stat. Data Anal. 105, 46–52 (2017)
Gervini, D.: Robust functional estimation using the median and spherical principal components. Biometrika 95(3), 587–600 (2008)
Hall, P., Hosseini-Nasab, M.: On properties of functional principal components analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 68(1), 109–126 (2006)
Hall, P., Müller, H., Yao, F.: Modelling sparse generalized longitudinal observations with latent gaussian processes. J. R. Stati. Soc. Ser. B Stat. Methodol. 70(4), 703–723 (2008)
Huang, J., Horowitz, J.L., Wei, F.: Variable selection in nonparametric additive models. Ann. Stat. 38(4), 2282 (2010)
Huang, H., Li, Y., Guan, Y.: Joint modeling and clustering paired generalized longitudinal trajectories with application to cocaine abuse treatment data. J. Am. Stat. Assoc. 109(508), 1412–1424 (2014)
James, G.M., Hastie, T.J., Sugar, C.A.: Principal component models for sparse functional data. Biometrika 87(3), 587–602 (2000)
Kozey-Keadle, S., Staudenmayer, J., Libertine, A., et al.: Changes in sedentary time and physical activity in response to an exercise training and/or lifestyle intervention. J. Phys. Activity Health 11(7), 1324–1333 (2014)
Lee, S., Huang, J.Z.: A coordinate descent mm algorithm for fast computation of sparse logistic pca. Comput. Stat. Data Anal. 62, 26–38 (2013)
Lee, S., Huang, J.Z., Hu, J.: Sparse logistic principal components analysis for binary data. Ann. Appl. Stat. 4(3), 1579–1601 (2010)
Li, G., Shen, H., Huang, J.Z.: Supervised sparse and functional principal component analysis. J. Comput. Graph. Stat. 25(3), 859–878 (2016)
Li, G., Huang, J.Z., Shen, H.: Exponential family functional data analysis via a low-rank model. Biometrics 74(4), 1301–1310 (2018)
Lin, Z., Wang, L., Cao, J.: Interpretable functional principal component analysis. Biometrics 72(3), 846–854 (2016)
Lin, Z., Cao, J., Wang, L., et al.: Locally sparse estimator for functional linear regression models. J. Comput. Graph. Stat. 26(2), 306–318 (2017)
Nie, Y., Cao, J.: Sparse functional principal component analysis in a new regression framework. Comput. Stat. Data Anal. 152, 1–15 (2020)
Ramsay, J.O., Silverman, B.W.: Functional Data Analysis, 2nd edn. Springer Series in Statistics. Springer, New York (2005)
Silverman, B.W.: Smoothed functional principal components analysis by choice of norm. Ann. Stat. 24(1), 1–24 (1996)
Tremblay, M.S., Warburton, D.E., Janssen, I., et al.: New Canadian physical activity guidelines. Appl. Physiol. Nutr. Metab. 36(1), 36–46 (2011)
Tu, C.Y., Park, J., Wang, H.: Estimation of functional sparsity in nonparametric varying coefficient models for longitudinal data analysis. Stat. Sin. 30, 439–465 (2020)
van der Linde, A.: A Bayesian latent variable approach to functional principal components analysis with binary and count data. AStA Adv. Stat. Anal. 93(3), 307–333 (2009)
Wang, H., Kai, B.: Functional sparsity: global versus local. Stat. Sin. 25, 1337–1354 (2015)
Wang, K., Tsung, F.: Hierarchical sparse functional principal component analysis for multistage multivariate profile data. IISE Trans. 66, 1–16 (2020)
Yao, F., Müller, H.G., Wang, J.L.: Functional data analysis for sparse longitudinal data. J. Am. Stat. Assoc. 100(470), 577–590 (2005)
Zhang, J., Siegle, G.J., Sun, T., et al.: Interpretable principal component analysis for multilevel multivariate functional data. Biostatistics 6, 66 (2021)
Zhou, L., Huang, J.Z., Carroll, R.J.: Joint modelling of paired sparse functional data using principal components. Biometrika 95(3), 601–619 (2008)
Zhou, J., Wang, N.Y., Wang, N.: Functional linear model with zero-value coefficient function at sub-regions. Stat. Sin. 23, 25–50 (2013)
Acknowledgements
This work was supported by Public Health & Disease Control and Prevention, Major Innovation & Planning Interdisciplinary Platform for the “Double-First Class” Initiative, Renmin University of China. This work was supported by the Outstanding Innovative Talents Cultivation Funded Programs 2021 of Renmin University of China.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhong, R., Liu, S., Li, H. et al. Sparse logistic functional principal component analysis for binary data. Stat Comput 33, 15 (2023). https://doi.org/10.1007/s11222-022-10190-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-022-10190-3