Abstract
We propose a copula-based approach for analyzing functional data with correlated multiple functional outcomes exhibiting heterogeneous shape characteristics. To accommodate the possibly large number of parameters due to having several functional outcomes, parameter estimation is performed in two steps: first, the parameters for the marginal distributions are estimated using the skew t family, and then the dependence structure both within and across outcomes is estimated using a Gaussian copula. We develop an estimation algorithm for the dependence parameters based on the Karhunen–Loève expansion and an EM algorithm that significantly reduces the dimension of the problem and is computationally efficient. We also demonstrate prediction of an unknown outcome when the other outcomes are known. We apply our methodology to diffusion tensor imaging data for multiple sclerosis (MS) patients with three outcomes and identify differences in both the marginal distributions and the dependence structure between the MS and control groups. Our proposed methodology is quite general and can be applied to other functional data with multiple outcomes in biology and other fields. Supplementary materials accompanying this paper appear online.
Similar content being viewed by others
References
Azzalini, A. (1985). A class of distributions which includes the normal ones. Scandinavian Journal of Statistics 12, 171–178.
— (2011). R Package ‘sn’: The skew-normal and skew-t distributions (version 0.4-17). URL http://azzalini.stat.unipd.it/SN, accessed August 1, 2012.
— (2014). The Skew-Normal and Related Families, Institute of Mathematical Statistics Monographs, Cambridge University Press.
Azzalini, A. and Capitanio, A. (2003). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew \(t\) distribution. Journal of the Royal Statistics Society, Series B 65, 367–389.
Basser, P. J., Mattiello, J., and LeBihan, D. (1994). MR diffusion tensor spectroscopy and imaging. Biophysical Journal 66, 259–267.
Basser, P. J., Pajevic, S., Pierpaoli, C., Duda, J., and Aldroubi, A. (2000). In vivo fiber tractography using dt-mri data. Magnetic Resonance in Medicine 44, 625–632.
Cao, J., Wang L., Huang, Z., Gai, J., Wu, R. (2017). Functional Mapping of Multiple Dynamic Traits. Journal of Agricultural, Biological and Environmental Statistics 22, 60–75.
Crainiceanu, C. M., Reiss, P., Goldsmith, J., Huang, L., Huo, L., Scheipl, F. et al. (2012). R Package ‘refund’: Regression with functional data (version 0.1-6). URL http://cran.r-project.org/web/packages/refund/index.html, accessed 2010.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39, 1–38.
Di C., Crainiceanu C.M., Caffo B.S., and Punjabi N.M. (2009). Multilevel Functional Principal Component Analysis. The Annals of Applied Statistics 3, 458–488.
Dubin, J. A. and Müller, H. G. (2005). Dynamical correlation for multivariate longitudinal data. Journal of the American Statistical Association 100(471), 872–881.
Ferraty, F. and Romain, Y. (2010). The Oxford Handbook of Functional Data Analysis. New York: Oxford University Press.
Goldsmith, J., Feder, J., Crainiceanu, C. M., Caffo, B., and Reich, D. S. (2011a). Penalized functional regression. Journal of Computational and Graphical Statistics 20, 830–851.
Goldsmith, J., Crainiceanu, C. M., Caffo, B., and Reich, D. S. (2011b). Penalized functional regression analysis of white-matter tract profiles in multiple sclerosis. NeuroImage 57, 431–439.
— (2012). Longitudinal penalized functional regression for cognitive outcomes on neuronal tract measurements. Journal of the Royal Statistical Society, Series C 61, 453–469.
Greven, S., Crainiceanu, C. M., Caffo, B., and Reich, D. S. (2010). Longitudinal functional principal component analysis. Electronic Journal of Statistics 4, 1022–1054.
Li, H., Staudenmayer, J. and Carroll, R. J. (2014). Hierarchical functional data with mixed continuous and binary measurements. Biometrics 70(4), 802–811.
McLean, M. W., Hooker, G., Staicu, A., Scheipl, F., and Ruppert, D. (2014). Functional generalized additive models. Journal of Computational and Graphical Statistics 23(1), 249–269.
Owen, D. B. (1956). Tables for computing bivariate normal probabilities. Annals of Mathematical Statistics 27, 1075–1090.
Ozturk, A., Smith, S.A., Gordon-Lipkin E.M., Harrison, D.M., Shiee, N., Pham, D.L. et al. (2010). MRI of the corpus callosum in multiple sclerosis: association with disability. Multiple Sclerosis 16, 166–177.
Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis. New York: Springer.
Redd, A. (2011). R Package ‘orthogonalsplinebasis’: Orthogonal bspline basis functions (version 0.1.5). URL http://osplinebasis.r-forge.r-project.org, accessed February 1, 2013.
Reich, D. S., Smith, S. A., Zackowski, K. M., Gordon-Lipkin, E. M., Jones, C. K., Farrel, J. A.D. et al. (2005). Multiparametric magnetic resonance imaging analysis of the corticospinal tract in multiple sclerosis. NeuroImage 38, 271–279.
Ruppert, D. (2002). Selecting the number of knots for penalized splines. Journal of Computational and Graphical Statistics 11, 735–757.
Ruppert, D., Wand, M. P., and Carroll, R. J. (2003). Semiparametric Regression, Cambridge: Cambridge University Press.
Staicu, A.-M., Crainiceanu, C. M., and Carroll, R. J. (2010). Fast methods for spatially correlated multilevel functional data. Biostatistics 11, 177–194.
Staicu, A., Crainiceanu, C. M., Reich, D. S., and Ruppert, D. (2012). Modeling functional data with spatially heterogeneous shape characteristics. Biometrics 68, 331–343.
Tsing, T, and Eubank, R. (2015) Theoretical Foundations of Functional Data Analysis with an Introduction to Linear Operators, Wiley.
Wood, S. N. (2006). Generalized Additive Models: An Introduction with R. Boca Raton, FL: Chapman & Hall/CRC.
Zhou, L., Huang, J. Z., and Carroll, R. J. (2008). Joint modelling of paired sparse functional data using principal components. Biometrika 95, 601–619.
Acknowledgements
The authors are very grateful for the constructive comments of the Editor, the Associate Editor and two reviewers, which are extremely helpful for us to improve our work. The authors also thank Daniel Reich and Peter Calabresi and their research teams, who were instrumental in collecting the data for this study. Scans were funded by grants from the National Multiple Sclerosis Society and EMD Serono. We are grateful to Ciprian Crainiceanu for providing access to the data and for meaningful discussions and personal communications. Carroll was supported by a Grant U01-CA057030 from the National Cancer Institute (NCI). Ruppert was supported by the NCI Grant U01-CA057030 and a NSF Grant AST-1312903.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Cao, J., Soiaporn, K., Carroll, R.J. et al. Modeling and Prediction of Multiple Correlated Functional Outcomes. JABES 24, 112–129 (2019). https://doi.org/10.1007/s13253-018-00344-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13253-018-00344-0