Modeling and Prediction of Multiple Correlated Functional Outcomes

  • Jiguo CaoEmail author
  • Kunlaya Soiaporn
  • Raymond J. Carroll
  • David Ruppert


We propose a copula-based approach for analyzing functional data with correlated multiple functional outcomes exhibiting heterogeneous shape characteristics. To accommodate the possibly large number of parameters due to having several functional outcomes, parameter estimation is performed in two steps: first, the parameters for the marginal distributions are estimated using the skew t family, and then the dependence structure both within and across outcomes is estimated using a Gaussian copula. We develop an estimation algorithm for the dependence parameters based on the Karhunen–Loève expansion and an EM algorithm that significantly reduces the dimension of the problem and is computationally efficient. We also demonstrate prediction of an unknown outcome when the other outcomes are known. We apply our methodology to diffusion tensor imaging data for multiple sclerosis (MS) patients with three outcomes and identify differences in both the marginal distributions and the dependence structure between the MS and control groups. Our proposed methodology is quite general and can be applied to other functional data with multiple outcomes in biology and other fields. Supplementary materials accompanying this paper appear online.


Diffusion tensor imaging Gaussian copulas Multiple sclerosis Skewed functional data Tractography data 



The authors are very grateful for the constructive comments of the Editor, the Associate Editor and two reviewers, which are extremely helpful for us to improve our work. The authors also thank Daniel Reich and Peter Calabresi and their research teams, who were instrumental in collecting the data for this study. Scans were funded by grants from the National Multiple Sclerosis Society and EMD Serono. We are grateful to Ciprian Crainiceanu for providing access to the data and for meaningful discussions and personal communications. Carroll was supported by a Grant U01-CA057030 from the National Cancer Institute (NCI). Ruppert was supported by the NCI Grant U01-CA057030 and a NSF Grant AST-1312903.

Supplementary material

13253_2018_344_MOESM1_ESM.pdf (2 mb)
Supplementary material 1 (pdf 2089 KB)


  1. Azzalini, A. (1985). A class of distributions which includes the normal ones. Scandinavian Journal of Statistics 12, 171–178.MathSciNetzbMATHGoogle Scholar
  2. — (2011). R Package ‘sn’: The skew-normal and skew-t distributions (version 0.4-17). URL, accessed August 1, 2012.
  3. — (2014). The Skew-Normal and Related Families, Institute of Mathematical Statistics Monographs, Cambridge University Press.zbMATHGoogle Scholar
  4. Azzalini, A. and Capitanio, A. (2003). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew \(t\) distribution. Journal of the Royal Statistics Society, Series B 65, 367–389.MathSciNetCrossRefzbMATHGoogle Scholar
  5. Basser, P. J., Mattiello, J., and LeBihan, D. (1994). MR diffusion tensor spectroscopy and imaging. Biophysical Journal 66, 259–267.CrossRefGoogle Scholar
  6. Basser, P. J., Pajevic, S., Pierpaoli, C., Duda, J., and Aldroubi, A. (2000). In vivo fiber tractography using dt-mri data. Magnetic Resonance in Medicine 44, 625–632.CrossRefGoogle Scholar
  7. Cao, J., Wang L., Huang, Z., Gai, J., Wu, R. (2017). Functional Mapping of Multiple Dynamic Traits. Journal of Agricultural, Biological and Environmental Statistics 22, 60–75.MathSciNetCrossRefzbMATHGoogle Scholar
  8. Crainiceanu, C. M., Reiss, P., Goldsmith, J., Huang, L., Huo, L., Scheipl, F. et al. (2012). R Package ‘refund’: Regression with functional data (version 0.1-6). URL, accessed 2010.
  9. Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39, 1–38.MathSciNetzbMATHGoogle Scholar
  10. Di C., Crainiceanu C.M., Caffo B.S., and Punjabi N.M. (2009). Multilevel Functional Principal Component Analysis. The Annals of Applied Statistics 3, 458–488.MathSciNetCrossRefzbMATHGoogle Scholar
  11. Dubin, J. A. and Müller, H. G. (2005). Dynamical correlation for multivariate longitudinal data. Journal of the American Statistical Association 100(471), 872–881.MathSciNetCrossRefzbMATHGoogle Scholar
  12. Ferraty, F. and Romain, Y. (2010). The Oxford Handbook of Functional Data Analysis. New York: Oxford University Press.Google Scholar
  13. Goldsmith, J., Feder, J., Crainiceanu, C. M., Caffo, B., and Reich, D. S. (2011a). Penalized functional regression. Journal of Computational and Graphical Statistics 20, 830–851.MathSciNetCrossRefGoogle Scholar
  14. Goldsmith, J., Crainiceanu, C. M., Caffo, B., and Reich, D. S. (2011b). Penalized functional regression analysis of white-matter tract profiles in multiple sclerosis. NeuroImage 57, 431–439.CrossRefGoogle Scholar
  15. — (2012). Longitudinal penalized functional regression for cognitive outcomes on neuronal tract measurements. Journal of the Royal Statistical Society, Series C 61, 453–469.MathSciNetCrossRefGoogle Scholar
  16. Greven, S., Crainiceanu, C. M., Caffo, B., and Reich, D. S. (2010). Longitudinal functional principal component analysis. Electronic Journal of Statistics 4, 1022–1054.MathSciNetCrossRefzbMATHGoogle Scholar
  17. Li, H., Staudenmayer, J. and Carroll, R. J. (2014). Hierarchical functional data with mixed continuous and binary measurements. Biometrics 70(4), 802–811.MathSciNetCrossRefzbMATHGoogle Scholar
  18. McLean, M. W., Hooker, G., Staicu, A., Scheipl, F., and Ruppert, D. (2014). Functional generalized additive models. Journal of Computational and Graphical Statistics 23(1), 249–269.Google Scholar
  19. Owen, D. B. (1956). Tables for computing bivariate normal probabilities. Annals of Mathematical Statistics 27, 1075–1090.MathSciNetCrossRefzbMATHGoogle Scholar
  20. Ozturk, A., Smith, S.A., Gordon-Lipkin E.M., Harrison, D.M., Shiee, N., Pham, D.L. et al. (2010). MRI of the corpus callosum in multiple sclerosis: association with disability. Multiple Sclerosis 16, 166–177.CrossRefGoogle Scholar
  21. Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis. New York: Springer.zbMATHGoogle Scholar
  22. Redd, A. (2011). R Package ‘orthogonalsplinebasis’: Orthogonal bspline basis functions (version 0.1.5). URL, accessed February 1, 2013.
  23. Reich, D. S., Smith, S. A., Zackowski, K. M., Gordon-Lipkin, E. M., Jones, C. K., Farrel, J. A.D. et al. (2005). Multiparametric magnetic resonance imaging analysis of the corticospinal tract in multiple sclerosis. NeuroImage 38, 271–279.CrossRefGoogle Scholar
  24. Ruppert, D. (2002). Selecting the number of knots for penalized splines. Journal of Computational and Graphical Statistics 11, 735–757.MathSciNetCrossRefGoogle Scholar
  25. Ruppert, D., Wand, M. P., and Carroll, R. J. (2003). Semiparametric Regression, Cambridge: Cambridge University Press.CrossRefzbMATHGoogle Scholar
  26. Staicu, A.-M., Crainiceanu, C. M., and Carroll, R. J. (2010). Fast methods for spatially correlated multilevel functional data. Biostatistics 11, 177–194.CrossRefGoogle Scholar
  27. Staicu, A., Crainiceanu, C. M., Reich, D. S., and Ruppert, D. (2012). Modeling functional data with spatially heterogeneous shape characteristics. Biometrics 68, 331–343.MathSciNetCrossRefzbMATHGoogle Scholar
  28. Tsing, T, and Eubank, R. (2015) Theoretical Foundations of Functional Data Analysis with an Introduction to Linear Operators, Wiley.Google Scholar
  29. Wood, S. N. (2006). Generalized Additive Models: An Introduction with R. Boca Raton, FL: Chapman & Hall/CRC.CrossRefzbMATHGoogle Scholar
  30. Zhou, L., Huang, J. Z., and Carroll, R. J. (2008). Joint modelling of paired sparse functional data using principal components. Biometrika 95, 601–619.MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© International Biometric Society 2018

Authors and Affiliations

  1. 1. Department of Statistics and Actuarial ScienceSimon Fraser UniversityBurnabyCanada
  2. 2. Capital OneViennaUSA
  3. 3. Department of StatisticsTexas A&M UniversityCollege StationUSA
  4. 4. School of Mathematical and Physical SciencesUniversity of Technology SydneyBroadwayAustralia
  5. 5. Department of Statistical Science and School of Operations Research and Information EngineeringCornell UniversityIthacaUSA

Personalised recommendations