Association Study of Alzheimer’s Disease with Tree-Guided Sparse Canonical Correlation Analysis

  • Shangchen Zhou
  • Shuai Yuan
  • Zhizhuo Zhang
  • Zenglin XuEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11307)


We consider the problem of finding the sparse associations between two sources of data, for example the sparse association between genetic variations (e.g., single nucleotide polymorphisms, SNPs) and phenotypical features (e.g., magnetic resonance imaging, MRI) in the study of Alzheimer’s disease (AD). Despite the success of Canonical Correlation Analysis (CCA) based its sparse variants in a number of applications, they usually neglect the underlying natural tree structures SNPs and MRI data. Specifically, the whole candidate set, genes, SNPs of gene form a path of tree structure in SNPs data, and the whole image, regions of image, features of region form a path of tree structure in the MRI data. In order to model the tree structure of features in both sources of data, in this paper, we propose a Tree-guided Sparse Canonical Correlation Analysis (TSCCA). The proposed model equips CCA with special mixed-norm regularization terms in order to model the underlying multilevel tree structures among both the inputs and outputs. To solve the resulted complicated optimization problem, we introduce an efficient iterative algorithm for TSCCA by rewriting tree-structured regularization into the common form of overlapping group lasso. To evaluate the proposed model, we have designed the simulation study and real world study respectively on Alzheimer’s disease. Experimental results on the simulation study have shown that the proposed method outperforms CCA with Lasso and group Lasso. The real world study on Alzheimer’s disease has shown that our model can find biologically meaningful associations between SNPs and MRI features.


Tree-guided Sparse Canonical correlation analysis Association study Alzheimer’s disease 



Datasets used in this paper are obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (ADNI official website: The investigators who contributed to the design and implementation of ADNI and/or collected data can be found on ADNI official website.

This work was in part supported by grants of NSF China (No. 61572111), a 985 Project of UESTC (No. A1098531023601041) and a Fundamental Research Project of China Central Universities (No. ZYGX2016Z003).


  1. 1.
    Chen, J., Bushman, F.D., Lewis, J.D., Wu, G.D., Li, H.: Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis. Biostatistics 14(2), 244–258 (2013)CrossRefGoogle Scholar
  2. 2.
    Chen, X., Liu, H., Carbonell, J.G.: Structured sparse canonical correlation analysis. In: International Conference on Artificial Intelligence and Statistics, pp. 199–207 (2012)Google Scholar
  3. 3.
    Daniela, M., Tibshirani, R.: Extensions of sparse canonical correlation analysis, with applications to genomic data. Stat. Appl. Genet. Mol. Biol. 383(1), 1–27 (2009)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Du, L., et al.: Pattern discovery in brain imaging genetics via scca modeling with a generic non-convex penalty. Sci. Rep. 7(1), 14052 (2017)CrossRefGoogle Scholar
  5. 5.
    Eisenschtat, A., Wolf, L.: Linking image and text with 2-way nets. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  6. 6.
    Hao, X., et al.: Mining outcome-relevant brain imaging genetic associations via three-way sparse canonical correlation analysis in alzheimer’s disease. Sci. Rep. 7, 44272 (2017)CrossRefGoogle Scholar
  7. 7.
    Hao, X., et al.: Identification of associations between genotypes and longitudinal phenotypes via temporally-constrained group sparse canonical correlation analysis. Bioinformatics 33(14), i341–i349 (2017)CrossRefGoogle Scholar
  8. 8.
    Hotelling, H.: Relations between two sets of variates. Biometrika 28, 321–377 (1936)CrossRefGoogle Scholar
  9. 9.
    Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th Annual international Conference on Machine Learning, pp. 433–440. ACM (2009)Google Scholar
  10. 10.
    Jenatton, R., Audibert, J.Y., Bach, F.: Structured variable selection with sparsity-inducing norms. J. Mach. Learn. Res. 12, 2777–2824 (2011)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Kang, Z., Lu, X., Yi, J., Xu, Z.: Self-weighted multiple kernel learning for graph-based clustering and semi-supervised classification. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI), pp. 2312–2318 (2018)Google Scholar
  12. 12.
    Kim, S., Xing, E.P.: Tree-guided group lasso for multi-task regression with structured sparsity. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 543–550 (2010)Google Scholar
  13. 13.
    Kim, S., Xing, E.P., et al.: Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eqtl mapping. Ann. Appl. Stat. 6(3), 1095–1117 (2012)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Liu, J., Ye, J.: Moreau-yosida regularization for grouped tree structure learning. In: Advances in Neural Information Processing Systems, pp. 1459–1467 (2010)Google Scholar
  15. 15.
    MacKay, D.J.: Bayesian interpolation. Neural Comput. 4(3), 415–447 (1991)CrossRefGoogle Scholar
  16. 16.
    Meier, L., Van De Geer, S., Buhlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 70(1), 53–71 (2008)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Neal, R.M.: Bayesian Learning for Neural Networks, vol. 118, p. 118. Springer Science & Business Media, New York (1996)zbMATHGoogle Scholar
  18. 18.
    Parkhomenko, E., Tritchler, D., Beyene, J.: Genome-wide sparse canonical correlation of gene expression with genotypes. BMC Proc. 1(Suppl. 1), S119 (2007)CrossRefGoogle Scholar
  19. 19.
    Parkhomenko, E., Tritchler, D., Beyene, J.: Sparse canonical correlation analysis with application to genomic data integration. Stat. Appl. Genet. Mol. Biol. 8(1), 1–34 (2009)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Que, X., Ren, Y., Zhou, J., Xu, Z.: Regularized multi-source matrix factorization for diagnosis of Alzheimer’s disease. In: Neural Information Processing - 24th International Conference, ICONIP, pp. 463–473 (2017)Google Scholar
  21. 21.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58, 267–288 (1994)Google Scholar
  22. 22.
    Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics p. kxp008 (2009)Google Scholar
  23. 23.
    Witten, D.M., Tibshirani, R.J.: Extensions of sparse canonical correlation analysis with applications to genomic data. Stat. Appl. Genet. Mol. Biol. 8(1), 1–27 (2009)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Xu, Z., Jin, R., King, I., Lyu, M.R.: An extended level method for efficient multiple kernel learning. In: Advances in Neural Information Processing Systems 21, Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS), pp. 1825–1832 (2008)Google Scholar
  25. 25.
    Xu, Z., Jin, R., Yang, H., King, I., Lyu, M.R.: Simple and efficient multiple kernel learning by group lasso. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 1175–1182 (2010)Google Scholar
  26. 26.
    Xu, Z., Jin, R., Ye, J., Lyu, M.R., King, I.: Non-monotonic feature selection. In: Proceedings of the 26th Annual International Conference on Machine Learning (ICML), pp. 1145–1152 (2009)Google Scholar
  27. 27.
    Xu, Z., Jin, R., Zhu, S., Lyu, M.R., King, I.: Smooth optimization for effective multiple kernel learning. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (2010)Google Scholar
  28. 28.
    Xu, Z., King, I., Lyu, M.R., Jin, R.: Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans. Neural Networks 21(7), 1033–1047 (2010)CrossRefGoogle Scholar
  29. 29.
    Xu, Z., Zhe, S., Qi, Y., Yu, P.: Association discovery and diagnosis of alzheimer’s disease with bayesian multiview learning. J. Artif. Intell. Res. 56, 247–268 (2016)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Yang, H., Xu, Z., King, I., Lyu, M.R.: Online learning for group lasso. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 1191–1198 (2010)Google Scholar
  31. 31.
    Yang, H., Xu, Z., Lyu, M.R., King, I.: Budget constrained non-monotonic feature selection. Neural Networks 71, 214–224 (2015)CrossRefGoogle Scholar
  32. 32.
    Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 68(1), 49–67 (2006)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Zhe, S., Xu, Z., Qi, Y., Yu, P.: Sparse bayesian multiview learning for simultaneous association discovery and diagnosis of alzheimer’s disease. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 1966–1972 (2015)Google Scholar
  34. 34.
    Zhe, S., Xu, Z., Qi, Y., Yu, P., et al.: Joint association discovery and diagnosis of alzheimer’s disease by supervised heterogeneous multiview learning. In: Pacific Symposium on Biocomputing, vol. 19. World Scientific (2014)Google Scholar
  35. 35.
    Zhou, S., Yao, H., Yu, W., Wang, Y.: Tree-guided group sparse based representation for person re-identification. In: Proceedings of the International Conference on Internet Multimedia Computing and Service, pp. 14–17. ACM (2016)Google Scholar
  36. 36.
    Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Shangchen Zhou
    • 1
    • 2
  • Shuai Yuan
    • 1
  • Zhizhuo Zhang
    • 3
  • Zenglin Xu
    • 1
    Email author
  1. 1.SMILE Lab, School of Computer Science and EngineeringUniversity of Electronic Science and Technology of ChinaChengduChina
  2. 2.School of Computer Science and TechnologyHarbin Institute of TechnologyHarbinChina
  3. 3.Computer Science and Artificial Intelligence LaboratoryMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations