Association Study of Alzheimer’s Disease with Tree-Guided Sparse Canonical Correlation Analysis
We consider the problem of finding the sparse associations between two sources of data, for example the sparse association between genetic variations (e.g., single nucleotide polymorphisms, SNPs) and phenotypical features (e.g., magnetic resonance imaging, MRI) in the study of Alzheimer’s disease (AD). Despite the success of Canonical Correlation Analysis (CCA) based its sparse variants in a number of applications, they usually neglect the underlying natural tree structures SNPs and MRI data. Specifically, the whole candidate set, genes, SNPs of gene form a path of tree structure in SNPs data, and the whole image, regions of image, features of region form a path of tree structure in the MRI data. In order to model the tree structure of features in both sources of data, in this paper, we propose a Tree-guided Sparse Canonical Correlation Analysis (TSCCA). The proposed model equips CCA with special mixed-norm regularization terms in order to model the underlying multilevel tree structures among both the inputs and outputs. To solve the resulted complicated optimization problem, we introduce an efficient iterative algorithm for TSCCA by rewriting tree-structured regularization into the common form of overlapping group lasso. To evaluate the proposed model, we have designed the simulation study and real world study respectively on Alzheimer’s disease. Experimental results on the simulation study have shown that the proposed method outperforms CCA with Lasso and group Lasso. The real world study on Alzheimer’s disease has shown that our model can find biologically meaningful associations between SNPs and MRI features.
KeywordsTree-guided Sparse Canonical correlation analysis Association study Alzheimer’s disease
Datasets used in this paper are obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (ADNI official website: adni.loni.ucla.edu). The investigators who contributed to the design and implementation of ADNI and/or collected data can be found on ADNI official website.
This work was in part supported by grants of NSF China (No. 61572111), a 985 Project of UESTC (No. A1098531023601041) and a Fundamental Research Project of China Central Universities (No. ZYGX2016Z003).
- 2.Chen, X., Liu, H., Carbonell, J.G.: Structured sparse canonical correlation analysis. In: International Conference on Artificial Intelligence and Statistics, pp. 199–207 (2012)Google Scholar
- 5.Eisenschtat, A., Wolf, L.: Linking image and text with 2-way nets. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
- 9.Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th Annual international Conference on Machine Learning, pp. 433–440. ACM (2009)Google Scholar
- 11.Kang, Z., Lu, X., Yi, J., Xu, Z.: Self-weighted multiple kernel learning for graph-based clustering and semi-supervised classification. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI), pp. 2312–2318 (2018)Google Scholar
- 12.Kim, S., Xing, E.P.: Tree-guided group lasso for multi-task regression with structured sparsity. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 543–550 (2010)Google Scholar
- 14.Liu, J., Ye, J.: Moreau-yosida regularization for grouped tree structure learning. In: Advances in Neural Information Processing Systems, pp. 1459–1467 (2010)Google Scholar
- 20.Que, X., Ren, Y., Zhou, J., Xu, Z.: Regularized multi-source matrix factorization for diagnosis of Alzheimer’s disease. In: Neural Information Processing - 24th International Conference, ICONIP, pp. 463–473 (2017)Google Scholar
- 21.Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58, 267–288 (1994)Google Scholar
- 22.Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics p. kxp008 (2009)Google Scholar
- 24.Xu, Z., Jin, R., King, I., Lyu, M.R.: An extended level method for efficient multiple kernel learning. In: Advances in Neural Information Processing Systems 21, Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS), pp. 1825–1832 (2008)Google Scholar
- 25.Xu, Z., Jin, R., Yang, H., King, I., Lyu, M.R.: Simple and efficient multiple kernel learning by group lasso. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 1175–1182 (2010)Google Scholar
- 26.Xu, Z., Jin, R., Ye, J., Lyu, M.R., King, I.: Non-monotonic feature selection. In: Proceedings of the 26th Annual International Conference on Machine Learning (ICML), pp. 1145–1152 (2009)Google Scholar
- 27.Xu, Z., Jin, R., Zhu, S., Lyu, M.R., King, I.: Smooth optimization for effective multiple kernel learning. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (2010)Google Scholar
- 30.Yang, H., Xu, Z., King, I., Lyu, M.R.: Online learning for group lasso. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 1191–1198 (2010)Google Scholar
- 33.Zhe, S., Xu, Z., Qi, Y., Yu, P.: Sparse bayesian multiview learning for simultaneous association discovery and diagnosis of alzheimer’s disease. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 1966–1972 (2015)Google Scholar
- 34.Zhe, S., Xu, Z., Qi, Y., Yu, P., et al.: Joint association discovery and diagnosis of alzheimer’s disease by supervised heterogeneous multiview learning. In: Pacific Symposium on Biocomputing, vol. 19. World Scientific (2014)Google Scholar
- 35.Zhou, S., Yao, H., Yu, W., Wang, Y.: Tree-guided group sparse based representation for person re-identification. In: Proceedings of the International Conference on Internet Multimedia Computing and Service, pp. 14–17. ACM (2016)Google Scholar