Skip to main content

A Fast SCCA Algorithm for Big Data Analysis in Brain Imaging Genetics

  • Conference paper
  • First Online:
Graphs in Biomedical Image Analysis, Computational Anatomy and Imaging Genetics (GRAIL 2017, MICGen 2017, MFCA 2017)

Abstract

Mining big data in brain imaging genetics is an emerging topic in brain science. It can uncover meaningful associations between genetic variations and brain structures and functions. Sparse canonical correlation analysis (SCCA) is introduced to discover bi-multivariate correlations with feature selection. However, these SCCA methods cannot be directly applied to big brain imaging genetics data due to two limitations. First, they have cubic complexity in the size of the matrix involved and are computational and memory intensive when the matrix becomes large. Second, the parameters in an SCCA method need to be fine-tuned in advance. This further dramatically increases the computational time, and gets severe in high-dimensional scenarios. In this paper, we propose two fast and efficient algorithms to speed up the structure-aware SCCA (S2CCA) implementations without modification to the original SCCA models. The fast algorithms employ a divide-and-conquer strategy and are easy to implement. The experimental results, compared with conventional algorithms, show that our algorithms reduce the time usage significantly. Specifically, the fast algorithms improve the computational efficiency by tens to hundreds of times compared to conventional algorithms. Besides, our algorithms yield similar correlation coefficients and canonical loading profiles to the conventional implementations. Our fast algorithms can be easily parallelized to further reduce the computational time. This indicates that the proposed fast scalable SCCA algorithms can be a powerful tool for big data analysis in brain imaging genetics.

L. Du—This work was supported by NSFC (61602384), the Natural Science Basic Research Plan in Shaanxi Province of China (2017JQ6001), the China Postdoctoral Science Foundation (2017M613214), and the Fundamental Research Funds for the Central Universities (3102016OQD0065). This work was also supported by NIH R01 EB022574, R01 LM011360, U01 AG024904, P30 AG10133, R01 AG19771, UL1 TR001108, R01 AG 042437, R01 AG046171, and R01 AG040770, by DoD W81XWH-14-2-0151, W81XWH-13-1-0259, W81XWH-12-2-0012, and NCAA 14132004.

Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, J., Bushman, F.D., et al.: Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis. Biostatistics 14(2), 244–258 (2013)

    Article  Google Scholar 

  2. Chen, X., Liu, H., Carbonell, J.G.: Structured sparse canonical correlation analysis. In: AISTATS (2012)

    Google Scholar 

  3. Du, L., Huang, H., Yan, J., Kim, S., Risacher, S.L., et al.: Structured sparse canonical correlation analysis for brain imaging genetics: an improved graphnet method. Bioinformatics 32(10), 1544–1551 (2016)

    Article  Google Scholar 

  4. Du, L., et al.: A novel structure-aware sparse learning algorithm for brain imaging genetics. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8675, pp. 329–336. Springer, Cham (2014). doi:10.1007/978-3-319-10443-0_42

    Google Scholar 

  5. Du, L., et al.: Identifying associations between brain imaging phenotypes and genetic factors via a novel structured SCCA approach. In: Niethammer, M., Styner, M., Aylward, S., Zhu, H., Oguz, I., Yap, P.-T., Shen, D. (eds.) IPMI 2017. LNCS, vol. 10265, pp. 543–555. Springer, Cham (2017). doi:10.1007/978-3-319-59050-9_43

    Chapter  Google Scholar 

  6. Gorski, J., Pfeuffer, F., Klamroth, K.: Biconvex sets and optimization with biconvex functions: a survey and extensions. Math. Methods Oper. Res. 66(3), 373–407 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  7. Jagust, W.J., Bandy, D., Chen, K., Foster, N.L., Landau, S.M., Mathis, C.A., Price, J.C., Reiman, E.M., Skovronsky, D., Koeppe, R.A., et al.: The Alzheimer’s disease neuroimaging initiative positron emission tomography core. Alzheimer’s Dement. 6(3), 221–229 (2010)

    Article  Google Scholar 

  8. Lambert, J.C., Ibrahim-Verbaas, C.A., Harold, D., Naj, A.C., Sims, R., Bellenguez, C., Jun, G., DeStefano, A.L., Bis, J.C., Beecham, G.W., et al.: Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat. Genet. 45(12), 1452–1458 (2013)

    Article  Google Scholar 

  9. Parkhomenko, E., Tritchler, D., Beyene, J.: Sparse canonical correlation analysis with application to genomic data integration. Stat. Appl. Genet. Mol. Biol. 8(1), 1–34 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. Rosenfeld, J.A., Mason, C.E., Smith, T.M.: Limitations of the human reference genome for personalized genomics. PLoS ONE 7(7), e40294 (2012)

    Article  Google Scholar 

  11. Saykin, A.J., Shen, L., Yao, X., Kim, S., Nho, K., et al.: Genetic studies of quantitative MCI and ad phenotypes in ADNI: progress, opportunities, and plans. Alzheimer’s Dement. 11(7), 792–814 (2015)

    Article  Google Scholar 

  12. Shen, L., Kim, S., Risacher, S.L., Nho, K., Swaminathan, S., West, J.D., Foroud, T., Pankratz, N., Moore, J.H., Sloan, C.D., et al.: Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: a study of the ADNI cohort. Neuroimage 53(3), 1051–1063 (2010)

    Article  Google Scholar 

  13. Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3), 515–534 (2009)

    Article  Google Scholar 

  14. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 68(1), 49–67 (2006)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Consortia

Corresponding author

Correspondence to Lei Du .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Huang, Y. et al. (2017). A Fast SCCA Algorithm for Big Data Analysis in Brain Imaging Genetics. In: Cardoso, M., et al. Graphs in Biomedical Image Analysis, Computational Anatomy and Imaging Genetics. GRAIL MICGen MFCA 2017 2017 2017. Lecture Notes in Computer Science(), vol 10551. Springer, Cham. https://doi.org/10.1007/978-3-319-67675-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67675-3_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67674-6

  • Online ISBN: 978-3-319-67675-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics