Advertisement

Neuroinformatics

, Volume 17, Issue 4, pp 583–592 | Cite as

Independent Multiple Factor Association Analysis for Multiblock Data in Imaging Genetics

  • Natalia Vilor-TejedorEmail author
  • Mohammad Arfan Ikram
  • Gennady V. Roshchupkin
  • Alejandro Cáceres
  • Silvia Alemany
  • Meike W. Vernooij
  • Wiro J. Niessen
  • Cornelia M. van Duijn
  • Jordi Sunyer
  • Hieab H. Adams
  • Juan R. González
Original Article

Abstract

Multivariate methods have the potential to better capture complex relationships that may exist between different biological levels. Multiple Factor Analysis (MFA) is one of the most popular methods to obtain factor scores and measures of discrepancy between data sets. However, singular value decomposition in MFA is based on PCA, which is adequate only if the data is normally distributed, linear or stationary. In addition, including strongly correlated variables can overemphasize the contribution of the estimated components. In this work, we introduced a novel method referred as Independent Multifactorial Analysis (ICA-MFA) to derive relevant features from multiscale data. This method is an extended implementation of MFA, where the component value decomposition is based on Independent Component Analysis. In addition, ICA-MFA incorporates a predictive step based on an Independent Component Regression. We evaluated and compared the performance of ICA-MFA with both, the MFA method and traditional univariate analyses, in a simulation study. We showed how ICA-MFA explained up to 10-fold more variance than MFA and univariate methods. We applied the proposed algorithm in a study of 4057 individuals belonging to the population-based Rotterdam Study with available genetic and neuroimaging data, as well as information about executive cognitive functioning. Specifically, we used ICA-MFA to detect relevant genetic features related to structural brain regions, which in turn were involved, in the mechanisms of executive cognitive function. The proposed strategy makes it possible to determine the degree to which the whole set of genetic and/or neuroimaging markers contribute to the variability of the symptomatology jointly, rather than individually. While univariate results and MFA combinations only explained a limited proportion of variance (less than 2%), our method increased the explained variance (10%) and allowed the identification of significant components that maximize the variance explained in the model. The potential application of the ICA-MFA algorithm constitutes an important aspect of integrating multivariate multiscale data, specifically in the field of Neurogenetics.

Keywords

Data integration ICA-MFA Imaging genetics Modelling Neurogenetics 

Notes

Acknowledgements

Natalia Vilor-Tejedor is funded by a pre-doctoral grant from the Agència de Gestió d’Ajuts Universitaris i de Recerca (2017 FI_B 00636), Generalitat de Catalunya – Fons Social Europeu. This work has been partially supported by a STSM Grant from EU COST Action 15120 Open Multiscale Systems Medicine (OpenMultiMed) and Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBERESP). Further support was obtained through the Ministerio de Economía e Innovación (Spain), grant MTM2015-68140-R. ISGlobal is a member of the CERCA Programme, Generalitat de Catalunya.

Silvia Alemany thanks the Institute of Health Carlos III for her Sara Borrell postdoctoral grant (CD14/00214).

The generation and management of GWAS genotype data for the Rotterdam Study are supported by the Netherlands Organization of Scientific Research NWO Investments (no. 175.010.2005.011, 911-03-012). This study is funded by the Research Institute for Diseases in the Elderly (014-93-015; RIDE2), the Netherlands Genomics Initiative (NGI)/Netherlands Organization for Scientific Research (NWO) project no. 050-060-810. The Rotterdam Study is funded by Erasmus Medical Center and Erasmus University, Rotterdam, Netherlands Organization for the Health Research and Development (ZonMw), the Research Institute for Diseases in the Elderly (RIDE), the Ministry of Education, Culture and Science, the Ministry for Health, Welfare and Sports, the European Commission (DG XII), and the Municipality of Rotterdam. This research is supported by the Dutch Technology Foundation STW (12723), which is part of the NWO, and which is partly funded by the Ministry of Economic Affairs. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (project: ORACLE, grant agreement No: 678543).

Compliance with Ethical Standards

Conflict of Interest

None.

Supplementary material

12021_2019_9416_MOESM1_ESM.docx (291 kb)
ESM 1 (DOCX 284 kb)
12021_2019_9416_MOESM2_ESM.xlsx (15 kb)
ESM 2 (XLSX 14 kb)

References

  1. Abdi, H., Williams, L. J., & Valentin, D. (2013). Multiple factor analysis: Principal component analysis for multitable and multiblock data sets. Wiley Interdisciplinary Reviews: Computational Statistics, 5(2), 149–179.  https://doi.org/10.1002/wics.1246.CrossRefGoogle Scholar
  2. Abi-Dargham, A., & Horga, G. (2016). The search for imaging biomarkers in psychiatric disorders. Nature Medicine, 22(11), 1248–1255.  https://doi.org/10.1038/nm.4190.CrossRefPubMedGoogle Scholar
  3. Akaike, H. (1998). Information theory and an extension of the maximum likelihood principle (pp. 199–213). Springer, New York, NY.  https://doi.org/10.1007/978-1-4612-1694-0_15.Google Scholar
  4. Bair, E., Hastie, T., Paul, D., & Tibshirani, R. (2006). Prediction by supervised principal components. Journal of the American Statistical Association, 101(473), 119–137.  https://doi.org/10.1198/016214505000000628.CrossRefGoogle Scholar
  5. Chen, L., & Huang, J. Z. (2012). Sparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection in Multivariate Regression. Retrieved from http://www.stat.yale.edu/~lc436/Chen_Huang_2012_JASA.pdf
  6. Comon, P. (1994). Independent component analysis, a new concept? Signal Processing, 36(3), 287–314.  https://doi.org/10.1016/0165-1684(94)90029-9.CrossRefGoogle Scholar
  7. Cruz-Cano, R., & Lee, M.-L. T. (2014). Fast regularized canonical correlation analysis. Computational Statistics & Data Analysis, 70, 88–100.  https://doi.org/10.1016/J.CSDA.2013.09.020.CrossRefGoogle Scholar
  8. Curatolo, P., D’Agati, E., & Moavero, R. (2010). The neurobiological basis of ADHD. Italian Journal of Pediatrics, 36(1), 79.  https://doi.org/10.1186/1824-7288-36-79.CrossRefPubMedPubMedCentralGoogle Scholar
  9. Demontis, D., Walters, R. K., Martin, J., Mattheisen, M., Als, T. D., Agerbo, E., et al. (2018). Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder. Nature Genetics, 51(1), 63–75.  https://doi.org/10.1038/s41588-018-0269-7.CrossRefGoogle Scholar
  10. Durston, S. (2010). Imaging genetics in ADHD. Retrieved September 3, 2015, from http://www.ncbi.nlm.nih.gov/pubmed/20206707.
  11. Fischl, B., Salat, D. H., van der Kouwe, A. J. W., Makris, N., Ségonne, F., Quinn, B. T., & Dale, A. M. (2004). Sequence-independent segmentation of magnetic resonance images. NeuroImage, 23, S69–S84.  https://doi.org/10.1016/j.neuroimage.2004.07.016.CrossRefPubMedGoogle Scholar
  12. Härdle, W., & Simar, L. (2007). Applied Multivariate Statistical Analysis *. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?10.1.1.233.897&rep=rep1&type=pdf
  13. Hoogendam, Y. Y., Hofman, A., van der Geest, J. N., van der Lugt, A., & Ikram, M. A. (2014). Patterns of cognitive function in aging: The Rotterdam study. European Journal of Epidemiology, 29(2), 133–140.  https://doi.org/10.1007/s10654-014-9885-4.CrossRefPubMedGoogle Scholar
  14. Hoogman, M., Guadalupe, T., Zwiers, M. P., Klarenbeek, P., Francks, C., & Fisher, S. E. (2014). Assessing the effects of common variation in the FOXP2 gene on human brain structure. Frontiers in Human Neuroscience, 8(473).  https://doi.org/10.3389/fnhum.2014.00473.
  15. Husson, F., Lê, S., & Pagès, J. (2011). Exploratory multivariate analysis by example using R. CRC Press. Retrieved from https://www.crcpress.com/Exploratory-Multivariate-Analysis-by-Example-Using-R/Husson-Le-Pages/p/book/9781439835814
  16. Hyvärinen, A. (2013). Independent component analysis: Recent advances. Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences, 371(1984), 20110534.  https://doi.org/10.1098/rsta.2011.0534.CrossRefPubMedPubMedCentralGoogle Scholar
  17. Ikram, M. A., van der Lugt, A., Niessen, W. J., Koudstaal, P. J., Krestin, G. P., Hofman, A., Bos, D., & Vernooij, M. W. (2015). The Rotterdam scan study: Design update 2016 and main findings. European Journal of Epidemiology, 30(12), 1299–1315.  https://doi.org/10.1007/s10654-015-0105-7.CrossRefPubMedPubMedCentralGoogle Scholar
  18. Ikram, M. A., Brusselle, G. G. O., Murad, S. D., van Duijn, C. M., Franco, O. H., Goedegebure, A., Klaver, C. C. W., Nijsten, T. E. C., Peeters, R. P., Stricker, B. H., Tiemeier, H., Uitterlinden, A. G., Vernooij, M. W., & Hofman, A. (2017). The Rotterdam study: 2018 update on objectives, design and main results. European Journal of Epidemiology, 32(9), 807–850.  https://doi.org/10.1007/s10654-017-0321-4.CrossRefPubMedPubMedCentralGoogle Scholar
  19. Jolles, J., Houx, P. J., Van Boxtel, M. P. J., & Ponds, R. W. H. M. (2017). The Maastricht Aging Study: Determinants of cognitive aging. Retrieved from http://www.np.unimaas.nl/maas
  20. Kawaguchi, A., Yamashita, F., & Alzheimer’s Disease Neuroimaging Initiative. (2017). OUP accepted manuscript. Biostatistics, 18(4), 651–665.  https://doi.org/10.1093/biostatistics/kxx011.CrossRefPubMedGoogle Scholar
  21. Lever, J., Krzywinski, M., & Altman, N. (2017). Points of significance: Principal component analysis. Nature Methods, 14(7), 641–642.  https://doi.org/10.1038/nmeth.4346.CrossRefGoogle Scholar
  22. Liu, J., & Calhoun, V. D. (2014). A review of multivariate analyses in imaging genetics. Frontiers in Neuroinformatics, 8(29).  https://doi.org/10.3389/fninf.2014.00029.
  23. Luo, J., Wu, M., Gopukumar, D., & Zhao, Y. (2016). Big data application in biomedical research and health care: A literature review. Biomedical Informatics Insights, 8, 1–10.  https://doi.org/10.4137/BII.S31559.CrossRefPubMedPubMedCentralGoogle Scholar
  24. Manolio, T. A., Collins, F. S., Cox, N. J., Goldstein, D. B., Hindorff, L. A., Hunter, D. J., McCarthy, M. I., Ramos, E. M., Cardon, L. R., Chakravarti, A., Cho, J. H., Guttmacher, A. E., Kong, A., Kruglyak, L., Mardis, E., Rotimi, C. N., Slatkin, M., Valle, D., Whittemore, A. S., Boehnke, M., Clark, A. G., Eichler, E. E., Gibson, G., Haines, J. L., Mackay, T. F. C., McCarroll, S. A., & Visscher, P. M. (2009). Finding the missing heritability of complex diseases. Nature, 461(7265), 747–753.  https://doi.org/10.1038/nature08494.CrossRefPubMedPubMedCentralGoogle Scholar
  25. McCarthy, C. S., Ramprashad, A., Thompson, C., Botti, J.-A., Coman, I. L., & Kates, W. R. (2015). A comparison of FreeSurfer-generated data with and without manual intervention. Frontiers in Neuroscience, 9(379).  https://doi.org/10.3389/fnins.2015.00379.
  26. Medland, S. E., Jahanshad, N., Neale, B. M., & Thompson, P. M. (2014). Whole-genome analyses of whole-brain data: Working within an expanded search space. Nature Neuroscience, 17(6), 791–800.  https://doi.org/10.1038/nn.3718.CrossRefPubMedPubMedCentralGoogle Scholar
  27. Meyer-Lindenberg, A. (2012). The future of fMRI and genetics research. NeuroImage, 62(2), 1286–1292.  https://doi.org/10.1016/j.neuroimage.2011.10.063.CrossRefPubMedGoogle Scholar
  28. Mueller, K. L., & Tomblin, J. B. (2012). Diagnosis of ADHD and its behavioral. Neurologic and Genetic Roots Topics in Language Disorders, 32(3), 207–227.  https://doi.org/10.1097/TLD.0b013e318261ffdd.CrossRefPubMedGoogle Scholar
  29. Pearson, K. (1901). LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559–572.  https://doi.org/10.1080/14786440109462720.CrossRefGoogle Scholar
  30. Purper-Ouakil, D., Ramoz, N., Lepagnol-Bestel, A.-M., Gorwood, P., & Simonneau, M. (2011). Neurobiology of attention deficit/hyperactivity disorder. Pediatric Research, 69(5 Part 2), 69R–76R.  https://doi.org/10.1203/PDR.0b013e318212b40f.CrossRefPubMedGoogle Scholar
  31. Rosipal, R., & Krämer, N. (2005). Overview and recent advances in partial least squares. In Notes in Computer Science  https://doi.org/10.1007/11752790_2.CrossRefGoogle Scholar
  32. Sui, J., Adali, T., Yu, Q., Chen, J., & Calhoun, V. D. (2012). A review of multivariate methods for multimodal fusion of brain imaging data. Journal of Neuroscience Methods, 204(1), 68–81.  https://doi.org/10.1016/j.jneumeth.2011.10.031.CrossRefPubMedGoogle Scholar
  33. van der Elst, W., van Boxtel, M. P. J., van Breukelen, G. J. P., & Jolles, J. (2006). The letter digit substitution test: Normative data for 1,858 healthy participants aged 24–81 from the Maastricht aging study (MAAS): Influence of age, education, and sex. Journal of Clinical and Experimental Neuropsychology, 28(6), 998–1009.  https://doi.org/10.1080/13803390591004428.CrossRefPubMedGoogle Scholar
  34. Vilor-Tejedor, N., Cáceres, A., Pujol, J., Sunyer, J., & González, J. R. (2016). Imaging genetics in attention-deficit/hyperactivity disorder and related neurodevelopmental domains: State of the art. Brain Imaging and Behavior, 11, 1922–1931.  https://doi.org/10.1007/s11682-016-9663-x.CrossRefGoogle Scholar
  35. Vilor-Tejedor, N., Alemany, S., Cáceres, A., Bustamante, M., Pujol, J., Sunyer, J., & González, J. R. (2018). Strategies for integrated analysis in imaging genetics studies. Neuroscience and Biobehavioral Reviews, 93, 57–70.  https://doi.org/10.1016/j.neubiorev.2018.06.013.CrossRefPubMedGoogle Scholar
  36. Willcutt, E. G., Doyle, A. E., Nigg, J. T., Faraone, S. V., & Pennington, B. F. (2005). Validity of the executive function theory of attention-deficit/hyperactivity disorder: A meta-analytic review. Biological Psychiatry, 57(11), 1336–1346.  https://doi.org/10.1016/j.biopsych.2005.02.006.CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Natalia Vilor-Tejedor
    • 1
    • 2
    • 3
    • 4
    • 5
    Email author
  • Mohammad Arfan Ikram
    • 6
  • Gennady V. Roshchupkin
    • 7
    • 8
  • Alejandro Cáceres
    • 3
    • 4
    • 5
  • Silvia Alemany
    • 3
    • 4
  • Meike W. Vernooij
    • 6
    • 7
  • Wiro J. Niessen
    • 7
    • 8
    • 9
  • Cornelia M. van Duijn
    • 6
  • Jordi Sunyer
    • 3
    • 4
    • 5
    • 10
  • Hieab H. Adams
    • 6
    • 7
    • 8
  • Juan R. González
    • 3
    • 4
    • 5
  1. 1.Centre for Genomic Regulation (CRG)The Barcelona Institute for Science and Technology.BarcelonaSpain
  2. 2.BarcelonaBeta Brain Research Center (BBRC), Pasqual Maragall FoundationBarcelonaSpain
  3. 3.Barcelona Institute for Global Health (ISGlobal)BarcelonaSpain
  4. 4.Universitat Pompeu Fabra (UPF)BarcelonaSpain
  5. 5.CIBER Epidemiología y Salud Pública (CIBERESP)BarcelonaSpain
  6. 6.Department of EpidemiologyErasmus MCRotterdamthe Netherlands
  7. 7.Department of Radiology and Nuclear MedicineErasmus MCRotterdamthe Netherlands
  8. 8.Department of Medical InformaticsErasmus MCRotterdamthe Netherlands
  9. 9.Faculty of Applied SciencesDelft University of TechnologyDelftThe Netherlands
  10. 10.IMIM (Hospital del Mar Medical Research Institute)BarcelonaSpain

Personalised recommendations