Abstract
As the developments in high throughput technologies have become more common and accessible it is becoming usual to take several distinct simultaneous approaches to study the same problem. In practice, this means that data of different types (expression, proteins, metabolites...) may be available for the same study, highlighting the need for methods and tools to analyze them in a combined way. In recent years there have been developed many methods that allow for the integrated analysis of different types of data. Corresponding to a certain tradition in bioinformatics many methodologies are rooted in machine learning such as bayesian networks, support vector machines or graph-based methods. In contrast with the high number of applications from these fields, another that seems to have contributed less to “omic” data integration is multivariate statistics, which has however a long tradition in being used to combine and visualize multidimensional data. In this work, we discuss the application of multivariate statistical approaches to integrate bio-molecular information by using multiple factorial analysis. The techniques are applied to a real unpublished data set consisting of three different data types: clinical variables, expression microarrays and DNA Gel Electrophoretic bands. We show how these statistical techniques can be used to perform reduction dimension and then visualize data of one type useful to explain those from other types. Whereas this is more or less straightforward when we deal with two types of data it turns to be more complicated when the goal is to visualize simultaneously more than two types. Comparison between the approaches shows that the information they provide is complementary suggesting their combined use yields more information than simply using one of them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Carlin, J., Normand, T.: Tutorial in biostatistics. meta-analysis: formulating, evaluating, combining, and reporting. Stat. Med. 19(5), 753–759 (2000)
de Tayrac, M., Lê, S., Aubry, M., Mosser, J., Husson, F.: Simultaneous analysis of distinct Omics data sets with integration of biological knowledge: Multiple Factor Analysis approach. BMC Genomics 10, 32–32 (2009)
Dumas, M., Canlet, C., Debrauwer, L., Martin, P., Paris, A.: Selection of biomarkers by a multivariate statistical processing of composite metabonomic data sets using multiple factor analysis. J. Proteome Res. 4, 1485–1492 (2005)
Escofier, B., Pages, J.: Analyses factorielles simples et multiples. [Multiple and Simple Factor Analysis], 3rd edn. Dunod, Paris (1998)
Escofier, E., Pages, J.: Multiple factor analysis (afmult package). Computational Statistics & Data Analysis 18, 121–140 (1994)
Falciani, F.: Microarray technology through applications. Taylor & Francis, New York (2007)
Gafan, G.P., Lucas, V.S., Roberts, G.J., Petrie, A., Wilson, M., Spratt, D.A.: Statistical analyses of complex denaturing gradient gel electrophoresis profiles. J. Clin. Microbiol. 43, 3971–3978 (2005)
Goble, C., Stevens, R.: State of the nation in data integration for bioinformatics. Journal of Biomedical Informatics 41(5), 687–693 (2008), http://dx.doi.org/10.1016/j.jbi.2008.01.008
Hamid, J., Hu, P., Roslin, V., Greenwood, C., Beyene, J.: Data integration in genetics and genomics: Methods and challenges. Human Genomics and Proteomics (2009)
Huopaniemi, I., Suvitaival, T., Nikkil, J., Orei, M., Kaski, S.: Multivariate multi-way analysis of multi-source data. Bioinformatics 26(12), i391–i398 (2010), http://bioinformatics.oxfordjournals.org/content/26/12/i391.abstract
Hao, K., Schadt, E.E., Storey, J.D.: Calibrating the performance of snp arrays for whole-genome association studies. PLoS Genet. 4(6), e1000109 (2008)
Lě, S., Josse, J., Husson, F.: Factominer: An r package for multivariate analysis. Journal of Statistical Software 25(1), 1–18 (2008), http://www.jstatsoft.org/v25/i01
Nguyen, D.V.: DNA microarray experiments: Biological and technological aspects. Biometrics 58(4), 701–717 (2002), http://www.blackwell-synergy.com/doi/abs/10.1111/j.0006-341X.2002.00701.x
Rhodes, D.R., Barrette, T.R., Rubin, M.A., Ghosh, D., Chinnaiyan, A.M.: Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Res. 62(15), 4427–4433 (2002)
Ricart, W., Fernández-Real, J.M.: Insulin resistance as a mechanism of adaptation during human evolution. Endocrinol Nutr. 57, 381–390 (2010)
Van Deun, K., Smilde, A., van der Werf, M., Kiers, H., Van Mechelen, I.: A structured overview of simultaneous component based data integration. BMC Bioinformatics 10(1), 246 (2009), http://www.biomedcentral.com/1471-2105/10/246
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sánchez, A. et al. (2012). Multivariate Methods for the Integration and Visualization of Omics Data. In: Freitas, A.T., Navarro, A. (eds) Bioinformatics for Personalized Medicine. JBI 2010. Lecture Notes in Computer Science(), vol 6620. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28062-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-28062-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28061-0
Online ISBN: 978-3-642-28062-7
eBook Packages: Computer ScienceComputer Science (R0)