Abstract
The effective extraction of information from multidimensional data sets derived from phenotyping experiments is a growing challenge in biology. Data visualization tools are important resources that can aid in exploratory data analysis of complex data sets. Phenotyping experiments of model organisms produce data sets in which a large number of phenotypic measures are collected for each individual in a group. A critical initial step in the analysis of such multidimensional data sets is the exploratory analysis of data distribution and correlation. To facilitate the rapid visualization and exploratory analysis of multidimensional complex trait data, we have developed a user-friendly, web-based software tool called Phenostat. Phenostat is composed of a dynamic graphical environment that allows the user to inspect the distribution of multiple variables in a data set simultaneously. Individuals can be selected by directly clicking on the graphs and thus displaying their identity, highlighting corresponding values in all graphs, allowing their inclusion or exclusion from the analysis. Statistical analysis is provided by R package functions. Phenostat is particularly suited for rapid distribution and correlation analysis of subsets of data. An analysis of behavioral and physiologic data stemming from a large mouse phenotyping experiment using Phenostat reveals previously unsuspected correlations. Phenostat is freely available to academic institutions and nonprofit organizations and can be used from our website at http://www.bioinfo.embl.it/phenostat/.
Similar content being viewed by others
References
Auwerx J, Avner P, Baldock R, Ballabio A, Balling R, et al. (2004) The European dimension for the mouse genome mutagenesis program. Nat Genet 36:925–927
Bogue MA, Grubb SC (2004) The Mouse Phenome Project. Genetica 122:71–74
Brown SD, Chambon P, de Angelis MH (2005) EMPReSS: standardized phenotype screens for functional annotation of the mouse genome. Nat Genet 37(11):1155
Brown SD, Hancock JM, Gates H (2006) Understanding mammalian genetic systems: the challenge of phenotyping in the mouse. PLoS Genetics 2:e118
Carola V, Frazzetto G, Gross C (2006) Identifying interactions between genes and early environment in the mouse. Genes Brain Behav 5:89–199
Frank E, Hall M, Trigg L, Holmes G, Witten IH (2004) Data mining in bioinformatics using Weka. Bioinformatics 20:2479–2481
Goldowitz D, Frankel WN, Takahashi JS, Holtz-Vitatema M, Bult C, et al. (2004) Large-scale mutagenesis of the mouse to understand the genetic bases of nervous system structure and function. Brain Res Mol Brain Res 132:105–115
Hancock JM et al. (2007) Integration of Mouse Phenome Data Resources. Mamm Genome Vol. 18
Rastan S, Hough T, Kierman A, Hardisty R, Erven A, et al. (2004) Towards a mutant map of the mouse-new models of neurological, behavioural, deafness, bone, renal and blood disorders. Genetica 122:47–49
Swayne FD, Temple Lang D, Buja A, Cook D (2003) GGobi: Evolving from XGobi into an extensible framework for interactive data visualization. J Comput Stat Data Anal 43:423–444
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Reuveni, E., Carola, V., Banchaabouchi, M.A. et al. Phenostat: visualization and statistical tool for analysis of phenotyping data. Mamm Genome 18, 677–681 (2007). https://doi.org/10.1007/s00335-007-9042-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00335-007-9042-4