Abstract
Recent advances in high-throughput genetic and genomic technologies, such as efficient DNA sequencers, multiplex genotyping platforms, microarrays, and proteomics and metabolomics assays, have provided researchers with an unprecedented ability to seek out and characterize the genetic determinants of diseases and clinical outcomes of all sorts. In fact, the application of these technologies has led to the identification of hundreds, if not thousands, of genomic loci that are associated with or even responsible for many different disease conditions, clinical outcomes, and responses to medications. As useful as these technologies are, however, their ability to generate data easily outpaces an ability to draw compelling inferences from those data. The field of bioinformatics evolved out of a need to manage, analyze, and interpret high-dimensional data of the type generated from the application of high-throughput genetic and genomic technologies. Necessary bioinformatics tools include those that enable one to test statistical associations between DNA sequence variations and phenotypes, find patterns in gene or protein expression data, understand how specific perturbations in a protein may affect the functioning of that protein, and determine which fundamental processes, pathways, or genetic networks may harbor the molecular “lesions” causing disease. In this chapter we provide an overview of available bioinformatics strategies and tools that have either been applied or are simply applicable to the genetic and genomic dissection of complex pulmonary vascular diseases (PVD) and other complex diseases. We start with a brief summary of the motivation for examining the genetic basis of PVD, consider various strategies for identifying genetic factors contributing to PVD, and then describe the bioinformatics tools and resources available to facilitate these analyses. We provide relevant Web resources in addition to references, and also provide example analyses and results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Manolio TA, Brooks LD, Collins FS (2008) A HapMap harvest of insights into the genetics of common disease. J Clin Invest 118:1590–1605
Collins FS (1990–1991) Identifying human disease genes by positional cloning. Harvey Lect 86:149–164
Lander ES, Schork NJ (1994) Genetic dissection of complex traits. Science 265:2037–2048
Austin ED, Loyd JE (2007) Genetics and mediators in pulmonary arterial hypertension. Clin Chest Med 28:43–57
Sztrymf B, Yaïci A, Girerd B, Humbert M (2007) Genes and pulmonary arterial hypertension. Respiration 74:123–132
Newman JH, Wheeler L, Lane KB, Loyd E, Gaddipati R, Phillips JA 3rd, Loyd JE (2001) Mutation in the gene for bone morphogenetic protein receptor II as a cause of primary pulmonary hypertension in a large kindred. N Engl J Med 345:319–324
Altshuler D, Daly MJ, Lander ES (2008) Genetic mapping in human disease. Science 322:881–888
Ott J (1999) Analysis of human genetic linkage. The Johns Hopkins University Press, Baltimore
Topol EJ, Frazer KA (2007) The resequencing imperative. Nat Genet 39:439–440
The International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299–1220
International HapMap Consortium, Frazer KA, Ballinger DG, Cox DR (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851–861
Maresso K, Broeckel U (2008) Genotyping platforms for mass-throughput genotyping with SNPs, including human genome-wide scans. Adv Genet 60:107–139
Carlson CS, Eberle MA, Kruglyak L, Nickerson DA (2004) Mapping complex disease loci in whole-genome association studies. Nature 429:446–452
Jakobsson M, Scholz SW, Scheet P et al (2008) Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451:998–1003
Bodmer W, Bonilla C (2008) Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet 40:695–701
Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, Hobbs HH (2004) Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305:869–872
Levy S, Sutton G, Ng PC et al (2007) The diploid genome sequence of an individual human. PLoS Biol 5:e254
Wheeler DA, Srinivasan M, Egholm M et al (2008) The complete genome of an individual by massively parallel DNA sequencing. Nature 452:872–876
Ji W, Foo JN, O’Roak BJ et al (2008) Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet 40:592–599
Romeo S, Pennacchio LA, Fu Y et al (2007) Population-based resequencing of ANGPTL4 uncovers variations that reduce triglycerides and increase HDL. Nat Genet 39:513–516
Ng PC, Henikoff S (2006) Predicting the effects of amino acid substitutions on protein function. Annu Rev Genom Hum Genet 7:61–80
Ng PC, Henikoff S (2002) Accounting for human polymorphisms predicted to affect protein function. Genome Res 12:436–446
Thomas PD, Kejariwal A (2004) Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: evolutionary evidence for differences in molecular effects. Proc Natl Acad Sci U S A 101:15398–15403
Sunyaev S, Ramensky V, Koch I, Lathe W 3rd, Kondrashov AS, Bork P (2001) Prediction of deleterious human alleles. Hum Mol Genet 10:591–597
Ferrer-Costa C, Gelpí JL, Zamakola L, Parraga I, de la Cruz X, Orozco M (2005) PMUT: a web-based tool for the annotation of pathological mutations on proteins. Bioinformatics 21:3176–3178
Yue P, Melamud E, Moult J (2006) SNPs3D: candidate gene and SNP selection for association studies. BMC Bioinform 7:166
Nakken S, Alseth I, Rognes T (2007) Computational prediction of the effects of non-synonymous single nucleotide polymorphisms in human DNA repair genes. Neuroscience 145:1273–1279
Torkamani A, Schork NJ (2007) Accurate prediction of deleterious protein kinase polymorphisms. Bioinformatics 23:2918–2925
Xue D, Yin J, Tan M, Yue J, Wang Y, Liang L (2008) Prediction of functional nonsynonymous single nucleotide polymorphisms in human G-protein-coupled receptors. J Hum Genet 53:379–389
Pritchard C, Underhill P, Greenfield A (2008) Using DNA microarrays. Methods Mol Biol 461:605–629
Han X, Aslanian A, Yates JR 3rd (2008) Mass spectrometry for proteomics. Curr Opin Chem Biol 12:483–490
Abdullah KG, Li L, Shen GQ et al (2008) Four SNPS on chromosome 9p21 confer risk to premature, familial CAD and MI in an American Caucasian population (GeneQuest). Ann Hum Genet 72:654–657
Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447:661–678
Montaner J, Fernandez-Cadenas I, Molina CA et al (2006) Poststroke C-reactive protein is a powerful prognostic tool among candidates for thrombolysis. Stroke 37:1205–1210
Topakian R, Strasak AM, Nussbaumer K et al (2008) Prognostic value of admission C-reactive protein in stroke patients undergoing IV thrombolysis. J Neurol 255:1190–1196
McKusick VA (2007) Mendelian inheritance in man and its online version, OMIM. Am J Hum Genet 80:588–604
Stenson PD, Ball EV, Mort M et al (2003) Human gene mutation database (HGMD): 2003 update. Hum Mutat 21:577–581
Sherry ST, Ward MH, Kholodov M et al (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311
The International HapMap Consortium (2003) The International HapMap Project. Nature 426:789–796
Mailman MD, Feolo M, Jin Y et al (2007) The NCBI dbGaP database of genotypes and phenotypes. Nat Genet 39:1181–1186
Dawber TR, Meadors GF, Moore FEJ (1951) Epidemiological approaches to heart disease: the Framingham Study. Am J Public Health 41:279–286
Johnson AD, O’Donnell CJ (2009) An open access database of genome-wide association results. BMC Med Genet 10:6
Zondervan KT, Cardon LR (2007) Designing candidate gene and genome-wide case-control association studies. Nat Protoc 2:2492–2501
Skol AD, Scott LJ, Abecasis GR, Boehnke M (2007) Optimal designs for two-stage genome-wide association studies. Genet Epidemiol 31:776–788
Purcell S, Neale B, Todd-Brown K et al (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575
Edgar R, Domrachev M, Lash AE (2002) Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30:207–210
Parkinson H, Kapushesky M, Shojatalab M et al (2007) ArrayExpress – a public database of microarray experiments and gene expression profiles. Nucleic Acids Res 35:D747–D750
Ikeo K, Ishi-i J, Tamura T et al (2003) CIBEX: center for information biology gene expression database. C R Biol 326:1079–1082
Brazma A, Hingamp P, Quackenbush J et al (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 29:365–371
Thou shalt share your data (2008) Nat Methods 5:209
Mathivanan S, Ahmed M, Ahn NG et al (2008) Human Proteinpedia enables sharing of human protein data. Nat Biotechnol 26:164–167
Golovin A, Oldfield TJ, Tate JG et al (2004) E-MSD: an integrated data resource for bioinformatics. Nucleic Acid Res 32:D211–D216
Bull TM, Coldren CD, Moore M et al (2004) Gene microarray analysis of peripheral blood cells in pulmonary arterial hypertension. Am J Respir Crit Care Med 170:911–919
Runo JR, Loyd JE (2003) Primary pulmonary hypertension. Lancet 361:1533–1544
Ekins S, Nikolsky Y, Bugrim A et al (2007) Pathway mapping tools for analysis of high content data. Methods Mol Biol 356:319–350
Ganter B, Zidek N, Hewitt PR et al (2008) Pathway analysis tools and toxicogenomics reference databases for risk assessment. Pharmacogenomics 9:35–54
Mootha VK, Lindgren CM, Eriksson KF et al (2003) PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 34:267–273
Dorfmüller P, Perros F, Balabanian K et al (2003) Inflammation in pulmonary arterial hypertension. Eur Respir J 22:358–363
Humbert M, Monti G, Brenot F et al (1995) Increased interleukin-1 and interleukin-6 serum concentrations in severe primary pulmonary hypertension. Am J Respir Crit Care Med 151:1628–1631
Gómez A, Bialostozky D, Zajarias A et al (2001) Right ventricular ischemia in patients with primary pulmonary hypertension. J Am Coll Cardiol 38:1137–1142
Chen Y, Zhu J, Lum PY et al (2008) Variations in DNA elucidate molecular networks that cause disease. Nature 452:429–435
Emilsson V, Thorleifsson G, Zhang B et al (2008) Genetics of gene expression and its effect on disease. Nature 452:423–428
Shlomi T, Cabili MN, Herrgård MJ, Palsson BØ, Ruppin E (2008) Network-based prediction of human tissue-specific metabolism. Nat Biotechnol 26:1003–1010
Duarte NC, Becker SA, Jamshidi N et al (2007) Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci U S A 104:1777–1782
Cochrane G, Akhtar R, Bonfield J et al (2009) Petabyte-scale innovations at the European Nucleotide Archive. Nucleic Acids Res 37:D19–D25
Feinberg AP (2007) Phenotypic plasticity and the epigenetics of human disease. Nature 447:433–440
Jirtle RL, Skinner MK (2007) Environmental epigenomics and disease susceptibility. Nat Rev Genet 8:253–262
Hatchwell E, Greally JM (2007) The potential role of epigenomic dysregulation in complex human disease. Trends Genet 23:588–595
DeAngelis JT, Farrington WJ, Tollefsbol TO (2008) An overview of epigenetic assays. Mol Biotechnol 38:179–183
Ideker T, Galitski T, Hood L (2001) A new approach to decoding life: systems biology. Annu Rev Genom Hum Genet 2:343–372
Mo ML, Palsson BØ (2009) Understanding human metabolic physiology: a genome-to-systems approach. Trends Biotechnol 27:37–44
Rockman MV (2008) Reverse engineering the genotype-phenotype map with natural genetic variation. Nature 456:738–744
Jenkinson AM, Albrecht M, Birney E et al (2008) Integrating biological data – the Distributed Annotation System. BMC Bioinformatics 9:S3
Butler D (2008) Translational research: crossing the valley of death. Nature 453:840–842
Ginsburg GS (2008) Genomic medicine: “grand challenges” in the translation of genomics to human health. Eur J Hum Genet 16:873–874
Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21:263–265
Weir BS (2008) Linkage disequilibrium and association mapping. Annu Rev Genom Hum Genet 9:129–142
Additional Reading
Bull TM, Coldren CD, Geraci MW, Voelkel NF (2007) Gene expression profiling in pulmonary hypertension. Proc Am Thorac Soc 4:117–120
Sampsonas F, Karkoulias K, Kaparianos A, Spiropoulos K (2006) Genetics of chronic obstructive pulmonary disease, beyond a1-antitrypsin deficiency. Curr Med Chem 13:2857–2873
Pettersson F, Morris AP, Barnes MR, Cardon LR (2008) Goldsurfer2 (Gs2): a comprehensive tool for the analysis and visualization of genome wide association studies. BMC Bioinform 4(9):138
Aulchenko YS, Ripke S, Isaacs A, van Duijn CM (2007) GenABEL: an R library for genome-wide association analysis. Bioinformatics 23:1294–1296
Melzer D, Perry JR, Hernandez D et al (2008) A genome-wide association study identifies protein quantitative trait loci (pQTLs). PLoS Genet 4:e1000072
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Torkamani, A., Topol, E.J., Schork, N.J. (2011). Bioinformatics, Genomics, and Functional Genomics: Overview. In: Yuan, JJ., Garcia, J., West, J., Hales, C., Rich, S., Archer, S. (eds) Textbook of Pulmonary Vascular Disease. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-87429-6_39
Download citation
DOI: https://doi.org/10.1007/978-0-387-87429-6_39
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-87428-9
Online ISBN: 978-0-387-87429-6
eBook Packages: MedicineMedicine (R0)