Skip to main content

Current Computational Methods for Prioritizing Candidate Regulatory Polymorphisms

  • Protocol
  • First Online:
Biomedical Informatics

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 569))

Summary

Discovery of DNA sequence variants responsible for human phenotypic variation is key to advances in molecular diagnostics and medicines. Historically, variants that alter the protein-coding sequence of genes have been targeted when attempting to identify a trait’s etiology; this is done because the rules governing these regions are generally well-understood and candidate variants can be easily selected. However, the effects of variants on gene regulation are increasingly regarded as being as important as protein-coding variation in uncovering the nature of phenotypic variation. I discuss resources and methodology that have recently been developed to computationally prioritize variants that may alter gene expression.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pastinen T, Hudson TJ (2004) Cis-acting regulatory variation in the human genome. Science 306: 647–650.

    Article  PubMed  CAS  Google Scholar 

  2. Knight JC (2005) Regulatory polymorphisms underlying complex disease traits. J Mol Med 83: 97–109.

    Article  PubMed  CAS  Google Scholar 

  3. Rockman MV, Wray GA (2002) Abundant raw material for cis-regulatory evolution in humans. Mol Biol Evol 19: 1991–2004.

    Article  PubMed  CAS  Google Scholar 

  4. Wittkopp PJ (2005) Genomic sources of regulatory variation in cis and in trans. Cell Mol Life Sci 62: 1779–1783.

    Article  PubMed  CAS  Google Scholar 

  5. Whitehead A, Crawford DL (2006) Variation within and among species in gene expression: raw material for evolution. Mol Ecol 15: 1197–1211.

    Article  PubMed  CAS  Google Scholar 

  6. Miao X, Yu C, Tan W, Xiong P, Liang G, et al. (2003) A functional polymorphism in the matrix metalloproteinase-2 gene promoter (-1306C/T) is associated with risk of development but not metastasis of gastric cardia adenocarcinoma. Cancer Res 63: 3987–3990.

    PubMed  CAS  Google Scholar 

  7. Bond GL, Hu W, Bond EE, Robins H, Lutzker SG, et al. (2004) A single nucleotide polymorphism in the MDM2 promoter attenuates the p53 tumor suppressor pathway and accelerates tumor formation in humans. Cell 119: 591–602.

    Article  PubMed  CAS  Google Scholar 

  8. Caspi A, Sugden K, Moffitt TE, Taylor A, Craig IW, et al. (2003) Influence of life stress on depression: moderation by a polymorphism in the 5-HTT gene. Science 301: 386–389.

    Article  PubMed  CAS  Google Scholar 

  9. Prokunina L, Castillejo-Lopez C, Oberg F, Gunnarsson I, Berg L, et al. (2002) A regulatory polymorphism in PDCD1 is associated with susceptibility to systemic lupus erythematosus in humans. Nat Genet 32: 666–669.

    Article  PubMed  CAS  Google Scholar 

  10. Kostrikis LG, Neumann AU, Thomson B, Korber BT, McHardy P, et al. (1999) A polymorphism in the regulatory region of the CC-chemokine receptor 5 gene influences perinatal transmission of human immunodeficiency virus type 1 to African-American infants. J Virol 73: 10264–10271.

    PubMed  CAS  Google Scholar 

  11. Saito H, Tada S, Ebinuma H, Wakabayashi K, Takagi T, et al. (2001) Interferon regulatory factor 1 promoter polymorphism and response to type 1 interferon. J Cell Biochem Suppl 36: 191–200.

    Article  PubMed  CAS  Google Scholar 

  12. Stranger BE, Nica AC, Forrest MS, Dimas A, Bird CP, et al. (2007) Population genomics of human gene expression. Nat Genet 39: 1217–1224.

    Article  PubMed  CAS  Google Scholar 

  13. Pastinen T, Sladek R, Gurd S, Sammak A, Ge B, et al. (2004) A survey of genetic and epigenetic variation affecting human gene expression. Physiol Genomics 16: 184–193.

    PubMed  CAS  Google Scholar 

  14. Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, et al. (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851–861.

    Article  PubMed  CAS  Google Scholar 

  15. Nielsen R, Hellmann I, Hubisz M, Bustamante C, Clark AG (2007) Recent and ongoing selection in the human genome. Nat Rev Genet 8: 857–868.

    Article  PubMed  CAS  Google Scholar 

  16. Sabeti PC, Reich DE, Higgins JM, Levine HZ, Richter DJ, et al. (2002) Detecting recent positive selection in the human genome from haplotype structure. Nature 419: 832–837.

    Article  PubMed  CAS  Google Scholar 

  17. Voight BF, Kudaravalli S, Wen X, Pritchard JK (2006) A map of recent positive selection in the human genome. PLoS Biol 4: e72.

    Article  PubMed  Google Scholar 

  18. Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, et al. (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449: 913–918.

    Article  PubMed  CAS  Google Scholar 

  19. King MC, Wilson AC (1975) Evolution at two levels in humans and chimpanzees. Science 188: 107–116.

    Article  PubMed  CAS  Google Scholar 

  20. Kornblihtt AR (2005) Promoter usage and alternative splicing. Curr Opin Cell Biol 17: 262–268.

    Article  PubMed  CAS  Google Scholar 

  21. Davidson EH (2001) Genomic Regulatory Systems: Development and Evolution. San Diego: Academic. xii, 261 pp.

    Google Scholar 

  22. Hoogendoorn B, Coleman SL, Guy CA, Smith K, Bowen T, et al. (2003) Functional analysis of human promoter polymorphisms. Hum Mol Genet 12: 2249–2254.

    Article  PubMed  CAS  Google Scholar 

  23. Hewett D, Lynch J, Child A, Firth H, Sykes B (1994) Differential allelic expression of a fibrillin gene (FBN1) in patients with Marfan syndrome. Am J Hum Genet 55: 447–452.

    PubMed  CAS  Google Scholar 

  24. Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, et al. (2004) Genetic analysis of genome-wide variation in human gene expression. Nature 430: 743–747.

    Article  PubMed  CAS  Google Scholar 

  25. Monks SA, Leonardson A, Zhu H, Cundiff P, Pietrusiak P, et al. (2004) Genetic inheritance of gene expression in human cell lines. Am J Hum Genet 75: 1094–1105.

    Article  PubMed  CAS  Google Scholar 

  26. Cheung VG, Spielman RS, Ewens KG, Weber TM, Morley M, et al. (2005) Mapping determinants of human gene expression by regional and genome-wide association. Nature 437: 1365–1369.

    Article  PubMed  CAS  Google Scholar 

  27. Pastinen T, Ge B, Hudson TJ (2006) Influence of human genome polymorphism on gene expression. Hum Mol Genet 15 Spec No 1: R9–R16.

    Google Scholar 

  28. Conde L, Vaquerizas JM, Dopazo H, Arbiza L, Reumers J, et al. (2006) PupaSuite: finding functional single nucleotide polymorphisms for large-scale genotyping purposes. Nucleic Acids Res 34: W621–W625.

    Article  PubMed  CAS  Google Scholar 

  29. Freimuth RR, Stormo GD, McLeod HL (2005) PolyMAPr: programs for polymorphism database mining, annotation, and functional analysis. Hum Mutat 25: 110–117.

    Article  PubMed  CAS  Google Scholar 

  30. Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, et al. (2006) TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res 34: D108–D110.

    Article  PubMed  CAS  Google Scholar 

  31. Bryne JC, Valen E, Tang MH, Marstrand T, Winther O, et al. (2008) JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res 36: D102–D106.

    Article  PubMed  CAS  Google Scholar 

  32. Tomso DJ, Inga A, Menendez D, Pittman GS, Campbell MR, et al. (2005) Functionally distinct polymorphic sequences in the human genome that are targets for p53 transactivation. Proc Natl Acad Sci USA 102: 6431–6436.

    Article  PubMed  CAS  Google Scholar 

  33. Mottagui-Tabar S, Faghihi MA, Mizuno Y, Engstrom PG, Lenhard B, et al. (2005) Identification of functional SNPs in the 5-prime flanking sequences of human genes. BMC Genomics 6: 18.

    Article  PubMed  Google Scholar 

  34. Khan IA, Mort M, Buckland PR, O’Donovan MC, Cooper DN, et al. (2005) In silico discrimination of single nucleotide polymorphisms and pathological mutations in human gene promoter regions by means of local DNA sequence context and regularity. In Silico Biol 6: 0003.

    Google Scholar 

  35. Mooney SD, Altman RB (2003) MutDB: annotating human variation with functionally relevant data. Bioinformatics 19: 1858–1860.

    Article  PubMed  CAS  Google Scholar 

  36. Montgomery SB, Astakhova T, Bilenky M, Birney E, Fu T, et al. (2004) Sockeye: a 3D environment for comparative genomics. Genome Res 14: 956–962.

    Article  PubMed  CAS  Google Scholar 

  37. Hinrichs AS, Karolchik D, Baertsch R, Barber GP, Bejerano G, et al. (2006) The UCSC Genome Browser Database: update 2006. Nucleic Acids Res 34: D590–D598.

    Article  PubMed  CAS  Google Scholar 

  38. Ponomarenko JV, Merkulova TI, Orlova GV, Fokin ON, Gorshkova EV, et al. (2003) rSNP_Guide, a database system for analysis of transcription factor binding to DNA with variations: application to genome annotation. Nucleic Acids Res 31: 118–121.

    Article  PubMed  CAS  Google Scholar 

  39. Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, et al. (2003) Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat 21: 577–581.

    Article  PubMed  CAS  Google Scholar 

  40. Griffith OL, Montgomery SB, Bernier B, Chu B, Kasaian K, et al. (2008) ORegAnno: an open-access community-driven resource for regulatory annotation. Nucleic Acids Res 36: D107–D113.

    Google Scholar 

  41. Portales-Casamar E, Kirov S, Lim J, Lithwick S, Swanson MI, et al. (2007) PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation. Genome Biol 8: R207.

    Article  PubMed  Google Scholar 

  42. Montgomery SB, Griffith OL, Schuetz JM, Brooks-Wilson A, Jones SJ (2007) A survey of genomic properties for the detection of regulatory polymorphisms. PLoS Comput Biol 3: e106.

    Article  PubMed  Google Scholar 

  43. Drake JA, Bird C, Nemesh J, Thomas DJ, Newton-Cheh C, et al. (2006) Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nat Genet 38: 223–227.

    Article  PubMed  CAS  Google Scholar 

  44. Khaitovich P, Paabo S, Weiss G (2005) Toward a neutral evolutionary model of gene expression. Genetics 170: 929–939.

    Article  PubMed  CAS  Google Scholar 

  45. Balhoff JP, Wray GA (2005) Evolutionary analysis of the well characterized endo16 promoter reveals substantial variation within functional sites. Proc Natl Acad Sci USA 102: 8591–8596.

    Article  PubMed  CAS  Google Scholar 

  46. Romano LA, Wray GA (2003) Conservation of Endo16 expression in sea urchins despite evolutionary divergence in both cis and trans-acting components of transcriptional regulation. Development 130: 4187–4199.

    Article  PubMed  CAS  Google Scholar 

  47. Klug SJ, Famulok M (1994) All you wanted to know about SELEX. Mol Biol Rep 20: 97–107.

    Article  PubMed  CAS  Google Scholar 

  48. Stormo GD, Schneider TD, Gold L, Ehrenfeucht A (1982) Use of the ‘Perceptron’ algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res 10: 2997–3011.

    Article  PubMed  CAS  Google Scholar 

  49. Lenhard B, Wasserman WW (2002) TFBS: computational framework for transcription factor binding site analysis. Bioinformatics 18: 1135–1136.

    Article  PubMed  CAS  Google Scholar 

  50. Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14: 1188–1190.

    Article  PubMed  CAS  Google Scholar 

  51. Base BCMD (2006) http://research.nhgri.nih.gov/bic/

  52. Clifford R, Edmonson M, Hu Y, Nguyen C, Scherpbier T, et al. (2000) Expression-based genetic/physical maps of single-nucleotide polymorphisms identified by the cancer genome anatomy project. Genome Res 10: 1259–1265.

    Article  PubMed  CAS  Google Scholar 

  53. CFMDB (2006) http://www.genet.sickkids.on.ca/cftr/

  54. Wu CH, Apweiler R, Bairoch A, Natale DA, Barker WC, et al. (2006) The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res 34: D187–D191.

    Article  PubMed  CAS  Google Scholar 

  55. Fay JC, Wu CI (2000) Hitchhiking under positive Darwinian selection. Genetics 155: 1405–1413.

    PubMed  CAS  Google Scholar 

  56. Cargill M, Altshuler D, Ireland J, Sklar P, Ardlie K, et al. (1999) Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat Genet 22: 231–238.

    Article  PubMed  CAS  Google Scholar 

  57. Walsh EC, Sabeti P, Hutcheson HB, Fry B, Schaffner SF, et al. (2006) Searching for signals of evolutionary selection in 168 genes related to immune function. Hum Genet 119: 92–102.

    Article  PubMed  CAS  Google Scholar 

  58. Ramensky V, Bork P, Sunyaev S (2002) Human non-synonymous SNPs: server and survey. Nucleic Acids Res 30: 3894–3900.

    Article  PubMed  CAS  Google Scholar 

  59. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, et al. (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29: 308–311.

    Article  PubMed  CAS  Google Scholar 

  60. Spencer CC, Deloukas P, Hunt S, Mullikin J, Myers S, et al. (2006) The influence of recombination on human genetic diversity. PLoS Genet 2: e148.

    Article  PubMed  Google Scholar 

  61. Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, et al. (2003) Human-mouse alignments with BLASTZ. Genome Res 13: 103–107.

    Article  PubMed  CAS  Google Scholar 

  62. Palaniswamy SK, James S, Sun H, Lamb RS, Davuluri RV, et al. (2006) AGRIS and AtRegNet. a platform to link cis-regulatory elements and transcription factors into regulatory networks. Plant Physiol 140: 818–829.

    Article  PubMed  CAS  Google Scholar 

  63. AtProbe (2006) http://rulai.cshl.edu/software/index1.htm

  64. CEPDB (2006) http://rulai.cshl.edu/software/index1.htm

  65. Kummerfeld SK, Teichmann SA (2006) DBD: a transcription factor prediction database. Nucleic Acids Res 34: D74–D81.

    Article  PubMed  CAS  Google Scholar 

  66. Barrasa MI, Vaglio P, Cavasino F, Jacotot L, Walhout AJ (2007) EDGEdb: a transcription factor-DNA interaction database for the analysis of C. elegans differential gene expression. BMC Genomics 8: 21.

    Article  PubMed  Google Scholar 

  67. Schmid CD, Perier R, Praz V, Bucher P (2006) EPD in its twentieth year: towards complete promoter coverage of selected model organisms. Nucleic Acids Res 34: D82–D85.

    Article  PubMed  CAS  Google Scholar 

  68. Adryan B, Teichmann SA (2006) FlyTF: a systematic review of site-specific transcription factors in the fruit fly Drosophila melanogaster. Bioinformatics 22: 1532–1533.

    Article  PubMed  CAS  Google Scholar 

  69. Pohar TT, Sun H, Davuluri RV (2004) HemoPDB: Hematopoiesis Promoter Database, an information resource of transcriptional regulation in blood cell development. Nucleic Acids Res 32: D86–D90.

    Article  PubMed  CAS  Google Scholar 

  70. LSPD (2006) http://rulai.cshl.edu/software/index1.htm

  71. Palaniswamy SK, Jin VX, Sun H, Davuluri RV (2005) OMGProm: a database of orthologous mammalian gene promoters. Bioinformatics 21: 835–836.

    Article  PubMed  CAS  Google Scholar 

  72. Grienberg I, Benayahu D (2005) Osteo-Promoter Database (OPD) – promoter analysis in skeletal cells. BMC Genomics 6: 46.

    Article  PubMed  Google Scholar 

  73. Higo K, Ugawa Y, Iwamoto M, Korenaga T (1999) Plant cis-acting regulatory DNA elements (PLACE) database: 1999. Nucleic Acids Res 27: 297–300.

    Article  PubMed  CAS  Google Scholar 

  74. Lescot M, Dehais P, Thijs G, Marchal K, Moreau Y, et al. (2002) PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res 30: 325–327.

    Article  PubMed  CAS  Google Scholar 

  75. Shahmuradov IA, Gammerman AJ, Hancock JM, Bramley PM, Solovyev VV (2003) PlantProm: a database of plant promoter sequences. Nucleic Acids Res 31: 114–117.

    Article  PubMed  CAS  Google Scholar 

  76. Zhu J, Zhang MQ (1999) SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics 15: 607–611.

    Article  PubMed  CAS  Google Scholar 

  77. Kanamori M, Konno H, Osato N, Kawai J, Hayashizaki Y, et al. (2004) A genome-wide and nonredundant mouse transcription factor database. Biochem Biophys Res Commun 322: 787–793.

    Article  PubMed  CAS  Google Scholar 

  78. Kolchanov NA, Ignatieva EV, Ananko EA, Podkolodnaya OA, Stepanenko IL, et al. (2002) Transcription Regulatory Regions Database (TRRD): its status in 2002. Nucleic Acids Res 30: 312–317.

    Article  PubMed  CAS  Google Scholar 

  79. The HapMap Consortium (2005) A haplotype map of the human genome. Nature 437: 1299–1320.

    Google Scholar 

  80. Fredman D, Munns G, Rios D, Sjoholm F, Siegfried M, et al. (2004) HGVbase: a curated resource describing human DNA variation and phenotype relationships. Nucleic Acids Res 32: D516–D519.

    Article  PubMed  CAS  Google Scholar 

  81. Rajeevan H, Osier MV, Cheung KH, Deng H, Druskin L, et al. (2003) ALFRED: the ALelle FREquency Database. Update. Nucleic Acids Res 31: 270–271.

    Article  PubMed  CAS  Google Scholar 

  82. OMIM (2006) Online Mendelian Inheritance in Man, OMIM (TM). McKusick-Nathans Institute for Genetic Medicine, Johns Hopkins University (Baltimore, MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD), June 2006. World Wide Web URL: http://www.ncbi.nlm.nih.gov/omim/.

  83. Riva A, Kohane IS (2002) SNPper: retrieval and analysis of human SNPs. Bioinformatics 18: 1681–1685.

    Article  PubMed  CAS  Google Scholar 

  84. Hirakawa M, Tanaka T, Hashimoto Y, Kuroda M, Takagi T, et al. (2002) JSNP: a database of common gene variations in the Japanese population. Nucleic Acids Res 30: 158–162.

    Article  PubMed  CAS  Google Scholar 

  85. Tahira T, Baba S, Higasa K, Kukita Y, Suzuki Y, et al. (2005) dbQSNP: a database of SNPs in human promoter regions with allele frequency information determined by single-strand conformation polymorphism-based methods. Hum Mutat 26: 69–77.

    Article  PubMed  CAS  Google Scholar 

  86. Guryev V, Berezikov E, Cuppen E (2005) CASCAD: a database of annotated candidate single nucleotide polymorphisms associated with expressed sequences. BMC Genomics 6: 10.

    Article  PubMed  Google Scholar 

  87. Stitziel NO, Binkowski TA, Tseng YY, Kasif S, Liang J (2004) topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association. Nucleic Acids Res 32: D520–D522.

    Article  PubMed  CAS  Google Scholar 

  88. SeattleSNPs (2006) NHLBI Program for Genomic Applications, SeattleSNPs, Seattle, WA (URL: http://pga.gs.washington.edu) [Accessed 30 Jul 2006].

  89. GeneSNPs (2006) NIEHS SNPs. NIEHS Environmental Genome Project, University of Washington, Seattle, WA (URL: http://egp.gs.washington.edu) [Accessed 30 Jul 2006].

  90. HSVD (2006) http://humanparalogy.gs.washington.edu/structuralvariation/

  91. Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, et al. (2004) Detection of large-scale variation in the human genome. Nat Genet 36: 949–951.

    Article  PubMed  CAS  Google Scholar 

  92. Barber JC (2005) Directly transmitted unbalanced chromosome abnormalities and euchromatic variants. J Med Genet 42: 609–629.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgment

S.B.M. would like to thank Monica C. Sleumer, Daniel C. Jeffares, and Emmanouil T. Dermitzakis for critical review and support in development of this work. S.B.M. is funded by the European Molecular Biology Organization and the Natural Sciences and Engineering Research Council of Canada.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Montgomery, S. (2009). Current Computational Methods for Prioritizing Candidate Regulatory Polymorphisms. In: Astakhov, V. (eds) Biomedical Informatics. Methods in Molecular Biology™, vol 569. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-59745-524-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-59745-524-4_5

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-934115-63-3

  • Online ISBN: 978-1-59745-524-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics