Skip to main content

Annotating the Regulatory Genome

  • Protocol
  • First Online:
Book cover Computational Biology of Transcription Factor Binding

Part of the book series: Methods in Molecular Biology ((MIMB,volume 674))

Abstract

Determining the timing and molecular repertoire responsible for gene expression is fundamental to understanding a gene’s function. Heritable differences in this character are increasingly regarded as explanatory for complex and common traits. For many known trait-predisposing genes, studies have sought to elucidate the associated logic behind gene regulation. However, there exist many challenges in deciphering these mechanisms. Among them, it is recognized that we have limited understanding of regulatory complexity, the current models of gene regulation have low specificity and any gene’s regulatory logic is dependent on biological context. Addressing these limitations and defining the regulatory genome is an ongoing challenge for molecular biology. We discuss current efforts to define and annotate the regulatory genome by focusing on curation and text-mining activities. We further highlight the type of information and curation process for describing regulatory elements within the ORegAnno database (www.oreganno.org) and how the general standards for such information are changing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Khaitovich, P., Hellmann, I., Enard, W. et al. (2005) Parallel patterns of evolution in the genomes and transcriptomes of humans and chimpanzees. Science 309, 1850–1854.

    Article  PubMed  CAS  Google Scholar 

  2. King, M.C., and Wilson, A.C. (1975) Evolution at two levels in humans and chimpanzees. Science 188, 107–116.

    Article  PubMed  CAS  Google Scholar 

  3. Davidson, E.H., and Levine, M.S. (2008) Properties of developmental gene regulatory networks. Proc Natl Acad Sci USA 105, 20063–20066.

    Article  PubMed  CAS  Google Scholar 

  4. Levine, M., and Davidson, E.H. (2005) Gene regulatory networks for development. Proc Natl Acad Sci USA 102, 4936–4942.

    Article  PubMed  CAS  Google Scholar 

  5. Giurumescu, C.A., Sternberg, P.W., and Asthagiri, A.R. (2009) Predicting phenotypic diversity and the underlying quantitative molecular transitions. PLoS Comput Biol 5, e1000354.

    Article  PubMed  Google Scholar 

  6. Hardy, J., and Singleton, A. (2009) Genomewide association studies and human disease. N Engl J Med 360, 1759–1768.

    Article  PubMed  CAS  Google Scholar 

  7. Waterston, R.H., Lindblad-Toh, K., Birney, E. et al. (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562.

    Article  PubMed  CAS  Google Scholar 

  8. Cooper, G.M., Stone, E.A., Asimenos, G. et al. (2005) Distribution and intensity of constraint in mammalian genomic sequence. Genome Res 15, 901–913.

    Article  PubMed  CAS  Google Scholar 

  9. Birney, E., Stamatoyannopoulos, J.A., Dutta, A. et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816.

    Article  PubMed  CAS  Google Scholar 

  10. Attanasio, C., Reymond, A., Humbert, R. et al. (2008) Assaying the regulatory potential of mammalian conserved non-coding sequences in human cells. Genome Biol 9, R168.

    Article  PubMed  Google Scholar 

  11. Dermitzakis, E.T., and Clark, A.G. (2002) Evolution of transcription factor binding sites in Mammalian gene regulatory regions: conservation and turnover. Mol Biol Evol 19, 1114–1121.

    Article  PubMed  CAS  Google Scholar 

  12. Arnone, M.I., and Davidson, E.H. (1997) The hardwiring of development: organization and function of genomic regulatory systems. Development 124, 1851–1864.

    PubMed  CAS  Google Scholar 

  13. Messina, D.N., Glasscock, J., Gish, W. et al. (2004) An ORFeome-based analysis of human transcription factor genes and the construction of a microarray to interrogate their expression. Genome Res 14, 2041–2047.

    Article  PubMed  CAS  Google Scholar 

  14. Cheung, V.G., Conlin, L.K., Weber, T.M. et al. (2003) Natural variation in human gene expression assessed in lymphoblastoid cells. Nat Genet 33, 422–425.

    Article  PubMed  CAS  Google Scholar 

  15. Frazer, K.A., Ballinger, D.G., Cox, D.R. et al. (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861.

    Article  PubMed  CAS  Google Scholar 

  16. Monks, S.A., Leonardson, A., Zhu, H. et al. (2004) Genetic inheritance of gene expression in human cell lines. Am J Hum Genet 75, 1094–1105.

    Article  PubMed  CAS  Google Scholar 

  17. Petretto, E., Mangion, J., Dickens, N.J. et al. (2006) Heritability and tissue specificity of expression quantitative trait loci. PLoS Genet 2, e172.

    Article  PubMed  Google Scholar 

  18. Price, A.L., Patterson, N., Hancks, D.C. et al. (2008) Effects of cis and trans genetic ancestry on gene expression in African Americans. PLoS Genet 4, e1000294.

    Article  PubMed  Google Scholar 

  19. Schadt, E.E., Monks, S.A., Drake, T.A. et al. (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302.

    Article  PubMed  CAS  Google Scholar 

  20. Spielman, R.S., Bastone, L.A., Burdick, J.T. et al. (2007) Common genetic variants account for differences in gene expression among ethnic groups. Nat Genet 39, 226–231.

    Article  PubMed  CAS  Google Scholar 

  21. Storey, J.D., Madeoy, J., Strout, J.L. et al. (2007) Gene-expression variation within and among human populations. Am J Hum Genet 80, 502–509.

    Article  PubMed  CAS  Google Scholar 

  22. Stranger, B.E., Nica, A.C., Forrest, M.S. et al. (2007) Population genomics of human gene expression. Nat Genet 39, 1217–1224.

    Article  PubMed  CAS  Google Scholar 

  23. Miao, X., Yu, C., Tan, W. et al. (2003) A functional polymorphism in the matrix metalloproteinase-2 gene promoter (–1306C/T) is associated with risk of development but not metastasis of gastric cardiac adenocarcinoma. Cancer Res 63, 3987–3990.

    PubMed  CAS  Google Scholar 

  24. Bond, G.L., Hu, W., Bond, E.E. et al. (2004) A single nucleotide polymorphism in the MDM2 promoter attenuates the p53 tumor suppressor pathway and accelerates tumor formation in humans. Cell 119, 591–602.

    Article  PubMed  CAS  Google Scholar 

  25. Caspi, A., Sugden, K., Moffitt, T.E. et al. (2003) Influence of life stress on depression: moderation by a polymorphism in the 5-HTT gene. Science 301, 386–389.

    Article  PubMed  CAS  Google Scholar 

  26. Prokunina, L., Castillejo-Lopez, C., Oberg, F. et al. (2002) A regulatory polymorphism in PDCD1 is associated with susceptibility to systemic lupus erythematosus in humans. Nat Genet 32, 666–669.

    Article  PubMed  CAS  Google Scholar 

  27. Kostrikis, L.G., Neumann, A.U., Thomson, B. et al. (1999) A polymorphism in the regulatory region of the CC-chemokine receptor 5 gene influences perinatal transmission of human immunodeficiency virus type 1 to African-American infants. J Virol 73, 10264–10271.

    PubMed  CAS  Google Scholar 

  28. Saito, H., Tada, S., Ebinuma, H. et al. (2001) Interferon regulatory factor 1 promoter polymorphism and response to type 1 interferon. J Cell Biochem 81, 191–200.

    Article  Google Scholar 

  29. Emilsson, V., Thorleifsson, G., Zhang, B. et al. (2008) Genetics of gene expression and its effect on disease. Nature 452, 423–428.

    Article  PubMed  CAS  Google Scholar 

  30. Bryne, J.C., Valen, E., Tang, M.H. et al. (2008) JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res 36, D102–D106.

    Article  PubMed  CAS  Google Scholar 

  31. Matys, V., Kel-Margoulis, O.V., Fricke, E. et al. (2006) TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res 34, D108–D110.

    Article  PubMed  CAS  Google Scholar 

  32. Roulet, E., Busso, S., Camargo, A.A. et al. (2002) High-throughput SELEX SAGE method for quantitative modeling of transcription-factor binding sites. Nat Biotechnol 20, 831–835.

    PubMed  CAS  Google Scholar 

  33. Wilson, D., Charoensawan, V., Kummerfeld, S.K. et al. (2008) DBD – taxonomically broad transcription factor predictions: new content and functionality. Nucleic Acids Res 36, D88-D92.

    Article  PubMed  CAS  Google Scholar 

  34. Fulton, D.L., Sundararajan, S., Badis, G. et al. (2009) TFCat: the curated catalog of mouse and human transcription factors. Genome Biol 10, R29.

    Article  PubMed  Google Scholar 

  35. Lescot, M., Dehais, P., Thijs, G. et al. (2002) PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res 30, 325–327.

    Article  PubMed  CAS  Google Scholar 

  36. Pohar, T.T., Sun, H., and Davuluri, R.V. (2004) HemoPDB: hematopoiesis promoter database, an information resource of transcriptional regulation in blood cell development. Nucleic Acids Res 32, D86–D90.

    Article  PubMed  CAS  Google Scholar 

  37. Grienberg, I., and Benayahu, D. (2005) Osteo-Promoter Database (OPD) – promoter analysis in skeletal cells. BMC Genomics 6, 46.

    Article  PubMed  Google Scholar 

  38. Schmid, C.D., Perier, R., Praz, V. et al. (2006) EPD in its twentieth year: towards complete promoter coverage of selected model organisms. Nucleic Acids Res 34, D82–D85.

    Article  PubMed  CAS  Google Scholar 

  39. Shahmuradov, I.A., Gammerman, A.J., Hancock, J.M. et al. (2003) PlantProm: a database of plant promoter sequences. Nucleic Acids Res 31, 114–117.

    Article  PubMed  CAS  Google Scholar 

  40. Zhu, J., and Zhang, M.Q. (1999) SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics 15, 607–611.

    Article  PubMed  CAS  Google Scholar 

  41. Kolchanov, N.A., Ignatieva, E.V., Ananko, E.A. et al. (2002) Transcription Regulatory Regions Database. (TRRD): its status in 2002. Nucleic Acids Res 30, 312–317.

    Article  PubMed  CAS  Google Scholar 

  42. Bergman, C.M., Carlson, J.W., and Celniker, S.E. (2005) Drosophila DNase I footprint database: a systematic genome annotation of transcription factor binding sites in the fruitfly, Drosophila melanogaster. Bioinformatics 21, 1747–1749.

    Article  PubMed  CAS  Google Scholar 

  43. Kanamori, M., Konno, H., Osato, N. et al. (2004) A genome-wide and nonredundant mouse transcription factor database. Biochem Biophys Res Commun 322, 787–793.

    Article  PubMed  CAS  Google Scholar 

  44. Tahira, T., Baba, S., Higasa, K. et al. (2005) dbQSNP: a database of SNPs in human promoter regions with allele frequency information determined by single-strand conformation polymorphism-based methods. Hum Mutat 26, 69–77.

    Article  PubMed  CAS  Google Scholar 

  45. Stenson, P.D., Ball, E.V., Mort, M. et al. (2003) Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat 21, 577–581.

    Article  PubMed  CAS  Google Scholar 

  46. Zhao, T., Chang, L.W., McLeod, H.L. et al. (2004) PromoLign: a database for upstream region analysis and SNPs. Hum Mutat 23, 534–539.

    Article  PubMed  CAS  Google Scholar 

  47. Griffith, O.L., Montgomery, S.B., Bernier, B. et al. (2008) ORegAnno: an open-access community-driven resource for regulatory annotation. Nucleic Acids Res 36, D107–D113.

    Article  PubMed  CAS  Google Scholar 

  48. Portales-Casamar, E., Kirov, S., Lim, J. et al. (2007) PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation. Genome Biol 8, R207.

    Article  PubMed  Google Scholar 

  49. Aerts, S., Haeussler, M., van Vooren, S. et al. (2008) Text-mining assisted regulatory annotation. Genome Biol 9, R31.

    Article  PubMed  Google Scholar 

  50. Saric, J., Jensen, L.J., Ouzounova, R. et al. (2006) Extraction of regulatory gene/protein networks from Medline. Bioinformatics 22, 645–650.

    Article  PubMed  CAS  Google Scholar 

  51. Rodriguez-Penagos, C., Salgado, H., Martinez-Flores, I. et al. (2007) Automatic reconstruction of a bacterial regulatory network using Natural Language Processing. BMC Bioinformatics 8, 293.

    Article  PubMed  Google Scholar 

  52. Beisswanger, E., Lee, V., Kim, J.J. et al. (2008) Gene Regulation Ontology (GRO): design principles and use cases. Stud Health Technol Inform 136, 9–14.

    PubMed  Google Scholar 

  53. Kelso, J., Visagie, J., Theiler, G. et al. (2003) eVOC: a controlled vocabulary for unifying gene expression data. Genome Res 13, 1222–1230.

    Article  PubMed  CAS  Google Scholar 

  54. Schomburg, I., Chang, A., Ebeling, C. et al. (2004) BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Res 32, D431–D433.

    Article  PubMed  CAS  Google Scholar 

  55. Gallo, S.M., Li, L., Hu, Z. et al. (2006) REDfly: a Regulatory Element Database for Drosophila. Bioinformatics 22, 381–383.

    Article  PubMed  CAS  Google Scholar 

  56. Wasserman, W.W., and Fickett, J.W. (1998) Identification of regulatory regions which confer muscle-specific gene expression. J Mol Biol 278, 167–181.

    Article  PubMed  CAS  Google Scholar 

  57. Ho Sui, S.J., Mortimer, J.R., Arenillas, D.J. et al. (2005) oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes. Nucleic Acids Res. 33, 3154–3164.

    Article  PubMed  Google Scholar 

  58. Blanco, E., Farre, D., Alba, M.M. et al. (2006) ABS: a database of Annotated regulatory Binding Sites from orthologous promoters. Nucleic Acids Res 34, D63–D67.

    Article  PubMed  CAS  Google Scholar 

  59. Jiang, C., Xuan, Z., Zhao, F. et al. (2007) TRED: a transcriptional regulatory element database, new entries and other development. Nucleic Acids Res 35, D137–D140.

    Article  PubMed  CAS  Google Scholar 

  60. Ghosh D. (2000) Object-oriented transcription factors database (ooTFD). Nucleic Acids Res 28, 308–310.

    Article  PubMed  CAS  Google Scholar 

  61. Sierro, N., Kusakabe, T., Park, K.J. et al. (2006) DBTGR: a database of tunicate promoters and their regulatory elements. Nucleic Acids Res 34, D552–D555.

    Article  PubMed  CAS  Google Scholar 

  62. Hubbard T.J., Aken B.L., Ayling S. et al. (2009) Ensembl 2009. Nucleic Acids Res 37, D690–D697.

    Article  PubMed  CAS  Google Scholar 

  63. Sayers E.W., Barrett T., Benson D.A. et al. (2009) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 37, D5–D15.

    Article  PubMed  CAS  Google Scholar 

  64. Ponomarenko, J.V., Merkulova, T.I., Vasiliev, G.V. et al. (2001) rSNP_Guide, a database system for analysis of transcription factor binding to target sequences: application to SNPs and site-directed mutations. Nucleic Acids Res 29, 312–316.

    Article  PubMed  CAS  Google Scholar 

  65. Trinklein, N.D., Aldred, S.J., Saldanha, A.J. et al. (2003) Identification and functional analysis of human transcriptional promoters. Genome Res 13, 308–312.

    Article  PubMed  CAS  Google Scholar 

  66. King, D.C., Taylor, J., Elnitski, L. et al. (2005) Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences. Genome Res 15, 1051–1060.

    Article  PubMed  CAS  Google Scholar 

  67. Wang, H., Zhang, Y., Cheng, Y. et al. (2006) Experimental validation of predicted mammalian erythroid cis-regulatory modules. Genome Res 16, 1480–1492.

    Article  PubMed  CAS  Google Scholar 

  68. Visel, A., Minovitsky, S., Dubchak, I. et al. (2007) VISTA Enhancer Browser – a database of tissue-specific human enhancers. Nucleic Acids Res 35, D88–D92.

    Article  PubMed  CAS  Google Scholar 

  69. Kim T.H., Abdullaev Z.K., Smith A.D. et al. (2007) Analysis of the Vertebrate Insulator Protein CTCF-Binding Sites in the Human Genome. Cell 128, 1231–1245.

    Article  PubMed  CAS  Google Scholar 

  70. Gao, H., Falt, S., Sandelin, A. et al. (2008) Genome-wide identification of estrogen receptor alpha-binding sites in mouse liver. Mol Endocrinol 22, 10–22.

    Article  PubMed  CAS  Google Scholar 

  71. Harbison, C.T., Gordon, D.B., Lee, T.I. et al. (2004) Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104.

    Article  PubMed  CAS  Google Scholar 

  72. MacIsaac, K.D., Wang, T., Gordon, D.B. et al. (2006) An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics 7, 113.

    Article  PubMed  Google Scholar 

  73. Robertson, G., Hirst, M., Bainbridge, M. et al. (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4, 651–657.

    Article  PubMed  CAS  Google Scholar 

  74. Johnson D.S., Mortazavi A., Myers R.M. et al. (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502.

    Article  PubMed  CAS  Google Scholar 

  75. Lin, C.Y., Vega, V.B., Thomsen, J.S. et al. (2007) Whole-genome cartography of estrogen receptor alpha binding sites. PLoS Genet 3, e87.

    Article  PubMed  Google Scholar 

  76. Lim, C.A., Yao, F., Wong, J.J. et al. (2007) Genome-wide mapping of RELA(p65) binding identifies E2F1 as a transcriptional activator recruited by NF-kappaB upon TLR4 activation. Mol Cell 27, 622–635.

    Article  PubMed  CAS  Google Scholar 

  77. Wederell, E.D., Bilenky, M., Cullum, R. et al. (2008) Global analysis of in vivo Foxa2-binding sites in mouse adult liver using massively parallel sequencing. Nucleic Acids Res 36, 4549–4564.

    Article  PubMed  CAS  Google Scholar 

  78. Hufton, A.L., Mathia, S., Braun, H. et al. (2009) Deeply conserved chordate non-coding sequences preserve genome synteny but do not drive gene duplicate retention. Genome Res. 19, 2036–2051.

    Article  PubMed  CAS  Google Scholar 

  79. Adryan, B., and Teichmann, S.A. (2006) FlyTF: a systematic review of site-specific transcription factors in the fruit fly Drosophila melanogaster. Bioinformatics 22, 1532–1533.

    Article  PubMed  CAS  Google Scholar 

  80. Zhu, Q.H., Guo, A.Y., Gao, G. et al. (2007) DPTF: a database of poplar transcription factors. Bioinformatics 23, 1307–1308.

    Article  PubMed  CAS  Google Scholar 

  81. Maier, H., Dohr, S., Grote, K. et al. (2005) LitMiner and WikiGene: identifying problem-related key players of gene regulation using publication abstracts. Nucleic Acids Res 33, W779–W782.

    Article  PubMed  CAS  Google Scholar 

  82. Yang, H., Nenadic, G., and Keane, J.A. (2008) Identification of transcription factor contexts in literature using machine learning approaches. BMC Bioinformatics 9 Suppl 3, S11.

    Article  PubMed  Google Scholar 

  83. Steele, E., Tucker, A., ’t Hoen, P.A. et al. (2009) Literature-based priors for gene regulatory networks. Bioinformatics 25, 1768–1774.

    Article  PubMed  CAS  Google Scholar 

  84. Schilling, T., Schleithoff, E.S., Kairat, A. et al. (2009) Active transcription of the human FAS/CD95/TNFRSF6 gene involves the p53 family. Biochem Biophys Res Commun 387, 399–404.

    Article  PubMed  CAS  Google Scholar 

  85. Kent, W.J. (2002) BLAT – the BLAST-like alignment tool. Genome Res 12, 656–664.

    PubMed  CAS  Google Scholar 

  86. Palaniswamy, S.K., James, S., Sun, H. et al. (2006) AGRIS and AtRegNet. a platform to link cis-regulatory elements and transcription factors into regulatory networks. Plant Physiol 140, 818–829.

    Article  PubMed  CAS  Google Scholar 

  87. Shahi, P., Loukianiouk, S., Bohne-Lang, A. et al. (2006) Argonaute – a database for gene regulation by mammalian microRNAs. Nucleic Acids Res 34, D115–D118.

    Article  PubMed  CAS  Google Scholar 

  88. Barrasa, M.I., Vaglio, P., Cavasino, F. et al. (2007) EDGEdb: a transcription factor-DNA interaction database for the analysis of C. elegans differential gene expression. BMC Genomics 8, 21.

    Article  PubMed  Google Scholar 

  89. LSPD. (2006) http://rulai.cshl.edu/LSPD/.

  90. Halfon, M.S., Gallo, S.M., and Bergman, C.M. (2008) REDfly 2.0: an integrated database of cis-regulatory modules and transcription factor binding sites in Drosophila. Nucleic Acids Res 36, D594–D598.

    Article  PubMed  CAS  Google Scholar 

  91. Gama-Castro, S., Jimenez-Jacinto, V., Peralta-Gil, M. et al. (2008) RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res 36, D120–D124.

    Article  PubMed  CAS  Google Scholar 

  92. Wingender, E. (2008) The TRANSFAC project as an example of framework technology that supports the analysis of genomic regulation. Brief Bioinform 9, 326–332.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Obi L. Griffith .

Editor information

Editors and Affiliations

Appendix 1. ORegAnno XML Sample

Appendix 1. ORegAnno XML Sample

The method for manual annotation detailed in Section 2.4 of the ‘Methods and data’ is recommended for individual low-throughput studies. In some cases, a user may wish to upload a large collection of previously annotated publications, an existing database, or the results of a high-throughput experiment. To allow this, an alternative ‘batch upload’ method is provided. This makes use of the ORegAnno XML template. Data may be converted into an XML format using the following example. Many of the required fields overlap with those explained in Section 2.4. However, each field is also explained in Table 20.2. Please contact the ORegAnno administrators (oreganno@bcgsc.ca) for help with creating and uploading an XML batch file.<?xml version="1.0" encoding="ISO-8859-1"?> <oreganno> <recordSet> <record> <id></id> <stabled></stabled> <type>REGULATORY REGION</type> <outcome>NEGATIVE OUTCOME</outcome> <geneId>ENSDARG00000062484</geneId> <geneName>ptprf</geneName> <geneSource>ENSEMBL</geneSource> <geneVersion>danio_rerio_core_42_6c</geneVersion> <tfId></tfId> <tfName></tfName> <tfSource></tfSource> <tfVersion></tfVersion> <lociName></lociName> <speciesName>Danio rerio</speciesName> <reference>19704032</reference> <date>5-Aug-2009</date> <sequence> <internalSequenceType>sequence</internalSequenceType> <sequence>GGTTAAGAGTGAAAAGAACCAACCTCCTCGAGGGTCTATGAGATGA GGTGAGAGTTTGACCGGGTGATTTAATGGA</sequence> <ensembl_database_name>danio_rerio_core_42_6c</ensembl _database_name> <sequence_region_name>2</sequence_region_name> <start>14342892</start> <end>14342967</end> <strand>1</strand> <verified>true</verified> </sequence> <sequenceWithFlank> <internalSequenceType>sequence_with_flank</internal SequenceType> <sequence>ttgacacagataacaactagcctgaacgaaatataacattgctcttg catctcttttaatgcaggctcatgcaagtcacctgacacaacacattcagcctgaac acaaaggtgaggggcggcataacgcagggagtgggattgata acaagggtctctga ttaaagatggatccaggttggggtctgcaagcggcGGTTAAGAGTGAAAAGAACCAA CCTCCTCGAGGGTCTATGAGATGAGGTGAGAGTTTGACCGGGTGATTTAATGGAgat gaaattgaaagacagagacaaatggaaaacaagagaacatgaaaagacatttgtgaa caatttcatggctgttagaaaaaaaaagaaacacaatggaaatttttaaaagacaga cacaaaagcataacattcacagaaaagtcggatattctaccatatttcatacatatt gcagcaacatccccaatg</sequence> <ensembl_database_name>danio_rerio_core_42_6c</ensembl_ database_name> <sequence_region_name>2</sequence_region_name> <start>14342697</start> <end>14343159</end> <strand>1</strand> <verified>true</verified> </sequenceWithFlank> <searchSpace> <internalSequenceType>searchSpace</internalSequenceType> <sequence>ttgacacagataacaactagcctgaacgaaatataacattgctcttg catctcttttaatgcaggctcatgcaagtcacctgacacaacacattcagcctgaac acaaaggtgaggggcggcataacgcagggagtgggattgataacaagggtctctga ttaaagatggatccaggttggggtctgcaagcggcGGTTAAGAGTGAAAAGAACCAA CCTCCTCGAGGGTCTATGAGATGAGGTGAGAGTTTGACCGGGTGATTTAATGGAgat gaaattgaaagacagagacaaatggaaaacaagagaacatgaaaagacatttgtgaa caatttcatggctgttagaaaaaaaaagaaacacaatggaaatttttaaaagacaga cacaaaagcataacattcacagaaaagtcggatattctaccatatttcatacatatt gcagcaacatccccaatg</sequence> <ensembl_database_name>danio_rerio_core_42_6c</ensembl_ database_name> <sequence_region_name>2</sequence_region_name> <start>14342697</start> <end>14343159</end> <strand>1</strand> <verified>true</verified> </searchSpace> <dataset>OREGDS00016</dataset> <evidenceSet> <evidence> <evidenceClassStableId>OREGEC00001</evidenceClassStableId> <evidenceTypeStableId>OREGET00002</evidenceTypeStableId> <evidenceSubtypeStableId>OREGES00021 </evidenceSubtypeStableId> <comment>Each candidate conserved regulatory region was amplified by PCR and co-injected with an EGFP reporter construct into zebrafish embryos produced from natural matings between the 1-4 cleavage stages. Embryos were then assayed for GFP expressionon the second day of development (approximately 24-16 hpf). The conserved region is recorded here as "sequence," and the entire tested PCR product is recorded as "searchSpace." This element contains the PCNE 67-Dr_ECR7_C2.</comment> <date>5-Aug-2009</date> <userName>hufton</userName> </evidence> </evidenceSet> <commentSet> <comment> <comment>Ancient phylogenetically conserved non-coding elements (PCNEs) were identified around gene families from mouse, zebrafish, fugu, and the invertebrate chordate amphioxus. 42 of these elements were tested for enhancer activity in transgenic zebrafish embryos, including 22 amphioxus elements and 20 fish elements. Results for each of these elements, and 9 randomly chosen negative control elements, are described in this dataset.</comment> <date>5-Aug-2009</date> <userName>hufton</userName> </comment> </commentSet> <scoreSet></scoreSet> <variationSet></variationSet> <metaDataSet></metaDataSet> <deprecatedByDate></deprecatedByDate> <deprecatedByStableID></deprecatedByStableID> <deprecatedByUser></deprecatedByUser> </record> </recordSet> <speciesSet> <species> <name>Danio rerio</name> <taxonId>7955</taxonId> </species> </speciesSet> <userName>hufton</userName> </oreganno>

Table 20.2 Explanation of data fields in ORegAnno XML template (In order of appearance)

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Montgomery, S.B., Kasaian, K., Jones, S.J., Griffith, O.L. (2010). Annotating the Regulatory Genome. In: Ladunga, I. (eds) Computational Biology of Transcription Factor Binding. Methods in Molecular Biology, vol 674. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-854-6_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-854-6_20

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-60761-853-9

  • Online ISBN: 978-1-60761-854-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics