Design of Knowledge Bases for Plant Gene Regulatory Networks

Part of the Methods in Molecular Biology book series (MIMB, volume 1629)


Developing a knowledge base that contains all the information necessary for the researcher studying gene regulation in a particular organism can be accomplished in four stages. This begins with defining the data scope. We describe here the necessary information and resources, and outline the methods for obtaining data. The second stage consists of designing the schema, which involves defining the entire arrangement of the database in a systematic plan. The third stage is the implementation, defined by actualization of the database by using software according to a predefined schema. The final stage is development, where the database is made available to users in a web-accessible system. The result is a knowledgebase that integrates all the information pertaining to gene regulation, and which is easily expandable and transferable.

Key words

Gene regulation Transcription factors Promoter Protein-DNA interaction Database design 


  1. 1.
    Mejia-Guerra MK, Pomeranz M, Morohashi K, Grotewold E (2012) Gene regulatory mechanisms. Biochim Biophys Acta 1819:454–446CrossRefPubMedGoogle Scholar
  2. 2.
    Yilmaz A, Mejia-Guerra MK, Kurz K, Liang X, Welch L, Grotewold E (2011) AGRIS: the Arabidopsis gene regulatory information server, an update. Nucleic Acids Res 39:1118–1122CrossRefGoogle Scholar
  3. 3.
    Davuluri RV, Sun H, Palaniswamy SK, Matthews N, Molina C, Kurtz M, Grotewold E (2003) AGRIS: Arabidopsis Gene regulatory information an information resource of Arabidopsis cis-regulatory elements and transcription factors. BMC Bioinformatics 4:25CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Yilmaz A, Nishiyama MY, Garcia-Fuentes B, Souza GM, Janies D, Gray J, Grotewold E (2009) GRASSIUS:a platform for comparative regulatory genomics across the grasses. Plant Physiol 149:171–180CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Kim TH, Ren B (2006) Genome-wide analysis of protein-DNA interactions. Annu Rev Genomics Hum Genet 7:81–102CrossRefPubMedGoogle Scholar
  6. 6.
    Kuo MH, Allis CD (1999) In vivo cross-linking and immunoprecipitation for studying dynamic protein: DNA associations in a chromatin environment. Methods 19:425–433CrossRefPubMedGoogle Scholar
  7. 7.
    Ellington AD, Szostak JW (1990) In vitro selection of RNA molecules that bind specific ligands. Nature 346:818–822CrossRefPubMedGoogle Scholar
  8. 8.
    Fried M, Crothers DM (1981) Equilibria and kinetics of lac repressor-operator interactions by polyacrylamide gel electrophoresis. Nucleic Acids Res 9:6505–6525CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Riechmann JL, Heard J, Martin G, Reuber L, Jiang C, Keddie J, Yu G (2000) Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. Science 290:2105–2110CrossRefPubMedGoogle Scholar
  10. 10.
    Riaño-Pachón DM, Ruzicic S, Dreyer I, Mueller-Roeber B (2007) PlnTFDB: an integrative plant transcription factor database. BMC Bioinformatics 8:42CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Kaufmann K, Muiño JM, Osteras M, Farinelli L, Krajewski P, Angenent GC (2010) Chromatin immunoprecipitation (ChIP) of plant transcription factors followed by sequencing (ChIP-SEQ) or hybridization to whole genome arrays (ChIP-CHIP). Nat Protoc 5:457–472CrossRefPubMedGoogle Scholar
  12. 12.
    Morohashi K, Xie Z, Grotewold E (2009) Gene-specific and genome-wide ChIP approaches to study plant transcriptional networks. Methods Mol Biol 553:312Google Scholar
  13. 13.
    Bulyk ML, Gentalen E, Lockhart DJ, Church GM (1999) Quantifying DNA-protein interactions by double-stranded DNA arrays. Nat Biotechnol 17:573–577CrossRefPubMedGoogle Scholar
  14. 14.
    Jolma A, Kivioja T, Toivonen J, Cheng L, Wei G, Enge M, Taipale J (2010) Multiplexed massively parallel SELEX for characterization of human transcription factor binding specificities. Genome Res 20:861–873CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Takahashi H, Kato S, Murata M, Carninci P (2012) CAGE (cap analysis of gene expression): a protocol for the detection of promoter and transcriptional networks. Methods Mol Biol 786:181–200CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Batut P, Gingeras TR (2013) RAMPAGE: promoter activity profiling by paired-end sequencing of 5′-complete cDNAs. Curr Protoc Mol Biol 104:25Google Scholar
  17. 17.
    Hashimoto SI, Suzuki Y, Kasai Y, Morohoshi K, Yamada T, Sese J, Matsushima K (2004) 5′-end SAGE for the analysis of transcriptional start sites. Nat Biotechnol 22:1146–1149CrossRefPubMedGoogle Scholar
  18. 18.
    Ni T, Corcoran DL, Rach EA, Song S, Spana EP, Gao Y, Zhu J (2010) A paired-end sequencing strategy to map the complex landscape of transcription initiation. Nat Methods 7:521–527Google Scholar
  19. 19.
    Messeguer X, Escudero R, Farré D, Núñez O, Martínez J, Albà MM (2002) PROMO: detection of known transcription regulatory elements using species-tailored searches. Bioinformatics 18:333–334CrossRefPubMedGoogle Scholar
  20. 20.
    Rohr C, Parra RG, Yankilevich P, Perez-Castro C (2013) INSECT: IN-silico SEarch for co-occurring transcription factors. Bioinformatics 29:2852–2858CrossRefPubMedGoogle Scholar
  21. 21.
    Schug J (2008) Using TESS to predict transcription factor binding sites in DNA sequence. Curr Protoc Bioinformatics 21:26:261–26.15Google Scholar
  22. 22.
    World Wide Web Consortium (1994) Accessed 15 Apr 2016
  23. 23.
    Canonical Ltd Ubuntu community (2004) Accessed 15 Apr 2016
  24. 24.
    Red Hat Project (1993) Accessed 15 Apr 2016
  25. 25.
    The openSUSE Project (2010) Accessed 15 Apr 2016
  26. 26.
    WAMP (2008) Accessed 15 Apr 2016
  27. 27.
    LAMP (2008) Accessed 15 Apr 2016
  28. 28.
    Murphy M, Brown G, Wallin C, Tatusova T, Pruitt K, Murphy T, Maglott D (2006) Gene help: integrated access to genes of genomes in the reference sequence collection. In: Gene help [internet].NCBI. Available from Accessed 26 Apr 2016
  29. 29.
    Gilmore A (2011) “I prefer not text”: developing Japanese learners’ communicative competence with authentic materials. Lang Learn 61:786–819Google Scholar
  30. 30.
    Huerta A, Salgado H, Thieffry D, Cides-Collado J (1989) RegulonDB: a database on transcription regulation in Escherichia coli. Nucleic Acids Res 26:55–59CrossRefGoogle Scholar
  31. 31.
    Bai J, Wang J, Xue F, Li J, Bu L, Hu H, GBQ X, Zhao G, Ding X, Yan J, Wu J (2010) proTF: a comprehensive data and phylogenomics resource for prokaryotic transcription factors. Bioinformatics 26:2493–2495CrossRefPubMedGoogle Scholar
  32. 32.
    Yang TH, Wang CC, Wang YC, Wu WS (2014) YTRP: a repository for yeast transcriptional regulatory pathways. Database (Oxford) 2014:bau014CrossRefGoogle Scholar
  33. 33.
    Chien CH, Chiang-Hsieh TF, Chen YA, Chow CN, Wu NY, Hou PF, Chang WC (2015) AtmiRNET: a web-based resource for reconstructing regulatory networks of Arabidopsis microRNAs. Database (Oxford) 2015:bav042CrossRefGoogle Scholar
  34. 34.
    Palaniswamy SK, James S, Sun H, Lamb RS, Davuluri R, Grotewold E (2006) AGRIS and AtRegNet: a platform to link cis-regulatory elements and transcription factors into regulatory networks. Plant Physiol 140:818–829CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Dai X, Li J, Liu T, Zhao P (2016) HRGRN: a graph search-empowered integrative database of Arabidopsis signaling transduction, metabolism and gene regulation networks. Plant Cell Physiol 57:12CrossRefGoogle Scholar
  36. 36.
    Iida K, Seki M, Sakurai T, Satou M, Akiyama K, Toyoda T, Konagaya A, Shinozaki K (2005) RARTF: database and tools for complete sets of Arabidopsis transcription factor. DNA Res 12:247–256CrossRefPubMedGoogle Scholar
  37. 37.
    Gou A, He K, Liu D, Bai S, Gu X, Wei L, Luo J (2005) DATF: a database of Arabidopsis transcription factors. Bioinformatics 21:2568–2569CrossRefGoogle Scholar
  38. 38.
    Lee T, Yang S, Kim E, Ko Y, Hwang S, Shin J, Shim JE, Shim H, Kim H, Kim C, Lee I (2015) AraNet v2: an improved database of co-functional gene networks for the study of Arabidopsis thaliana and 27 other nonmodel plant species. Nucleic Acids Res 43:996–1002CrossRefGoogle Scholar
  39. 39.
    Mutwill M, Klie S, Tohge T, Giorgi F, Wilkins O, Campbell MM, Fernie AR, Usadel B, Nikoloski Z, Persson S (2011) PlaNet: combined sequence and expression comparisons across plant networks derived from seven species. Plant Cell 23:895–910CrossRefGoogle Scholar
  40. 40.
    Pérez-Rodríguez P, Riaño-Pachón DM, Guedes-Correa LG, Rensing SA, Kerten B, Mueller-Roeber B (2010) PlnTFDB: updated content and new features of the plant transcription factor database. Nucleic Acids Res 38:822–827CrossRefGoogle Scholar
  41. 41.
    Priya P, Jain M (2013) RiceSRTFDB: a database of rice transcription factors containing comprehensive expression, cis-regulatory element and mutant information to facilitate gene function analysis. Database (Oxford) 2013:bat027CrossRefGoogle Scholar
  42. 42.
    Zhang H, Jin J, Tang L, Zhao Y, Gu X, Gao G, Luo J (2011) PlantTFDB 2.0: update and improvement of the comprehensive plant transcription factor database. Nucleic Acids Res 39:1114–1117CrossRefGoogle Scholar
  43. 43.
    Jin J, Zhang H, Kong L, Gao G, Luo J (2014) PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors. Nucleic Acids Res 42:1182–1187CrossRefGoogle Scholar
  44. 44.
    Mochida K, Yoshida T, Sakurai T, Yamaguchi-Shinozaki K, Shinozaki K, Phan L (2013) TreeTFDB: an integrative database of the transcription factors from six economically important tree crops for functional predictions and comparative and functional genomics. DNA Res 20:151–162CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    Mochida K, Yoshida T, Sakurai T, Yamaguchi-Shinozaki K, Shinozaki K, Phan L (2010) LegumeTFDB: an integrative database of Glycine max, Lotus Japonicus and Medicago truncatula transcription factors. Bioinformatics 26:290–291CrossRefPubMedGoogle Scholar
  46. 46.
    Wang Z, Libault M, Joshi T, Valliyodan B, Nguyen XD, Stacey G, Cheng J (2010) SoyDB: a knowledge database of soybean transcription factors. BMC Plant Biol 10:14CrossRefPubMedPubMedCentralGoogle Scholar
  47. 47.
    Ouma WZ, Mejia-Guerra MK, Yilmaz A, Pareja-Tobes P, Li W, Doseff AI, Grotewold E (2015) Important biological information uncovered in previously unaligned reads from chromatin immunoprecipitation experiments (ChIP-Seq). Sci Rep 5:8635CrossRefPubMedPubMedCentralGoogle Scholar
  48. 48.
    Berger MF, Bulyk ML (2006) Protein binding microarrays (PBMs) for rapid, high-throughput characterization of the sequence specificities of DNA binding proteins. Methods Mol Biol 338:245–260PubMedPubMedCentralGoogle Scholar
  49. 49.
    Bulyk ML, Johnson PL, Church GM (2002) Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors. Nucleic Acids Res 30:1255–1261CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Moyroud E, Minguet EG, Ott F, Yant L, Posé D, Monniaux M, Parcy F (2011) Prediction of regulatory interactions from genome sequences using a biophysical model for the Arabidopsis LEAFY transcription factor plant. Plant Cell 23:1293–1306CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
    Hoskins RA, Landolin JM, Brown JB, Sandler JE, Takahashi H, Lassmann T, Celniker SE (2011) Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res 21:182–192CrossRefPubMedPubMedCentralGoogle Scholar
  52. 52.
    Kawaji H, Kasukawa T, Fukuda S, Katayama S, Kai C, Kawai J, Hayashizaki Y (2006) CAGE basic/analysis databases: the CAGE resource for comprehensive promoter analysis. Nucleic Acids Res 34:632–636CrossRefGoogle Scholar
  53. 53.
    MKK M-G, Li W, Galeano NF, Vidal M, Gray J, Doseff AI, Grotewold E (2015) Core promoter plasticity between maize tissues and genotypes contrasts with predominance of sharp transcription initiation sites. Plant Cell 27:3309–3320CrossRefGoogle Scholar
  54. 54.
    Ponjavic J, Lenhard B, Kai C, Kawai J, Carninci P, Hayashizaki Y, Sandelin A (2006) Transcriptional and structural impact of TATA-initiation site spacing in mammalian core promoters. Genome Biol 7:78CrossRefGoogle Scholar
  55. 55.
    Morton T, Petricka J, Corcoran DL, Li S, Winter CM, Carda A, Megraw M (2014) Paired-end analysis of transcription start sites in Arabidopsis reveals plant-specific promoter signatures. Plant Cell 26:2746–2760CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.Center for Applied Plant Sciences (CAPS)The Ohio State UniversityColumbusUSA
  2. 2.Molecular, Cellular and Developmental Biology Graduate ProgramThe Ohio State UniversityColumbusUSA
  3. 3.Department of Molecular Genetics and Horticulture & Crop SciencesThe Ohio State UniversityColumbusUSA
  4. 4.Department of Horticulture & Crop SciencesThe Ohio State UniversityColumbusUSA

Personalised recommendations