Skip to main content

Nucleic Acid Sequence and Structure Databases

  • Protocol
  • First Online:
Data Mining Techniques for the Life Sciences

Part of the book series: Methods in Molecular Biology ((MIMB,volume 609))

Abstract

This chapter gives an overview of the most commonly used biological databases of nucleic acid sequences and their structures. We cover general sequence databases, databases for specific DNA features, noncoding RNA sequences, and RNA secondary and tertiary structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kanz, C., Aldebert, P., Althorpe, N., Baker, W., Baldwin, A., Bates, K., Browne, P., van den Broek, A., Castro, M., Cochrane, G., Duggan, K., Eberhardt, R., Faruque, N., Gamble, J., Diez, F. G., Harte, N., Kulikova, T., Lin, Q., Lombard, V., Lopez, R., Mancuso, R., McHale, M., Nardone, F., Silventoinen, V., Sobhany, S., Stoehr, P., Tuli, M. A., Tzouvara, K., Vaughan, R., Wu, D., Zhu, W. and Apweiler, R. (2005) The EMBL Nucleotide Sequence Database. Nucleic Acids Res 33, D29–33

    Article  CAS  PubMed  Google Scholar 

  2. Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. and Wheeler, D. L. (2008) GenBank. Nucleic Acids Res 36, D25–30

    Article  CAS  PubMed  Google Scholar 

  3. Miyazaki, S., Sugawara, H., Ikeo, K., Gojobori, T. and Tateno, Y. (2004) DDBJ in the stream of various biological data. Nucleic Acids Res 32, D31–4

    Article  CAS  PubMed  Google Scholar 

  4. Pruitt, K. D., Tatusova, T. and Maglott, D. R. (2005) NCBI Reference Sequence (Ref-Seq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 33, D501–4

    Article  CAS  PubMed  Google Scholar 

  5. Stamm, S., Riethoven, J. J., Le Texier, V., Gopalakrishnan, C., Kumanduri, V., Tang, Y., Barbosa-Morais, N. L. and Thanaraj, T. A. (2006) ASD: a bioinformatics resource on alternative splicing. Nucleic Acids Res 34, D46–55

    Article  CAS  PubMed  Google Scholar 

  6. Takeda, J., Suzuki, Y., Nakao, M., Kuroda, T., Sugano, S., Gojobori, T. and Imanishi, T. (2007) H-DBAS: alternative splicing database of completely sequenced and manually annotated full-length cDNAs based on H-Invitational. Nucleic Acids Res 35, D104–9

    Article  CAS  PubMed  Google Scholar 

  7. Ruitberg, C. M., Reeder, D. J. and Butler, J. M. (2001) STRBase: a short tandem repeat DNA database for the human identity testing community. Nucleic Acids Res 29, 320–2

    Article  CAS  PubMed  Google Scholar 

  8. Ouyang, S. and Buell, C. R. (2004) The TIGR Plant Repeat Databases: a collective resource for the identification of repetitive sequences in plants. Nucleic Acids Res 32, D360–3

    Article  CAS  PubMed  Google Scholar 

  9. Leplae, R., Hebrant, A., Wodak, S. J. and Toussaint, A. (2004) ACLAME: a CLAssification of Mobile genetic Elements. Nucleic Acids Res 32, D45–9

    Article  CAS  PubMed  Google Scholar 

  10. Siguier, P., Perochon, J., Lestrade, L., Mahillon, J. and Chandler, M. (2006) ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res 34, D32–6

    Article  CAS  PubMed  Google Scholar 

  11. Sreenu, V. B., Alevoor, V., Nagaraju, J. and Nagarajaram, H. A. (2003) MICdb: database of prokaryotic microsatellites. Nucleic Acids Res 31, 106–8

    Article  CAS  PubMed  Google Scholar 

  12. Mantri, Y. and Williams, K. P. (2004) Islander: a database of integrative islands in prokaryotic genomes, the associated integrases and their DNA site specificities. Nucleic Acids Res 32, D55–8

    Article  CAS  PubMed  Google Scholar 

  13. Matys, V., Kel-Margoulis, O. V., Fricke, E., Liebich, I., Land, S., Barre-Dirrie, A., Reuter, I., Chekmenev, D., Krull, M., Hornischer, K., Voss, N., Stegmaier, P., Lewicki-Potapov, B., Saxel, H., Kel, A. E. and Wingender, E. (2006) TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res 34, D108–10

    Article  CAS  PubMed  Google Scholar 

  14. Bryne, J. C., Valen, E., Tang, M. H., Marstrand, T., Winther, O., da Piedade, I., Krogh, A., Lenhard, B. and Sandelin, A. (2008) JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res 36, D102–6

    Article  CAS  PubMed  Google Scholar 

  15. Zhu, J. and Zhang, M. Q. (1999) SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics 15, 607–11

    Article  CAS  PubMed  Google Scholar 

  16. Lescot, M., Déhais, P., Thijs, G., Marchal, K., Moreau, Y., Van de Peer, Y., Rouzé, P. and Rombauts, S. (2002) PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res 30, 325–7

    Article  CAS  PubMed  Google Scholar 

  17. Salgado, H., Gama-Castro, S., Peralta-Gil, M., Díaz-Peredo, E., Sánchez-Solano, F., Santos-Zavaleta, A., Martínez-Flores, I., Jiménez-Jacinto, V., Bonavides-Martínez, C., Segura-Salazar, J., Martínez-Antonio, A. and Collado-Vides, J. (2006) RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions. Nucleic Acids Res 34, D394–7

    Article  CAS  PubMed  Google Scholar 

  18. Pang, K. C., Stephen, S., Dinger, M. E., Engström, P. G., Lenhard, B. and Mattick, J. S. (2007) RNAdb 2.0–an expanded database of mammalian non-coding RNAs. Nucleic Acids Res 35, D178–82

    Article  CAS  PubMed  Google Scholar 

  19. He, S., Liu, C., Skogerbø, G., Zhao, H., Wang, J., Liu, T., Bai, B., Zhao, Y. and Chen, R. (2008) NONCODE v2.0: decoding the non-coding. Nucleic Acids Res 36, D170–2

    Article  CAS  PubMed  Google Scholar 

  20. Kin, T., Yamada, K., Terai, G., Okida, H., Yoshinari, Y., Ono, Y., Kojima, A., Kimura, Y., Komori, T. and Asai, K. (2007) fRNAdb: a platform for mining/annotating functional RNA candidates from non-coding RNA sequences. Nucleic Acids Res 35, D145–8

    Article  CAS  PubMed  Google Scholar 

  21. Mignone, F., Grillo, G., Licciulli, F., Iacono, M., Liuni, S., Kersey, P. J., Duarte, J., Saccone, C. and Pesole, G. (2005) UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs. Nucleic Acids Res 33, D141–6

    Article  CAS  PubMed  Google Scholar 

  22. Bakheet, T., Williams, B. R. and Khabar, K. S. (2006) ARED 3.0: the large and diverse AU-rich transcriptome. Nucleic Acids Res 34, D111–4

    Article  CAS  PubMed  Google Scholar 

  23. Lee, J. Y., Yeh, I., Park, J. Y. and Tian, B. (2007) PolyA_DB 2: mRNA polyadenylation sites in vertebrate genes. Nucleic Acids Res 35, D165–8

    Article  CAS  PubMed  Google Scholar 

  24. Bonnal, S., Boutonnet, C., Prado-Lourenço, L. and Vagner, S. (2003) IRESdb: the Internal Ribosome Entry Site database. Nucleic Acids Res 31, 427–8

    Article  CAS  PubMed  Google Scholar 

  25. Picardi, E., Regina, T. M., Brennicke, A. and Quagliariello, C. (2007) REDIdb: the RNA editing database. Nucleic Acids Res 35, D173–7

    Article  CAS  PubMed  Google Scholar 

  26. He, T., Du, P. and Li, Y. (2007) dbRES: a web-oriented database for annotated RNA editing sites. Nucleic Acids Res 35, D141–4

    Article  CAS  PubMed  Google Scholar 

  27. Wuyts, J., Perrière, G. and Van De Peer, Y. (2004) The European ribosomal RNA database. Nucleic Acids Res 32, D101–3

    Article  CAS  PubMed  Google Scholar 

  28. Cole, J. R., Chai, B., Farris, R. J., Wang, Q., Kulam-Syed-Mohideen, A. S., McGarrell, D. M., Bandela, A. M., Cardenas, E., Garrity, G. M. and Tiedje, J. M. (2007) The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data. Nucleic Acids Res 35, D169–72

    Article  CAS  PubMed  Google Scholar 

  29. Szymanski, M., Barciszewska, M. Z., Erdmann, V. A. and Barciszewski, J. (2002) 5S Ribosomal RNA Database. Nucleic Acids Res 30, 176–8

    Article  CAS  PubMed  Google Scholar 

  30. Sprinzl, M. and Vassilenko, K. S. (2005) Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res 33, D139–40

    Article  CAS  PubMed  Google Scholar 

  31. Rosenblad, M. A., Gorodkin, J., Knudsen, B., Zwieb, C. and Samuelsson, T. (2003) SRPDB: Signal Recognition Particle Database. Nucleic Acids Res 31, 363–4

    Article  CAS  PubMed  Google Scholar 

  32. Brown, J. W. (1999) The Ribonuclease P Database. Nucleic Acids Res 27, 314

    Article  CAS  PubMed  Google Scholar 

  33. Zwieb, C., Larsen, N. and Wower, J. (1998) The tmRNA database (tmRDB). Nucleic Acids Res 26, 166–7

    Article  CAS  PubMed  Google Scholar 

  34. Zhou, Y., Lu, C., Wu, Q. J., Wang, Y., Sun, Z. T., Deng, J. C. and Zhang, Y. (2008) GISSD: Group I Intron Sequence and Structure Database. Nucleic Acids Res 36, D31–7

    Article  CAS  PubMed  Google Scholar 

  35. Dai, L., Toor, N., Olson, R., Keeping, A. and Zimmerly, S. (2003) Database for mobile group II introns. Nucleic Acids Res 31, 424–6

    Article  CAS  PubMed  Google Scholar 

  36. Griffiths-Jones, S., Saini, H. K., van Dongen, S. and Enright, A. J. (2008) miRBase: tools for microRNA genomics. Nucleic Acids Res 36, D154–8

    Article  CAS  PubMed  Google Scholar 

  37. Shahi, P., Loukianiouk, S., Bohne-Lang, A., Kenzelmann, M., Küffer, S., Maertens, S., Eils, R., Gröne, H. J., Gretz, N. and Brors, B. (2006) Argonaute–a database for gene regulation by mammalian microRNAs. Nucleic Acids Res 34, D115–8

    Article  CAS  PubMed  Google Scholar 

  38. Hsu, S. D., Chu, C. H., Tsou, A. P., Chen, S. J., Chen, H. C., Hsu, P. W., Wong, Y. H., Chen, Y. H., Chen, G. H. and Huang, H. D. (2008) miRNAMap 2.0: genomic maps of microRNAs in metazoan genomes. Nucleic Acids Res 36, D165–9

    Article  CAS  PubMed  Google Scholar 

  39. Chiromatzo, A. O., Oliveira, T. Y., Pereira, G., Costa, A. Y., Montesco, C. A., Gras, D. E., Yosetake, F., Vilar, J. B., Cervato, M., Prado, P. R., Cardenas, R. G., Cerri, R., Borges, R. L., Lemos, R. N., Alvarenga, S. M., Perallis, V. R., Pinheiro, D. G., Silva, I. T., Brandäo, R. M., Cunha, M. A., Giuliatti, S. and Silva, W. A., Jr (2007) miRNApath: a database of miRNAs, target genes and metabolic pathways. Genet Mol Res 6, 859–65

    CAS  PubMed  Google Scholar 

  40. Megraw, M., Sethupathy, P., Corda, B. and Hatzigeorgiou, A. G. (2007) miRGen: a database for the study of animal microRNA genomic organization and function. Nucleic Acids Res 35, D149–55

    Article  CAS  PubMed  Google Scholar 

  41. Lestrade, L. and Weber, M. J. (2006) snoRNA-LBME-db, a comprehensive database of human H/ACA and C/D box snoRNAs. Nucleic Acids Res 34, D158–62

    Article  CAS  PubMed  Google Scholar 

  42. Brown, J. W., Echeverria, M., Qu, L. H., Lowe, T. M., Bachellerie, J. P., Hüttenhofer, A., Kastenmayer, J. P., Green, P. J., Shaw, P. and Marshall, D. F. (2003) Plant snoRNA database. Nucleic Acids Res 31, 432–5

    Article  CAS  PubMed  Google Scholar 

  43. Lee, J. F., Hesselberth, J. R., Meyers, L. A. and Ellington, A. D. (2004) Aptamer database. Nucleic Acids Res 32, D95–100

    Article  CAS  PubMed  Google Scholar 

  44. Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S. R. and Bateman, A. (2005) Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res 33, D121–4

    Article  CAS  PubMed  Google Scholar 

  45. van Batenburg, F. H., Gultyaev, A. P. and Pleij, C. W. (2001) PseudoBase: structural information on RNA pseudoknots. Nucleic Acids Res 29, 194–5

    Article  PubMed  Google Scholar 

  46. Nagaswamy, U., Larios-Sanz, M., Hury, J., Collins, S., Zhang, Z., Zhao, Q. and Fox, G. E. (2002) NCIR: a database of non-canonical interactions in known RNA structures. Nucleic Acids Res 30, 395–7

    Article  CAS  PubMed  Google Scholar 

  47. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. and Bourne, P. E. (2000) The Protein Data Bank. Nucleic Acids Res 28, 235–42

    Article  CAS  PubMed  Google Scholar 

  48. Berman, H., Westbrook, J., Feng, Z., Iype, L., Schneider, B. and Zardecki, C. (2003) The Nucleic Acid Database: A repository of three-dimensional structural information about nucleic acids. Structural Bioinformatics , 199–216

    Google Scholar 

  49. Tamura, M., Hendrix, D. K., Klosterman, P. S., Schimmelman, N. R., Brenner, S. E. and Holbrook, S. R. (2004) SCOR: Structural Classification of RNA, version 2.0. Nucleic Acids Res 32, D182–4

    Article  CAS  PubMed  Google Scholar 

  50. Stefan, L. R., Zhang, R., Levitan, A. G., Hendrix, D. K., Brenner, S. E. and Holbrook, S. R. (2006) MeRNA: a database of metal ion binding sites in RNA structures. Nucleic Acids Res 34, D131–4

    Article  CAS  PubMed  Google Scholar 

  51. Bindewald, E., Hayes, R., Yingling, Y. G., Kasprzak, W. and Shapiro, B. A. (2008) RNAJunction: a database of RNA junctions and kissing loops for three-dimensional structural analysis and nanodesign. Nucleic Acids Res 36, D392–7

    Article  CAS  PubMed  Google Scholar 

  52. Popenda, M., Blazewicz, M., Szachniuk, M. and Adamiak, R. W. (2008) RNA FRABASE version 1.0: an engine with a database to search for the three-dimensional fragments within RNA structures. Nucleic Acids Res 36, D386–91

    Article  CAS  PubMed  Google Scholar 

  53. Jurka, J., Kapitonov, V. V., Pavlicek, A., Klonowski, P., Kohany, O. and Walichiewicz, J. (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110, 462–7

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

SW was supported by a GEN-AU mobility fellowship sponsored by the Bundesministeriums für Wissenschaft und Forschung.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Washietl, S., Hofacker, I.L. (2010). Nucleic Acid Sequence and Structure Databases. In: Carugo, O., Eisenhaber, F. (eds) Data Mining Techniques for the Life Sciences. Methods in Molecular Biology, vol 609. Humana Press. https://doi.org/10.1007/978-1-60327-241-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-60327-241-4_1

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60327-240-7

  • Online ISBN: 978-1-60327-241-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics