Skip to main content

Biological Databases

  • Chapter
  • First Online:
Concise Guide to Databases

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

  • 3831 Accesses

Abstract

Bioinformatics is the application of computing in the storage and analysis of vast amount of biological data. These data are available as sequences and protein and nucleic acid structures. Sequences are represented as single dimensions while a structure includes three-dimensional data of sequences. A biological database organises its data in such a way so that they can be easily accessed and analysed. Biological databases can be classified into sequence and structure databases. Sequence databases are applied to both protein and nucleic acid sequences while protein databases are applied only to proteins. The first database was developed after the insulin protein sequence was made available back in 1956. Insulin was the first protein to be sequenced. During the sixties, the first nucleic acid sequence of Yeast tRNA was developed. There was development of three-dimensional structures of proteins and the Protein Data Bank was established with only 10 entries. This database has evolved to a large database with over 10000 entries. In 1986, the SWISS-PROT protein sequence database was developed and it has about 70000 protein sequences that cover more than 5000 model organisms ((Babu 1997).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Babu MM (1997) Biological databases and protein sequence analysis. https://www.mrc-lmb.cam.ac.uk/genomes/madanm/pdfs/biodbseq.pdf. Accessed 13 Sept 2019

  • Baxevanis AD, Bateman A (2018) The importance of biological databases in biological discovery. Curr Protoc. Bioinform 50(1):1.1.1–1.1.8

    Google Scholar 

  • Benson DA, Cavanaugh M, Clark K et al (2013) GenBank. Nucl Acids Res 41(Database issue):D36–D42

    Google Scholar 

  • Berman H, Westbrook J, Feng Z, Gilliland G, Bhat T, Weissig H, Shindyalov IN, Zhuang P (2000) The protein data bank. Nucl Acids Res 28:235–242

    Article  Google Scholar 

  • Bourne P (2005) Will a biological database be different from a biological journal? PLoS Comput Biol 1:179–181

    Google Scholar 

  • Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2016) GenBank. Nucl Acids Res 44:D67–D72

    Article  Google Scholar 

  • Cochrane G, Karsch-Mizrachi I, Takagi T (2016) The international nucleotide sequence database collaboration. Nucl Acids Res 44:D48–D50

    Article  Google Scholar 

  • EMBL-EBI (2020) Primary and secondary databases. https://www.ebi.ac.uk/training/online/course/bioinformatics-terrified-2018/primary-and-secondary-databases. Accessed 12 Dec 2019

  • Enago Academy (2019) Biological databases: an overview and future perspective. https://www.enago.com/academy/biological-databases-an-overview-and-future-perspectives/. Accessed 06 Dec 2019

  • GenBank (2020) GenBank Overview. https://www.ncbi.nlm.nih.gov/genbank/. Accessed 07 Nov 2019

  • Gibson R, Alako B, Amid C, Cerdeño-Tárraga A, Cleland I, Goodgame N, Hoopen PT, Jayathilaka S, Kay S, Leinonen R et al (2016) Biocuration of functional annotation at the European nucleotide archive. Nucl Acids Res 44:D58–D66

    Google Scholar 

  • Henneges C, Hinselmann G, Jung S, Madlung J, Schutz W et al (2009) Ranking methods for the prediction of frequent top scoring peptides from proteomics data. J Proteomics Bioinform 2:226–235

    Article  Google Scholar 

  • Holzinger A, Jurisica I (2014) Knowledge discovery and data mining in biomedical informatics: the future is in integrative, interactive machine learning solutions. Lecture notes in computer science, pp 1–18

    Google Scholar 

  • Karthick RNS, Muthukumaran J (2008) ‘Prediction of three dimensional model and active site analysis of inducible serine protease inhibitor -2 (ISPI -2)’, Galleria Mellonella. J Comput Sci Syst Biol 1:119–125

    Google Scholar 

  • Kodama Y, Shumway M, Leinonen R (2012) International nucleotide sequence database collaboration. The sequence read archive: explosive growth of sequencing data. Nucl Acids Res 40(Database issue):D54–D56

    Google Scholar 

  • Kulikova T, Aldebert P, Althorpe N, Baker W, Bates K, Browne P, Broek A, Cochrane G, Duggan K, Eberhardt R, Faruque N, García-Pastor MP, Harte N, Kanz C, Leinonen R, Lin Q, Lombard V, Lopez R, Mancuso R, Apweiler R (2004) The EMBL nucleotide sequence database. Nucl Acids Res 32:D27–D30. https://doi.org/10.1093/nar/gkh120

    Article  Google Scholar 

  • Liu Z, Liu Y, Liu S, Ding X, Yang Y et al (2009) Analysis of the sequence of ITS1-5.8S-ITS2 regions of the three species of fructus Evodiae in Guizhou Province of China and identification of main ingredients of their medicinal chemistry. J Comput Sci Syst Biol 2:200–207

    Google Scholar 

  • Manach C (2016) Metabolomics databases. In: Max Rubner conference, 10–12 October 2016. Max Rubner-Institut, Karlsruhe, Germany

    Google Scholar 

  • Mashima J, Kodama Y, Fujisawa T, Katayama T, Okuda Y, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T (2017) DNA data bank of Japan. Nucl Acids Res 45(D1):D25–D31

    Article  Google Scholar 

  • Moreau Y, Tranchevent L-C (2012) Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nat Rev Genet 13:523–536

    Article  Google Scholar 

  • Oswaldo Cruz Institute (2001) Characteristics of biological data. http://www.dbbm.fiocruz.br/class/Lecture/d17/db_overview/characteristics_of_biological_data.htm. Accessed 01 Nov 2019

  • Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O’Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, DiCuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM (2014) RefSeq: an update on mammalian reference sequences. Nucl Acids Res 42(D1):D756–D763

    Article  Google Scholar 

  • Riad AM, Hassan AE, Hassan QF (2009) Investigating investigating performance of XML web services in real-time business systems. J Comput Sci Syst Biol 2:266–271

    Google Scholar 

  • Shanthi V, Ramanathan K, Sethumadhavan R (2009) Role of the cation-Ï€ interaction in therapeutic proteins: a comparative study with conventional stabilizing forces. J Comput Sci Syst Biol 2:051–068

    Google Scholar 

  • Singh S, Gupta SK, Nischal A, Khattri S, Nath R et al (2010) Comparative modeling study of the 3-D structure of small delta antigen protein of hepatitis delta virus. J Comput Sci Syst Biol 3:001–004

    Google Scholar 

  • Toomula S, Kumar A, Kumar DS, Bheemidi VS (2011) Biological databases-integration of life science data. J Comput Sci & Syst Biol 4(5):088–092

    Google Scholar 

  • The UniProt Consortium (2017) UniProt: the universal protein knowledgebase. Nucl Acids Res 45(D1):D158–D169

    Google Scholar 

  • UniProt Consortium (2020) The universal protein resource – UniProt. [Flyer obtained online]. Accessed 06 Mar 2019

    Google Scholar 

  • Varsale AR, Wadnerkar AS, Mandage RH, Jadhavrao PK (2010) Cheminformatics. J Proteomics Bioinform 3:253–259

    Article  Google Scholar 

  • Wooley JC, Lin HS (eds) (2005) Catalyzing inquiry at the interface of computing and biology. National Research Council (US) Committee on Frontiers at the Interface of Computing and Biology. National Academies Press, Washington (DC)

    Google Scholar 

  • wwPDB (2020) Worldwide Protein Data Bank (PDB). http://www.wwpdb.org/. Accessed 23 Aug 2019

  • Yadav G, Mohanty D (2017) Databases developed in India for biological sciences. J Proteins Proteomics 8(3):159–167

    Google Scholar 

  • Zou D, Ma L, Yu J, Zhang Z (2015) Biological databases for human research. Genomics Proteomics Bioinform 13(1):55–63

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantinos Domdouzis .

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Domdouzis, K., Lake, P., Crowther, P. (2021). Biological Databases. In: Concise Guide to Databases. Undergraduate Topics in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-42224-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-42224-0_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-42223-3

  • Online ISBN: 978-3-030-42224-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics