Biological Databases

Domdouzis, Konstantinos; Lake, Peter; Crowther, Paul

doi:10.1007/978-3-030-42224-0_18

Konstantinos Domdouzis¹³,
Peter Lake¹⁴ &
Paul Crowther¹⁵

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

3831 Accesses

Abstract

Bioinformatics is the application of computing in the storage and analysis of vast amount of biological data. These data are available as sequences and protein and nucleic acid structures. Sequences are represented as single dimensions while a structure includes three-dimensional data of sequences. A biological database organises its data in such a way so that they can be easily accessed and analysed. Biological databases can be classified into sequence and structure databases. Sequence databases are applied to both protein and nucleic acid sequences while protein databases are applied only to proteins. The first database was developed after the insulin protein sequence was made available back in 1956. Insulin was the first protein to be sequenced. During the sixties, the first nucleic acid sequence of Yeast tRNA was developed. There was development of three-dimensional structures of proteins and the Protein Data Bank was established with only 10 entries. This database has evolved to a large database with over 10000 entries. In 1986, the SWISS-PROT protein sequence database was developed and it has about 70000 protein sequences that cover more than 5000 model organisms ((Babu 1997).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Babu MM (1997) Biological databases and protein sequence analysis. https://www.mrc-lmb.cam.ac.uk/genomes/madanm/pdfs/biodbseq.pdf. Accessed 13 Sept 2019
Baxevanis AD, Bateman A (2018) The importance of biological databases in biological discovery. Curr Protoc. Bioinform 50(1):1.1.1–1.1.8
Google Scholar
Benson DA, Cavanaugh M, Clark K et al (2013) GenBank. Nucl Acids Res 41(Database issue):D36–D42
Google Scholar
Berman H, Westbrook J, Feng Z, Gilliland G, Bhat T, Weissig H, Shindyalov IN, Zhuang P (2000) The protein data bank. Nucl Acids Res 28:235–242
Article Google Scholar
Bourne P (2005) Will a biological database be different from a biological journal? PLoS Comput Biol 1:179–181
Google Scholar
Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2016) GenBank. Nucl Acids Res 44:D67–D72
Article Google Scholar
Cochrane G, Karsch-Mizrachi I, Takagi T (2016) The international nucleotide sequence database collaboration. Nucl Acids Res 44:D48–D50
Article Google Scholar
EMBL-EBI (2020) Primary and secondary databases. https://www.ebi.ac.uk/training/online/course/bioinformatics-terrified-2018/primary-and-secondary-databases. Accessed 12 Dec 2019
Enago Academy (2019) Biological databases: an overview and future perspective. https://www.enago.com/academy/biological-databases-an-overview-and-future-perspectives/. Accessed 06 Dec 2019
GenBank (2020) GenBank Overview. https://www.ncbi.nlm.nih.gov/genbank/. Accessed 07 Nov 2019
Gibson R, Alako B, Amid C, Cerdeño-Tárraga A, Cleland I, Goodgame N, Hoopen PT, Jayathilaka S, Kay S, Leinonen R et al (2016) Biocuration of functional annotation at the European nucleotide archive. Nucl Acids Res 44:D58–D66
Google Scholar
Henneges C, Hinselmann G, Jung S, Madlung J, Schutz W et al (2009) Ranking methods for the prediction of frequent top scoring peptides from proteomics data. J Proteomics Bioinform 2:226–235
Article Google Scholar
Holzinger A, Jurisica I (2014) Knowledge discovery and data mining in biomedical informatics: the future is in integrative, interactive machine learning solutions. Lecture notes in computer science, pp 1–18
Google Scholar
Karthick RNS, Muthukumaran J (2008) ‘Prediction of three dimensional model and active site analysis of inducible serine protease inhibitor -2 (ISPI -2)’, Galleria Mellonella. J Comput Sci Syst Biol 1:119–125
Google Scholar
Kodama Y, Shumway M, Leinonen R (2012) International nucleotide sequence database collaboration. The sequence read archive: explosive growth of sequencing data. Nucl Acids Res 40(Database issue):D54–D56
Google Scholar
Kulikova T, Aldebert P, Althorpe N, Baker W, Bates K, Browne P, Broek A, Cochrane G, Duggan K, Eberhardt R, Faruque N, García-Pastor MP, Harte N, Kanz C, Leinonen R, Lin Q, Lombard V, Lopez R, Mancuso R, Apweiler R (2004) The EMBL nucleotide sequence database. Nucl Acids Res 32:D27–D30. https://doi.org/10.1093/nar/gkh120
Article Google Scholar
Liu Z, Liu Y, Liu S, Ding X, Yang Y et al (2009) Analysis of the sequence of ITS1-5.8S-ITS2 regions of the three species of fructus Evodiae in Guizhou Province of China and identification of main ingredients of their medicinal chemistry. J Comput Sci Syst Biol 2:200–207
Google Scholar
Manach C (2016) Metabolomics databases. In: Max Rubner conference, 10–12 October 2016. Max Rubner-Institut, Karlsruhe, Germany
Google Scholar
Mashima J, Kodama Y, Fujisawa T, Katayama T, Okuda Y, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T (2017) DNA data bank of Japan. Nucl Acids Res 45(D1):D25–D31
Article Google Scholar
Moreau Y, Tranchevent L-C (2012) Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nat Rev Genet 13:523–536
Article Google Scholar
Oswaldo Cruz Institute (2001) Characteristics of biological data. http://www.dbbm.fiocruz.br/class/Lecture/d17/db_overview/characteristics_of_biological_data.htm. Accessed 01 Nov 2019
Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O’Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, DiCuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM (2014) RefSeq: an update on mammalian reference sequences. Nucl Acids Res 42(D1):D756–D763
Article Google Scholar
Riad AM, Hassan AE, Hassan QF (2009) Investigating investigating performance of XML web services in real-time business systems. J Comput Sci Syst Biol 2:266–271
Google Scholar
Shanthi V, Ramanathan K, Sethumadhavan R (2009) Role of the cation-π interaction in therapeutic proteins: a comparative study with conventional stabilizing forces. J Comput Sci Syst Biol 2:051–068
Google Scholar
Singh S, Gupta SK, Nischal A, Khattri S, Nath R et al (2010) Comparative modeling study of the 3-D structure of small delta antigen protein of hepatitis delta virus. J Comput Sci Syst Biol 3:001–004
Google Scholar
Toomula S, Kumar A, Kumar DS, Bheemidi VS (2011) Biological databases-integration of life science data. J Comput Sci & Syst Biol 4(5):088–092
Google Scholar
The UniProt Consortium (2017) UniProt: the universal protein knowledgebase. Nucl Acids Res 45(D1):D158–D169
Google Scholar
UniProt Consortium (2020) The universal protein resource – UniProt. [Flyer obtained online]. Accessed 06 Mar 2019
Google Scholar
Varsale AR, Wadnerkar AS, Mandage RH, Jadhavrao PK (2010) Cheminformatics. J Proteomics Bioinform 3:253–259
Article Google Scholar
Wooley JC, Lin HS (eds) (2005) Catalyzing inquiry at the interface of computing and biology. National Research Council (US) Committee on Frontiers at the Interface of Computing and Biology. National Academies Press, Washington (DC)
Google Scholar
wwPDB (2020) Worldwide Protein Data Bank (PDB). http://www.wwpdb.org/. Accessed 23 Aug 2019
Yadav G, Mohanty D (2017) Databases developed in India for biological sciences. J Proteins Proteomics 8(3):159–167
Google Scholar
Zou D, Ma L, Yu J, Zhang Z (2015) Biological databases for human research. Genomics Proteomics Bioinform 13(1):55–63
Article Google Scholar

Download references

Author information

Authors and Affiliations

Sheffield Hallam University, Sheffield, UK
Konstantinos Domdouzis
Sheffield Hallam University, Sheffield, UK
Peter Lake
Sheffield Hallam University, Sheffield, UK
Paul Crowther

Authors

Konstantinos Domdouzis
View author publications
You can also search for this author in PubMed Google Scholar
Peter Lake
View author publications
You can also search for this author in PubMed Google Scholar
Paul Crowther
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Konstantinos Domdouzis .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Domdouzis, K., Lake, P., Crowther, P. (2021). Biological Databases. In: Concise Guide to Databases. Undergraduate Topics in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-42224-0_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-42224-0_18
Published: 21 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-42223-3
Online ISBN: 978-3-030-42224-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics