Abstract
Bio-informatics is an interdisciplinary science, which provides life solutions in the discipline of biology and health care by combining the tools available in various disciplines such as computer science, statistics, storage, retrieval and processing of biological data. This interdisciplinary science can provide inputs to diverse sectors such as medical, health, food and agriculture.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
EMBL-European Bioinformatics Institute, EMBL-EBI annual scientific report 2013 (2014)
V. Marx, Biology: the big challenges of big data. Nature 498(7453), 255–260 (2013)
S.Y. Rojahn, Breaking the genome bottleneck. MIT Technology Review (May 2012)
A. Nekrutenko, J. Taylor, Next-generation sequencing data interpretation: enhancing reproducibility and accessibility. Nat. Rev. Genet. 13(9), 667–672 (2012)
M. Kanehisa, S. Goto, KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28(1), 27–30 (2000)
D. Croft, G. OKelly, G. Wu, R. Haw, M. Gillespie, L. Matthews, M. Caudy, P. Garapati, G. Gopinath, B. Jassal et al., Reactome a database of reactions, pathways and biological processes. Nucleic Acids Res. 39, D691–D697 (2010)
E.G. Cerami, B.E. Gross, E. Demir, I. Rodchenkov, O. Babur, N. Anwar, N. Schultz, G.D. Bader, C. Sander, Pathway commons, a web resource for biological pathway data. Nucleic Acids Res. 39(suppl 1), D685–D690 (2011)
J. Mosquera, A. Sanchez-Pla, Serbgo: searching for the best go tool. Nucleic Acids Res. 36(suppl 2), W368–W371 (2008)
T.H. Stokes, R.A. Moffitt, J.H. Phan, M.D. Wang, Chip artifact CORRECTion (caCORRECT): a bioinformatics system for quality assurance of genomics and proteomics array data. Ann. Biomed. Eng. 35(6), 1068–1080 (2007)
J.H. Phan, A.N. Young, M.D. Wang, ominBiomarker a web-based application for knowledge-driven biomarker identification. IEEE Trans. Biomed. Eng. 60(12), 3364–3367 (2013)
M. Liang, F. Zhang, G. Jin, J. Zhu, FastGCN: a GPU accelerated tool for fast gene co-expression networks. PLoS one 10(1), e0116776 (2014)
D.G. McArt, P. Bankhead, P.D. Dunne, M. Salto-Tellez, P. Hamilton, S.D. Zhang, cudaMap: a GPU accelerated program for gene expression connectively mapping. BMC Bioinform. 14(1), 305 (2013)
A. Day, J. Dong, V.A. Funari, B. Harry, S.P. Strom, D.H. Cohn, S.F. Nelson, Disease gene characterization through large scale co-expression analysis. PLoS One 4(12), e8491 (2009)
H. Kashyap, H.A. Ahmed, N. Hoque, S. Roy, D.K. Bhattacharyya, Big data analytics in bioinformatics: a machine learning perspective
A. Day, M.R. Carlson, J. Dong, B.D. O’Connor, S.F. Nelson, Celsius: a community resource for Affymetrix microarray data. Genome Biol. 8(6), R112 (2007)
P. Langfelder, S. Horvath, WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9(1), 559 (2008)
C.G. Rivera, R. Vakil, J.S. Bader, NeMo: network module identification in cytoscape. BMC Bioinform. 11(Suppl 1), S61 (2010)
G.D. Bader, C.W. Hogue, An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. 4(1), 2 (2003)
T. Nepusz, H. Yu, A. Paccanaro, Detcting overlapping protein complexes in protein protein interaction networks. Nat. Methods 9(5), 471–472 (2012)
B.P. Kelley, B. Yuan, F. Lewritter, R. Sharan, B.R. Stockwell, T. Ideker, PathBALST: a tool for alignment of protein interaction networks. Nucleic Acids Res. 32(suppl 2), W83–W88 (2004)
J. Goecks, A. Nekrutenko, J. Taylor et al., Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life science. Genomic Biol. 11(8), R86 (2010)
A. Matsunaga, M. Tsugawa, J. Fortes, Cloudblast: combining MapReduce and virtualization on distributed resources for bioinformatics applications, in eScience’08 IEEE Fourth International Conference on IEEE, 2008, pp. 222–229
H. Nordberg, K. Bhatia, K. Wang, Z. Wang, BioPig: a hadoop based analytic toolkit for large-scale sequence data. Bioinformatics 29(23), 3014–3019 (2013)
A. Schumacher, L. Pireddu, M. Niemenmaa, A. Kallio, E. Kotpelainen, G. Zanetti, K. Heljanko, SeqPig: simple and scalable scripting for large sequencing data sets in hadoop. Bioinformatics 30(1), 119–120 (2014)
B. Langmead, M.C. Schatz, J. Lin, M. Pop, S.L. Salzherg, Searching for SNPs with cloud computing. Genome Biol. 10(11), R134 (2009)
B. Langmead, C. Trapnell, M. Pop, S.L. Salzberg et al., Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10(3), R25 (2009)
R. Li, Y. Li, X. Fang, H. Yang, J. Wang, K. Kristiansen, J. Wang, SNP detection for massively parallel whole-genome resequencing. Genome Res. 19(6), 1125–1132 (2009)
S. Zhao, K. Prenger, L. Smith, Strombow: a cloud-based tool for reads mapping and expression quantification in large scale RNA-Seq studies. Int. Sch. Res. Not. 2013 (2013)
S.V. Angiuoli, M. Matalka, A. Gussman, K. Galens, M. Vangala, D.R. Riley, C. Arze, J.R. White, O. White, W.F. Fricke, CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing. BMC Bioinform. 12(1), 356 (2011)
S. Zhao, K. Prenger, L. Smith, T. Messina, H. Fan, E. Jaeger, S. Stephens, Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing. BMC Genom. 14(1), 425 (2013)
S. Kurtz, The vmatch large scale sequence analysis software. Ref Type: Computer Program, pp. 4–12 (2003)
www.bioinformatics .bbsrc.ac.uk
A.C. Zambon, S. Gaj, I. Ho, K. Hanspers, K. Vranizan, C.T. Evelo, B.R. Conklin, A.R. Pico, N. Salomonis, GO-Elite a flexible solution for pathway and ontology over representation. Bioinformatics 28(16), 2209–2210 (2012)
M.P. van lersel T. Kelder, A.R. Pico, K. Hanspers, S. Coort, B.R. Conklin, C. Evelo, Presenting and exploring biological pathways with PathVisio. BMC Bioinform. 9(1), 399 (2008)
P. Yang, E. Patrick, S.X. Tan, D.J. Fazakerley, J. Burchfield, C. Gribben, M.J. Prior, D.E. James, Y.H. Yang, Direction pathway analysis of large-scale proteomics data reveals novel features of the insulin action pathway. Bioinformatics 30(6), 808–814 (2014)
P. Grosu, J.P. Townsend, D.L. Hartl, D. Cavalieri, Pathway processor: a tool for integrating whole-genome expression results into metabolic networks. Genome Res. 12(7), 1121–1126 (2002)
Y.S. Park, M. Schmidt, E.R. Martin, M.A. Pericak-Vance, R.H. Chung, Pathway PDT: a flexible pathway analysis tool for nuclear families. BMC Bioinform. 14(1), 267 (2013)
W. Luo, C. Brouwer, Pathview: an R/Bioconductor package for pathway based data integration and visualization. Bioinformatics 29(14), 1830–1831 (2013)
S. Kumar, M. Nei, J. Dedley, K. Tamura, MEGA: a biologist-centric software for evolutionary analysis of DNA and protein sequences. Brief. Bioinform. 9(4), 299–306 (2008)
M.S. Barker, K.M. Dlugosch, L. Dinh, R.S. Challa, N.C. Kane, M.G. King, L.H. Rieseberg, EvoPipes net: bioinformatic tools for ecological and evolutionary genomics. Evol. Bioinform. Online 6, 143 (2010)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Prabhu, C., Chivukula, A., Mogadala, A., Ghosh, R., Livingston, L. (2019). Big Data Analytics in Bio-informatics. In: Big Data Analytics: Systems, Algorithms, Applications. Springer, Singapore. https://doi.org/10.1007/978-981-15-0094-7_13
Download citation
DOI: https://doi.org/10.1007/978-981-15-0094-7_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0093-0
Online ISBN: 978-981-15-0094-7
eBook Packages: Computer ScienceComputer Science (R0)