Skip to main content

Big Data Analytics in Bio-informatics

  • Chapter
  • First Online:
Book cover Big Data Analytics: Systems, Algorithms, Applications

Abstract

Bio-informatics is an interdisciplinary science, which provides life solutions in the discipline of biology and health care by combining the tools available in various disciplines such as computer science, statistics, storage, retrieval and processing of biological data. This interdisciplinary science can provide inputs to diverse sectors such as medical, health, food and agriculture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. EMBL-European Bioinformatics Institute, EMBL-EBI annual scientific report 2013 (2014)

    Google Scholar 

  2. V. Marx, Biology: the big challenges of big data. Nature 498(7453), 255–260 (2013)

    Article  Google Scholar 

  3. S.Y. Rojahn, Breaking the genome bottleneck. MIT Technology Review (May 2012)

    Google Scholar 

  4. A. Nekrutenko, J. Taylor, Next-generation sequencing data interpretation: enhancing reproducibility and accessibility. Nat. Rev. Genet. 13(9), 667–672 (2012)

    Article  Google Scholar 

  5. M. Kanehisa, S. Goto, KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28(1), 27–30 (2000)

    Article  Google Scholar 

  6. D. Croft, G. OKelly, G. Wu, R. Haw, M. Gillespie, L. Matthews, M. Caudy, P. Garapati, G. Gopinath, B. Jassal et al., Reactome a database of reactions, pathways and biological processes. Nucleic Acids Res. 39, D691–D697 (2010)

    Google Scholar 

  7. E.G. Cerami, B.E. Gross, E. Demir, I. Rodchenkov, O. Babur, N. Anwar, N. Schultz, G.D. Bader, C. Sander, Pathway commons, a web resource for biological pathway data. Nucleic Acids Res. 39(suppl 1), D685–D690 (2011)

    Article  Google Scholar 

  8. J. Mosquera, A. Sanchez-Pla, Serbgo: searching for the best go tool. Nucleic Acids Res. 36(suppl 2), W368–W371 (2008)

    Article  Google Scholar 

  9. T.H. Stokes, R.A. Moffitt, J.H. Phan, M.D. Wang, Chip artifact CORRECTion (caCORRECT): a bioinformatics system for quality assurance of genomics and proteomics array data. Ann. Biomed. Eng. 35(6), 1068–1080 (2007)

    Google Scholar 

  10. J.H. Phan, A.N. Young, M.D. Wang, ominBiomarker a web-based application for knowledge-driven biomarker identification. IEEE Trans. Biomed. Eng. 60(12), 3364–3367 (2013)

    Article  Google Scholar 

  11. M. Liang, F. Zhang, G. Jin, J. Zhu, FastGCN: a GPU accelerated tool for fast gene co-expression networks. PLoS one 10(1), e0116776 (2014)

    Google Scholar 

  12. D.G. McArt, P. Bankhead, P.D. Dunne, M. Salto-Tellez, P. Hamilton, S.D. Zhang, cudaMap: a GPU accelerated program for gene expression connectively mapping. BMC Bioinform. 14(1), 305 (2013)

    Article  Google Scholar 

  13. A. Day, J. Dong, V.A. Funari, B. Harry, S.P. Strom, D.H. Cohn, S.F. Nelson, Disease gene characterization through large scale co-expression analysis. PLoS One 4(12), e8491 (2009)

    Google Scholar 

  14. H. Kashyap, H.A. Ahmed, N. Hoque, S. Roy, D.K. Bhattacharyya, Big data analytics in bioinformatics: a machine learning perspective

    Google Scholar 

  15. A. Day, M.R. Carlson, J. Dong, B.D. O’Connor, S.F. Nelson, Celsius: a community resource for Affymetrix microarray data. Genome Biol. 8(6), R112 (2007)

    Google Scholar 

  16. P. Langfelder, S. Horvath, WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9(1), 559 (2008)

    Article  Google Scholar 

  17. C.G. Rivera, R. Vakil, J.S. Bader, NeMo: network module identification in cytoscape. BMC Bioinform. 11(Suppl 1), S61 (2010)

    Article  Google Scholar 

  18. G.D. Bader, C.W. Hogue, An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. 4(1), 2 (2003)

    Article  Google Scholar 

  19. T. Nepusz, H. Yu, A. Paccanaro, Detcting overlapping protein complexes in protein protein interaction networks. Nat. Methods 9(5), 471–472 (2012)

    Article  Google Scholar 

  20. B.P. Kelley, B. Yuan, F. Lewritter, R. Sharan, B.R. Stockwell, T. Ideker, PathBALST: a tool for alignment of protein interaction networks. Nucleic Acids Res. 32(suppl 2), W83–W88 (2004)

    Article  Google Scholar 

  21. J. Goecks, A. Nekrutenko, J. Taylor et al., Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life science. Genomic Biol. 11(8), R86 (2010)

    Article  Google Scholar 

  22. A. Matsunaga, M. Tsugawa, J. Fortes, Cloudblast: combining MapReduce and virtualization on distributed resources for bioinformatics applications, in eScience’08 IEEE Fourth International Conference on IEEE, 2008, pp. 222–229

    Google Scholar 

  23. H. Nordberg, K. Bhatia, K. Wang, Z. Wang, BioPig: a hadoop based analytic toolkit for large-scale sequence data. Bioinformatics 29(23), 3014–3019 (2013)

    Article  Google Scholar 

  24. A. Schumacher, L. Pireddu, M. Niemenmaa, A. Kallio, E. Kotpelainen, G. Zanetti, K. Heljanko, SeqPig: simple and scalable scripting for large sequencing data sets in hadoop. Bioinformatics 30(1), 119–120 (2014)

    Article  Google Scholar 

  25. B. Langmead, M.C. Schatz, J. Lin, M. Pop, S.L. Salzherg, Searching for SNPs with cloud computing. Genome Biol. 10(11), R134 (2009)

    Article  Google Scholar 

  26. B. Langmead, C. Trapnell, M. Pop, S.L. Salzberg et al., Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10(3), R25 (2009)

    Google Scholar 

  27. R. Li, Y. Li, X. Fang, H. Yang, J. Wang, K. Kristiansen, J. Wang, SNP detection for massively parallel whole-genome resequencing. Genome Res. 19(6), 1125–1132 (2009)

    Article  Google Scholar 

  28. S. Zhao, K. Prenger, L. Smith, Strombow: a cloud-based tool for reads mapping and expression quantification in large scale RNA-Seq studies. Int. Sch. Res. Not. 2013 (2013)

    Google Scholar 

  29. S.V. Angiuoli, M. Matalka, A. Gussman, K. Galens, M. Vangala, D.R. Riley, C. Arze, J.R. White, O. White, W.F. Fricke, CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing. BMC Bioinform. 12(1), 356 (2011)

    Article  Google Scholar 

  30. S. Zhao, K. Prenger, L. Smith, T. Messina, H. Fan, E. Jaeger, S. Stephens, Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing. BMC Genom. 14(1), 425 (2013)

    Article  Google Scholar 

  31. S. Kurtz, The vmatch large scale sequence analysis software. Ref Type: Computer Program, pp. 4–12 (2003)

    Google Scholar 

  32. www.bioinformatics .bbsrc.ac.uk

  33. A.C. Zambon, S. Gaj, I. Ho, K. Hanspers, K. Vranizan, C.T. Evelo, B.R. Conklin, A.R. Pico, N. Salomonis, GO-Elite a flexible solution for pathway and ontology over representation. Bioinformatics 28(16), 2209–2210 (2012)

    Article  Google Scholar 

  34. M.P. van lersel T. Kelder, A.R. Pico, K. Hanspers, S. Coort, B.R. Conklin, C. Evelo, Presenting and exploring biological pathways with PathVisio. BMC Bioinform. 9(1), 399 (2008)

    Google Scholar 

  35. P. Yang, E. Patrick, S.X. Tan, D.J. Fazakerley, J. Burchfield, C. Gribben, M.J. Prior, D.E. James, Y.H. Yang, Direction pathway analysis of large-scale proteomics data reveals novel features of the insulin action pathway. Bioinformatics 30(6), 808–814 (2014)

    Google Scholar 

  36. P. Grosu, J.P. Townsend, D.L. Hartl, D. Cavalieri, Pathway processor: a tool for integrating whole-genome expression results into metabolic networks. Genome Res. 12(7), 1121–1126 (2002)

    Google Scholar 

  37. Y.S. Park, M. Schmidt, E.R. Martin, M.A. Pericak-Vance, R.H. Chung, Pathway PDT: a flexible pathway analysis tool for nuclear families. BMC Bioinform. 14(1), 267 (2013)

    Article  Google Scholar 

  38. W. Luo, C. Brouwer, Pathview: an R/Bioconductor package for pathway based data integration and visualization. Bioinformatics 29(14), 1830–1831 (2013)

    Article  Google Scholar 

  39. S. Kumar, M. Nei, J. Dedley, K. Tamura, MEGA: a biologist-centric software for evolutionary analysis of DNA and protein sequences. Brief. Bioinform. 9(4), 299–306 (2008)

    Article  Google Scholar 

  40. M.S. Barker, K.M. Dlugosch, L. Dinh, R.S. Challa, N.C. Kane, M.G. King, L.H. Rieseberg, EvoPipes net: bioinformatic tools for ecological and evolutionary genomics. Evol. Bioinform. Online 6, 143 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to C.S.R. Prabhu .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Prabhu, C., Chivukula, A., Mogadala, A., Ghosh, R., Livingston, L. (2019). Big Data Analytics in Bio-informatics. In: Big Data Analytics: Systems, Algorithms, Applications. Springer, Singapore. https://doi.org/10.1007/978-981-15-0094-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-0094-7_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-0093-0

  • Online ISBN: 978-981-15-0094-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics