Skip to main content

Role of Data Science in the Field of Genomics and Basic Analysis of Raw Genomic Data Using Python

  • Conference paper
  • First Online:
Data Science and Security

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 290))

  • 494 Accesses

Abstract

The application of genomics in identifying the nature and cause of diseases has predominantly increased in this decade. This field of study in life sciences combined with new technologies, revealed an outbreak of certain large amounts of genomic sequences. Analysis of such huge data in an appropriate way will ensure accurate prediction of disease which helps to adopt preventive mechanisms which can ultimately improve the human quality of life. In order to achieve this, efficient comprehensive analysis tools and storage mechanisms for handling the enormous genomic data is essential. This research work gives an insight into the application of data science in genomics with a demonstration using Python.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. A Brief Guide to Genomics (2019) 15 August 2015. https://www.genome.gov/about-genomics/fact-sheets/A-Brief-Guide-to-Genomics. Accessed 2 Dec 2019

  2. Quilez Oliete J: A step-by-step guide to DNA sequencing data analysis, Kolabtree Blog, 23 March 2020. https://www.kolabtree.com/blog/a-step-by-step-guide-to-dna-sequencing-data-analysis/. Accessed 9 Apr 2020

  3. Zhang X, Li A, Zhang Y, Xiao Y (2012) Validity of cluster technique for genome expression data. In: 2012 24th Chinese control and decision conference (CCDC), Taiyuan, pp 3737–3741. https://doi.org/10.1109/CCDC.2012.6244599

  4. Jimenez-Lopez J, Gachomo E, Sharma S, Kotchoni S (2013) Genome sequencing and next-generation sequence data analysis: a comprehensive compilation of bioinformatics tools and databases. Am J Mol Biol 3:115–130. https://doi.org/10.4236/ajmb.2013.32016

    Article  Google Scholar 

  5. Leggett RM, Ramirez-Gonzalez RH, Clavijo BJ, Waite D, Davey RP (2013) Sequencing quality assessment tools to enable data-driven informatics for high throughput genomics. Front Genet 4:288. https://doi.org/10.3389/fgene.2013.00288

  6. Schatz MC (2015) Biological data sciences in genome research. Cold Spring Harb Lab Press Perspect 25:1417–1422. https://doi.org/10.1101/gr.191684.115.

  7. Ceri S, Kaitoua A, Masseroli M, Pinoli P, Venco F, Milano P (2016) Data management for next generation genomic computing. EDBT 485–490. https://doi.org/10.5441/002/edbt.2016.46.

  8. Roy S, LaFramboise WA, Nikiforov YE, Nikiforova MN, Routbort MJ, Pfeifer J, Nagarajan R, Carter AB, Pantanowitz L (2016) Next-generation sequencing informatics: challenges and strategies for implementation in a clinical environment. Arch Pathol Lab Med 140(9):958–975. https://doi.org/10.5858/arpa.2015-0507-RA Epub 2016 Feb 22 PMID: 26901284

    Article  Google Scholar 

  9. He KY, Ge D, He MM (2017) Big data analytics for genomic medicine. Int J Mol Sci 18:1–18. https://doi.org/10.3390/ijms18020412

  10. Molnár-gábor, F, Lueck R, Yakneen S, Korbel JO (2017) Computing patient data in the cloud: practical and legal considerations for genetics and genomics research in Europe and internationally. Genome Med 9:1–12. https://doi.org/10.1186/s13073-017-0449-6

  11. Navarro, FCP, Mohsen H, Yan C, Li S, Gu M, Meyerson W (2019) Genomics and data science: an application within an umbrella. Genome Biol 20:1–11. https://doi.org/10.1186/s13059-019-1724-1

  12. Ceri S, Pinoli P (2020) Data science for genomic data management: challenges, resources, experiences. SN Comput Sci 1:1–7. https://doi.org/10.1007/s42979-019-0005-0.

  13. Kashyap H, Ahmed HA, Hoque N, Roy S, Bhattacharyya DK (2015) Big data analytics in bioinformatics: a machine learning perspective. CoRR abs/1506.05101

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Karthikeyan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Karthikeyan, S., Jose, D.V. (2021). Role of Data Science in the Field of Genomics and Basic Analysis of Raw Genomic Data Using Python. In: Shukla, S., Unal, A., Kureethara, J.V., Mishra, D.K., Han, D.S. (eds) Data Science and Security. Lecture Notes in Networks and Systems, vol 290. Springer, Singapore. https://doi.org/10.1007/978-981-16-4486-3_19

Download citation

Publish with us

Policies and ethics