Skip to main content

Role of Bioinformatics in Molecular Medicine

  • Chapter
  • First Online:
Genomic Medicine
  • 1059 Accesses

Abstract

The increased clinical adoption of next-generation sequencing (NGS) has led to widespread application of bioinformatics approaches for data analytics. Laboratories have largely adopted custom solution implementations of various analysis steps that are tied together and referred to as a bioinformatics workflow or pipeline. This chapter discusses the high-level approaches that have become the standard for clinical bioinformatics, including details on the file structure, conceptualizing the algorithmic processes, and regulatory considerations. We also discuss some of the tools that are becoming more commonly implemented and are likely to become standard components of many laboratories.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hogeweg P. The roots of bioinformatics in theoretical biology. PLoS Comput Biol. 2011;7:e1002021.

    Article  CAS  Google Scholar 

  2. Luscombe NM, Greenbaum D, Gerstein M. What is bioinformatics? A proposed definition and overview of the field. Methods Inf Med. 2001;40:346.

    Article  CAS  Google Scholar 

  3. Mantione KJ, Kream RM, Kuzelova H, Ptacek R, Raboch J, Samuel JM, et al. Comparing bioinformatic gene expression profiling methods: microarray and RNA-Seq. Med Sci Monit Basic Res. 2014;20:138–41.

    Article  Google Scholar 

  4. Bowie J, Luthy R, Eisenberg D. A method to identify protein sequences that fold into a known three-dimensional structure. Science. 1991;253:164–70.

    Article  CAS  Google Scholar 

  5. Jones MB, Schildhauer MP, Reichman OJ, Bowers S. The new bioinformatics: integrating ecological data from the gene to the biosphere. Annu Rev Ecol Evol Syst. 2006;37:519–44.

    Article  Google Scholar 

  6. Wang X, Liotta L. Clinical bioinformatics: a new emerging science. J Clin Bioinform. 2011;1:1–3.

    Article  Google Scholar 

  7. Belmont JW, Shaw CA. Clinical bioinformatics: emergence of a new laboratory discipline. Expert Rev Mol Diagn. 2016;16:1139.

    Article  CAS  Google Scholar 

  8. Mills L. Common file formats. Curr Protoc Bioinformatics. 2003;45:A.1B.1–18.

    Article  Google Scholar 

  9. Ye J, McGinnis S, Madden TL. BLAST: improvements for better sequence analysis. Nucleic Acids Res. 2006;34:W6–9.

    Article  CAS  Google Scholar 

  10. Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2010;38:1767–71.

    Article  CAS  Google Scholar 

  11. Ewing B, Green P. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 1998;8:186–94.

    Article  CAS  Google Scholar 

  12. Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces UsingPhred. I. accuracy assessment. Genome Res. 1998;8:175–85.

    Article  CAS  Google Scholar 

  13. Baker M. De novo genome assembly: what every biologist should know. Nat Methods. 2012;9:333.

    Article  CAS  Google Scholar 

  14. Burrows M, Wheeler DJ. A block-sorting lossless data compression algorithm. Palo Alto: Digital Equipment Corporation; 1994.

    Google Scholar 

  15. Eaves HL, Gao Y. MOM: maximum oligonucleotide mapping. Bioinformatics. 2009;25:969–70.

    Article  CAS  Google Scholar 

  16. Campagna D, Albiero A, Bilardi A, Caniato E, Forcato C, Manavski S, et al. PASS: a program to align short sequences. Bioinformatics. 2009;25:967–8.

    Article  CAS  Google Scholar 

  17. Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–60.

    Article  CAS  Google Scholar 

  18. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv. 2013.

    Google Scholar 

  19. Pritchard CC, Salipante SJ, Koehler K, Smith C, Scroggins S, Wood B, et al. Validation and implementation of targeted capture and sequencing for the detection of actionable mutation, copy number variation, and gene rearrangement in clinical cancer specimens. J Mol Diagn. 2014;16:56–67.

    Article  CAS  Google Scholar 

  20. Frampton GM, Fichtenholtz A, Otto GA, Wang K, Downing SR, He J, et al. Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing. Nat Biotechnol. 2013;31:1023–31.

    Article  CAS  Google Scholar 

  21. Cheng DT, Mitchell TN, Zehir A, Shah RH, Benayed R, Syed A, et al. Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT): a hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J Mol Diagn. 2015;17:251–64.

    Article  CAS  Google Scholar 

  22. Mu JC, Jiang H, Kiani A, Mohiyuddin M, Asadi N, Wong WH. Fast and accurate read alignment for resequencing. Bioinformatics. 2012;28:2366–73.

    Article  CAS  Google Scholar 

  23. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.

    Article  Google Scholar 

  24. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, et al. Integrative genomics viewer. Nat Biotechnol. 2011;29:24–6.

    Article  CAS  Google Scholar 

  25. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.

    Article  CAS  Google Scholar 

  26. Mose LE, Wilkerson MD, Hayes ND, Perou CM, Parker JS. ABRA: improved coding indel detection via assembly-based realignment. Bioinformatics. 2014;30:2813–5.

    Article  CAS  Google Scholar 

  27. Liu X, Han S, Wang Z, Gelernter J, Yang B-Z. Variant callers for next-generation sequencing data: a comparison study. PLoS One. 2013;8:e75619.

    Article  CAS  Google Scholar 

  28. Consortium 1000. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65.

    Article  Google Scholar 

  29. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–8.

    Article  CAS  Google Scholar 

  30. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164.

    Article  Google Scholar 

  31. Landrum MJ, Lee JM, Benson M, Brown G, Chao C, Chitipiralla S, et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 2016;44:D862–8.

    Article  CAS  Google Scholar 

  32. Chakravarty D, Gao J, Phillips S, Kundra R, Zhang H, Wang J, et al. OncoKB: a precision oncology knowledge base. JCO Precis Oncol. 2017;2017:1–16.

    Google Scholar 

  33. Ramos AH, Lichtenstein L, Gupta M, Lawrence MS, Pugh TJ, Saksena G, et al. Oncotator: cancer variant annotation tool. Hum Mutat. 2015;36:E2423–9.

    Article  Google Scholar 

  34. Roy S, LaFramboise WA, Nikiforov YE, Nikiforova MN, Routbort MJ, Pfeifer J, et al. Next-generation sequencing informatics. Arch Pathol Lab Med. 2016;140:958–75.

    Article  CAS  Google Scholar 

  35. Davies KD, Farooqi MS, Gruidl M, Hill CE, Woolworth-Hirschhorn J, Jones H, et al. Multi-institutional FASTQ file exchange as a means of proficiency testing for next-generation sequencing bioinformatics and variant interpretation. J Mol Diagn. 2016;18:572–9.

    Article  CAS  Google Scholar 

  36. Gargis AS, Kalman L, Lubin IM. Assuring the quality of next-generation sequencing in clinical microbiology and public health laboratories. J Clin Microbiol. 2016;54:2857–65.

    Article  Google Scholar 

  37. Jennings LJ, Arcila ME, Corless C, Kamel-Reid S, Lubin IM, Pfeifer J, et al. Guidelines for validation of next-generation sequencing–based oncology panels. A Joint Consensus Recommendation of the Association for Molecular Pathology and College of American Pathologists. J Mol Diagn. 2017;19:341–65.

    Article  Google Scholar 

  38. Roy S, Coldren C, Karunamurthy A, Kip NS, Klee EW, Lincoln SE, et al. Standards and guidelines for validating next-generation sequencing bioinformatics pipelines a joint recommendation of the Association for Molecular Pathology and the College of American Pathologists. J Mol Diagn. 2018;20:4.

    Article  CAS  Google Scholar 

  39. Duncavage EJ, Abel HJ, Pfeifer JD. In silico proficiency testing for clinical next-generation sequencing. J Mol Diagn. 2017;19:35–42.

    Article  CAS  Google Scholar 

  40. Duncavage EJ, Abel HJ, Merker JD, Bodner JB, Zhao Q, Voelkerding KV, et al. A model study of in silico proficiency testing for clinical next-generation sequencing. Arch Pathol Lab Med. 2016;140:1085–91.

    Article  CAS  Google Scholar 

  41. Zhao M, Wang Q, Wang Q, Jia P, Zhao Z. Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives. BMC Bioinformatics. 2013;14:1–16.

    Article  Google Scholar 

  42. Shen R, Seshan VE. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res. 2016;44:e131.

    Article  Google Scholar 

  43. Guan P, Sung W-K. Structural variation detection using next-generation sequencing data: a comparative technical review. Methods. 2016;102:36–49.

    Article  CAS  Google Scholar 

  44. Middha S, Zhang L, Nafa K, Jayakumaran G, Wong D, Kim HR, et al. Reliable pan-cancer microsatellite instability assessment by using targeted next-generation sequencing data. JCO Precis Oncol. 2017:1–17.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chad Vanderbilt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Vanderbilt, C., Middha, S. (2020). Role of Bioinformatics in Molecular Medicine. In: Tafe, L., Arcila, M. (eds) Genomic Medicine. Springer, Cham. https://doi.org/10.1007/978-3-030-22922-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-22922-1_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-22921-4

  • Online ISBN: 978-3-030-22922-1

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics