Role of Bioinformatics in Molecular Medicine

Vanderbilt, Chad; Middha, Sumit

doi:10.1007/978-3-030-22922-1_4

Chad Vanderbilt³ &
Sumit Middha³

1059 Accesses

Abstract

The increased clinical adoption of next-generation sequencing (NGS) has led to widespread application of bioinformatics approaches for data analytics. Laboratories have largely adopted custom solution implementations of various analysis steps that are tied together and referred to as a bioinformatics workflow or pipeline. This chapter discusses the high-level approaches that have become the standard for clinical bioinformatics, including details on the file structure, conceptualizing the algorithmic processes, and regulatory considerations. We also discuss some of the tools that are becoming more commonly implemented and are likely to become standard components of many laboratories.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hogeweg P. The roots of bioinformatics in theoretical biology. PLoS Comput Biol. 2011;7:e1002021.
Article CAS Google Scholar
Luscombe NM, Greenbaum D, Gerstein M. What is bioinformatics? A proposed definition and overview of the field. Methods Inf Med. 2001;40:346.
Article CAS Google Scholar
Mantione KJ, Kream RM, Kuzelova H, Ptacek R, Raboch J, Samuel JM, et al. Comparing bioinformatic gene expression profiling methods: microarray and RNA-Seq. Med Sci Monit Basic Res. 2014;20:138–41.
Article Google Scholar
Bowie J, Luthy R, Eisenberg D. A method to identify protein sequences that fold into a known three-dimensional structure. Science. 1991;253:164–70.
Article CAS Google Scholar
Jones MB, Schildhauer MP, Reichman OJ, Bowers S. The new bioinformatics: integrating ecological data from the gene to the biosphere. Annu Rev Ecol Evol Syst. 2006;37:519–44.
Article Google Scholar
Wang X, Liotta L. Clinical bioinformatics: a new emerging science. J Clin Bioinform. 2011;1:1–3.
Article Google Scholar
Belmont JW, Shaw CA. Clinical bioinformatics: emergence of a new laboratory discipline. Expert Rev Mol Diagn. 2016;16:1139.
Article CAS Google Scholar
Mills L. Common file formats. Curr Protoc Bioinformatics. 2003;45:A.1B.1–18.
Article Google Scholar
Ye J, McGinnis S, Madden TL. BLAST: improvements for better sequence analysis. Nucleic Acids Res. 2006;34:W6–9.
Article CAS Google Scholar
Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2010;38:1767–71.
Article CAS Google Scholar
Ewing B, Green P. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 1998;8:186–94.
Article CAS Google Scholar
Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces UsingPhred. I. accuracy assessment. Genome Res. 1998;8:175–85.
Article CAS Google Scholar
Baker M. De novo genome assembly: what every biologist should know. Nat Methods. 2012;9:333.
Article CAS Google Scholar
Burrows M, Wheeler DJ. A block-sorting lossless data compression algorithm. Palo Alto: Digital Equipment Corporation; 1994.
Google Scholar
Eaves HL, Gao Y. MOM: maximum oligonucleotide mapping. Bioinformatics. 2009;25:969–70.
Article CAS Google Scholar
Campagna D, Albiero A, Bilardi A, Caniato E, Forcato C, Manavski S, et al. PASS: a program to align short sequences. Bioinformatics. 2009;25:967–8.
Article CAS Google Scholar
Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–60.
Article CAS Google Scholar
Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv. 2013.
Google Scholar
Pritchard CC, Salipante SJ, Koehler K, Smith C, Scroggins S, Wood B, et al. Validation and implementation of targeted capture and sequencing for the detection of actionable mutation, copy number variation, and gene rearrangement in clinical cancer specimens. J Mol Diagn. 2014;16:56–67.
Article CAS Google Scholar
Frampton GM, Fichtenholtz A, Otto GA, Wang K, Downing SR, He J, et al. Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing. Nat Biotechnol. 2013;31:1023–31.
Article CAS Google Scholar
Cheng DT, Mitchell TN, Zehir A, Shah RH, Benayed R, Syed A, et al. Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT): a hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J Mol Diagn. 2015;17:251–64.
Article CAS Google Scholar
Mu JC, Jiang H, Kiani A, Mohiyuddin M, Asadi N, Wong WH. Fast and accurate read alignment for resequencing. Bioinformatics. 2012;28:2366–73.
Article CAS Google Scholar
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.
Article Google Scholar
Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, et al. Integrative genomics viewer. Nat Biotechnol. 2011;29:24–6.
Article CAS Google Scholar
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.
Article CAS Google Scholar
Mose LE, Wilkerson MD, Hayes ND, Perou CM, Parker JS. ABRA: improved coding indel detection via assembly-based realignment. Bioinformatics. 2014;30:2813–5.
Article CAS Google Scholar
Liu X, Han S, Wang Z, Gelernter J, Yang B-Z. Variant callers for next-generation sequencing data: a comparison study. PLoS One. 2013;8:e75619.
Article CAS Google Scholar
Consortium 1000. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65.
Article Google Scholar
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–8.
Article CAS Google Scholar
Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164.
Article Google Scholar
Landrum MJ, Lee JM, Benson M, Brown G, Chao C, Chitipiralla S, et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 2016;44:D862–8.
Article CAS Google Scholar
Chakravarty D, Gao J, Phillips S, Kundra R, Zhang H, Wang J, et al. OncoKB: a precision oncology knowledge base. JCO Precis Oncol. 2017;2017:1–16.
Google Scholar
Ramos AH, Lichtenstein L, Gupta M, Lawrence MS, Pugh TJ, Saksena G, et al. Oncotator: cancer variant annotation tool. Hum Mutat. 2015;36:E2423–9.
Article Google Scholar
Roy S, LaFramboise WA, Nikiforov YE, Nikiforova MN, Routbort MJ, Pfeifer J, et al. Next-generation sequencing informatics. Arch Pathol Lab Med. 2016;140:958–75.
Article CAS Google Scholar
Davies KD, Farooqi MS, Gruidl M, Hill CE, Woolworth-Hirschhorn J, Jones H, et al. Multi-institutional FASTQ file exchange as a means of proficiency testing for next-generation sequencing bioinformatics and variant interpretation. J Mol Diagn. 2016;18:572–9.
Article CAS Google Scholar
Gargis AS, Kalman L, Lubin IM. Assuring the quality of next-generation sequencing in clinical microbiology and public health laboratories. J Clin Microbiol. 2016;54:2857–65.
Article Google Scholar
Jennings LJ, Arcila ME, Corless C, Kamel-Reid S, Lubin IM, Pfeifer J, et al. Guidelines for validation of next-generation sequencing–based oncology panels. A Joint Consensus Recommendation of the Association for Molecular Pathology and College of American Pathologists. J Mol Diagn. 2017;19:341–65.
Article Google Scholar
Roy S, Coldren C, Karunamurthy A, Kip NS, Klee EW, Lincoln SE, et al. Standards and guidelines for validating next-generation sequencing bioinformatics pipelines a joint recommendation of the Association for Molecular Pathology and the College of American Pathologists. J Mol Diagn. 2018;20:4.
Article CAS Google Scholar
Duncavage EJ, Abel HJ, Pfeifer JD. In silico proficiency testing for clinical next-generation sequencing. J Mol Diagn. 2017;19:35–42.
Article CAS Google Scholar
Duncavage EJ, Abel HJ, Merker JD, Bodner JB, Zhao Q, Voelkerding KV, et al. A model study of in silico proficiency testing for clinical next-generation sequencing. Arch Pathol Lab Med. 2016;140:1085–91.
Article CAS Google Scholar
Zhao M, Wang Q, Wang Q, Jia P, Zhao Z. Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives. BMC Bioinformatics. 2013;14:1–16.
Article Google Scholar
Shen R, Seshan VE. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res. 2016;44:e131.
Article Google Scholar
Guan P, Sung W-K. Structural variation detection using next-generation sequencing data: a comparative technical review. Methods. 2016;102:36–49.
Article CAS Google Scholar
Middha S, Zhang L, Nafa K, Jayakumaran G, Wong D, Kim HR, et al. Reliable pan-cancer microsatellite instability assessment by using targeted next-generation sequencing data. JCO Precis Oncol. 2017:1–17.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Chad Vanderbilt & Sumit Middha

Authors

Chad Vanderbilt
View author publications
You can also search for this author in PubMed Google Scholar
Sumit Middha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chad Vanderbilt .

Editor information

Editors and Affiliations

Dartmouth–Hitchcock Medical Center, Lebanon, NH, USA
Laura J. Tafe
Memorial Sloan Kettering Cancer Center, New York, NY, USA
Maria E. Arcila

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Vanderbilt, C., Middha, S. (2020). Role of Bioinformatics in Molecular Medicine. In: Tafe, L., Arcila, M. (eds) Genomic Medicine. Springer, Cham. https://doi.org/10.1007/978-3-030-22922-1_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-22922-1_4
Published: 27 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22921-4
Online ISBN: 978-3-030-22922-1
eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics