Abstract
The bioinformatics requirements within the clinical environment are very specific, and analytic techniques need to be fit for purpose, robust, and predictable. At the same time, the bewildering amount of information produced during these analyses needs to be carefully managed, used and interpreted correctly. The challenge for clinical laboratories now is to implement production analytical processes that are capable of handling different experimental approaches on current equipment, as well as to incorporate ways for these systems to evolve to take account of developments likely to make impacts in the near future. This is complicated by the many options available at each of the critical processing steps and a clear method needs to be developed to assemble appropriate pipelines. Here, I discuss the issues relevant to the development of an informatics pipeline that meets these criteria that should allow individual laboratories to assess their proposed strategies.
Key words
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsAbbreviations
- BAM:
-
Binary version of SAM file
- CNV:
-
Copy number variant
- NGS:
-
Next-generation sequencing
- Q:
-
Quality score
- QC:
-
Quality control
- SNV:
-
Single nucleotide variant
- SSV:
-
Variant instance
- VCF:
-
Variant call format
- WES:
-
Whole-exome sequencing
- WGS:
-
Whole-genome sequencing
References
Metzker ML (2010) Sequencing technologies: the next generation. Nat Rev Genet 11:31–46
Liu L, Li Y, Li S et al (2012) Comparison of next-generation sequencing systems. J Biomed Biotechnol 2012:251364
Kamalakaran S, Varadan V, Janevski A et al (2013) Translating next generation sequencing to practice: opportunities and necessary steps. Mol Oncol 7:743–755
Hong H, Zhang W, Shen J et al (2013) Critical role of bioinformatics in translating huge amounts of next-generation sequencing data in personalized medicine. Sci China Life Sci 56:110–118
Yang Y, Muzny DM, Reid JG et al (2013) Clinical whole-exome sequencing for the diagnosis of Mendelian disorders. N Engl J Med 369:1502–1511
Bromberg Y (2013) Building a genome analysis pipeline to predict disease risk and prevent disease. J Mol Biol 425:3993–4005
Guo Y, Ye F, Sheng Q et al (2013) Three-stage quality control strategies for DNA re-sequencing data. Brief Bioinform. doi:10.1093/bib/bbt069
Cock PJA, Fields CJ, Goto N et al (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38:1767–1771
Li H, Handsaker B, Wysoker A et al (2009) The sequencer alignment/map format and SAMtools. Bioinformatics 16:2078–2079
Danecek P, Auton A, Abecasis G et al (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842
1000 Genomes Project Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061–1073
Ng PC, Henikoff S (2003) SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res 31:3812–3814
Adzhubei IA, Schmidt S, Peshkin L et al (2010) A method and server for predicting damaging missense mutations. Nat Methods 7:248–249
Schwarz JM, Rodelsperger C, Schuelke M et al (2010) MutationTaster evaluates disease-causing potential of sequence alternations. Nat Methods 7:575–576
Quinque D, Kittler R, Kayser M et al (2006) Evaluation of saliva and a source of human DNA for population and association studies. Anal Biochem 353:272–277
Boland JF, Chung CC, Roberson D et al (2013) The new sequencer on the block: comparison of Life Technology’s Proton sequencer to an Illumina HiSeq for whole-exome sequencing. Hum Genet 132:1153–1163
Pavlopoulos GA, Oulas A, Iacucci E et al (2013) Unravelling genomic variation from next generation sequencing data. BioData Min 6:13–38
Liu X, Han S, Wang Z et al (2013) Variant callers for next generation sequencing data: a comparison study. PLoS One 8:e75619
Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:e164
Chang X, Wang K (2012) wANNOVAR: annotating genetic variants for personal genomes via the web. J Med Genet 49:433–436
Acknowledgments
I would like to acknowledge the long-standing support of Mr Neill Hodgen and the Department of Clinical Immunology, Royal Perth Hospital for their past and ongoing support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Allcock, R.J.N. (2014). Production and Analytic Bioinformatics for Next-Generation DNA Sequencing. In: Trent, R. (eds) Clinical Bioinformatics. Methods in Molecular Biology, vol 1168. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0847-9_2
Download citation
DOI: https://doi.org/10.1007/978-1-4939-0847-9_2
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-0846-2
Online ISBN: 978-1-4939-0847-9
eBook Packages: Springer Protocols