Genomic Revolution-Driven Cancer Research
In recent years of genomic era, continuous development of cutting-edge technologies has enormously increased the generation of genomic data and markedly transformed bioinformatic research. This plethora of genomic information accelerated the high-end digital storage capacity and computational efficiency. Now these developments and the availability of pinnacle amount of genomic data in the public database aid the potential application of genomics toward healthcare research in the context of clinical diagnosis and treatments for human disease. Along these lines, it is essential for the biomedical students and researchers to acquire up-to-date knowledge on the wide range of computational genomic tools and analysis methods involved in disease gene discovery. In this chapter, we provide an overview of the computational or downstream analysis of next-generation sequence (NGS) data and some guidance on its applications in clinical research with head and neck squamous cell carcinoma as an example. Overall, our aim is to give an accessible entry point to analyses of NGS data for identification of potential disease risk variants.
KeywordsNext-generation sequencing Whole exome sequences Bioinformatics tools Head and neck squamous cell carcinoma Variants Clinical applications
Binary Alignment Map, a compressed binary format for storing large nucleotide sequence alignments.
The text-based format for storing both a DNA sequence and its corresponding quality scores.
This sequencing procedure involves sequencing both the ends of the DNA fragments in a library and aligning the forward and reverse reads as read pairs.
The base calling converts the signals into actual sequence data with this quality scores.
The WGS or WES procedure involves shearing DNA into hundreds of thousands of small fragments, and every single fragment is called a “read.”
The average number of times that a given nucleotide in the genome has been read in a sequencing experiment. For instance, a 40× read depth means that each base is present in an average of 40 reads.
Sequence Alignment Map, a genetic format for storing large nucleotide sequence alignments.
This sequencing procedure involves sequencing DNA from only one end.
The text-based tab-delimited file format.
Variant Calling Format, a text file format containing meta-information lines, a header line, and then data lines, each containing information about a position in the genome.
- Andrews S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc
- Garrison E, Marth G (2012) Haplotype-based variant detection from short-read sequencing [Internet]. arXiv [q-bio.GN]. Available from: http://arxiv.org/abs/1207.3907
- Hu H, Huff CD, Moore B, Flygare S (2013) VAAST 2.0: improved variant classification and disease-gene identification using a conservation-controlled amino acid substitution matrix. Genetic [Internet] Wiley Online Library. Available from: http://onlinelibrary.wiley.com/doi/10.1002/gepi.21743/full
- Jourenkova N, Reinikainen M, Bouchardy C, Dayer P, Benhamou S, Hirvonen A (1998) Larynx cancer risk in relation to glutathione S-transferase M1 and T1 genotypes and tobacco smoking. Cancer Epidemiol Biomark Prev 7(1):19–23Google Scholar
- Lionel AC, Costain G, Monfared N, Walker S, Reuter MS, Hosseini SM et al (2017) Improved diagnostic yield compared with targeted gene sequencing panels suggests a role for whole-genome sequencing as a first-tier genetic test. Genet Med [Internet] 3:435; Available from. https://doi.org/10.1038/gim.2017.119CrossRefGoogle Scholar
- Metsky HC, Matranga CB, Wohl S, Schaffner SF (2017) Genome sequencing reveals Zika virus diversity and spread in the Americas. bioRxiv [Internet]. biorxiv.org. Available from https://www.biorxiv.org/content/early/2017/04/23/109348.abstract
- Olshan AF, Weissler MC, Watson MA, Bell DA (2000) GSTM1, GSTT1, GSTP1, CYP1A1, and NAT1 polymorphisms, tobacco use, and the risk of head and neck cancer. Cancer Epidemiol Biomark Prev 9(2):185–191Google Scholar
- Ramakodi MP, Devarajan K, Blackman E, Gibbs D, Luce D, Deloumeaux J et al (2017) Integrative genomic analysis identifies ancestry-related expression quantitative trait loci on DNA polymerase beta and supports the association of genetic ancestry with survival disparities in head and neck squamous cell carcinoma. Cancer 123(5):849–860PubMedCrossRefGoogle Scholar