Abstract
The next generation sequencing (NGS) technology refers to non-Sanger based DNA sequencing methods which have replaced conventional sequencing methods. They have been vividly used for analyses of complete genome (whole genome sequencing), the coding exons within already reported genes (whole exome sequencing), and only coding regions of selected genes (targeted panel). In this chapter, we give an introduction of NGS technology as well as a gist of different types and applications of NGS. As advancements in NGS data analysis have opened up new therapeutic opportunities for disease diagnosis, the complementary approaches such as machine learning algorithms used in NGS are subtly dealt at the end.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abbasi S, Masoumi S (2020) Next-generation sequencing (NGS). Int J Adv Sci Technol. https://doi.org/10.1007/978-3-662-49054-9_3542-1
Abdellah Z, Ahmadi A, Ahmed S et al (2004) Finishing the euchromatic sequence of the human genome. Nature 431:931–945. https://doi.org/10.1038/nature03001
Altshuler DM, Durbin RM, Abecasis GR et al (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65. https://doi.org/10.1038/nature11632
Ambardar S, Gupta R, Trakroo D et al (2016) High throughput sequencing: an overview of sequencing chemistry. Indian J Microbiol 56:394–404
Ansorge WJ (2009) Next-generation DNA sequencing techniques. N Biotechnol 25:195–203
Arts P, Simons A, AlZahrani MS et al (2019) Exome sequencing in routine diagnostics: a generic test for 254 patients with primary immunodeficiencies. Genome Med 11:38. https://doi.org/10.1186/s13073-019-0649-3
Barzon L, Lavezzo E, Militello V et al (2011) Applications of next-generation sequencing technologies to diagnostic virology. Int J Mol Sci 12:7861–7884. https://doi.org/10.3390/ijms12117861
Bashiardes S, Zilberman-Schapira G, Elinav E (2016) Use of metatranscriptomics in microbiome research. Bioinform Biol Insights 10:19. https://doi.org/10.4137/BBI.S34610
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. https://doi.org/10.1093/bioinformatics/btu170
Brun M, Sima C, Hua J et al (2007) Model-based evaluation of clustering validation measures. Pattern Recogn 40:3. https://doi.org/10.1016/j.patcog.2006.06.026
Buermans HPJ, den Dunnen JT (2014) Next generation sequencing technology: advances and applications. Biochim Biophys Acta – Mol Basis Dis 1842:1932–1941
Cai T, Dodd LE (2008) Regression analysis for the partial area under the ROC curve. Stat Sin 18:817
Caporaso JG, Kuczynski J, Stombaugh J et al (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7:335
Chan BK, Wilson T, Fischer KF, Kriesel JD (2014) Deep sequencing to identify the causes of viral encephalitis. PLoS One 9:e93993. https://doi.org/10.1371/journal.pone.0093993
Charles TC, Liles MR, Sessitsch A (2017) Functional metagenomics: tools and applications. Springer, Cham
Chiu RWK, Chan KCA, Gao Y et al (2008) Noninvasive prenatal diagnosis of fetal chromosomal aneuploidy by massively parallel genomic sequencing of DNA in maternal plasma. Proc Natl Acad Sci 105:20458–20463. https://doi.org/10.1073/pnas.0810641105
Cingolani P, Platts A, Wang LL et al (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6:80–92. https://doi.org/10.4161/fly.19695
Conesa A, Götz S, GarcÃa-Gómez JM et al (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21:3674. https://doi.org/10.1093/bioinformatics/bti610
de Carvalho JB, de Morais GL, Vieira TCDS et al (2019) miRNA genetic variants alter their secondary structure and expression in patients with RASopathies syndromes. Front Genet 10:1144. https://doi.org/10.3389/fgene.2019.01144
Demšar J, Curk T, Erjavec A et al (2013) Orange: data mining toolbox in python. J Mach Learn Res 14:2349–2353
Depristo MA, Banks E, Poplin R et al (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43:491–498. https://doi.org/10.1038/ng.806
Deurenberg RH, Bathoorn E, Chlebowicz MA et al (2017) Application of next generation sequencing in clinical microbiology and infection prevention. J Biotechnol 243:16–24. https://doi.org/10.1016/j.jbiotec.2016.12.022
Di Resta C, Galbiati S, Carrera P, Ferrari M (2018) Next-generation sequencing approach for the diagnosis of human diseases: open challenges and new opportunities. Electron J Int Fed Clin Chem Lab Med 29:4–14
Fan X, Abbott TE, Larson D, Chen K (2014) BreakDancer: identification of genomic structural variation from paired-end read mapping. Curr Protoc Bioinformatics 45:15. https://doi.org/10.1002/0471250953.bi1506s45
Finotello F, Lavezzo E, Barzon L et al (2012) A strategy to reduce technical variability and bias in RNA sequencing data. EMBnet J 18:5. https://doi.org/10.14806/ej.18.b.552
Gambin T, Akdemir ZC, Yuan B et al (2017) Homozygous and hemizygous CNV detection from exome sequencing data in a Mendelian disease cohort. Nucleic Acids Res 45:1633–1648. https://doi.org/10.1093/nar/gkw1237
Garber M, Grabherr MG, Guttman M, Trapnell C (2011) Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods 8:469–477
Goff LA, Trapnell C, Kelley D (2012) CummeRbund: visualization and exploration of cufflinks high-throughput sequencing data. R Packag version
Goldberg B, Sichtig H, Geyer C et al (2015) Making the leap from research laboratory to clinic: challenges and opportunities for next-generation sequencing in infectious disease diagnostics. MBio 6:e01888. https://doi.org/10.1128/mBio.01888-15
Gracia A, González S, Robles V, Menasalvas E (2014) A methodology to compare dimensionality reduction algorithms in terms of loss of quality. Inf Sci (Ny) 270:1–27. https://doi.org/10.1016/j.ins.2014.02.068
Greiner M, Pfeiffer D, Smith RD (2000) Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests. Prev Vet Med 45:23–41. https://doi.org/10.1016/S0167-5877(00)00115-X
Gupta S, Chatterjee S, Mukherjee A, Mutsuddi M (2017) Whole exome sequencing: uncovering causal genetic variants for ocular diseases. Exp Eye Res 164:139–150
Gupta S, Gupta N, Tiwari P et al (2018) Lnc-EPB41-protein interactions associated with congenital pouch colon. Biomol Ther 8:95. https://doi.org/10.3390/biom8030095
Gupta A, Shukla N, Nehra M et al (2020) A pilot study on the whole exome sequencing of prostate cancer in the Indian phenotype reveals distinct polymorphisms. Front Genet 11:874. https://doi.org/10.3389/fgene.2020.00874
Hämäläinen J, Jauhiainen S, Kärkkäinen T (2017) Comparison of internal clustering validation indices for prototype-based clustering. Algorithms 10:105. https://doi.org/10.3390/a10030105
He KY, Ge D, He MM (2017) Big data analytics for genomic medicine. Int J Mol Sci 18:412
Head SR, Kiyomi Komori H, LaMere SA et al (2014) Library construction for next-generation sequencing: overviews and challenges. Biotechniques 56:61–77. https://doi.org/10.2144/000114133
Heather JM, Chain B (2016) The sequence of sequencers: the history of sequencing DNA. Genomics 107:1–8
Heyer R, Schallert K, Zoun R et al (2017) Challenges and perspectives of metaproteomic data analysis. J Biotechnol 261:24–36
Hui AWH, Lau HW, Chan THT, Tsui SKW (2013) The human microbiota: a new direction in the investigation of thoracic diseases. J Thorac Dis 5:127–131
Hyatt D, Chen GL, LoCascio PF et al (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:119. https://doi.org/10.1186/1471-2105-11-119
Illumina (2015) HiSeq 3000/HiSeq 4000 sequencing systems. In: Illumina
Jamuar SS h, Tan E-C (2015) Clinical application of next-generation sequencing for Mendelian diseases. Hum Genomics 9:10. https://doi.org/10.1186/s40246-015-0031-5
Jeste SS, Geschwind DH (2014) Disentangling the heterogeneity of autism spectrum disorder through genetic findings. Nat Rev Neurol 10:74–81. https://doi.org/10.1038/nrneurol.2013.278
Joshi N, Fass J (2011) Sickle: a sliding-window, adaptive, quality-based trimming tool for FastQ files (Version 1.33) [Software]. https://github.com/najoshi/sickle
Korbel JO, Abyzov A, Mu XJ et al (2009) PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biol 10:23. https://doi.org/10.1186/gb-2009-10-2-r23
Lam HYK, Clark MJ, Chen R et al (2012) Performance comparison of whole-genome sequencing platforms. Nat Biotechnol 30:78–82. https://doi.org/10.1038/nbt.2065
Lam KN, Cheng J, Engel K et al (2015) Current and future resources for functional metagenomics. Front Microbiol 6:1196. https://doi.org/10.3389/fmicb.2015.01196
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:25. https://doi.org/10.1186/gb-2009-10-3-r25
Lek M, Karczewski KJ, Minikel EV et al (2016) Analysis of protein-coding genetic variation in 60,706 humans. Nature 536:285–291. https://doi.org/10.1038/nature19057
Levy SE, Myers RM (2016) Advancements in next-generation sequencing. Annu Rev Genomics Hum Genet 17:95–115
Li H, Durbin R (2009) Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics 25(14):1754–1760. https://doi.org/10.1093/bioinformatics/btp324
Love M, Anders S, Huber W (2017) Analyzing RNA-seq data with DESeq2. Bioconductor
Luthra R, Chen H, Roy-Chowdhuri S, Singh RR (2015) Next-generation sequencing in clinical molecular diagnostics of cancer: advantages and challenges. Cancers (Basel) 7:14
Mai NTH, Phu NH, Nhu LNT et al (2017) Central nervous system infection diagnosis by next-generation sequencing: a glimpse into the future? Open Forum Infect Dis 4:046. https://doi.org/10.1093/ofid/ofx046
Marceddu G, Dallavilla T, Guerri G et al (2019) Analysis of machine learning algorithms as integrative tools for validation of next generation sequencing data. Eur Rev Med Pharmacol Sci 23:8139. https://doi.org/10.26355/eurrev_201909_19034
Maron PA, Ranjard L, Mougel C, Lemanceau P (2007) Metaproteomics: a new approach for studying functional microbial ecology. Microb Ecol 53:486–493
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17:10. https://doi.org/10.14806/ej.17.1.200
Martin JA, Wang Z (2011) Next-generation transcriptome assembly. Nat Rev Genet 12:671
Mathur P, Medicherla KM, Chaudhary S et al (2018) Whole exome sequencing reveals rare variants linked to congenital pouch colon. Sci Rep 8:6646. https://doi.org/10.1038/s41598-018-24967-y
Meena N, Mathur P, Medicherla K, Suravajhala P (2018) A bioinformatics pipeline for whole exome sequencing: overview of the processing and steps from raw data to downstream analysis. Bio-Protocol 8:e2805. https://doi.org/10.21769/BioProtoc.2805
Metzker ML (2010) Sequencing technologies the next generation. Nat Rev Genet 11:31–46
Morgan JL, Darling AE, Eisen JA (2010) Metagenomic sequencing of an in vitro-simulated microbial community. PLoS One 5:e10209. https://doi.org/10.1371/journal.pone.0010209
Moriya Y, Itoh M, Okuda S et al (2007) KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res 35:2. https://doi.org/10.1093/nar/gkm321
Mueller JJ, Schlappe BA, Kumar R et al (2018) Massively parallel sequencing analysis of mucinous ovarian carcinomas: genomic profiling and differential diagnoses. Gynecol Oncol 150:127–135. https://doi.org/10.1016/j.ygyno.2018.05.008
Nagalakshmi U, Waern K, Snyder M (2010) RNA-seq: a method for comprehensive transcriptome analysis. Curr Protoc Mol Biol 89:4.11.1–4.11.13
Neveling K, Feenstra I, Gilissen C et al (2013) A post-hoc comparison of the utility of sanger sequencing and exome sequencing for the diagnosis of heterogeneous diseases. Hum Mutat 34:1721–1726. https://doi.org/10.1002/humu.22450
Ng PC, Kirkness EF (2010) Whole genome sequencing. Methods Mol Biol 628:215–226
Pareek CS, Smoczynski R, Tretyn A (2011) Sequencing technologies and genome sequencing. J Appl Genet 52:413–435
Parize P, Muth E, Richaud C et al (2017) Untargeted next-generation sequencing-based first-line diagnosis of infection in immunocompromised adults: a multicentre, blinded, prospective study. Clin Microbiol Infect 23:574. https://doi.org/10.1016/j.cmi.2017.02.006
Pettersson E, Lundeberg J, Ahmadian A (2009) Generations of sequencing technologies. Genomics 93:105–111. https://doi.org/10.1016/j.ygeno.2008.10.003
Quail MA, Smith M, Coupland P et al (2012) A tale of three next generation sequencing platforms: comparison of ion torrent, pacific biosciences and illumina MiSeq sequencers. BMC Genomics 13:341. https://doi.org/10.1186/1471-2164-13-341
Rabbani B, Tekin M, Mahdieh N (2014) The promise of whole-exome sequencing in medical genetics. J Hum Genet 59:5–15. https://doi.org/10.1038/jhg.2013.114
Rhoads A, Au KF (2015) PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13:278–289
Robinson MD, McCarthy DJ, Smyth GK (2009) edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139. https://doi.org/10.1093/bioinformatics/btp616
Salzberg SL, Breitwieser FP, Kumar A et al (2016) Next-generation sequencing in neuropathologic diagnosis of infections of the nervous system. Neurol - Neuroimmunol Neuroinflammation 3:e251. https://doi.org/10.1212/NXI.0000000000000251
Sanders SJ, Neale BM, Huang H et al (2017) Whole genome sequencing in psychiatric disorders: the WGSPD consortium. Nat Neurosci 20:1661–1668. https://doi.org/10.1038/s41593-017-0017-9
Sarwar B, Karypis G, Konstan J, Riedl J (2000) Application of dimensionality reduction in recommender system—a case study. ACM WebKDD 2000 Web Min ECommerce Work. https://doi.org/10.3141/1625-22
Schirmer M, Ijaz UZ, D’Amore R et al (2015) Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform. Nucleic Acids Res 43:37. https://doi.org/10.1093/nar/gku1341
Schloss PD, Westcott SL, Ryabin T et al (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75:7537. https://doi.org/10.1128/AEM.01541-09
Schubach M, Re M, Robinson PN, Valentini G (2017) Imbalance-aware machine learning for predicting rare and common disease-associated non-coding variants. Sci Rep 7:2959. https://doi.org/10.1038/s41598-017-03011-5
Schuster SC (2008) Next-generation sequencing transforms today’s biology. Nat Methods 5:16–18
Shendure J (2005) Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309:1728–1732. https://doi.org/10.1126/science.1117389
Silahtaroǧlu G (2009) An attribute-centre based decision tree classification algorithm. World Acad Sci Eng Technol 36:11282
Simpson JT, Wong K, Jackman SD et al (2009) ABySS: a parallel assembler for short read sequence data. Genome Res 19:1117–1123. https://doi.org/10.1101/gr.089532.108
Stratton MR, Campbell PJ, Futreal PA (2009) The cancer genome. Nature 458:719–724. https://doi.org/10.1038/nature07943
Stray-Pedersen A, Sorte HS, Samarakoon P et al (2017) Primary immunodeficiency diseases: genomic approaches delineate heterogeneous Mendelian disorders. J Allergy Clin Immunol 139:232–245. https://doi.org/10.1016/j.jaci.2016.05.042
Suravajhala P, Kogelman LJA, Kadarmideen HN (2016) Multi-omic data integration and analysis using systems genomics approaches: methods and applications in animal production, health and welfare. Genet Sel Evol 48:38. https://doi.org/10.1186/s12711-016-0217-x
Thermes C (2014) Ten years of next-generation sequencing technology. Trends Genet 30:418–426. https://doi.org/10.1016/j.tig.2014.07.001
Thomas T, Gilbert J, Meyer F (2012) Metagenomics—a guide from sampling to data analysis. Microb Inform Exp 2:3. https://doi.org/10.1186/2042-5783-2-3
Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105. https://doi.org/10.1093/bioinformatics/btp120
Trapnell C, Williams BA, Pertea G et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515. https://doi.org/10.1038/nbt.1621
Tringe SG (2005) Comparative metagenomics of microbial communities. Science 308:554–557. https://doi.org/10.1126/science.1107851
Tripathi R, Sharma P, Chakraborty P, Varadwaj PK (2016) Next-generation sequencing revolution through big data analytics. Front Life Sci 9:119–149. https://doi.org/10.1080/21553769.2016.1178180
van den Akker J, Mishne G, Zimmer AD, Zhou AY (2018) A machine learning model to determine the accuracy of variant calls in capture-based next generation sequencing. BMC Genomics 19:263. https://doi.org/10.1186/s12864-018-4659-0
Van den Veyver IB, Eng CM (2015) Genome-wide sequencing for prenatal detection of fetal single-gene disorders. Cold Spring Harb Perspect Med 5:23077. https://doi.org/10.1101/cshperspect.a023077
Vendramin L, Campello RJGB, Hruschka ER (2010) Relative clustering validity criteria: a comparative overview. Stat Anal Data Min 3:209. https://doi.org/10.1002/sam.10080
Wadapurkar RM, Vyas R (2018) Computational analysis of next generation sequencing data and its applications in clinical oncology. Informatics Med Unlocked 11:75–82. https://doi.org/10.1016/j.imu.2018.05.003
Waern K, Nagalakshmi U, Snyder M (2011) RNA sequencing. Methods Mol Biol 3:209–235. https://doi.org/10.1007/978-1-61779-173-4_8
Wandelt S, Rheinländer A, Bux M et al (2012) Data management challenges in next generation sequencing. Datenbank-Spektrum 12:161–171. https://doi.org/10.1007/s13222-012-0098-2
Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63
Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:164. https://doi.org/10.1093/nar/gkq603
Weigelt B, Bi R, Kumar R et al (2018) The landscape of somatic genetic alterations in breast cancers from ATM germline mutation carriers. JNCI J Natl Cancer Inst 110:1030–1034. https://doi.org/10.1093/jnci/djy028
Wylie KM, Weinstock GM, Storch GA (2013) Virome genomics: a tool for defining the human virome. Curr Opin Microbiol 16:479–484. https://doi.org/10.1016/j.mib.2013.04.006
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Prasad, A. et al. (2021). Next Generation Sequencing. In: Singh, V., Kumar, A. (eds) Advances in Bioinformatics. Springer, Singapore. https://doi.org/10.1007/978-981-33-6191-1_14
Download citation
DOI: https://doi.org/10.1007/978-981-33-6191-1_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-33-6190-4
Online ISBN: 978-981-33-6191-1
eBook Packages: Computer ScienceComputer Science (R0)