An Introduction to the Computational Challenges in Next Generation Sequencing

  • Zoltan SzallasiEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 822)


During the last decade next generation sequencing has become one of the research areas that poses the most significant challenges both in terms of big data handling and algorithmic problems.

In this review we will discuss those challenges with a particular emphasis on those issues where scientific innovation will be essential to make progress.


Next generation sequencing Big data Bioinformatics 


  1. 1.
    Muir, P., Li, S., Lou, S., et al.: The real cost of sequencing: scaling computation to keep pace with data generation. Genome Biol. 17(1), 53 (2016)CrossRefGoogle Scholar
  2. 2.
    Szallasi, Z.: Development of genomic based diagnostics in various application domains. In: XIX International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2017), CEUR WS, vol. 2022, pp. 3–4, (2017). Extended Abstract
  3. 3.
    Boone, M., De Koker, A., Callewaert, N.: Capturing the “ome”: the expanding molecular toolbox for RNA and DNA library construction. Nucleic Acids Res. 107, 1 (2018)Google Scholar
  4. 4.
    Stephens, Z.D., Lee, S.Y., Faghri, F., et al.: Big data: astronomical or genomical? PLoS Biol. 13(7), e1002195 (2015)CrossRefGoogle Scholar
  5. 5.
    Nik-Zainal, S., Davies, H., Staaf, J., et al.: Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534(7605), 47–54 (2016)CrossRefGoogle Scholar
  6. 6.
    Reynolds, S.M., Miller, M., Lee, P., et al.: The ISB cancer genomics cloud: a flexible cloud-based platform for cancer genomics research. Cancer Res. 77(21), e7–e10 (2017)CrossRefGoogle Scholar
  7. 7.
    Rhoads, A., Au, K.F.: PacBio sequencing and its applications. Genomics Proteomics Bioinf. 13(5), 278–289 (2015)CrossRefGoogle Scholar
  8. 8.
    Liao, P., Satten, G.A., Hu, Y.-J.: PhredEM: a phred-score-informed genotype-calling approach for next-generation sequencing studies. Genet. Epidemiol. 41(5), 375–387 (2017)CrossRefGoogle Scholar
  9. 9.
    Pevzner, P.A., Tang, H., Waterman, M.S.: An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. U.S.A. 98(17), 9748–9753 (2001)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Compeau, P.E.C., Pevzner, P.A., Tesler, G.: How to apply de Bruijn graphs to genome assembly. Nat. Biotechnol. 29(11), 987–991 (2011)CrossRefGoogle Scholar
  11. 11.
    Yang, J., Moeinzadeh, M.-H., Kuhl, H., et al.: Haplotype-resolved sweet potato genome traces back its hexaploidization history. Nat. Plants 3(9), 696–703 (2017)CrossRefGoogle Scholar
  12. 12.
    Olson, N.D., Treangen, T.J., Hill, C.M., et al.: Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes. Brief. Bioinf. 8, e61692 (2017)Google Scholar
  13. 13.
    Buhler, S., Sanchez-Mazas, A.: HLA DNA sequence variation among human populations: molecular signatures of demographic and selective events. PLoS One 6(2), e14643 (2011)CrossRefGoogle Scholar
  14. 14.
    Szilveszter Juhos, K.R., Horváth, G.: On Genotyping Polymorphic HLA Genes — Ambiguities and quality measures using ngs. next generation sequencing - advances, applications and challenges. InTech (2016). Scholar
  15. 15.
    Szolek, A., Schubert, B., Mohr, C., et al.: OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics 30(23), 3310–3316 (2014)CrossRefGoogle Scholar
  16. 16.
    Shukla, S.A., Rooney, M.S., Rajasagi, M., et al.: Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes. Nat. Biotechnol. 33(11), 1152–1158 (2015)CrossRefGoogle Scholar
  17. 17.
    Goodhead, I., Darby, A.C.: Taking the pseudo out of pseudogenes. Curr. Opin. Microbiol. 23, 102–109 (2015)CrossRefGoogle Scholar
  18. 18.
    Krøigård, A.B., Thomassen, M., Lænkholm, A.-V., et al.: Evaluation of nine somatic variant callers for detection of somatic mutations in exome and targeted deep sequencing data. PLoS One 11(3), e0151664 (2016)CrossRefGoogle Scholar
  19. 19.
    Alexandrov, L.B., Nik-Zainal, S., Wedge, D.C., et al.: Signatures of mutational processes in human cancer. Nature 500(7463), 415–421 (2013)CrossRefGoogle Scholar
  20. 20.
    Dill, K.A., MacCallum, J.L.: The protein-folding problem, 50 years on. Science 338(6110), 1042–1046 (2012)CrossRefGoogle Scholar
  21. 21.
    Berger, B., Leighton, T.: Protein folding in the hydrophobic-hydrophilic (HP) model is NP-complete. J. Comput. Biol. 5(1), 27–40 (1998)CrossRefGoogle Scholar
  22. 22.
    Eccles, D.M., Mitchell, G., Monteiro, A.N.A., et al.: BRCA1 and BRCA2 genetic testing-pitfalls and recommendations for managing variants of uncertain clinical significance. Ann. Oncol. 26(10), 2057–2065 (2015)CrossRefGoogle Scholar
  23. 23.
    Li, Q., Wang, K.: InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines. Am. J. Hum. Genet. 100(2), 267–280 (2017)CrossRefGoogle Scholar
  24. 24.
    Jurtz, V., Paul, S., Andreatta, M., et al.: NetMHCpan-4.0: improved peptide-MHC Class I interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol. 199(9), 3360–3368 (2017)CrossRefGoogle Scholar
  25. 25.
    Bjerregaard, A.-M., Nielsen, M., Jurtz, V., et al.: An analysis of natural T cell responses to predicted tumor neoepitopes. Front. Immunol. 8, 1566 (2017)CrossRefGoogle Scholar
  26. 26.
    Ott, P.A., Hu, Z., Keskin, D.B., et al.: An immunogenic personal neoantigen vaccine for patients with melanoma. Nature 547(7662), 217–221 (2017)CrossRefGoogle Scholar
  27. 27.
    Shah, S.P., Roth, A., Goya, R., et al.: The clonal and mutational evolution spectrum of primary triple-negative breast cancers. Nature 486(7403), 395–399 (2012)CrossRefGoogle Scholar
  28. 28.
    Miklos, G.L.G.: The human cancer genome project—one more misstep in the war on cancer. Nat. Biotechnol. 23(5), 535–537 (2005)CrossRefGoogle Scholar
  29. 29.
    Chang, J.: Core services: Reward bioinformaticians. Nature 520(7546), 151–152 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Computational Health Informatics Program, Boston Children’s Hospital, Harvard Medical SchoolBostonUSA

Personalised recommendations