Skip to main content

Long Single-Molecule Reads Can Resolve the Complexity of the Influenza Virus Composed of Rare, Closely Related Mutant Variants

  • Conference paper
  • First Online:
Research in Computational Molecular Biology (RECOMB 2016)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9649))

Abstract

As a result of a high rate of mutations and recombination events, an RNA-virus exists as a heterogeneous “swarm” of mutant variants. The long read length offered by single-molecule sequencing technologies allows each mutant variant to be sequenced in a single pass. However, high error rate limits the ability to reconstruct heterogeneous viral population composed of rare, related mutant variants. In this paper, we present 2SNV, a method able to tolerate the high error-rate of the single-molecule protocol and reconstruct mutant variants. 2SNV uses linkage between single nucleotide variations to efficiently distinguish them from read errors. To benchmark the sensitivity of 2SNV, we performed a single-molecule sequencing experiment on a sample containing a titrated level of known viral mutant variants. Our method is able to accurately reconstruct clone with frequency of 0.2 % and distinguish clones that differed in only two nucleotides distantly located on the genome. 2SNV outperforms existing methods for full-length viral mutant reconstruction. The open source implementation of 2SNV is freely available for download at http://alan.cs.gsu.edu/NGS/?q=content/2snv.

A. Artyomenko, N.C. Wu and S. Mangul—Equal contributor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aguiar, D., Istrail, S.: Haplotype assembly in polyploid genomes and identical by descent shared tracts. Bioinformatics 29(13), i352–i360 (2013)

    Article  Google Scholar 

  2. Beerenwinkel, N., et al.: Diversity and complexity of HIV-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype. Proc. Natl. Acad. Sci. 99(12), 8271–8276 (2002)

    Article  Google Scholar 

  3. Bushman, F.D., et al.: Massively parallel pyrosequencing in HIV research. Aids 22(12), 1411–1415 (2008)

    Article  Google Scholar 

  4. Dilernia, D.A., et al.: Multiplexed highly-accurate DNA sequencing of closely-related HIV-1 variants using continuous long reads from single molecule, real-time sequencing. Nucleic Acids Res. 43(20), e129 (2015)

    Article  Google Scholar 

  5. Doi, K., et al.: Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing. Bioinformatics 30(6), 815–822 (2014)

    Article  Google Scholar 

  6. Domingo, E.: Mutation rates and rapid evolution of RNA viruses. In: Morse, S.S. (ed.) The Evolutionary Biology of Viruses, pp. 161–184. Raven Press, New York (1994)

    Google Scholar 

  7. Domingo, E., Holland, J.: RNA virus mutations and fitness for survival. Annu. Rev. Microbiol. 51(1), 151–178 (1997)

    Article  Google Scholar 

  8. Eid, J., et al.: Real-time dna sequencing from single polymerase molecules. Science 323(5910), 133–138 (2009)

    Article  Google Scholar 

  9. Eigen, M.: Selforganization of matter and the evolution of biological macromolecules. Naturwissenschaften 58(10), 465–523 (1971)

    Article  Google Scholar 

  10. Flaherty, P., et al.: Ultrasensitive detection of rare mutations using next-generation targeted resequencing. Nucleic Acids Res. 40(1), e2 (2012)

    Article  Google Scholar 

  11. Forshew, T., et al.: Noninvasive identification and monitoring of cancer mutations by targeted deep sequencing of plasma DNA. Sci. Transl. Med. 4(136), 136ra68 (2012)

    Article  Google Scholar 

  12. Goepfert, P.A., et al.: Transmission of HIV-1 Gag immune escape mutations is associated with reduced viral load in linked recipients. J. Exp. Med. 205(5), 1009–1017 (2008)

    Article  Google Scholar 

  13. Harismendy, O., et al.: Detection of low prevalence somatic mutations in solid tumors with ultra-deep targeted sequencing. Genome Biol. 12(12), R124 (2011)

    Article  Google Scholar 

  14. Herfst, S., et al.: Airborne transmission of influenza A/H5N1 virus between ferrets. Science 336(6088), 1534–1541 (2012)

    Article  Google Scholar 

  15. Holland, J., et al.: Rapid evolution of RNA genomes. Science 215(4540), 1577–1585 (1982)

    Article  Google Scholar 

  16. Imai, M., et al.: Experimental adaptation of an influenza H5 HA confers respiratory droplet transmission to a reassortant H5 HA/H1N1 virus in ferrets. Nature 486(7403), 420–428 (2012)

    Google Scholar 

  17. Klarenbeek, P.L., et al.: Deep sequencing of antiviral T-cell responses to HCMV and EBV in humans reveals a stable repertoire that is maintained for many years. PLoS Pathog 8(9), e1002889 (2012)

    Article  Google Scholar 

  18. Schrago, C.G., Carvalho, A.B.: Long-read single molecule sequencing to resolve tandem gene copies: the Mst77Y region on the drosophila melanogaster Y chromosome. G3 (Bethesda) 5(6), 1145–1150 (2015)

    Article  Google Scholar 

  19. Lauring, A.S., Andino, R.: Quasispecies theory and the behavior of RNA viruses. PLoS Pathog 6(7), e1001005 (2010)

    Article  Google Scholar 

  20. Li, H., Durbin, R.: Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)

    Article  Google Scholar 

  21. Li, M., Stoneking, M.: A new approach for detecting low-level mutations in next-generation sequence data. Genome Biol. 13(5), R34 (2012)

    Article  Google Scholar 

  22. Liu, J., et al.: Analysis of low-frequency mutations associated with drug resistance to raltegravir before antiretroviral treatment. Antimicrob. Agents Chemother. 55(3), 1114–1119 (2011)

    Article  Google Scholar 

  23. Macalalad, A.R., et al.: Highly sensitive and specific detection of rare variants in mixed viral populations from massively parallel sequence data. PLoS Comput. Biol. 8(3), e1002417 (2012)

    Article  Google Scholar 

  24. Mangul, S., et al.: Accurate viral population assembly from ultra-deep sequencing data. Bioinformatics 30(12), i329–i337 (2014)

    Article  Google Scholar 

  25. Mardis, E.R., Wilson, R.K.: Cancer genome sequencing: a review. Hum. Mol. Genet. 18(R2), R163–168 (2009)

    Article  Google Scholar 

  26. Margeridon-Thermet, S., et al.: Ultra-deep pyrosequencing of hepatitis B virus quasispecies from nucleoside and nucleotide reverse-transcriptase inhibitor (NRTI)-treated patients and NRTI-naive patients. J. Infect. Dis. 199(9), 1275–1285 (2009)

    Article  Google Scholar 

  27. Miconnet, I.: Probing the T-cell receptor repertoire with deep sequencing. Curr. Opin. HIV AIDS 7(1), 64–70 (2012)

    Article  Google Scholar 

  28. Murphy, F.A., Kingsbury, D.W.: Virus taxonomy. Fields Virol. 2, 15–57 (1996)

    Google Scholar 

  29. Asai, K., Hamada, M.: PBSIM: PacBio reads simulator toward accurate genome assembly. Bioinformatics 29(1), 119–121 (2013)

    Article  Google Scholar 

  30. Palmer, S., et al.: Selection and persistence of non-nucleoside reverse transcriptase inhibitor-resistant HIV-1 in patients starting and stopping non-nucleoside therapy. Aids 20(5), 701–710 (2006)

    Article  Google Scholar 

  31. Pendleton, M., et al.: Assembly and diploid architecture of an individual human genome via single-molecule technologies. Nat. Methods 12, 780–786 (2015)

    Article  Google Scholar 

  32. Beerenwinkel, N., Roth, V.: HIV haplotype inference using a propagating Dirichlet process mixture model. IEEE/ACM Trans. Computat. Biol. Bioinform. (TCBB) 11(1), 182–191 (2014)

    Article  Google Scholar 

  33. Skums, P., et al.: Computational framework for next-generation sequencing of heterogeneous viral populations using combinatorial pooling. Bioinformatics 31(5), 682–690 (2015)

    Article  Google Scholar 

  34. Sharon, D., Snyder, M.P.: Defining a personal, allele-specific, and single-molecule long-read transcriptome. Proc. Natl. Acad. Sci. 111(27), 9869–9874 (2014)

    Article  Google Scholar 

  35. Töpfer, A., Marschall, T., Bull, R.A., Luciani, F., Schönhuth, A., Beerenwinkel, N.: Viral quasispecies assembly via maximal clique enumeration. In: Sharan, R. (ed.) RECOMB 2014. LNCS, vol. 8394, pp. 309–310. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  36. Töpfer, A., et al.: Probabilistic inference of viral quasispecies subject to recombination. J. Comput. Biol. 20(2), 113–123 (2013)

    Article  MathSciNet  Google Scholar 

  37. Ummat, A., Bashir, A.: Resolving complex tandem repeats with long reads. Bioinformatics 30(24), 3491–3498 (2014)

    Article  Google Scholar 

  38. Von Hahn, T., et al.: Hepatitis C virus continuously escapes from neutralizing antibody and T-cell responses during chronic infection in vivo. Gastroenterology 132(2), 667–678 (2007)

    Article  Google Scholar 

  39. Ronaghi, M., Shafer, R.: Characterization of mutation spectra with ultra-deep pyrosequencing: application to HIV-1 drug resistance. Genome Res. 17(8), 1195–1201 (2007)

    Article  Google Scholar 

  40. Wu, X., et al.: Focused evolution of HIV-1 neutralizing antibodies revealed by structures and deep sequencing. Science 333(6049), 1593–1602 (2011)

    Article  Google Scholar 

  41. Eriksson, N., Beerenwinkel, N.: Shorah: estimating the genetic diversity of a mixed sample from next-generation sequencing data. BMC Bioinform. 12(1), 119 (2011)

    Article  Google Scholar 

  42. Zhu, J., et al.: Mining the antibodyome for HIV-1-neutralizing antibodies with next-generation sequencing and phylogenetic pairing of heavy/light chains. Proc. Natl. Acad. Sci. U.S.A. 110(16), 6470–6475 (2013)

    Article  Google Scholar 

  43. Zhu, J., et al.: De novo identification of VRC01 class HIV-1-neutralizing antibodies by next-generation sequencing of B-cell transcripts. Proc. Natl. Acad. Sci. U.S.A. 110(43), E4088–4097 (2013)

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank H. Hao for performing the PacBio sequencing at Johns Hopkins Deep Sequencing & Microarray Core Facility. A.A. was supported by GSU Molecular Basis of Disease Fellowship. S.M. and E.E were supported by National Science Foundation grants 0513612, 0731455, 0729049, 0916676, 1065276, 1302448 and 1320589, and National Institutes of Health grants K25-HL080079, U01-DA024417, P01- HL30568, P01-HL28481, R01-GM083198, R01-MH101782 and R01-ES022282. S.M. was supported in part by Institute for Quantitative & Computational Biosciences Fellowship, UCLA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Artyomenko .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 813 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Artyomenko, A., Wu, N.C., Mangul, S., Eskin, E., Sun, R., Zelikovsky, A. (2016). Long Single-Molecule Reads Can Resolve the Complexity of the Influenza Virus Composed of Rare, Closely Related Mutant Variants. In: Singh, M. (eds) Research in Computational Molecular Biology. RECOMB 2016. Lecture Notes in Computer Science(), vol 9649. Springer, Cham. https://doi.org/10.1007/978-3-319-31957-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31957-5_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31956-8

  • Online ISBN: 978-3-319-31957-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics