Skip to main content

Accurate Decoding of Pooled Sequenced Data Using Compressed Sensing

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8126))

Abstract

In order to overcome the limitations imposed by DNA barcoding when multiplexing a large number of samples in the current generation of high-throughput sequencing instruments, we have recently proposed a new protocol that leverages advances in combinatorial pooling design (group testing) [9]. We have also demonstrated how this new protocol would enable de novo selective sequencing and assembly of large, highly-repetitive genomes. Here we address the problem of decoding pooled sequenced data obtained from such a protocol. Our algorithm employs a synergistic combination of ideas from compressed sensing and the decoding of error-correcting codes. Experimental results on synthetic data for the rice genome and real data for the barley genome show that our novel decoding algorithm enables significantly higher quality assemblies than the previous approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alon, S., Vigneault, F., Eminaga, S., et al.: Barcoding bias in high-throughput multiplex sequencing of mirna. Genome Research 21(9), 1506–1511 (2011)

    Article  Google Scholar 

  2. Amir, A., Zuk, O.: Bacterial community reconstruction using compressed sensing. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 1–15. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  3. Earl, D., et al.: Assemblathon 1: A competitive assessment of de novo short read assembly methods. Genome Research 21(12), 2224–2241 (2011)

    Article  Google Scholar 

  4. Engler, F.W., Hatfield, J., Nelson, W., Soderlund, C.A.: Locating sequence on FPC maps and selecting a minimal tiling path. Genome Research 13(9), 2152–2163 (2003)

    Article  Google Scholar 

  5. Erlich, Y., Chang, K., Gordon, A., et al.: DNA sudoku - harnessing high-throughput sequencing for multiplexed specimen analysis. Genome Research 19(7), 1243–1253 (2009)

    Article  Google Scholar 

  6. Erlich, Y., Gordon, A., Brand, M., et al.: Compressed genotyping. IEEE Transactions on Information Theory 56(2), 706–723 (2010)

    Article  MathSciNet  Google Scholar 

  7. Hajirasouliha, I., Hormozdiari, F., Sahinalp, S.C., Birol, I.: Optimal pooling for genome re-sequencing with ultra-high-throughput short-read technologies. Bioinformatics 24(13), i32–i40 (2008)

    Google Scholar 

  8. Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10(3), R25 (2009)

    Google Scholar 

  9. Lonardi, S., Duma, D., Alpert, M., et al.: Combinatorial pooling enables selective sequencing of the barley gene space. PLoS Comput. Biol. 9(4), e1003010 (2013)

    Google Scholar 

  10. Ngo, H.Q., Porat, E., Rudra, A.: Efficiently decodable compressed sensing by list-recoverable codes and recursion. In: STACS, pp. 230–241 (2012)

    Google Scholar 

  11. Prabhu, S., Pe’er, I.: Overlapping pools for high-throughput targeted resequencing. Genome Research 19(7), 1254–1261 (2009)

    Article  Google Scholar 

  12. Shental, N., Amir, A., Zuk, O.: Identification of rare alleles and their carriers using compressed se(que)nsing. Nucleic Acids Research 38(19), e179–e179 (2010)

    Google Scholar 

  13. Simpson, J.T., Durbin, R.: Efficient de novo assembly of large genomes using compressed data structures. Genome Research 22(3), 549–556 (2012)

    Article  Google Scholar 

  14. The International Barley Genome Sequencing Consortium. A physical, genetic and functional sequence assembly of the barley genome. Nature (advance online publication October 2012) (in press)

    Google Scholar 

  15. Thierry-Mieg, N.: A new pooling strategy for high-throughput screening: the shifted transversal design. BMC Bioinformatics 7(28) (2006)

    Google Scholar 

  16. Tropp, J.A., Gilbert, A.C.: Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inform. Theory 53, 4655–4666 (2007)

    Article  MathSciNet  Google Scholar 

  17. Tropp, J.A., Gilbert, A.C., Strauss, M.J.: Algorithms for simultaneous sparse approximation: part i: Greedy pursuit. Signal Process. 86(3), 572–588 (2006)

    Article  MATH  Google Scholar 

  18. Zerbino, D., Birney, E.: Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Research 8(5), 821–829 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Duma, D. et al. (2013). Accurate Decoding of Pooled Sequenced Data Using Compressed Sensing. In: Darling, A., Stoye, J. (eds) Algorithms in Bioinformatics. WABI 2013. Lecture Notes in Computer Science(), vol 8126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40453-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40453-5_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40452-8

  • Online ISBN: 978-3-642-40453-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics