Skip to main content

Processing Repetitive Sequence Structures with Mismatches at Streaming Rate

  • Conference paper
Field Programmable Logic and Application (FPL 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3203))

Included in the following conference series:

Abstract

With the accelerating growth of biological databases and the beginning of genome-scale processing, cost-effective high-performance sequence analysis remains an essential problem in bioinformatics. We examine the use of FPGAs for finding repetitive structures such as tandem repeats and palindromes under various mismatch models. For all problems addressed here, we process strings in streaming mode and obtain processing times of 5ns per character for arbitrary length strings. Using a Xilinx XC2VP100, we can find: (i) all repeats up to size 1024, each with any number of mismatches; (ii) all precise tandem arrays with repeats up to size 1024; (iii) all palindromes up to size 256, each with any number of mismatches, or (iv) a somewhat smaller size of (i) and (iii) with a single insertion or deletion. The speed-up factors range from 250 to 6000 over an efficient serial implementation which is itself many times faster than a direct implementation of a theoretically optimal serial algorithm.

This work was supported in part by the NSF through award 9702483 and the NIH through award RR020209-01.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Schuler, G.: Sequence alignment and database searching. In: Baxevanis, A., Ouellette, B. (eds.) Bioinformatics: A Practical Guide, Wiley, Chichester (2001)

    Google Scholar 

  2. Benson, G.: Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Research 27, 573–580 (1999)

    Article  Google Scholar 

  3. Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  4. Landau, G., Schmidt, J., Sokol, D.: An algorithm for approximate tandem repeats. J. Computational Biology 8, 1–18 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Conti, A.A., Van Court, T., Herbordt, M.C. (2004). Processing Repetitive Sequence Structures with Mismatches at Streaming Rate. In: Becker, J., Platzner, M., Vernalde, S. (eds) Field Programmable Logic and Application. FPL 2004. Lecture Notes in Computer Science, vol 3203. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30117-2_131

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30117-2_131

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22989-6

  • Online ISBN: 978-3-540-30117-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics