Advertisement

A Reconfigurable Index FLASH Memory tailored to Seed-Based Genomic Sequence Comparison Algorithms

  • D. Lavenier
  • G. Georges
  • X. Liu
Article

Abstract

Genomic sequence comparison algorithms represent the basic toolbox for processing large volume of DNA or protein sequences. They are involved both in the systematic scan of databases, mostly for detecting similarities with an unknown sequence, and in preliminary processing before advanced bioinformatics analysis. Due to the exponential growth of genomic data, new solutions are required to keep the computation time reasonable. This paper presents a specific hardware architecture to speed-up seed-based algorithms which are currently the most popular heuristics for detecting alignments. The architecture regroups FLASH and FPGA technologies on a common support, allowing a large amount of data to be rapidly accessed and quickly processed. Experiments on database search and intensive sequence comparison demonstrate a good cost/performance ratio compared to standard approaches.

Keywords

bioinformatics genomics sequence comparison reconfigurable architecture FLASH memory index indexing seed-based algorithm 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    S.F. Altschul, W. Gish, W. Miller, E.W. Myers and D.J. Lipman, “Basic Local Alignment Search Tool,” J. Biol. Mol., vol. 410, 1990, pp. 215–403.Google Scholar
  2. 2.
    S.F. Altschul et al., “Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs,” Nucleic Acids Res., vol. 25, 1997, pp. 3389–3402.CrossRefGoogle Scholar
  3. 3.
    D.A. Benson, I. Karsch-Mizrachi, D.J. Lipman, J. Ostell and D.L. Wheeler, GenBank, Nucleic Acids Res., vol. 33, no. (Database issue), Jan 1 2005 pp. D34–D38.Google Scholar
  4. 4.
    E. Chow, T. Hunkapiller and J. Peterson, “Biological Information Signal Processor, ASAP’91”, in International Conference on Application Specific Array Processors, Barcelona, Spain, 1991.Google Scholar
  5. 5.
    E. Gal and S. Toledo, “Algorithms and data structures for flash memories”, ACM Computing Surveys (CSUR), vol. 37, no. 2, 2005, pp. 138–163.CrossRefGoogle Scholar
  6. 6.
    Genome online databases—http://www.genomesonline.org/
  7. 7.
    L. Grate, M. Diekhans, D. Dahle and R. Hughey, “Sequence Analysis With the Kestrel SIMD Parallel Processor Pacific Symposium on Biocomputing”, Hawaii, 2001.Google Scholar
  8. 8.
    P. Guerdoux and D. Lavenier, “SAMBA: Hardware Accelerator for Biological Sequence Comparison,” CABIOS, vol. 13, no. 6, 1997.Google Scholar
  9. 9.
    D.T. Hoang, “Searching Genetic Databases on SPLASH2, FCCM’93,”in IEEE Workshop on FPGAs for Custom Computing Machines, Napa, California, 1993.Google Scholar
  10. 10.
    K9W8G08U1M Samsung, 1G × 8 Bit NAND Flash Memory Datasheet, http://www.samsung.com
  11. 11.
    K. Keeton, D.A. Patterson and J.M. Hellerstein, “A Case for Intelligent Disks (IDISKs),” SIGMOD Rec, vol. 27, no. 3, 1998, pp. 42–52.CrossRefGoogle Scholar
  12. 12.
    W.J. Kent, “BLAT—the BLAST-like alignment tool,” Genome Res., vol. 12, no. 4, 2002, pp. 656–664.CrossRefGoogle Scholar
  13. 13.
    P. Krishnanurthy, J. Buhler, R.D. Chamberlain, M.A. Franklin, K. Gyang and J. Lancaster, “Biosequence Similarity search on the Mercury system”, in Proceedings Of The 15th IEEE International Conference On Application-Specific Systems, Architectures And Processors, 365–375, 2004.Google Scholar
  14. 14.
    J. Lancaster, J. Buhler and R. Chamberlain, “Acceleration of Ungapped Extension in Mercury BLAST”, in 7th Workshop On Media And Streaming Processors, Barcelona, Spain, November 12, 2005.Google Scholar
  15. 15.
    D. Lavenier, D. Guyétant, S. Derrien and S. Rubini, “A reconfigurable parallel disk system for filtering genomic banks, ERSA’03”, in Engineering of Reconfigurable Systems and Algorithms, Las Vegas, Nevada, USA, 2003.Google Scholar
  16. 16.
    D. Lavenier, X. Xinchun and G. Georges, “Seed-based Genomic Sequence Comparison using a FPGA/FLASH Accelerator”, in International IEEE Conference on Field Programmable Technology (FPT), Bangkok, Thailand, 2006.Google Scholar
  17. 17.
    B. Ma, J. Trump and M. Li, “PatternHunter: faster and more sensitive homology search”, Bioinformatics, vol. 18, no. 3, 2002, pp. 440–445 (March).CrossRefGoogle Scholar
  18. 18.
    G. Memik, M.T. Kandemir and A. Choudhary, “Design and Evaluation of a Smart Disk Cluster for DSS Commercial Workloads,” J. Parallel Distrib. Comput., vol. 61, no. 11, 2001, pp. 1633–1664.zbMATHCrossRefGoogle Scholar
  19. 19.
    K. Muriki, K.D. Underwood and R. Sass, “RC-BLAST: Towards a Portable, Cost-Effective Open Source Hardware Implementation”, in proc. IPDPS 2005: Fourth IEEE International Workshop on High Performance Computational Biology, Denver, CO, April 4, 2005.Google Scholar
  20. 20.
    W.R. Pearson and D.J. Lipman, “Improved Tools for Biological Sequence Comparison,” Proc Natl Acad Sci, vol. 85, 1988, pp. 3244–3248.CrossRefGoogle Scholar
  21. 21.
    E. Riedel, C. Faloustos, G.A. Gibson and D. Nagle, “Active Disks for large scale data processing”, IEEE Computer, vol. 34, no.6, 2001.Google Scholar
  22. 22.
    T.F. Smith and M.S. Waterman, “Identification of Common Molecular Subsequences,” J. Mol. Biol., 147–195–197, 1981.Google Scholar
  23. 23.
    E. Sotiriades, C. Kozanitis and A. Dollas, “Some Initial Results on Hardware BLAST Acceleration with a Reconfigurable Architecture, HICOMB 2006”, in Fifth IEEE International Workshop on High Performance Computational Biology, Rhodes Island, Greece, 2006.Google Scholar
  24. 24.
    TimeLogic Web Site: http://www.timelogic.com

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.IRISA/CNRSRennesFrance
  2. 2.Key Laboratory of Computer System and Architecture, Institute of Computing TechnologyCASBeijingChina

Personalised recommendations