S.F. Altschul, T.L. Madden, A.A. Schaffer, J. Zhang, Z. Zhang, W. Miller and D.J. Lipman, “Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs,” Nucleic Acids Res.
, vol. 25, 1997, pp. 3389–3402.CrossRef
B. Bloom, “Space/Time Trade-Offs in Hash Coding with Allowable Errors,” Commun. ACM
, vol. 13, no. 7, 1970, pp. 422–426.CrossRefMATH
J. Buhler, “Mercury BLAST Dictionaries: Analysis and Performance Measurement,” Technical Report WUCSE-2007-13, Washington University in St. Louis, 2007.
J. Buhler, U. Keich and Y. Sun, “Designing Seeds for Similarity Search in Genomic DNA,” J. Comput. Syst. Sci.
, vol. 70, 2005, pp. 342–363.MathSciNetCrossRef
L. Carter and M. Wegman, “Universal Classes of Hashing Functions,” J. Comput. Syst. Sci.
, vol. 18, 1979, pp. 143–154.MathSciNetCrossRefMATH
R. Chamberlain and R. Cytron, “Novel Techniques for Processing Unstructured Data Sets,” in Proc. of IEEE Aerospace Conf., Montana, March 2005.
R. Chamberlain and B. Shands, “Streaming Data from Disk Store to Application,” in Proc. of 3rd Int’l Workshop on Storage Network Architecture and Parallel I/Os, St. Louis, MO, September 2005, pp. 17–23.
R. Chamberlain, B. Shands and J. White, “Achieving Real Data Throughput for an FPGA Co-Processor on Commodity Server Platforms,” in Proc. of 1st Workshop on Building Block Engine Architectures for Computers and Networks, Boston, MA, October 2004.
R.D. Chamberlain, R.K. Cytron, M.A. Franklin and R.S. Indeck, The Mercury System: Exploiting Truly Fast Hardware for Data Search,” in Proc. of Int’l Workshop on Storage Network Architecture and Parallel I/Os, pp. 65–72, September 2003.
Z.J. Czech, G. Havas and B.S. Majewski, “Perfect Hashing,” Theor. Comp. Sci.
, vol. 182, 1997, pp. 1–143.MathSciNetCrossRefMATH
W.J. Dally et al., “Merrimac: Supercomputing with Streams.” in Proc. of Supercomputing Conf., November 2003.
S. Dharmapurikar, P. Krishnamurthy, T. Sproull and J. Lockwood, “Deep Packet Inspection Using Parallel Bloom Filters,” IEEE Micro
, vol. 24, no. 1, 2004, pp. 52–61.CrossRef
R.K. Singh et al., “BioSCAN: A Dynamically Reconfigurable Systolic Array for Biosequence Analysis,” in Proc. CERCS 96, 1996.
M. Franklin, R. Chamberlain, M. Henrichs, B. Shands and J. White, “An Architecture for Fast Processing of Large Unstructured Data Sets,” in Proc. of the 22nd Int’l Conf. on Computer Design, October 2004, pp. 280–287.
T. Hagerup, P.B. Miltersen and R. Pagh, “Deterministic Dictionaries,” J. Algorithms
, vol. 41, 2001, pp. 69–85.MathSciNetCrossRefMATH
J.D. Hirschberg, R. Hughley and K. Karplus, “Kestrel: A Programmable Array for Sequence Analysis,” in Proc. of IEEE International Conference on Application-Specific Systems, Architecture, and Processors, 1996, pp. 23–34.
D.T. Hoang, “Searching Genetic Databases on Splash 2,” in IEEE Workshop on FPGAs for Custom Computing Machines, 1993, pp. 185–191.
W.J. Kent, “BLAT: The BLAST-Like Alignment Tool,” Genome Res.
, vol. 12, 2002, pp. 656–664.MathSciNetCrossRef
G. Knowles and P. Gardner-Stephen, “DASH: Localizing Dynamic Programming for Order of Magnitude Faster, Accurate Sequence Alignment,” in Proc. of the 3rd International IEEE Computer Society Computational Systems Bioinformatics Conference, 2004, pp. 732–735.
G. Knowles and P. Gardner-Stephen, “A New Hardware Architecture for Genomic and Proteomic Sequence Alignment,” in Proc. of IEEE Computational Systems Bioinformatics Conf., 2004.
J. Lancaster, J. Buhler and R.D. Chamberlain, “Acceleration of Ungapped Extension in Mercury BLAST.” in Proc. of the 7th Workshop on Media and Streaming Processors, November 2005.
D. Lavenier, S. Guytant, S. Derrien and S. Rubin, “A Reconfigurable Parallel Disk System for Filtering Genomic Banks,” in ERSA’03, Engineering of Reconfigurable Systems and Algorithms, 2003.
M. Li, B. Ma, D. Kisman and J. Tromp, “Patternhunter II: Highly Sensitive and Fast Homology Search,” J. Bioinform. Comput. Biol.
, vol. 2, 2004, pp. 417–439.CrossRef
National Center for Biological Information, “Growth of GenBank,” 2002, http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html
Z. Ning, A.J. Cox and J.C. Mullikin, “SSAHA: A Fast Search Method for Large DNA Databases,” Genome Res.
, vol. 11, 2001, pp. 1725–1729.CrossRef
N. Pappas, “Searching Biological Sequence Databases Using Distributed Adaptive Computing,” Master’s thesis, Virginia Polytechnic Institute and State University, 2003.
P.A. Pevzner and M.S. Waterman, “Multiple Filtration and Approximate Pattern Matching,” Algorithmica
, vol. 13, no. 1/2, 1995, pp. 135–154.MathSciNetCrossRefMATH
M.V. Ramakrishna, E. Fu and E. Bahcekapili, “Efficient Hardware Hashing Functions for High Performance Computers,” IEEE Trans. Comput.
, vol. 46, 1997, pp. 1378–1381.CrossRef
E. Reidel, C. Faloutsos, G. Gibson and D. Nagle, “Active Disks for Large-Scale Data Processing,” IEEE Comput.
, vol. 34, no. 6, June 2001, pp. 68–74.CrossRef
T.F. Smith and M.S. Waterman, “Identification of Common Molecular Subsequences,” J. Mol. Biol.
, vol. 147, no. 1, March 1981, pp. 195–197.CrossRef
R. Sprugnoli, “Perfect Hashing Functions: A Single Probe Retrieving Method for Static Sets,” Commun. ACM
, vol. 20, no. 11, 1977, pp. 841–850.MathSciNetCrossRefMATH
R.E. Tarjan and A.C.C. Yao, “Storing a Sparse Table,” Commun. ACM
, vol. 22, no. 11, 1979, pp. 606–611.MathSciNetCrossRefMATH
R.H. Waterston et al., “Initial Sequencing and Comparative Analysis of the Mouse Genome,” Nature
, vol. 420, 2002, pp. 520–562.CrossRef
B. West, R.D. Chamberlain, R.S. Indeck and Q. Zhang, “An FPGA-Based Search Engine for Unstructured Database,” in Proc. of 2nd Workshop on Application Specific Processors, December 2003, pp. 25–32.
Y. Yamaguchi, T. Maruyama and A. Konagaya, “High Speed Homology Search with FPGAs,” in Pacific Symposium on Biocomputing, 2002, pp. 271–282.
Q. Zhang, R.D. Chamberlain, R.S. Indeck, B. West and J. White, “Massively Parallel Data Mining Using Reconfigurable Hardware: Approximate String Matching,” in Proc. Workshop on Massively Parallel Processing, April 2004.
Z. Zhang, S. Schwartz, L. Wagner and W. Miller, “A Greedy Algorithm for Aligning DNA Sequences,” J. Comput Biol.
, vol. 7, 2000, pp. 203–214.CrossRef