MPI-HMMER-Boost: Distributed FPGA Acceleration

  • John Paul Walters
  • Xiandong Meng
  • Vipin Chaudhary
  • Tim Oliver
  • Leow Yuan Yeow
  • Bertil Schmidt
  • Darran Nathan
  • Joseph Landman
Article

Abstract

HMMER, based on the profile Hidden Markov Model (HMM) is one of the most widely used sequence database searching tools, allowing researchers to compare HMMs to sequence databases or sequences to HMM databases. Such searches often take many hours and consume a great number of CPU cycles on modern computers. We present a cluster-enabled hardware/software-accelerated implementation of the HMMER search tool hmmsearch. Our results show that combining the parallel efficiency of a cluster with one or more high-speed hardware accelerators (FPGAs) can significantly improve performance for even the most time consuming searches, often reducing search times from several hours to minutes.

Keywords

HMMER database searching FPGA VLSI MPI profile hidden markov models 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Swissprot protein sequence database. http://www.ebi.ac.uk/swissprot/, 2006.
  2. 2.
    Uniref sequence database. http://www.ebi.ac.uk/uniref/, 2006.
  3. 3.
    S.F. Altschul, W. Gish, W. Miller, E.W. Myers and D. J. Lipman, “Basic Local Alignment Search Tool,” J Mol Biol, vol. 215, no. 3, October 1990, pp. 403–410.Google Scholar
  4. 4.
    A. Bateman, L. Coin, R. Durbin, R.D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. Marshall, S. Moxon, E.L.L. Sonnhammer, D.J. Studholme, C. Yeats and S.R. Eddy, “The Pfam Protein Families Database,” Nucleic Acid Res., vol. 32, 2004, pp. 38–141.CrossRefGoogle Scholar
  5. 5.
    G. Burns, R. Daoud and J. Vaigl, “LAM: An Open Cluster Environment for MPI,” in Proc. of Supercomputing Symposium, 1994, pp. 379–386.Google Scholar
  6. 6.
    G. Chukkapalli, C. Guda and S. Subramaniam, “SledgeHMMER: A Web Server for Batch Searching the Pfam Database,” Nucleic Acids Res., vol. 32, 2004(Web Server issue).Google Scholar
  7. 7.
    R. Durbin, S. Eddy, A. Krogh and A. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1998.Google Scholar
  8. 8.
    S. Eddy, “HMMER: Profile HMMs for Protein Sequence Analysis,” http://hmmer.wustl.edu, 2006.
  9. 9.
    S.R. Eddy, “Profile Hidden Markov Models,” Bioinformatics, vol. 14, no. 9, 1998.Google Scholar
  10. 10.
    The MPI Forum, “MPI: A Message Passing Interface,” Proc. of the Supercomputing Conference, 1993, pp. 878–883.Google Scholar
  11. 11.
    W. Gropp, E. Lusk, N. Doss and A. Skjellum, “A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard,” Parallel Comput., vol. 22, no. 6, September 1996, pp. 789–828.MATHCrossRefGoogle Scholar
  12. 12.
    W.D. Gropp and E. Lusk, “User’s Guide for mpich, a Portable Implementation of MPI,” Mathematics and Computer Science Division, Argonne National Laboratory, 1996. ANL-96/6.Google Scholar
  13. 13.
    D.R. Horn, M. Houston and P. Hanrahan, “Clawhmmer: A Streaming Hmmer-Search Implementation,” in SC ’05: The International Conference on High Performance Computing, Networking and Storage, 2005.Google Scholar
  14. 14.
    H.H.J. Hum, O. Maquelin, K.B. Theobald, X. Tian, G.R. Gao and L.J. Hendren, “A Study of the Earth-Manna Multithreaded System,” Int. J. Parallel Program., vol. 24, no. 4, 1996, pp. 319–348.Google Scholar
  15. 15.
    Intel Corporation. “SSE2: Streaming SIMD (Single Instruction Multiple Data) Second Extensions,” http://www.intel.com, 2006.
  16. 16.
    J. Landman, J. Ray and J.P. Walters, “Accelerating Hmmer Searches on Opteron Processors with Minimally Invasive Recoding,” in AINA ’06: Proc. of the 20th International Conference on Advanced Information Networking and Applications—Volume 2 (AINA’06), IEEE Computer Society, Washington, DC, USA, 2006, pp. 628–636.Google Scholar
  17. 17.
    E. Lindahl, “Altivec-Accelerated HMM Algorithms,” http://lindahl.sbc.su.se/, 2005.
  18. 18.
    R.P. Maddimsetty, J. Buhler, R. Chamberlain, M. Franklin and B. Harris, “Accelerator Design for Protein Sequence Hmm Search,” in Proc. of the 20th ACM International Conference on Supercomputing (ICS06), ACM, 2006, pp. 287–296.Google Scholar
  19. 19.
    Myricom, “Mpich-Gm Software,” http://www.myri.com/scs/download-mpichgm.html.
  20. 20.
    NCBI, “Position-specific iterated BLAST,” http://www.ncbi.nlm.nih.gov/BLAST/.
  21. 21.
    S. Needleman and C. Wunsch, “A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of two Sequences,” J. Mol. Biol., vol. 48, no. 3, 1970.Google Scholar
  22. 22.
    T.F. Oliver, B. Schmidt, J. Yanto and D.L. Maskell, “Acclerating the Viterbi Algorithm for Profile Hidden Markov Models Using Reconfigurable Hardware,” Lect. Notes Comput. Sci., vol. 3991, 2006, pp. 522–529.CrossRefGoogle Scholar
  23. 23.
    Pfam, “The PFAM HMM Library: A Large Collection of Multiple Sequence Alignments and Hidden Markov Models Covering Many Common Protein Families,” http://pfam.wustl.edu, 2006.
  24. 24.
    Progeniq, “BioBoost Accelerator Platform,” http://www.progeniq.com/, 2006.
  25. 25.
    T.F. Smith and M.S. Waterman, “Identification of Common Molecular Subsequences,” J. Mol. Biol., vol. 147, 1981.Google Scholar
  26. 26.
    V.S. Sunderam, “PVM: A Framework for Parallel Distributed Computing,” Concurrency: Pract. Exper., vol. 2, no. 4, 1990, pp. 315–339.CrossRefGoogle Scholar
  27. 27.
    TimeLogic BioComputing Solutions, “DecypherHMM,” http://www.timelogic.com/, 2006.
  28. 28.
    A.J. Viterbi, “Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm,” IEEE Trans. Inf. Theory, vol. IT-13, 1967, pp. 260–269.MATHCrossRefGoogle Scholar
  29. 29.
    J.P. Walters, J. Landman and V. Chaudhary, “Optimized Cluster-Enabled Hmmer Searches,” in To appear in Grids for Bioinformatics and Computational Biology, E.G. Talbi and A. Zomaya (Eds.), Wiley & Sons, 2007.Google Scholar
  30. 30.
    J.P. Walters, B. Qudah and V. Chaudhary, “Accelerating the Hmmer Sequence Analysis Suite Using Conventional Processors,” in AINA ’06: Proc. of the 20th International Conference on Advanced Information Networking and Applications—Volume 1 (AINA’06), IEEE Computer Society, Washington, DC, USA, 2006, pp. 289–294.Google Scholar
  31. 31.
    B.Wun, J. Buhler and P. Crowley, “Exploiting Coarse-Grained Parallelism to Accelerate Protein Motif Finding with a Network Processor,” in PACT ’05: Proc. of the 2005 International Conference on Parallel Architectures and Compilation Techniques, 2005.Google Scholar
  32. 32.
    W. Zhu, Y. Niu, J. Lu and G.R. Gao,” Implementing Parallel Hmm-Pfam on the Earth Mulithreaded Architecture,” in The 2nd IEEE Computer Society Bioinformatics Conference, 2003.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • John Paul Walters
    • 1
  • Xiandong Meng
    • 2
  • Vipin Chaudhary
    • 3
  • Tim Oliver
    • 4
  • Leow Yuan Yeow
    • 4
  • Bertil Schmidt
    • 5
  • Darran Nathan
    • 4
  • Joseph Landman
    • 6
  1. 1.Institute for Scientific ComputingWayne State UniversityDetroitUSA
  2. 2.Electrical and Computer Engineering DepartmentWayne State UniversityDetroitUSA
  3. 3.Department of Computer Science and Engineering University at BuffaloThe State University of New YorkBuffaloUSA
  4. 4.Progeniq Pte Ltd.SingaporeSingapore
  5. 5.UNSW AsiaQueenstownSingapore
  6. 6.Scalable Informatics LLCCantonUSA

Personalised recommendations