Skip to main content

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 144))

Abstract

This paper describes initial work in the development of the DNA@Home volunteer computing project, which aims to use Gibbs sampling for the identification and location of DNA control signals on full genome scale data sets. Most current research involving sequence analysis for these control signals involve significantly smaller data sets, however volunteer computing can provide the necessary computational power to make full genome analysis feasible. A fault tolerant and asynchronous implementation of Gibbs sampling using the Berkeley Open Infrastructure for Network Computing (BOINC) is presented, which is currently being used to analyze the intergenic regions of the Mycobacterium tuberculosis genome. In only three months of limited operation, the project has had over 1,800 volunteered computing hosts participate and obtains a number of samples required for analysis over 400 times faster than an average computing host for the Mycobacterium tuberculosis dataset. We feel that the preliminary results for this project provide a strong argument for the feasibility and public interest of a volunteer computing project for this type of bioinformatics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pande, V., et al.: Atomistic protein folding simulations on the submillisecond timescale using worldwide distributed computing. Biopolymers 68(1), 91–109 (2002), peter Kollman Memorial Issue

    Article  Google Scholar 

  2. Anderson, D.P., Korpela, E., Walton, R.: High-performance task distribution for volunteer computing. In: e-Science, pp. 196–203. IEEE Computer Society Press (2005)

    Google Scholar 

  3. Lawrence, C., Altschul, S., Boguski, M., Liu, J., Neuwald, A., Wootton, J.: Detecting subtle sequence signals: a gibbs sampling strategy for multiple alignment. Science 262(5131), 208–214 (1993)

    Article  Google Scholar 

  4. Bais, A.S., Kaminski, N., Benos, P.V.: Finding subtypes of transcription factor motif pairs with distinct regulatory roles. Nucleic Acids Research (2011)

    Google Scholar 

  5. Stormo, G.D.: Motif discovery using expectation maximization and gibbs sampling. In: Ladunga, I. (ed.) Computational Biology of Transcription Factor Binding. Methods in Molecular Biology, vol. 674, pp. 85–95. Humana Press (2010)

    Google Scholar 

  6. Challa, S., Thulasiraman, P.: Protein Sequence Motif Discovery on Distributed Supercomputer. In: Wu, S., Yang, L.T., Xu, T.L. (eds.) GPC 2008. LNCS, vol. 5036, pp. 232–243. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  7. Zhang, X.: Automatic feature learning and parameter estimation for hidden markov models using mce and gibbs sampling. Ph.D. dissertation, University of Florida (2009)

    Google Scholar 

  8. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL 2005, pp. 363–370. Association for Computational Linguistics, Stroudsburg (2005)

    Chapter  Google Scholar 

  9. Tan, X., Xi, W., Baras, J.S.: Decentralized coordination of autonomous swarms using parallel gibbs sampling. Automatica 46(12), 2068–2076 (2010)

    Article  MATH  Google Scholar 

  10. Salas-Gonzalez, D., Kuruoglu, E.E., Ruiz, D.P.: Modelling with mixture of symmetric stable distributions using gibbs sampling. Signal Processing 90(3), 774–783 (2010)

    Article  MATH  Google Scholar 

  11. Newberg, L.A., Thompson, W.A., Conlan, S., Smith, T.M., McCue, L.A., Lawrence, C.E.: A phylogenetic gibbs sampler that yields centroid solutions for cis-regulatory site prediction. Bioinformatics 23, 1718–1727 (2007)

    Article  Google Scholar 

  12. Thompson, W.A., Newberg, L.A., Conlan, S., McCue, L.A., Lawrence, C.E.: The gibbs centroid sampler. Nucleic Acids Research 35(Web-Server-Issue), 232–237 (2007)

    Article  Google Scholar 

  13. Lartillot, N.: Conjugate gibbs sampling for bayesian phylogenetic models. Journal of Computational Biology 13(10), 1701–1722 (2006)

    Article  MathSciNet  Google Scholar 

  14. Gelman, A., Rubin, D.: Inference from iterative simulation using multiple sequences. Statistical Science 7, 457–511 (1992)

    Article  Google Scholar 

  15. Yu, L., Xu, Y.: A parallel gibbs sampling algorithm for motif finding on gpu. In: 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications, pp. 555–558 (2009)

    Google Scholar 

  16. Kuttippurathu, L., Hsing, M., Liu, Y., Schmidt, B., Maskell, D.L., Lee, K., He, A., Pu, W.T., Kong, S.W.: Decgpu: distributed error correction on massively parallel graphics processing units using cuda and mpi. BMC Bioinformatics 12(85) (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Travis Desell .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag GmbH Berlin Heidelberg

About this paper

Cite this paper

Desell, T., Newberg, L.A., Magdon-Ismail, M., Szymanski, B.K., Thompson, W. (2012). Finding Protein Binding Sites Using Volunteer Computing Grids. In: Gaol, F., Nguyen, Q. (eds) Proceedings of the 2011 2nd International Congress on Computer Applications and Computational Science. Advances in Intelligent and Soft Computing, vol 144. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28314-7_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28314-7_52

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28313-0

  • Online ISBN: 978-3-642-28314-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics