Finding Protein Binding Sites Using Volunteer Computing Grids
This paper describes initial work in the development of the DNA@Home volunteer computing project, which aims to use Gibbs sampling for the identification and location of DNA control signals on full genome scale data sets. Most current research involving sequence analysis for these control signals involve significantly smaller data sets, however volunteer computing can provide the necessary computational power to make full genome analysis feasible. A fault tolerant and asynchronous implementation of Gibbs sampling using the Berkeley Open Infrastructure for Network Computing (BOINC) is presented, which is currently being used to analyze the intergenic regions of the Mycobacterium tuberculosis genome. In only three months of limited operation, the project has had over 1,800 volunteered computing hosts participate and obtains a number of samples required for analysis over 400 times faster than an average computing host for the Mycobacterium tuberculosis dataset. We feel that the preliminary results for this project provide a strong argument for the feasibility and public interest of a volunteer computing project for this type of bioinformatics.
KeywordsIntergenic Region Full Genome Motif Model Yersinia Pestis Gibbs Sampling Algorithm
Unable to display preview. Download preview PDF.
- 2.Anderson, D.P., Korpela, E., Walton, R.: High-performance task distribution for volunteer computing. In: e-Science, pp. 196–203. IEEE Computer Society Press (2005)Google Scholar
- 4.Bais, A.S., Kaminski, N., Benos, P.V.: Finding subtypes of transcription factor motif pairs with distinct regulatory roles. Nucleic Acids Research (2011)Google Scholar
- 5.Stormo, G.D.: Motif discovery using expectation maximization and gibbs sampling. In: Ladunga, I. (ed.) Computational Biology of Transcription Factor Binding. Methods in Molecular Biology, vol. 674, pp. 85–95. Humana Press (2010)Google Scholar
- 7.Zhang, X.: Automatic feature learning and parameter estimation for hidden markov models using mce and gibbs sampling. Ph.D. dissertation, University of Florida (2009)Google Scholar
- 8.Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL 2005, pp. 363–370. Association for Computational Linguistics, Stroudsburg (2005)CrossRefGoogle Scholar
- 15.Yu, L., Xu, Y.: A parallel gibbs sampling algorithm for motif finding on gpu. In: 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications, pp. 555–558 (2009)Google Scholar
- 16.Kuttippurathu, L., Hsing, M., Liu, Y., Schmidt, B., Maskell, D.L., Lee, K., He, A., Pu, W.T., Kong, S.W.: Decgpu: distributed error correction on massively parallel graphics processing units using cuda and mpi. BMC Bioinformatics 12(85) (2011)Google Scholar