Skip to main content

The use of graphics accelerators to detect functional signals in the regulatory regions of prokaryotic genes

Abstract

Various methods for identifying significant contextual signals are widely used to search for the transcription factor binding sites and to identify the structural and functional organization of the regulatory regions. These methods do not require any prealignment of the sample sequences analyzed or experimental information about the exact location of transcription factor binding sites. Methods of searching for contextual signals, based on the identification of degenerate oligonucleotide motifs recorded in the 15-letter IUPAC code have become widespread. A fundamental problem with degenerate motifs is their great diversity, which makes the researchers apply heuristics which do not guarantee that the most significant signal will be found. The development of high-performance computing systems based on the use of graphics cards has made it possible to use exact exhaustive methods to identify significant motifs. We have developed a new system for identifying significant degenerate oligonucleotide motifs of a given length in the regulatory regions based on the use of widespread graphics cards that provide a search for the signal with the greatest significance. The higher efficiency of the GPU compared to the CPU was demonstrated. Using the proposed approach, we analyzed the regulatory regions of the B. subtilis, E. coli, H. pylori, M. gallisepticum, M. genitalium, and M. pneumoniae genes. Sets of degenerate motifs have been identified for each species of prokaryotes. They were classified based on the similarity with the transcription factor binding sites of E. coli.

This is a preview of subscription content, access via your institution.

References

  1. Baker, Z.K. and Prasanna, V.K., An architecture for efficient hardware data mining using reconfigurable computing systems, 14th Annual IEEE Symp. on Field-Programmable Custom Computing Machines, 2006.

    Google Scholar 

  2. Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., and Sayers, E.W., Genbank, Nucleic Acids Res., 2013, vol. 41, pp. D39–42.

    Article  Google Scholar 

  3. Elnitski, L., Hardison, R.C., Yang, S., Kolbe, D., Eswara, P., O’Connor, M.J., Schwartz, S., Miller, W., and Chiaromonte, F., Distinguishing regulatory DNA from neutral sites, Genome Res., 2003, vol. 13, no. 1, pp. 64–72.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  4. Fomin, E.S. and Alemasov, N.A., Implementation of a non-bonded interaction calculation algorithm for the cell architecture, Lect. Notes Comput. Sci., 2009, vol. 5698, pp. 399–405.

    Article  Google Scholar 

  5. Grundy, W.N., Bailey, T.L., and Elkan, C.P., Para MEME: A parallel implementation and a web interface for a DNA and protein motif discovery tool, CABIOS, Comput. Appl. Biosci., 1996, vol. 12, pp. 303–310.

    CAS  PubMed  Google Scholar 

  6. Hertz, G.Z. and Stormo, G.D., Identifying DNA and protein patterns with statistically significant alignments of multiple sequences, Bioinformatics, 1999, vol. 15, pp. 563–577.

    CAS  Article  PubMed  Google Scholar 

  7. Kolchanov, N.A., Ignatieva, E.V., Ananko, E.A., Podkolodnaya, O.A., Stepanenko, I.L., Merkulova, T.I., Pozdnyakov, M.A., Podkolodny, N.L., Naumochkin, A.N., and Romashchenko, A.G., Transcription regulatory regions database (TRRD): Its status in 2002, Nucleic Acids Res., 2002, vol. 30, pp. 312–317.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  8. Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., and Wootton, J.C., Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment, Science, 1993, vol. 262, pp. 208–214.

    CAS  Article  PubMed  Google Scholar 

  9. Manavski, S.A. and Valle, G., CUDA compatible GPU cards as efficient hardware accelerators for Smith–Waterman sequence alignment, BMC Bioinf., 2008, vol. 26, no. 9, p. S10.

    Article  Google Scholar 

  10. Marsan, L. and Sagot, M.F., Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification, J. Comput. Biol., 2000, vol. 7, pp. 345–362.

    CAS  Article  PubMed  Google Scholar 

  11. Matys, V., Kel-Margoulis, O.V., Fricke, E., Liebich, I., Land, S., Barre-Dirrie, A., Reuter, I., Chekmenev, D., Krull, M., Hornischer, K., Voss, N., Stegmaier, P., Lewicki-Potapov, B., Saxel, H., Kel, A.E., and Wingender, E., TRANSFAC and its module TRANSCompel: Transcriptional gene regulation in eukaryotes, Nucleic Acids Res., 2006, vol. 34, pp. D108–110.

    CAS  Article  PubMed  Google Scholar 

  12. Mrázek, J., Gaynon, L.H., and Karlin, S., Frequent oligonucleotide motifs in genomes of three streptococci, Nucleic Acids Res., 2002, vol. 19, pp. 4216–4221.

    Article  Google Scholar 

  13. NVIDIA CUDA programming guide 3.2. http://developer.download.nvidia.com/compute/cuda/3_2/toolkit/docs/CUDA_C_Programming_Guide.pdf.

  14. Osada, R., Zaslavsky, E., and Singh., M., Comparative analysis of methods for representing and searching for transcription factor binding sites, Bioinformatics, 2004, vol. 20, no. 18, pp. 3516–3525.

    CAS  Article  PubMed  Google Scholar 

  15. Pesole, G., Liuni, S., and Dsouza, M., PatSearch: A pattern matcher software that finds functional elements in nucleotide and protein sequences and assesses their statistical significance, Bioinformatics, 2000, vol. 16, pp. 439–450.

    CAS  Article  PubMed  Google Scholar 

  16. Pevzner, P.A. and Sze, S.H., Combinatorial approaches to finding subtle signals in DNA sequences, Proc. of the 8th Int. Conf. on Intelligent Systems for Molecular Biology (ISMB), 2000.

    Google Scholar 

  17. Portales-Casamar, E., Thongjuea, S., Kwon, A.T., Arenillas, D., Zhao, X., Valen, E., Yusuf, D., Lenhard, B., Wasserman, W.W., and Sandelin, A., JASPAR 2010: The greatly expanded open-access database of transcription factor binding profiles, Nucleic Acids Res., 2010, vol. 38, pp. D105–D110.

    CAS  Article  PubMed  Google Scholar 

  18. Sukhwani, B. and Herbordt, M.C., GPU acceleration of a production molecular docking code, Proc. of 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009.

    Google Scholar 

  19. Vishnevsky, O.V. and Kolchanov, N.A., ARGO: A web system for the detection of degenerate motifs and large-scale recognition of eukaryotic promoters, Nucleic Acids Res., 2005, vol. 33, pp. 417–422.

    Article  Google Scholar 

  20. Vishnevsky, O.V., Gunbin, K.V., Bocharnikov, A.V., and Berezikov, E.V., Analysis of the conservative motifs in promoters of miRNA genes, expressed in different tissues of mammalians, in Evolutionary Biology Concepts, Molecular and Morphological Evolution, 2011.

    Google Scholar 

  21. Yooseph, S., Sutton, G., Rusch, D.B., Halpern, A.L., Williamson, S.J., Remington, K., Eisen, J.A., Heidelberg, K.B., Manning, G., Li, W., Jaroszewski, L., Cieplak, P., Miller, C.S., Li, H., Mashiyama, S.T., et al., The sorcerer II global ocean sampling expedition: expanding the universe of protein families, PLoS Biol., 2007, vol. 5, no. 3.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to O. V. Vishnevsky.

Additional information

Original Russian Text © O.V. Vishnevsky, A.V. Bocharnikov, A.A. Romanenko, 2015, published in Vavilovskii Zhurnal Genetiki i Selektsii, 2015, Vol. 19, No. 6, pp. 661–667.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Vishnevsky, O.V., Bocharnikov, A.V. & Romanenko, A.A. The use of graphics accelerators to detect functional signals in the regulatory regions of prokaryotic genes. Russ J Genet Appl Res 6, 731–737 (2016). https://doi.org/10.1134/S2079059716070145

Download citation

Keywords

  • degenerated oligonucleotide motif
  • transcription regulation
  • translation regulation
  • CUDA
  • GPU