Skip to main content
Log in

The use of graphics accelerators to detect functional signals in the regulatory regions of prokaryotic genes

  • Published:
Russian Journal of Genetics: Applied Research

Abstract

Various methods for identifying significant contextual signals are widely used to search for the transcription factor binding sites and to identify the structural and functional organization of the regulatory regions. These methods do not require any prealignment of the sample sequences analyzed or experimental information about the exact location of transcription factor binding sites. Methods of searching for contextual signals, based on the identification of degenerate oligonucleotide motifs recorded in the 15-letter IUPAC code have become widespread. A fundamental problem with degenerate motifs is their great diversity, which makes the researchers apply heuristics which do not guarantee that the most significant signal will be found. The development of high-performance computing systems based on the use of graphics cards has made it possible to use exact exhaustive methods to identify significant motifs. We have developed a new system for identifying significant degenerate oligonucleotide motifs of a given length in the regulatory regions based on the use of widespread graphics cards that provide a search for the signal with the greatest significance. The higher efficiency of the GPU compared to the CPU was demonstrated. Using the proposed approach, we analyzed the regulatory regions of the B. subtilis, E. coli, H. pylori, M. gallisepticum, M. genitalium, and M. pneumoniae genes. Sets of degenerate motifs have been identified for each species of prokaryotes. They were classified based on the similarity with the transcription factor binding sites of E. coli.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Baker, Z.K. and Prasanna, V.K., An architecture for efficient hardware data mining using reconfigurable computing systems, 14th Annual IEEE Symp. on Field-Programmable Custom Computing Machines, 2006.

    Google Scholar 

  • Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., and Sayers, E.W., Genbank, Nucleic Acids Res., 2013, vol. 41, pp. D39–42.

    Article  Google Scholar 

  • Elnitski, L., Hardison, R.C., Yang, S., Kolbe, D., Eswara, P., O’Connor, M.J., Schwartz, S., Miller, W., and Chiaromonte, F., Distinguishing regulatory DNA from neutral sites, Genome Res., 2003, vol. 13, no. 1, pp. 64–72.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fomin, E.S. and Alemasov, N.A., Implementation of a non-bonded interaction calculation algorithm for the cell architecture, Lect. Notes Comput. Sci., 2009, vol. 5698, pp. 399–405.

    Article  Google Scholar 

  • Grundy, W.N., Bailey, T.L., and Elkan, C.P., Para MEME: A parallel implementation and a web interface for a DNA and protein motif discovery tool, CABIOS, Comput. Appl. Biosci., 1996, vol. 12, pp. 303–310.

    CAS  PubMed  Google Scholar 

  • Hertz, G.Z. and Stormo, G.D., Identifying DNA and protein patterns with statistically significant alignments of multiple sequences, Bioinformatics, 1999, vol. 15, pp. 563–577.

    Article  CAS  PubMed  Google Scholar 

  • Kolchanov, N.A., Ignatieva, E.V., Ananko, E.A., Podkolodnaya, O.A., Stepanenko, I.L., Merkulova, T.I., Pozdnyakov, M.A., Podkolodny, N.L., Naumochkin, A.N., and Romashchenko, A.G., Transcription regulatory regions database (TRRD): Its status in 2002, Nucleic Acids Res., 2002, vol. 30, pp. 312–317.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., and Wootton, J.C., Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment, Science, 1993, vol. 262, pp. 208–214.

    Article  CAS  PubMed  Google Scholar 

  • Manavski, S.A. and Valle, G., CUDA compatible GPU cards as efficient hardware accelerators for Smith–Waterman sequence alignment, BMC Bioinf., 2008, vol. 26, no. 9, p. S10.

    Article  Google Scholar 

  • Marsan, L. and Sagot, M.F., Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification, J. Comput. Biol., 2000, vol. 7, pp. 345–362.

    Article  CAS  PubMed  Google Scholar 

  • Matys, V., Kel-Margoulis, O.V., Fricke, E., Liebich, I., Land, S., Barre-Dirrie, A., Reuter, I., Chekmenev, D., Krull, M., Hornischer, K., Voss, N., Stegmaier, P., Lewicki-Potapov, B., Saxel, H., Kel, A.E., and Wingender, E., TRANSFAC and its module TRANSCompel: Transcriptional gene regulation in eukaryotes, Nucleic Acids Res., 2006, vol. 34, pp. D108–110.

    Article  CAS  PubMed  Google Scholar 

  • Mrázek, J., Gaynon, L.H., and Karlin, S., Frequent oligonucleotide motifs in genomes of three streptococci, Nucleic Acids Res., 2002, vol. 19, pp. 4216–4221.

    Article  Google Scholar 

  • NVIDIA CUDA programming guide 3.2. http://developer.download.nvidia.com/compute/cuda/3_2/toolkit/docs/CUDA_C_Programming_Guide.pdf.

  • Osada, R., Zaslavsky, E., and Singh., M., Comparative analysis of methods for representing and searching for transcription factor binding sites, Bioinformatics, 2004, vol. 20, no. 18, pp. 3516–3525.

    Article  CAS  PubMed  Google Scholar 

  • Pesole, G., Liuni, S., and Dsouza, M., PatSearch: A pattern matcher software that finds functional elements in nucleotide and protein sequences and assesses their statistical significance, Bioinformatics, 2000, vol. 16, pp. 439–450.

    Article  CAS  PubMed  Google Scholar 

  • Pevzner, P.A. and Sze, S.H., Combinatorial approaches to finding subtle signals in DNA sequences, Proc. of the 8th Int. Conf. on Intelligent Systems for Molecular Biology (ISMB), 2000.

    Google Scholar 

  • Portales-Casamar, E., Thongjuea, S., Kwon, A.T., Arenillas, D., Zhao, X., Valen, E., Yusuf, D., Lenhard, B., Wasserman, W.W., and Sandelin, A., JASPAR 2010: The greatly expanded open-access database of transcription factor binding profiles, Nucleic Acids Res., 2010, vol. 38, pp. D105–D110.

    Article  CAS  PubMed  Google Scholar 

  • Sukhwani, B. and Herbordt, M.C., GPU acceleration of a production molecular docking code, Proc. of 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009.

    Google Scholar 

  • Vishnevsky, O.V. and Kolchanov, N.A., ARGO: A web system for the detection of degenerate motifs and large-scale recognition of eukaryotic promoters, Nucleic Acids Res., 2005, vol. 33, pp. 417–422.

    Article  Google Scholar 

  • Vishnevsky, O.V., Gunbin, K.V., Bocharnikov, A.V., and Berezikov, E.V., Analysis of the conservative motifs in promoters of miRNA genes, expressed in different tissues of mammalians, in Evolutionary Biology Concepts, Molecular and Morphological Evolution, 2011.

    Google Scholar 

  • Yooseph, S., Sutton, G., Rusch, D.B., Halpern, A.L., Williamson, S.J., Remington, K., Eisen, J.A., Heidelberg, K.B., Manning, G., Li, W., Jaroszewski, L., Cieplak, P., Miller, C.S., Li, H., Mashiyama, S.T., et al., The sorcerer II global ocean sampling expedition: expanding the universe of protein families, PLoS Biol., 2007, vol. 5, no. 3.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to O. V. Vishnevsky.

Additional information

Original Russian Text © O.V. Vishnevsky, A.V. Bocharnikov, A.A. Romanenko, 2015, published in Vavilovskii Zhurnal Genetiki i Selektsii, 2015, Vol. 19, No. 6, pp. 661–667.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vishnevsky, O.V., Bocharnikov, A.V. & Romanenko, A.A. The use of graphics accelerators to detect functional signals in the regulatory regions of prokaryotic genes. Russ J Genet Appl Res 6, 731–737 (2016). https://doi.org/10.1134/S2079059716070145

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S2079059716070145

Keywords

Navigation