Abstract
Various methods for identifying significant contextual signals are widely used to search for the transcription factor binding sites and to identify the structural and functional organization of the regulatory regions. These methods do not require any prealignment of the sample sequences analyzed or experimental information about the exact location of transcription factor binding sites. Methods of searching for contextual signals, based on the identification of degenerate oligonucleotide motifs recorded in the 15-letter IUPAC code have become widespread. A fundamental problem with degenerate motifs is their great diversity, which makes the researchers apply heuristics which do not guarantee that the most significant signal will be found. The development of high-performance computing systems based on the use of graphics cards has made it possible to use exact exhaustive methods to identify significant motifs. We have developed a new system for identifying significant degenerate oligonucleotide motifs of a given length in the regulatory regions based on the use of widespread graphics cards that provide a search for the signal with the greatest significance. The higher efficiency of the GPU compared to the CPU was demonstrated. Using the proposed approach, we analyzed the regulatory regions of the B. subtilis, E. coli, H. pylori, M. gallisepticum, M. genitalium, and M. pneumoniae genes. Sets of degenerate motifs have been identified for each species of prokaryotes. They were classified based on the similarity with the transcription factor binding sites of E. coli.
Similar content being viewed by others
References
Baker, Z.K. and Prasanna, V.K., An architecture for efficient hardware data mining using reconfigurable computing systems, 14th Annual IEEE Symp. on Field-Programmable Custom Computing Machines, 2006.
Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., and Sayers, E.W., Genbank, Nucleic Acids Res., 2013, vol. 41, pp. D39–42.
Elnitski, L., Hardison, R.C., Yang, S., Kolbe, D., Eswara, P., O’Connor, M.J., Schwartz, S., Miller, W., and Chiaromonte, F., Distinguishing regulatory DNA from neutral sites, Genome Res., 2003, vol. 13, no. 1, pp. 64–72.
Fomin, E.S. and Alemasov, N.A., Implementation of a non-bonded interaction calculation algorithm for the cell architecture, Lect. Notes Comput. Sci., 2009, vol. 5698, pp. 399–405.
Grundy, W.N., Bailey, T.L., and Elkan, C.P., Para MEME: A parallel implementation and a web interface for a DNA and protein motif discovery tool, CABIOS, Comput. Appl. Biosci., 1996, vol. 12, pp. 303–310.
Hertz, G.Z. and Stormo, G.D., Identifying DNA and protein patterns with statistically significant alignments of multiple sequences, Bioinformatics, 1999, vol. 15, pp. 563–577.
Kolchanov, N.A., Ignatieva, E.V., Ananko, E.A., Podkolodnaya, O.A., Stepanenko, I.L., Merkulova, T.I., Pozdnyakov, M.A., Podkolodny, N.L., Naumochkin, A.N., and Romashchenko, A.G., Transcription regulatory regions database (TRRD): Its status in 2002, Nucleic Acids Res., 2002, vol. 30, pp. 312–317.
Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., and Wootton, J.C., Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment, Science, 1993, vol. 262, pp. 208–214.
Manavski, S.A. and Valle, G., CUDA compatible GPU cards as efficient hardware accelerators for Smith–Waterman sequence alignment, BMC Bioinf., 2008, vol. 26, no. 9, p. S10.
Marsan, L. and Sagot, M.F., Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification, J. Comput. Biol., 2000, vol. 7, pp. 345–362.
Matys, V., Kel-Margoulis, O.V., Fricke, E., Liebich, I., Land, S., Barre-Dirrie, A., Reuter, I., Chekmenev, D., Krull, M., Hornischer, K., Voss, N., Stegmaier, P., Lewicki-Potapov, B., Saxel, H., Kel, A.E., and Wingender, E., TRANSFAC and its module TRANSCompel: Transcriptional gene regulation in eukaryotes, Nucleic Acids Res., 2006, vol. 34, pp. D108–110.
Mrázek, J., Gaynon, L.H., and Karlin, S., Frequent oligonucleotide motifs in genomes of three streptococci, Nucleic Acids Res., 2002, vol. 19, pp. 4216–4221.
NVIDIA CUDA programming guide 3.2. http://developer.download.nvidia.com/compute/cuda/3_2/toolkit/docs/CUDA_C_Programming_Guide.pdf.
Osada, R., Zaslavsky, E., and Singh., M., Comparative analysis of methods for representing and searching for transcription factor binding sites, Bioinformatics, 2004, vol. 20, no. 18, pp. 3516–3525.
Pesole, G., Liuni, S., and Dsouza, M., PatSearch: A pattern matcher software that finds functional elements in nucleotide and protein sequences and assesses their statistical significance, Bioinformatics, 2000, vol. 16, pp. 439–450.
Pevzner, P.A. and Sze, S.H., Combinatorial approaches to finding subtle signals in DNA sequences, Proc. of the 8th Int. Conf. on Intelligent Systems for Molecular Biology (ISMB), 2000.
Portales-Casamar, E., Thongjuea, S., Kwon, A.T., Arenillas, D., Zhao, X., Valen, E., Yusuf, D., Lenhard, B., Wasserman, W.W., and Sandelin, A., JASPAR 2010: The greatly expanded open-access database of transcription factor binding profiles, Nucleic Acids Res., 2010, vol. 38, pp. D105–D110.
Sukhwani, B. and Herbordt, M.C., GPU acceleration of a production molecular docking code, Proc. of 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009.
Vishnevsky, O.V. and Kolchanov, N.A., ARGO: A web system for the detection of degenerate motifs and large-scale recognition of eukaryotic promoters, Nucleic Acids Res., 2005, vol. 33, pp. 417–422.
Vishnevsky, O.V., Gunbin, K.V., Bocharnikov, A.V., and Berezikov, E.V., Analysis of the conservative motifs in promoters of miRNA genes, expressed in different tissues of mammalians, in Evolutionary Biology Concepts, Molecular and Morphological Evolution, 2011.
Yooseph, S., Sutton, G., Rusch, D.B., Halpern, A.L., Williamson, S.J., Remington, K., Eisen, J.A., Heidelberg, K.B., Manning, G., Li, W., Jaroszewski, L., Cieplak, P., Miller, C.S., Li, H., Mashiyama, S.T., et al., The sorcerer II global ocean sampling expedition: expanding the universe of protein families, PLoS Biol., 2007, vol. 5, no. 3.
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © O.V. Vishnevsky, A.V. Bocharnikov, A.A. Romanenko, 2015, published in Vavilovskii Zhurnal Genetiki i Selektsii, 2015, Vol. 19, No. 6, pp. 661–667.
Rights and permissions
About this article
Cite this article
Vishnevsky, O.V., Bocharnikov, A.V. & Romanenko, A.A. The use of graphics accelerators to detect functional signals in the regulatory regions of prokaryotic genes. Russ J Genet Appl Res 6, 731–737 (2016). https://doi.org/10.1134/S2079059716070145
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S2079059716070145