Abstract
Sequence motifs occurring in a particular order in proteins or DNA have been proved to be of biological interest. In this paper, a new method to locate the occurrences of up to five user-defined motifs in a specified order in large proteins and in nucleotide sequence databases is proposed. It has been designed using the concept of quantifiers in regular expressions and linked lists for data storage. The application of this method includes the extraction of relevant consensus regions from biological sequences. This might be useful in clustering of protein families as well as to study the correlation between positions of motifs and their functional sites in DNA sequences.
Chapter PDF
Similar content being viewed by others
References
Hulo, N., Sigrist, C.J.A., Bairoch, A.: Recent improvements to the PROSITE database. Nucl. Acids Res. 32, D134–D137 (2004)
Carvalho, A.M., Freitas, A.T., Oliveira, A.L., Sagot, M.: An Efficient Algorithm for the Identification of Structured Motifs in DNA Promoter Sequences. IEEE/ACM Transactions on Computational Biology and Bioinformatics 03, 126–140 (2006)
Cartharius, K., Frech, K., Grote, K., Klocke, B., Haltmeier, M., Klingenhoff, A., Frisch, M., Bayerlein, M., Werner, T.: MatInspector and beyond: promoter analysis based on transcription factor binding sites. Bioinformatics 21, 2933–2942 (2005)
Wingender, E., Chen, X., Fricke, E., Geffers, R., Hehl, R., Liebich, I., Krull, M., Matys, V., Michael, H., Ohnhaeuser, R., Prueb, M., Schacherer, F., Thiele, S., Urbach, S.: Match - a tool for searching transcription factor binding sites in DNA sequences. Nucl. Acids Res. 29, 281–283 (2001)
Akiyama, Y.: TFSEARCH: Searching Transcription Factor Binding Sites, http://www.rwcp.or.jp/papia/
Werner, T.: Model for prediction and recognition of eukaryotic promoters. Mammalian Genome 10, 168–175 (1999)
Wang, W., Kim, R., Jancarik, J., Yokota, H., Kim, S.H.: Crystal structure of phosphoserine phosphatase from Methanococcus jannaschii, a hyperthermophile, at 1.8 A resolution. Structure 9, 65–71 (2001)
VanHelden, J., André, B., Collado-Vides, J.: Extracting Regulatory Sites from the Upstream Region of Yeast Genes by Computational Analysis of Oligonucleotide Frequencies. J. Mol. Biol. 281, 827–842 (1998)
Pavlidis, P., Furey, T.S., Liberto, M., Haussler, D., Grundy, W.N.: Promoter region-based classification of genes. In: Proceedings of the Pacific Symposium on Bio-computing, pp. 151–163 (2001)
Collet, J.F., Stroobant, V., Pirard, M., Delpierre, G., Schaftingen, E.V.: A new class of phosphotransferases phosphorylated on an aspartate residue in an amino-terminal (DXDX(T/V)) motif. J. Biol. Chem. 273, 14107–14112
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
Rao, K.N., Kumaran, D., Swaminathan, S.: Crystal structure of trehalose-6-phosphate phosphatase-related protein: Biochemical and biological implications. Protein Sci. 15, 1735–1744 (2006)
Altschul, S.F., Gish, W., Miller, W., Myers, W.E., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)
Nevill-Manning, C.G., Wu, T.D., Brutlag, D.L.: Highly specific protein sequence motifs for genome analysis. JOURNAL NAME HERE 95, 5865–5871 (1998)
Ben-Hur, A., Brutlag, D.: Remote homology detection: a motif based approach. Bioinformatics 19, i26–i33 (2003)
Russ Overbeek: scan_for_matches, http://iubio.bio.indiana.edu/soft/molbio/pattern/scan_for_matches
Dsouza, M., Larsen, N., Overbeek, R.: Searching for patterns in genomic data. Trends Genet. 13, 497–504 (1997)
Pesole, S., Liuni, S., D’Souza, M.: PatSearch: a pattern matcher software that finds functional elements in nucleotide and protein sequences and assesses their statistical significance. Bioinformatics 16, 439–450 (2000)
Huang, J.Y., Brutlag, S.: The eMOTIF Database. Nucl. Acids Res. 29, 202–204 (2001)
Obenauer, J.C., Cantley, L.C., Yaffe, M.B.: Scansite 2.0 Proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucl. Acids Res. 31, 3635–3641 (2003)
MOTIF SCAN, http://myhits.isb-sib.ch/cgi-bin/motif_scan
Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L.L., Studholme, D.J., Yeats, C., Eddy, S.R.: The Pfam protein families database. Nucl. Acids Res. 32, D138–D141 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kumar, C., Kumar, N., Rangarajan, S., Balakrishnan, N., Sekar, K. (2008). A Method to Find Sequentially Separated Motifs in Biological Sequences (SSMBS). In: Chetty, M., Ngom, A., Ahmad, S. (eds) Pattern Recognition in Bioinformatics. PRIB 2008. Lecture Notes in Computer Science(), vol 5265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88436-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-88436-1_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88434-7
Online ISBN: 978-3-540-88436-1
eBook Packages: Computer ScienceComputer Science (R0)