Pattern Matching for Motifs

  • Brendan Tse
  • David Hume
  • Yi-Ping Phoebe Chen


As noted in the introduction, any mammalian gene may have 50/100, or more, binding sites for transcription factors scattered among promoters and enhancers. Typically, there are multiple sites bound by any single transcription factor. As noted above, genuine transcriptional regulatory elements tend to be clustered within conserved non-coding regions. There are many transcription factors that bind or act cooperatively, for example, the Ets and AP1 families (Stacey et al., 1995), so that their respective recognition motifs commonly occur side-by-side if they are functional. Regardless of the method used above, one can achieve an additional constraint on analysis and greater confidence in predictions by searching for clusters of predicted elements using programs such as Cluster Bluster (Frith et al., 2003). If the same clusters occur in genes with similar regulatory patterns, or across species, the analysis can have an additional predictive power. When one includes multiple genes, the order and location of sites becomes irrelevant, and the output one seeks is the incidence of a particular site within a cluster, and its frequency when it is present. This constraint, in addition to those above, can help overcome the problem of transcription factor binding site degeneracy, and take us to a position in which it may be possible to design machine learning approaches that can distinguish classes of genes and likely transcriptional outputs based upon genomic sequence information alone.


Pattern Match Query Sequence Search Window Mammalian Gene Current Window 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Antequera, F. (2003) Structure, function and evolution of CpG island promoters. Cellular and Molecular Life Sciences 60: 1647–1658.CrossRefGoogle Scholar
  2. DeRisi, J.L., Iyer, V.R. and Brown, P.O. (1997) Exploring the Metabolic and Genetic Control of Gene Expression on a Genomic Scale. Science 278: 680–686.CrossRefGoogle Scholar
  3. Durbin, R., Eddy, S.R., Krogh, A. and Mitchison, G. (1998) Biological sequence analysis: probabalistic models of proteins and nucleic acids. Cambridge University Press, New York.Google Scholar
  4. Frith, M.C., Li, M.C. and Weng, Z. (2003). Cluster-Buster: finding dense clusters of motifs in DNA sequences. Nucl. Acids. Res. 31: 3666–3668.CrossRefGoogle Scholar
  5. Hume, D.A. (2000) Probability in transcriptional regulation and its implications for leukocyte differentiation and inducible gene expression. Blood 96: 2323–2328.Google Scholar
  6. Kawai, J., Shinagawa, A., Shibata, K., Yoshino, M., Itoh, M., Ishii, Y., Arakawa, T., Hara, A., Fukunishi, Y., Konno, H., Adachi, J., Fukuda, S., Aizawa, K., Izawa, M., Nishi, K., Kiyosawa, H., Kondo, S., Yamanaka, I., Saito, T., Okazaki, Y., Gojobori, T., Bono, H., Kasukawa, T., Saito, R., Kadota, K., Matsuda, H., Ashburner, M., Batalov, S., Casavant, T., Fleischmann, W., Gaasterland, T., Gissi, C., King, B., Kochiwa, H., Kuehl, P., Lewis, S., Matsuo, Y., Nikaido, I., Pesole, G., Quackenbush, J., Schriml, L.M., Staubli, F., Suzuki, R., Tomita, M., Wagner, L., Washio, T., Sakai, K., Okido, T., Furuno, M., Aono, H., Baldarelli, R., Barsh, G., Blake, J., Boffelli, D., Bojunga, N., Carninci, P., de Bonaldo, M.F., Brownstein, M.J., Bult, C., Fletcher, C., Fujita, M., Gariboldi, M., Gustincich, S., Hill, D., Hofmann, M., Hume, D.A., Kamiya, M., Lee, N.H., Lyons, P., Marchionni, L., Mashima, J., Mazzarelli, J., Mombaerts, P., Nordone, P., Ring, B., Ringwald, M., Rodriguez, I., Sakamoto, N., Sasaki, H., Sato, K., Schonbach, C., Seya, T., Shibata, Y., Storch, K.F., Suzuki, H., Toyo-oka, K., Wang, K.H., Weitz, C., Whittaker, C., Wilming, L., Wynshaw-Boris, A., Yoshida, K., Hasegawa, Y., Kawaji, H., Kohtsuki, S., and Hayashizaki, Y. (2001) Functional annotation of a full length mouse cDNA collection. Nature 409: 685–690.CrossRefGoogle Scholar
  7. Lee, T.I., Rinaldi, N.J., Robert, F., Odom, D.T., Bar-Joseph, Z., Gerber, G.K., Hannett, N.M., Harbison, C.T., Thompson, C.M., Simon, I., Zeitlinger, J., Jennings, E.G., Murray, H.L., Gordon, D.B., Ren, B., Wyrick, J.J., Tagne, J.B., Volkert, T.L., Fraenkel, E., Gifford, D.K. and Young, R.A. (2002) Transcriptional Regulatory Networks in Saccharomyces cerevisiae. Science 298: 799–804.Google Scholar
  8. Lemon, B. and Tjian, R. (2000) Orchestrated response: a symphony of transcription factors for gene control. Genes Dev. 14: 2551–2569.CrossRefGoogle Scholar
  9. Okazaki, Y., Furuno, M., Kasukawa, T., Adachi, J., Bono, H., Kondo, S., Nikaido, I., Osato, N., Saito, R., Suzuki, H., Yamanaka, I., Kiyosawa, H., Yagi, K., Tomaru, Y., Hasegawa, Y., Nogami, A., Schonbach, C., Gojobori, T., Baldarelli, R., Hill, D.P., Bult, C., Hume, D.A., Quackenbush, J., Schriml, L.M., Kanapin, A., Matsuda, H., Batalov, S., Beisel, K.W., Blake, J.A., Bradt, D., Brusic, V., Chothia, C., Corbani, L.E., Cousins, S., Dalla, E., Dragani, T.A., Fletcher, C.F., Forrest, A., Frazer, K.S., Gaasterland, T., Gariboldi, M., Gissi, C., Godzik, A., Gough, J., Grimmond, S., Gustincich, S., Hirokawa, N., Jackson, I.J., Jarvis, E.D., Kanai, A., Kawaji, H., Kawasawa, Y., Kedzierski, R.M., King, B.L., Konagaya, A., Kurochkin, I.V., Lee, Y., Lenhard, B., Lyons, P.A., Maglott, D.R., Maltais, L., Marchionni, L., McKenzie, L., Miki, H., Nagashima, T., Numata, K., Okido, T., Pavan, W.J., Pertea, G., Pesole, G., Petrovsky, N., Pillai, R., Pontius, J.U., Qi, D., Ramachandran, S., Ravasi, T., Reed, J.C., Reed, DJ., Reid, J., Ring, B.Z., Ringwald, M., Sandelin, A., Schneider, C., Semple, C.A.M., Setou, M., Shimada, K., Sultana, R., Takenaka, Y., Taylor, M.S., Teasdale, R.D., Tomita, M., Verardo, R., Wagner, L., Wahlestedt, C, Wang, Y., Watanabe, Y., Wells, C., Wilming, L.G., Wynshaw-Boris, A., Yanagisawa, M., et al. 2002. Analysis of the mouse transcriptome based on functional annotation of 60,770 full length cDNAs. Nature 420: 563–573.Google Scholar
  10. Pennacchio, L.A. and Rubin, E.M. (2001) Genomic strategies to identify mammalian regulatory sequences. Nature Reviews Genetics 2: 100–109.CrossRefGoogle Scholar
  11. Ravasi, T., Hsu, K., Goyette, J., Schroder, K., Yang, Z., Rahimi, F., Miranda, L.P., Alewood, P.F., Hume, D.A. and C. Geczy. PROBING THE S100 PROTEIN FAMILY THROUGH GENOMIC AND FUNCTIONAL ANALYSIS. Genomics.Google Scholar
  12. Rehli, M. (2002) Of mice and men: species variations of Toll-like receptor expression. Trends in Immunology 23: 375–378.CrossRefGoogle Scholar
  13. Rombauts, S., Florquin, K., Lescot, M., Marchal, K., Rouze, P. and Van de Peer, Y. (2003) Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes. Plant Physiology 132: 1162–1176.CrossRefGoogle Scholar
  14. Stacey, K., Fowles, L., Colman, M., Ostrowski, M. and Hume, D. (1995) Regulation of urokinase-type plasminogen activator gene transcription by macrophage colony-stimulating factor. Mol. Cell. Biol. 15: 3430–3441.Google Scholar
  15. Sweet, M.J. and Hume, D.A. (1996) Endotoxin signal transduction in macrophages. Journal of Leukocyte Biology 60: 8–26.Google Scholar
  16. Tagoh, H., Himes, R., Clarke, D., Leenen, P.J.M., Riggs, A.D., Hume, D. and Bonifer, C. (2002) Transcription factor complex formation and chromatin fine structure alterations at the murine c-fms (CSF-1 receptor) locus during maturation of myeloid precursor cells. Genes Dev. 16: 1721–1737.CrossRefGoogle Scholar
  17. Walsh, N.C., Cahill, M., Carninci, P., Kawai, J., Okazaki, Y., Hayashizaki, Y., Hume, D.A., Cassady, A.I. (2003) Multiple tissue-specific promoters control expression of the murine tartrate-resistant acid phosphatase gene. Gene 307: 111–123.CrossRefGoogle Scholar
  18. Wang, T. and Stormo, G.D. (2003) Combining phylogenetic data with coregulated genes to identify regulatory motifs. Bioinformatics 19: 2369–2380.Google Scholar
  19. Wells, C., Ravasi, T., Faulkner, G., Carinci, P., Okazaki, Y., Hayashizaki, Y., Sweet, M.J., Wainwright, B.J., Hume, D.A. (2003) Genetic control of the innate immune response. BMC Immunology 4.Google Scholar

Copyright information

© Springer-Verlag Berlin Hiedelberg 2005

Authors and Affiliations

  • Brendan Tse
    • 1
    • 2
  • David Hume
    • 1
    • 2
  • Yi-Ping Phoebe Chen
    • 3
    • 4
  1. 1.CRC for Chronic Inflammatory DiseasesAustralia
  2. 2.Institute for Molecular BioscienceUniversity of QueenslandBrisbaneAustralia
  3. 3.School of Information Technology Faculty of Science and TechnologyDeakin UniversityAustralia
  4. 4.ARC Centre in BioinformaticsAustralia

Personalised recommendations