Finding Consensus Patterns in Very Scarce Biosequence Samples from Their Minimal Multiple Generalizations

  • Yen Kaow Ng
  • Takeshi Shinohara
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3918)


In this paper we examine the issues involved in finding consensus patterns from biosequence data of very small sample sizes, by searching for so-called minimal multiple generalization (mmg), that is, a set of syntactically minimal patterns that accounts for all the samples. The data we use are the sigma regulons with more conserved consensus patterns for the bacteria B. subtilis. By comparing between the mmgs found over different search spaces, we found that it is possible to derive patterns close to the known consensus patterns by simply making some reasonable requirements on the kinds of patterns to obtain. We also propose some simple measures to evaluate the patterns in an mmg.


Regular Pattern Pattern Class Pattern Language Variable Occurrence Consensus Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arimura, H., Fujino, R., Shinohara, T., Arikawa, S.: Protein motif discovery from positive examples by Minimal Multiple Generalization over regular patterns. In: Proceedings of the Genome Informatics Workshop, pp. 39–48 (1994)Google Scholar
  2. 2.
    Arimura, H., Shinohara, T., Otsuki, S.: Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data. In: Enjalbert, P., Mayr, E.W., Wagner, K.W. (eds.) STACS 1994. LNCS, vol. 775, pp. 649–660. Springer, Heidelberg (1994)CrossRefGoogle Scholar
  3. 3.
    Brāzma, A., Jonassen, I., Eidhammer, I., Gilbert, D.: Approaches to the automatic discovery of patterns in biosequences. J. Comp. Biol. 5(2), 277–304 (1998)Google Scholar
  4. 4.
    Helmann, J.D., Moran, C.P.: RNA Polymerase and Sigma Factors, ch 21, pp. 289–312. American Society Microbiology, Washington (2001)Google Scholar
  5. 5.
    Makita, Y., Nakao, M., Ogasawara, N., Nakai, K.: DBTBS: Database of transcriptional regulation in Bacillus Subtilis and its contribution to comparative genomics. Nucl. Acids Res. 32, 75–77 (2004)CrossRefGoogle Scholar
  6. 6.
    Ng, Y.K., Ono, H., Shinohara, T.: Measuring over-generalization in the minimal multiple generalizations of biosequences. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds.) DS 2005. LNCS (LNAI), vol. 3735, pp. 176–188. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  7. 7.
    Rigoutsos, I., Floratos, A.: Combinatorial pattern discovery in biological sequences: the TEIRESIAS algorithm. Bioinformatics 14(1), 55–67 (1998)CrossRefGoogle Scholar
  8. 8.
    Shinohara, T.: Polynomial time inference of extended regular pattern languages. In: Goto, E., Nakajima, R., Yonezawa, A., Nakata, I., Furukawa, K. (eds.) RIMS 1982. LNCS, vol. 147, pp. 115–127. Springer, Heidelberg (1983)CrossRefGoogle Scholar
  9. 9.
    Sigrist, C.J., Cerutti, L., Hulo, N., Gattiker, A., Falquet, L., Pagni, M., Bairoch, A., Bucher, P.: PROSITE: A documented database using patterns and profiles as motif descriptors. Brief. Bioinform., 3, 265–274 (2002)CrossRefGoogle Scholar
  10. 10.
    Takae, T., Kasai, T., Arimura, H., Shinohara, T.: Knowledge discovery in biosequences using sort regular patterns. In: Workshop on Applied Learning Theory (1998)Google Scholar
  11. 11.
    Yamaguchi, M., Shimozono, S., Shinohara, T.: Finding minimal multiple generalization over regular patterns with alphabet indexing. In: Proceedings of the Seventh Workshop on Genome Informatics, vol. 7, pp. 51–60. Universal Academy Press, Tokyo (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yen Kaow Ng
    • 1
  • Takeshi Shinohara
    • 2
  1. 1.Graduate School of Computer Science and SystemsKyushu Institute of TechnologyIizukaJapan
  2. 2.Department of Artificial IntelligenceKyushu Institute of TechnologyIizukaJapan

Personalised recommendations