The Probability Distribution of Distance TSS-TLS Is Organism Characteristic and Can Be Used for Promoter Prediction

  • Yun Dai
  • Ren Zhang
  • Yan-Xia Lin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4031)


Transcription is a complicated process which involves the interactions of promoter cis-elements with multiple trans-protein factors. The specific interactions rely not only on the specific sequence recognition between the cis- and trans-factors but also on certain spatial arrangement of the factors in a complex. The relative positioning of involved cis-elements provides the framework for such a spatial arrangement. The distance distribution between gene transcription and translation start sites (TSS-TLS) is the subject of the present study to test an assumption that over evolution, the TSS-TLS distance becomes a distinct character for a given organism. Four representative organisms (Escherichia cloi, Saccharomyces cerevisiae, Arabidopsis thaliana and Homo sapiens) were chosen to study the probability distribution of the distance TSS-TLS. The statistical results show that the distances distributions vary significantly and are not independent of species. There seems a trend of increased length of the distances from simple prokaryotic to more complicated eukaryotic organisms. With the specific distance distribution data, computational promoter prediction tools can be improved for higher accuracy.


Distance Distribution Translation Start Site Promoter Prediction Promoter Recognition Empirical Sample Distribution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bajic, V.B., Seah, S.H., Chong, A., Krishnan, S.P.T., Koh, J.L.Y., Brusic, V.: Computer model for recognition of functional transcription start sites in RNA polymerase II promoters of vertebrates. Journal of Molecular Graphics and Modelling 21, 323–332 (2004)CrossRefGoogle Scholar
  2. 2.
    Burden, S., Lin, Y.-X., Zhang, R.: Improving promoter prediction for the NNPP2.2 algorithm: a case study using E. coli DNA sequences. Bioinformatics 21, 601–607 (2005)CrossRefGoogle Scholar
  3. 3.
    Huerta, A.M., Collado-Vides, J.: Sigma 70 Promoters in Escherichia coli: Specific Transcrription in Dense Region of Overlapping Promoter-like Singals. J. Mol. Biol. 333, 261–278 (2003)CrossRefGoogle Scholar
  4. 4.
    Karp, P.D., Riley, M., Saier, M., Paulsen, I.T., Collado-Vides, J., Paley, S., Pellegrini-Toole, A., Paley, S., Bonavides, C., Gama-Castro, S.: The EcoCyc database. Nucl. Acids Res. 30, 56–58 (2002)CrossRefGoogle Scholar
  5. 5.
    Molina, C., Grotewold, E.: Genome wide analysis of Arabidopsis core Promoters. BMC Genomics 6, 1–12 (2005)CrossRefGoogle Scholar
  6. 6.
    Ohler, U., Niemann, H.: Identification and analysis of eukaryotic promoters: recent computational approaches. Trends in Genetics 17, 56–60 (2001)CrossRefGoogle Scholar
  7. 7.
    Qiu, P.: Computational approaches for deciphering the transcriptional regulatroy network by promoter analysis. Biosilico 1, 133–235 (2003)CrossRefGoogle Scholar
  8. 8.
  9. 9.
    Reese, M., Eeckman, F.: Novel neurla network prediction systems for human promoters and splice sites. In: Searls, D., Fickett, J., Stormo, G., Noordewier, M. (eds.) Proceedings of the Workshop on Gene-Finding and Gene Structure Prediction Pennsylvania, Philadelphia (1995)Google Scholar
  10. 10.
    Reese, M.G.: Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome. Computers and Chemistry 26, 51–56 (2001)CrossRefGoogle Scholar
  11. 11.
    Suzuki, Y., Yamashita, R., Sugano, S., Nakai, K.: DBTSS, DataBase of Transcriptional Start Sites: progress report 2004. Nucleic Acids Research 32, 78–81 (2004)CrossRefGoogle Scholar
  12. 12.
    Zhu, J., Zhang, M.Q.: SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics 15, 607–611 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yun Dai
    • 1
  • Ren Zhang
    • 2
  • Yan-Xia Lin
    • 1
  1. 1.School of Mathematics and Applied StatisticsUniversity of WollongongAustralia
  2. 2.School of Biological SciencesUniversity of WollongongAustralia

Personalised recommendations