Skip to main content
Log in

Databases, models, and algorithms for functional genomics

A bioinformatics perspective

  • Review
  • Published:
Molecular Biotechnology Aims and scope Submit manuscript

Abstract

A variety of patterns have been observed on the DNA and protein sequences that serve as control points for gene expression and cellular functions. Owing to the vital role of such patterns discovered on biological sequences, they are generally cataolged and maintained within internationally shared databases. Furthermore, the variability in a family of observed patterns is often represented using computational models in order to facilitate their search within an uncharacterized biological sequence. As the biological data is comprised of a mosaic of sequence-levels motifs, it is significant to unravel the synergies of macromolecular coordination utilized in cellspecific differential synthesis of proteins. This article provides an overview of the various pattern representation methodologies and the surveys the pattern databases available for use to the molecular biologists. Our aim is to describe the principles behind the computational modeling and analysis techniques utilized in bioinformatics research, with the objective of providing insight necessary to better understand and effectively utilize the available databases and analysis tools. We also provide a detailed review of DNA sequence level patterns responsible for structural conformations within the Scaffold or Matrix Attachment Regions (S/MARs).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Kuensmith, L. and Kish, V. (1995) Principles of Cell and Molecular Biology.

  2. Kadonaga, J (1998) Eukaryotic transcription: an interlaced network of transcription factors and chromatin-modifying machines. Cell 92, 307–313.

    Article  PubMed  CAS  Google Scholar 

  3. Roeder, R. (1996) The role of general initiation factors in transcription by RNA polymerase II. Trends in Biochem. Sci. 21, 327–335.

    Article  CAS  Google Scholar 

  4. Hartwell, L. and Kasten, M. (1994) Cell cycle control and cancer. Science 266, 1821–1828.

    Article  PubMed  CAS  Google Scholar 

  5. Mays, V., Fricke, E., Geffers, R., et al. (2003) TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res. 31. 374–378.

    Article  CAS  Google Scholar 

  6. Bucher, P. and Trifonov, E. (1986) Compilation and analysis of eukaryotic POL II promoter sequences. Nucleic Acid Res. 14, 10009–10026.

    Article  PubMed  CAS  Google Scholar 

  7. Ghosh, D. (1998) OOTFD (Object-Oriented Transcription Factors Database): an object-oriented successor to TFD. Nucleic Acid Res. 26, 360–362.

    Article  PubMed  CAS  Google Scholar 

  8. Staden, R. (1988) Methods for calculating the probabilities of finding patterns in sequences. Comput. Applic. Biosci. 5, 89–96.

    Google Scholar 

  9. Staden, R. (1988) Searching for patterns in proteins and nucleic acid sequences. Methods Enzymol. 183, 193–211.

    Google Scholar 

  10. Gribskov, M., McLachan, A., and Eisenberg, D. (1987) Profile analysis: detection of distantly related proteins. Proc. Natl. Acad. Sci. USA 84, 4355–4358.

    Article  PubMed  CAS  Google Scholar 

  11. Gribskov, M., Luethy, R., and Eisenberg, D. (1990) Profile analysis. Methods Enzymol. 183, 146–159.

    Article  PubMed  CAS  Google Scholar 

  12. Rabiner, L. (1989) A tutorial on hidden Matkov models and selected applications in speech recognition. Proc of the IEEE 77, 257–286.

    Article  Google Scholar 

  13. Kogh, A., Brown, M., Mian, S., Sjolander, K., and Haussler, D. (1994) Hidden Mailkov models in computational biology-Applications to protein modeling. J. Mol. Biol. 235, 1501–1531.

    Article  Google Scholar 

  14. Kogh, A., Brown, M., Mian, S., and Haussler, D. (1994) A hidden Markov model that finds genes in E. coli DNA. Nucleic Acid Res. 22, 4768–4778.

    Article  Google Scholar 

  15. Hulo, N., Sigrist, C. J, Saux Le, V., et al. (2004) Recent improvements to the PROSITE database. Nucleic Acids Res. 32 Database issue, D1-D13

    Google Scholar 

  16. Wingender, E. (1990) Transcription regulating proteins and their recognition sequences. Crit. Rev. Eukaryot. Gene Expr. 1, 11–48.

    PubMed  CAS  Google Scholar 

  17. Liebich, I., Bode, J., Frisch, M., and Wingender, E. S/MARt DBa database on scaffold/matrix attached regions. Nucleic Acids Res. 30, 372–374.

  18. KolMargoulis, O., VoKl, A. E., Reuter, I., Deineko, I. V., and Wingender, E. (2002) TRANSCompel: a database on composite regulatory elements in eukaryotic genes. Nucleic Acids Res. 30, 24

    Article  Google Scholar 

  19. Wheeler, D. L., Church, D. M., Edgar, R., et al. (2004) Database resources of the National Center for Botechnology Information: update. Nucleic Acids Res. 32 Database issue, D35-D40.

    Article  PubMed  CAS  Google Scholar 

  20. Boulikas, T. (1993) Nature of DNA sequences at the attachment regions of genes to the nuclear matrix. J. Cell. Biochem. 52, 14–22.

    Article  PubMed  CAS  Google Scholar 

  21. Bode, J., Stengert-Iber, M., Kay, V., Schlake, T., and Dietz-Pfeilstetter, A. (1996) Scaffold/matrix attachment regions: topological switches with multiple regulatory functions. Crit. Rev. Eukaryot. Gene Expr. 6, 115–138.

    PubMed  CAS  Google Scholar 

  22. Bode, J., Rios-Ramirez M., Mielke, C., Stengert, M., Kay, V., and Kehr-Wirth, D. (1995) Scaffold/matrix attachment regions: strucutral properties creating transcriptionally active loci. Intl. Rev. Cytol. 162A, 384–452.

    Google Scholar 

  23. Nikolaev, L., Tsevegiyn, T., Akopov, S., Ashworth, L., and Sverdlov, E. (1996) Construction of a chromosome specific library of mars and mapping of matrix attachment regions on human chromosome 19. Nucleic Acid Res. 24, 1330–1336.

    Article  PubMed  CAS  Google Scholar 

  24. Phi-Van L. and Stratling, W. H. (1988) The matrix attachment regions of the chicken lysoxyme gene co-map with the boundaries of chromatin domain. EMBO J. 7:655–664.

    CAS  Google Scholar 

  25. Iade, J., Rios-Ramirez, M., Mielke, C., Stengert, M., Kay, V., and Kehr-Wirth, D. (1995) Scaffold/matrix attachment regions: structural properties creating transcriptionally active loci. Intl. Rev. Cytol. 162A, 389–454.

    Google Scholar 

  26. Iarman A. and Higgs, D. (1998) Nuclear scaffold attachment sites in the human globin gene complexes. EMBO J. 7, 3337–3344.

    Google Scholar 

  27. Farache, G., Razin, S., Targa, F., and Scherrer, K. (1990) Organization of the Boundary of the chicken alpha globin gene domain and characterization of a CR 1-specific protein binding site. Nucleic Acid Res. 18, 401–409.

    Article  PubMed  CAS  Google Scholar 

  28. Deppert, W. (1996) Bding of MAR-DNA elements by mutant p9 possible implications for oncogenic function. J. Cell. Biochem. 62, 172–180.

    Article  PubMed  CAS  Google Scholar 

  29. Kramer, J. and Krawetz, S. (1995) Matrix associated regions in haploid expressed domains. Mammal. Genome 6, 677–679.

    Article  CAS  Google Scholar 

  30. Singh, G., Kramer, J., and Krawetz, S. (1997) Mathematical model to predict regions of chromatin attachment to the nuclear matrix. Nucleic Acid Res. 25, 1419–1425.

    Article  PubMed  CAS  Google Scholar 

  31. Perier, R., Junier, T., and Bucher, P. (1998) The eukaryotic promoter database. Nucleic Acid Res. 26, 353–357.

    Article  CAS  Google Scholar 

  32. Bucher P. and Bryan, B (1984) Signal search analysis: a new method to localize and characterize functionally important DNA sequences. Nucleic Acid Res. 12, 287–305.

    Article  PubMed  CAS  Google Scholar 

  33. Chen, Q., Hertz, J., and Stormo, G. (1995) MATRIX SEARCH 1.0: a computer program that scans DNA sequences for transcriptional elements using a database of weight matrices. Comput. Applic. Biosci. 11, 563–566.

    CAS  Google Scholar 

  34. Quandt, K., Grote, K., and Werner, T. (1996) GenomeInspector: basic software tools for analysis of spatial correlation between genomic structures within megabase sequences. Genomics 33, 301–304.

    Article  PubMed  CAS  Google Scholar 

  35. Strissel, P., Espinosa, R., Rowley, J., and Swift, H. (1996) Scaffold attachment regions in centromere-associated DNA. Chromosoma 105, 122–133.

    PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gautam B. Singh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Singh, G.B., Singh, H. Databases, models, and algorithms for functional genomics. Mol Biotechnol 29, 165–183 (2005). https://doi.org/10.1385/MB:29:2:165

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1385/MB:29:2:165

Index Entries

Navigation