Skip to main content

Large Scale Matching for Position Weight Matrices

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 4009)

Abstract

This paper addresses the problem of multiple pattern matching for motifs encoded by Position Weight Matrices. We first present an algorithm that uses a multi-index table to preprocess the set of motifs, allowing a dramatically decrease of computation time. We then show how to take benefit from simlar motifs to prevent useless computations.

Keywords

  • Transcription Factor Binding Site
  • Similar Matrice
  • Score Threshold
  • Position Weight Matrix
  • Weighted Pattern

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/11780441_36
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-540-35461-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chung, Y.S., Peng, S.L., Tang, C.Y., Yang, J.M.: Finging k-cliques on a k-partite graph. In: 22nd Workshop on Combinatorial Mathematics and Computational Theory (2005)

    Google Scholar 

  2. Claverie, J.M., Audic, S.: The statistical significance of nucleotide position-weight matrix matches. Computer Applications in the Biosciences 12(5), 431–439 (1996)

    Google Scholar 

  3. Elkon, R., Linhart, C., Sharan, R., Shamir, R., Shiloh, Y.: Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells. Genome Research 13, 773–780 (2003)

    CrossRef  Google Scholar 

  4. Grunert, T., Irnich, S., Zimmermann, H.-J., Schneider, M., Wulfhorst, B.: Cliques in k-partite graphs and their application in textile engineering (2002)

    Google Scholar 

  5. Huang, H., Kao, M.-C.J., Zhou, X., Liu, J.S., Wong, W.H.: Determination of local statistical significance of patterns in markov sequences with application to promoter element identification. J. Comput. Biol. 11(1), 1–14 (2004)

    CrossRef  MATH  Google Scholar 

  6. Marinescu, V.D., Kohane, I.S., Riva, A.: The MAPPER database: a multi-genome catalog of putative transcription factor binding sites. Nucleic Acids Research 33, Database issue: D91–D97 (2005)

    Google Scholar 

  7. Sandelin, A., Alkema, W., Engstrom, P., Wasserman, W.W., Lenhard, B.: JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Research, Database issue: D91-D94 (2004)

    Google Scholar 

  8. Sandelin, A., Wasserman, W.W.: Constrained binding site diversity within families of transcription factors enhances pattern discovery bioinformatics. Journal of Molecular Biology 338(2), 207–215 (2004)

    CrossRef  Google Scholar 

  9. Schones, E.D., Sumazin, P., Zhang, M.Q.: Similarity of position frequency matrices for transcription factor binding sites. Bioinformatics 21(3), 307–313 (2005)

    CrossRef  Google Scholar 

  10. Kielbasa, S.M., Gonze, D., Herzel, H.: Measuring similarities between transcription factor binding sites. BMC Bioinformatics 6(237) (2005)

    Google Scholar 

  11. Staden, R.: Methods for calculating the probabilities of finding patterns in sequences. Computer Applications in the Biosciences 5, 89–96 (1989)

    Google Scholar 

  12. Stormo, G.D., Fields, D.S.: Specificity, free energy and information content in protein-DNA interactions. Trends in biochemical sciences 23, 109–113 (1998)

    CrossRef  Google Scholar 

  13. Sui, S.J.H., Mortimer, J.R., Arenillas, D.J., Brumm, J., Walsh, C.J., Kennedy, B.P., Wasserman, W.W.: oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes. Nucleic Acids Res. 33(10), 3154–3164 (2005)

    CrossRef  Google Scholar 

  14. Wingender, E., Chen, X., Hehl, R., Karas, I., Liebich, I., Matys, V., Meinhardt, T., Pruss, M., Reuter, I., Schacherer, F.: TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Research 28(1), 316–319 (2000)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liefooghe, A., Touzet, H., Varré, JS. (2006). Large Scale Matching for Position Weight Matrices. In: Lewenstein, M., Valiente, G. (eds) Combinatorial Pattern Matching. CPM 2006. Lecture Notes in Computer Science, vol 4009. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11780441_36

Download citation

  • DOI: https://doi.org/10.1007/11780441_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35455-0

  • Online ISBN: 978-3-540-35461-1

  • eBook Packages: Computer ScienceComputer Science (R0)