Skip to main content

Matching Techniques in Genomic Sequences for Motif Searching

  • Chapter
Book cover Soft Computing for Data Mining Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 190))

  • 871 Accesses

Abstract

Sequence retrieval serves as a “preprocess” for a number of other processes including motif discovery, in which obtained sequences are scored against a consensus before being recognized as a motif. This depends on the way sequences are stored prior to retrieval. The usage of two bits for representing genomic characters is optimal storage wise, however does not provide any details regarding length of repetitive characters or other details of positional significance. The intent of the chapter is to showcase an alternative storage technique for the sequence and its corresponding retrieval technique. We represent our technique with the use of integers for clarity of understanding. With the bit equivalent of the integers used in actual representation we could minimize storage complexity significantly. We give a clear picture of the requirements of a storage technique from a motif discovery perspective before showcasing our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Srinivas, V.R.: Bioinformatics - A Modern Approach. Prentice-Hall of India, Englewood Cliffs (2005)

    Google Scholar 

  2. Leung, H.C.M., Chin, F.Y.L.: An Efficient Algorithm For the Extended (l,d)-Motif Problem With Unknown Number of Binding Sites. In: Proceedings of the 5th IEEE Symposium on Bioinformatics and Bioengineering (BIBE 2005) (2005)

    Google Scholar 

  3. Jonassen, I., Collins, J.F., Higgins, D.: Finding Flexible Patterns in Unaligned Protein Sequences. Protein Science 4(8), 1587–1595 (1995)

    Article  Google Scholar 

  4. Rigoutsos, I., Floratos, A.: Combinatorial Pattern Discovery in Biological Sequences: The TEIRESIAS Algorithm. Biofinformatics 14, 55–67 (1998)

    Article  Google Scholar 

  5. Liu, X.S., Burtlag, L., Liu, J.S.: An Algorithm for Finding Protein-DNA Binding Sites with Applications to Chromatin Immuno Precipitation Microarray Experiments. Biotechnology 20, 835–839 (2002)

    Google Scholar 

  6. Neuwald, A.F., Liu, J.S., Lawrence, C.E.: Gibbs Motif Sampling: Detection of Bacterial Outer Membrane Protein Repeats. Protien Science 4, 1618–1632 (1995)

    Article  Google Scholar 

  7. Hertz, G.Z., Stormo, G.D.: Identifying DNA and Protien Patterns with Statistically Significant Alignments of Multiple Sequences. Bioinformatics 15, 563–577 (1999)

    Article  Google Scholar 

  8. Bailey, T.L., Elkan, C.: Unsupervised Learning of Multiple Motifs in Biopolymers using Expectation Maximization. Machine Learning 21, 51–80 (1995)

    Google Scholar 

  9. Liu, X., Burtlag, D.L., Liu, J.S.: Bioprospector: Discovering Conserved DNA Motifs in Upstream Ergulatory Regions of Co-expressed Genes. Pacific Symposium on Biocomputing 6, 127–138 (2001)

    Google Scholar 

  10. Needleman, S., Wunsch, C.: A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. Journal of Molecular Biology 48(3), 443–453 (2000)

    Article  Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Venugopal, K.R., Srinivasa, K.G., Patnaik, L.M. (2009). Matching Techniques in Genomic Sequences for Motif Searching. In: Soft Computing for Data Mining Applications. Studies in Computational Intelligence, vol 190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00193-2_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00193-2_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00192-5

  • Online ISBN: 978-3-642-00193-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics