Skip to main content

Segment Match Refinement and Applications

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2452))

Abstract

Comparison of large, unfinished genomic sequences requires fast methods that are robust to misordering, misorientation, and duplications. A number of fast methods exist that can compute local similarities between such sequences, from which an optimal one-to-one correspondence might be desired. However, existing methods for computing such a correspondence are either too costly to run or are inappropriate for unfinished sequence. We propose an efficient method for refining a set of segment matches such that the resulting segments are of maximal size without non-identity overlaps. This resolved set of segments can be used in various ways to compute a similarity measure between any two large sequences, and hence can be used in alignment, matching, or tree construction algorithms for two or more sequences.

New address: WSI-AB, Tübingen University, Sand 14, 72076 Tübingen, Germany.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. F. Altschul and B. W. Erickson. Locally optimal subalignments using nonlinear similarity functions. Bull. Math. Biol., 48:633–660, 1986.

    MATH  MathSciNet  Google Scholar 

  2. S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215:403–410, 1990.

    Google Scholar 

  3. S. Batzoglou, L. Pachter, J. P. Mesirov, B. Berger, and E. S. Lander. Human and mouse gene structure: Comparative analysis and application to exon prediction. Genome Research, 10:950–958, 2000.

    Article  Google Scholar 

  4. A. L. Delcher, S. Kasif, R. D. Fleischmann, J. Peterson, O. White, and S. L. Salzberg. Alignment of whole genomes. Nucleic Acids Research, 27(11):2369–2376, 1999.

    Article  Google Scholar 

  5. Delcher, A. and others. unpublished.

    Google Scholar 

  6. G. Jacobson and K.-P. Vo. Heaviest increasing/common subsequence problems. In Proceedings 3rd Annual Symposium on Combinatorial pattern matching (CPM), pages 52–66, 1992.

    Google Scholar 

  7. J. D. Kececioglu. The maximum weight trace problem in multiple sequence alignment. In Proc. 4-th Symp. Combinatorial Pattern Matching, number 684 in Lecture Notes in Computer Science, pages 106–119. Springer-Verlag, 1993.

    Google Scholar 

  8. J. D. Kececioglu, H.-P. Lenhof, K. Mehlhorn, P. Mutzel, K. Reinert, and M. Vingron. A polyhedral approach to sequence alignment problems. Discrete Applied Mathematics, 104:143–186, 2000.

    Article  MATH  MathSciNet  Google Scholar 

  9. S. Kurtz and C. Schleiermacher. REPuter: fast computation of maximal repeats in complete genomes. Bioinformatics, 15(5):426–427, 1999.

    Article  Google Scholar 

  10. G. S. Luecker. A data structure for orthogonal range queries. Proc. 19th IEEE Symposium on Foundations of Computer Science, pages 28–34, 1978.

    Google Scholar 

  11. B. Morgenstern, W. R. Atchley, K. Hahn, and A. Dress. Segment-based scores for pairwise and multiple sequence alignments. In Proceedings of the Sixth International Conference on Intelligent Systems for Molecular Biology (ISMB-98), 1998.

    Google Scholar 

  12. E. W. Myers, G. G. Sutton, H. O. Smith, M. D. Adams, and J. C. Venter. On the sequencing and assembly of the human genome. Proc Natl Acad Sci U S A, 99(7):4145–4146, 2002.

    Google Scholar 

  13. P. A. Pevzner and M. S. Waterman. Generalized sequence alignment and duality. Advances in Applied Mathematics, 14:139–171, 1993.

    Article  MATH  MathSciNet  Google Scholar 

  14. S. Schwartz, Z. Zhang, K. A. Frazer, A. Smit, C. Riemer, J. Bouck, R. Gibbs, R. Hardison, and W. Miller. PipMaker-a web server for aligning two genomic dna sequences. Genome Research, 10:577–586, 2000.

    Article  Google Scholar 

  15. J.-S. Varré, J.-P. Delahaye, and E. Rivals. Transformation distances: a family of dissimilarity measures based on movements of segments. Bioinformatics, 15(3):194–202, 1999.

    Article  Google Scholar 

  16. W. J. Wilbur and D. J. Lipman. The context dependent comparison of biological sequences. SIAM J. Applied Mathematics, 44(3):557–567, 1984.

    Article  MATH  MathSciNet  Google Scholar 

  17. D. E. Willard. New data structures for orthogonal queries. SIAM Journal of Computing, pages 232–253, 1985.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Halpern, A.L., Huson, D.H., Reinert, K. (2002). Segment Match Refinement and Applications. In: Guigó, R., Gusfield, D. (eds) Algorithms in Bioinformatics. WABI 2002. Lecture Notes in Computer Science, vol 2452. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45784-4_10

Download citation

  • DOI: https://doi.org/10.1007/3-540-45784-4_10

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44211-0

  • Online ISBN: 978-3-540-45784-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics