Skip to main content

Parametric recomputing in alignment graphs

Part of the Lecture Notes in Computer Science book series (LNCS,volume 807)

Abstract

DNA/protein sequence alignments in computational molecular biology depend heavily on the settings of penalties for substitutions, insertions/deletions and gaps. Inappropriate choice of parameters causes irrelevant matches (“noise”) to be reported, thus obscuring biologically relevant matches. In practice, biologists frequently compare sequences in a few iterations, starting from a vague idea about appropriate parameters, then refining parameters to reduce noise. This procedure often helps to delineate biologically interesting similarities and to substantially reduce laborious analysis. This paper provides a computational underpinning for such iterative noise filtration in alignment graphs. Our main results assume that a preliminary “noisy” alignment, computed with reasonable but ad hoc parameters, is given; the problem is to modify the parameters to reduce noise. We present fast algorithms to refine penalty parameters and describe an application of these algorithms.

Keywords

  • Decomposition Tree
  • Optimal Alignment
  • Locus Control Region
  • Alignment Graph
  • Computational Molecular Biology

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

The research was supported in part by the National Science Foundation under grant DIR-9106510.

The research was supported in part by the National Science Foundation under grant CCR-9308567 and the National Institutes of Health under grant R0I HG00987.

The research was supported in part by the National Institutes of Health under grant R01 LM05110.

This is a preview of subscription content, access via your institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boguski, M., R. Hardison, S. Schwartz and W. Miller (1992) Analysis of conserved domains and sequence motifs in cellular regulatory proteins and locus control regions using new software tools for multiple alignment and visualization. The New Biologist 4, 247–260.

    Google Scholar 

  2. Dayhoff, M., W. Barker and L. Hunt (1983) Establishing homologies in protein sequences. Methods in Enzymology 91, 524–545.

    Google Scholar 

  3. Fitch, W., and T. Smith (1983) Optimal sequence alignments. Proc. Natl. Acad. Sci. USA 80, 1382–1386.

    Google Scholar 

  4. Gotoh, O. (1990) Optimal sequence alignment allowing for long gaps. Bull. Math. Biol. 52, 359–373.

    Google Scholar 

  5. Gusfield, D., K. Balasubramanian and D. Naor (1992) Parametric optimization of sequence alignment. Proceedings of the Third Annual ACM-SIAM Symposium on Discrete Algorithms, January 1992, 432–439.

    Google Scholar 

  6. Hardison, R., K.-M. Chao, M. Adamkiewicz, D. Price, J. Jackson, T. Zeigler, N. Stojanovic and W. Miller (1993) Positive and negative regulatory elements of the rabbit embryonic ε-globin gene revealed by an improved multiple alignment program and functional analysis. DNA Sequence, 4, 163–176.

    Google Scholar 

  7. Hardison, R., and W. Miller (1993) Use of long sequence alignments to study the evolution and regulation of mammalian globin gene clusters. Molecular Biology and Evolution 10, 73–102.

    Google Scholar 

  8. Huang, X., and W. Miller (1991) A time-efficient, linear-space local similarity algorithm. Advances in Applied Mathematics 12, 337–357.

    Google Scholar 

  9. Huang, X. (1994) An algorithm for identifying regions of a DNA sequence that satisfy a content requirement. Comput. Applic. Biosci. (to appear).

    Google Scholar 

  10. Miller, W., and E. W. Myers (1988) Sequence comparison with concave weighting functions, Bull. Math. Biol. 50, 97–120.

    Google Scholar 

  11. Miller, W., S. Schwartz and R. Hardison (1994) A point of contact between computer science and molecular biology. IEEE Computational Science and Engineering (to appear).

    Google Scholar 

  12. Panjukov V.V. (1993) Finding steady alignments: similarity and distance. Comp. Appl. in Biol. Sci, 9, 285–290.

    Google Scholar 

  13. Preparata F., and M. Shamos (1985) Computational geometry. An introduction. Springer-Verlag, New York.

    Google Scholar 

  14. Rechid, R., M. Vingron and P. Argos (1989) A new interactive protein sequence alignment program and comparison of its results with widely used algorithms. Comput. Appl. Biosci. 5, 107–113.

    Google Scholar 

  15. Schwartz, S., W. Miller, C.-M. Yang and R. Hardison (1991) Software tools for analyzing pairwise sequence alignments. Nucleic Acids Research 19, 4663–4667.

    Google Scholar 

  16. Vingron, M., and P. A. Pevzner (1993) Multiple sequence alignment and n-dimensional image reconstruction. A. Apostolico, M. Crochermore, Z. Galil, U. Manber (eds.) Combinatorial Pattern Matching 1993, Padova, Italy Lecture Notes in Computer Science 684, 243–253.

    Google Scholar 

  17. Vingron, M., and M. S. Waterman (1994) Parametric sequence alignment and penalty choice: Case studies. J. Mol. Biol. 235, 1–12.

    Google Scholar 

  18. Waterman, M. S., and M. Eggert (1987) A new algorithm for best subsequence alignments with application to tRNA-rRNA comparisons. J. Mol. Biol. 197, 723–725.

    Google Scholar 

  19. Waterman, M. S., M. Eggert and E. Lander (1992) Parametric sequence comparisons. Proc. Natl. Acad. Sci. USA 89, 6090–6093.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Huang, X., Pevzner, P.A., Miller, W. (1994). Parametric recomputing in alignment graphs. In: Crochemore, M., Gusfield, D. (eds) Combinatorial Pattern Matching. CPM 1994. Lecture Notes in Computer Science, vol 807. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58094-8_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-58094-8_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-58094-2

  • Online ISBN: 978-3-540-48450-9

  • eBook Packages: Springer Book Archive