Skip to main content

Estimating the Accuracy of Multiple Alignments and its Use in Parameter Advising

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNBI,volume 7262)

Abstract

We develop a novel and general approach to estimating the accuracy of protein multiple sequence alignments without knowledge of a reference alignment, and use our approach to address a new problem that we call parameter advising. For protein alignments, we consider twelve independent features that contribute to a quality alignment. An accuracy estimator is learned that is a polynomial function of these features; its coefficients are determined by minimizing its error with respect to true accuracy using mathematical optimization. We evaluate this approach by applying it to the task of parameter advising: the problem of choosing alignment scoring parameters from a collection of parameter values to maximize the accuracy of a computed alignment. Our estimator, which we call Facet (for “feature-based accuracy estimator”), yields a parameter advisor that on the hardest benchmarks provides more than a 20% improvement in accuracy over the best default parameter choice, and outperforms the best prior approaches to selecting good alignments for parameter advising.

Keywords

  • Integer Linear Program
  • Structural Alignment
  • Accuracy Estimator
  • Parameter Choice
  • Balance Weight

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Research supported by US NSF Grant IIS-1050293 and DGE-0654435.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahola, V., Aittokallio, T., Vihinen, M., Uusipaikka, E.: Model-based prediction of sequence alignment quality. Bioinformatics 24(19), 2165–2171 (2008)

    CrossRef  Google Scholar 

  2. Balaji, S., Sujatha, S., Kumar, S.S.C., Srinivasan, N.: PALI: a database of alignments and phylogeny of homologous protein structures. NAR 29(1), 61–65 (2001)

    CrossRef  Google Scholar 

  3. Edgar, R.C.: http://www.drive5.com/bench (2009)

  4. Jones, D.T.: Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology 292, 195–202 (1999)

    CrossRef  Google Scholar 

  5. Kim, E., Kececioglu, J.: Learning scoring schemes for sequence alignment from partial examples. IEEE/ACM Trans. Comp. Biol. Bioinf. 5(4), 546–556 (2008)

    CrossRef  Google Scholar 

  6. Kim, J., Ma, J.: PSAR: measuring multiple sequence alignment reliability by probabilistic sampling. Nucleic Acids Research 39(15), 6359–6368 (2011)

    CrossRef  Google Scholar 

  7. Landan, G., Graur, D.: Heads or tails: a simple reliability check for multiple sequence alignments. Molecular Biology and Evolution 24(6), 1380–1383 (2007)

    CrossRef  Google Scholar 

  8. Lassmann, T., Sonnhammer, E.L.L.: Automatic assessment of alignment quality. Nucleic Acids Research 33(22), 7120–7128 (2005)

    CrossRef  Google Scholar 

  9. Pei, J., Grishin, N.V.: AL2CO: calculation of positional conservation in a protein sequence alignment. Bioinformatics 17(8), 700–712 (2001)

    CrossRef  Google Scholar 

  10. Penn, O., Privman, E., Landan, G., Graur, D., Pupko, T.: An alignment confidence score capturing robustness to guide tree uncertainty. MBE 27(8), 1759–1767 (2010)

    CrossRef  Google Scholar 

  11. Thompson, J.D., Plewniak, F., Ripp, R., Thierry, J.C., Poch, O.: Towards a reliable objective function for multiple sequence alignments. JMB 314, 937–951 (2001)

    CrossRef  Google Scholar 

  12. Wheeler, T.J., Kececioglu, J.D.: Multiple alignment by aligning alignments. Bioinformatics 23, i559–i568 (2007); Proceedings of the 15th ISMB

    Google Scholar 

  13. Wheeler, T.J., Kececioglu, J.D.: Opal: software for aligning multiple biological sequences. Version 2.1.0 (January 2012), http://opal.cs.arizona.edu

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

DeBlasio, D.F., Wheeler, T.J., Kececioglu, J.D. (2012). Estimating the Accuracy of Multiple Alignments and its Use in Parameter Advising. In: Chor, B. (eds) Research in Computational Molecular Biology. RECOMB 2012. Lecture Notes in Computer Science(), vol 7262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29627-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29627-7_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29626-0

  • Online ISBN: 978-3-642-29627-7

  • eBook Packages: Computer ScienceComputer Science (R0)